Here are two screenshots I took.
The first describes what methods I used to render the four boxes.
The second is the same as the first, except I changed the PresentationInterval from D3DPRESENT_INTERVAL_ONE to D3DPRESENT_INTERVAL_IMMEDIATE, basically a speed test.
I think if I was using a native display, I could get a few more FPS out of it, but it works so I'm not complaining. The only thing I will complain about is the way D3D locks it's vertex/index buffers. The way I would have to do it would be to create a new direct buffer every time around the area that is returned. That would be fine, except for the fact that when doing things like animation, you call Lock every frame. I guess that's what garbage collectors are for, but still, that's a lot of buffers. The way I did it for now is to just set the data to the buffer that's passed in (a memcpy to be precise). I have an idea on how to fix this to give it more of a 1:1 feel, but I'm not sure if it will work.
-gz