Sunday, August 1, 2010

Currently Researching...

Dynamic Vertex Buffers in OpenGL. This is what DirectQ uses and it can handle my 400 knights test map at over 40 FPS. RMQ is still using plain old vertex arrays however and the performance is just not there - less than half.

So why would I want to draw 400 knights anyway? Performance efficiency is the reason. There are going to be ambitious things planned for RMQ and the faster it can handle MDL drawing (which is the current major bottleneck) the more headroom will be available for other things to happen.

With D3D correct use of a dynamic VBO is clear and easy. Create it with D3DUSAGE_DYNAMIC, lock it with D3DLOCK_DISCARD, write in your data, unlock when done, and draw away to your heart's content. DirectQ pulls a few more tricks to keep the number of lock/unlock operations to a minimum, such as saving out state changes to a replayable list so that we can write in objects with different states without needing to unlock/render/lock; the only thing that actually causes an unlock/render is a change in vertex format (happens a LOT less often than you might think) or the end of the current frame. These are all viable techniques for RMQ too.

With OpenGL the situation is a little more muddy, and this comes back to my theory about somebody in OpenGL land panicking and leaving us with VBOs as a half thought through first draft API. There are operations that seem like they should do much the same as in D3D, but there are also subtle little differences in important parts. The fact that anything could be going through software emulation at any point in time, and you never quite know, is another negative point against OpenGL. My initial attempt at bringing on VBOs in RMQ went down in flames as a result of these, but now that I have almost everything in vertex arrays it's getting near time to try again.

I'm thinking a two-step process will be best here. Step 1 will involve moving the renderer to the same final structure as DirectQ uses, meaning the replayable list of state changes, but keep it using standard vertex arrays. This won't be too painful and can be done on an even more modular basis, such as taking brush models across first. Step 2 will involve bringing on VBOs on top of that.

Step 2 is the one that concerns me, as another point about OpenGL VBOs I haven't yet mentioned is that every time I experimented with them in the past I observed zero performance gain (hardware platform is irrelevant here as all of my prior tests were on NVIDIA). The possibility that I was doing something wrong can't be discounted, of course, but there is still an element of reluctance in me to undertake the work required for Step 1 (even though it's not much at all) without knowing for certain that Step 2 will give the improvement that it should. Without Step 2 doing it's thing, Step 1 amounts to little more than code aesthetics, which are nice but unnecessary.

Fortunately the amount of work for Step 1 is quite small so it's getting near time to dive in. I'll report back on progress. Till then!

0 comments: