Wednesday, July 21, 2010

glTexSubImage2D and the quest for fast lightmap uploads

The saga of lightmap uploads is continuing, and while I do have a solution in place, it is not one that I consider trustworthy in the long term. It's good enough for testing though.

glTexSubImage2D is an interesting beast. When I was doing the original port of what became DirectQ I literally agonized over this part of the code. The OpenGL version (a single call to glTexSubImage2D) seemed so clean and neat, whereas the D3D version (lots of LockRect, memcpy, UnlockRect, decisions about memory pools and when to dirty the update region) just looked awful.

Surely the one function call version just had to be the optimal performer?

Things are not always what they seem. Right now glTexSubImage2D is just not performing well at all. Tests have indicated that the most likely cause is one of 2 things: either Windows 7 with Aero enabled or newer (OpenGL 2.0+) drivers. I'll find out later on this week what happens with a GL2 driver on XP.

Performance is of course relative, so how bad is it? I've run a number of test cases in standalone apps to measure the performance, with explicit timing of the time spent in the glTexSubImage2D call. On the older hardware/XP machines a single 1024x1024 texture can be fully updated in 2 milliseconds. On the newer hardware/7 machines it's 78ms. More representative of what's happening in RMQ will be 16 lightmaps created at 64x512. This time around the timings are 5ms versus 40ms.

None of this happens in DirectQ where performance consistently scales up with increasing hardware capabilities.

So what is happening and why? Nobody seems to know, or if they do, they're not telling. What is definite is that various support forums are filling with questions about it, and the same tired old answers are always given (use a PBO, use BGRA, etc), all of which have been tried and none of which work.

The current RMQ solution (lots of small updates) does however work, but it's one that I would need some semi-official indication of it being the correct thing to use before committing to using it.

0 comments: