Saturday, June 19, 2010

More Thoughts for the Future

Just a few thoughts rather than words.

Firstly, I'm not going to get lightmap updates for free with the current architecture. This was a bit ambitious of me, but I still believe that it was a worthwhile goal and I'm glad that I shot for it.

I know how to do animated lightstyles entirely on the GPU, at least for single-component traditional Quake white lighting (RGBA is possible as well but requires beefier hardware). I'm going to consider this for a future release, but it's not on the list for 1.8.5. This will give updates more-or-less for free.

Dynamic lights are trickier. The problem is that we don't have the information sufficiently far in advance, and key elements change on a frame by frame basis. I have some experimental thoughts regarding it that will work on any hardware, but that involes a trade-off of sending a new lightmap to the graphics hardware versus accepting some extra overdraw. I think that the overdraw will be substantially cheaper in the common case, but I know that in the worst case (every visible surface modified by a large number of dynamic lights) it will be substantially more expensive.

There is another solution that requires shaders which I might also consider.

Hardware instancing. I might do it for certain types of object. It needs shader model 3 and requires the per-instance data to be smaller than the cost of just sending the full data directly (which is why it's not suitable for traditional Quake MDLs). Torches could certainly benefit from it though as their animations are largely in lockstep, so the vertexes and texcoords are identical per-instance, with just a matrix and a colour differing. Being client-side static entities means that there's nothing from the server that could upset things either.

Particles would need a position and a colour, so the only saving per-vertex is two floats (unless I find a clever way of reusing vertexes as texcoords); it's certainly do-able but is it worthwhile? I'll certainly be looking at putting some of the CPU work into a vertex shader for the general (no hardware instancing) case anyway.

Occlusion queries as a means of supplementing PVS? Interesting idea, but you're doubling the raw xyz position submission overhead for the world model. In scenes with a high wpoly count where everything submitted actually should be visible it's going to be bad. Changing the renderer so that every second frame we submit occlusion queries but present nothing to the screen would be a solution in this case. Need to think.

Slow machine performance - I occasionally test on a VMWare session running Windows XP and with VMWare's display driver. This is great for identifying potential bottlenecks in the code; it's how I identified that my console character drawing was a serious slow down. I recommend it to everyone. Right now it's telling me that I've still got room for improvement in my handling of bigger scenes, possibly by reusing vertex buffer contents across multiple frames if things haven't changed enough to require a full rebuild of the world.

That's all for now; these are just some theoretical possibilities for the future. Some, none or all of them might happen, and those that do happen might be wildly different in implementation from how I've described above - things always change radically when you get down to writing code.

0 comments: