Tuesday, January 6, 2009

Bug is firmly squished

It was the beams list all along that was somehow somewhere going crazy and trashing the alias model verts (and who knows what else in other maps). While I'm a bit worried that I haven't quite figured out what was going crazy about it, I'm quite happy to lay the matter to rest for the time being.

So I've taken beams, and for good measure temp entities as well, off the hunk (allocated per map) to a single global static array with fixed size. I don't really mind doing that, as these two were only ever used for lightning bolts and grappling hook ropes, which there will never be an extremely large amount of. In any event, I've bumped the maximum on them to 64 and 128 respectively, which should be sufficient to cover all normal and most abnormal situations.

I've a mental note that I'm still going to investigate alternate means of removing the limits. I think a Hunk allocated linked list for beams seems in order, and maybe take temp entities from the top of the standard entities list (which is already an array of 8192 entity_t pointers).

Note that I said pointers there, so storage overhead for 8192 entities client-side is only 32K. The first 512 are allocated on the Hunk at map startup time, with anything beyond that allocated as required.

I've decided that the render to texture warp update will very probably have to go. I'm not one bit happy with the FPS loss on the Intel 910; and the video RAM overhead is bordering on the extreme. It requires a target texture 4 x the size (in each dimension!) of the source (otherwise miplevel 1 of the source is used for creating miplevel 0 of the target - ugleeeeee city!)

This means that for 512 x 512 source textures we have a 21 MB video RAM overhead per texture! The start map will eat 84 MB video RAM! I tried to ease this off by using 16 bit render targets, but it still works out as 42 MB for start. I'm considering that to be quite unacceptable.

Now that I know the Intel 910 is a respectable enough Pixel Shaders 2.0 performer - it can run RenderMonkey's DoF sample with 9 passes and 120,000 triangles at 10 FPS, and very easily hits 250 FPS for more typical Quake-ey effects - one or two passes, triangle counts in the order of 10,000 or so, it seems an option worth exploring. I am worried though that I might be locking out owners of older GeForces here - even a GeForce 4 is limited to 1.3.

Another option is to revert to surface subdivision and use a vertex shader to evaluate texcoords from the vertexes, which will work on any card that doesn't support vertex shaders, as Direct 3D can emulate them in software. It's fast too - that 250 FPS quoted above is with a software emulated vertex shader.

Of course this doesn't mean that the engine speeds will suddenly increase by a factor of 3 or so, there's a lot more going on in Quake aside from the render!

0 comments: