I've been cleaning up and generalising my brushmodel code and now have it at a stage where the only real difference between brushmodels and the world is visible surface determination. I have a really nice single function that is used to handle all brush model rendering, and the code is clean, elegant, and makes me happy.
Texture chains are gone. They are only used for an initial sort by texture at load time now (and I'm hoping to get rid of even that). Instead there's a linked list of triangle list definitions attached to each brush model (which is allocated in texture order) with each surf possessing it's own pointer to the triangle list that contains it's verts (the old glpoly_t is almost gone too). An array of indexes for visible surfaces into this list is generated at runtime, then each entire list is blasted down to the GPU (as indexed primitives, using the array). I can draw potentially hundreds of surfaces (in practice it's more like tens, but you get the picture) in a single call using this method, which is friendlier to the API and friendlier to (reasonably) modern GPUs.
r_speeds counts are totally meaningless here. Even the busiest scenes tend to top out at a wpoly count of about 60, which equates to the number of DrawPrimitive calls used (the equivalent of a glBegin/glEnd pair). The long and winding road to the castle in Marcher is the record setter with just under 100 (this can still slow down on slower/older GPUs, but that's due to the hardware and not code inefficiency). epoly counts are strictly 1 per alias model as each alias model is drawn with a single DrawPrimitive (it will be 2 when I implement the 1 TMU path for these).
I decided to stop being paranoid about batching textures up according to whether they have fullbrights or not, and it's paid off. In practice the better drawing efficiency far outweighs the overhead from state and texture changes, so on balance it's a speed increase. It could go faster if I did batch them, but only slightly so, and it would be at the expense of adding complexities to the code, which would cause trouble for future maintainability.
One compromise I've had to make has been with lightmaps. Better drawing efficiency can be had by having as many surfaces as possible sharing both the same texture and the same lightmap (bigger batches, less texture changes, so on). For previous releases of DirectQ I'd found that the optimal lightmap size was 64: smaller and you're uploading too many lightmaps at runtime, larger and you're uploading huge unchanged chunks of a lightmap for a handful of dynamic surfaces. To get better batching I've had to increase this to 256. This means that dynamic lights have a bigger speed impact than I'd like, so I'm considering how to work around this. Obviously we don't know enough info in advance to do something like alloc lightmaps big enough or small enough for each texture batch, so clever trickery will be required (if it even happens at all).
One idea is to just upload the lightmap for each surf as it changes. A second is to use tall but narrow lightmaps, as lightmap width is a major factor here. I think a combination of these will get results, but we'll see.
Till next time.
Tuesday, December 29, 2009
More fun with Brushmodels
Posted by
mhquake
at
1:18 PM
Subscribe to:
Post Comments (Atom)
0 comments:
Post a Comment