Here's one that I find quite interesting...
Been considering the relative performance merits of triangle lists vs strips vs fans, with indexing or without. There's a lot of literature on the topic, with much of it being contradictory, not to mention failing to take account of real world usage scenarios.
The idea is to submit as many vertexes in a single API call as possible; a figure of about 300 is mentioned in some of the docs. Problem is - and maybe it's just me - but I fail to see how that's achievable.
First up, a fan or strip is a discrete unit, you can't submit more than one at a time. You can go a long way with strips, but you'll eventually hit the boundary.
Secondly, texture changes mean that you'll have to stop what you're doing and resubmit at some point. The only way I can see around this is to cache a number of textures in spare TMUs and toggle them in a shader, but I haven't done any tests to see how efficient that would be.
Thirdly, there is the small matter of PVS. The primitives you're submitting aren't necessarily going to be contiguous in your VBO. Best case is you sort by leaf and you might get 4 to 5 good blocks, but you run the risk of breaking texture sorting. Alternative you keep a dynamic VBO and stream into that from system memory in the correct order, but per-frame dynamic VBOs really defeat the purpose of having a VBO in the first place, don't they? Did I mention that you might also break back-to-front order (already partially broken by texture sorting)? It gets worse.
Let's look at indexing. For a given world surf in classic Quake there are 7 floats per vert - 3 xyz, 2 st for diffuse and 2 st for lightmap. For the majority of cases the only reuse you're going to get out of that is in the xyz, which is a grand total of 8 bytes (12 less 4 for the vertex). In extreme high poly scenes those 8 bytes will be precious, but in Quake? I ain't so certain. Vertex caching performance may make indexing desirable despite that, but again I'm coming back to low poly. And again, how much data is going to be cachable? Not much.
Fourthly there's the expense of setting up all the infrastructure required to support it. Indexing for certain types of data looks like O(n²) to me... not nice. Ideally your build tools would do it all for you, but once again this is Quake, not la-la-idealistic-hippy-land.
Despite all that I can definitely see cases where certain techniques are advantageous. Strips are faster than fans as the driver can decompose them into individual triangles easier (reuse the last 2 verts and tack on the new one), and there may be certain cases where what's currently a trifan could be represented as a strip instead. Alias models are another good one, there's heavy reuse potential across both vertexes and texcoords, PVS ain't an issue, texture changing ain't an issue. Interpolation might be, which is why I currently run it in a vertex shader - just submit the 2 sets of xyz and the blend factors and let the GPU do the work. It's a heavier submission, but indexing will have greater potential there. Lighting is the big one, and is why I still use DrawPrimitiveUP with them - this potentially changes for every vertex each frame. A caching system may be viable at 10 FPS updates, but interpolation will break that.
The solution to alias models probably looks like (1) store verts, texcoords and lightnormals in a vertex buffer (thinks - lightnormals are shared for all alias models so could probably be encoded in a texture...), (2) store vert indexes, texcoord indexes and lightnormal indexes in an index buffer, (3) submit the indexes together with the shadelight values (which can be a one time submission per entity) and the intepolation blend factors in a single DrawPrimitive call per entity, and (4) run all the interpolation code currently in GL_DrawAliasModel in a vertex shader.
Mmmmm - this would definitely crack the problem of alias models being something of a bottleneck.
Tuesday, January 13, 2009
Triangle fans, lists and strips
Posted by
mhquake
at
8:20 PM
Subscribe to:
Post Comments (Atom)
0 comments:
Post a Comment