It's rare enough that I look at the main news page of Inside3D, but today I did and saw that there was a news item/review on DirectQ 1.7.2 there (thanks folks!) One point of interest to me was a note that the engine was quite slow on the bigass1 demo. Now, I can't resist a challenge like this, so I downloaded the demo and played it, and yup - really slow, with lots of hitching and stalling.
So on a hunch I went to my player skin translation routine and stuck in a Con_Printf to see if that was the culprit, and yes, it seems as though DirectQ is running a new skin translation on every respawn (rather than when the skin actually does change). I don't know if this is inherited from original GLQuake, but I more than half suspect that it is (the original comment about doing it fast instead of sending through GL_Upload8 in the ID code is the giveaway here).
A further test was to just put a return at the start of this function and rerun the demo which confirmed that things were nice and smooth and with no hitching or stalls.
This has now been fixed by a combination of caching, some memory optimizations and a few other small things. The difference is like night and day; sweet.
For the record, the original Inside3D item that prompted this can be viewed here.
Once again, thanks due to Inside3D in general and scar3crow in particular for this one, without which a fairly evil performance bug would have continued into future releases. I've said it before but I'll say it again: I love being notified of issues like this as it draws my attention to areas of the engine I need to do better on.
Saturday, October 17, 2009
Inside3D Review/News Item and consequent bugfix
Posted by
mhquake
at
1:53 PM
Subscribe to:
Post Comments (Atom)
3 comments:
As it is, that's quite a nice review. We followers don't need any introduction to the engine, but well-played, as they say.
I played a few levels of the first expansion pack this afternoon, and this engine still keeps rocking.
Yes that is a legacy from glquake.
Just like the lack of colormapping on bodies and other non-client entities.
I rewrote the entire colormap handling in darkplaces just to fix bigass1 performance - darkplaces takes the 8bit skin and uploads it using 5 different palettes (skipping ones that produce a blank image), specifically these:
merged - fullbrights removed
base - fullbrights, pants, shirt removed
pants - all but pants removed
shirt - all but shirt removed
glow - all but fullbrights removed
Then the rendering of the models simply checks if ent->colormap is > 0 and looks up the colors and renders base*light+pants*bottomcolor*light+shirt*topcolor*light+glow
In a shader this is a single pass, in fixed function it really is 4 passes per player model, but no hitching.
Technically on DX8-class cards you can do all 4 layers in one pass with a pixel shader.
The major drawback is vram usage - a player skin expands from 57424 bytes (296x194x8bit) to 699052 bytes (512x256x32bit), then multiply that by 5 for all the layers, but the bigger problem than player skins is monster skins - they tend to use pants/shirt colors in a non-colormapped way (which is why the merged texture exists), and you don't want to upload any of those textures if you can avoid it, so deferred upload is worth it (this is actually the only reason darkplaces supports deferred uploads).
Yeah, pretty much as I thought. I was planning along similar lines, but the vram usage really scared me off. As it is the caching system I have now will hitch and stall a little at the start of the demo, but as things pick up it's quite performant. I'm probably going to dive in a little further and see how much of this I can preload (if any).
Post a Comment