Wednesday, March 2, 2011

DirectQ Update - 2nd March 2011

gl_texturemode is now saved to your config file. I've also fixed a bug in it where GL_NEAREST was always selected irrespective of what you asked for. I guess that most people who used it were using it to select GL_NEAREST anyway, so it may have slipped through the cracks. This only affected gl_texturemode when used from the console; the corresponding video menu options always worked right.

This command has tab autocompletion enabled on it, by the way, just in case anyone hasn't noticed. I've always found GL_NEAREST_MIPMAP_NEAREST (my preferred mode for when I get a hankering for crunchy pixels) a little too long-winded to type, so I did it a few versions back.

The D3D effects framework is proving to be a minor pain, and something of a performance bottleneck. Warning - technobabble follows. The primary problem is with the ID3DXBaseEffect::SetTexture method, which does an AddRef and Release on your IDirect3DTexture9. Normally that's OK by the rules of COM (highly recommended if you're copying a pointer, because otherwise there is a risk that something else may Release the object before you use the copy), but in this case the AddRef and Release don't actually accomplish anything as there are no D3D calls between them. To compound the misery, the AddRef call on IDirect3DTexture9 is fucking expensive. Between these AddRef and Release calls, D3D is spending almost 3 times the amount of CPU time that it's spending on everything else combined.

Obviously something is happening here that goes beyond simply incrementing or decrementing a reference counter, and I suspect based on this evidence that MS have coded some additional logic into the AddRef method for IDirect3DTexture9. Needless to say I'm exploring options for working around this shoddiness.

______________________________


Update: I've resolved the texturing problem so that CPU overhead is now gone. Digging into PIX it seems as though there's lots more evil stuff going on behind the scenes with the effects framework, so over time I'll be cleaning that out too. It's a pity because the framework is really handy for compiling and managing shaders, but in use it's a strange black box where you never really know what's happening. This confirms my initial observations on it from about 2 years ago, when I noted that framerates had dropped a few percent when compared to raw shaders.

I'm playing with the idea of putting mouse updates into a separate thread. the real problem here is that Quake's default 72 FPS framerate just doesn't match to typical mouse polling rates, which does impact on overall smoothness.

I did have some problems with single CPU/single core machines when I moved a portion of the renderer to another thread, but I think I can pull this one off without so much bother.

0 comments: