Friday, October 29, 2010

More Performance Notes

Things are building up nicely again; I just got maybe 25% extra from it. It's still not as fast as the modified 1.0 I keep for testing and experiments, never mind being as fast as it should be, but it's definitely getting there.

Specific trouble spots.

The sound code I've been using for the past few releases was EFF-YOU-SEA-KAY'ed. I ended up rolling that back to the 1.7.666 level, losing some of the extra functionality I'd added but gaining a whole heap of speed in exchange. I'm going to need to add back the ability to restart the sound system on-demand, as well as player-selectable sampling rates. It's been my stated intention for a while to port the Quake II sound code, so now I have added incentive to do so.

Obviously there were portions of that code still running even when I had thought I'd disabled sound in previous tests. I'm happier with having rolled back though as some of the changes I'd made since were quite fragile.

There is a call to "rand ()" at the start of Host_Frame that I'd always been dubious about (it was in ID's original code) but I'd left in all the same, because I never saw any reason to remove it. That was sucking away a few percent CPU. I've commented it out for now, and am going to run with it like that for a while to see what happens.

Host_FilterTime was brutal; just pushing the stack each run through the main loop was taking about 8% CPU. I've inlined that one and we're in better shape now.

Where things are at now is that the heaviest functions are more or less the ones I would have expected - dynamic lightmap building, alias model vertex interpolation and R_RecursiveWorldNode. There are still some anomalies - a heavy stack push in my Sys_Milliseconds function and the setjmp in Host_Frame being two of the more important ones. There's some runtime checking I've added to the Debug build which is more or less irrelevant (it won't be in Release builds), and drawing the console characters is heavier than I'd like, but not too dramatically bad overall.

I'm now suspecting that the biggest cause of some of these performance drains lies in project options I have selected, so I'm going to play around with reverting some of these to the same as my test/experimental codebase and see what comes out of it. On balance though I'm a lot happier with where things currently stand than I was even only yesterday.

Phew! Onwards!

_________________________


Update.

That worked! It seems to be back up to full speed now, and is going about 55% faster than it was yesterday, or over twice as fast as 1.8.666b was.

What a horrible nightmare of an experience...

3 comments:

Spike said...

> There is a call to "rand ()" at the start of Host_Frame that I'd always been dubious about (it was in ID's original code) but I'd left in all the same, because I never saw any reason to remove it. That was sucking away a few percent CPU. I've commented it out for now, and am going to run with it like that for a while to see what happens.

Mwahaha! This means I can now choose my team+enemy spawn spots by using the pre-war axe to figure out where in the rand() table it is, and spawn when the random number generator is favourable.

(This was a fun exploit made possible by mvdsv's qvm mods not calling rand() at (unpredictable) inervals - its a fun one)

mhquake said...

That seems a valid reason for it to exist; cheers!

Putting it back doesn't affect performance since the latest fixes went in, so obviously it was being misreported as a drain on account of the other stuff elsewhere.

Currently benchmarking about 25% higher than my previous personal best. Happy days.

=peg= said...

Congratz! :D