Sunday, January 3, 2010

DirectQ 1.8 Speeds Explained

OK, this might cause some controversy so here's the deal upfront.

If you are using timedemos of ID1 demos as benchmarks, 1.8 might report that it's running slower than 1.7 was.

I'm not going to sugar the pill, the most likely explanation is going to be because it is. The dropoff won't be too dramatic; in the order of 5% to 10% in an ID1 timedemo. In regular gameplay when you're locked to a maximum of 72 FPS you won't notice a thing.

The reason for this is because - in order to achieve better geometry batching - I need to build up a database of all geometry in the map and other models which is sorted and otherwise organised in a certain fashion. Doing this requires overhead, mainly on the CPU and memory, and in more trivial cases (like ID1 maps) the loss from the overhead may outweigh the gains from the rendering optimizations.

So if you drop from 260 FPS to 235 FPS on "timedemo demo1" don't come running to me hollering about it, OK? ;)

The places where you are going to see huge gains is in maps like ne_tower, marcher, masque and ctfrq1. These are all cases where the older style "classic Quake" renderers get choked, and 1.8 will give them a good liberal dose of WD40 and get them moving smoothly.

You are still going to be able to find some strange places in these maps where 1.8 gets choked.

Much of the optimization comes from being able to efficiently batch together surfaces that share the same texture and lightmap. It's perfectly possible to construct a scene that breaks this, and lots of brushmodels each of which uses many different textures will be one such scenario. These scenes will choke any other engine too, and the worst case is a fallback to a similar level of efficiency (or lack thereof) as 1.7 had.

So likewise if you find such a scene don't come hollering to me about it.

This engine won't be able to make a slow 3D card into a fast one.

If, taking these factors into account, you still find that you're getting sub-20 framerates most of the time, it might just be the case that your 3D card is plain-old-fashioned slow. No amount of clever coding techniques can make a slow card into a fast one, that's hardware baby; all that I can do is help it make better use of what resources it has (and even then I might not succeed all of the time).

For the record, the primary limiting factor in 1.8 is probably going to be bandwidth. Because I'm submitting large batches of triangles at a time, the more bandwidth you have the better you will like it. However, note that even a 4 year old Integrated Intel has sufficient bandwidth to cope with most scenes here so the majority of people should be fine.

Even when I release 1.8 I won't be "finished".

Yes, there will still be more room for optimization. There are 2 or 3 things I am aware of right now that I could do to improve performance, but I'm probably not going to do them for at least the first release of 1.8 as it would mean delays in getting it out.

0 comments: