Friday, July 30, 2010

More adventures in Linux land

So here I am on Linux verifying correct operation of the RMQ engine. Thanks to a nice little tool called Wubi I've got my Ubuntu 10.04 up and running, and - yes - I know that true propeller-heads prefer other distros, but I have a life to get on with.

Where was I? Oh yeah, Linux; specifically Ubuntu 10.04. It's not actually that bad you know. For an OS who's primary intended use is a bit of office work, surfing the internet, and reading email it's more than adequately capable. It's not that great either though.

sudo - and I thought UAC in Vista was annoying and intrusive. I'll say no more.

Automatic updates. I must have told it "go away and never bother me again" 20 times but it's not getting the message.

apt-get - it's OK. It doesn't fully resolve all dependencies but it does the job.

Config files. Just say "no". Give me a nice clean registry with a logical hierarchical organisation and everything I need accessible and searchable from the one location, all using the same data types and the same API any day please.

Ctrl-Alt-F2 - mental note to self - find out how to get back to the GUI if you're ever going to press these again, please.

Some of the UI is still appallingly primitive and clunky, but it's getting there. I still find the "Hardware Drivers" applet cute. It's almost like a nice little moral ideological policeman ready to tap me on the shoulder and say "ello ello, wot's all this then" if I ever venture out of cozy-wozy fuzzy-wuzzy safe little GPL land.

Enough of that, let's talk about debugging.

OH MY GOD GBD IS RUBBISH!!!!!

A good debugger is a vital part of any development tool chain, and I've been spoiled by having a really really good debugger in MSVC and the Windows debugging tools, but this thing is just plain bad. There is really no comparison, and it has to be admitted that - whatever the real or imagined failures of their business model and the rest of their software - the folks at MS really know a thing or two when it comes to debugging tools.

It's not really that big a deal as I have an MSVC port of the RMQ project that I do most of my work on (I regularly drop into Code::Blocks or Linux to verify that I haven't introduced any MSVC-isms or Win32-isms) but sometimes when I'm debugging a Linux-only problem having a decent debugger helps. A lot.

Message to gdb defenders everywhere: see for yourself! It's really easy! Install Windows, install MSVC (and no, you don't need to go into the registry to configure it, we have GUI tools for that), write a simple enough program - a sort routine that works with pointers, say - and run it in the debugger. Set some breakpoints (you can do this while it's running), attach it to processes, inspect and modify data, use edit and continue, see all of the cool stuff you can do.

Anyway, I've wasted enough time talking about this, and no doubt there are features of gdb I'm yet unaware of that enable you to do this kind of stuff, and no doubt a large percentage of it is down to me preferring the tools I'm most familiar with, and I'm 100% certain that at least some of it was influenced by the fact that when I hit a breakpoint it didn't give the mouse or the keyboard back to the OS, so I'll stop right about here.

Till next time!

More HLSL/GLSL Fun

I've successfully ported the DirectQ waterwarp to GLSL and now have it running in the engine. DirectQ's version is very configurable with cvars to define the warp speed and scale (they all begin with r_warp so look out for them) but I likely won't be doing this for RMQ as all I wanted was a higher quality version of the classic water warp - something that doesn't look like shit, in other words.

Most of the differences between HLSL and GLSL arise from the underlying different philosophies behind Direct3D and OpenGL. In general I find D3D to be more powerful and closer to the hardware - you ask for something, that's exactly what you get, and if it doesn't exist it blows up in your face - whereas GL has at least one more layer of abstraction between the hardware and the API - you don't always get exactly what you ask for but you'll get something that's likely to work anyway, and if it doesn't exist it will possibly fall back to software emulation. Both are valid perspectives and there are sound reasons in favour of both. I personally prefer the D3D approach as I like to know what I'm dealing with, and am a little uneasy at the vagueness and sense of never quite being certain that comes with GL, but that's just me; other people will be different and that's OK too.

Parts of GLSL are pretty nifty. I now like the shader/program attach/detach model a lot, and it's a reasonably close analog of D3D's FX framework. Not as nice to work with, but not as horrible as my initial impressions led me to fear. The terminology is slightly bizarre though; at the very least I would have chosen "unit" instead of "shader". Reading the ARB docs from the time reveals that there was possibly a hint of pedantry and inverse snobbery involved in the choices here. Sigh. How very unlike OpenGL (sarcasm).

The whole uniform/varying thing is freakish. D3D's approach of just passing structs around seems more sensible. Pixel shader gets the struct that's returned from the vertex shader as a param - simple, sensible, clean. On the other hand GL's built-in gl_Vertex, gl_Position, gl_TexCoord, etc is very much the right thing, although I understand that they're gone from more recent versions.

The quality of the documentation and tools for GLSL is horrible. Tool-wise, nothing with the capabilities of PIX exists and debugging is very hit and miss. GLIntercept is reasonably decent and has been a life-saver, but PIX lets you go so far as to debug individual pixels on-screen at each stage of the frame. It's a joy to use, and an essential tool.

Documentation-wise I could use a proper SDK. Something I can download, install, and have indexed and searchable on my hard disk at all times. With a suite of tools and examples (ones that someone has actually bothered to compile and test would be nice), technical articles, FAQs, and all the other good stuff you need. I'll never complain about the D3D SDK again. It's a sad and shocking fact that the best OpenGL documentation, even today, is still the old MSDN OpenGL 1.1 stuff.

I think that just about wraps up the extent to which I'm going to be using GLSL for the RMQ engine. I could potentially spend the next 6 months implementing the entire renderer in GLSL, fine-tuning and tweaking it along the way, but nothing else would get done. Like I said, improving the water warp quality was the objective here and that's just a bonus feature, and an interlude to give me time to think over a few things relating to what I need to do next. Higher priorities are beckoning.

Till next time!

Wednesday, July 28, 2010

Good Vibes and HLSL/GLSL Comparisons

I've been a bit quiet lately but that's been on account of having my nose buried in work on the RMQ engine. Things are coming along quite well; obviously there is still a long way to go, even with a single subsystem such as the renderer, but other things are also getting done.

Removing and extending limits is a critical part of any project where the content is demanding, and I've been able to draw on the experience of some things I did right (and some things I did wrong!) with DirectQ for this one. Overall I think the code is a lot cleaner than DirectQ's. Of course a natural credit for this needs to go to both metlslime and the QuakeSpasm team as they provided such a great cleanup of a lot of things that were fundamentally wrong with the GLQuake source code (with DirectQ I started from scratch myself).

Some really good vibes coming through these days. For a long time now I've felt the need for some more serious testing on engine releases, and the experience I had sending advance copies of the DirectQ 1.8.666 executables to various people really brought that need home in a big way. Frankly, a lot of the bugs that came out in the released versions could have, would have, and should have been avoided if I had been more forthcoming with release copies at earlier stages of the development (I should have also released after porting the ProQuake netcode but before rewriting the renderer, but that's another story). Working with the RMQ team, I'm getting new code out to them on a quite regular basis - daily, often more frequently - and it's showing in the improved quality and stability.

They're also tolerant of some of the more serious mistakes I've made, like the time it crashed and burned on Linux for two days. That's pretty much all that an engine coder could ask for. Happy days.

It's also great to see some of the performance improvements and other stuff I've done being used by a talented mod team who are taking the opportunity to go nuts with them. The next RMQ demo (am I allowed talk about that?) should hopefully showcase some quite cool stuff.

On the other subject, I'm implementing some GLSL code in the engine. I had played with it a little some years ago, but most of my recent experience has been with HLSL, and it's interesting to do a comparison of the two. Of course my opinion may change as time goes on, but here's how things stand for now.

I've long been of the opinion that up to and including OpenGL 1.4 things were great and OpenGL was by far the superior of the two APIs. That opinion still stands, and if I'm working with OpenGL code up to 1.4 it's an absolute pleasure to use. I think the only things I would mark in it's disfavour are the need to bind objects before performing operations on them and the fact that some of the GL_ARB_texture_env_combine stuff is a little on the horrible side.

VBOs were where things started getting ugly. They could have been more tightly integrated with vertex arrays for starters, and the whole API could have had it's behaviour more clearly defined. In actual fact if VBOs could have just been implemented as a glHint things would have been a lot nicer (and probably more in tune with OpenGL's underlying philosophy).

My theory is that when D3D got vertex buffers (one of the first times, if not the first time, that D3D suddenly took a jump ahead of OpenGL) somebody in OpenGL-land panicked and the end result was a botched half-thought-through first draft of an API that's had to end up being patched through subsequent revisions. In other words a part of OpenGL suddenly became like what the really early versions of D3D were like.

GLSL certainly mixes the sour with the sweet. The whole operation of creatings, setting source, compiling, linking, attaching, etc shaders with programs (and even the fact that there are two separate types of object in the first place) leaves one with a bad taste. It's like Lotus Notes in a lot of ways - lots of people might think it's brilliant but unless you've been exposed to something that comprehensively pisses all over it, you'll never know better. The D3D FX framework is just intrinsically superior in every way.

On the other hand D3D tends to assume that if you're using shaders at all you're going to want to use them for everything - a common fault with many MS technologies. Mixing shaders with the fixed pipeline is just not fun in D3D; states will start going haywire, matrix transforms will give subtly - but significantly - different results, things that used to work (like fog for example) stop working, and getting the whole thing into coherent shape will take so much work that you might as well just rewrite it to use shaders everywhere. GL is a lot nicer and more sensible in that regard.

So what am I using shaders for in RMQ? Nothing much or nothing fancy; we want this content to be fully functional on as many engines as possible, so all I'm doing is supplementing and boosting some of the existing rendering paths. It'll end up being much the same as DirectQ in other words, just using them to improve the classic Quake water and sky warps and not much else.

Think that's all for now. Till next time.

Sunday, July 25, 2010

1.8.666 Full Release is Out!

I meant to get this out days ago, and it's been ready for a long time, but just never got round to it. So here it is - the last DirectQ release for a while.

http://directq.codeplex.com/releases/view/49583.

It's kinda sad in a very large way to be posting that, but positive too as RMQ is quite a cool thing to be working on, and the improvements that come back to DirectQ from it will be very worthwhile.

I'll be keeping you up to date with technical tid-bits here, of course, but I'm not going to dishonour the level of trust put in me by giving away any advance details on what's coming from RMQ. The place for info on that is the RMQ blog: http://kneedeepinthedoomed.wordpress.com/.

Till next time!

Underwater Warps

I've added underwater warps to the RMQ engine using code derived from DirectQ, and the funny thing is that the OpenGL code for doing this is immensely superior to the D3D code. It's not using an FBO or a pbuffer, just good old glCopyTexSubImage2D, but it does also add in the blend colour for free. The end result is an underwater warp that's able to run at the full screen resolution (although for now I've clamped it at 1024x1024 to prevent excessive texture memory usage) without the slowdown problems I had to deal with in DirectQ.

Of course D3D's CreateRenderTarget/SetRenderTarget/etc are more flexible as a general solution, but for something like this glCopyTexSubImage2D is more than amply good enough and there's no need to complicate things like I had to do with DirectQ.

One sticking point with the OpenGL code is the whole "bottom-left is the origin" thing. This obviously goes back to OpenGL's CAD beginnings, but when you're working in screen space it's awkward, unintuitive and involves an amount of mental juggling.

It's while working on things like this that sometimes it comes home how much I really missed the elegant simplicity of OpenGL, whereas other times I just want to curse at it and punch the person responsible for that design decision, and - occasionally - both things happen at once.

Friday, July 23, 2010

Making the RMQ Engine even more portable

The RMQ engine is based on QuakeSpasm, an SDL port of FitzQuake, is developed using a mixture of MSVC (for the superior debugger) and Code::Blocks/mingw, and is regularly given shakedowns on both Linux and Windows (I don't know if we have a Mac person or anyone who can cross-compile for Mac) so portability should be top-notch, right?

You'd think so, but that's not always the case. Take the example of vsync. The engine uses SDL/OpenGL for graphics and grabs the GL_EXT_swap_control extension (if present) to control vertical sync settings. The trouble here is that GL_EXT_swap_control is not always present on a lot of Windows machines, and you need to use WGL_EXT_swap_control (and it's associated wgl functions) instead!

This is a rare and interesting case of a portability concern occurring in the opposite direction to that which one would normally and reasonably expect. All in all an interesting learning experience.

_________________________________

I was intending to release the full DirectQ 1.8.666 release build today, but became detained with chasing down some bugs in RMQ and other stuff. Sorry about that; I'll try to get it out over the weekend.

_________________________________

I'm quite saddened to report that my 6 year old home-built PC finally expired today. It's had an extremely good innings, was still quite capable (I specced it well at the outset) and I was intending to see it off with full honours by bringing up a Linux build environment on it for my own RMQ testing.

I guess I'm going to be in the market for a new PC then.

Wednesday, July 21, 2010

RMQ Engine Update

As is by now a tradition whenever I achieve a milestone with an engine build, here is the first screenshot of the RMQ engine running the Marcher Fortress.



I mentioned that I was doing some awful things to the FitzQuake code a while back, so I think an explanation is in order. Fundamentally what I'm doing is completely rewriting the renderer to use the same surface batching techniques that DirectQ does. The FitzQuake renderer as it stood was a perfectly fine renderer, and very faithful to the GLQuake standard but with a lot of annoying niggles fixed. However, for the kind of content-heavy maps we're getting here, it wasn't performing as well as it could have.

I could see that was going to be a problem. There are places with very high polycounts, places where almost every lightmap in the map gets an update, and places with extremely high levels of detailed brushwork. Think Doom 3 like levels of detail, but with Quake textures and architecture, and running in the Quake engine.

Drawing each polygon individually is a fine technique for older content, or if you want to maintain compatibility with older (20th century) hardware, which is a very valid choice and not one I would criticise. The kind of detail we're talking about here, however, needs a bit more rendering muscle behind it.

It's still early days, and it's still the first revision of the rewrite, so I can't really say for definite what's going to fall out of the end of it yet - mostly because I don't even know myself yet! What I can say is that the new renderer is very highly optimized for drawing high poly scenes.

As a general guideline, if you can run Q3A well you should also be able to run this quite well. I'm aiming at a TNT2 level of functionality as an absolute minimum, so that's maybe OpenGL 1.4 or thereabouts. You will need multitexture and combine modes - both of which the TNT2 does support. More advanced features - like vertex buffers and occlusion queries - will be used if available, but it won't depend on them.

Now, a few people might be pissed at me for bringing up the minimum requirements for FitzQuake like this, but please, do try to remember that these are not normal Quake maps and that the techniques required to render them well just do not exist on first or second gen hardware. Even if I did retain that level of support and coded my renderer to that spec, the maps would just run very very poorly for everybody.

Forcing everybody else to suffer just to accomodate an icredibly small minority still clinging to their 3DFXs or Power VRs does not make the blindest bit of sense.

You're not going to see any sneak-previews of undisclosed RMQ content on here by the way, sorry about that. I may post screenshots of the engine running RMQ content from time to time, but it will be firmly restricted to content that has already been released.

glTexSubImage2D and the quest for fast lightmap uploads

The saga of lightmap uploads is continuing, and while I do have a solution in place, it is not one that I consider trustworthy in the long term. It's good enough for testing though.

glTexSubImage2D is an interesting beast. When I was doing the original port of what became DirectQ I literally agonized over this part of the code. The OpenGL version (a single call to glTexSubImage2D) seemed so clean and neat, whereas the D3D version (lots of LockRect, memcpy, UnlockRect, decisions about memory pools and when to dirty the update region) just looked awful.

Surely the one function call version just had to be the optimal performer?

Things are not always what they seem. Right now glTexSubImage2D is just not performing well at all. Tests have indicated that the most likely cause is one of 2 things: either Windows 7 with Aero enabled or newer (OpenGL 2.0+) drivers. I'll find out later on this week what happens with a GL2 driver on XP.

Performance is of course relative, so how bad is it? I've run a number of test cases in standalone apps to measure the performance, with explicit timing of the time spent in the glTexSubImage2D call. On the older hardware/XP machines a single 1024x1024 texture can be fully updated in 2 milliseconds. On the newer hardware/7 machines it's 78ms. More representative of what's happening in RMQ will be 16 lightmaps created at 64x512. This time around the timings are 5ms versus 40ms.

None of this happens in DirectQ where performance consistently scales up with increasing hardware capabilities.

So what is happening and why? Nobody seems to know, or if they do, they're not telling. What is definite is that various support forums are filling with questions about it, and the same tired old answers are always given (use a PBO, use BGRA, etc), all of which have been tried and none of which work.

The current RMQ solution (lots of small updates) does however work, but it's one that I would need some semi-official indication of it being the correct thing to use before committing to using it.

Tuesday, July 20, 2010

Fun With OpenGL

One lesson I learned when initially moving from OpenGL to D3D was that techniques that worked well in OpenGL don't necessarily work well in D3D. Now that I'm back working with OpenGL, it's an interesting experience to be learning that the reverse is also true. Of course it makes sense that this is the case, and you don't (and shouldn't) even have to think about it.

A major stumbling block over the past few days has been the issue of dynamic light updates. This was a critical engine issue because the RMQ folks have maps which are positively swamped in animating lightstyles, and I want to give them the ability to exploit this to the utmost. The world looks active and alive, and this adds so much atmosphere to the whole experience. You feel as if you're moving through a real place (the fact that it's a terminally fucked-up real place only adds to the sense of menace this creates).

Trouble is that it wasn't running well. At all. The engine was hitching and stalling, framerates were see-sawing wildly back and forth between the kind I wanted (steady higher than 72) and the kind I definitely did NOT want (think of a number below 20 and start subtracting from it), and movement and interaction with the world was jerky and awkward. Total mood killer.

The really annoying thing was that the very same technique in DirectQ worked perfectly, and gave ultra smooth performance with none of the issues I had experienced. OK, so they're different APIs, but at the end of the day it's all lines of code and silicon on a chip - the API should be largely irrelevant.

So why was DirectQ so fast and RMQEngine not?

With D3D you need to shove data to the GPU in large batches, and the GPU will love you for it, and reward you with fast and smooth performance in demanding scenes. With OpenGL that is also the case - most of the time. One place where it's different is the case of updating dynamic textures at runtime.

This is where you need to step back and start challenging what you know. In this case I "knew" that I was doing it "right", but I just wasn't getting the kind of performance I wanted. After bashing at it for two days, trying out various techniques (pixel buffers, uploading with sizes as multiples of 4, double-buffering textures, etc) with nothing working it became obvious that it was time for alternative techniques.

I decided as a test to see how the updates worked if I used lots of small updates instead of very few large ones. This totally contradicts everything I considered to be common sense here, but damn it all to hell, it worked.

RMQEngine is now running faster than all versions of DirectQ prior to about 1.8.4, by the way.
__________________________________________________

Speaking of DirectQ, I'm probably going to release the 1.8.666 final build either tomorrow or the day after. There are a few bugs outstanding but I'm going to leave them be for now: I'm pleased as anything with this codebase now, and consider it a great personal achievement. It has been however a quite intense period leading up to this point, and it's been great for me that the opportunity to jump aboard with RMQ came up when it did.

It will also mean than when I get RMQ in a satisfactory initial state I'll be able to take a short break from it, come back and fix those bugs and do some more work with DirectQ, bringing forward things I've learned from RMQ and make DirectQ even better (I'm fascinated to find out if the same lightmaps trick will apply to it).

DirectQ is still my main thing here and - above all else - it is my engine, where I get to call the shots. That's important. What's also important is to vary your experience and learn new things a little outside of your comfort zone (but not too much!) We should all do that a bit more at times.

Monday, July 19, 2010

ACME Translation Services Strikes Again!

While looking for some stuff I came across this article. Seems as though the Dirty Hungarian Phrasebook is making a return of sorts in the land of D3D optimization!

Is YOUR hovercraft full of eels?

One of the all-time classics:

The worst marries scenario occurs when an application is repeatedly rendering one or two triangles Rep rendering call. I see this in applications Officers' Club of Revolutionary Armed Forces dwells frequently than one might expect, and the majority of these occurrences seem to fall into several design categories:

DirectQ is going to be on temporary hold soon

Some of you may have seen that I've joined the Remake Quake team as an engine coder. As well as getting advance access to all the cool insane stuff they're doing, it means that I'll be calling a temporary halt to further work on DirectQ for a while.

I still intend getting the final 1.8.666 release out over the next few days (Easter Egg included), and I have definitely not cancelled any prospect of further work. DirectQ WILL return, and I'm hoping to bring some cool engine advances from this project into it.

Initially I'm working on the renderer side of their engine, porting many of the optimisations I've made to DirectQ over to it, and fixing up a few things so that it works better with their content. FitzQuake purists might choke on some of what I've done so far, but it's OK - this is a specific engine for a specific project, so there's greater freedom to take a butchers knife to it.

It's already running faster than DirectQ in many places, by the way. Sigh - I did miss OpenGL, really.

From there I'll be moving on to more behind-the-scenes work in the engine. Can we say CSQC? Crazy protocol extensions?

This is a very positive move and a step in a very good direction. I've always held the opinion that a mod team and an engine coder should work closely together, and feed off each other's ideas and enthusiasm, and - in the day or so I've been with them so far - there is certainly no shortage of that. Stuff they're coming up with is pushing me to do better things with the engine, and stuff I'm coming up with is opening doors for them to do great things with their mod.



Sir wanted horde combat? Would 400 knights be enough for sir?

Sunday, July 18, 2010

So somebody wanted an Easter Egg...

DirectQ now has a rather cool Easter Egg. Of course it's a totally frivolous waste of my time when I could have been doing some more serious stuff, but what the hell?

It's quite hard to find (unless you cheat by looking at the source code, of course) and when you do you might wish that you hadn't. And don't worry, it doesn't mess up anything on any maps or mods, it's nothing stupid like that. In fact it might even be quite scary the first time you see it (it still frightens me a little even now).

I'll see if anyone finds it quickly enough before I start dropping hints. Because I'm nasty like that.

Saturday, July 17, 2010

Final list of upcoming changes

This is part 1 of what I hope will be the final list of upcoming changes for 1.8.666; there may not be a part 2 (depending on how things work out with the Rc3 release over the next few days).

I've removed the palette gamma scaling. This was always something of a kludgy hack, it never worked right with external textures, and it's not going to work right with HL BSP (whenever I get around to re-implementing it properly). To prevent DirectQ from being too dark when you start it up I've rescaled the value of the gamma cvar to the same range when it's applied. This should hopefully prevent the final release from being too much of a shock to everyone's system.

Of course this means that when you run DirectQ windowed the gamma ramp will be applied to your full desktop. There are some things I've done to lessen the effect of this. Firstly, when you Alt-Tab away your original desktop gamma will be restored. When you Alt-Tab back DirectQ's gamma is restored. Secondly, if DirectQ crashes it will attempt to restore your original desktop gamma.

At some point in time I may change it to use D3D gamma control, but I would need to be certain of it's behaviour if I did so. Not for this release in other words.

I'm going to re-examine the way I currently handle lightmaps. What I do at present is calculate a lightmap table from the colormap and use that as a lookup for the final lightmap intensity; this is great if one of Those Wacky Modders has implemented a custom colormap with different lighting ranges to what ID Quake uses, but it does make DirectQ look a little different from everything else. Probably not that big a deal overall, but I do suspect that it's not getting the full 2x range as a result.

I'm also doing something similar with the palette and I want to be certain that it's valid and that I'm not getting anything too odd as a result.

There are some system settings that I would really love to move out of the cfg files and into the Registry. Unfortunately the Registry has a bad reputation (based on FUD, lies and deception) so I probably won't do it. It would make the whole application, and in particular video startup, a lot more stable and solid if I felt that I could, however.

That's about it for now. Till next time!

Update - Part 2

I've removed the code that checks if the status bar needs to be redrawn, and that only redraws it if so. It's proven to be too fragile and the entire engine was littered with calls to Sbar_Changed () in some quite odd places. This will drop a percent or two framerate, but nothing too drastic.

This also means that I can potentially be able to move the FPS counter and clock back down towards the bottom of the screen instead of having them overlay the 3D refresh area. However, I kinda like them where they are, so unless anyone really wants them moved I'm going to leave them be.

There's a bug where your health doesn't update properly on the status bar when you die. Shame on the lot of you for not spotting that one! This was a case of me trying to be too clever and falling flat on my face instead. Fixed now.

Palette and lightmap derivation from the colormap are gone now. Things are looking a lot more solid as a result. The next step seems to be to brighten up MDLs a little; I've already done this in MP, but they're a little too dark in SP as well (the Quad secret in e1m2 is a good example here).

All the fun of the fair.

Update - Part 3

RC3 was mistakenly labelled as "RC2" on both the splash screen and the title bar (if you're running windowed). Make sure that you're using the right version before reporting a bug!

RC3 shipped with my implementation of the Remake Quake protocol 999 (but I'm using number 777 to avoid conflicts with any future changes they may make - clear as mud!) There is a bug in the co-ordinate read/write functions, so if - for example - the first elevator ride in e1m1 is quite jerky, it means that you've just discovered that bug.

I've cross-checked with the RMQ engine and it exists there too, so I've notified the RMQ people of it. Meantime I've changed the co-ordinate format a little to remove the bug. This means that my implementation is no longer 100% compatible with their's, but that's OK for now. (It has also fixed a host of old problems, like getting stuck on the edges of slopes. Bonus!)

My new protocol 777 will be shipping with the final release, and I've bumped the number to 778 to reflect the change. I'm not retaining support for my old buggy 777 beyond RC3, so any demos you might record with RC3 won't work going forward.

Release Candidate 3 is Out

http://directq.codeplex.com/releases/view/49131.

This has actually been ready for a coupla days, but I wanted to put it through it's paces with a few maps and mods before releasing it. It's a good sign of things settling down that the latest modified dates are the 15th.

By now, all issues that were reported to me should have been fixed, with the exception of a playerskins problem in an old mod from 1996. I just haven't really had the opportunity to look at that one yet.

Assuming that everything is fine with this one I'm going to wait a few more days, then compile a release build and put it up as 1.8.666 (final).

Thanks to everyone who got involved in the testing and feedback process with this one.

Friday, July 16, 2010

More fun with Linux

This is getting annoying now. DirectQ works, but I'm getting less than 1 FPS. That's OK, it may just be the case that I'm going through software emulation. But I seem to have no way of finding out. Even if I did find out, how do I enable hardware acceleration? Hell, I don't even seem to have any way of finding out the make and model of video driver it's using. DirectQ tells me that it's an NVIDIA FX 5600 which is quite odd as I certainly do not have an NVIDIA anything in this box.

What the hell is going on?

Right now I'm spending much more time battling with the OS than I am spending actually doing things with applications. That's not good, and it's not productive. Gah! It's still a long way from being "Linux for Humans" I think. Or even for IT heads.

I've managed big networks for 12 years, repaired faulty DNS servers, recovered Oracle databases, built Cisco router configurations, and many other fun things, so I'm reasonably confident that I am not stupid when it comes to this kind of thing, but every time I try anything it's just throwing more obstacles in my face.

So I'm shutting it down now. I might get back to it later on, but at the moment I'm no longer inclined to spend time on it.

I wonder do Linux people have the same kind of experience when confronted with a Windows box?

(As a sidenote, it's also quite ironic that my fresh install of Ubuntu 9.10 was about the same size as my fresh install of Windows 7 - bloatware, anyone?)

Softly Softly Catchee Buggee

OK, I mentioned that there were a few bugs I needed to track down, and one of them - thankfully the worst one - has now been fixed.

I had partially implemented the Rotating Brush Models fix way back some months ago, but had never fully linked in the code as there wasn't really anything to test with at the time. For 1.8.666 the timing was right, a good QC method of handling the entity had been devised, some test maps emerged, and I finished off the code.

Unfortunately there was something weird elsewhere in my code causing the player (that's you!) to be thrown into the solid leaf when you clipped against a rotating brush model. Not a nice thing to happen in other words.

This, by the way, ties in with a more general clipnodes bug in my code that I had seen once or twice before, but that was quite rare (it only ever happened on one map, and even then did not happen consistently) so I didn't bother fixing it (some bugs are just like that, alas! No, I don't like it either.)

Fortunately however, the old world.c (sv_world.cpp in my code) was one of the files I have changed the least in DirectQ so I was able to go back to a known-good version, bring the working code over, test, bring on the most critical of my changes, re-test, and eventually got there.

It's nice to be able to finally wave goodbye to this one.

Thursday, July 15, 2010

Updates for 15th July 2010

I've all but decided that I'm going to be doing an RC3 before I do a full release, as there are a few more bugs coming through that I want to get fixes coded for and confirmation that they work for other people too before I declare it "done".

Brush and alias model transformations have been optimized slightly. Because of the way my renderer is structured, and because I batch brush models with the world and all entities sharing the same alias model together, I need to run these transforms within the engine. (You should note that hardware T&L does this too, it's only the final multiplication of the vertex coords by MVP that are done in hardware). Previously I had done a full matrix multiply, but now I only do that if needed, otherwise I just do a simpler translate (and optional scale if required).

In practice it doesn't make much difference to performance, but it just feels better this way.

I'm cleaning up the mess I had made of the various MSG_Write/MSG_Read Coord/Angle functions.

Regarding protocols, I've deciced to not do a full implementation of the Remake Quake protocol just yet. Instead what I'll be doing is implementing an interim extension of the FitzQuake protocol (using number 777) based on what the folks have done so far.

The reason for this is because as far as I am concerned the protocol number 999 belongs to the Remake Quake people, and they are obviously not finished their project yet; therefore implementation details would be subject to change. If I was to take the number 999 and use it, there would be danger of having two incompatible protocols using the same number. That's not in their interest and it's not in mine either.

I already know the grief that this kind of thing causes from my adventures in coding Nehahra support, where the Nehahra people extended protocol 15 but didn't change the number. Let's not have a repeat of that crap; once is enough.

I'm going to need to clean out a lot of my current protocol checking code at some time as the "if" statements are now starting to get quite lengthy.

All going well I'll release RC3 sometime over the weekend, and hopefully follow that up with the full release next week. See you there!

DirectQ now works on Linux/Wine

In the end I decided that the Wine implementations of the D3DX functions were just so shabby that there was a spiral of diminishing returns in trying to work around them, and I would have ended up compromising Win32 functionality in order to get things done. Compromising your primary platform in favour of a secondary platform doesn't seem to me like anyone's idea of being particularly clever, so I tried something else.

It may not be immediately obvious to everybody, but D3DX is just a utility library, in kind of the same way that GLU is a utility library for OpenGL. So really all it contains is a bunch of C/C++ code that calls into the main D3D DLL.

With that in mind, I copied the D3DX DLLs from a Windows machine over to my Wine installation. Bang, up she comes, it worked.

Now, I have to say at this point in time that doing this might be some kind of violation of the D3D EULA, and that there is no way on earth I would endorse or recommend it, and that if you were to do so, you would be doing so totally under your own volition and with full awareness of the consequences of your own actions. All I am doing is passing on the information that it works.

It may also sully the purity of a "Free" OS, but I would think that if you're using Wine to try run a D3D application you're already halfway to beyond redemption anyway.

So that was roundabout where I left things off with that. I haven't done any real performance tests, or even enough to establish that it's playable. I'm not even aware if my Linux installation can do hardware accelerated rendering at the moment: Ubuntu does have something that looked like a driver control panel, but all it told me was whether or not I had any proprietary drivers installed. Now that's cute, but it's also quite useless.

I'm going back to a few more things I need to do on the main codebase, and probably won't pick this particular experiment up again until the weekend, but so far I have at least established that it can be done.

Testing on Linux

I've been doing some preliminary testing on Linux (running via Wine in a VM) and the initial indication is that there is a little work required on my part to get things running OK.

The primary problem is with DirectQ's use of the D3DX library, or more specifically the fact that the Wine developers have not fully implemented all functions in this library, leaving a lot of them as stubs. D3DXCreateRenderToSurface and D3DXDeclaratorFromFVF are both called very early during DirectQ's video startup, and according to info on the Wine pages, both are currently either unimplemented or not implemented correctly.

The latter can be easily worked around in my code (I just need to create my vertex declarations the long way instead and the problem just goes away) but the former is a more interesting case. In summary, an exception is being thrown from inside the Wine implementation of D3DXCreateRenderToSurface. I know this for a fact as I surrounded it with MessageBox calls, so I'm quite confident that I'm not blaming the Wine folks in the wrong here. No amount of error checking or working around on my part can do anything about that.

The only reasonably sane way of dealing with this is to surround the render target creation with a __try/__except handler and catch the failure, which actually works out quite OK. DirectQ already has code for an alternative underwater warp if the rendertarget failed to create anyway, so I guess this is the current fallback mode for Wine, and, one day if the Wine developers implement this properly, it will just start working automatically.

I'm going to do the vertex declaration changeover later on, but it's not beyond the bounds of possibility that there are other things either unimplemented or just broken in Wine, so that doesn't mean it's going to work after I get that done.

Upcoming changes

Now that RC2 is out I'm going to wait a while and see if bug/crash reports come in, but I'm starting to feel as though the next release might be the full and final 1.8.666 release.

Even if not, there are a few more changes going to come through next time around.

DirectQ entity movement and rotation has always been a little jerkier than it should be, despite the availability of interpolation; I've identified the cause of this and it has been eliminated from the engine. Everything is nice and smooth now.

Some more robust capabilities checking at startup has greatly increased support for older and/or downlevel hardware. Massive credit needs to go to the mighty Baker (he of ProQuake fame) for testing various builds, reporting crashes, and generally being patient during the troubleshooting of this one.

I am sorely tempted to change my default protocol to the new RemakeQuake protocol. This is based on the FitzQuake protocol (which is my current default) but has extended co-ordinates and smoother angles for all entities (not just the player) built in (which, IMO, Fitz should have also had from the outset). It might be a bit hairy doing this at this stage in the game, but I do want it at some point in time, and it's of benefit to everybody so if it can be done now I think it's worth shooting for.

On the other hand, I mentioned that I wanted to make a change to handling of fullbrights, but have now decided to defer this one. I'm reasonably certain that I can get fullbright colours without needing to load an extra texture for them (confirmed with HLSL, actually), and definitely certain that I can seamlessly and robustly support gl_fullbrights 0, 1 and 2 (i.e. old blend mode), but it needs a bit more restructuring of the surface refresh that I'm currently willing to take on.

I'm also planning to at least make an attempt at Linux/Wine support; even if it only gets as far as checking whether or not it works and getting some idea of what needs to be done, it will be useful information.

Once I get the full and final release out I'm planning on taking a break from DirectQ for a short while. This will be to let ideas for 1.9 settle better in my mind, get away from what has been a quite intense piece of work recently, try some experiments outside of the main codebase, and generally relax and chill out more. I also fancy doing some OpenGL work, which I haven't really done in a while, and think I'm going to write a replacement gl_rsurf.c that should utilise some ideas from DirectQ's renderer but be adaptable to other engines quite easily. Sometimes a change is as good as a rest!

Wednesday, July 14, 2010

RC2 is Out

If you've already recieved a private build of RC2, you should download this one as there have been a few more changes and fixes since then.

Again, this is not totally recommended for general use or as your primary engine, but should nonetheless be a LOT more stable than RC1 was.

Link: http://directq.codeplex.com/releases/view/48964.

I expect that I will probably do an RC3 as there are a few more things that still need fixing, and there is a change I want to make in how it handles fullbright colours. So strictly speaking this isn't really a "release candidate" then, is it? Ah well, it's what it's called anyway.

Adventures in Linux Land

You'll need to wait a little while longer for the RC2 announcement, but it should be going up later on today. :)

Part of my stated intent with DirectQ is to have it running (or at least able to run) on a wide range of systems.  This is one of the reasons why I fret over things like Windows 98 support, and the implications of migrating to different versions of Visual Studio.  There is however only so much that can be done while keeping the program (and myself!) sane and rational at the same time.  That is one of the reasons why it doesn't run on Windows 98.

Anyway, I am by no means a "Weenix Loony", and I have a healthy disrespect for "Weenix Loonies" the world over, but there is another class of Linux user out there (who I would hope is - or one day will be - a majority among Linux users).  These are people who have truly evaluated the options available in the cold light of day, and not being tainted by subscription to any ideology, nor being motivated by what they hate (as opposed to being motivated by what they like), have made the decision that Linux is the correct OS for them (these people aren't "Weenix Loonies" either, by the way).

I think it would be cool if these people could get a decent game of Quake, using DirectQ, through Wine.

There are a number of reasons why somebody might want to use DirectQ instead of another engine, even in a case where you know that both the Windows API calls and the Direct3D calls are going through an emulation layer.  One reason is the increased capacity, another reason is it's ability to handle bigger maps and more complex scenes with ease, and a third reason is that it's so damn fast (that one maybe not so relevant on account of the emulation).  Maybe someone just likes the way DirectQ looks and the way it does things, even.

So over the next few days I intend bringing up a Linux machine and running some tests.  To begin with I'm going to run it through VMWare just to establish that the thing works, but I also have an old(ish) PC knocking around that I could even do an installation on if I needed to or wanted to.  So far I've been doing some preliminary familiarisation work, just installing and configuring parts of the OS (Ubuntu 9.10 for now), and believe it or not I'm actually liking a lot of what I see.  I'll probably never feel totally at ease with Linux (my days as a Unix admin have left deep scars) but it's certainly a lot better in terms of an OS being a means to an end (running applications and getting stuff done) rather than an end in itself than it was the last time I looked.

What a very pleasant surprise.

Slow Machines, Visual Studio Versions, and More

Currently I can do a Visual Studio 2003 build of DirectQ that should in theory work on Windows 98; however there are Windows API calls that I use which are only available on Windows 2000 or higher (and DirectQ really does prefer to use the D3D10 shader compiler which is only available on Windows XP or higher) so a Windows 98 build is not a viable proposition any more.  It's an interesting thing to test all the same, and gives the lie to the theory that more recent software from Microsoft is just slow and bloated.  While 2008 does certainly produce a larger exe (1.3 MB vs 1 MB in the debug build; release will be smaller) it also runs about 25% faster.

No further comment necessary there, I think.

I'm looking forward to the day when I can just ditch Windows 2000 support and move to Visual C++ 2010; it's definitely a much nicer environment to work in, and it will be interesting to do performance comparisons with that.  I am however waiting for SP1 of 2010 to come out, and also you would likely all need to upgrade your Visual C++ runtimes and DirectX versions (2010 isn't happy with older versions of the DirectX SDK), which wouldn't be a nice thing to do; at least for a while.  I have done test migrations and have confirmed that DirectQ will compile clean and run well, but I can't make any performance comparisons right now as the last such build I made was about midway through 1.8.4.

Slow machine testing is a critical part of any DirectQ release, and I currently do test runs about twice a week.  This normally just involves some timedemos, a quick run through an e1 map or two, as well as an occasional big map (just for laughs).  It gives vital information on potential bottlenecks that just don't appear when you run on a faster machine, and is also great for testing fallback modes for when the latest and greatest hardware features aren't available.

The current slow machine of choice is a VMWare VM running Windows XP.  This is configured with 512 MB of main memory, 128 MB of video RAM and a single CPU core, so it would be reasonably representative of the type of machine that was common enough maybe 5 years ago.  It uses VMWare's video driver (no hardware virtualisation) which I guess is equivalent in speed to maybe a TNT2 (but with a more modern feature set) - I certainly seem to remember getting similar timedemo demo1 results (90-100 FPS at 800x600) back when I had a TNT2.

I also have a Windows 2000 VM which I've specced even lower, but very rarely use it.  It's slightly faster than the XP box, but has something of a tendency to bluescreen.  Some day I'm going to add a Linux box to the mix (is Ubuntu still the hip thing with the young things?) and try it out over WINE.

All in all there;s no conclusion and nothing really relevant here if you're tracking current progress with the engine, just some of my random ramblings. :)

Next time I'll probably have the RC2 release available for everyone to try.  Till then.

Tuesday, July 13, 2010

Release Candidate 2 coming soon

As I've said, 1.8.666 contained too many code changes and hence the fact that weird bugs are emerging. The more cautious approach of doing release candidates seems to be the way to go until I get confirmation that things are settled down for everyone, at which point I'll do a full stable release.

RC2 will be on it's way shortly based on feedback and bug reports from RC1.  The following items (so far) are addressed:

  • Mouse no longer locks when alt-tabbing back to a fullscreen mode.
  • Crashes will now give more descriptive info, including the file name and line number on which the crash event occurred (for those of you who've seen the sys_win.cpp, line 812 crash: that was my error handler!)
  • The Intel 945 "everything is black" bug has now been fixed; this was due to some rogue state changes not being reset properly when switching between HLSL on or off on a software T&L device.
  • Mouse input sending has been reverted to the old way in single-player games.
  • Changing the value of d3dx_version now causes a vid_restart to automatically issue, so that the depth buffer is properly recreated.
  • gl_fullbrights 0 and 2 modes have been removed owing to excessive video RAM usage.
  • Potential for vertex and index buffer sizes to overflow max allowed by your hardware has been removed.
More news as it happens!

Monday, July 12, 2010

And it's out

This is Release Candidate 1, not a full stable release, so expect bugs and loads of potential "fun".

http://directq.codeplex.com/releases/view/48822

One major point to note (this is also covered in an included readme).

I had originally intended to put as many of my changes to the code into the public domain as possible. That is still the long-term intention, nothing has changed there. However, it is important to protect the community from people who use other project's code in their own work and give nothing back. On account of that, I am retaining the GPL (but using version 3) for this release.

Update

It seems that I made the right decision in putting out a "Release Candidate" first. I really did change too many parts of too many subsystems with this one, and overall stability, as well as the ability to isolate causes of problems, is suffering as a result. With hindsight releasing a "1.8.4b" after I had ported the ProQuake netcode, and then doing incremental releases as everything else came on, would have been a wiser move. Gratitude to everyone who's been brave enough to download and run this, and keep the crash and bug reports coming - we'll lick this one yet!

1.8.666 is now feature-complete

Yes, it's true. The last code change to bring the functionality level back to the old 1.8.4 level has just gone in. Everything else is just bugfixes at this stage. I think I might even go crazy and put a "Release Candidate 1" out sometime soon.

Most recent changes and fixes are:

  • Subpicture drawing for cl_sbar 2 and 3 completed.
  • Brush model alpha for enginetest-spirit and other maps that use alpha brushmodels without exporting a field from progs.dat fixed.
  • Texture compression restored for external textures only.
  • Underwater warp code tightened up slightly and made look better for values of r_waterwarptess other than 32.
  • Crash bug for playing a demo for which you don't have the map fixed.
  • Loading of external textures for 2D HUD and console elements tested and confirmed OK.
  • Additional testing and validation of video mode changes implemented.
  • Brush model alpha in non-HLSL path checked over and declared satisfactory.
There are a few small things to just review, but - assuming I don't get sense in the meantime - you can probably expect an RC1 release later on today! Wooot!

Department of crazy ideas - the return

Occasionally I'll have these crazy ideas; sometimes they come from something I've read, other times they seem to pop into my head from nowhere. Sometimes they are complete moonshine, other times they end up being useful and good.

The current one is relating to lightmaps. Handling lightmaps correctly (i.e. fast) in a multitextured renderer is not too easy. The way DirectQ currently does it is:

  • Lightmaps are built in texture order.
  • Tall but narrow lightmaps are used. Tall so that we can get a better chance of all surfs that use the same texture using the same lightmap. Narrow for dynamic light upload optimizations.
  • For rendering, first we chain all surfaces by texture.
  • Then we walk through each texture chain building up sub-chains by lightmap.
  • Finally we walk through each lightmap chain doing the actual rendering.
The goals here are twofold. Firstly to reduce the number of texture changes. That's not too big a deal anymore, texture changes are fast on modern hardware. The second goal is to increase the number of surfaces that can be included per-batch. That's a big part of the reason why DirectQ is able to render scenes like the Start map skill selection room in a total of 24 draw calls (including all the 2D HUD elements, models, etc).

So now we come back to today's crazy idea. The goal is to increase the number of surfaces that can be stored on each lightmap texture without affecting performance. Two ways spring to mind.

Lightmap textures could become cubemaps, with each face of the cubemap corresponding to a traditional lightmap. Some twiddling of texture coords would be needed to address the proper part of the cubemap, but texture changes could be cut by a factor of 6, thus enabling bigger batch sizes. The downside is that I would have to say goodbye to my tall-but-narrow lightmaps, as each face of a cubemap must be square. This might hurt dynamic light performance.

Alternatively lightmaps could become 3D textures. Each slice represents a traditional lightmap, addressing is easier (you just need to know which slice to use), we get to keep tall-but-narrow and all the other benefits, but we have additional restrictions on 3D textures imposed on us, including stricter size restrictions (but it might be possible to store all lightmaps for the entire world in a single 3D texture, thus totally eliminating texture changes).

Both of these are in the realm of the purely speculative at the moment, and I definitely haven't thought everything through. For example, DirectQ allows variable-sized lightmaps; how would they fit into such a picture? At what point in the past did 3D textures become commonly supported in hardware? Becchmarking of a tex2D vs a tex3D lookup also needs to be done.

However, if something works out from this it would be an exciting and interesting addition to 1.9; we'll see how things go.

Now back to 1.8.666!

Sunday, July 11, 2010

Updates for 11th July 2010

Texture compression has now been removed, unless anyone can come up with a compelling reason to keep it. Performance loss is in the order of 1 or 2 percent on ID1 timedemos and non-existent in bigger, more complex scenes.

Draw_PicFromWad and Draw_CachePic have been largely merged into a single function. I don't like Draw_CachePic and am going to remove it and just preload all the necessary pics at game startup time instead. This will make game startup take a little longer but will prevent temporary hitches when you access menus for the first time. I can see the sense in it back in the days when you had to co-exist with a software renderer and run on a machine with limited texture memory (only load what you need, and wait until you're certain you need it before loading it), but it's quite irrelevant, silly and over-complicating things today.

I also want to change the in-memory picture format. Having to cast the data member of a qpic_t to a glpic_t is a horrible hack that deserves to be grabbed by the ears and shown where the door is. That probably won't happen this time around though.

DirectQ now runs through a test ChangeDisplaySettings call before admitting a display mode that Direct3D reports back to the family. GLQuake did the same, but with D3D there should be no need for it - tighter coupling to the windowing system and video hardware means that it should give you valid modes all the time. It just feels a little safer this way though, especially now that I'm allowing the end-user to change more things with the selected modes.

I need to get a new beta version out to people for testing to see if the refresh rate changes I have made have been effective in eliminating reported crashes. I'm going to wait until I have the next round of changes made (to the 2D pic system) before doing so, however.

The current status is that correctly handling sub-pics for the QuakeWorld HUD is now the only real major item remaining on my to-do list. There are a few other more minor ones that are either less critical or will be left to the next release after.

Correcting brush model alpha in the fixed functionality path is not critical. At the present moment there is some kind of alpha present, but it doesn't blend correctly when you have multiple models layered on top of each other in the Z-order. The same probably applies to r_wateralpha. Either way I don't see it as a show-stopper as neither of these are used extensively in mods, and both work correctly in the HLSL path (which most people will have and be able to use).

There is also another issue with brush model alpha where there appears to be at least two standards in use for setting alpha. That used by Spirit's Engine Test Map doesn't work in DirectQ at the moment, but that used by Remake Quake does. I suspect that this might be a progs.dat or protocol issue, and will probably investigate further and fix it if possible before release.

Half Life BSP support isn't going to happen this time round. Most of the support required is already there, and an earlier version of the engine actually was able to load them OK, but I changed the palette handling a few versions back, and have changed it again for this version, and still haven't gotten round to integrating it properly yet. It may crash in other places too; this hasn't even been tested for a few months.

To be honest I see this as being more of a novelty feature than anything else. OK, it's kinda neat to be able to run Half Life maps in a Quake engine, but what does it actually gain you? What does it achieve? There are no mods out there that seriously use the Half Life BSP format, and until such a time as one appears this feature is purely window dressing.

Long map names in the Save and Load menus still need proper handling. For now I'm probably just going to chop them off at the max supported by the display. Having a clean and clear display seems more important to me than showing all of a cutesy map name.

External textures for HUD and Menu items need to be tested again. Not really a big deal as the new code was written with support for these in mind, but I just need to be sure that I've missed nothing.

That's about it for this version; most of what's needed could potentially be turned around in a very short time, but I do want to take some extra time out to ensure that nothing nasty has managed to sneak in. With hindsight it's possible that too much code has changed with this one, but done is done and going back would be a little silly right now. If nothing else it should make 1.9 a little easier!

Some proposed changes

These are going to change the way DirectQ looks and how fast it runs so I want to shoot them out to the floor before doing anything.

First one is that I'm thinking of removing texture compression. I'm really starting to notice some quite severe compression artefacts on certain textures, and visually DirectQ compares disfavourably to other engines as a result. The brick texture on the floor of the start map is one example here; with texture compression it just looks messy and ugly, whereas without it looks more solid and clean. The conclusion is that Quake's low resolution textures just don't behave too well with texture compression enabled.

The downside is that there will be a loss of performance as a result; even a 64 x 64 texture will give a faster engine with texture compression enabled than without. I need to measure this with big maps to see how bad it's going to be, but right now I'm increasingly of the opinion that the trade-off just isn't worth it.

Before you say it, with earlier versions of the engine I had provided a cvar to let you switch it on or off. But owing to the way Quake caches textures it's not an immediate change; at the very least you would need to reload the map, worst case is restart the engine.

There have already been cases where I have had to remove compression from certain textures - the 2D HUD and menu graphics for example - owing to the quality loss being unacceptable. Doing a direct with/without comparison indicates that it's a global factor, not just something confined to these textures.

(Before you get too worried: texture compression is only one of the factors that give DirectQ it's high performance, and is probably the least significant of those. We're not talking about scenes that used to run at 200 FPS suddenly grinding down to 20 FPS here, more like 195. It's still going to be faster than 1.8.4 was.)

The second one is that I think DirectQ is too bright. This is a result of the fact that I apply the same default gamma scaling to the Quake palette as GLQuake does, but DirectQ is really the only engine that does this, and there are places where it really stands out badly. It also loses a certain amount of precision in the palette. It never really worked well with external textures either, and I had to resort to some evil hackery (including slowing down external texture loading quite a bit) to get it consistent.

This is probably more of a no-brainer to be honest. Brightness is a factor influenced by many things, including your video card, your monitor, the time of day, the lighting in your room, and even your eyesight. There is no "one size fits all" value here, and trying to find one is just an exercise in futility. However, doing it will give more of an immediate visual change than removing compression would.

The proposal is to remove the gamma scaling. If you want DirectQ to look brighter (or darker) you can always just use the brightness slider in the menu to find a value that works well for you.

In both cases I am inclining very strongly towards doing what I have proposed. Both cases involve some kind of trade-off, but you really do need to balance the pros and cons here, and it seems to me right now that the pros of removing them far outweigh the pros of keeping them.

Update:

OK, I've pretty much decided to keep the gamma scaling. Having run it without on a few different machines so far, it's just too big a difference, and too far in the opposite direction.

Texture compression is still a candidate for removal. To illustrate better, and because pictures tell more than words, here's an example of what I mean. The left-hand side has compression, the right-hand side doesn't. (You may need to click on this for the larger image to see it better).



Some benchmarks also. timedemo demo1 currently gives 328 FPS with compression and 324 without. The (in)famous ne_tower scene currently gives 100 FPS in both cases, no difference. Obviously in this one the bottlenecks are elsewhere and having texture compression enabled does nothing to help. This is really swinging me more towards removing it.

Saturday, July 10, 2010

Updates for 10th July 2010

gl_fullbrights 0 mode has now been done. The chosen solution was to load an extra copy of the texture for use with this mode; as I've said before this only applies to the native low-resolution 8-bit textures so video RAM overhead from it is going to be minimal. I'm also going to do gl_fullbrights 2 which will give you the same (uncorrected) blend as is used by most other engines.

gl_fullbrights and gl_overbright are now no longer saved to your config; you'll need to explicitly request them if you want to use them, and explicitly put them in an autoexec if you want to keep them. This is in line with my philosophy that fullbrights and overbrights are intended features of the Quake engine that should normally not be switched off, but also gives you the ability to switch them off if you're running a "made for GLQuake" map that has bad texturing in it.

By the same philosophy, neither option is exposed by the menus.

A few things have been fixed up with the 2D HUD/console textures. Firstly I now offset the vertex coords by -0.5 in both the x and y dimensions so that it gives correct mapping on texels in the texture image to pixels on-screen. This isn't some cheesy non-robust hack but is the recommended approach; see http://msdn.microsoft.com/en-us/library/bb219690%28VS.85%29.aspx for more information (as well as a description of the problems it overcomes).

Secondly I'm now padding scrap texture images properly so that linear filtering will no longer cause adjacent HUD icon texels to bleed into each other.

There are a few other minor changes coming up soon which I may update this post with later on. Till then.

Thursday, July 8, 2010

Updates for 8th July 2010

Refresh rate changing seems to be a mite unstable on XP at the moment so I've removed the option from all versions of Windows prior to Vista. This may be a "feature" of the older driver model that XP uses, or it may be that I've missed something in my code, but until I know for certain it seems better to play it safe. DirectQ will by default run at your desktop refresh rate anyway, so it shouldn't be that big a deal for most people.

The refresh rate option is also not visible if you only have one refresh rate available.

You should note that DirectQ may report different refresh rates than your Display control panel or an OpenGL engine reports. This is because I use the D3D display mode enumeration functions to retrieve the list of refresh rates; it's just telling you what D3D tells it, and isn't a DirectQ bug.

I've also fixed a mouse input bug when connected to ProQuake servers, so things should no longer jerk around badly when you're looking straight down. In many respects connecting to a ProQuake server is a lot like running a high-capacity map. There are just so many small, subtle, but significant changes scattered all throughout the code (as well as some not-so-small ones!) that it's difficult to say if I've caught them all. It would have been preferable if ProQuake had defined a new protocol back in the day; as it stands the changes practically constitute a new protocol and the differences would have been clearer and easier to pick up.

I'm going to be taking a short break to catch up on some RL stuff, so further work is temporarily on hold for a few days. It's likely that the release will now be pushed back to mid/late July on account of this as well as there being a few more things with it that I need to work a little on. It will be a better program as a result of this, however, so it's worthwhile.

See you all soon!

Wednesday, July 7, 2010

Updates for 7th July 2010

gl_fullbrights 0 mode has been removed, at least until I find an acceptable solution. The options were:

  • Make it available in the HLSL path only.
  • Keep the original texture data in memory and recreate the textures if the mode changes.
  • Recreate the textures from disk if the mode changes.
  • Load a second copy of the affected textures for use with gl_fullbrights 0 mode.
  • Rewrite the renderer.
Of these, the second-last one is the most acceptable to me; the others have varying degrees of unacceptability, ranging from "it'll work but it'll be ugly" to "that's flat-out crazy".

I may yet decide to go with the second copy of the main texture; it only applies when native (low resolution 8-bit) textures that contain fullbright colours are loaded, so it's not going to be much video RAM overhead - a Voodoo 1 would probably even cope with it. Plus it's more elegant in operation as I can set the correct texture to use during setup instead of at render time, as well as provide a gl_fullbrights 2 mode which over-saturates the colours in a manner that people may be more accustomed to from some other engines.

Unless there are any major objections that seems to be the solution of choice.

The current build went out for beta testing to some people today and I've already got some useful feedback and bug reports in. Pretty much all major remaining changes are purely cosmetic in nature, so we're into the settling down and stress-testing phase.

Longer term plans for 1.9 are also forming. A certain amount of what I had intended that version to be actually ended up making it into this version, but there are still a few other things that were in my original plans that I still need to do with it. As things develop you'll be the first to know, but let's get this one out the door before that.

Till next time.

Tuesday, July 6, 2010

Updates for 6th July 2010

Pretty much all that got done today is implementation of gl_fullbrights 0 mode on the HLSL path. I'm quite dubious about this mode; on the one hand it's very much incorrect for Quake to disable fullbrights and doing so will completely wreck the intended look of the original game. On the other hand many mappers and texture artists only ever tested their work in GLQuake (which didn't have fullbrights) and as a result have fullbright pixels where they shouldn't.

These works will also look crap in software Quake, so there's a strong case to be made for not fully supporting gl_fullbrights 0 mode. Software Quake is after all the Gold Standard, the intended look of Quake, and anything that diverges from that is incorrect. Dilbert being told to "put the bugs back in" springs to mind here.

It's almost a definite decision that it won't be a "fast path"; the new texture creation code I use would mean that I would have to completely recreate all textures everytime gl_fullbrights changed, and I don't want to do that. I could go with requiring you to reload the map but that goes against the ethos of DirectQ, where even changing gl_subdivide_size doesn't need a map reload.

Some other minor things; I managed to test refresh rate changing and fixed a small bug in it, as well as found and fixed two not-so-small bugs that go back a few versions. One was a stupid array overflow, which doesn't seem to negatively affect the currently released versions, but occasionally when I made code changes I found a -1 sneaking in where it shouldn't be. It had me puzzled for a while but I've got it licked now. The second was a D3D resource leak during shutdown - nothing dramatic or dangerous, but more in the interests of correctness.

UPDATE:

OK, I've done it for r_hlsl 0 with brush models, and Christ it was messy. Right now MDLs are shaping up to be a major pain. Based on this I'm really really leaning towards just removing gl_fullbrights 0 mode, or - at the very best - making it ineffective with r_hlsl 0. The more I think about it, the more I think that I'm covering the arses of content makers who didn't bother doing their homework back in the day, and that just gives me one of those baaaad feelings.

I'll sleep on it.

New Video Options Menu

OK, here's your new video options menu:



Some things to note:

  • The resolutions available are different for windowed modes and for fullscreen modes, and the options list will switch when you select a different type.  This is consistent with the way WinQuake works, and to be honest if I was to change it I would mess up everyone's configs.
  • Colour depth (must change that to "color" for release) is either 16 or 32 and is not available in windowed modes.  In a windowed mode you just get the same colour depth as your desktop.  That's the way windowed modes work, at least in D3D.
  • Refresh rate is likewise not available in windowed modes and for the same reason.  It's also controlled by the "vid_refreshrate" cvar; this will default to the same refresh rate as your desktop.  I allow up to 64 different refresh rates, but - danger - every single machine I've tested on so far has only one refresh rate available.  If I can find one with more I'll verify that it works right, otherwise I guess I won't.
  • Depth/Stencil format allows you to select the depth and stencil buffer formats you want.  These are combined into a single format in D3D and so can't be separated.
  • Due to the way D3D display modes work, and if you have any exotic or unusual hardware, it might be possible to set a combination of modes here that your hardware doesn't actually support.  I'm likely going to need to add some extra validation when switching modes, like a CheckDeviceFormat call or similar.  In general however this should only happen if you have cruddy old kit from the year 1473.
  • I've enforced that the changes only happen one at a time with a full screen refresh inbetween them.  This even applies if you change multiple options together before hitting that "Apply Video Mode Change" option.  This should (hopefully) make it easier to recover from one of those invalid modes I mentioned above.
That's about it - tomorrow it's back to more bug-fixing and finishing up.

Monday, July 5, 2010

Updates for 5th July 2010

Having resolved the problem of the skybox cubemap in much less time than I had originally budgeted, I now find myself with the opportunity to add some extra "value added" stuff. This is something I like to do from time to time, and today's chosen victim is the video options menu.

Much of this menu dates back over a year, and some of the implementation is quite crude. So a nice layer of polish is going to be added to the mode selection, with the ability to select windowed or full screen modes, bit depths, resolutions, and possibly even refresh rate separately from each other.

I'm also going to add the ability to select a depth buffer format. Right now DirectQ locks the depth buffer format at startup time, and prefers 24-bit depth and 8-bit stencil. The reason for this is to make r_shadows mode look as acceptable as possible (the major flaws with it being beyond redemption), but some people (and I'm included in their number) frankly couldn't be arsed about r_shadows.

Now, it's a strange thing among certain Quake players, but they seem to love those cheesy, hacky and unstable "not very robust but cool to look at" (John Carmack, 1996) GLQuake effects. Even if there are obvious flaws with them, and even if other games before and since have done them much better, they still want to see their crap shadows, they want to see their crap mirrors and they want to see their crap fog. The fact that DirectQ can run Masque of the Red Death at 200 FPS, fixes the rubbish GLQuake sky and water, and restores the old classic underwater warp doesn't seem to matter much to this kind of person. No crap mirrors - DIE!

OK, enough of that. If you're one of those people I obviously don't include you among those I'm complaining about! But if you couldn't give a flying one about r_shadows there is a 10-15% performance gain to be had from not even creating a stencil buffer in the first place - even with r_shadows 0! I think you deserve to have that performance gain, so you're going to be able to get it.

On the other hand if you do care about r_shadows, there is still a performance gain to be had by not using a stencil buffer and drawing the shadows opaque black. Not as much (drawing the shadows adds overhead) but worth getting. So you should be able to get that too. Finally if you want an intermediate level of shadowing with the stencil buffer used to draw them, you should also be able to get that.

However, at the end of the day it is my engine, and I'm the one calling the shots here, so the default mode is going to be a fast one that doesn't have a stencil buffer. I expect you all to remember that and not come complaining to me when your shadows look crap later on. :)

So that's today's "spare time" project, and I'm currently carefully crafting the code that will let you do all of this, as well as cleaning out some accumulated legacy junk that was originally written to resolve problems that I've since resolved elsewhere in a much more elegant manner.

It should be fun.

Sunday, July 4, 2010

A funny thing happened on the way to my cubemap...

So after doing some research on cubemap projections in HLSL, all that I found waffled on at great length about reflection vectors, tangents, half-angles and so forth. Nice stuff for certain but it only serves to confuse the matter when all you want is to project a cubemap onto skybox surfaces.

My normal reaction in these cases is to say "this is a load of crap, I'll do it myself". So I bashed out a quick shader, made some mistakes, learned a few things about the differences between cubemap texture lookups and normal texture lookups in HLSL, and got to the stage where I was just doing a standard texture lookup with none of the fancy projection stuff. Just to see what happened so I could make a decision on where to go next.

It worked. That was all that was needed.

This normally does not happen. Normally I need to wade through layers of obfuscated so-called documentation that stops just short of giving me the hint I need to get to what I need to know. Normally I need to swear blind at Microsoft's technical writers and say things like "if the person responsible for this had their head on a pike in front of my house I'd be quite happy right now". Stuff like that.

So I was expecting literally hours of sweating and swearing to get this, and instead it took maybe 5 minutes? And when I did find something relevant it confirmed that what I had done was right? Guess I got lucky with this one.

Updates for 4th July 2010

First of all it's a happy wotsit to my American readers. Drink lots of root beer, eat your mom's pie, and don't flunk math now, y'hear. ;)

Anyway, I decided to bite the bullet and implement a full HLSL path for 1.8.5 (must remember to stop calling it that). This is the preferred alternative to the fixed functionality path, and is enabled by default if your 3D hardware supports Shader Model 2 or above; you can switch it off with r_hlsl 0 if it causes you trouble, of course.

In terms of quality I've tried to keep the 2 paths as close as possible. Naturally there are some parts of the renderer that are always going to be better with HLSL as it can perform it's operations per-pixel instead of per-vertex. These would be water and sky, so nothing much has changed since 1.8.4 in that regard.

In terms of performance the HLSL path should eat the fixed path for breakfast most of the time. A big win is sky, which runs considerably faster using HLSL than it does with the fixed path. Nothing much changed since 1.8.4 there either.

So what is different then? At present it's really all just behind the scenes stuff. By having pure HLSL all the way (or pure fixed all the way if you set r_hlsl 0) what we have is a cleaner and more consistent renderer with none of the "funnies" you tend to get if you try mixing HLSL with fixed in the same path.

Longer term things will get faster or better in certain areas. A lot of the overhead of particle setup can be shifted to a vertex shader, for example, meaning that particle-heavy scenes should become more efficient. Likewise there may be scope for reduction of vertex submission overhead elsewhere; this will need to be examined on a case-by-case basis.

About the only thing left that I haven't migrated is the skybox, the main reason being that I'm not too certain how the automatic texture coord generation I've used for it is going to work out. As soon as I knuckle down to do it I'll know, of course.

The rest of the items remaining on my to do list have been joined by a few more, but overall it's coming together quite well. Most of them are just relating to tidying up of loose ends at this point in time, but some of those may need a bit of work.

The big one is the sound restart crash. I've determined that it only seems to happen when you're accessing the menus, so for now the solution seems to be to just remove the offending items from the menu. When I port the Q2 (or Q3A) sound code things should get a lot better.

Overall today I got almost nothing done, but did test quite heavily, running through most of e2 and e3, as well as Masque of the Red Death. Performance remains quite solid, and handling of some of the more intense moments in Masque is impressive. A lot of the testing was also "slow machine testing", using a VMWare XP session and VMWare's video driver. Getting a solid 72FPS on ID1 maps with that is a great result, even though it struggles to go higher (it can manage 125FPS timedemos though).

Tomorrow I hope to hit the skybox code and get that one off the list. Till then.

Friday, July 2, 2010

Updates for 2nd July 2010

Not too much today as I was feeling tired. I did implement HLSL paths for all brush models (including the world) and alias models (i.e. MDLs) but right now they don't really do much more than the old fixed-functionality paths did. About the only thing new is that translucent brush model alpha is now done correctly. I guess when I had implemented it in the fixed path that I had seen some alpha, and just assumed that it was OK, but it's not really. I'm not too certain how much effort I'll invest in fixing this.

Both of these were originally scheduled for 1.9, but they were so easy to do, and as I didn't really feel like getting stuck into some of the other required work I decided to just keep the momentum going by doing something.

The ultimate long-term objective is that everything (at least in the 3D refresh) will have both a HLSL path and a fixed path available. This is an absolute pre-requisite for the return of fog to the engine, as - I believe I may have mentioned this before - the old hardware fog is no longer guaranteed to be available in Shader Model 3 or above hardware (yes, it works in OpenGL, but D3D - being closer to the hardware - is affected by this kind of thing more).

Depending on your hardware your might find that the HLSL paths are somewhat faster than the fixed func paths. If you've anything from within the last 5 or so years you probably will, and if you've anything from within the last 2 or 3 you definitely should. This is mostly because on newer hardware fixed func is emulated through shaders constructed on the fly, so we're bypassing that and going direct. Another reason why is that shaders are much much lighter on the state changes than fixed func, requiring no texture blend modes to be set up or texture stages to be enabled or disabled.

Anyway, I'm reasonably certain that the only fixed functionality items left in the 3D view are particles and sprites (and the skybox, but I'll probably leave that as is), so I'm seriously considering going all the way with this for 1.8.5; the next step would be to start amalgamating my shaders so that we have One Shader To Rule Them All with different Techniques and Passes being set depending on what's being drawn.

In some respects this might be seen as a step backwards to the old 1.6.x releases, but overall what I've got now is much better. For starters HLSL is no longer an absolute requirement; it will be used if it's there but a fixed functionality alternative is available too. Secondly, things are a lot better integrated than they were in the past, with only a few lines of code in the difference between the paths.

All of this means that what I've been calling 1.8.5 all this time (even in this post) may end up getting a further version number bump. Not all the way to 1.9, of course, but something like 1.8.666 sounds about right. It does have implications for how 1.9 is going to shape up though, as a good chunk of work I had scheduled for it has now been brought forward. Interesting times.

Hardware gamma evil

It's a common enough story. You've used the brightness slider, something goes bad, the engine crashes, and now you can microwave a cat with the brightness pouring from your monitor.

Well no more. I've installed an exception handling routine that will restore your monitor gamma ramp to a flat level on a crash. Obviously it's better not to crash at all, but in the event that you do, at least it happens a little more gracefully now.

Longer term I intend using this to also allow you to give me useful info about what caused the crash, but for now it's justified it's existence well enough.

Thursday, July 1, 2010

Updates for 1st July 2010

Things are really falling off the "to do" list now, with everything starting to come together well and look and behave more cohesive.

I broke anisotropic filtering again so I fixed it again.

I've removed sv_gameplayfix_gravityunaffectedbyticrate. Fundamentally it didn't work right, and I'd rather not have it at all than give the impression that it's there and working.

fullsbardraw is now a cvar and a menu option. In most cases you shouldn't need it owing to the flipping policy DirectQ uses, but it's there in case you ever do.

gl_polyblend has now been properly integrated with the underwater warp code.

I'm probably going to make the default for occlusion queries 3 (full) now. I've been regularly testing this and haven't seen blinky once, not even under conditions where I used to before. I'll need to fine-tune the cutoff points at which models get queries run, and may cvar-ize them. If I do they won't be menu options; these are "advanced users only, and you are responsible for the results of your own actions" stuff.

The surface chaining by texture/lightmap/f2b setup is done and gets a few extra frames in heavy situations. It's of real benefit in Marcher.

The 2 TMU renderer works. In the current build it's user-selectable but I'll be removing that before release.

I'll probably be making the sound restarting stuff a startup-time-only option for now. I haven't seriously looked at it yet, but I don't think it's a big enough deal to invest too much time in just now. I'll likely spend a few hours over the weekend but if I don't get it by then I'm killing it.

One final thing for now. Some of you may have noticed that there is a new DirectX SDK out. I'm not intending to move DirectQ over to it; in fact I might be backtracking it a little to an older version. In general I'm an opponent of the "older software is better just because it's older" mentality (but likewise I don't subscribe to "newer software is better just because it's newer" - good software is good, irrespective of age) but in this case using an older version means that the "upgrade your DirectX" thing that's been a problem should fade away even more.

Some messy fiddly stuff coming up in the next items on the list, but right now I'm willing to say that there will be a release sometime within the next 2 weeks. I can't be more specific just yet, but we're getting there.

UPDATE:

Setting the occlusion threshold (number of vertexes below which models won't get queries run) to about 200 for both MDLs and brush models appears to give the best balance between cutting out the really heavy stuff and avoiding the overhead of queries for smaller stuff. Of course it's totally possible to construct a model with a small amount of vertexes but that totally punishes fillrate. I've decided to not be worried about that for now.

1.8.4 versus 1.8.5 performance

Performance optimizations that benefit one type of map don't always translate into optimizations that benefit another type. There are some that are no-brainers for sure, like the improved batching, reduced particle fillrate, chopping off the status bar area from per-frame updates and scaled down rendertarget for bonus flashes.

Others are different. With ID1 maps using a common vertex size and format for everything made sense. It was only necessary to set up the vertex buffer once per frame, and so on. With other maps it was a drain. The sheer amount of data going in meant that not only was the advantage of only setting up the buffer once wiped out, but also that further performance was lost.

I could have cvar-ized this one and let the player decide, but ultimately I decided myself. ID1 maps are going to run fast anyway, so any further performance gains - while certainly nice - are dubious at best. When they're achieved at the expense of reduced performance where you really need it most, then they're just more trouble than they're worth.

So I've reverted 1.8.5 back to the old 1.8.4 way of using different vertex sizes and formats. Ultimately this means a drop of a few percent in ID1 maps (doesn't matter, 1.8.5 already wipes the floor with 1.8.4) but a worthwhile (and sometimes even colossal - I've seen 30% increases here) gain in big complex scenes that really stress the GPU.

FPS scores are meaningless; all they serve to indicate is the difference on my own machine. But for what it's worth, 1.8.4 was nudging on 260 in timedemo demo1, whereas 1.8.5 will get you over 310. A good chunk of this comes from the reduced-size rendertarget (to 80% of the viewport size, which I've found keeps framerates when underwater consistent with framerates when above water) but even without that it was getting over 280.

Others may scoff but I've found the ID1 demos to be good representations of different parts of the renderer that stress the engine in different ways. demo1 is the particle (and therefore fillrate) monster, demo3 is the dynamic lighting monster, and demo2 gives the underwater warp code a good working out.

Something like bigass1 (170 FPS, by the way) may be quite brutal, but that's just stressing everything. No useful information there. For testing purposes you need to be able to home in on more specific items, and particularly demo1 and demo3 are really useful for identifying performance in two parts of the renderer that every engine is going to need to tackle and optimize.