Thursday, December 14, 2006

XeO3: Bug spray....

I've had an interesting evening here, what with fighting with 2 year olds and trying to get them to sleep, and tracking down some elusive bugs, its been quite eventful. However, I've managed to eliminate 2 nasty bugs, one in the cache system and one on the front end... letme take you through them, or rather how I go about bug hunting... I've discovered over the years that debugging is a real art form, and one not many new folk are very good at - so here we go...

Bug 1. Cache system - okay, I had a fair idea where this one was since I had just rewriten the whole cache but the problem is that the cache is pretty large so first things first, chop it down to 1 entry. This means it's gonna hurt as every sprite will get rotated every frame, but Im debugging so what the hell. After stepping through the code I discover I've taken some short cuts in the code which speeds things up, but also requires at least 2 cache entries - okay, 2's still not bad, so we set that. Next, we have to verify that the cache is actually intact, and looking correct. With 140 odd entries, this would be pretty horrible to do, but with 2 it's a doddle. Turns out not all is well in the state of cache land. So I set about rewriting the cache initialisation stuff. This is a bit of a mind bender - again because I've decided to improve things.

You see, being a coder I count from 0 up, which means my indexes also start at 0. But the probelm is, I also want to use 0 to specify a no links (like a NULL in C), so I increment every entry by 1 so that 0 is 1, 1 is 2 and so on....this is pretty easy, but then it means that all the table accesses have to have a -1 with them to offset this.

Anyway, after rebuilding the linked list (using indexes this time), I fix up the main code that refrences them to use -1's and fix a couple of $FF index issues and we're all set. Cache bug appears to be fixed.

Bug 2. I thought this was a really simple one, I'd moved all the graphics around and suddenly the front end font vanished; okay I thought, I'll just have to move something to point to the new graphics...but no. Bugger. Now, whenever you get a bug that you simply don't understand, the trick is to knock out as much code as you can while still having the bug occur. This narrows down the amount of code you have to debug, and also reduces any interactions between modules where any funny issues might be happening. So, I do this.... I remove the ECM mode stuff, force in specific character set addresses, and make damn sure the character set is where I expect - it is, and its all as expected; but still no font.

I play for a while trying to get something to appear, firstly by altering the PRINT routine so that it only writes WHITE to the colourmap, and then by filling the whole damn screen with WHITE. Ah! There we go! Something...but not a font. Now, I know its displaying the right character set as the stars are all working, and if I poke around in there with $00's and $FF's I can see it - but still no font.

Okay...what next. If in doubt - try it on a real machine. Mmm...looks the same, press and hold reset and hay look... the logo changes to the FONT characters as the IRQ's go down, so they are there.

Okay...Now I know the grahics are right, lets step through the print routine - which I was assuming worked fine. Now I have a table lookup to translate ascii into my font using self mod code, and it turns out that this table USED to be on a 256 byte bounday, so I would store the character in the low byte and have the high byte point to the table. However, with the big memory move, its all changed... and this table no longer sits on a nice 256 byte boundry! The result is that during the look up, all letters are changing to a space! DOH! So, now I align the table and PING! All working again, so quickly put in all the bits I'd taken out, and bang! Back to normal.

So... when debugging, lop out as much as you can; coment out huge sections, #ifdef 0, or whatever! The less code you have to step through, the quicker you'll find that bug - Here endith the lesson.


Anonymous said...

Next time you run out of memory remember your ASCII->font table and code. That may require using ugly equates like CHAR_A and non-readable text data at PC, but removes need for translation inside the game. Paradroid Redux is full of strings like

GameOver: HEX 40 0A 42 0E 30 18 1F 0E 1B B0


Mike said...

Its okay, its one of those bugs you only get once until you put big comments around it :)

I only ever do those big HEX things if theres serious speed-ups to be had.

Anonymous said...

If you just have to find more memory then runtime conversion is good candidate. At that point there shouldn't be too many text changes any more, so source readability can be degraded slightly.

Back to debugging related issues... I know it's good practice to put those comments around kludgy routines and data, but I wonder why I forget that when coding. Maybe I trust my memory too much :) I've put IF clauses around tricky parts to halt assembly if table/code crosses page boundary, but sometimes that's not enough to catch the most obscure tricks.

One of the most "interesting" things I've seen was "ld de,$a028; inc d" inside Z88 operating system. It not only ends up with correct value in DE, but also clears Z flags which is checked much later after it was pushed/pulled through stack . Puzzled me several weeks, that one...