Saturday, June 30, 2007

XeO3: Feeling blue...

Well, since I was in a colour mood, I thought I'd smooth out some of the presentation a little so I've added a nice little fade on when the game starts and fade off when you die. It costs around 200-300 bytes in all, but it looks much nicer than the abrupt start/stop I did have before. Funnily the fade on looks much smoother, but I'm not really sure why - not that it really matters I guess, its probably an optical illusion and I can't really see what could be different about it.

I'm tempted to do a delayed DMA thing and scroll the panel on from the bottom, but thats probably a little too much when I dont have a lot of memory going spare...

XeO3: Colour....

Mmm... the more I look at the Plus4 version, the more I wish it had colour. If you followed the way spectrum games work - that is the background gets colour and ships dont, they just fly through it - then you might be able to pull it off. The biggest problem is moving the colour screen. Unlike the C64, you can't scroll a whole colour screen over 8 game cycles as I have to double buffer the software sprites (C64 doesn't need that due to hardware sprites), but...... if you could afford to lose another 4k (ouch!) then it might JUST be possible...

Basically you have 4 screens. 2 are active at any one time over 8 game cycles and the other 2 are hidden. Normal character rendering goes on as normal, but the hidden 2 screens get the colour slowly moved (or rather copied!) from screens 1&2 to 3&4 - but 1 character further on. It still requires you to shift around 5 lines per frame but thats a far cry from blitting a whole 21 line high screen each game cycle as I do now. However, if you did that (and you might have to drop a sprite to fit it in), you could get XeO3 - with colour!

Its way too late for me to add that now, but once the source is released, it be cool if someone else did it. 9 software sprites is still loads! So drop 1 sprite, and free 4K+space for colour tiles, and it should work!

Okay...More minor optimisations. I've removed a couple of functions and inline'd them whilst making them faster, and finally removed the old, slow PATH allocator. It was still using the simple loop rather than the stack method, so thats now changed. I got fed up writing that bloody thing, so I've macro-ized it even more, and here it is.


;****************************************************************************************
;
; Allocate/Free a Turret.
; Out: X=spare slot or -1 for error
;
; Best 29 - OLD
; Worst 134 - OLD
;
; Alloc - BEST/WORST - 12/18
; Free - Best/Worst - 24 (constant)
;
;
; Usage:- FastAlloc [xy], ListAddress, ObjectOnOff
; X or Y = index register to use
; ListAddress = Base of pre-filled stack
; ObjectOnOff = Value to clear/set when allocated/free'd
;
;****************************************************************************************
FastAlloc macro
ld\0 \1FreeIndex ; 3 ldx FreeIndex
lda \1FreeList,\0 ; 4 lda FreeList
bmi !NoneFree ; 2 bmi !NoneFree
in\0 ; 2 inx
st\0 \1FreeIndex ; 3 stx FreeIndex
!NoneFree: ; tax
ta\0 ; 2 get next free bullet (or $ff for none left)
endm

;****************************************************************************************
;
; Usage:- FreeTurret [xy], ListAddress, ObjectOnOff
; X or Y = index register to use (and hold the object to free)
; ListAddress = Base of pre-filled stack
; ObjectOnOff = Value to clear/set when allocated/free'd
;
;****************************************************************************************
FastFree macro
t\0a ; 2 txa
ld\0 \1FreeIndex ; 3 ldx FreeIndex
de\0 ; 2 dex
st\0 \1FreeIndex ; 3 stx FreeIndex
sta \1FreeList,\0 ; 5 sta FreeList,x
ta\0 ; 2 tax - restore index
lda #0 ; 2 lda #0
sta \2,\0 ; 5 sta BullInUse,x
endm



And this is how I use it - Its pretty simple although I do have to replicate a branch which technically speaking I could do without. However, if it comes down to the game not working because of 2 cycles wasted here....I'll kill myself!

        FastAlloc x,Path,PathsInUse 

These are obviously SNASM macros. SNASM has a really cool macro feature in that it lets you build new opcodes, labaels or anything! using the paramaters. This means I can decide later what register to use, and prepend a name and have a new label - cool eh!

Oh! And for the record....the micro-optimisations I've been doing are generally called peep-hole optimisations by compilers. That is were they look are a few instructions at a time and try and make them better - I'm pretty much doing the same.

XeO3: More micro optimisations....

I've spent some more time relaxing and scaning the code looking for the odd cycle here and there, and I've managed to smoothout some of the path code a little. I'd noticed that just before calling the actual command I set Y to 0 so that it pointer back to the start of the command (and not the index into the jump table), however the first thing all the commands did was to skip the first byte so it could read the paramaters (using an INY). This meant a wasted 2 cycles in every function. Not a lot... but 10 scripts, and around 20 functions means 20 cycles saved, and around 20 bytes as well - and the codes a little cleaner.

I also look at the Misc part - this is the messy bit that does all the grunt work of getting baddie numbers into sprites etc, and I've streamlined that a little too as it had duplicate checks and the average case wasn't as optimal as it could be.

All in all - good fun!

Thursday, June 28, 2007

XeO3: Scripts...

I've decided to release what little doc's I have for the scripting system to give you a better idea what its doing. This isn't finished by any means, but I started it to help Luca play with the some level editing. Its actually a refrence document and a few tutorials. Enyway - enjoy (even if you cant type them in and play with it just yet!!)

HERE it is!

XeO3: More and more.

Okay - I've changed the little bits of the script engine anyway... damn you all!! Making me feel guilty! bah! But I guess, a scanline is still a scanline saved so... *sigh* - your all bastards....the lot of you! :)

I'm also busy changing my code from using #<Address to #Lo(Address) so that I can remove the shortcut "<". This will let me add to my assembler and do 65816 and do a full 24bit address range, which will in turn allow me to do SuperCPU stuff someday.

If anyone uses SNASM, then you might want to start changing over now as the assembler supports both at the moment. Future versions will only support Lo() and Hi() operators - but its all in a good cause!

I find it oddly relaxing doing minor code changes, move this, changing that while not doing any real work, its all just cleaning up and making it better. This is probably because you don't really get the time to do it as a profesional developer, you code as best you can as quickly as you can, and rarely do you get to go back and polish the code just for the sake of it.

XeO3: More scripting.

Okay, rather than reply to the comments, I'll write a bit more about it here. The actual loop for processing scripts is pretty small, and really easy to follow - so here it is....


;************************************************************
; *
; Name : ProcessPaths *
; Function: Loop through all our paths and move the *
; objects assosiated with them. This includes *
; the master "SPAWN" path. *
; *
;************************************************************
ProcessPaths ; must start at 0 and go up
ldx #0 ; This means paths being started by the MASTER PATH
ProcessNextPath ; are not delayed a game cycle

lda PathsInUse,x ; path in use? Keep it a byte to make it quick....
beq PathNotActive

lda PathAddressLo,x ; Get baddie current path address/location
sta PathAddress
lda PathAddressHi,x
sta PathAddress+1


;
; Jump back to here to execute another command on the same sprite!
; "some" commands need to be free, which others will take a game tick.
; Changing object attribute or animation are "FREE", while movement is 1 tick.
;
DoNextCommand
ldy #0 ; reset index into command
lda (PathAddress),y ; get command
asl a ; *2 to index table
tay
lda PathJumpTable,y
sta DoJumpHere+1
lda PathJumpTable+1,y
sta DoJumpHere+2

ldy #0 ; point to first byte again
DoJumpHere jmp $0101


;
; once we've finished processing THIS sprite, jump to here
; we can then animate and do any "special" checking that needs to be done.
;
DoNextPath
jsr DoMisc ; 1 scanline wasted doing jsr/ret's.... could make it quicker... or inline it.

lda PathAddress ; Save the path location back into the sprite system
sta PathAddressLo,x
lda PathAddress+1
sta PathAddressHi,x

inx
cpx #MaxPaths
bne ProcessNextPath
ExitPathSystem
rts

PathNotActive:
cpx #0
beq SkipNuke
sta SY,x ; if path not active - nuke Sprite coordinate!
SkipNuke:
inx
cpx #MaxPaths
bne ProcessNextPath
rts



Theres a couple of places where I could save a little, and I may later on, but I only really do that when its not going to change much; updating highly optimised code is next to impossible.

DoMisc does things like copy the baddie coords over to the sprite system (the sprites are independent and not tied into the paths since I want to be able to take one and not the other to another game). DoMisc also does things like checks for Kill when clipped options, and updates sprite centers so that collision routines don't have to keep doing that. It also deals with baddie animation, and getting the address of sprite shapes etc.

As for the Circle routine - thats pretty big... Aside from working out the next set of offsets, its just a MoveABS style function with a counter, so I'll show you the actual GetOffsets function. Theres 2 phases to the circle command - the 1st is a reverse calculation that works out the center for you, allowing you to specify the radius of an arc you are already on - this is a huge bonus, and all that I do is work out 180 degrees further on, and revese the add/subtract to the deltas the 1st time in.


;*****************************************************************************
; workout offsets using path Radius and angle values.
; Store in zero page "CircleOff?" variables
;
; We use an inline macro for the xply as 2*9 cycles * 9 sprites=162 cycles!
; (or a couple of scanlines), and its only 66 bytes each...
;
;*****************************************************************************
GetOffsets
;
; Do RadiusY*sin(AngleY)
;
ldy PathAngleY,x ; get angle
lda SineWave,y ; get "sin(angle)"
sta xplyd ; store
lda PathRadY,x
sta xplyc
XPLY_M ; Use inline macro to avoid the 9 cycles for call/ret.
sta CircleOffY

ldy PathAngleY,x ; angle >180 deg? if so then NEGATE value
bpl !NotNegY
eor #$ff
clc
adc #1
sta CircleOffY
!NotNegY:

;
; Do RadiusX*sin(AngleX).
; (angle has been ofset 90 degrees (64bytes) to account for COS->SIN translation)
;
lda #0
sta CircleOffX+1 ; high byte of X offset

ldy PathAngleX,x ; get angle
lda SineWave,y ; get "sin(angle)"
sta xplyd ; store
lda PathRadX,x
sta xplyc
XPLY_M ; Use inline macro to avoid the 9 cycles for call/ret.
sta CircleOffX

ldy PathAngleX,x ; angle >180 deg? if so then NEGATE value
bpl !NotNegX
eor #$ff
clc ; NEG a
adc #1
sta CircleOffX
bcs !NotNegX
dec CircleOffX+1 ; take high byte it from 0 to $ff
!NotNegX:
rts




I haven't really looked into optimising this that much yet as most of the time is spent doing the multiplys, and they are as fast as they can be.

So this function returns a set of signed deltas from the circle origin, and the main function adds them onto the center and sets the sprites location. Its a pretty expensive function, but all the smooth wavey paths in XeO3 are done with this, it also saves huge amounts of SCRIPTing memory as one command can control a sprite for a long time. The last part is to increment the current angle, allowing the baddie to move in an arc (or more depending on the radius of X and Y),

The scripting also allows for what I call drift... this means I can drift the sprite on X by a certain amount. A drift of 1 means that if I pause, the sprite will move with the scrolling background without needing more commands. Now, if you apply drift to a circle, you can have a zero radius on X, and wobble it vertically, this allows you to make a simple snake wave. Theres lots of tricks you can use with command combinations to get some cool effects.

I did a shoot-em-up on a phone where I allowed the center of a circle to be attached to a parent, which allowed for heirarchial rotation - small orbs spining around a larger orb - very cool. The plus/4 isn't quite able to handle that I think, but that doesn't mean you cant get some nice paths out of it!

If I've missed anything, or if somethings not clear - just ask again! :)

EDIT: Oh - and I dont have 360 degrees, but 256. 128 = 180, 64 = 90 and so on...

Wednesday, June 27, 2007

XeO3: Scripting Engine.

Okay, after a request for more info on the scripting engine/system I use, I've written up a small description of how it all works. It's not hugely in-depth, but it should give you the jist on how it all works...

So have a look HERE to read through, but feel free to ask questions afterwards. I do intend to write up an API for the script system, and do several examples and tutorials for when I release the code. I've some some small samples and doc's so that Luca and play with it, but I really need to do far more.

Monday, June 25, 2007

XeO3: update....

Well, the cache really is working well now. I've been able to strip it down from 137 entrys to 100 with out any noticable slow down. This saves around 2.5k which is great, but I may need the extra space if I want to bring on multiple baddie types at once, so I'm not about to splurge on memory just yet!

But the good news is that we're now at a point where I can FINALLY progress to the weapons system!! Everything appears to be running, and is stable, so thats going to be the next task!

Once I've done that I can look to making a new demo - PLAYABLE this time, albeit looking like mince. We don't want to give too much away on how it looks/feels so we're going to remove most of the backgrounds (make them solid blocks) and make the sprites very simple too. This will allow me to craft the difficulty a little better to users abilities and not make it too hard; which was the chief complaint with Blood Money (apparently). So while the games not really being closed to being finished, you do at least have something in the short term to look forward to!

XeO3: Sprite Cache....

Mmm...Now it gets interesting! Now that the cache routine is working properly, I can finally visualise it and see how much is used, and how it behaves... It's showing that its actually got a little room to spare - which is great as it means I can always steal some if I need to - however, that little function below that makes the cache work takes around 50+ cycles per sprite per game cycle for normal usage. That is when its not at the end or whatever... Thats 10 scanlines at least! Everygame cycle! (when all sprites are active)

so once again... its down to a balancing act - do we use the cache and perhaps get a couple of K back...or do we ditch it and gain 10 scanlines? Of course theres also an additional penalty of having to rotate the player ship everynow and then if we dont use the cache properly, and it could come at anytime, where as the cache is reasonably predictable.

It's never easy.....I suspect the cache will stay as 10 scanlines is less than rotating 1 sprite, and if I dont overload the screen with NEW sprites, I can be sure that it wont rotate + the full 10 scanlines at anyone time. I'll have a think....perhaps theres an even quicker way of doing this....

XeO3: Internet development to the rescue!!

Well, TNT to the rescue! He spotted the problem and its not working a treat. Heres the final function, and you can see how all suggestions have been added and sped things up.


;
; Y = rotation index
; Temp+16 = pointer to the raw sprite info with an
; 8 byte cache index table
;
lda (Temp+16),y
tay
ldx Cache_Next-1,y
beq Skip_MoveToEnd
lda Cache_Prev-1,y
sta Cache_Prev-1,x

bne !We_are_Not_First ; STA above does not change flags
stx FirstCache ; X still holds Cache_Next-1
beq !SkipNotFirst

!We_are_Not_First:
tax ; NOW get prev
lda Cache_Next-1,y
sta Cache_Next-1,x
lda #0 ; Only need to clear here

!SkipNotFirst:
sta Cache_Next-1,y
lax LastCache
sta Cache_Prev-1,y ; *BUG HERE* -1 added.
tya
sta Cache_Next-1,x
sta LastCache
Skip_MoveToEnd:


And there you go! Beleive it or not, this tiny bit of code is the whole reason XeO3 got started!! I wrote the whole sprite routine to try THIS big of code out! Stupid really... but there ya go!

Many thanks guys!

Sunday, June 24, 2007

XeO3: Bloody bugs!!!

I've been writing the cache code, and its simple enough, but theres a bug in it, and I can't for the life of me find it!! So I thought I'd post it here to see if anyones got any ideas. I suspect I've just been looking at it too much; not that that helps me of course! Anyway, its a simple linked list but the indexes are all +1 so that 0 specifies the end of the list. To offset this I simply subtract 1 from the address to the links which allows the normal use again. Anyway, heres to code....


;
; Y = rotation index
; Temp+16 = pointer to the raw sprite info with an
; 8 byte cache index table
;
lda (Temp+16),y ; get cache index using the current rotation index
tay ; into the sprite cache index list
ldx Cache_Next-1,y ; If we are last, then do nothing!!!
beq Skip_MoveToEnd
lda Cache_Prev-1,y ; get the PREV from our current cache block
sta Cache_Prev-1,x ; and store in the NEXT's Prev

; We've now unlinked from the NEXT item, unlink from PREV item

tax ; and set the PREV of our new one to the OLD last.
beq !We_are_First
lda Cache_Next-1,y
sta Cache_Next-1,x
bne !SkipFirst

!We_are_First:
ldx Cache_Next-1,y
stx FirstCache
lda #$00
sta Cache_Prev-1,x

!SkipFirst:
; Now link the cache block to the end
lda #0
sta Cache_Next-1,y ; get the PREV from our current cache block
lax LastCache
sta Cache_Prev,y ; Set the LAST entry as our prev.
tya
sta Cache_Next-1,x ; and set the last entry's NEXT as our new item
sta LastCache ; now set thr NEW last as the new item.
Skip_MoveToEnd:


So.... There you go! If you spot anything, let me know - while I've still got hair!

Saturday, June 23, 2007

XeO3: Microcoding

I've spent a very pleasent day slowly shifting through the sprite source and doing some micro optimisations; that is removing unneeded instructions and the odd cycle here and there. It'll have no huge impact, but it might help compensate for when I do the cache linking everyframe. The idea (as Ive said before) is when a frame from the cache is used, I remove it from the linked list, and add it to the bottom again (the end). This keeps used ones in the list, and unused ones drift to the top.

So, now I've dont this, I'll go and put the code in to actually use the cache properly and see how it all holds together.

Friday, June 22, 2007

XeO3: Working Lunch!

The great thing about owning my own laptop is I can put the code onto it and bring it into work so I can play with it at lunchtime - fab! So, I've made the change where by I now no longer have a bitmask marking the rotations that have been cached, I now just use the cache index value directly and if its 0, then its not cached. This works great, and frees up 167 bytes (space for 4 more sprites!) Not that you really seed a speed up, but I know its a little quicker, so thats nice.

The next thing to do is to try and finally make the cache work the way its supposed to! This simply means keeping commonly used rotations in there, and letting ones I don't use much drift to the top of the free list - I hope to do that later tonight.

Thursday, June 21, 2007

XeO3: Sprite time...

So I'm sitting here fixing the damage I did while debugging last night, and it occurs to me that Im wasting time and memory with my sprite system - again; although not much of either this time.....

I currently store a bitmask to tell me if a rotation is cached, and an index which points to it - like this


db BitMask ; 1 bit per rotation. In MCM mode, ever other bit is unused.
db Rotation1_Index
db Rotation2_Index
db Rotation3_Index
db Rotation4_Index
db Rotation5_Index
db Rotation6_Index
db Rotation7_Index
db Rotation8_Index
ds 32 ; Raw Sprite Graphic


Now, before I changed everything to indexes, the bitmask made sense, as it was quicker to check than the address. But now.... All I need to do is load the rotation index data and see if its 0. This will save 167 bytes, and all the time I spend masking the rotation flags. It will also help me speed up the cache itself - a little.

Classic retro music on a HUGE scale....


This has to be seen (and heard) to be believed.... Not only does it look pretty cool, but its actually playing music! Click HERE to see it... Theres a tune in there I recognise, but cant place....yet.....

While we're at it... heres another scary image....

Edit: TETRIS!!! Thats it!!!

Wednesday, June 20, 2007

XeO3: Success!!!

At last!!! Found the little ******!!!!

After a really painful debug session with Yape (he really needs to make his debugger more usable!!!) I finally tracked the problem down to..........The player ship sprite! Yep... nothing to do with the cache at all - well, not really. It was the sprite rendering that fried the memory, but that was only because the player ship pointers had been nuked! Funny....Im sure this was the same bad bug as last time...I wonder if the fix got lost, I'll have to check my source control logs... Anyway, the fix was just 3 lines!

lda PlayerAnims
ldx #0
jsr SetSpriteShape

So...there we go... Crappy nights work, but worth it in the end. I'm thinking about releasing the current minus4, as its debugger is much nicer than anything else out there, and really helps.

Edit: Yep...it appears I only half fixed this the last time. I was drawing a bad sprite last time, and now it just killed it dead! Oh well...all fixed.

XeO3: Problems....problems................problems

Still no advance on this graphics corruption and crash; I spent all of last night looking through code and playing with Minus4 to try and replicate the bug there, all with no luck. I still think it's to do with the new sprite cache initialisation code because if I run it once only, it seems to be fine. Problem is, the code looks okay to me, which means I'll have to step through all 137 sprite slots being initialised.

You know in a way, this restores my faith in development. When I started this, I had come to the conclusion that either I'd just gotten way better than when I was doing this stuff profesionally, or things were just not that hard now. But this shows that really nasty bugs can still rear their ugly heads, and are still a fag to track down. Thats not to say that its not easier over all - if I hadn't improved in the 18 years I've been doing this for a living, I'd be really worried!

Monday, June 18, 2007

XeO3: Damn bugs.......

I'm still trying to find this bug, and without much luck. I thought I'd found it earlier, but no joy. I'm fairly sure its something to do with the new cache, but I just can't track it down. If I could get it to do the same in Minus4, I could debug it properly, but it runs okay there; that leads me to think its hitting an undocumented opcode that yape and a real Plus4 won't do, but that Minus 4 does do. When adding the codes, I just added them all....I didnt really stop to think that it might not work on the actual machine. I hope to build a new version of Minus4 tomorrow without the undocumented op's - or rather without all the extra ones XeO3 doesn't use.

Sunday, June 17, 2007

RetroEdit: reliving the past....

I was busy building a shed today, so I didn't get much done, but I did have a look trough the source of RetroEdit and wasn thinking that it needs cleaned up a bit. I started to write this right back when I was only just starting to use C#, and it shows. I currently use it day-to-day at work so I know a lot more about it, and I can make it much nicer inside. So I may try and clean uo the code before extending it in the near future - like putting editing in for example.....

Saturday, June 16, 2007

XeO3: damn.....damn........................damn....

Just proves you should never say "finished" on anything.... After assuming I'd fixed all bugs, a real topper comes along and kicks me on my arse! After a couple of restarts (and this varies!) it looks like theres MAJOR corruption going on, and its killing all the graphics. The game itself looks like its playing away fine, but graphics (sprites, backgrounds and everything else) looks like they've been tosted by something. I can't get this to happen in Minus4, only Yape and on a real machine, so it looks like it'll be a bugger to find.

Damn.....damn..................................damn.

XeO3: Whoops! -*Blush*-

See.... Now I just feel really embarrassed! The reason it only shows up when I have 2 or 3 (or as it turns out less than 8) cache entrys, is because the INIT loop uses the number of cache blocks to clear the array! Heres the start of the init() call..


InitCache:
;
; First wipe cache data
;
ldx #0
txa
!Lp1 sta Cache_GraphicLo,x
sta Cache_GraphicHi,x
sta Cache_Index,x
sta Cache_8Bytes,x ; clear first 8 bytes (and more) of cache
inx
cpx #CacheBlocks ; number of cache entrys to init
bne !LP1

Now.... those Cache_8Bytes are really important, thats how I scroll sprites up and down cleanly without having to store/cache extra space for vertical movement. Because sprites are stored in characters you have to rotate horizontally, and in theory - vertically; so that you can move it around the screen.

However.... if you store the sprites in columns (which is faster anyway), you can then store 8 bytes before the sprite and just move the start address backwards a byte to move the sprite downways a pixel. This means you only ever copy in 3x2 characters, and the rest are blank. Each column moves up 1 byte and gets the space from the previous column (a colunm is 24 pixels high, but the lower 8 bytes are always blank).

Now the next sprite in the cache doesn't have these 8 spare bytes in front, it does however have the previous sprites last empty character - so uses that!

So... anyway... turns out this bug wasn't clearing the whole of the 8byte cache space simply because I had lowered the number - DOH! I could "fix" it so that it will cope with caches lower than 8, but thats virtually useless anyway, and this way uses less memory - so as long as I now know WHY its doing it, I'm happy.

XeO3: Bug time..

So... I finally found some time and energy to boot up the project again, and low and behold - another bug. If you look at the image you'll see the litte white lines above some of the sprites, this I think is due to the first or last sprite not being setup correctly, but Im having a little trouble pining down as to why. This is running with only 3 cache entrys, which is pretty cool really as it means its using virtually no memory (saving 10k!), but it does slow down a lot when the screen gets busy and the cache is emptied on a regular basis.

Once I fix this, I think that'll be all the bugs out of it, and I can once again start on the weapons system.If you recall, I did start on that before, but it went slightly pair-shapped, and I had no idea why so I had to undo it all and hope to start again fresh later.

Anyways... I'm off to try and find this one, its been around for a while, but with 137 cache slots its hard to spot - at least with only 2 or 3, its right there in front of me.