Well, since I was in a colour mood, I thought I'd smooth out some of the presentation a little so I've added a nice little fade on when the game starts and fade off when you die. It costs around 200-300 bytes in all, but it looks much nicer than the abrupt start/stop I did have before. Funnily the fade on looks much smoother, but I'm not really sure why - not that it really matters I guess, its probably an optical illusion and I can't really see what could be different about it.
I'm tempted to do a delayed DMA thing and scroll the panel on from the bottom, but thats probably a little too much when I dont have a lot of memory going spare...
Saturday, June 30, 2007
XeO3: Colour....
Mmm... the more I look at the Plus4 version, the more I wish it had colour. If you followed the way spectrum games work - that is the background gets colour and ships dont, they just fly through it - then you might be able to pull it off. The biggest problem is moving the colour screen. Unlike the C64, you can't scroll a whole colour screen over 8 game cycles as I have to double buffer the software sprites (C64 doesn't need that due to hardware sprites), but...... if you could afford to lose another 4k (ouch!) then it might JUST be possible...
Basically you have 4 screens. 2 are active at any one time over 8 game cycles and the other 2 are hidden. Normal character rendering goes on as normal, but the hidden 2 screens get the colour slowly moved (or rather copied!) from screens 1&2 to 3&4 - but 1 character further on. It still requires you to shift around 5 lines per frame but thats a far cry from blitting a whole 21 line high screen each game cycle as I do now. However, if you did that (and you might have to drop a sprite to fit it in), you could get XeO3 - with colour!
Its way too late for me to add that now, but once the source is released, it be cool if someone else did it. 9 software sprites is still loads! So drop 1 sprite, and free 4K+space for colour tiles, and it should work!
Okay...More minor optimisations. I've removed a couple of functions and inline'd them whilst making them faster, and finally removed the old, slow PATH allocator. It was still using the simple loop rather than the stack method, so thats now changed. I got fed up writing that bloody thing, so I've macro-ized it even more, and here it is.
And this is how I use it - Its pretty simple although I do have to replicate a branch which technically speaking I could do without. However, if it comes down to the game not working because of 2 cycles wasted here....I'll kill myself!
These are obviously SNASM macros. SNASM has a really cool macro feature in that it lets you build new opcodes, labaels or anything! using the paramaters. This means I can decide later what register to use, and prepend a name and have a new label - cool eh!
Oh! And for the record....the micro-optimisations I've been doing are generally called peep-hole optimisations by compilers. That is were they look are a few instructions at a time and try and make them better - I'm pretty much doing the same.
Basically you have 4 screens. 2 are active at any one time over 8 game cycles and the other 2 are hidden. Normal character rendering goes on as normal, but the hidden 2 screens get the colour slowly moved (or rather copied!) from screens 1&2 to 3&4 - but 1 character further on. It still requires you to shift around 5 lines per frame but thats a far cry from blitting a whole 21 line high screen each game cycle as I do now. However, if you did that (and you might have to drop a sprite to fit it in), you could get XeO3 - with colour!
Its way too late for me to add that now, but once the source is released, it be cool if someone else did it. 9 software sprites is still loads! So drop 1 sprite, and free 4K+space for colour tiles, and it should work!
Okay...More minor optimisations. I've removed a couple of functions and inline'd them whilst making them faster, and finally removed the old, slow PATH allocator. It was still using the simple loop rather than the stack method, so thats now changed. I got fed up writing that bloody thing, so I've macro-ized it even more, and here it is.
;****************************************************************************************
;
; Allocate/Free a Turret.
; Out: X=spare slot or -1 for error
;
; Best 29 - OLD
; Worst 134 - OLD
;
; Alloc - BEST/WORST - 12/18
; Free - Best/Worst - 24 (constant)
;
;
; Usage:- FastAlloc [xy], ListAddress, ObjectOnOff
; X or Y = index register to use
; ListAddress = Base of pre-filled stack
; ObjectOnOff = Value to clear/set when allocated/free'd
;
;****************************************************************************************
FastAlloc macro
ld\0 \1FreeIndex ; 3 ldx FreeIndex
lda \1FreeList,\0 ; 4 lda FreeList
bmi !NoneFree ; 2 bmi !NoneFree
in\0 ; 2 inx
st\0 \1FreeIndex ; 3 stx FreeIndex
!NoneFree: ; tax
ta\0 ; 2 get next free bullet (or $ff for none left)
endm
;****************************************************************************************
;
; Usage:- FreeTurret [xy], ListAddress, ObjectOnOff
; X or Y = index register to use (and hold the object to free)
; ListAddress = Base of pre-filled stack
; ObjectOnOff = Value to clear/set when allocated/free'd
;
;****************************************************************************************
FastFree macro
t\0a ; 2 txa
ld\0 \1FreeIndex ; 3 ldx FreeIndex
de\0 ; 2 dex
st\0 \1FreeIndex ; 3 stx FreeIndex
sta \1FreeList,\0 ; 5 sta FreeList,x
ta\0 ; 2 tax - restore index
lda #0 ; 2 lda #0
sta \2,\0 ; 5 sta BullInUse,x
endm
And this is how I use it - Its pretty simple although I do have to replicate a branch which technically speaking I could do without. However, if it comes down to the game not working because of 2 cycles wasted here....I'll kill myself!
FastAlloc x,Path,PathsInUse
These are obviously SNASM macros. SNASM has a really cool macro feature in that it lets you build new opcodes, labaels or anything! using the paramaters. This means I can decide later what register to use, and prepend a name and have a new label - cool eh!
Oh! And for the record....the micro-optimisations I've been doing are generally called peep-hole optimisations by compilers. That is were they look are a few instructions at a time and try and make them better - I'm pretty much doing the same.
XeO3: More micro optimisations....
I've spent some more time relaxing and scaning the code looking for the odd cycle here and there, and I've managed to smoothout some of the path code a little. I'd noticed that just before calling the actual command I set Y to 0 so that it pointer back to the start of the command (and not the index into the jump table), however the first thing all the commands did was to skip the first byte so it could read the paramaters (using an INY). This meant a wasted 2 cycles in every function. Not a lot... but 10 scripts, and around 20 functions means 20 cycles saved, and around 20 bytes as well - and the codes a little cleaner.
I also look at the Misc part - this is the messy bit that does all the grunt work of getting baddie numbers into sprites etc, and I've streamlined that a little too as it had duplicate checks and the average case wasn't as optimal as it could be.
All in all - good fun!
I also look at the Misc part - this is the messy bit that does all the grunt work of getting baddie numbers into sprites etc, and I've streamlined that a little too as it had duplicate checks and the average case wasn't as optimal as it could be.
All in all - good fun!
Thursday, June 28, 2007
XeO3: Scripts...
I've decided to release what little doc's I have for the scripting system to give you a better idea what its doing. This isn't finished by any means, but I started it to help Luca play with the some level editing. Its actually a refrence document and a few tutorials. Enyway - enjoy (even if you cant type them in and play with it just yet!!)
HERE it is!
HERE it is!
XeO3: More and more.
Okay - I've changed the little bits of the script engine anyway... damn you all!! Making me feel guilty! bah! But I guess, a scanline is still a scanline saved so... *sigh* - your all bastards....the lot of you! :)
I'm also busy changing my code from using #<Address to #Lo(Address) so that I can remove the shortcut "<". This will let me add to my assembler and do 65816 and do a full 24bit address range, which will in turn allow me to do SuperCPU stuff someday.
If anyone uses SNASM, then you might want to start changing over now as the assembler supports both at the moment. Future versions will only support Lo() and Hi() operators - but its all in a good cause!
I find it oddly relaxing doing minor code changes, move this, changing that while not doing any real work, its all just cleaning up and making it better. This is probably because you don't really get the time to do it as a profesional developer, you code as best you can as quickly as you can, and rarely do you get to go back and polish the code just for the sake of it.
I'm also busy changing my code from using #<Address to #Lo(Address) so that I can remove the shortcut "<". This will let me add to my assembler and do 65816 and do a full 24bit address range, which will in turn allow me to do SuperCPU stuff someday.
If anyone uses SNASM, then you might want to start changing over now as the assembler supports both at the moment. Future versions will only support Lo() and Hi() operators - but its all in a good cause!
I find it oddly relaxing doing minor code changes, move this, changing that while not doing any real work, its all just cleaning up and making it better. This is probably because you don't really get the time to do it as a profesional developer, you code as best you can as quickly as you can, and rarely do you get to go back and polish the code just for the sake of it.
XeO3: More scripting.
Okay, rather than reply to the comments, I'll write a bit more about it here. The actual loop for processing scripts is pretty small, and really easy to follow - so here it is....
Theres a couple of places where I could save a little, and I may later on, but I only really do that when its not going to change much; updating highly optimised code is next to impossible.
DoMisc does things like copy the baddie coords over to the sprite system (the sprites are independent and not tied into the paths since I want to be able to take one and not the other to another game). DoMisc also does things like checks for Kill when clipped options, and updates sprite centers so that collision routines don't have to keep doing that. It also deals with baddie animation, and getting the address of sprite shapes etc.
As for the Circle routine - thats pretty big... Aside from working out the next set of offsets, its just a MoveABS style function with a counter, so I'll show you the actual GetOffsets function. Theres 2 phases to the circle command - the 1st is a reverse calculation that works out the center for you, allowing you to specify the radius of an arc you are already on - this is a huge bonus, and all that I do is work out 180 degrees further on, and revese the add/subtract to the deltas the 1st time in.
I haven't really looked into optimising this that much yet as most of the time is spent doing the multiplys, and they are as fast as they can be.
So this function returns a set of signed deltas from the circle origin, and the main function adds them onto the center and sets the sprites location. Its a pretty expensive function, but all the smooth wavey paths in XeO3 are done with this, it also saves huge amounts of SCRIPTing memory as one command can control a sprite for a long time. The last part is to increment the current angle, allowing the baddie to move in an arc (or more depending on the radius of X and Y),
The scripting also allows for what I call drift... this means I can drift the sprite on X by a certain amount. A drift of 1 means that if I pause, the sprite will move with the scrolling background without needing more commands. Now, if you apply drift to a circle, you can have a zero radius on X, and wobble it vertically, this allows you to make a simple snake wave. Theres lots of tricks you can use with command combinations to get some cool effects.
I did a shoot-em-up on a phone where I allowed the center of a circle to be attached to a parent, which allowed for heirarchial rotation - small orbs spining around a larger orb - very cool. The plus/4 isn't quite able to handle that I think, but that doesn't mean you cant get some nice paths out of it!
If I've missed anything, or if somethings not clear - just ask again! :)
EDIT: Oh - and I dont have 360 degrees, but 256. 128 = 180, 64 = 90 and so on...
;************************************************************
; *
; Name : ProcessPaths *
; Function: Loop through all our paths and move the *
; objects assosiated with them. This includes *
; the master "SPAWN" path. *
; *
;************************************************************
ProcessPaths ; must start at 0 and go up
ldx #0 ; This means paths being started by the MASTER PATH
ProcessNextPath ; are not delayed a game cycle
lda PathsInUse,x ; path in use? Keep it a byte to make it quick....
beq PathNotActive
lda PathAddressLo,x ; Get baddie current path address/location
sta PathAddress
lda PathAddressHi,x
sta PathAddress+1
;
; Jump back to here to execute another command on the same sprite!
; "some" commands need to be free, which others will take a game tick.
; Changing object attribute or animation are "FREE", while movement is 1 tick.
;
DoNextCommand
ldy #0 ; reset index into command
lda (PathAddress),y ; get command
asl a ; *2 to index table
tay
lda PathJumpTable,y
sta DoJumpHere+1
lda PathJumpTable+1,y
sta DoJumpHere+2
ldy #0 ; point to first byte again
DoJumpHere jmp $0101
;
; once we've finished processing THIS sprite, jump to here
; we can then animate and do any "special" checking that needs to be done.
;
DoNextPath
jsr DoMisc ; 1 scanline wasted doing jsr/ret's.... could make it quicker... or inline it.
lda PathAddress ; Save the path location back into the sprite system
sta PathAddressLo,x
lda PathAddress+1
sta PathAddressHi,x
inx
cpx #MaxPaths
bne ProcessNextPath
ExitPathSystem
rts
PathNotActive:
cpx #0
beq SkipNuke
sta SY,x ; if path not active - nuke Sprite coordinate!
SkipNuke:
inx
cpx #MaxPaths
bne ProcessNextPath
rts
Theres a couple of places where I could save a little, and I may later on, but I only really do that when its not going to change much; updating highly optimised code is next to impossible.
DoMisc does things like copy the baddie coords over to the sprite system (the sprites are independent and not tied into the paths since I want to be able to take one and not the other to another game). DoMisc also does things like checks for Kill when clipped options, and updates sprite centers so that collision routines don't have to keep doing that. It also deals with baddie animation, and getting the address of sprite shapes etc.
As for the Circle routine - thats pretty big... Aside from working out the next set of offsets, its just a MoveABS style function with a counter, so I'll show you the actual GetOffsets function. Theres 2 phases to the circle command - the 1st is a reverse calculation that works out the center for you, allowing you to specify the radius of an arc you are already on - this is a huge bonus, and all that I do is work out 180 degrees further on, and revese the add/subtract to the deltas the 1st time in.
;*****************************************************************************
; workout offsets using path Radius and angle values.
; Store in zero page "CircleOff?" variables
;
; We use an inline macro for the xply as 2*9 cycles * 9 sprites=162 cycles!
; (or a couple of scanlines), and its only 66 bytes each...
;
;*****************************************************************************
GetOffsets
;
; Do RadiusY*sin(AngleY)
;
ldy PathAngleY,x ; get angle
lda SineWave,y ; get "sin(angle)"
sta xplyd ; store
lda PathRadY,x
sta xplyc
XPLY_M ; Use inline macro to avoid the 9 cycles for call/ret.
sta CircleOffY
ldy PathAngleY,x ; angle >180 deg? if so then NEGATE value
bpl !NotNegY
eor #$ff
clc
adc #1
sta CircleOffY
!NotNegY:
;
; Do RadiusX*sin(AngleX).
; (angle has been ofset 90 degrees (64bytes) to account for COS->SIN translation)
;
lda #0
sta CircleOffX+1 ; high byte of X offset
ldy PathAngleX,x ; get angle
lda SineWave,y ; get "sin(angle)"
sta xplyd ; store
lda PathRadX,x
sta xplyc
XPLY_M ; Use inline macro to avoid the 9 cycles for call/ret.
sta CircleOffX
ldy PathAngleX,x ; angle >180 deg? if so then NEGATE value
bpl !NotNegX
eor #$ff
clc ; NEG a
adc #1
sta CircleOffX
bcs !NotNegX
dec CircleOffX+1 ; take high byte it from 0 to $ff
!NotNegX:
rts
I haven't really looked into optimising this that much yet as most of the time is spent doing the multiplys, and they are as fast as they can be.
So this function returns a set of signed deltas from the circle origin, and the main function adds them onto the center and sets the sprites location. Its a pretty expensive function, but all the smooth wavey paths in XeO3 are done with this, it also saves huge amounts of SCRIPTing memory as one command can control a sprite for a long time. The last part is to increment the current angle, allowing the baddie to move in an arc (or more depending on the radius of X and Y),
The scripting also allows for what I call drift... this means I can drift the sprite on X by a certain amount. A drift of 1 means that if I pause, the sprite will move with the scrolling background without needing more commands. Now, if you apply drift to a circle, you can have a zero radius on X, and wobble it vertically, this allows you to make a simple snake wave. Theres lots of tricks you can use with command combinations to get some cool effects.
I did a shoot-em-up on a phone where I allowed the center of a circle to be attached to a parent, which allowed for heirarchial rotation - small orbs spining around a larger orb - very cool. The plus/4 isn't quite able to handle that I think, but that doesn't mean you cant get some nice paths out of it!
If I've missed anything, or if somethings not clear - just ask again! :)
EDIT: Oh - and I dont have 360 degrees, but 256. 128 = 180, 64 = 90 and so on...
Wednesday, June 27, 2007
XeO3: Scripting Engine.
Okay, after a request for more info on the scripting engine/system I use, I've written up a small description of how it all works. It's not hugely in-depth, but it should give you the jist on how it all works...
So have a look HERE to read through, but feel free to ask questions afterwards. I do intend to write up an API for the script system, and do several examples and tutorials for when I release the code. I've some some small samples and doc's so that Luca and play with it, but I really need to do far more.
So have a look HERE to read through, but feel free to ask questions afterwards. I do intend to write up an API for the script system, and do several examples and tutorials for when I release the code. I've some some small samples and doc's so that Luca and play with it, but I really need to do far more.
Monday, June 25, 2007
XeO3: update....
Well, the cache really is working well now. I've been able to strip it down from 137 entrys to 100 with out any noticable slow down. This saves around 2.5k which is great, but I may need the extra space if I want to bring on multiple baddie types at once, so I'm not about to splurge on memory just yet!
But the good news is that we're now at a point where I can FINALLY progress to the weapons system!! Everything appears to be running, and is stable, so thats going to be the next task!
Once I've done that I can look to making a new demo - PLAYABLE this time, albeit looking like mince. We don't want to give too much away on how it looks/feels so we're going to remove most of the backgrounds (make them solid blocks) and make the sprites very simple too. This will allow me to craft the difficulty a little better to users abilities and not make it too hard; which was the chief complaint with Blood Money (apparently). So while the games not really being closed to being finished, you do at least have something in the short term to look forward to!
But the good news is that we're now at a point where I can FINALLY progress to the weapons system!! Everything appears to be running, and is stable, so thats going to be the next task!
Once I've done that I can look to making a new demo - PLAYABLE this time, albeit looking like mince. We don't want to give too much away on how it looks/feels so we're going to remove most of the backgrounds (make them solid blocks) and make the sprites very simple too. This will allow me to craft the difficulty a little better to users abilities and not make it too hard; which was the chief complaint with Blood Money (apparently). So while the games not really being closed to being finished, you do at least have something in the short term to look forward to!
XeO3: Sprite Cache....
Mmm...Now it gets interesting! Now that the cache routine is working properly, I can finally visualise it and see how much is used, and how it behaves... It's showing that its actually got a little room to spare - which is great as it means I can always steal some if I need to - however, that little function below that makes the cache work takes around 50+ cycles per sprite per game cycle for normal usage. That is when its not at the end or whatever... Thats 10 scanlines at least! Everygame cycle! (when all sprites are active)
so once again... its down to a balancing act - do we use the cache and perhaps get a couple of K back...or do we ditch it and gain 10 scanlines? Of course theres also an additional penalty of having to rotate the player ship everynow and then if we dont use the cache properly, and it could come at anytime, where as the cache is reasonably predictable.
It's never easy.....I suspect the cache will stay as 10 scanlines is less than rotating 1 sprite, and if I dont overload the screen with NEW sprites, I can be sure that it wont rotate + the full 10 scanlines at anyone time. I'll have a think....perhaps theres an even quicker way of doing this....
so once again... its down to a balancing act - do we use the cache and perhaps get a couple of K back...or do we ditch it and gain 10 scanlines? Of course theres also an additional penalty of having to rotate the player ship everynow and then if we dont use the cache properly, and it could come at anytime, where as the cache is reasonably predictable.
It's never easy.....I suspect the cache will stay as 10 scanlines is less than rotating 1 sprite, and if I dont overload the screen with NEW sprites, I can be sure that it wont rotate + the full 10 scanlines at anyone time. I'll have a think....perhaps theres an even quicker way of doing this....
XeO3: Internet development to the rescue!!
Well, TNT to the rescue! He spotted the problem and its not working a treat. Heres the final function, and you can see how all suggestions have been added and sped things up.
And there you go! Beleive it or not, this tiny bit of code is the whole reason XeO3 got started!! I wrote the whole sprite routine to try THIS big of code out! Stupid really... but there ya go!
Many thanks guys!
;
; Y = rotation index
; Temp+16 = pointer to the raw sprite info with an
; 8 byte cache index table
;
lda (Temp+16),y
tay
ldx Cache_Next-1,y
beq Skip_MoveToEnd
lda Cache_Prev-1,y
sta Cache_Prev-1,x
bne !We_are_Not_First ; STA above does not change flags
stx FirstCache ; X still holds Cache_Next-1
beq !SkipNotFirst
!We_are_Not_First:
tax ; NOW get prev
lda Cache_Next-1,y
sta Cache_Next-1,x
lda #0 ; Only need to clear here
!SkipNotFirst:
sta Cache_Next-1,y
lax LastCache
sta Cache_Prev-1,y ; *BUG HERE* -1 added.
tya
sta Cache_Next-1,x
sta LastCache
Skip_MoveToEnd:
And there you go! Beleive it or not, this tiny bit of code is the whole reason XeO3 got started!! I wrote the whole sprite routine to try THIS big of code out! Stupid really... but there ya go!
Many thanks guys!
Subscribe to:
Posts (Atom)