Saturday, July 07, 2007

65816: The power of 16bit.

I've been wondering just how much faster the SuperCPU actually is to a stock C64, and aside from the x20 jump you get from the raw clock speed, the new instructions and 16bit nature give you an even bigger boost - Alomst another x2! Heres a little example....

The scrolling in XeO3 takes a long time, every game cycle I do this:


ldx #39
ScrollLoop
lda BackBuffer,x
sta HWScreen2+$400+(40*00),x
lda BackBuffer,x
sta HWScreen2+$400+(40*01),x
lda BackBuffer,x
sta HWScreen2+$400+(40*02),x
lda BackBuffer,x
sta HWScreen2+$400+(40*03),x
lda BackBuffer,x
sta HWScreen2+$400+(40*04),x
lda BackBuffer,x
sta HWScreen2+$400+(40*05),x
lda BackBuffer,x
sta HWScreen2+$400+(40*06),x
lda BackBuffer,x
sta HWScreen2+$400+(40*07),x
lda BackBuffer,x
sta HWScreen2+$400+(40*08),x
lda BackBuffer,x
sta HWScreen2+$400+(40*09),x
lda BackBuffer,x
sta HWScreen2+$400+(40*10),x
lda BackBuffer,x
sta HWScreen2+$400+(40*11),x
lda BackBuffer,x
sta HWScreen2+$400+(40*12),x
lda BackBuffer,x
sta HWScreen2+$400+(40*13),x
lda BackBuffer,x
sta HWScreen2+$400+(40*14),x
lda BackBuffer,x
sta HWScreen2+$400+(40*15),x
lda BackBuffer,x
sta HWScreen2+$400+(40*16),x
lda BackBuffer,x
sta HWScreen2+$400+(40*17),x
lda BackBuffer,x
sta HWScreen2+$400+(40*18),x
lda BackBuffer,x
sta HWScreen2+$400+(40*19),x
lda BackBuffer,x
sta HWScreen2+$400+(40*20),x
dex
jpl ScrollLoop1
rts


This code is self-modified to address the new location of the back buffer, and I have to use a jpl (macro) since a normal branch is just out of reach, so this takes (40*21*9)+(40*7) = 7840 cycles. (this is approx as there are also page boundary crossings hidden in here.)

Now in 65816, I can do exactly the same but being 16 bit, the loop is half, and although we add a couple more cycles for LDA/STA, its still much quicker. So the loop is now (20*21*11)+(40*7) = 4900 cycles.

And now lastly, the 65816 has a block transfer instruction MVN+MVP which are like Z80's LDIR instruction, which means (BEST case) its now (20*21*7) = 2940 cycles. Now, although the block transfer would be broken up a little mode (to do lines mainly), its still only going to be around 3000. So not only is more than twice the speed as the 6502 version, but we have the new 20Mhz clock as well.

..............Bitmap blitting suddenly becomes REALLY interesting!!

Friday, July 06, 2007

Snasm: 65816

I've done most of the assembler stuff - not all, but most - to the point that I can now build simple programs. So I've built a simple 65816 program, and it appears to be falling over; and I've no idea why! These kinds of things suck coz I can't debug it at all - just try new things over and over until I get somewhere....

Heres the code....


opt prg
opt C64=START ; Insert BASIC header, and jump to start of the code
opt A65816 ; Set 65816 assembler mode


org $1000
START:
longa off
longi off
sei
clc ; Set processor to 65816
xce ; Set 16bit processor mode

rep #$30
longi on
longa on

lda #$0000
sta $20

Here:
lda $20
ldx #$1000
Here2:
sta $0400,x
dex
bne Here2

inc $20
jmp Here



If I dont put it into 16bit mode, it appears to work fine, but execute the rep #$30, and it dies.... I'm not sure why... If I didnt know better, Id say there was an interrupt going off - I wonder if the NMI's are still on....

Thursday, July 05, 2007

Snasm: Addressing modes.

I've finished the last couple of new addressing modes the 65816 has, so tonight I'll start throwing in the new instructions which won't take long and then I can start having a play on the real machine and see what it can do. The idea of a fully software driven game appeals to me, as in bitmap mode you can do some pretty big sprites without losing all your background tiles.

However... A multiplex is still a great tool, and you should never discard useful tools.

Wednesday, July 04, 2007

SNASM: 65816...

Almost done with the addressing modes. Only 2 small ones to go, and I'll do them tomorrow.

     EOR (dp,X)
EOR sr,S
EOR dp
EOR [dp]
EOR #const
EOR addr
EOR >long
EOR (dp),Y
EOR (dp)
EOR (sr,S),Y
EOR dp,X
EOR [dp],Y
EOR addr,Y
EOR addr,X
EOR >long,X


So the only modes left are sr,S and (sr,S),y - neither which I've ever used but I'll put them in for completeness. This hasn't been nearly as painful as I thought it would be, 2-3 days at most. I've still to add all the extra instructions, but they're all just implied addressing (TXA etc.) and are just table modifications. MVN, MVP, PER and BRL are the only ones that need special work, and they won't take long.

Oh....I've been using THIS page as an opcode refrence - pretty good too.

XeO3: SuperCPU...

I'm just running the current version of XeO3 on the my C128 with the SuperCPU attached, and I still cant believe just how fast it is... The whole game (as it currently stands - scrolling, paths, animations, turrets) runs in around 20 scanline. Whats amazing is that XeO3 is designed to run in 2 frames so that its a nice slow paced scroller so thats basically 2 WHOLE FRAMES with nothing to do but fancy stuff!!! How cool is that!

With the turbo off, its taking about a frame - which is a lot. It just shows how much faster the Plus4 really is.

Im also really pleased with the downloader - Because it uses control lines and not timing code; it just works! AND I can switch the turbo on/off as it downloads without it doing something horrible!

The only current issue is that while everything runs, the background tiles aren't being drawn correctly - for some reason. That's a bit of a bugger...

Edit: I think part of the problem is that the C64 version has a multiplexor runing everyframe, and the sort is as well - when it doesn't have to be.

SNASM: 65816 support....

I've been adding more addressing modes during lunchtime, so now I can do full 16bit immediate mode stuff. This also means I had to put in 8/16 bit flow control using new LongA and LongI, so you can now do commands like this!

        LongA   off
LongI off

lda #$ff
adc #$55
and #$fe
cmp #$12
eor #$55
ora #$55
sbc #$55
bit #$55

ldx #$23
cpx #$55
ldy #$23
cpy #$55


LongA on
LongI on

lda #$ffff
adc #$5125
and #$fe34
cmp #$1112
eor #$5544
ora #$5555
sbc #$5566
bit #$5532

ldx #$2312
cpx #$5542
ldy #$2363
cpy #$5513


Great fun!! I found out from the forums on Lemon64 that I can happily run my SuperCPU on my C128 in C64 mode without fear of blowing it up, so I can probably start to test some of the output soon.

I was thinking of doing a bitmap scroller an actually blitting the bitmap on each frame - which at 20Mhz, should only take around 1/5th of a frame!! wow thats fast... I could then also do software sprites on top of that! The great thing being that the sprites would be much quicker (aside fom the 20Mhz stuff) since I dont have to copy character data around - AND you wouldn't be limited by the character set either -its a bitmap! Colour would also be possible with this as well I guess... and thats not even thinking about C64 hardware sprites yet either!

Theres surprisingly little added to the 6502, a couple of addressing modes (and I'm doing 24bit Absolute addressing just now), and a few extra instructions that take no effort at all really. So this shouldn't take that long at all to finish this....

Edit: Thinking a bit more about this....I really need to update my C64 emulator to allow 65816; development would go much quicker if I had an emulator to run it in... I'm not sure how much effort that would be, I've not looked at it in years - since I put it on the PS2/XBox really.

Tuesday, July 03, 2007

Paradroid!!!!

At long (long, long, long, long, long, long, long, long, long, long, LONG!) last... TNT has started a proper blog of his Paradriod update. Its basically a disassembled, updated (a lot) and then reassembled version of the original game. He's making heaps of improvements and its well worth a shot if you haven't already. You can find his blog HERE

I've decided to start the work of upgrading my assembler to be 65816 compatible (which is used in the SuperCPU) as I fancy having a little play with 20Mhz of power! I know - Im jumping around quite a bit, but that happens as I try and keep my interest going. I'll probably play with this at lunchtimes at work, so I hope it won't get in the way too much. The 65816 is very neat and I've used it a lot in the past when doing SNES work (Lemmings2), its got full 16 bit registers although you can swap them back and forth. Its also got access to a full 16Mb of RAM (which my Super CPU has!) which could make for some REALLY cool stuff - lots of space for buffers and tables here! Also Zero page becomes DirectPage and it can move!! In Lemmings2 I pointed zero page at my graphics so that the code ran faster! I'll need to look out my SNES assembler manual to see the syntax I used in that, but all in all - it should be pretty good fun!

Edit: So I've just added my first new 65816 instructions! yeeeeeeaaa!!
     inc a
opcode [$00]
opcode [$00],y

I'm so happy... :) This does 24bit indirect addressing through the direct page register (the new movable ZeroPage)

You know something....the THOUGHT of filling even 4Mb with C64 graphics/sound is frightning.... On the SNES I had a ROM disk system, and most of the memory was taken up with that, but here... its RAM...so you cna actually DO stuff with it - outstanding! I may have to update my C64 emulator to allow 65816 code as well...Mmmm... even simple stuff without the need to fallback to 1Mhz for custom chips would probably do - I dont think anyones done a SuperCPU emulator before....

The other thing I didn't realise is that the 65816 is also 65c02 compatable...which means I can probably add 65c02 to the assembler as well...

RetroEdit: Painting by numbers.

I got basic editing in last night, although I really do mean basic. I can paint with 2 pre-defined colours - which isn't very handy. Now I need to put all the really boring editor stuff in to allow you to select colours from a palette, swap between them, validate the system its on, and then swap between systems....I hate editors :)

(oh....then save it all)

Sunday, July 01, 2007

RetroEdit: Editing...

I've almost got editing working, I can click on the bitmap and it draws a pixel into the right place. Theres still loads to do, but it shouldn't take long now. Once I get editing and saving working, i can let Luca lose on it and try it out - it should help him do later level sprites. I'm expecting great things from him once he gets a good editor, as hes already done amazing things with - well, crap editors!

XeO3: Squishing bugs!

I've had a really bugging..well...bug! for a while now in the front end; after you die and return there I seemed to be losing a layer of stars and I couldn't think what it could be! My first that was that one of the couters werent being reset and so a layer was in fact 2, but I couldn't see how that was possible. So I then though that perhaps some stars were black, and I just wasn't seeing them. This wasn't right either - although I did discover some were in fact being made black - but it was fairly even throughout all the layers. Once I fixed that I went back to hunting for the missing layer.... Turns out my first instinct was right and a counter wasn't being reset! The slowest speed only moves every other frame and was being toggled with $FF and getting checked to 0, but since I use the game data area it wasn't being cleared to 0, which meant the EOR #$FF was setting it from one zero value to another, and so it was moving at the same speed as the middle layer.

Glad I fixed that, its been bugging me for a while now....