Wednesday, December 05, 2007

65816: Cool bitmap stuff...

Yeah, I know I know.... I should be doing XeO3, but the lure of 65816 code is too much just now! :)

Anyway... After doing my block transfer code for copying a bitmap, I started wondering how expensive it would be to physically draw the screen everyframe from a tileset. This then got me thinking about making a software character map. If you could do a 16bit character map, you could extend the characters available, and even do some pretend hardware flipping! So I did some basic code (no visuals yet) to see how bad it was, and was pretty pleased. It worked out about 1.5 to x2 the raw block copy. Not bad at all! After playing with the hardware FLIP (on Y) idea, it came out to about x2 to x2.5 which is still pretty good, and actually BETTER than my current character screen copy! (code shown below)
     ldx #8000-8
BlitLoop2
lda $9000+1000 ; get character
dec BlitLoop2+1
dec BlitLoop2+1
and #$3ff
asl
asl
asl
tay ; X=Character address

lda $8000,y ; copy character
sta $4000,x
lda $8002,y
sta $4002,x
lda $8004,y
sta $4004,x
lda $8006,y
sta $4006,x
txa
sbc #7 ; -8 coz carry is clear
tax
bpl BlitLoop2


Imagine it.... a character map screen that has proper bitmap colouring, a colour screen that is double buffered, and around 1024 (at least) characters. Very cool. On top of all that, your software sprites DONT steal any of them, and DONT have to draw into chars before going on screen.

Then I started thinking... On the 65816 Direct page is moveable.... So why not point it at the bitmap screen. This could save 2 cycles per STA - a LOT when dealing with a bitmap. The results are amazing to say the least!! Believe it or not, its now about the SAME time as the raw block transfer instruction!!! How amazing is that!! (basic code below)
DoChar     macro 
lda charScr+\0 ; get character
and #$3ff
asl
asl
asl
tay ; X=Character address
lda $8000,y ; copy character
sta $00+\1
lda $8002,y
sta $02+\1
lda $8004,y
sta $04+\1
lda $8006,y
sta $06+\1
endm

; Actual loop....

phd
lda #31
sta LoopCounter+1
lda #$4000
tcd
ldx #0
BlitLoop:
DoChar 0,0
DoChar 2,(8*1)
DoChar 4,(8*2)
DoChar 6,(8*3)
DoChar 8,(8*4)
;
; etc...
;
DoChar 60,(8*30)
DoChar 62,(8*31)

txa
clc
adc #64
tax

tdc
clc
adc #$100
tcd

dec LoopCounter+1

LoopCounter:
lda #$0101
beq AllDone
jmp BlitLoop
AllDone:
; 8 more DoChar's to finish off....



So I can now not ONLY copy a bitmap, but create a whole screen in realtime from a 16bit charactermap. This means you can do all the old tricks of animating a couple of characters and the whole screen could change (as I do for my turrets in XeO3), and if you really wanted you could draw huge character style sprites/baddies again.

I need to verify all this by actually drawing a screen (probably get the XeO3 scrolling going), but it looks really REALLY cool.....

Tuesday, December 04, 2007

Play.com: How the mighty have fallen....

For those that don't know, Play.com are a fairly sucessful on-line DVD shop. They have reciently branched out into electricals and games but while doing so have sacrificed what made them great...

I just tried to order a Hi Def DVD player from them only to have my account suspended. I then recieved an email telling me to phone customer support to verify who I was. Now.... if I was a new user, OR had changed my details in the past - hell - YEAR! then I could understand this. I had added a new delivery address, but these items "for security reasons" were being shipped to my home/billing address. On top of that, they emailed the email address they've held for the past several years to ask for verification.

This comes on top of increasilly poor service. Now, Im a DVD fan - a HUGE DVD fan, so I notice this more than most. But in the days of "next day delivery" or at the most a couple of days, a couple weeks is bad. And thats what play now deliver on a regular basis. My current record is 4 weeks, I now NEVER recieve an order from play quicker than 2 weeks.

And to top all this off, our local Asda has started beating them on price - HUGELY! The new Shrek the 3rd movie on play is around £12.99, while Asda have it for £8.88! This is happening more and more, so much so that I found the only reason to order from play, was the fact I could do it easily, and then they pull this.

Not only that, but when you buy a pre-release, you could usually assume you would get it on the day of the release. NOPE! They've screwed that as well. I ordered Call of Duty 4 months before it came out, yet I had to wait several days after launch to finally recieve it, when I could have walked into a shop on the friday and get it for exactly the same price.

Now only did they ask for verification, they actually suspended my account. Bastards the lot of them. Everything on play can be bought else where, usually at the same price or sometimes now cheaper. It didn't used to be this way.

.................So much for internet shopping being cheaper,quicker and easier - to hell with them.

(rant over)

Monday, December 03, 2007

65816: More coding goodness....

I finally managed to get a stable framework now (more or less), complete with interrupts. I did have to go looking for some answers though as I was getting nowhere fast! Turns out when the 65816 is in native (16bit) mode, its IRQ vectors move to $FFEE and $FFEA. Oh well, live and learn (again). When I did this on the SNES I would have done that the 1st couple of days then forgotten about it!

Anyway, now that its running the real problems with block transfer routines show themselves; you can't interrupt them! Yup, on most CPU's the next interrupt happens at the end of the last instruction - EVEN if that instruction takes seconds to complete! So, looks like I'd need a loop around a few of these, or not use it at all!

I guess for the final BLIT, as long as I knew it wasn't going to interfere I could use it, but it does look like the best/safest way is simply to use lda/sta instructions, Oh well.

Saturday, December 01, 2007

SuperCPU: Wowza!

You know the more I toy with this machine, the more I'm blown away by it. Now that I have a reasonably stable assembler I can see just what its capable of.

For example, a full BITMAP copy using the block transfer instruction runs in at around 56 scan lines (7 characters high), while a transfer to NON vic memory runs in at around 40 scan lines (5 chars high). Thats fast, really fast! Normally, this would take over a frame to do. It means that software sprites AND bitmap scrolling is easily possible, and it would be interesting to take over the XeO3 sprite routine and apply it to a bitmap screen.

I suspect you could do a parallax bitmap scroll in the same way I did a character one, but by using a bitmap screen you would get access to many more colours.

If you ran the game in 2 frames (same as XeO3), you would have a phenominal ammount of CPU time to burn, and could probably fill the screen with software sprites!

All that said, I am still having problems using the SuperCPU to its fullest as I can't seem to access the higher banks without it locking up (using the block transfer instruction).

For those interested, I've been using THIS PAGE as my 65816 refrence. Although I do have an old 65816 cycle sheet from my SNES days somewhere....


You know, the more I think about it, the more I think that 16Mb of RAM and sprite caches would be amazing! I need to do a new SuperCPU demo :)

Friday, November 30, 2007

65816: Aaaaaaaannnnnnnddddd we're done..........

Well, I appear to have now finished 65816 support. I had some nasty little bugglets that were messing with my REP/SEP commands which is exactly the reason I couldn't swap in/out of it easily! Basically, even in 16bit mode, Rep/Sep are 8bit commands, but it was saving 16bit immediate values, which meant it would then hit a $00 (brk) as the next instuction.

So, this is pretty cool, as I'm now done with the core supprt for 65816 and 65c02. It still needs heavy testing, but looks like its mostly there.

While I was at it, I also fixed a couple of other instructions in normal 6502 mode that were broken ( jmp($1234), and lax $00,y). So that helps me anyway for XeO3.

So all I need to do now is release this version and then back onto paths for xeo3!
(have I mentioned just how much I HATE doing paths???)

Thursday, November 29, 2007

65816: Problems....

Okay....I was a bit peeved that I released the latest SNASM with only partial 65816 support, so I've decided to finish it off. While doing this, I've hooked up my SuperCPU again along with my fab new Heavy Duty PSU.

I'm pleased to say my MMC64 and Ethernet still work on the super CPU, so downloading is a doddle. You know...I was halfway through this post, moaning that nothing was working, and then I noticed a REALLY stupid mistake, and now all is well....

So now I have a full 65816 - 16bit program running!! COOOOOOL!

Slight problems still exist however....If I'm in 16bit mode, I cant go into 16bit mode. (yes you read that right). Its a bit strange. As its not that big, I've done a HEX dump on the code and it all lookgs fine!

*sigh*........What the hells wrong now.......

Wednesday, November 28, 2007

New Release!!!!

I have just done my first new release of Minus4 in about 2-3 years!! Yes all these new features I keep going on about are have now been released into the wild. So if you head on over to THE MINUS4 HOMEPAGE you'll be able to download new versions of Minus4w, Minus4wsrc and the new Snasm.

The new Snasm also has a lot of 65816 assmbler support in it, but mre importantly outputs symbols for Minus4's built in debugger.

If theres any cool features you can think of, let me know.... Minus4 has moved from being the best emulator (now easily beaten by YAPE) to being the best for development - by far!

XeO3 is being developed using SNASM as the assembler, and Minus4 as the debugger. Yape provides more realistic playback, and the uploader provides real hardware downloading! All I really need to do is finish the remote debugger and we'd have a full devkit!!

Tuesday, November 27, 2007

XeO3: Profiled!

I'll get into this in a bit, but I just wanted to show the basic XeO3 profile. As expected sprite drawing is coming out on top with the HUGE spike inside ZeroBlock which is the bit that draws sprites when they are over an empty character. This is the fastest path through the sprite system.

The 2 big peaks to the right are the screen copy used for scrolling (but also wipes the display). Theres 2 seprate peaks as its double buffered and it flips from one to the other.

This is pretty good as it means code is taking its time in the areas expected; as is usually the case for 8bit stuff. However, I'll start looking at things like collision and zeropage next. Theres something funny happening in zeropage that I'll need to track down, it looks like something it over writing a couple of bytes of unallocated memory down there.

Oh...The reason I was able to pick this bit out is that I've impemented the ZOOM feature. So now all you do is click and drag over an area of interest and then click ZOOM. I'll need to change the Execute buffer as when you zoom in to byte level, it only counts the OPCODE bytes as being used. This means you get a peak, 0, peak, 0,0 peak etc. as the paramaters aren't tagged. All I need is a 256 byte table of how big each opcode is then fill in each location of an instruction.

Monday, November 26, 2007

Minus4: More on profiling....

You know, the one thing I really like about machines getting bigger and better, is that it lets you be lazy. In Minus4 I could do special functions to hunt quickly for symbols as quickly as possibly, arranging it all in a nice binary tree, while at the same time maintaining a 2D linked list so that I can find the symbol closest to an address.

Or.... being lazy and having a big machine....I can create a 256K table of symbol pointers.

I like lazy. Code is easy, and I can get onto what matters.

So I've only a couple of bits left of the Minus4 profiler. One is a ZOOM function; which it does actually do just now but I need a user interface, and the second is a symbol look up so that the addresses make sense to a developer. The blurb above means I can do the symbol lookup REALLY easily, so that will just leave the zoom and scroll bit (to let you view around the window area easily).

Then I can get back onto XeO3 and rearrange memory to be more efficiant.

Sunday, November 25, 2007

Minus4: Profiling other games!!

Now this gets really cool.... I was busy flicking though other peoples demos and games and its amazing what a basic profiler does. In Elite, you can clearly see where the linedraw (I assume) is called as theres a huge spike in the Execute map, while in Monty on the Run shown in this new image, you can clearly see that monty is rotated in Zeropage!!

72 bytes appear to be set aside for the monty graphic in zero page, and as you run about it goes up and down in time. Really cool.

It also shows up some odd patterns, for example in one of Luca's demos, it looks like he's using ZeroPage oddly as something is reading through zeropage overtime, and the read bar runs up smoothly through memory.

On XeO3, you can clearly see where both the sprites and scroll occur in memory, AND you can tell when its drawing tiles as theres a noticable spike every few frames.

The thing a static image can't get over, is that fact that becasue its in realtime, you see spikes as they happen - and where! Really neat.

Great fun this :)