Tuesday, October 02, 2018

Advanced programming of the ZX Spectrum Next

So while porting Super Crate Box to the NEXT I decided to break with the norm, and use the hardware more fully. While doing so... I realised I'm probably the first to do so - although we all knew about this when the hardware features were being designed, I think I'm the first to take advantage of it. That being so.... I thought I'd write it up so others could also take advantage.

One of the (many) things I loved about the C64, was the 64k of RAM and address space you had, you could fill that machine with your game, banking out ROMs, VIC, Character sets and IO ports etc and use every part of it! This was awesome. I use this when writing Blood Money on the C64, and it was very liberating. :P

When putting the Next together, the team extended the new 8K memory mapping to allow for the same thing - you can now have a full 64k of program space. This means you can now bank RAM into the lower 16k, and even move the screen location. This is incredible, and really helps you make full use of the machine.

So... how did I do it? Actually... it's fairly simple, but it's nice to see it all laid out. Using the new .NEX format file, I simply had to move code around a bit, and have a small "boot" loader somewhere. Here's the memory map I was aiming for...

$0000-$7FFF - Game Code
$8000-$BFFF - Game Data
$C000-$DFFF - Graphics
$E000-$FFFF - ULA/Timex screen

I will use $C000-$FFFF for other things on demand (like Hardware sprite graphics etc), but this is the main layout. As you can see, it's a much better layout, meaning you can have so much more code, and not fill the middle of memory with 16K of screens. The 8K banks also allows the paging of graphics AND the ULA/Timex screen.

Using the Timex screen means I can double buffer the ULA screen without using the ZX128 shadow screen, and it's much easier to control. it's also important to realise that the NEXT will ALWAYS use bank 5 (banks 10 and 11) for the ULA and Timex screen, even if it's not banked in. This is very cool.

Using SNASM and the new segments I can lay this out easily, but I'll be more generic so you can use another assembler if you want.

First, you need to ORG at $8000 so we can write out BOOT loader. This is incredibly simple

  di
  NextReg $50,Code_Bank  ; $0000-$1FFF
  NextReg $51,Code_Bank+1  ; $2000-$3FFF
  NextReg $52,Code_Bank+2  ; $4000-$5FFF
  NextReg $53,Code_Bank+3  ; $6000-$7FFF
  rst $00

StackEnd:       ds      127
StackStart:     db      0

Technically, I only need to bank in the first bank, and then I could do the rest in the main code block, but for me this is fine. I've set code to use banks 12,13,14 and 15 - which maps to the ZX Spectrum 128 banks 6 and 7. Bank 2 (4,5) is my data, and bank 5(10,11) is the ULA/Timex screen.

I've yet to use Bank 0 (0,1)...so it's free to something else - perhaps tables I need to bank in on demand. I can decide that later

So next I need to set bank 6 (8K bank 12) to assemble to (however your assembler does this), then set the target "ASSEMBLE TO" address to $0000. In SNASM you would do this....

                SEG  CODE_SEG,12:0,$0000    ; create segment (bank 12,offset 0. Assemble to location $0000)
                SEG  CODE_SEG               ; set segment


BootUp          di                          ; RST $00 jumps here....
                jp      StartCode
                nop
                nop
                nop
                nop

                ; RST $08
                ret
                nop
                nop
                nop
                nop
                nop
                nop
                nop

                ; RST $10
                ret
                nop
                nop
                nop
                nop
                nop
                nop
                nop

                ; RST $18
                ret
                nop
                nop
                nop
                nop
                nop
                nop
                nop

                ; RST $20
                ret
                nop
                nop
                nop
                nop
                nop
                nop
                nop

                ; RST $28
                ret
                nop
                nop
                nop
                nop
                nop
                nop
                nop

                ; RST $30
                ret
                nop
                nop
                nop
                nop
                nop
                nop
                nop

                ; RST $38
IRQ             ei
                reti
                nop
                nop
                nop
                nop
                nop
                nop

StartCode:
                ; Main game start up....

One interesting side effect of this, is that you no longer have to use IM 2 for interrupts. You have access to the hardware vector, which is handy.
Now when I (say) clear the screen, no matter which it is - Timex or ULA, I just bank in the correct bank (10 or 11) to $E000, and then clear - like this.

; ************************************************************************
; Clear the ULA Screen using DMA
; ************************************************************************
ClearULAScreen:
  ld a,(ULABank)
  NextReg $57,a   ; bank screen into $E000
  
                ld      hl,$e000  ; Get the current buffer we're drawing to
                ld de,$e001
                ld      (hl),0
                ld      bc,6143   ; fill the screen
                jp DMACopy


You'll notice I use DMA instead of LDIR, as it's much faster - but that's just an aside. As you can see, I bank in the current buffer, then always target $E000, it's as simple as that.

The only real gotcha is that using the new opcode pixelad returns an address based on $4000, so we need to OR in $A0 to get the proper address. Fortunately, pixeldn works with any base address - which is cool.

Lastly...the main loop and double buffering is pretty simple, and looks like this...

; *****************************************************************************************************************************
; Flip Buffers
; *****************************************************************************************************************************
FlipBuffers:
                ; Flip ULA/Timex screen (double buffer ULA screen)
                ld      a,(ULABank)             ; Get screen to display this frame
                cp      10
                jr      z,@DisplayTimex

                ld      b,10                    ; set target screen to ULA
                ld      a,1                     ; set CURRENT screen to TIMEX
                jp      @DisplayULA

@DisplayTimex:  ld      b,11                    ; set target screen to TIMEX
                xor     a                       ; set CURRENT screen to ULA
                
@DisplayULA:    out     ($ff),a                 ; Select Timex/ULA screen
                ld      a,b                     ; get bank to render to next frame
                ld      (ULABank),a             ; store...

                jp      ClearULAScreen          ; wipe ULA/Timex screen 


A little aside.... using RST $?? as common functions is very handy, especially as they are smaller than normal calls, and if it's a tiny function, it's also faster than a call by 7 T-States. If Interrupts are disabled, you can also use RST $38 for a larger call, giving you much quicker CALLs to a common function.

So that's the basics. As you can see, it's not that complicated, but the new memory layout is awesome. lots of code without having to bank in overlays, and being able to move the ULA screen AND have it double buffered is just brilliant.