Sunday, June 04, 2017

ZX Spectrum Next - Bitmap example disassembly

I've been having a poke around in the ZX Spectrum Next Bitmap example. It's mostly clear, but one or two things are....odd.
This is what I have so far.

; Disassembly of the file "bitmaps\BMPLOAD"
; 
; BMP Port $123b - bit 2, 1 = bitmap ON, 0 = off
;
; On entry, HL = arguments
;

2000 226221    ld      (HL_Save),hl     ; Store HL
2003 216421    ld      hl,UploadCode    ; Src  = 
2006 110060    ld      de,6000h         ; dest = $6000
2009 011f00    ld      bc,001fh         ; size = 31 bytes
200c edb0      ldir    

200e 2a6221    ld      hl,(HL_Save)     ; get HL back
2011 7c        ld      a,h
2012 b5        or      l                ; is hl 0?
2013 2008      jr      nz,201dh         ; if we have an argument, carry on

2015 213121    ld      hl,2131h         ; get message
2018 cd1d21    call    PrintText        ; print the filename
201b 1823      jr      Exit             ; exit

; Load file....?
CopyFilename:
201d 112421    ld      de,FileName      ; "filename.ext" text - space to store filename?
2020 060c      ld      b,0ch            ; b = $0c (max length of allowed filename - no path it seems)

; Copy, and validate characters in filename
2022 7e        ld      a,(hl)           ; first byte of filename
2023 fe3a      cp      3ah              ; is it a ":"?  End of filename
2025 2810      jr      z,EndFilename    ; if so end of filename  
2027 b7        or      a                ; 0?
2028 280d      jr      z,EndFilename    ; if so end of filename
202a fe0d      cp      0dh              ; newline?
202c 2809      jr      z,EndFilename    ; if so end of filename
202e cb7f      bit     7,a              ; over 127?
2030 2005      jr      nz,EndFilename   ; if so end of filename
2032 12        ld      (de),a           ; copy over to filename cache
2033 23        inc     hl               ; next src letter
2034 13        inc     de               ; next dest letter
2035 10eb      djnz    2022h            ; copy all
EndFilename
2037 af        xor     a                ; Mark end of filename
2038 12        ld      (de),a           ; store filename
2039 dd212421  ld      ix,FileName      ; get filename base address
203d cd6220    call    LoadFile:

Exit:
2040 af        xor     a                ; no error
2041 c9        ret                      ; exit


; esxDOS detect if unit is ready
DetectUnit:
2042 af        xor     a                ; Detect if unit is ready
2043 cf        rst     08h              ; call esxDOS
2044 89        db      $89              ; M_GETSETDRV

; Open
; IX= Filename (ASCIIZ)
; B = FA_READ       ($01)
;     FA_APPEND     ($06)
;     FA_OVERWRITE  ($0C)
;
OpenFile:
2045 dde5      push    ix               ; ix = filename
2047 e1        pop     hl               ; Not sure why this is needed.....????
2048 0601      ld      b,01h            ; b = FA_READ
204a 3e2a      ld      a,2ah            ; a = unknown  (a=0 for open)
204c cf        rst     08h              ; call esxDOS
204d 9a        db      $9a              ; F_Open
204e 325620    ld      (2056h),a        ; Store file handle  (self modify read from file code)
2051 c9        ret                      ;


; esxDOS command - Read from file - command $9d
; IX = address to load into
; BC = number of bytes to load
; A  = file handle
ReadBytes:
2052 dde5      push    ix               ; IX = where to to store data
2054 e1        pop     hl               ; get address to load into
2055 3e00      ld      a,00h            ; $00 is self modified
2057 cf        rst     08h              ; call esxDOS
2058 9d        db      $9d              ; F_Read
2059 c9        ret     

; esxDOS command - Close File - command $9b
; A = file handle
CloseFile:
205a 3a5620    ld      a,(2056h)         ; Get open file handle 
205d b7        or      a                 ; is it 0? (did it open)
205e c8        ret     z                 ; if file handle is 0, return
205f cf        rst     08h               ; Call esxDOS
2060 9b        db      $9b               ; F_Close
2061 c9        ret     

LoadFile:
2062 dde5      push    ix                ; remember filename
2064 cd4220    call    DetectUnit        ; detect unit and open...?!?!?
2067 dde1      pop     ix                ; get filename back
2069 cd4520    call    OpenFile          ; OpenFile - again??
206c dd21e720  ld      ix,BMPHeader      ; Read the BMP file header
2070 013600    ld      bc,0036h          ; read header ($36 bytes)
2073 cd5220    call    ReadBytes
2076 dd218321  ld      ix,BitMap         ; read block into $2183
207a 010004    ld      bc,0400h          ; read palette - 1024 bytes
207d cd5220    call    ReadBytes

;
; Convert the 24bit palette into a simple RRRGGGBB format
;
ConvertBMP:
2080 218321    ld      hl,BitMap         ; Get buffer address ($2183)
2083 11003f    ld      de,3f00h          ; Dest address of converted palette
2086 0600      ld      b,00h

ConvertionLoop:
2088 7e        ld      a,(hl)            ; get BLUE byte
2089 23        inc     hl                ; move on to green
208a c620      add     a,20h             ; brighten up a bit? (blue is always pretty dark??)
208c 3002      jr      nc,SkipSatB       ; overflow? 
208e 3eff      ld      a,0ffh            ; if overflow, then saturate to $FF

SkipSatB:
2090 1f        rra                       ; get top 2 bits only  RRRGGGBB
2091 1f        rra     
2092 1f        rra     
2093 1f        rra     
2094 1f        rra     
2095 1f        rra     
2096 e603      and     03h               ; and store them at the bottom
2098 4f        ld      c,a               ; c holds current byte

2099 7e        ld      a,(hl)            ; get GREEN
209a 23        inc     hl                ; move onto red
209b c610      add     a,10h             ; brighten up a bit as well
209d 3002      jr      nc,SkipSatG       ; if no overflow, skip saturate
209f 3eff      ld      a,0ffh

SkipSatG:
20a1 1f        rra                       ; get 3 bits of green into right place
20a2 1f        rra     
20a3 1f        rra     
20a4 e61c      and     1ch               ; mask off remaining lower bits
20a6 b1        or      c                 ; merge with output byte
20a7 4f        ld      c,a               ; store into output

20a8 7e        ld      a,(hl)            ; get RED
20a9 23        inc     hl                ; move to next byte of colour (assuming alpha)
20aa c610      add     a,10h             ; brighten up a bit
20ac 3002      jr      nc,SkipSatR       ; no overflow?
20ae 3eff      ld      a,0ffh            ; Saturate to $FF

SkipSatR
20b0 e6e0      and     0e0h              ; keep top 3 bits
20b2 b1        or      c                 ; merge with output pixel
20b3 12        ld      (de),a            ; store converted pixel
20b4 13        inc     de                ; move to next ouput pixel
20b5 23        inc     hl                ; move to next BGRA pixel
20b6 10d0      djnz    ConvertionLoop    ; c = pixel....?!?!?!

20b8 06c0      ld      b,0c0h            ; bc=$c000

ConvertUploadLoop:
20ba c5        push    bc
20bb dd21003e  ld      ix,3e00h          ; $3e00 = Destination address
20bf 010001    ld      bc,0100h          ; read 256 bytes of data
20c2 cd5220    call    ReadBytes         ; Read from file

;
; convert 256 value palette index into actual RGB byte pixel using palette lookup
;
20c5 2e00      ld      l,00h             ; l = xx (loop counter)

CopyLoop:
20c7 263e      ld      h,3eh             ; hl = $3exx
20c9 5e        ld      e,(hl)            ; e = Get palette index
20ca 163f      ld      d,3fh             ; d = $3f palette base address - 256 byte aligned
20cc 1a        ld      a,(de)            ; a = palette value (24bit converted downto 8bit)
20cd 265b      ld      h,5bh             ; hl = $5bxx converted 256 byte buffer
20cf 77        ld      (hl),a            ; ($5bxx) = converted colour pixel
20d0 263e      ld      h,3eh             ; hl = $3e00
20d2 2c        inc     l                 ; do 256 bytes of this....
20d3 20f2      jr      nz,CopyLoop

20d5 c1        pop     bc                ; bc = $c000
20d6 c5        push    bc
20d7 cd0060    call    6000h             ; block transfer 256 bytes of the bitmap
20da c1        pop     bc
20db 10dd      djnz    ConvertUploadLoop
20dd 013b12    ld      bc,123bh          ; Bitmap port
20e0 3e02      ld      a,02h             ; 2 = enable
20e2 ed79      out     (c),a             ; switch bitmap layer 2 on
20e4 c35a20    jp      205ah

BMPHeader:     ds   54                   ; $20e7 to $211c

PrintText:
211d 7e        ld      a,(hl)            ; get character
211e 23        inc     hl                ; move to next one
211f b7        or      a                 ; is this 0? 
2120 c8        ret     z                 ; if so... end of string
2121 d7        rst     10h               ; outchr()
2122 18f9      jr      211dh             ; print all characters

;
; Looks like data of some kinds
;
FileName: db  "filename.ext",0
Message:  db  ".picload  to load image to background",$d,$00
HL_Save   dw  0

;
; Copied up to $6000
;
UploadCode:
2164 05        dec     b                ; b = upper byte of memory address
2165 78        ld      a,b              ; get 256 byte page into a
2166 e63f      and     3fh              ; ignore top bits
2168 57        ld      d,a              ;
2169 1e00      ld      e,00h
216b 78        ld      a,b
216c e6c0      and     0c0h
216e f601      or      01h
2170 013b12    ld      bc,123bh         ; bitmap register
2173 ed79      out     (c),a            ; map bank into memory perhaps?
2175 21005b    ld      hl,5b00h         ; $5b00 src image chunk
2178 010001    ld      bc,0100h         ; 256 byte copy
217b edb0      ldir                     ; copy up
217d 013b12    ld      bc,123bh         ; bitmap register
2180 ed69      out     (c),l            ; l=0, disable current bank and screen
2182 c9        ret     

BitMap:
2183 00        nop                      ; file is loaded into here....

Friday, April 28, 2017

Hacking ZX Spectrum Manic Miner

So I was playing around with the original Manic Miner game, using a ZX Spectrum .SNA snap shot file (so an actual memory image), and while extracting levels etc. I also extracted graphics - which were obviously included as well.

For the record, I didn't reverse engineer all this, I just used one of the many pages out there describing it.  ( http://www.icemark.com/dataformats/manic/mmformat.htm )

But it also occurred to me that it might be nice to have these graphics readily available for folk to play with, it would certainly have been handy when I was doing my remake a few months ago! So here's what I've extracted....

So, this is the baddies that come with each level, the items (1 per level), the 8 tiles (left to right per level) in colour, and in B/W, and then Miner Willy himself. There are several "special" baddies, and if I extract them I'll add them here as well.

I might see if I can go and grab the Z Spectrum font from the ROM, because I couldn't find it as a simple strip at the original size.

I'll just mention the sprites move through the graphic, so the animation frame and X position is usually....

frame = 3-((x>>1)&3);
draw_xpos = (*x&0xfffffff8)

     














ZX Spectrum 8x8 pixel Font

Sunday, September 25, 2016

ZX Spectrum colour clash on modern hardware

There are always some games that like to do very retro looking games, usually in the style of a Spectrum since you still get "hi-res" from it, rather than the C64 or something where pixels were doubled and couple look a bit nasty, even for today's retro fan club. However, as retro as they look, they never really capture the full spirit of the old machine because of one thing - colour clash.



The ZX Spectrum's screen consisted of a hires (for its day) bitmap of 256x192 pixels, and a colour (or attribute) map. The pixels in the bitmap would get coloured every 8x8 pixels depending on the colour in the attribute map.

So, roll onto modern day. We no longer have separate pixels and attributes so simulating this old mode can be a little tricky.

Roll on further and I suddenly realised a really simple way of achieving this using the hardware we all have in our machines already. So lets say we simulate the way the spectrum drew things, a bitmap screen an a colour map. First, lets draw everything in "white" pixels only - which is much like the spectrum would be, it would look something like this.



If we then drew the colour map, we'd just end up with big blocks of colour going over the top of the screen - like this

And this is where modern hardware struggles. However... if we could somehow use the first image, as a "mask" for the second, then we could get rid of the pixels from the blocky colour image and get the image we wanted. We could generate both these images and then send them through a shader and do - well, whatever we want to, but there's actually a simpler way.

Let me introduce you to destination alpha. DestAlpha is the alpha channel on the screen surface (or render target) that you render to all the time anyway. Whenever you draw a sprite the RGB goes onto the surface from the texture/sprite your drawing, and so does the alpha. When you clear the screen to black with no alpha (or UINT32 of 0x00000000), and then draw a sprite onto it, you're effectively making a copy of a sprite. When you start drawing lots of other sprites it all gets a bit of a mess and is basically useless, but if you control it...it can be very handy

There's also another string for our bow; the colour channel write mask. This lets you switch off and on the different channels of the screen (red,green,blue or alpha). Now it gets very interesting.....

Lets say that black and white image above was also represented in via the alpha channel.
The image shown here has white pixels where there is data, and transparent everywhere else. This is a "mask" of the data we want to draw, and if we put the same data into the alpha channel we can use "destalpha" to actually mask everything.

So, the first thing we need to do is draw the background - whatever we want, however we want. Next we need to clear the alpha channel only - leaving the colour screen intact. We do this by setting the colour channel mask so that only the alpha channel is being written to. We can then render a filled rectangle with alpha of 0. One the alpha channel is cleared, we then set the colour mask back to normal and render everything else.

Once this is done,  the alpha channel will look something like the transparent image above, a lovely alpha mask - now if only we could add this to a sprite it would be like any normal texture....  Step in dest alpha. Normally were use source alpha, inverse source alpha as the blend mode. This would normally take the alpha value from the source i.e. the texture, but now we'll use destalpha, inverse destalpha. This will use the screens alpha channel as the alpha for the blend.

Now when we render the solid rectangles of colour that would be the colour mask, the underlying pixels that crated the alpha channel, will only let colour through were we drew a pixel in the first place.


So, by using the transparent image as the alpha channel of the blocky image, we're left with the image above - which is just like the spectrum.

Now comes the good bit. When we render objects - players, baddies, pickups etc, all we need to then do is change the colour block in the blocky image (which is rendered layer), and then this gives us the same blocky, colour clash feel that the spectrum had

Here's the general order to rendering things....


Render "paper" colour map  (32x24 of 8x8 blocks of colour)

Set colour mask to alpha only

Disable blending and alpha-test
Draw giant flat rectangle of 0 alpha, and 0 colour
Reset colour mask to full ARGB
Enable alpha blend and alpha test

Draw everything

Set colour write mask to RGB - disabling ALPHA

Enable alpha blend and alpha test
Set blend mode to DestAlpha, Inv_DestAlpha

Draw all colour blocks (32x24 of 8x8 blocks of solid colour)

Set blend mode back to SrcAlpha, Inv_SrcAlpha


Once you've done all this.... you'll be using the "pixels" from your sprites, and the colour from the attribute map. Whats better is every bit of hardware out there can do this. It's all standard blend modes stuff - even older hardware. If you've done all this, you should get something like the video below. Make sure ALL your sprites background graphics are transparent, and don't have solid pixel colours (like black) as this will effect the mask.







One final point... It should be noted that the "colour" pixels written at the same time of the mask aren't really needed, but if you colour them as they are meant to be (not just white), then it means you can easily switch the effect on and off without affecting the game. This gives the end user the option to have colour clash or not, which is a nice side effect.




Friday, January 01, 2016

Why create a GameMaker export to the Raspberry Pi?

Okay, these are very much personal opinions...
So I've been asked a couple of times why bother? It's a tiny machine with (probably) zero chance in making money from it - so why?


First, the Raspberry Pi is an amazing thing, Eben Upton and his various partners have done a phenomenal job getting it out in the first place, never mind the subsequent work of getting the Pi Zero - which is beyond amazing!

What I love about the Pi more than anything else, is that you no longer have to decide to take out a second mortgage on your home to give your kid a computer in their room! What's more, you no longer have to worry about them breaking it. At $25 you can just get another one. With the Pi Zero, this is and even simpler decision. Having a computer in my bedroom when I was growing up was life changing. Without that, I'd never be where I am now.

Of course.... there were other factors that helped. First - content. Without games to play, I'd have grown bored very quickly, and my ZX81 would have ended up in a drawer. What you need to inspire development is inspirational content. You want to play games, use programs and hardware and think - "That is SO cool! I want to do that!"  This is what drove me. We used to play the simple games available, type in programs from magazines, and spend hours trying to do stuff ourselves. 

Put simply, this is what I want to help achieve. With GameMaker, we have access to users who can provide these inspirational games, games that you want to play over and over, games that make you want to try and create stuff yourself.  Sure, there are limits. This isn't a Quad core i7 with 16 Gigs of ram and enough HD space to store most of the internet on. But....games don't have to be full 3D ultra-realistic to be good. Vlambeer have some amazing games, Locomalito has made some astounding games - ALL FREE!  All they need (in theory) is a port of the GameMaker runner to help make it happen.


So... that's my reason number one. It's how I started, and if I can help others do the same, I'm all for that.


Second - Education. Not so much for game creation itself, but for Electronics. Currently most folk seem to use Python to do this, and to be frank - python sucks. Even with simple GPIO commands I've been having great fun with it. Remote development is ideal for this kind of thing, because when things go wrong and the Pi crashes (as it inevitably will), you can simply reset while your entire dev environment is safe and sound on a separate machine.  At some point I'll get the remote debugger working as well (It almost works now), and that means you'll be able to step through your code and development will be even easier.

On top of this there is an emerging home brew arcade scene. Folk like Locomalito take their games and stick them in PCs inside arcade machines - like a mame cabinet does, and then run their games like in the old days, complete with arcade buttons and joysticks. These days, they appear to be moving more towards using the Raspberry Pi - it's cheap, has lots of easy to use pins for interfacing and if it blows up - meh, they can buy another one. The runner port will allow games to be run on the machine, while the GPIO commands I've added will help them interface with buttons and joysticks easily.


Lastly... a word about the Pi Zero. I can't begin to say how much I'm impressed they made this happen. It's such an incredible break through, They even gave one away on the cover of a magazine! I'll let that sink in...

They gave away a whole computer on the cover of a magazine.

Wow.....It has the same GPIO pins as the PI, and while only a single core, it is a 1Ghz core. This means you could make a case for it with a 3D printer, get some very cheap SD cards, and then sell your game as a plugin console - probably for $20 or so. That astounds me. Even without the case etc. you could sell you game - complete with computer, for less than $10.  You can buy 50 micro 4Gb micro SD cards for £1.36 each, add that to the price of the Pi ($5) and you're still going to make a profit at $10. Wow.

Before the Pi came out, we tried to get GameMaker 7 actually running ON the machine. We almost got there, but ran out of time. We were doing it for free because we believed in what they were doing, but commercial pressures reasserted and we had to move on. Long term, I'd love to see a GameMaker IDE of some kind running ON the Raspberry Pi. It would be an amazing thing to behold, and would make learning so much more fun.

..........one day perhaps.....one day.





Thursday, December 31, 2015

Hooking up an SD card to the RaspberryPi

So I've not done a personal blog post for fricken ages! Mainly because any post I would have done would have been a GameMaker one, so they've been on the YoYo Games Techblog instead.

While porting the GameMaker: Studio runner to the Raspberry Pi, I thought it would be very cool to allow access the GPIO (general purpose input/output) pins. This would let you use GameMaker to do some cool electrical experiments without A) having to resort to using Python (yuck) and B) allow remote development - meaning you don't have to work on the actual Raspberry Pi itself - and possibly even remote debugging using the GameMaker debugger. All of which would be pretty cool.

I've already hooked up 2 different LCD screens, so I wanted to do something actually useful. Having a secondary SD card that isn't controlled by the OS could be pretty useful - especially if you want to somehow make a console or something. You could boot up into Linux on the OS card, but then allow games to be plugged via this dedicated card - well that's the theory.

I've done SD card coms before with a Commodore 64, but it was some time ago, so I was almost starting fresh. Starting any new project is always annoying, you're never sure if it's a software bug, or an electrical one. It's doubly tricky for a noob like me, as I've only just toyed with electronics, it's mostly guess work. Still, that's never stopped me before! I had a couple of false starts, but finally got the connections right, the final layout is shown here...


Because the Raspberry Pi's GPIO is 3.3v, I was able to just connect the pins directly, if they were different (5v for example) then I'd have had to alter the pins voltage before the hit the card. But you don't need that with the Pi which makes life really simple - although I have been reminded that I should really use a capacitor to smooth things out, but this does work for just hacking around.

Once connected the real fun of trying to talk to the damn thing starts! SD cards have a very cool 1 bit SPI interface, where basically, you set bits and toggle a line up and down and the card accepts that bit. Send lots of these, it gets a byte - and so on. It also sends data in the same way, just toggle lines up and down and accept bits at your leisure!

So first, lets initialise the ports/data lines, and then we can get to throwing some data at it. Below is the code (written in GameMaker: Studio obviously) that sets up the ports and line direction. It's worth mentioning that the Pi can set lines to input or output, so you can see below that I setup most to be output, and the DataIn line to be input.


The CS is the chip select line. Whenever you want the card to accept your data, you set this line low (to 0).
The CLK line is the line we toggle Low/Hi to acknowledge that a bit has been sent or received.
DataIn/DataOut are the lines for the actual bit data we're reading/writing.

All in all, it's pretty simple. The Pi does have hardware SPI lines, but I'm not using these yet as I've yet to add support to GameMaker for them.

Once we've set these up, we really need to start trying to initialise the SD card - or MultiMediaCard interface as that's what used it first. Before getting going, we really need a bit and byte sending routine. This is the lowest level of function we'll use, and it'll handle all the bit flipping we require.

The function to send a bit is very simple.  We set the bit to 0 or 1, set the CLK line LOW(0) wait for a bit - this depends on the speed of your CPU, but on the PI,it's a small (or non-existent) delay. We then read a bit in. We do this because it's easier to have a single function that reads and writes, rather than having to worry about the state of the DataOut line while trying to use the DataIn line. This just makes life simpler. We then take the CLK line HIGH again (1), and pause again. After that...we're done, just return the bit we received.


/// SendBit(bit)
SetDataOut(argument0);
SetCLK(0);
Delay(_Delay);
var b = gpio_get(DataIn);
SetCLK(1);
Delay(_Delay);
return b;

With this setup, we need to be able to send a command, and this the sequance you need(image from
http://elm-chan.org/docs/mmc/mmc_e.html )


Command/Response sequence

You can see from the image above, we basically send bytes of Info to the card using the SendBit() function above for each BIT in the byte. The send command function is pretty simple once you know these two facts.


Once you have the Write8() function (which is just a loop of 8 around the SendBit() function) and the SendCMD() function, we really just need to actually use them to initialise the SD card. The core of the initialisation centres around sending the CMD0 and CMD1, but before doing this you have to pulse the SPI line a lot to set it into SPI more. This means sending "more than" 74 clock pulses while the CS and DI lines are HIGH. All commands should return a 0 if they are accepted, or an error code if not.


The InitCMDS() just sets up the command values, CMD0=$40, CMD1=$41 and so on, while InitFAT() is setting up stuff for reading the actual disk later...

After we've done this, then the SD card is ready to actually send us sector data! CMD17 ($51) is the command that will return us sectors, it takes to form of <$51,32Bit_Address,$ff>. It's worth knowing that after the CMD0 you don't need to worry about the CRC for the most part. Just send $ff as the lat BIT needs to be 1 (see the SPI diagram above).

After sending the CMD17 command, we wait for the card to return $FE - basically sitting in a loop while it's sending $FF. If we get a value back that isn't $FF or $FE, then it's an error code. Once we DO have $FE we can sit in a loop reading in 512 bytes of data, followed by 2 bytes of CRC (which we'll ignore).


And that's it! This should start getting you real information from the SD card. After this you can just Google FAT16 or FAT32 (which is easier I think), and start to decode the disk. For the sake of completeness I'll include my (inlined and unrolled) byte b = Write8( _byte ) function....



As you can see, once you have the SD card connected to the Pi, there's not actually that much involved in getting data out of it. This makes SD cards ideal for storage on electrical projects - after all, no one says you have to store data in a DOS format, you could just read/write whole sectors and store everything in a custom format. This is what we used to do in the old days, as "standard" formats are bloated for game purposes.

Some point soon, I hope we can release this GameMaker: Studio Raspberry Pi export, then you can have as much fun as I've had with this stuff.  One thing I will say, is that remote development - i.e. not working on the actual machine, is brilliant for this kind of thing, because when you make a mistake and crash the PI (and I do - a lot!) I can reboot it without actually losing my source, or even having to load it all up again! Also, because I don't have to boot into X for dev, it boots really quickly, making these mistakes pretty minor.

All in all, I've had great fun doing this and other little electrical projects. This so far include, my own micro switch Joystick, a 2 line character LCD screen, a 128x64 pixel screen and this SD Card reader. Not bad for a couple of weeks effort!

Here's a list of pages I've found invaluable while doing this project - including some FAT16/32 resources.

SD card stuff
http://elm-chan.org/docs/mmc/mmc_e.html
https://www.sdcard.org/downloads/pls/part1_410.pdf
https://www.sdcard.org/downloads/pls/partE1_300.pdf
http://www.retroleum.co.uk/electronics-articles/basic-mmc-card-access/

Disk FAT16/FAT32
https://www.win.tue.nl/~aeb/linux/fs/fat/fat-1.html
file://drobo-fs/Backup/Electronics/MMC/Paul's%208051%20Code%20Library%20Understanding%20the%20FAT32%20Filesystem.htm
http://www.maverick-os.dk/FileSystemFormats/FAT16_FileSystem.html
http://www.maverick-os.dk/FileSystemFormats/FAT32_FileSystem.html
http://www.tavi.co.uk/phobos/fat.html#boot_block
http://averstak.tripod.com/fatdox/dir.htm




Friday, October 17, 2014

Why aiming for 60fps or full 1080p is a pointless goal.

So yet again on the web I've seen posts by journalists that 1080p is worthwhile, and that developers should be spending all their time trying to make full 1080p/60fps games, and yet again, it gets my back up...

A few years back I did a post about the mass market, which was an article I did way back in 2004 ( A new direction in gaming), and it seems to me this is the same thing, coming up again.  I mean, seriously, who do you want to buy your games, a handful of journalists, or millions of gamers?

Making a full 1080p/60fps game can be hard - depending on the game of course, but ask professional developers just how much time and effort they spend making a game run at 60fps, while keeping the detail level up, and they'll tell you it's incredibly tricky. You end up leaving a lot of performance on the table because you can't let busy sections - explosions, multiplayer, or just a load of baddies appearing at the wrong time, slow down the game even a fraction, because everyone - no matter who they are, will notice a stutter in movement and gameplay. And that's the crux of the problem. Even a simple driving game, where it's all pretty much on rails, so you know poly counts and the like, has to leave a lot on the table, otherwise if there is a pileup, and there are more cars, particles and debris spread around that you thought there would be, the it'll all slow down, and will look utterly horrible.

Take a simple example. You’re in a race with 5 other cars, cars are in single file, spread out over the track - as the developer expected it to be. You’re in the lead with a clear track. So let’s say the developer had accounted for this, and are using 90% of the machines power for most of the track, and on certain areas, they reduce this to account for more cars and effects - like the start line for example. But suddenly, you lose control, and spin, the car crashes and bits are spread everywhere along with lots of particles. Now the other cars catch up, hit you, and it gets messy. Suddenly, that 10% spare just wasn't enough. it needs a good 30-40% to account for particle mayhem, and damage, and the game slows down. As a gamer, it’s not dropped from 60 to 30 - or perhaps even lower depending on how many cars are in the race (like an F1 race for example). Now, 30fps isn't terrible, and even 20fps would be fine - probably, but the thing is.... the player has experienced the 60fps responsiveness and now its suddenly not handling the same way, and they notice.

The problem is people notice change, even if they don't understand the technical effects or reasons behind it. So going from 60 to 30 will be noticed by everyone, even when you compensate for it. It is much harder to notice going from 30 to 20 when there is frame rate correction going on, but many can still "feel it".

So, if I'm saying you shouldn't do 60fps or full 1080p (if the game struggles to handle it), what should you do? Well, what people really notice, isn't smooth graphics, but smooth input. Games have to be responsive, and it's the lag on controls that everyone really notices. If you move left, and it takes a few frames for that to register - everyone cares, not just hard core gamers. But if a game responds smoothly, then if you’re at 60 or 30 - or even 20! then gamers are much more forgiving. This is mainly because once you're in the thick of the action, they are concentrating on the game, not the pretty graphics or effects. I can prove this too, even to hard core developers/gamers.  Years ago when projectors first came out they were made with reasonably low res LCD panels, and you got what was called the "Screen Door" effect. Pixels - when projected, didn't sit right next to each other, they were spaces between them. Now when you started a film, you'd see this and you would end up seeing nothing BUT the spaces. However as soon as you got into watching and more importantly, enjoying the film, that went away, you never saw the flaws in the display, because you were concentrating on the content.

The same is true of games, sure you power up a game and ooo...look, it’s super high res, and ever so smooth! But 10 minutes later, your deep in the game and couldn't give a crap about the extra pixels, or slightly smoother game, all you care about is the responsiveness of it all. If it moves when you tell it to, your happy.

So what about the 1080p aspect? Well years ago when 1080p started to become the norm, and shops had both 720p and 1080p large screen TVs in store, I happened to be in one, where they had the same model, but one 1080p, and on 720p hanging right next to each other, playing the same demo. I went right up to them, and could still hardly tell the difference. These were 2 50" displays, yet with my noise almost touching them, it was hard to tell where the extra pixels were. Now, if you have a 1080p projector, and are viewing on a 2 to 3-meter-wide screen, I'm pretty sure you'll notice, but on a TV? when your running/driving at high speed - no chance. Every article about resolution in games shows stills, and this is the worst thing to use, as that's not how you play games. It's also worth remembering, that at that moment in time your noting playing, so you’re not concentrating on the game, you’re doing nothing but search for extra pixels, so it's yet another con really.

So what’s the benefit to running slower, at a slightly reduced resolution? Well, 1080p at 30fps means you can draw at least twice as much. That's a LOT of extra graphics you can suddenly draw, and even if you don't need/want to draw that much more, if makes your coding life MUCH simpler. You no longer have to struggle to maintain the frame rate, or worry about when there is a sudden increase in particles or damage - or just more characters on screen.

What about resolution? Well 720p is still really high-res. There are still folk getting HD TV's and watching standard definition TV thinking it's so much sharper than it used to be! The mass market usually doesn't "get" what higher res is until they see it side by side, and once things are moving and they are concentrating on other things, they will neither know, or care.

At 720p, 30fps you can draw over 4 times as much 1080p, 60fps. That's a LOT of extra time and drawing. Imagine how much more detailed you could make a scene with that amount of extra geometry or pixel throughput. Even at 3 times, you could leave yourself a massive amount of spare bandwidth to handle effects and keep gameplay smooth.

2D games "probably" won't suffer too much from this problem, although on mobile web or consoles like the OUYA they will as the chips/tech themselves just isn't very quick.

So rather than spend all your time with tech and struggle to maintain a frame rate most gamers won't notice, shouldn't you spend all your time making a great game instead? If you're depending on pretty pixels to make your game enjoyable, it probably isn't a great game to start with, and you should really fix that.

Games with pretty pixels sell a few copies, truly great games sell far more, and while it's true that games with both will sell even more, the first rule of game development is that only games you release will sell anything at all, and while playing with tech is great fun, constant playing/tuning doesn't ship the game.






Wednesday, September 17, 2014

When Scotland changes forever.....

In a world of history, a truly historic event is rare, but that's what Scotland gets tomorrow. On the 18th of September 2014, we the people of Scotland, get to vote on Independence. We get to decide whether we break away from the United Kingdom, or whether we stay. But either way, things look like they'll change, for the worse or better is anyone's guess.

The Stay together campaign is promising loads of new powers from Westminster, and that the UK will stay strong, in both military and monetary might. They say everything a Yes vote promises is dangerous, and too big a risk to gamble on. Perhaps their right.... perhaps.

Or perhaps not. The Yes campaign obviously states the exact opposite, that staying will continue the downward spiral of Scotland, and more and more power and money going to Westminster. They will continue to take Scotland's Oil and squander it, while giving little or nothing back. They also say the so-called powers that are on offer, will never appear and not to trust them.

A couple of years ago, I started out firmly in the No camp, but over time came over to be a resounding Yes. So what changed my mind?

Well, at the start I was concerned with things like defence, and thought that being an island nation, we can defend our borders better together,  and that leaving would make travel ultra complicated - at least for a few years.

But over time I realised that just like with other nations, partnerships could be done for defence - and probably will be, while travel disruption will actually be fairly minimal as I'm still on a UK passport - that isn't going to change.

The other thing I came to realise was that Scotland is just as capable of running her affairs as Westminster, after all there have been both Scottish Prime Ministers, and Scottish Chancellors. We're a smaller country with amazing natural resources, something Westminster really want to keep a hold of, and I do wonder if it wasn't for Oil and renewable energies (not to mention some place to keep their nuclear arsenal), would they care so much?

I also don't trust the current promise of new powers, because while the "current" leaders may indeed be sincere, there is no way they can promise. Any set of powers MUST go through current parliamentary procedures and be approved by both the House of Parliament, and probably the house of Lords. Back bencher's of all parties - not to mention some grass roots members, have expressed distress - and no small disgust, at additional powers being promised without proper discussion and review. So chances are, whatever comes - IF they come at all, won't be what they are promising, nor what No voters are expecting.

The ironic thing is, that the SNP wanted "Devo-Max" (which is what the offer of new powers is) was on the ballot paper as an offer, but Westminster refused, saying if MUST be a yes/no only. If it was on, I suspect it would have been an easy victory for No/DevoMax folk.

There's also the argument of Money, and what currency to use. I'm indifferent to this at best. We can of course use the English pound, or we could use the Euro or even just make our own. I don't really care either way. I suspect it'll be a few years uncertainty, and then stabilise. Scotland is, or would be, a wealthy nation. Oil rich, great renewable energy, and we export more than we import.

Why does the Better Together campaign think Scotland would struggle? What other Oil rich country struggles? And this isn't even considering the renewables. Scotland is currently on track to have 100% of it's power generated by renewable energy - we're already at 40%.

Not that I actually think we'll win to be honest. While there have been huge - HUGE Yes rally's, I also think those voting No are more the kind to just watch, stay at home... then vote No. I know this because many of my older relatives are no's, and they are exactly like this. On top of this there is also the undecideds, and they may well swing it either way.

But the thing I'm truly excited about, is the level of engagement on this vote. Unlike a general election where the public believes their vote counts for nothing - especially in Scotland, everyone here feels their single vote could swing things, so they HAVE to take part. It's amazing. There has been a 97% registration rate - unheard of in any election, and they are expecting 80-90% of these to actually vote. That's astounding.

So, yes.... I'm looking forward to tomorrow, to take the leap and cast my vote, and while I don't "think" we'll win - I'm certainly not sure, and I'm excited to hear the voice of Scotland - virtually ALL of Scotland.

So I'll vote Yes, and I'll hope everyone else wants to take that leap of faith with me, believe that Scotland is best suited to govern Scotland, because while it might be a tricky road ahead, nothing truly worthwhile is ever easy, and this is something worth taking that leap of faith for, not just for me, but my kids and my kids kids.

Wish us luck, no matter how the vote goes!!

Sunday, August 24, 2014

The beginners guide to Ray Tracing

So, I was asked recently on Twitter how Ray Tracing works, but it's not something that you can easily explain in 140 characters, so I thought I'd do a small post to explain it better.

So... the basic idea in rendering a scene in graphics, is to have a near and far place, and the corners of these two planes defines a "box" in 3D space, and that's what you're going to render - everything in the box". Orthographic projections use a straight "box", where the far plane is the same size as the near plane, whereas perspective uses a far plane that is bigger than the near one - meaning more appears in the distance than closer to you (as you would expect).

Doing Ray Tracing, is no different. First create a box aligned with the Z axis, so the planes run along the "X" axis. something like.... (values not real/worked out!)

Near plane  -100,-100,0  (top left)  100,100,0 (bottom right)
Far plane     -500,-500,200  (top left)  500,500,200 (bottom right)

This defines the perspective view we're going to render.

Now, how you "step" across the near plane, determines your resolution. So, let's say we're going to render a 320x200 view, then we'd step 320 times across the X axis, while we step 200 times down the Y.
This means we have 320 steps from -100,-100,0  to  100,-100,0 (a step of 0.625 on  X). And you then do the same on Y (-100 to 100 giving a DeltaY of 1.0). With me so far?


So, this image shows the basics of what I'm talking about. The small rectangle at the front is the near plane, and the large one at the back; the far plane. Everything inside the white box in 3D is what we consider as part of our view. If we wanted to make a small 10x10 ray traced image, then we'd step along the top 10 times - as shown by the red lines. So we have the 8 steps and the edges, giving us 10 rays to cast.

So lets go back to our 320x200 image, and consider some actual code for this. (this code isn't tested, it's just pseudo code)

float fx,fy,fdx,fdy;
float nz = 0.0f;
float fz = 200.0f;
fy = -500.0f;
fdy = 2.0f;
for(float ny= -100.0; ny<100;nydata-blogger-escaped-a="" data-blogger-escaped-colour="CastRay(nx,ny,nz," data-blogger-escaped-f="" data-blogger-escaped-fdx="" data-blogger-escaped-fdy="" data-blogger-escaped-float="" data-blogger-escaped-for="" data-blogger-escaped-fx="" data-blogger-escaped-fy="" data-blogger-escaped-fz="" data-blogger-escaped-nbsp="" data-blogger-escaped-nx="-100.0;nx<100;nx" data-blogger-escaped-ny="" data-blogger-escaped-pixel.="" data-blogger-escaped-pre="" data-blogger-escaped-raytrace="" data-blogger-escaped-vec4="">
NOTE: Sorry - Blogspot's editor has utterly nuked the code. I'll fix it later....

As you can see, it's a pretty simple loop really. This gets a little more complicated once the view is rotated, as each X,Y,Z has it's own delta, but it's still much the same, with an outer loop of X and Y dictating the final resolution. In fact, at the point the outer X and Y loops are usually just from 0 to 320, and 0 to 200 (or whatever the resolution is).

Assuming we have a background colour, then the CastRay() would currently return only the background colour, as we have nothing in the scene! And when you get right down to it, this is the core of the what ray tracing is. Its not that complicated to define a view, and start to render it. The tricky stuff comes when you start trying to "hit" things with your rays.


Lets assume we have a single sphere in this scene, then the cast ray function will check a collision from the line nx,ny,nz  to  fx,fy,fz with our sphere (which is a position and radius). This is basic maths stuff now - Ray to Sphere collision. There's a reasonable example of ray to sphere HERE.


So once you have that code in place, and you decide you've hit your sphere, you'll return the colour of your sphere. Pretty easy so far. So what if we add another sphere into the scene? Well, then the ray cast will now simply loop through everything in the scene no matter what we add; spheres, planes, triangles, whatever. 


Now we have a scene with lots of things being checked and hit, and we return a single colour for the CLOSEST collision point (giving is the nearest thing we've hit). This is basically our Z-Buffer! I means things can intersect without us having to do anything else. The closest distance down the ray to the intersection point, is the nearest thing to us, so that's the one we see first.


The next simplest thing we might try is reflections. In realtime graphics, this is pretty hard, but in Ray Tracing, it's really not. Lets say we have 2 spheres, and the ray we cast hits the 1st one, well all we now need to do is calculate the "bounce", that is the new vector that we'd get from calculating the incoming ray, and how it hits and bounces off the sphere. Once we have this, then all we do is call cast ray again, and it'll fly off an either hit nothing - or something. If it hits something, we return the colour and this is then merged into the initial collision and returned. Now, how you merge the colour, is basically how shiny the thing is! So if its VERY shiny, then a simple LERP with a 1.0 on the reflected colour would give a perfect reflection, where as if it's 0.5, then you would get a mix of the first and second spheres, and if its 1.0 on the original sphere, then you would only get the original sphere colour (and would probably never do the bounce in the 1st place!)


So the number of "bounces" you allow, is basically the level of recursion you do for reflections. Do you only want to "see" one thing in the reflection, then one bounce it is. DO you want to see a shiny object, in another shiny object? Then up the recursion count! This is one of the places Ray Tracing starts to hurt, as you can increase this number massively and doing so will also increase the number of rays being case, and the amount of CPU time required.


We should now have a shiny scene with a few spheres all reflecting off each other. What about Texture Mapping? Well, this is also pretty simple. When a collision occurs, it'll also give you a "point" of collision, and now all you need to do is project that into the texture space and then return the pixel colour at that point. Hay presto - texturing. This is very simple on say a plane, but gets harder as you start to do triangles, or spheres - or displacement mapping, or bezier patches!


For things like glass and water, you follow the same rule as reflection, except the "bounce" is a refract calculation. So instead of coming off, the ray bends a little as it carries on through the primitive you've hit. Aside from that, it's pretty much the same.


Now... lighting. Again, in realtime graphics, this is tricky, but in Ray Tracing it's again pretty easy. So lets say you've just hit something with your first ray, all you now do is take this point, and create a new ray to your light source, and then cast again. If this new ray hits anything, it means there's something in the way, and your pixel is in shadow. You'll do this for all light sources you want to consider, and merge the shadows together, blending with the surface pixel you hit in the first place, making it darker. And that's all lighting is (at a basic level). Area lights. spot lights and the rest get more interesting, but the basics are the same. If you can't see the light, your in shadow.


So how do you do all this in realtime? Well, now it gets fun! The simple view is to test with everything! And this will work just fine, but it'll be slow. The 1st optimisation is to only test with things in your view frustum. This will exclude a massive amount - and even a simple test to make sure something is even close to it, will help. After this, it gets tricky. A BSP style approach will help, and only test regions that the RAY itself intersects. This again will exclude lots, and help speed things up hugely. 


(no idea whose image this is BTW)


After this...well, this is why we have large Graphics conferences like SigGraph, professionals from the likes of Pixar and other studios and universities spend long hours trying to make this stuff better and faster, so reading up on ideas from these people would be a good idea.


Lastly....  There is also some serious benefit to threading here. Parallelisation is a big win in Ray Tracing as each pixel you process is (usually) totally contained, meaning the more CPU cores you have, the faster you'll go! Or, if your really adventurous you can stick it all on the GPU and benefit from some serious parallelisation!!


Wednesday, January 01, 2014

Improving the runner

So, I thought it might be fun to look back at some of the changes we've made to the runner, why, and what impact it's had on performance. Back when we started, the main complaint with Game Maker was that it was too slow, and that for more complex games, it just didn't cut it. This is clearly not the case any more, so what's really changed? Well, this only really effects the c++ runner, but lets take a look see what we did...

WAD

When we started on the PSP port, the 1st thing we did was to remove all the compression from the output. The game was in fact doubly compressed (which is nuts BTW - it's not like it's going to get magically smaller again!), and on the PSP this lead to loading times of around 40 seconds to a minute is even very simple games. So, we changed the whole output, altering it into an IFF based WAD file, and this meant it could be loaded incredibly quickly, and was ready to go as soon as it loaded. This was a huge change code wise, and touched every element of Game Maker. However, this change alone meant games would load either instantly, on in a few seconds.

Virtual Machine

Game Maker was originally split into 2 parts, scripting, and D&D (Drag and Drop), and believe it or not, there were to totally separate code paths for each. On top of this, everything was compiled on load, meaning there was another noticeable pause when games loaded. So, first thing that was done, was we changed the script engine to use a virtual machine, this meant we could optimise small paths of code rather than huge chunks - twice. This gave a modest boost - not as much as we hoped, but this was due to the way GameMaker dealt with it's variables, and how things were cast, and we were stuck with that. The next thing we did was to remove the D&D path, and then compile all D&D actions into script so they could then be compiled in the same manner. Lastly, we pre-compiled everything. This removed the script from the final game - and finally removing the worry of simple decompilation, but it also sped the loading of a game up yet again. It was not at the point, where a windows game would load and run almost instantly.


Rendering

There's quite a number of changes here, so I'll break it down a little further. But basically, the old Delphi runner had some major CPU->GPU stalls, and so we had to look into fixing them.

Hardware T&L

This was introduced on GM8.1, and it's was the first real boost for performance. All later versions of Studio also use this "Hardware Transform and Lighting" mode. By using this, the GPU does all the transformations, and leaves the CPU free for running the game. Up until we switched it, after you submitted a sprite, DirectX then used the CPU to transform all the vertices into screen space before passing them onto the GPU - this places severe limits on how much you can push though the graphics pipeline, so it was changed on the first version of GM8.1 we released.

DirectX 9

The first thing we did was to upgrade to DX9, this meant we were on a version which Microsoft still maintained properly, and gave us access to more interesting tools down the line - like better shaders. While doing this, we also introduced a proper fullscreen mode, and while this may not seem like much, when in fullscreen mode, things do render slightly quicker. Also...DX9 is just a little faster than DX8, and the upgrade is a pretty simple process (takes time, but its simple enough).


Texture pages

One of the biggest problems with Game Maker, was that every sprite, every background was given it's own texture page, and while this might seem like a good idea as it means you can easily load things in, and throw them away, it utterly destroys any real performance. So, we set about putting all images on texture pages (TPages), but this was more complex as Game Maker allows non-power of 2 images.

Standard methods involved sub-dividing images down, splitting them in 2 each time until you allocate the sprite, and fill up the whole sheet. This is very fast, and very efficient. However... we needed something more. So what we did was create a multi-stream, virtual rectangle system which would clip to everything as you added any sprite of any size. I came up with this years ago, and it was incredibly powerful, and very fast - but it was recursive, and complex. Still, having written it before, I did it again and we had a very nice texture packer for any size of image.

Next we noticed that current Game Maker developers were incredibly wasteful of images. They would have huge sprites with nothing but space in them, so they could be aligned easily with screen elements. A cool fix for this, was to crop out anything which had a 0 alpha value. You basically search the sprite for a bounding box that isn't 0 alpha, and then crop it, and remember the all the offsets to enable you to draw it again as normal. This is a cool trick, that helps us pack even more onto a page,which then helps with batching as well.


Batching

Speaking of batching.... Since the old version changed texture all the time, it could never hope to batch more than a couple of triangles at once, and this meant the CPU was forever getting in the way. Modern graphics hardware wants you to send as much as you can in one shot, then to go away and leave it alone!

So, the first thing we did was go through and change all the primitive types. The old way (TriFan) is unbatchable, and so you would always be stuck with 2 triangles per batch, and that's useless. So we changed them to the most batchable prim we could use, TriList. I wrote a new vertex manager that you could allocate vertices from, and it detected if there was any changes, or breaking in batches, so all the engine had to do, was draw, and the underlying API would auto-batch it for us.

Couple this with the new texture pages, and if you were just drawing sprites, you had a good chance of actually drawing multiple sprites in one render call. In fact, many games (like They Need to be Fed) can render virtually the whole game in one render call depending on the size of the texture page.

There are still some things which can break the batching (obviously), but with careful set up, rendering sprites is no longer the limiting factor of your game.


Blend States

Blend states are always tricky, and in the past I've tended to using a layering system - something that might appear in later versions of Studio or more likely GM: Next, as it helps manage state changes much better. However, we got a shock early in the year when we discovered how GM users actually manage state, and so we worked hard to hide the pain of management from our users.

Now a user can simply set state and reset state as they see fit, and aside from the GML call overhead, we handle all the blend state batching behind the scenes. For example, if you set a blend state, draw something, reset it, then set it again, draw something and then reset it, we recognise this, and both items will be drawn together in a single batch, making the rendering code much more efficient without the developer ever having to care.

It'll also handle more extreme cases, where you set blends several times, then set a different one, and then set more blends several times. The engine will also recognise this, and will only submit 3 batches - or however many blend state batches are actually needed. This is incredibly powerful, and a major playing in rendering performance.


Static Models

In the past, whenever you created a D3D model, it didn't actually create anything, it just remembered the prims and commands you sent, then replayed them back dynamically. This meant 3D rendering was terrible, and you would struggle to get above a few thousand polys.

We actually introduce this new concept in GM8.1 although it was more of a pet project for me at the time. Now its a staple of how you should work. Studio will now let you build fully static models, meaning that when your finished creating them, you'll be left with a single vertex buffer (or multiple depending on lines and points), that will simply be submitted to the hardware when you want to draw something.

This is basically as fast as it's possible to draw on modern hardware, you create a buffer full of vertices and submit them but simply pointing to the buffer. This is what all modern games do, its what engines like Unity does. With this addition, you can quite literally draw models with hundreds of thousands of polys at hundreds of frames per second (depending on your graphics card of course!)


Shaders

Although a recent addition, this is a powerful performance tool. Up until this point, special effects would be done on surfaces, then processed by the CPU, most of the time, painfully so! But with all the previous optimisations in place, shaders were finally able to get their place and show you what the GPU was really able to do.

From effects such as shadows, to environment mapping, to more 2D effects such as outlines, colour remapping or even full shader tilemaps all meant that the CPU got more time to work on the game, not spend time trying to make things look pretty, especially when there was a much better way of doing it sitting there doing nothing.


string_execute()

Yeah... this old one. Well, this was a popular command, mainly because it let you be really lazy in the way you coded, and yeah.... it's not a terrible thing, but it was one of the worse performance hogs, and folk using Game Maker had come to rely on it, meaning it was pulling performance down for everyone. So why was it so bad? Well, unlike code when you executed it it had to compile the string every time before running it, and this was incredibly bad. Lets have a quick look at what's involved when compiling.

Lexical Analysis. First it had to scan the string, break it down into tokens allocate memory for it and store a parse tree. Even a very simple string like "room_goto(MenuScreen)" means we had to parse the string, looking at each character, and breaking it into several tokens. room_goto command token, find and verify the syntax of the command, and evaluate the contents of the brackets - in this case; MenuScreen, which is also stored as a token. These are then stored as a parse tree, so that equations and evaluations are calculated correctly. So if you gave a calculation like "A = instance3.x+((instance.B*12)/instance2.C)", it would be evaluated correctly when executed.

Compilation. This then transforms the parse tree into byte code. Previously, it was transfered into a stream of tokens, and the engine just run these, but with the virtual machine, we now compile into byte code. This memory also has to be allocated.

Running. Now the new script can be run. First it sets up a code environment (again, more code allocation), complete with the current instance so it can store things correctly, then the VM takes over and runs the code.

Freeing. This is the real crime. Once all this work is done, and the script has been run.... it's all thrown away. All the different memory blocks are freed, and when you try and run it again, it has to do this all again.

Knowing this... it's no wonder this command was such a hog. And file_execute was even worse, as it had to load the thing first, before doing any of this - and disk access is ALWAYS slow.  So this is why this command was removed. You should never need it if you code properly anyway. It can be great for tools, but when everyone just sees it as a command they can use....well, they just blame Game Maker for not being quick enough so it had to go.



Conclusion

Now, there are obviously countless other optimisations in the runner, from ds_grid get/set being made much quicker, to the total rewrite of the ds_map, allowing for huge, lightning fast dictionary lookups, or from the way variables are now handled so that the VM can access them much quicker, or how YYC now translates everything into C++ so it can be natively compiled and thereby lifting the what little lid there was left.

And don't forget, we're a multi-platform engine, and each platform gets the same kind of treatment as the windows one does. Each platform has it's own quirks, it's own set of performance enhancing toolset, like for example the total rewrite of the HTML5 WebGL engine, that does things in a totally different way just so it can push things a little better for that platform, or the windows RT platform, that has a custom DX11 engine design and optimised just for RT devices. Each platform is taken on its own merits, and enhanced to best fit that device.

But no matter how you look at it, GameMaker: Studio has move well beyond the mere learning tool it was once designed to be, it's no longer only use by beginners, or bedroom coders looking for some fun, it's now used by professional developers around the world because it delivers, fast, cross platform development, with a truly fast engine, that's always being improved.



Tuesday, December 31, 2013

2013 - The year of travelling.

2013 kicked off with a trip to CES where we had been invited to take part in MIPS presence at the show. Sandy, knowing my utter love of all things MIPS asked if I'd like to go, and I jumped at the chance! Russell felling left out begged to come as well. He's never been to CES and was itching to see the largest consumer electronics show on earth. Sandy relented and before he booked everything, I asked if I could pay to take my better half with us. I'd never been to Vegas, and suspected I'd never get back, so thought this an ideal opportunity to do all the tourist things out there in one go. We stayed at the New York, New York hotel/casino and it was an experience to be sure. I hired a car for a couple of days and we took trips out to the Hoover Dam and the Grand Canyon, both on my life "to-do" list..

The MIPS hotel suite was odd though, Russell and I turned up and no one knew why we were there, or what we were supposed to be doing. However, the head guy appeared a little later, and knew what was going on, and just told us to set-up somewhere. We were there to help support the sale of the chips and devices, to show that there were free tools available that supported MIPS. By this time, we had the free MIPS version out, so it was a good sales pitch I guess. There wasn't actually that much for us to do, so we just hung around for a while, then met up with my wife, and headed over to the show. CES is massive, and it was great fun wandering around, my wife Frances took great delight in getting as much free stuff as was possible.

Once we got back we had some serious catching up to do, as the race was on to get ready for this years GDC. Work had finally started on the YoYo GameMaker Compiler (YYC), and this was to be what the big push for GDC was this year. Yet again, the bulk of the work fell to Russell, as he was "Mr Compiler" in the team.  The games guys were to do a new set of example games, and this year they out did themselves making the lovely looking Angry Cats Space game. We also put together a great video of indie hits created using GameMaker, including Spelunky, Hotline Miami and quite a few others.

It was decided to get a proper stand this year, something Russell and I were overjoyed about, as we wouldn't have to carry the stand over there this time! But with only a couple of weeks to go and lots of meetings stacking up, Sandy suddenly felt we didn't have enough folk to man the stand, so ask Malcolm and Geoff to come as well. With 6 of us there for the whole show, and Jaime (our lone US presence) pitching in to help, we were well staffed, and ready for the rush. The stand looked great, and once again we had lucked out on the stand location, right next to the indie games expo.

Everything went as planned, although we adjusted things a little as the show went on to draw more folk in, but it was another busy show, with lots of interest. We had some great meetings while we were there as well and came back ready to press on with the compiler and 1.2 update changes.

1.2 had some pretty large features, the compiler being one of them, but we also added Shader support, and had intended to add a proper source level debugger, but as I was also doing the initial shader support, this slipped, which was a shame as it's why we had added cross platform networking support prior to GDC. Still, the shaders were going to be a huge deal, and when paired with the compiler, we figured we would finally be able to stand up to anyone comparing us with Unity. The speed up with the compiler - in some cases, was monumental, while in others, hardly noticeable, but this was to be expected as it depending on just how much scripts and loops you did.

Perhaps the largest change however was that we had finally outgrown the office, and had been looking around for a while for some place we could move to. This place came in the form of Dundee One, and was about as prestigious an office space as you would find in Dundee. While we cracked on with the work, Sharon our new head of finance was given the task of overseeing the move. This included planning it all, liaising with architects, electricians, builders, and building managers. This almost broke her, but the results were spectacular. While Russell and I were asked for general layouts, and plans, she did all the hard graft, and the office has been acknowledged as being one the best in Dundee and surrounding area. It's a fantastic place to work, and we still can't believe we get to go there every day!

At the start of July, a few of us went down to the Develop awards as we had for the first time been nominated for the tools we've spent so long creating. Stuart and a few of the guys were already down there, as we also had a stand at the Develop conference. Sharon, Russell and I however went down for the evening, and we invited Tom Francis (of Gunpoint fame!) to join us, and much fun, and many a drink was had by all.

Only a week later however, and I was off on my 4th trip of the year. this time part holiday, and part business. I had decided to take myself to SigGraph, a huge 3D graphics show that tours around the US each year. I'd been to a few of them over the years, and they are always hard going, but great fun. The "hard going" part, is the sitting in a dark room for a week, listening to folk drone on about stuff your only half interested in, but someone at work has asked you to attend as it was of interest to them, and being on a work ticket, you felt obliged. This time however... would be different. Being there on my own buck, I could ignore the boring talks, and just attend the ones I was interested in.

Pixar stand - where I got a Renderman walking Teapot!
As I was attending a conference the week after, Sandy had graciously volunteered to pay for my flight over, and as luck would have it, I found a dirt cheap business class seat out of Amsterdam and so for the first time ever, I went in some luxury to the US. Talk about how the other half live.... it was a fantastic trip, and one I now look for whenever I go, although it's not always possible, it's brilliant when it happens.

So I spent a week in LA at SigGraph, met a couple of friends, did some great holiday things and attending what turned out to be more like DVD extras live events (which was great fun too), I then headed up to San Francisco for Casual Connect. Unlike most trips to the US I'd now been here a week, so was actually in time sync and was looking forward to a show where I was actually awake! However, Casual connect was hard going and almost broke me. Stuart came through to help out and with Sandy and Jaime, we should have been in good shape, but everyone else had meetings to go to, and that left me on the stand for 4 days, from around 8am to 7pm, it was a total killer. Even Stuart who did have meetings to break things up agreed, but that's what shows are all about, so we finished up and went back to the UK exhausted.

Yet, only a few week back, and Russell, Stuart and I were due to head out once again to GDC Europe in Cologne. Unlike last year, where we flew into Amsterdam and got a train from there to Cologne, this time we found direct flights, and so were all pretty keen to go. But with the V1.2 release looming, Sandy told us someone had to stay back to help push it out, and so as Russell was involved in it far more that I was now, he volunteered and we rushed around to change tickets and hotel bookings to take Gavin instead.

So mid August, just 2 weeks after I had returned from the US, I was off again this time to Europe and Cologne. We had a lovely stand this year, and it really stood out compared to everyone else's, in fact, quite a few came up to ask where we got it from. The show went fine, although we felt it was a little quiet compared to the last couple of years, and certainly compared to the US one, but we still made some good contacts and had fun doing it. We got to go to Gamescom as well, which is always good, particularly as it's on the business day, and it's totally empty, not the mad rush that the public get to see.

So job done, we packed up and took ourselves out for a last relaxing evening out before heading back, when a Microsoft evangelist called Kristina Rothe tweeted me and ask if we were around, I'd said we were just having dinner and would be having an early night as we were leaving tomorrow, then made the mistake of inviting her over if she was in the neighbourhood - thinking that she'd be miles and miles away. But no, she was virtually around the corner and so appeared about 20 minutes later, and to be fair, we had a lovely drink, meal and a chat. She then invited us out to a party she was going to and said she could get us in, and we thought sure... one drink will do, and then we can head back.

The devil woman snares another victim
As it turned out, she was the devil in disguise, and the party was awesome, and fun... and I drank WAY too much, something I never - ever do. I made an utter fool of myself and managed to spray beer all over the host, who was a total gent and never even beat me up for it. It was a fab night, and we finally headed back to the hotel after a second party turn out to be closing down when we got there. I failed to drink enough water during the night and awoke the next day with a massive hangover, mainly I think due to dehydration! A nightmare trip to the airport followed, along with a long wait at a starbucks waiting on the flight to board. I did however manage to eat, and drink something, so by the time we got home, I was more or less back to normal again, where I vowed - like everyone else "Never again".

V1.2 was released a few days later, and went down incredibly well. Users were especially excited about the new shader support, and have been having fun with it ever since. We also shipped YYC as a new module something again customers felt should have been free. Back in the day when I use to use products like Blitz basic - or just BASIC for that matter, compilers were always viewed as an add on you paid for. The basics worked just fine, and the compiler was an optional extra that give you a boost when you needed it. These days with the internet giving so much away for free, many users forget the time and effort that goes into something like this and demand it be reduced in price - usually to a point where they specifically can afford it.

Iconic view from the Empire State Building
Still, everything was going well, when all of a sudden a few things clicked into place, and we were all off again on another trip to the US. Russell, Sandy and I were due to head out mid September to various meetings, first on the east coast in Boston, then down to Washington, then up to New York. It was a pretty quick trip taking in a couple of cities a day, but fortunately I was only doing the East coast and the guys then left towards San Francisco without me. Having never been in New York before I was pretty excited to take in some of the sights, and to finally meet up and have dinner with Jesse Freeman. After a great dinner, wonderful conversation and a good sleep, I was able to take in some of New York's sights before heading back to Boston by train for my flight home.


Once I got back, I joined in the started work on 1.3 - along with some other projects, and started to get back into the groove. At home however, I'd started a new GameMaker project that initially got some folk over excited. The Your World project started out life as an Ant Attack map loaded, but quickly evolved into the city scape that is so well know with another game - which shall remain nameless. It didn't take very long to get it up and running, and being a personal project at the time, I had great fun showing it's progress and trying out new things. I didn't realise the interest in it however, until Left on a holiday with my wife to Erik Ruth's wedding in Washington (Trip 6!) when the media got hold of it, and coverage exploded! My Twitter followers more than doubled in a matter of days, and interest in the project took off. We decided to rain things in. We decided that we would try and do something more with the project that as a toy for me to play with, and so Your World was born. There was a necessary lull in exposure to the project as we refactored it, and created our own assets, but in doing so, we created a fully open source world that was free, and had a built in editor for folk to play with.

Our trip to Washington was also great, and the wedding was fun to go to - much tamer than a Scottish wedding BTW! No one got knifed, or drunk themselves to death, but it was fun all the same. We also managed to spend some time with them a couple of days later going around the museums and malls. Both of us loved sightseeing around Washington, spotting all the landmarks we'd seen on so many TV programs, and would go back again in a heartbeat!

Back at work again, and we carried on working towards the 1.3 build, while I also did some other browser work, and helped get the new Your World project going.With very little free time, I was looking forward to a long weekend away and Amsterdam(Trip 7!), once again with my wife. I'd been through Amsterdam a few times, but had never managed to spend any time in it, so a few days walking around there at the end of November were a welcome break. With the canals and lots of tiny unique shops, its a very relaxing place to wander, and we had great fun doing just that.

Coming back from there Russell was pulled onto another project, and I started pulling together the brand new Early Access build. This was a new version of GameMaker that would live alongside the current one, with the idea that all the new - and probably broken features, would go into this one, while we kept the main build as stable as possible. over the year and a half of Studio's development, a huge number of features have gone into it, and as a result, it's definitely had some stability issues. We tackled them as soon as we found them, and the beta branch worked okay for that, but we don't feel it was enough, so over the last few versions, we've kept a totally separate code branch that we simply merge fixes into, and we feel that's helping to keep the main version much more stable.

But we do still want to get new features out there so users can try, and test them. We now have tens of thousands of users - hundreds of thousands if you include the free version, and we simply can't test as much as all those users can, so the decision was made to create a new version, explicitly designed to house new features - complete with the lovely blue icon.

However even though this is supposed to be an unstable branch, this first version is actually pretty stable in itself, and the feature set is utterly desirable. Lots of hard work has gone into the 1.3 release, and includes things like FLASH asset importing, the source level debugger (finally making an appearance), and full extension support for iOS and Andorid. On top of this, there are some major updates to the IAP api, Push notifications - probably the most requested feature, have finally been added, and some new events as well. This makes the first Early Access build one of the most desirable versions of GameMaker: Studio ever, and it's something we pushed hard to get out before Christmas so that everyone could have some fun with it.

So there you go...2013 has come, and gone, and it's been a totally mad year. I've done move travel this year, than I have in the past 3 put together, but it's been great fun - and it turns out, I love to travel, which is just as well really. 2013 certainly ended much better than 2012 did, and we have the foundation of an amazing 2014. Things are already afoot to give some surprising and exciting news, and while there's huge amounts of work still to be done, we're excited and ready to do it - here's looking forward to a great 2014!