Friday, October 17, 2014

Why aiming for 60fps or full 1080p is a pointless goal.

So yet again on the web I've seen posts by journalists that 1080p is worthwhile, and that developers should be spending all their time trying to make full 1080p/60fps games, and yet again, it gets my back up...

A few years back I did a post about the mass market, which was an article I did way back in 2004 ( A new direction in gaming), and it seems to me this is the same thing, coming up again.  I mean, seriously, who do you want to buy your games, a handful of journalists, or millions of gamers?

Making a full 1080p/60fps game can be hard - depending on the game of course, but ask professional developers just how much time and effort they spend making a game run at 60fps, while keeping the detail level up, and they'll tell you it's incredibly tricky. You end up leaving a lot of performance on the table because you can't let busy sections - explosions, multiplayer, or just a load of baddies appearing at the wrong time, slow down the game even a fraction, because everyone - no matter who they are, will notice a stutter in movement and gameplay. And that's the crux of the problem. Even a simple driving game, where it's all pretty much on rails, so you know poly counts and the like, has to leave a lot on the table, otherwise if there is a pileup, and there are more cars, particles and debris spread around that you thought there would be, the it'll all slow down, and will look utterly horrible.

Take a simple example. You’re in a race with 5 other cars, cars are in single file, spread out over the track - as the developer expected it to be. You’re in the lead with a clear track. So let’s say the developer had accounted for this, and are using 90% of the machines power for most of the track, and on certain areas, they reduce this to account for more cars and effects - like the start line for example. But suddenly, you lose control, and spin, the car crashes and bits are spread everywhere along with lots of particles. Now the other cars catch up, hit you, and it gets messy. Suddenly, that 10% spare just wasn't enough. it needs a good 30-40% to account for particle mayhem, and damage, and the game slows down. As a gamer, it’s not dropped from 60 to 30 - or perhaps even lower depending on how many cars are in the race (like an F1 race for example). Now, 30fps isn't terrible, and even 20fps would be fine - probably, but the thing is.... the player has experienced the 60fps responsiveness and now its suddenly not handling the same way, and they notice.

The problem is people notice change, even if they don't understand the technical effects or reasons behind it. So going from 60 to 30 will be noticed by everyone, even when you compensate for it. It is much harder to notice going from 30 to 20 when there is frame rate correction going on, but many can still "feel it".

So, if I'm saying you shouldn't do 60fps or full 1080p (if the game struggles to handle it), what should you do? Well, what people really notice, isn't smooth graphics, but smooth input. Games have to be responsive, and it's the lag on controls that everyone really notices. If you move left, and it takes a few frames for that to register - everyone cares, not just hard core gamers. But if a game responds smoothly, then if you’re at 60 or 30 - or even 20! then gamers are much more forgiving. This is mainly because once you're in the thick of the action, they are concentrating on the game, not the pretty graphics or effects. I can prove this too, even to hard core developers/gamers.  Years ago when projectors first came out they were made with reasonably low res LCD panels, and you got what was called the "Screen Door" effect. Pixels - when projected, didn't sit right next to each other, they were spaces between them. Now when you started a film, you'd see this and you would end up seeing nothing BUT the spaces. However as soon as you got into watching and more importantly, enjoying the film, that went away, you never saw the flaws in the display, because you were concentrating on the content.

The same is true of games, sure you power up a game and ooo...look, it’s super high res, and ever so smooth! But 10 minutes later, your deep in the game and couldn't give a crap about the extra pixels, or slightly smoother game, all you care about is the responsiveness of it all. If it moves when you tell it to, your happy.

So what about the 1080p aspect? Well years ago when 1080p started to become the norm, and shops had both 720p and 1080p large screen TVs in store, I happened to be in one, where they had the same model, but one 1080p, and on 720p hanging right next to each other, playing the same demo. I went right up to them, and could still hardly tell the difference. These were 2 50" displays, yet with my noise almost touching them, it was hard to tell where the extra pixels were. Now, if you have a 1080p projector, and are viewing on a 2 to 3-meter-wide screen, I'm pretty sure you'll notice, but on a TV? when your running/driving at high speed - no chance. Every article about resolution in games shows stills, and this is the worst thing to use, as that's not how you play games. It's also worth remembering, that at that moment in time your noting playing, so you’re not concentrating on the game, you’re doing nothing but search for extra pixels, so it's yet another con really.

So what’s the benefit to running slower, at a slightly reduced resolution? Well, 1080p at 30fps means you can draw at least twice as much. That's a LOT of extra graphics you can suddenly draw, and even if you don't need/want to draw that much more, if makes your coding life MUCH simpler. You no longer have to struggle to maintain the frame rate, or worry about when there is a sudden increase in particles or damage - or just more characters on screen.

What about resolution? Well 720p is still really high-res. There are still folk getting HD TV's and watching standard definition TV thinking it's so much sharper than it used to be! The mass market usually doesn't "get" what higher res is until they see it side by side, and once things are moving and they are concentrating on other things, they will neither know, or care.

At 720p, 30fps you can draw over 4 times as much 1080p, 60fps. That's a LOT of extra time and drawing. Imagine how much more detailed you could make a scene with that amount of extra geometry or pixel throughput. Even at 3 times, you could leave yourself a massive amount of spare bandwidth to handle effects and keep gameplay smooth.

2D games "probably" won't suffer too much from this problem, although on mobile web or consoles like the OUYA they will as the chips/tech themselves just isn't very quick.

So rather than spend all your time with tech and struggle to maintain a frame rate most gamers won't notice, shouldn't you spend all your time making a great game instead? If you're depending on pretty pixels to make your game enjoyable, it probably isn't a great game to start with, and you should really fix that.

Games with pretty pixels sell a few copies, truly great games sell far more, and while it's true that games with both will sell even more, the first rule of game development is that only games you release will sell anything at all, and while playing with tech is great fun, constant playing/tuning doesn't ship the game.






Wednesday, September 17, 2014

When Scotland changes forever.....

In a world of history, a truly historic event is rare, but that's what Scotland gets tomorrow. On the 18th of September 2014, we the people of Scotland, get to vote on Independence. We get to decide whether we break away from the United Kingdom, or whether we stay. But either way, things look like they'll change, for the worse or better is anyone's guess.

The Stay together campaign is promising loads of new powers from Westminster, and that the UK will stay strong, in both military and monetary might. They say everything a Yes vote promises is dangerous, and too big a risk to gamble on. Perhaps their right.... perhaps.

Or perhaps not. The Yes campaign obviously states the exact opposite, that staying will continue the downward spiral of Scotland, and more and more power and money going to Westminster. They will continue to take Scotland's Oil and squander it, while giving little or nothing back. They also say the so-called powers that are on offer, will never appear and not to trust them.

A couple of years ago, I started out firmly in the No camp, but over time came over to be a resounding Yes. So what changed my mind?

Well, at the start I was concerned with things like defence, and thought that being an island nation, we can defend our borders better together,  and that leaving would make travel ultra complicated - at least for a few years.

But over time I realised that just like with other nations, partnerships could be done for defence - and probably will be, while travel disruption will actually be fairly minimal as I'm still on a UK passport - that isn't going to change.

The other thing I came to realise was that Scotland is just as capable of running her affairs as Westminster, after all there have been both Scottish Prime Ministers, and Scottish Chancellors. We're a smaller country with amazing natural resources, something Westminster really want to keep a hold of, and I do wonder if it wasn't for Oil and renewable energies (not to mention some place to keep their nuclear arsenal), would they care so much?

I also don't trust the current promise of new powers, because while the "current" leaders may indeed be sincere, there is no way they can promise. Any set of powers MUST go through current parliamentary procedures and be approved by both the House of Parliament, and probably the house of Lords. Back bencher's of all parties - not to mention some grass roots members, have expressed distress - and no small disgust, at additional powers being promised without proper discussion and review. So chances are, whatever comes - IF they come at all, won't be what they are promising, nor what No voters are expecting.

The ironic thing is, that the SNP wanted "Devo-Max" (which is what the offer of new powers is) was on the ballot paper as an offer, but Westminster refused, saying if MUST be a yes/no only. If it was on, I suspect it would have been an easy victory for No/DevoMax folk.

There's also the argument of Money, and what currency to use. I'm indifferent to this at best. We can of course use the English pound, or we could use the Euro or even just make our own. I don't really care either way. I suspect it'll be a few years uncertainty, and then stabilise. Scotland is, or would be, a wealthy nation. Oil rich, great renewable energy, and we export more than we import.

Why does the Better Together campaign think Scotland would struggle? What other Oil rich country struggles? And this isn't even considering the renewables. Scotland is currently on track to have 100% of it's power generated by renewable energy - we're already at 40%.

Not that I actually think we'll win to be honest. While there have been huge - HUGE Yes rally's, I also think those voting No are more the kind to just watch, stay at home... then vote No. I know this because many of my older relatives are no's, and they are exactly like this. On top of this there is also the undecideds, and they may well swing it either way.

But the thing I'm truly excited about, is the level of engagement on this vote. Unlike a general election where the public believes their vote counts for nothing - especially in Scotland, everyone here feels their single vote could swing things, so they HAVE to take part. It's amazing. There has been a 97% registration rate - unheard of in any election, and they are expecting 80-90% of these to actually vote. That's astounding.

So, yes.... I'm looking forward to tomorrow, to take the leap and cast my vote, and while I don't "think" we'll win - I'm certainly not sure, and I'm excited to hear the voice of Scotland - virtually ALL of Scotland.

So I'll vote Yes, and I'll hope everyone else wants to take that leap of faith with me, believe that Scotland is best suited to govern Scotland, because while it might be a tricky road ahead, nothing truly worthwhile is ever easy, and this is something worth taking that leap of faith for, not just for me, but my kids and my kids kids.

Wish us luck, no matter how the vote goes!!

Sunday, August 24, 2014

The beginners guide to Ray Tracing

So, I was asked recently on Twitter how Ray Tracing works, but it's not something that you can easily explain in 140 characters, so I thought I'd do a small post to explain it better.

So... the basic idea in rendering a scene in graphics, is to have a near and far place, and the corners of these two planes defines a "box" in 3D space, and that's what you're going to render - everything in the box". Orthographic projections use a straight "box", where the far plane is the same size as the near plane, whereas perspective uses a far plane that is bigger than the near one - meaning more appears in the distance than closer to you (as you would expect).

Doing Ray Tracing, is no different. First create a box aligned with the Z axis, so the planes run along the "X" axis. something like.... (values not real/worked out!)

Near plane  -100,-100,0  (top left)  100,100,0 (bottom right)
Far plane     -500,-500,200  (top left)  500,500,200 (bottom right)

This defines the perspective view we're going to render.

Now, how you "step" across the near plane, determines your resolution. So, let's say we're going to render a 320x200 view, then we'd step 320 times across the X axis, while we step 200 times down the Y.
This means we have 320 steps from -100,-100,0  to  100,-100,0 (a step of 0.625 on  X). And you then do the same on Y (-100 to 100 giving a DeltaY of 1.0). With me so far?


So, this image shows the basics of what I'm talking about. The small rectangle at the front is the near plane, and the large one at the back; the far plane. Everything inside the white box in 3D is what we consider as part of our view. If we wanted to make a small 10x10 ray traced image, then we'd step along the top 10 times - as shown by the red lines. So we have the 8 steps and the edges, giving us 10 rays to cast.

So lets go back to our 320x200 image, and consider some actual code for this. (this code isn't tested, it's just pseudo code)

float fx,fy,fdx,fdy;
float nz = 0.0f;
float fz = 200.0f;
fy = -500.0f;
fdy = 2.0f;
for(float ny= -100.0; ny<100;nydata-blogger-escaped-a="" data-blogger-escaped-colour="CastRay(nx,ny,nz," data-blogger-escaped-f="" data-blogger-escaped-fdx="" data-blogger-escaped-fdy="" data-blogger-escaped-float="" data-blogger-escaped-for="" data-blogger-escaped-fx="" data-blogger-escaped-fy="" data-blogger-escaped-fz="" data-blogger-escaped-nbsp="" data-blogger-escaped-nx="-100.0;nx<100;nx" data-blogger-escaped-ny="" data-blogger-escaped-pixel.="" data-blogger-escaped-pre="" data-blogger-escaped-raytrace="" data-blogger-escaped-vec4="">
NOTE: Sorry - Blogspot's editor has utterly nuked the code. I'll fix it later....

As you can see, it's a pretty simple loop really. This gets a little more complicated once the view is rotated, as each X,Y,Z has it's own delta, but it's still much the same, with an outer loop of X and Y dictating the final resolution. In fact, at the point the outer X and Y loops are usually just from 0 to 320, and 0 to 200 (or whatever the resolution is).

Assuming we have a background colour, then the CastRay() would currently return only the background colour, as we have nothing in the scene! And when you get right down to it, this is the core of the what ray tracing is. Its not that complicated to define a view, and start to render it. The tricky stuff comes when you start trying to "hit" things with your rays.


Lets assume we have a single sphere in this scene, then the cast ray function will check a collision from the line nx,ny,nz  to  fx,fy,fz with our sphere (which is a position and radius). This is basic maths stuff now - Ray to Sphere collision. There's a reasonable example of ray to sphere HERE.


So once you have that code in place, and you decide you've hit your sphere, you'll return the colour of your sphere. Pretty easy so far. So what if we add another sphere into the scene? Well, then the ray cast will now simply loop through everything in the scene no matter what we add; spheres, planes, triangles, whatever. 


Now we have a scene with lots of things being checked and hit, and we return a single colour for the CLOSEST collision point (giving is the nearest thing we've hit). This is basically our Z-Buffer! I means things can intersect without us having to do anything else. The closest distance down the ray to the intersection point, is the nearest thing to us, so that's the one we see first.


The next simplest thing we might try is reflections. In realtime graphics, this is pretty hard, but in Ray Tracing, it's really not. Lets say we have 2 spheres, and the ray we cast hits the 1st one, well all we now need to do is calculate the "bounce", that is the new vector that we'd get from calculating the incoming ray, and how it hits and bounces off the sphere. Once we have this, then all we do is call cast ray again, and it'll fly off an either hit nothing - or something. If it hits something, we return the colour and this is then merged into the initial collision and returned. Now, how you merge the colour, is basically how shiny the thing is! So if its VERY shiny, then a simple LERP with a 1.0 on the reflected colour would give a perfect reflection, where as if it's 0.5, then you would get a mix of the first and second spheres, and if its 1.0 on the original sphere, then you would only get the original sphere colour (and would probably never do the bounce in the 1st place!)


So the number of "bounces" you allow, is basically the level of recursion you do for reflections. Do you only want to "see" one thing in the reflection, then one bounce it is. DO you want to see a shiny object, in another shiny object? Then up the recursion count! This is one of the places Ray Tracing starts to hurt, as you can increase this number massively and doing so will also increase the number of rays being case, and the amount of CPU time required.


We should now have a shiny scene with a few spheres all reflecting off each other. What about Texture Mapping? Well, this is also pretty simple. When a collision occurs, it'll also give you a "point" of collision, and now all you need to do is project that into the texture space and then return the pixel colour at that point. Hay presto - texturing. This is very simple on say a plane, but gets harder as you start to do triangles, or spheres - or displacement mapping, or bezier patches!


For things like glass and water, you follow the same rule as reflection, except the "bounce" is a refract calculation. So instead of coming off, the ray bends a little as it carries on through the primitive you've hit. Aside from that, it's pretty much the same.


Now... lighting. Again, in realtime graphics, this is tricky, but in Ray Tracing it's again pretty easy. So lets say you've just hit something with your first ray, all you now do is take this point, and create a new ray to your light source, and then cast again. If this new ray hits anything, it means there's something in the way, and your pixel is in shadow. You'll do this for all light sources you want to consider, and merge the shadows together, blending with the surface pixel you hit in the first place, making it darker. And that's all lighting is (at a basic level). Area lights. spot lights and the rest get more interesting, but the basics are the same. If you can't see the light, your in shadow.


So how do you do all this in realtime? Well, now it gets fun! The simple view is to test with everything! And this will work just fine, but it'll be slow. The 1st optimisation is to only test with things in your view frustum. This will exclude a massive amount - and even a simple test to make sure something is even close to it, will help. After this, it gets tricky. A BSP style approach will help, and only test regions that the RAY itself intersects. This again will exclude lots, and help speed things up hugely. 


(no idea whose image this is BTW)


After this...well, this is why we have large Graphics conferences like SigGraph, professionals from the likes of Pixar and other studios and universities spend long hours trying to make this stuff better and faster, so reading up on ideas from these people would be a good idea.


Lastly....  There is also some serious benefit to threading here. Parallelisation is a big win in Ray Tracing as each pixel you process is (usually) totally contained, meaning the more CPU cores you have, the faster you'll go! Or, if your really adventurous you can stick it all on the GPU and benefit from some serious parallelisation!!


Wednesday, January 01, 2014

Improving the runner

So, I thought it might be fun to look back at some of the changes we've made to the runner, why, and what impact it's had on performance. Back when we started, the main complaint with Game Maker was that it was too slow, and that for more complex games, it just didn't cut it. This is clearly not the case any more, so what's really changed? Well, this only really effects the c++ runner, but lets take a look see what we did...

WAD

When we started on the PSP port, the 1st thing we did was to remove all the compression from the output. The game was in fact doubly compressed (which is nuts BTW - it's not like it's going to get magically smaller again!), and on the PSP this lead to loading times of around 40 seconds to a minute is even very simple games. So, we changed the whole output, altering it into an IFF based WAD file, and this meant it could be loaded incredibly quickly, and was ready to go as soon as it loaded. This was a huge change code wise, and touched every element of Game Maker. However, this change alone meant games would load either instantly, on in a few seconds.

Virtual Machine

Game Maker was originally split into 2 parts, scripting, and D&D (Drag and Drop), and believe it or not, there were to totally separate code paths for each. On top of this, everything was compiled on load, meaning there was another noticeable pause when games loaded. So, first thing that was done, was we changed the script engine to use a virtual machine, this meant we could optimise small paths of code rather than huge chunks - twice. This gave a modest boost - not as much as we hoped, but this was due to the way GameMaker dealt with it's variables, and how things were cast, and we were stuck with that. The next thing we did was to remove the D&D path, and then compile all D&D actions into script so they could then be compiled in the same manner. Lastly, we pre-compiled everything. This removed the script from the final game - and finally removing the worry of simple decompilation, but it also sped the loading of a game up yet again. It was not at the point, where a windows game would load and run almost instantly.


Rendering

There's quite a number of changes here, so I'll break it down a little further. But basically, the old Delphi runner had some major CPU->GPU stalls, and so we had to look into fixing them.

Hardware T&L

This was introduced on GM8.1, and it's was the first real boost for performance. All later versions of Studio also use this "Hardware Transform and Lighting" mode. By using this, the GPU does all the transformations, and leaves the CPU free for running the game. Up until we switched it, after you submitted a sprite, DirectX then used the CPU to transform all the vertices into screen space before passing them onto the GPU - this places severe limits on how much you can push though the graphics pipeline, so it was changed on the first version of GM8.1 we released.

DirectX 9

The first thing we did was to upgrade to DX9, this meant we were on a version which Microsoft still maintained properly, and gave us access to more interesting tools down the line - like better shaders. While doing this, we also introduced a proper fullscreen mode, and while this may not seem like much, when in fullscreen mode, things do render slightly quicker. Also...DX9 is just a little faster than DX8, and the upgrade is a pretty simple process (takes time, but its simple enough).


Texture pages

One of the biggest problems with Game Maker, was that every sprite, every background was given it's own texture page, and while this might seem like a good idea as it means you can easily load things in, and throw them away, it utterly destroys any real performance. So, we set about putting all images on texture pages (TPages), but this was more complex as Game Maker allows non-power of 2 images.

Standard methods involved sub-dividing images down, splitting them in 2 each time until you allocate the sprite, and fill up the whole sheet. This is very fast, and very efficient. However... we needed something more. So what we did was create a multi-stream, virtual rectangle system which would clip to everything as you added any sprite of any size. I came up with this years ago, and it was incredibly powerful, and very fast - but it was recursive, and complex. Still, having written it before, I did it again and we had a very nice texture packer for any size of image.

Next we noticed that current Game Maker developers were incredibly wasteful of images. They would have huge sprites with nothing but space in them, so they could be aligned easily with screen elements. A cool fix for this, was to crop out anything which had a 0 alpha value. You basically search the sprite for a bounding box that isn't 0 alpha, and then crop it, and remember the all the offsets to enable you to draw it again as normal. This is a cool trick, that helps us pack even more onto a page,which then helps with batching as well.


Batching

Speaking of batching.... Since the old version changed texture all the time, it could never hope to batch more than a couple of triangles at once, and this meant the CPU was forever getting in the way. Modern graphics hardware wants you to send as much as you can in one shot, then to go away and leave it alone!

So, the first thing we did was go through and change all the primitive types. The old way (TriFan) is unbatchable, and so you would always be stuck with 2 triangles per batch, and that's useless. So we changed them to the most batchable prim we could use, TriList. I wrote a new vertex manager that you could allocate vertices from, and it detected if there was any changes, or breaking in batches, so all the engine had to do, was draw, and the underlying API would auto-batch it for us.

Couple this with the new texture pages, and if you were just drawing sprites, you had a good chance of actually drawing multiple sprites in one render call. In fact, many games (like They Need to be Fed) can render virtually the whole game in one render call depending on the size of the texture page.

There are still some things which can break the batching (obviously), but with careful set up, rendering sprites is no longer the limiting factor of your game.


Blend States

Blend states are always tricky, and in the past I've tended to using a layering system - something that might appear in later versions of Studio or more likely GM: Next, as it helps manage state changes much better. However, we got a shock early in the year when we discovered how GM users actually manage state, and so we worked hard to hide the pain of management from our users.

Now a user can simply set state and reset state as they see fit, and aside from the GML call overhead, we handle all the blend state batching behind the scenes. For example, if you set a blend state, draw something, reset it, then set it again, draw something and then reset it, we recognise this, and both items will be drawn together in a single batch, making the rendering code much more efficient without the developer ever having to care.

It'll also handle more extreme cases, where you set blends several times, then set a different one, and then set more blends several times. The engine will also recognise this, and will only submit 3 batches - or however many blend state batches are actually needed. This is incredibly powerful, and a major playing in rendering performance.


Static Models

In the past, whenever you created a D3D model, it didn't actually create anything, it just remembered the prims and commands you sent, then replayed them back dynamically. This meant 3D rendering was terrible, and you would struggle to get above a few thousand polys.

We actually introduce this new concept in GM8.1 although it was more of a pet project for me at the time. Now its a staple of how you should work. Studio will now let you build fully static models, meaning that when your finished creating them, you'll be left with a single vertex buffer (or multiple depending on lines and points), that will simply be submitted to the hardware when you want to draw something.

This is basically as fast as it's possible to draw on modern hardware, you create a buffer full of vertices and submit them but simply pointing to the buffer. This is what all modern games do, its what engines like Unity does. With this addition, you can quite literally draw models with hundreds of thousands of polys at hundreds of frames per second (depending on your graphics card of course!)


Shaders

Although a recent addition, this is a powerful performance tool. Up until this point, special effects would be done on surfaces, then processed by the CPU, most of the time, painfully so! But with all the previous optimisations in place, shaders were finally able to get their place and show you what the GPU was really able to do.

From effects such as shadows, to environment mapping, to more 2D effects such as outlines, colour remapping or even full shader tilemaps all meant that the CPU got more time to work on the game, not spend time trying to make things look pretty, especially when there was a much better way of doing it sitting there doing nothing.


string_execute()

Yeah... this old one. Well, this was a popular command, mainly because it let you be really lazy in the way you coded, and yeah.... it's not a terrible thing, but it was one of the worse performance hogs, and folk using Game Maker had come to rely on it, meaning it was pulling performance down for everyone. So why was it so bad? Well, unlike code when you executed it it had to compile the string every time before running it, and this was incredibly bad. Lets have a quick look at what's involved when compiling.

Lexical Analysis. First it had to scan the string, break it down into tokens allocate memory for it and store a parse tree. Even a very simple string like "room_goto(MenuScreen)" means we had to parse the string, looking at each character, and breaking it into several tokens. room_goto command token, find and verify the syntax of the command, and evaluate the contents of the brackets - in this case; MenuScreen, which is also stored as a token. These are then stored as a parse tree, so that equations and evaluations are calculated correctly. So if you gave a calculation like "A = instance3.x+((instance.B*12)/instance2.C)", it would be evaluated correctly when executed.

Compilation. This then transforms the parse tree into byte code. Previously, it was transfered into a stream of tokens, and the engine just run these, but with the virtual machine, we now compile into byte code. This memory also has to be allocated.

Running. Now the new script can be run. First it sets up a code environment (again, more code allocation), complete with the current instance so it can store things correctly, then the VM takes over and runs the code.

Freeing. This is the real crime. Once all this work is done, and the script has been run.... it's all thrown away. All the different memory blocks are freed, and when you try and run it again, it has to do this all again.

Knowing this... it's no wonder this command was such a hog. And file_execute was even worse, as it had to load the thing first, before doing any of this - and disk access is ALWAYS slow.  So this is why this command was removed. You should never need it if you code properly anyway. It can be great for tools, but when everyone just sees it as a command they can use....well, they just blame Game Maker for not being quick enough so it had to go.



Conclusion

Now, there are obviously countless other optimisations in the runner, from ds_grid get/set being made much quicker, to the total rewrite of the ds_map, allowing for huge, lightning fast dictionary lookups, or from the way variables are now handled so that the VM can access them much quicker, or how YYC now translates everything into C++ so it can be natively compiled and thereby lifting the what little lid there was left.

And don't forget, we're a multi-platform engine, and each platform gets the same kind of treatment as the windows one does. Each platform has it's own quirks, it's own set of performance enhancing toolset, like for example the total rewrite of the HTML5 WebGL engine, that does things in a totally different way just so it can push things a little better for that platform, or the windows RT platform, that has a custom DX11 engine design and optimised just for RT devices. Each platform is taken on its own merits, and enhanced to best fit that device.

But no matter how you look at it, GameMaker: Studio has move well beyond the mere learning tool it was once designed to be, it's no longer only use by beginners, or bedroom coders looking for some fun, it's now used by professional developers around the world because it delivers, fast, cross platform development, with a truly fast engine, that's always being improved.