I've been playing with trying to speed up the multiplexor sorting, and I thought I had a much quicker way of doing it too. The old method was dumb, really dumb. I'd run through all 16 sprites, get the smallest Y value one store it, set it to $FF and then loop again until there were none left. VERY simple stuff. However that means for the worst case its doing a 16*16 loop, with a preloop of 16 to setup some values.
So, I though, "How about an insertion linked list?" Simply add a sprite index to a 1D indexed-linked-list at the correct point. Now, worst case shouldn't be as bad as 16*16 since you will only ever check with whats in the list already. Now for the 1st few, that means theres only a couple of entrys there, while the last few will obviously check to n-1(ish). Now this sounds all very good, and I'd hoped that it double the speed.... however, that was not to be. Would you believe for the worst case, its only about 1 scanline faster! Damn!
It's still not working 100% but its mostly there, and while the code is around 72 bytes smaller, its much harder to follow (since linked lists under 6502 are tricky anyway) so I'm wondering which version I should take....
After the sort I need to copy all the sprite data into the IRQ buffer for displaying, so thats another N loop as well. On the plus side, when theres not as much on screen its quite a bit quicker, and only gets bad once a load of things come on. I guess this means that when I'm displaying lots of character sprites and only a couple of H/W sprites I'd win out over all; still, worst case is usually the ones to watch for.... *sigh*