SuperCPU Illuminated - Part 1



65816-Assembler?



Unfortunately the announced "Developer's Kit" from CMD is still not available.   Take a look at the Listings.


It's arrived!

There it is, the awe-inspiring Turbo card from CMD.  Everyone who sees it is facinated by its speed and compatibility.  But how do you develop a program for the SuperCPU? How do you really use it?  This tutorial guides you bit by bit through the secrets of this marvel.


SuperCPU present?


Firstly, it is important to be able to recognize the card.  Only when it's in the expansion port can you fall back on it, otherwise you have a normal C64 sitting in front of you, and the special programs aren't going to work. There are many ways to detect the SuperCPU; here we give two examples. First, one would probably get the idea to let the CIA-Timer run.This method leads to incorrect results if the user has the switch in the 1 MHz position.  There is another way: In the SuperCPU there's a more modern incarnation of our 6502/6510, namely, the 65816.

In the 6502 there is a little, far-reaching, mostly unknown glitch (don't worry, according to the rules, the processor calculates correctly!): In decimal mode (with SED turned on) the Negative-Flag gets set incorrectly.  When LDA #$99 is entered in decimal mode, the N-Flag is set.  Add one to the value, and the N-Flag remains unchanged.   This happens because the 65816 correctly calculated that the result is zero (a positive word).  We can use this to our advantage to check the behavior and find out if a 6510 or a 65816 is doing the work. We switch on decimal mode, enter 99 into memory and add one.  Now we just need to test the Negative-Flag:  If it is set, then the 6510 is working, otherwise the 65816 is the active processor. Listing 1.1 shows the routine.

Caution: There are some old turbocards for the C64 which use the 65816 processor, but they don't have nearly as many good qualities as the SuperCPU.  Apart from that, some C64 emulators behave like a 65816 but do not achive the speed of the CMD card.

How can you be totally sure that you have a SuperCPU in front of you? There's nothing easier than this:  There are some new, really interesting registers at $D070 and $D0B0.  One of them, $D0BC, helps us determine if the SuperCPU is there or not.   With this, you have to test Bit 7 of this register.  When the turbocard is working, this bit is always zero, while the stock C64 has a one in this position.   Take a look at Listing 1.2 to see how it's done.

Because you can't be 100% sure that every single C64 (without a SuperCPU) will give this result when this particular bit is examined, both methods in combination will let you know for sure.


Turbo or Normal?


So now we know that our SuperCPU is up and running.  But are we in Turbo-mode or the normal slow mode?  If our program for the Turbo-mode is displayed, then the SCPU is not in normal mode. It would be unprofessional to get a message every time saying, "Please turn on Turbo-mode".

It would be a lot better to recieve this message if the card really ran at 1 MHz.   CMD has built in the ability to very simply test this.  Bit 6 in the new register $D0B8 is high (one) if the card is in slow mode, otherwise it's zero.  In Listing 1.3 you can see how to program such a polling with the corresponding reaction.

Speed-Switching


You may often find that you want to run programs which require the machine to be in 1 MHz mode.  Sometimes these programs won't even run at all unless you are in the normal mode.  The turbo card lets you change the speed with a switch, but you can actually do it with software too!  There are two registers in the $D07X region which are not polled, but are thought to be changed.  To prevent programs in memory from being overwritten, the registers must first be activated and then deactivated.All registers are write-sensitive, that is, there must be a successful writing to the register in order to put in the desired option.  There is, for example, an STA, which has nothing to do with the contents in memory.  The position in memory to activate the remaining $D07X register is $D07E.  An STA $D07E makes the other registers "visible" and "writable".  The register at $D0BX can be polled at any time.
The developers have even considered the possibility that a Memory-Fill-Routine could accidentally fill up the registers.  But the memory loaction to deactivate the register is directly behind it - $D07F - , with which the registers are immidiately faded out again.  If the Fill-Routine comes from the other direction, it doesn't do anything: the register $D07F occurs again at $D07D.



Not So Fast!


The same method is used by the software moderated speed switcher. The software switching method can happen very quickly, so the registers must not be activated too much.  The register to turn off turbo speed exists in two places for the same reason described above.  
This is at $D07b and $D079.  Exactly inbetween (at $D07A) is the register used to turn on the 1MHz mode.  So there's no need to worry that even if some registers are accidentally changed, the turbo card won't be shut down.  If  in your programs, you want to sometimes use normal mode and other times use turbo mode, it's possible to switch the SuperCPU to 1MHz mode at STA $D07A. 
As soon as you want to have "full power", set register $D07B, and you'll find yourself in turbo mode.  Turbo mode will stay active until $D07A is changed again.   In contrast to other turbo cards, the SuperCPU uses this very secure switching method.  You can be sure that when the program exits it won't use up $D07A/$D07B.   So the CMD card will always stay in the mode desired by the programmer.  Just be careful that you don't set the speed out of the range of the SuperCPU.


How is that possible?


Back in the day it was said that there would never be such a turbo card for the C64.   The C128 couldn't be used at 2MHz in C64 mode without disabling the VIC chip.   This problem existed for a while, because the graphics chip of the C64 can't use the system RAM directly. The problem was solved thusly:  whenever something was written to the memory in the turbo card, it was simultaneously written to the C64's own memory, which could be accessed easily by the VIC.  This only works in 1 MHz mode, though. This sounds worse than it is really, because at the moment of the actual writing   it must be shut down; something that happens automatically anyway. That's the way it was for other turbo cards at least.  With the SuperCPU, the problem has been solved much more elegantly.  Programmers don't have to worry about it, the solution is a Cache-Byte.  With an STA the byte is put into fast static memory of the SuperCPU, and in the Cache-Byte.  With this, the next command will be carried out!   Meanwhile the SuperCPU logic waits for the internal 1 MHz of the C64 and writes to the Cache-Byte in the correct place according to the tactile frequency.  As a result, there's no delay. If, all things considered, more changes need to be done, and the Cache-Byte is still not empty, the processor must enter a few more waitstates. 
As a coder you don't really get anything from all of this, but you have to know it to understand the behavior of the CMD card.  The speed loss is determinable by the program, however the memory will always achive a factor of 10 - 15.  Unfortunately, the double write procedure (called "mirroring") is necessary, because the SuperCPU has to know if it's dealing with graphical data that the VIC will want to use later (for example: screen ram, bitmap or sprite-area).  The CPU can't possibly know this, but the programmer will know it exactly!

Even Faster!


There are special SuperCPU optimization modes in existance which allow certain programmer-chosen areas of memory to be mirrored. Every write to memory outside of this area will be carried out at full turbo speed!  This way you can get the optimal speed every time. There are four optimization modes which you can activate with an STA on the corresponding register: $4000-$8000 (VIC bank 1): $D075, $8000-$C000 (VIC bank 2): $D074, $0400-$0800 (only standard screen ram): $D076, no optimization (all memory will be mirrored; default): $D077. 
To make sure that the writing to the corresponding memory location is done correctly, you must first turn on register $D07E with an STA, and then turn $D07F off again (see above).   Unfortunately there are no optimization codes for VIC banks 0 and 3, as the space in the logic chip would have to be broken up.  For SuperCPU projects it would be best to use VIC bank 1 or 2, or if you don't need any graphics (for example: just to do calculations), the screen optimization mode.
If, for example, you mirror VIC bank 2, you'll find the cache procedure described above and the eventual accumulated waitstates only by writing to this memory location ($8000-$C000) - all other memory writes (as well as memory reads) will be carried out with maximum speed - Listing1.4 shows a routine with a raster-time notification of the difference between normal speed, turbo and turbo with optimization mode.

At some point and in every case you must write to the area containing graphics information.  If you don't want to put up with speed loss when writing to memory, you should skillfully code the Cache-Byte in the SuperCPU.  A delay first occurs if the Cache-Byte is not yet written to and a new write succeeds.  Instead of just letting the SuperCPU wait, you can do this yourself and execute a few more commands which do not write to the graphic memory area but rather to a non-mirrored area.

This all happens at full speed.  Meanwhile the logic on the turbo card is transferred to the C64 without disturbing the Cache-Byte.  This all occurs at some 16 (20 MHz!) Tact-cycles, in which diverse calculations or other things execute.  Just try it out - the method described just now will be possible if you need an extremly large amount of calculation time.  In most cases it will reach full speed.  It's always a good feeling to know how you can get more out of it...

In the next installment we will examine the 65816 processor more closely. It isn't just fast and compatible, it also offers a range of new opcodes. These make life a little easier and they'll speed up your programs too.



(w) ThunderBlade/DMAgic

1999 GO64! Redax & Count Zero/SCS*TRC for all HTML Stuff

[ To the Index ][ To Part 2 ]