This post is just what I'm thinking about concerning the emulator at the moment, what I'm currently implementing, written down and shared in hopes of obtaining another point of view, even if it's just, "hah, that's silly."
At first this was going to be a statically-recompiling emulator, but then plans changed when it was observed that the cost did not justify the benefit in comparison to a Dynarec-based CPU core. IBA is emulating an ARM processor on another ARM, so statically recompiling with GCC in the middle wasn't worth it: Register allocation and the code in general would not be optimal (did a test, results didn't look that good), CPSR flags have to be calculated (rather than using the
GP2X's), and it's rather cumbersome for the end-user (run in decompilation mode, exit,
compile, get back in again, rinse, lather, repeat). Since the whole thing is so experimental, one
should not be surprised with the changes. A GCC backend would probably do a better job with
a processor that has more registers than the ARM (PPC maybe?), but being so similar there is no
need for it and it ends up getting in the way.
A dynarec has been built for outputting blocks of code that supports all of the ARM's instructions
except, currently, the opcodes for co-processor communication. The code it outputs can be saved
to the SD so that ROM code does not have to be re-translated... so in a way, this emulator can still be a static recompiler, it just does ASM->ASM instead of ASM->C->ASM. (If anybody goes
out and says, "I told you so" I'll pull out a large trout and deliver a few slaps before the day ends)
When the emulator starts, the processor attempts to execute whatever is at address 0x08000000, which resides on the cartridge (GPROM). To know if the instruction at that address has been previously compiled, a second table is needed. Due to the size of the GPROM (up to 32MB) a table that keeps track of what's beed compiled and what hasn't needs to be as compact as possible. Since the worst case is that all 32MB are filled with Thumb code, you could say that every short (16 bits) needs one bit to indicate previous compilation. So, 32MBytes / 2Bytes = 16M Thumb instructions = 16Mbits are necessary for flagging each instruction = 2MBytes for the GPROM Compilation Bit Map (GPROMcbm from now on).Now that there's a table that tells the emulator that it has executed that instruction before, it needs to know where the translated block is. We can't afford to have a 32bit pointer for each instruction (it would require 64MB plus the GPROM's 32MB totalling 96MB!) so we'll have to over-write whatever the GPROM is storing there with a pointer to the code's actual location. In the case of ARM code, this is not a problem as the instructions are the same size as a pointer... but what about thumb mode? A pointer can't be stored in 16bits! Well, where you can't store an absolute address, a relative one will have to do. Let's say the letters on the left are GBA instructions and the ones on the right are the translated GP2X equivalent:
At first this was going to be a statically-recompiling emulator, but then plans changed when it was observed that the cost did not justify the benefit in comparison to a Dynarec-based CPU core. IBA is emulating an ARM processor on another ARM, so statically recompiling with GCC in the middle wasn't worth it: Register allocation and the code in general would not be optimal (did a test, results didn't look that good), CPSR flags have to be calculated (rather than using the
GP2X's), and it's rather cumbersome for the end-user (run in decompilation mode, exit,
compile, get back in again, rinse, lather, repeat). Since the whole thing is so experimental, one
should not be surprised with the changes. A GCC backend would probably do a better job with
a processor that has more registers than the ARM (PPC maybe?), but being so similar there is no
need for it and it ends up getting in the way.
A dynarec has been built for outputting blocks of code that supports all of the ARM's instructions
except, currently, the opcodes for co-processor communication. The code it outputs can be saved
to the SD so that ROM code does not have to be re-translated... so in a way, this emulator can still be a static recompiler, it just does ASM->ASM instead of ASM->C->ASM. (If anybody goes
out and says, "I told you so" I'll pull out a large trout and deliver a few slaps before the day ends)
When the emulator starts, the processor attempts to execute whatever is at address 0x08000000, which resides on the cartridge (GPROM). To know if the instruction at that address has been previously compiled, a second table is needed. Due to the size of the GPROM (up to 32MB) a table that keeps track of what's beed compiled and what hasn't needs to be as compact as possible. Since the worst case is that all 32MB are filled with Thumb code, you could say that every short (16 bits) needs one bit to indicate previous compilation. So, 32MBytes / 2Bytes = 16M Thumb instructions = 16Mbits are necessary for flagging each instruction = 2MBytes for the GPROM Compilation Bit Map (GPROMcbm from now on).Now that there's a table that tells the emulator that it has executed that instruction before, it needs to know where the translated block is. We can't afford to have a 32bit pointer for each instruction (it would require 64MB plus the GPROM's 32MB totalling 96MB!) so we'll have to over-write whatever the GPROM is storing there with a pointer to the code's actual location. In the case of ARM code, this is not a problem as the instructions are the same size as a pointer... but what about thumb mode? A pointer can't be stored in 16bits! Well, where you can't store an absolute address, a relative one will have to do. Let's say the letters on the left are GBA instructions and the ones on the right are the translated GP2X equivalent: