Tehpola posted this at the new wii64 website:
Since this is my first post on the blog discussing the dynarec, I’d like to first explain what a dynarec is and why we’re going to need one to accomplish full speed emulation on the Wii. Then I’d like to describe the history of the dynarec in our emulator, where its at now, and what needs to be done to get it working.
First of all, dynarec stands for dynamic recompiler, which is actually a bit of misnomer in the console emulation world: usually its not accomplished by creating an abstract syntax tree or control flow graph from the emulated machine code and running a target machine code compiler over it, which is what recompilation would really entail. The proper term would be binary translation: for each emulated instruction, I convert it to an equivalent target instruction. Since the N64 is a MIPS architecture machine, I take a MIPS instruction, decode it (determine what kind of instruction it is and what operands it operates on), and then generate equivalent PowerPC (GC/Wii use PPC) instructions to perform the operation that the MIPS instruction intends to. What we try to do is take a block of code that needs to be run, and fill out a new block with PowerPC code that was created by converting each of the MIPS instructions in the block. The emulator then runs the block of code as a function: it will return when a different section of code needs to run and the process repeats for the next block of code.
What we’re doing now is running an interpreter: instead of translating the MIPS code we want to run, we just decode each instruction and run a function written in C which performs what the MIPS instruction would do. Though this may seem like less work: we don’t have to translate all the code and then run it; we just run it, but because the code is ran so many times and running the translated code is much faster than running each instruction through the interpreter, the extra time translating is made up for my the faster time running through long loops.
The dynarec was the first thing I started working on with the emulator: it seemed like the most interesting aspect and the most crucial for such a port (besides the graphics which I didn’t understand well enough at the time to do much useful work besides porting a software renderer). It’s gone through a few different stages different stages: 1-to-1 register mapping binary translator, quickly dropped attempt at reworking the translator to be object oriented, slightly further progressed attempt at a MIPS to Scheme translator, and where I’m currently at: the first binary translator without 1-1 register mapping, confirming to the EABI (Embedded Application-Binary Interface).
I was concerned about performance initially, and I got a little greedy: I decided that since both MIPS and PowerPC had 32 general purpose registers, and MIPS has one hardwired to 0, and PowerPC has an extra register (ctr) I could move values into for temporary storage, I could do a simple translation of most of the instructions by using all the same registers as the MIPS would use on the PPC. The idea was that I wouldn’t have to shuffle things in and out of registers; I would load the whole MIPS register set values into the PPC registers, run the recompiled code which would operate on those values, and then when its done with a block, store those values back and restore the emulator’s registers. This was a bad idea for several reasons: small blocks that only fiddled with one or two registers still had every single register stored, loaded, and then stored and loaded again for each block, I had to disable interrupts because I destroyed the stack and environment pointers that were expected if any interrupts were taken, and because I couldn’t take interrupts, it was very difficult to debug because I couldn’t run gdb in the recompiled code. I had developed a pretty large code base and a somewhat working recompiler before I truly came to realize all the drawbacks of the method: it ran some simple hand-crafted demos I had written in MIPS which computed factorial and a few other simple things, but overall it was too unweildy and inefficient to continue to debug.
My attempt at refactoring the code I had written in a OOP way was soon abandoned, but it did inspire some improvement to the way I generated instructions. Instead of piecing together the machine code from all the different parts, I wrote new macros which would do that for me for specific instructions thus reducing some major code clutter in the translator functions.
I was unimpressed by the improvements I predicted I would see by refactoring the code in C++, and inspired by Daeken’s work on IronBabel to start the dynarec from scratch using a high-level language. The idea and the code was much simpler: decode the instructions using high-level magic and instead of generating low-level machine code, generate high-level code to execute each instruction, collect all
...
Catherine: Full Body’s English translation for the Vita