Pate has updated his Dos Emulator for the Nintendo DS, heres the news:
Sorry it took me over a month to get a new version released. However, this version now has two noteworthy improvements, namely FPU support and the fact that this version is now built with version 0.13beta of the SDK. This version of the SDK has already been released at the end of last year, but I only recently noticed that I still used the even older 0.12beta version to build DS2x86. This might have caused at least some of the audio problems, as the audio features were somewhat improved in SDK v0.13beta. I also decided to jump the version number to 0.20 with this version, to show that it has some major internal changes.
The biggest enhancement is the addition of FPU support. For now this only works in 32-bit protected mode (meaning those DOS4GW games), as those are the games that mostly expect the existence of an FPU. My two test games, X-COM UFO and Destruction Derby, now seem to run the FPU parts fine. The problems I mentioned in the previous blog post were actually both caused by the FPU opcodes, though in some rather interesting ways. It took me almost three days to track down and fix the problems. There is still a rather serious problem in the texture mapping in Destruction Derby, but I am not sure if that has anything to do with the FPU opcodes. In any case, Destruction Derby runs much too slow to be properly playable, so fixing the texture mapping is not a high priority. Fixing it might fix some other games as well, so I do plan to look into it at some point.
Anyways, the FPU problem in Destruction Derby was in the FCOMP (compare) opcode. When I first tested that opcode it seemed to return a correct result, so I spent a lot of time looking elsewhere for the problem. In the end it turned out that I had a small typo in the opcode handler, I had written "sll t4, t3" when I meant "sll t4, t3, 1", and the assembler had interpreted it as "sllv t4, t4, t3". In this case the assembler tried to be a bit too smart, when it replaced the sll opcode (which can only have an immediate shift value) with the sllv opcode (shift by a register value). Since I had forgotten to type the immediate value, it would have been nicer if the assembler had given an error instead. That problem meant that the comparison result was mostly random, it sometimes returned the correct result and sometimes not, somewhat depending on the lowest 5 bits in the registers I wanted to shift left by one bit before the comparison.
A more interesting problem was the one that affected X-COM UFO. It behaved very erratically, dropping to a debugger with an unsupported opcode at random locations, where the opcode should have been supported. That pointed to the interrupt routine failing to execute properly and dropping back to where the interrupt should have happened. It took me a while to notice that when it dropped into the debugger, the Virtual 86 mode flag in the flags registers was set! I had not coded support for interrupts in virtual 86 mode yet, so that was the reason for the weird unsupported opcodes. However, I was pretty sure that the game does not set this flag on purpose, so I added a check into DS2x86 to drop to the debugger immediately after this bit gets set in the CPU flags. After a few test runs, the results were quite interesting. Once the bit did not get set, but DS2x86 dropped into the debugger anyways. In other runs, twice the opcode was fsin, and once fcos, two of my new FPU opcodes.
The interesting part was that I did not touch the CPU flags in the fsin and fcos opcode handlers at all! I did store the register that contained the flags into stack, along with various other registers that calling the GCC math library sin() and cos() functions will globber, and then restored the registers after the call. I could not immediately see anything wrong with my code, so just for fun I copied the flags before the GCC library call to the EAX emulated register and the same flags after restoring them from stack to ECX emulated register, and then added a forced drop to the debugger after the call. And, much to my surprise, the value restored from the stack was completely different to the value I pushed to the stack!
I looked into the dump file for what happens in the GCC library sin() and cos() functions, and noticed that indeed, they destroy the two topmost words of the caller's stack!. I have a very hard time believing this behaviour is by design, this might actually be a bug in the GCC compiler itself. Here below is the dump from the start of the math library sin() function. The input is a 64-bit double value, in 32-bit registers a0 (low word) and a1 (high word).
80250fd0 <__sin>:
80250fd0: 3c027fff lui v0,0x7fff
80250fd4: 3442ffff ori v0,v0,0xffff
80250fd8: 00a23024 and a2,a1,v0
80250fdc: 3c033e50 lui v1,0x3e50
80250fe0: 27bdfda8 addiu sp,sp,-600 // sp -= 600, make room for local variables
80250fe4: 00c3182a slt v1,a2,v1
80250fe8: afbe0250 sw s8,592(sp)
80250fec: afbf0254 sw ra,596(sp)
80250ff0: afb7024c sw s7,588(sp)
80250ff4: afb60248 sw s6,584(sp)
...
Catherine: Full Body’s English translation for the Vita