Sorry, no new release of DSx86 today, as I have only been working on DS2x86 for the past two weeks. This porting work is progressing nicely, over half of the opcodes have been ported over to MIPS ASM. I have to mention, though, that the opcodes so far have been the easy ones (execpt the BCD opcodes), the more difficult opcodes like the string operations, shifts, INT and IRET, and port I/O are still ahead. These will take more time, and some of them will need some interfacing to the underlying hardware, so I can not just simply port them over from the ARM ASM code.
I am currently at opcode 0x8C, which is the mov r/m16,Sreg opcode, that is, moving a value from the segment register to memory or register. The problem above was caused by my tester code not yet supporting the FS and GS segment registers, while the CPU emulation already does this. So, every now and then I need to fix my tester program instead of the emulation code. :-)
Lazy Flags
Practically at the same time I started porting the opcode handlers from ARM ASM to MIPS ASM, I started thinking of ways to handle the Lazy Flags with the least amount of slowdown possible. Yesterday I figured out a method that is a little bit faster than the way I had when I started, so I spent a couple of hours refactoring all the opcodes I had already coded to use the new method. Too bad this did not occur to me earlier, but it is to be expected that I need to recode some parts of the code several times as I am still only learning the tricks in MIPS ASM.
I again used the DOSBox sources, together with the nice description at a
www.emulators.com blog post, to figure out how the lazy flags need to work. There are six flags that change after each arithmetic operation in the x86 architecture, some of which are simple and some more difficult to determine after the operation. The flags are:
Carry flag. This determines the unsigned overflow of the operation.
Adjust Flag. This is similar to Carry, but for the low 4 bits of the operation.
Overflow flag. This determines the signed overflow of the operation.
Zero flag. This determines if the result was zero.
Sign Flag. This determines if the result was negative.
Parity flag. This determines the number of bits set in the low byte of the result.
The simple flags are Zero, Sign and Parity. Zero flag is set if the result was zero, Sign flag is set if the highest bit of the result was set, and Parity flag can be set by a 256-item lookup table based on the low byte of the result. These three flags behave similarly to all opcodes (that change flags), so they can be determined simply by the result of the last operation. The other three opcodes behave differently in different opcodes, so based on the calculation operations in the DOSBox sources I combined a list of the different cases, to see how these need to be handled. DOSBox names the result and operands lf_resd, lf_var1d and lf_var2d (for doubleword operands), and I named them lf_res, lf_val1 and lf_val2 in my code.
Carry
Unknown, INC, DEC, MUL: return previous flag state
ADD: return (unsigned)lf_res < (unsigned)lf_val1;
ADC: return ((unsigned)lf_res < (unsigned)lf_val1) || (lflags.oldcf && (lf_res == lf_val1));
SBB: return ((unsigned)lf_val1 < (unsigned)lf_res) || (lflags.oldcf && (lf_val2 == 0xffffffff));
SUB, CMP: return ((unsigned)lf_val1 < (unsigned)lf_val2);
SHL, SHR, SAR, ROL, ROR, RCL, RCR: All have different handling
NEG: return lf_val1;
OR, AND, XOR, TEST, DIV: return false;
Adjust
Unknown: return previous flag state
ADC, ADD, SBB, SUB, CMP: return ((lf_val1 ^ lf_val2) ^ lf_res) & 0x10;
INC: return (lf_res & 0x0f) == 0;
DEC: return (lf_res & 0x0f) == 0x0f;
NEG: return lf_val1 & 0x0f;
SHL, SHR, SAR: return lf_val2 & 0x1f;
OR, AND, XOR, TEST, DIV, MUL: return false;
Overflow
Unknown, MUL: return previous flag state
ADD, ADC: return ((lf_val1 ^ lf_val2 ^ 0x80000000) & (lf_res ^ lf_val2)) & 0x80000000;
SBB, SUB, CMP: return ((lf_val1 ^ lf_val2) & (lf_val1 ^ lf_res)) & 0x80000000;
INC: return (lf_res == 0x80000000);
DEC: return (lf_res == 0x7fffffff);
NEG: return (lf_val1 == 0x80000000);
SHL: return (lf_res ^ lf_val1) & 0x80000000;
SHR: if ((lf_val2&0x1f)==1) return (lf_val1 > 0x80000000); else return false;
OR, AND, XOR, TEST, SAR, DIV: return false;
Based on these lists, it seemed to me that the Carry flag will be the most difficult and time-consuming to calculate. Besides the obvious conditional jump opcodes, there are many other opcodes (ADC, SBB, RCL, RCR, CMC) that need the current Carry flag value as their input. Also the shift opcodes change and use the Carry flag in various ways, so it seemed to me that using a switch statement -style code to calculate the Carry flag lazily whenever it is needed will really slow down those operations. So, I decided to see how much extra code I would need if I went for a direct Carry flag calculation in each of the opcodes. It turned out that most of the times it only takes one ASM operation to calculate the Carry flag after the operation, so this
Catherine: Full Body’s English translation for the Vita