DSx86 bug fixing [Archive] - DCEmu Network: The Homebrew, Hacking & Gaming Network

wraggster

June 25th, 2012, 15:21

The additional project I have been working on has still required some fixing and enhancing, so I have not been able to focus on DSx86-related work fully. However, I decided to get back to working on DSx86 by first porting the much-enhanced tester program I have coded for DS2x86 back to DSx86. This turned out to be a good idea, as I have found quite a few bugs in DSx86 code that I would not have found without the new enhanced tester program. Here is a list of the bugs I have found so far:

Opcodes 0x10 .. 0x13 use adc instead of add in address calculations.
Opcode adc sometimes sets wrong overflow flag value.
Opcode sbb sometimes sets wrong overflow flag value.
BCD-opcodes daa, das etc do not work correctly.
Opcode repne cmpsb swaps Carry flag if input CX = 0.
Opcode repne scasb (direction=down) destroys all flags if input CX = 0.
Opcode rep stosb does not handle segment wrap properly.
Opcode popf mixes ARM and x86 flag bits, perhaps incorrectly.
Protected mode opcode handlers use si+disp8] instead of di+disp8].
Protected mode opcodes lgdt and lidt do not clear the highest address byte.
Protected mode opcodes lar and lsl do not handle descriptor type 0 properly.
Opcode handler push_EGA_r2 uses invalid stack address (Lords of Doom).
IRQ handling might check the Interrupt Enabled flag incorrectly.

Looking at that list of issues, the first impression is that it is a wonder that DSx86 has worked at all! However, on closer inspection nearly all of those problems are things that are either very rare, or need some special situation to occur for them to cause any problems. Even so, I am currently fixing them, as they may cause problems in some games.

The overflow flag handling problem in the adc opcode is worth a special mention. The adc (add with carry) opcode works like the normal add opcode if the input carry flag is not set, but if the input carry is set, it will add one to the result. Since I am using the high 16 bits of the 32-bit ARM registers when emulating the x86 registers, I have not been able to use the ARM adc opcode directly. Instead, I first add one to one of the input values if carry is set, and then perform the addition. This works fine in all other cases except when the input value is 0xFFFF (which would became 0 after adding 1 to it). So, I used to have a special code to detect this situation, like this (where the input operands are in high 16 bits of registers r0 and r1, with the result put into high 16 bits of r0, and the ARM flags set):
addcss r1, #0x00010000 @ If input Carry is set, the right operand = (register value + 1). bcs adc_pass_carry_r0 @ If Carry is now set, it means the right operand was 0xFFFF and carry was set, so need special handling. adds r0, r1 @ Perform the actual addition, setting the resulting flags.If the input carry was set, and a carry is still set after adding 1 to the r1 value, it meant that the input register r1 value was 0xFFFF0000, and it is now 0. Adding zero to the r0 register does not change r0 value, so I had a special common handling for this situation:
adc_pass_carry_r0: ands r0, r0 @ Set Sign and Zero flags, keep Carry set, Overflow flag is not changed mrs r0,cpsr @ Put the flags into r0 bic r0, #0x10000000 @ Clear the Overflow flag b restore_flags_from_r0 @ Back to loop, setting the proper flags.In other words, I always cleared the overflow flag, which however is not the correct behaviour.
After some thought I figured out a way to let the ARM processor calculate the proper flags for me, so that I don't need to attempt to calculate the overflow flag myself. The new code looks like this:
addcs r0, #0x00010000 @ If input Carry is set, adjust the right operand so that ... subcs r0, #0x00000001 @ ... we can use the ARM ADC opcode for the actual operation. adcs r0, r1 @ Perform the actual addition, setting the resulting flags.It is much cleaner than the original code as there is no need for a special case handling. The idea is that if the input carry is set, instead of adding 1 (shifted to the high 16 bits) we set the low 16 bits of the input register, so that the ARM adc when adding 1 (the input carry) to the actual lowest bit, will automatically clear the lowest 16 bits and add one to the high 16 bits (the lowest bit of the x86 register value). This way all the resulting conditional flags, and the resulting r0 register, will get correct values in one go. The corresponding sbb handling is somewhat trickier. I could not figure out a way to use the ARM sbc opcode for that, so I decided to handle the subtraction using two code brances, one for a simple subtraction when the input carry is not set, and one for a situation where the input carry is set (where I then have to calculate the proper flags myself). A bit slower solution, but the end result has all the correct flags.
I hope to be able to do all these fixes during the next week, so that I can then release a fixed DSx86 version on the first of July. This depends a bit on how much work I still have to do for the additional project, but hopefully not a lot any more.
http://dsx86.patrickaalto.com/DSblog.html