PDA

View Full Version : FPU work, unaligned access, Doom benchmarks



wraggster
December 16th, 2013, 23:27
FPU supportFor the past couple of weeks I have continued working on the WP8 version of my emulator core, and I have also done some enhancements to the Android version and also to rpix86. The most notable enhancement is the addition of floating point support. So far only my emulators running on the MIPS architecture (ds2x86 and zerox86) have emulated the floating point opcodes. The ARM architecture emulators have silently ignored all FPU opcodes. There are some games that need the FPU support, though (like the X-COM series games), so I wanted to finally port the FPU emulation also to my ARM emulators.
I began by implementing the FPU support to the new Windows Phone 8 version. I used a simple test C source file where I added the floating point code that I wanted to implement, and then compiled that C code to ASM code (giving the /Fa parameter to the C compiler). This way I was able to determine the proper floating point calling convention to use, and also the internal names of the Windows runtime helper methods for some routines (conversion between a double and 64-bit integer, for example).

Here is a small example of the C code, for converting a double value to an integer:
double fpu_regs[9];int tst;void ftest64(int idx){ tst = fpu_regs[idx];}This is the ARM ASM code that this C code produces:
|ftest64| PROC movw r3,fpu_regs movt r3,fpu_regs movw r2,tst movt r2,tst add r3,r3,r0,lsl #3 vldr d0,[r3] vcvt.s32.f64 s0,d0 vmov r3,s0 str r3,[r2]|$M4| bx lr ENDP ; |ftest64|I managed to implement all the FPU opcodes for the WP8 version during the previous weekend, and they seem to work at least mostly correct. I haven't yet tested them thoroughly, so there may still be problems. During this weekend I then implemented the same to the Android version (ax86) and also to the rpix86 version. Those have different floating point calling conventions, as Raspberry Pi uses "hard float" and Android by default uses "softfp" calling convention. I am not sure whether my code is yet correct for both of those calling conventions, so I will continue working on that.
Unaligned accessA couple of weeks ago I also decided to test what happens if I let my code to use ldr opcodes to load data from memory addresses that are not correctly aligned to 32-bit word boundaries. I have coded all my memory accesses that can be unaligned to use separate ldrb and strb opcodes. This seems rather slow, and since the newer ARM processors since armv6 have the option to allow unaligned memory accesses, I wanted to test whether that actually works. The kernel can decided whether such access is allowed or not, and I wasn't sure whether Windows Phone, Android or Raspbian actually allow this.
I coded some of the most often used opcodes to always use ldr opcode, and somewhat to my suprise, the code worked on all those platforms! Next I changed some more opcodes, and then tested running DOOM timedemo whether the code runs faster or slower. The result was that on both Windows Phone and Android the code executed noticeably faster (DOOM timedemo ran up to 10% faster), but on rpix86 it ran slightly slower. So, it looks like I will by default enable unaligned access on WP8 and Android, but not on rpix86.
DOOM benchmark resultsOkay, since I was able to get some more speed using unaligned access, and since I have now benchmarked all of my emulators that can run DOOM, I thought it might be a good idea to collect the results in a single table. So, here is the DOOM benchmark result table for all of my emulator versions.


real tics
avg fps
SoC/CPU Type
Device
Emulator
Remarks


2660
28.1
Tegra 3 @ 1.3 Ghz
Nexus 7 Tablet
ax86



5387
13.8
Snapdragon S3 @ 1 Ghz
Nokia Lumia 520
pax86



6451
11.6
MIPS32r2 @ 1 Ghz
GCWZero
zerox86
http://zerox86.patrickaalto.com (http://zerox86.patrickaalto.com/)


14352
5.1
MIPS32r1 @ 396MHz
DSTwo cart
DS2x86
http://dsx86.patrickaalto.com (http://dsx86.patrickaalto.com/)


24112
3.1
ARMv6 @ 700MHz
Raspberry Pi
rpix86
http://rpix86.patrickaalto.com (http://rpix86.patrickaalto.com/)


I still have a lot of improvements and compatibility enhancements to do for the Windows Phone 8 version of my emulator. It does run some interesting games already, like for example Grand Prix 2 (http://en.wikipedia.org/wiki/Grand_Prix_2), from which the screen copy below is from. That game can run in SVGA mode, but also mentions Pentium 90 as a recommended system, so I am running it in VGA mode to get a reasonable framerate on my Nokia Lumia 520 phone.

http://ax86.patrickaalto.com/ablog.html