drk||Raziel has updated his website with some wip news concerning his Dreamcast Emulator for Windows

After a few hours of working on nullDC Dynarec instrumentation/profiling …

Shenmue: Ingame
mov:32 25.07% 37931888
mov:64 0.61% 929370
readm:8 0.14% 215023
readm:16 0.88% 1325655
readm:32 21.96% 33223344
46.25% 15364458 to mem
0.00% 300 to route
53.75% 17858586 to inline
27.54% 9150191 static
18.71% 6214567 fmem
readm:64 0.54% 817860
90.76% 742252 to mem
0.00% 0 to route
9.24% 75608 to inline
8.65% 70780 static
82.10% 671472 fmem
writem:32 3.02% 4563780
98.73% 4505665 to mem
0.06% 2880 to route
1.21% 55235 to inline
0.00% 0 static
98.79% 4508545 fmem
writem:64 0.11% 168904
100.00% 168904 to mem
0.00% 0 to route
0.00% 0 to inline
0.00% 0 static
100.00% 168904 fmem
cmp:32 9.04% 13672121
test:32 2.98% 4502301
SaveT:32 13.33% 20163008
LoadT:32 1.95% 2943077
not:32 0.20% 302600
and:32 0.33% 500402
or:32 0.28% 429952
xor:32 0.14% 212814
shl:32 0.74% 1125146
shr:32 0.12% 182725
rcl:32 0.26% 398469
movex:8 0.19% 288026
add:32 10.19% 15414719
sub:32 2.02% 3055821
fadd:32 0.13% 192646
fsub:32 1.05% 1582237
fmul:32 1.30% 1969777
fdiv:32 0.17% 250172
fneg:32 0.11% 167815
fmac:32 0.30% 447457
ifb:8 0.41% 627217
ftrv:32 0.25% 377367
fipr:32 0.25% 376035
floatfpul:32 0.25% 382908
ftrc:32 0.25% 378803
fcmp:32 0.58% 880835
pref:32 0.46% 701954
rest(18 ops) 0.38% 578451
Total 151.28M

Profiling games sure is fun

These are IL opcode counts, per dreamcast second. Sadly its not very practical to get execution time, so execution count will have to do for now … It’s interesting to note that most games archive between 120 and 200 MIPS (With most 30 fps rps on the low side, and DOA2LE getting constantly around 202 MIPS ingame )

mov32, readm32, writem32, cmp32, tst32, SaveT, LoadT, add32, sub32 make up for 90% of the opcodes executed. Out of these, readm32, SaveT and LoadT could be optimized, and maybe something can be done for movs aswell.

Memory:
readm:32 21.96% 33,223,344
46.25% 15,364,458 to mem
0.00% 300 to route
53.75% 17,858,586 to inline
27.54% 9,150,191 static
18.71% 6,214,567 fmem
writem:32 3.02% 4563780
98.73% 4,505,665 to mem
0.06% 2,880 to route
1.21% 55,235 to inline
0.00% 0 static
98.79% 4,508,545 fmem

Reads are 7x more common than reads. Array/Pointers access is pretty much the same between writes and reads (6.2M vs 4.5M — in other spots/games the difference is smaller). Whats interesting is static accesses — predicted static + inline — are over 27M for reads, but just 55K for writes. This verifies that sh4 really sucks at loading constants — so pretty much all of the constants are loaded as mem-reads — and also raises some questions about the generated code quality. Also, register reads+writes were REALLY low (10 mmr reads/frame , 96 mmr writes/frame).. Interesting huh ?

Anyway, these numbers and other statistics i plan to gather the following days will help to better optimize nullDC !

http://drk.emudev.org/blog/?p=143