Since the release of 0.30 last week, I have been working on the remaining problems in the new transfer system. These are the problems that I have now managed to fix, and which will be included in the 0.31 version. The current plan is to release DS2x86 version 0.31 next Sunday. I hope to still add some additional fixes and improvements during the next week.

Faster EGA 0x0D mode blitting when the logical screen width is larger than 320 pixels.
Fixed EGA LineCompare pixel panning reset, using NDS hardware features.
Fixed AdLib audio buffering problem.
Fixed Warcraft BSOD crash in SoundBlaster detection.

The new EGA mode 0x0D blitting code now has two working modes. If the logical screen layout is 320 pixels wide (so no horizontal scrolling or additional trickery is used), the blitting speed is 6.7 ms (149 fps), and if the logical screen width is more than 320, a separate transfer code is used on the MIPS side. This code only sends 8 extra pixels per screen row (to handle possible smooth pixel panning function), and thus the blitting speed drops only slightly, to 7.9 ms (126 fps). This change got rid of the screen tearing problem in Supaplex and Commander Keen 4 intro.

The EGA and VGA graphics cards have an option to jump back to the beginning of the graphics VRAM memory at a certain scanline (when the card is drawing the image on the monitor). This is activated by giving the EGA/VGA line compare register a scanline number that is less than the number of screen rows. There is also a bit in another register that tells the graphics card to reset the pixel panning to zero in this situation. The pixel panning register is used to shift the screen image 0..7 pixels left during the graphics VRAM scanning and drawing onto the monitor. Since the screen image start address needs to be at a byte boundary, and each byte in the 16-color modes contain 8 adjacent pixels, using the pixel panning register is needed when smoothly panning the image horizontally by less than 8 pixels at a time.

In DS2x86 version 0.30 I simplified the screen blitting code (compared to DSx86 and previous DS2x86 versions) so that I don't handle the pixel panning value in the code (by shifting the pixels before blitting them to the screen), but instead I use the Nintendo DS graphics background registers to emulate the pixel panning, much like the actual EGA/VGA card does it. However, it only occurred to me last weekend that I can also handle the line compare pixel panning reset using Nintendo DS hardware! Since the NDS graphics features include a VCount interrupt, I can use that to get an interrupt at the line compare scanline, and reset the NDS background register horizontal position to zero! The end result is exactly similar to the EGA/VGA card behaviour, with much of the functionality done by the NDS graphics hardware! This is a change I plan to port back to the original DSx86, as it will simplify the EGA blitting code there as well. This change made the Supaplex bottom score panel stay put while the upper area is panning.

Supaplex also helped me in finding the problem in my AdLib emulation. The music seemed to skip a lot of notes during the beginning. I logged the AdLib notes in DOSBox to a file, and also wrote code to log the notes that the MIPS side sends to the ARM9 to a file on the SD card, and noticed that there were no differences. The exact same notes get sent with nearly identical timing from the MIPS to the ARM9 side. And since the ARM7 uses the same code as the original DSx86 (which works fine), it was easy to figure out that the problem must be in the new ARM9 code. And there the problem indeed was. I had a minor bug in the buffering scheme, where the last command in the buffer was never sent from ARM9 to ARM7 until the buffer got additional data from MIPS to ARM9. In Supaplex music there are places where only one instrument is playing, and in these places the game sends only three commands: Note Off, Note Frequency, and Note On. So, when the last new command never got sent, this in effect made the ARM7 see the music as a sequence of Note On, Note Off, Note Frequency commands, so there was no sound output.

After those fixes I then began debugging the Warcraft BSOD crash problem. It is somewhat weird, as the location of the crash (as reported by the BSOD texts) seems to jump all over the MIPS code area. What is even more strange, the location seems to often point to a code that can not crash, that is, it has only some simple aritmetic operations or such. So, my first theory was that perhaps this is some interrupt routine re-entrancy problem in the new more accurate SoundBlaster IRQ emulation. I checked the Warcraft SB emulation code (which I reverse engineered some time ago, when an earlier DS2x86 version had problems with it), and noticed that it sets up an auto-init DMA audio transfer with a buffer length of 2 samples! That is, the SoundBlaster will send an IRQ after every 2 samples have been played! As the playing frequency was 22 kHz, this meant that my emulation code began getting over 11000 IRQs per second!

I experimented by forcibly limiting the auto-init IRQ frequency, but rather annoyingly, even at an IRQ frequency of 366 Hz (24000000/65536) the BSOD problem remained. Only at a frequency of 183 Hz (24000000/(2*65536)) I got rid of the BSOD problem. This made me realize that the IRQ speed itself can not be the actual cause for the BSOD, as for example Windows sets the PC timer to run at 1000Hz, which is also emulated similarly using a hardware IRQ at that speed, and it does work fine. Finally I then realized that my buffer copying code inside the IRQ handler expects the pointers to be word-aligned, and with the transfer buffer length of only 2 samples, the pointer was actually only halfword-aligned! Since Warcraft only uses this buffer setup when testing for a SoundBlaster, it is not so important to play the correct samples, and thus I forcibly aligned the pointers to be word-aligned. This got rid of the BSOD, but still the SB audio does not work quite correctly in Warcraft. I'll continue working on this problem during the next week. There are still various other problems in the new transfer code as well, which I also hope to be able to fix and/or implement during the upcoming weeks. But, you can expect at least the above fixes to be included in the next version.

http://dsx86.patrickaalto.com/DSblog.html