Copied from a post by Amadeus on the DSLinux forums:
See here for original post: http://www.dslinux.org/index.php?showtopic=1748&st=0
FAQ
===
What is the 8bit-write-problem, and why do I care?
--
The GBA ROM space is the only big (32 MBytes) addressable extension for the DS. If you want more memory, you need to use the GBA ROM space. Unfortunately, the GBA ROM access is 16bit only. For 8bit read access, the unused byte is sorted out by the CPU. But for 8bit write access, you end up with the unused byte beeing garbage, and destroy your memory layout. This is the reason why the GBA ROM space is only used for read-only memory.
What about Opera for the DS lite?
--
Opera is sold with a RAM extension in the GBA slot. Nobody knows how they did it. As there is only one program using this extension, I think they have recoded the program to avoid 8bit writes.
What possibilities exists to overcome this problem?
--
First of all, you can recode your programs to don't use 16bit writes. Regarding the endless amount of available linux software, this will be an endless task, and nobody has tried to do this.
Second, you can modify the compiler (gcc) to do a 16bit read-modify-write instead of a 8bit write. This will lead to slow programs, because the compiler has to check for even or odd addresses before doing the read-modify-write. Nobody has made an attempt to do this.
Third, you can disallow all(!) write accesses to the GBA ROM space, and let the data abort handler in the kernel do the read-modify-write. This has been done (outside DSLinux), but the resulting programs are suffering a massive slowdown (100 times).
And then, there is the cache+swp solution...
How does the solution work?
--
Pepsiman an me have discussed the 8bit problem over IRC and email, and have discovered that a data cache writeback will write back 16 bytes at once. So if you enable the data cache in writeback mode for the GBA ROM space, each 8bit write which hits data in the data cache is no problem any more, because the 8bit write is transformed into a 16byte write (when the cacheline is written back).
But what about data cache misses? They don't go into the data cache and are written back 8bit wise!
Working with the ARM assembler manual, I discovered the SWP instruction. This instruction is doing a read and a write from/to the same address in a single, uninterruptable instruction! The read will load the data cache, and the write will produce a data cache hit. And no interrupt can go between the read and the write and invalidate the cache!
I wrote some assembler code in head.S and studied the behaviour of swp, and proved that it works.
What are the drawbacks of this solution?
--
If the compiler is modified, ALL 8bit writes are turned into swps. So there is a slight drawback in speed in the main memory, and you can not access 8bit hardware registers with GCC any more. For hardware access, you must use the writeb() macro instead.
swp is doing a data cache load first. So the first access to a sequence of bytes is slow, but the subsequent accesses are fast. If your program has many accesses to 8bit data scattered all over the memory map, execution speed will suffer. But for most programs, this is not the expected behaviour. Most times, there is no visible slowdown.
swp uses a third CPU register and is limited in the addressing modes. So the code generated by GCC is larger and slower. How much larger and slower depends on the usage pattern. Again, most times slowdown is not visible.
Modify the compiler? Are you crazy?
--
Puhhh.. this is a huge task, anyway. But there is the GCC devel mailing list, and there are some friendly people with deep knowledge in the GCC community. So I decided to give it a try. After all, if you don't try it, failure is guarantied...
It turned out that making this modification is a _very_ huge task because the semantics of strb and swpb are different, and that we are needing a 3rd register. There were months of trying, tests, and emails. Finally, we have a GCC which is able to compile whole DSLinux (kernel and userland) without "internal compiler error".
Bookmarks