Page 1 of 2 12 LastLast
Results 1 to 10 of 15

Thread: SH4 DMAC demo

                  
   
  1. #1
    Won Hung Lo wraggster's Avatar
    Join Date
    Apr 2003
    Location
    Nottingham, England
    Age
    53
    Posts
    141,429
    Blog Entries
    3209
    Rep Power
    50

    Default SH4 DMAC demo

    Heinrich Tillack just emailed me with this rather cool news for us Dreamcast sceners:

    hi !
    SH4 DMAC demo

    Yeah, my first working SH4 DMAC demo.
    I use DMAC channel 1 in this demo to copy large data amounts to the videoscreen!

    This demo also shows how the PVR DMA, Store queues (SQ) and the CPU is performing this copy operation.
    (There is also code for Colortable DMA which is not working.)

    download at http://a128.ch9.de
    Some results of this test:
    Results for 1000 loops and 640*480*4 bytes transfer each loop:
    test memory Sh4->PVR
    51344 ms dmac 32bit mode
    16298 ms dmac 256bit mode
    40782 ms cpu loop
    16321 ms sq loop
    6222 ms dma loop
    test memory Sh4->Sh4
    8920 ms dmac 256bit copy loop
    18966 ms CPU 32bit copy loop

    bye
    heinrich

    Thanks to Heinrich for the very interesting news

  2. #2
    DCEmu Coder GPF's Avatar
    Join Date
    Apr 2004
    Location
    Texas
    Age
    53
    Posts
    796
    Rep Power
    81

    Default

    Code:
    test memory Sh4->PVR
    
    51362 ms  dmac 32bit mode
    
    16315 ms  dmac 256bit mode
    
    40983 ms  cpu loop
    
    16368 ms  sq loop
    
    6240 ms  dma loop
    
    test memory Sh4->Sh4
    
    8940 ms  dmac 256bit copy  loop
    
    19020 ms  CPU 32bit copy loop
    why are my results slightly slower?

    also how does this differ from using KOS's pvr_txr_load_dma ?

    Thanks,
    Troy

  3. #3
    DCEmu Coder
    Join Date
    Aug 2004
    Location
    Germany
    Posts
    29
    Rep Power
    0

    Default

    Quote Originally Posted by GPF
    Code:
    test memory Sh4->PVR
    
    
    8940 ms  dmac 256bit copy  loop
    
    19020 ms  CPU 32bit copy loop
    why are my results slightly slower?

    also how does this differ from using KOS's pvr_txr_load_dma ?

    Thanks,
    Troy
    the CPU loop and maybe also the other loops are slower cause maybe your gcc version is not that good as gcc 3.3.5 i used (the CPU loop could suffer from a bad gcc version)

    try uploading the test.elf in the archive . this should have the same results,

  4. #4

  5. #5
    DCEmu Dreamcast ron's Avatar
    Join Date
    Apr 2004
    Location
    MadriDC
    Age
    24
    Posts
    402
    Rep Power
    78

    Default

    why don't you try to upgrade GCC to 3.4.2, results seems much better than older versions. Anyway Thanks very much for your work
    SH4 Risc LittleEndian

  6. #6
    DCEmu Coder GPF's Avatar
    Join Date
    Apr 2004
    Location
    Texas
    Age
    53
    Posts
    796
    Rep Power
    81

    Default

    Quote Originally Posted by a128
    the CPU loop and maybe also the other loops are slower cause maybe your gcc version is not that good as gcc 3.3.5 i used (the CPU loop could suffer from a bad gcc version)

    try uploading the test.elf in the archive . this should have the same results,
    The results were from the test.elf except i did a $KOS_OBJCOPY -O binary -R .stack test.elf upload.bin since my version of dc-tool-ip doesn't work with elf files.

    I will try to compile it soon and test it with GCC 4.0.0 and post the results.

    Is your example faster than KOS functions? Or are they functions that are not in KOS now?

    Thanks,
    Troy

  7. #7
    Dream Coder
    Join Date
    Apr 2004
    Location
    Miami, FL
    Age
    38
    Posts
    4,675
    Rep Power
    50

    Default

    It will be slightly slower if you upload from bba than coders cable or CD.

  8. #8
    DCEmu Coder
    Join Date
    Aug 2004
    Location
    Germany
    Posts
    29
    Rep Power
    0

    Default

    Quote Originally Posted by GPF
    The results were from the test.elf except i did a $KOS_OBJCOPY -O binary -R .stack test.elf upload.bin since my version of dc-tool-ip doesn't work with elf files.

    I will try to compile it soon and test it with GCC 4.0.0 and post the results.

    Is your example faster than KOS functions? Or are they functions that are not in KOS now?

    Thanks,
    Troy
    Did KOS use the DMAC channel 1 for any operation? I gues not. or did I miss somethink in the KOS source tree?

    For PVR DMA I do not no if it`s faster, it`s basicly what KOS does, except for some settings which I think is more how you have to setup the PVR DMA stuff.
    Just have a look at the source code, I have put some note where it?`s diifferent compared to KOS

    Does anyone know why the colortable DMA (which is also able to move memeory from Sh4->PVR) not working in that demo?

  9. #9
    DCEmu Coder GPF's Avatar
    Join Date
    Apr 2004
    Location
    Texas
    Age
    53
    Posts
    796
    Rep Power
    81

    Default

    Quote Originally Posted by a128
    Did KOS use the DMAC channel 1 for any operation? I gues not. or did I miss somethink in the KOS source tree?

    For PVR DMA I do not no if it`s faster, it`s basicly what KOS does, except for some settings which I think is more how you have to setup the PVR DMA stuff.
    Just have a look at the source code, I have put some note where it?`s diifferent compared to KOS

    Does anyone know why the colortable DMA (which is also able to move memeory from Sh4->PVR) not working in that demo?
    Ok ill take a look at the source, I not that familiar with pvr code in KOS, other than PVR DMA doesn't work with Chankast

    Thanks,
    Troy

  10. #10

    Default

    It may be possible that you get different results based on the hardware revision you use it on.

Page 1 of 2 12 LastLast

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •