Page 9 of 10 FirstFirst ... 5678910 LastLast
Results 81 to 90 of 92

Thread: [WIP] YAPSxP: Yet Another PSX Emulator for PSP

                  
   
  1. #81
    DCEmu Coder
    Join Date
    Aug 2006
    Location
    Bloomington, IN
    Age
    41
    Posts
    268
    Rep Power
    73

    Default

    Quote Originally Posted by hlide View Post
    IR0/1/2/3 are normally set from MAC0/1/2/3 but they are saturated so they are always between -32768 or 32767. So long as MAC0/1/2/3 don't exceed those min et max values, IR0/1/2/3 are okay. If they do exceed, IR0/1/2/3 would be saturated anyway to those min/max values so I think it is okay for them indeed. Plus they are always loaded or stored as 16-bit word so that okay to be fit as a float. It sounds as if the problem only concerns MAC0/1/2/3, TRX/Y/Z, R/B/BBK which are always 32-bit words if you need to load/store them as float.
    Yeah, you're right, with the saturation they should be okay.

    Quote Originally Posted by hlide View Post
    Yes that's the main trouble, i cannot keep MAC0/1/2/3 as float registers because it only can keep a "24-bit integer" when MAC0/1/2/3 needs to store 1:31:0 bits. So if you use GPL (General purpose interpolation) :
    MAC1=A1[MAC1 + IR0 * IR1]
    MAC2=A2[MAC2 + IR0 * IR2]
    MAC3=A3[MAC3 + IR0 * IR3]
    IR1=Lm_B1[MAC1]
    IR2=Lm_B2[MAC2]
    IR3=Lm_B3[MAC3]
    [0,8,0] Cd0<-Cd1<-Cd2<- CODE
    [0,8,0] R0<-R1<-R2<- Lm_C1[MAC1]
    [0,8,0] G0<-G1<-G2<- Lm_C2[MAC2]
    [0,8,0] B0<-B1<-B2<- Lm_C3[MAC3]

    and call successive GPL, you are accumulating errors but it is also true that IR1/2/3 and Rx/Gx/Bx are being saturated anyway, so as long as the code doesn't need to look at MAC0/1/2/3 it should be okay.
    For the interpolation they probably usually won't, isn't it usually used for color?

    Quote Originally Posted by hlide View Post
    i don't see any easy way to do :/.

    typically to set FLAGS for MAC1/2/3 should look that way ?

    vli.t C000, [1<<(30-12), 1<<(29-12), 1<<(28-12)] // bits A1p, A2p, A3p
    vli.t C010, [1<<(27-12), 1<<(26-12), 1<<(25-12)] // bits A1n, A2n, A3n
    vli.t C020, [MIN, MIN, MIN] // -32767.0
    vli.t C030, [MAX, MAX, MAX] // +32768.0
    vslt.t C020, MAC1MAC2MAC3, C020 // cn[i] = IR[i] < (1-(1<<31)) ? 1.0 : 0.0 (unsure about this instruction)
    vsge.t C030, MAC1MAC2MAC3, C030 // cp[i] = IR[i] >= (0+(1<<31)) ? 1.0 : 0.0
    vmul.t C020, C020, C000 // A[i]p = cp[i]<<(31-i)
    vmul.t C030, C030, C010 // A[i]n = cn[i]<<(28-i)
    vadd.t C020, C030, C020 // A[i] = A[i]p + A[i]n
    vadd.s S020, S020, S021 // FLAGS = A[1] + A[2];
    vadd.s S020, S020, S022 // FLAGS += A[3];
    vf2i.s S020, S020
    mfv t8, S020
    or t9, t9, t8 // t9 contains (GTE FLAG >> 12).

    ...

    li t8, 0x7F87E // CHECKSUM = 1 if one of those bits are set
    and t8, t9, t8
    beql t8, zero
    lui t8, 8
    or t9, t8, t9
    sll t9, t9, 12
    mtv t9, FLAG



    for IR1/2/3 with lm=0 (no negative limit):

    vli.t C000, [1<<(24-12), 1<<(23-12), 1<<(22-12)] // bits B1, B2, B3
    vli.t C010, [MAX, MAX, MAX] // 65535.0
    vslt.t C020, C010, IR1IR2IR3 // 65535.0 < IR[i] ? 1.0 : 0.0
    vmin.t IR1IR2IR3, IR1IR2IR3, C010 // IR[i] = min(IR[i], 65535.0)
    vmul.t C020, C020, C000
    vadd.s S020, S020, S021
    vadd.s S020, S020, S022
    vf2i.s S020, S020
    mfv t8, S020
    or t9, t9, t8 // t9 contains GTE FLAG.
    ...
    blahblah

    well i dunno if it works...
    Well of course I didn't realize the existence of the vslt instructions, in fairness you just found them so that changes things a lot. What I was saying is that you can keep the GTE flags in their separate VFPU registers if possible (if you have room) and only do the conversions when you need it.

    Quote Originally Posted by hlide View Post
    well i use 6 matrixes for permanent usage so only matrix 0 and 1 are left for any transient purpose in my emulator. :/
    I'd have to see what kind of layouts you can get away with. 2 matrices is still 32 registers right?

    Quote Originally Posted by hlide View Post
    well, I think you see the point why i cannot load/store GTE registers as float register :/
    Yes, but since you can still statically determine within blocks what form the registers are taking you can at least leave them cached as floating point registers inbetween certain GTE instructions, don't you think? And hope error doesn't accumulate too much.

  2. #82
    DCEmu Coder
    Join Date
    Sep 2006
    Posts
    57
    Rep Power
    0

    Default

    Quote Originally Posted by Exophase View Post
    For the interpolation they probably usually won't, isn't it usually used for color?
    I guess
    Quote Originally Posted by Exophase View Post
    Well of course I didn't realize the existence of the vslt instructions, in fairness you just found them so that changes things a lot. What I was saying is that you can keep the GTE flags in their separate VFPU registers if possible (if you have room) and only do the conversions when you need it.
    don't worry, i really understand your point.

    actually this code has some errors and I found right now a way to squeeze more :

    ____________________
    viim.s S011, 16384 # 1<<(26-12)
    viim.s S010, 8192 # 1<<(25-12)
    vadd.s S012, S011, S011 # 1<<(27-12)
    viim.s S000, 8
    vscl.t C000, C010, S000 # [1<<(30-12),1<<(29-12),1<<(28-12)]

    lvi.t C020, [-2147483648.0, -2147483648.0, -2147483648.0] # TOFIX they must be 43-bit limit not 30-bit limit :/
    lvi.t C030, [+2147483647.0, +2147483647.0, +2147483647.0] # TOFIX

    vslt.t C020, C130, C020 # cn[i] = MAC[i] < -2147483648.0 ? 1.0 : 0.0
    vsge.t C030, C130, C030 # cp[i] = MAC[i] >= +2147483647.0 ? 1.0 : 0.0

    vdot.t S100, C020, C000 # Ap = (cp[0] * 1<<(30-12-i)) + (cp[1] * 1<<(29-12-i)) + (cp[2] * 1<<(28-12-i))
    vdot.t S133, C030, C010 # An = (cn[0] * 1<<(27-12-i)) + (cn[1] * 1<<(26-12-i)) + (cn[2] * 1<<(25-12-i))
    vadd.s S133, S100, S133 # (FLAGS >> 12) = (An + Ap);
    ...
    ___________________________

    funny to think that you can use a dot product to merge bits . Well, i hope it is only 1 cycle :/

    when you try to read FLAG (S333) with "mfc2 rd, FLAG", you just need to :
    transform FLAG content into integer, test a mask to see if you need to set CHKSUM bit to 1 and shift all to left 12 bits. (I don't see why we need to do so after a gte instruction, to do so just when reading FLAG should be enough instead). Of course if we could set the other bits only when reading this FLAG that would be better but is more difficult because you need to pre-analyse the code to generate.


    Quote Originally Posted by Exophase View Post
    I'd have to see what kind of layouts you can get away with. 2 matrices is still 32 registers right?
    yes : 4 x 4 = 16 registers, so you need 2 matrixes for 32 registers. GTE has 64 registers indeed so 4 matrixes. My GPR dnd GTE registers are loaded and stored in VFPU registers. The 2 left matrixes are scratch matrixes. For MDEC (i'm planning to use the VFPU vbfy1/2 "butterfly" instructions i recently digged in MDEC algorihtm) I maybe need to "steal" some matrix to GTE, since I don't think MDEC is used in conjonction with GTE. Keep in mind that GTE register has integer form, if you want all them to be float, you will need nearly 8 matrixes if I remember well.



    Quote Originally Posted by Exophase View Post
    Yes, but since you can still statically determine within blocks what form the registers are taking you can at least leave them cached as floating point registers inbetween certain GTE instructions, don't you think? And hope error doesn't accumulate too much.
    _______________ DeViL code >8>=>
    lwc2 MAC1, 0(a0)
    lwc2 MAC2, 4(a0)
    lwc2 MAC3, 8(a0)
    ...

    bltz a1, t0, 0f
    nop
    jal gte_gpl12
    ...
    b 1f
    ...
    0: jal gte_gpl0
    ...
    1:
    swc2 MAC1, 0(a2)
    swc2 MAC2, 4(a2)
    swc2 MAC3, 8(a2)
    ...
    _______________

    how do you know ?(a0) contains indeed a [1:3:12] format or a [1:16:0] before encountering the real GPL12 instruction ?
    and how you will store ?(a2) ?

  3. #83
    GP2X Coder
    Join Date
    Jul 2006
    Posts
    102
    Rep Power
    69

    Default

    I believe MDEC is completely seperated from the GTE. So there should be no worries there, the two shouldn't "mix".

  4. #84
    DCEmu Newbie
    Join Date
    Aug 2006
    Posts
    84
    Rep Power
    0

    Default

    Sorry, I'm a little slow... but is dynarec implemented in this emulator already?

  5. #85
    DCEmu Coder
    Join Date
    Sep 2006
    Posts
    57
    Rep Power
    0

    Default

    dynarec is done at 90%, still need to make GTE, COP0 and HLE dynarec.
    what is missing is a complete hardware emulation (GPU, CDR, SIO, etc.)
    So there is a dynarec but not an emulator. Normally i'm pairing with another developper to make a complete emulator and waiting for an SVN access to develop together.

  6. #86
    DCEmu Regular
    Join Date
    Jan 2006
    Location
    UK
    Age
    48
    Posts
    308
    Rep Power
    72

    Default

    Quote Originally Posted by hlide View Post
    dynarec is done at 90%, still need to make GTE, COP0 and HLE dynarec.
    what is missing is a complete hardware emulation (GPU, CDR, SIO, etc.)
    So there is a dynarec but not an emulator. Normally i'm pairing with another developper to make a complete emulator and waiting for an SVN access to develop together.
    Hi hlide.

    This may be unfeasible, but if you have an almost complete dynarec, and the Anonymous Coder has an emulator without a working dynarec, why don't you ask Wraggy to put you guys in touch and see if you can't collaborate....?

  7. #87
    DCEmu Coder
    Join Date
    Aug 2006
    Location
    Bloomington, IN
    Age
    41
    Posts
    268
    Rep Power
    73

    Default

    He wants to promote new emulators written from scratch.

  8. #88
    DCEmu Regular
    Join Date
    Jan 2006
    Location
    UK
    Age
    48
    Posts
    308
    Rep Power
    72

    Default

    Quote Originally Posted by Exophase View Post
    He wants to promote new emulators written from scratch.
    Well thats an honourable enough intention, I didn't realise though, is the Anonymous Coder's emulator a PCSX port?

  9. #89
    DCEmu Coder
    Join Date
    Sep 2006
    Posts
    57
    Rep Power
    0

    Default

    Quote Originally Posted by tsurumaru View Post
    Hi hlide.

    This may be unfeasible, but if you have an almost complete dynarec, and the Anonymous Coder has an emulator without a working dynarec, why don't you ask Wraggy to put you guys in touch and see if you can't collaborate....?
    well the way I want my dynarec to run is not totally compatible with his emulator which is undoubtedly derived from pcsx-like emulator since it is very specific to psp. I mean I cannot even use my dynarec for another MIPS-based platform. What you need is not to adapt the dynarec to the emulator but to adapt the emulator to the dynarec if you want to retain all its potentiality. Which is basically the same as rewriting from the scratch.

  10. #90
    Won Hung Lo wraggster's Avatar
    Join Date
    Apr 2003
    Location
    Nottingham, England
    Age
    53
    Posts
    141,446
    Blog Entries
    3209
    Rep Power
    50

    Default

    this is turning out to be a very interesting thread, not that i get a chance to read threads these days, too busy /too tired

Page 9 of 10 FirstFirst ... 5678910 LastLast

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •