Login

**Nobbs66** · 02-27-2015, 01:56 AM

(02-27-2015, 01:55 AM)dogen Wrote: Yeah, I totally get that. I'm just wondering how much work this would be. Would it require new mips and vu recs? Would it be something you could add into what we currently have? If so, would there be differences between fpu and vu implementations?

Just curious.

It probably would require a rewrite of the EE

**Blyss Sarania** · 02-27-2015, 01:58 AM

(02-27-2015, 01:56 AM)Nobbs66 Wrote: It probably would require a rewrite of the EE

I don't think so. The rounding and clamping modes are separate from the recompilers, and are applied to values as needed and based on settings. If there is a better way and we learn it, it should be relatively easy to implement.

The software FPU would be more complicated ofc, but I still don't think it would mean rewriting the recompilers.

JMC47 · 02-27-2015, 02:06 AM

In Dolphin, it was literally added per floating point instruction. I have no idea how hard/difficult the PS2 implementation would be.

**Blyss Sarania** · 02-27-2015, 02:17 AM

It's worth looking in to. I wish I knew more than I did lol.

**Nneeve** · (This post was last modified: 02-27-2015, 05:31 AM by Nneeve.)

Thought I'd try and clear things up a bit:

The two main obstacles in getting perfectly accurate FPU/VU emulation are:

A) Actually figuring out the exact behaviour of the FPU instructions on a PS2. I gave this a stab some years back and figured out add/sub/max/min, but mul/div/sqrt/rsqrt eluded me. (All of those produce results not always identical to what standard IEEE fpus produce).
Maybe someone else can figure these out...
(Note: the VU instructions most likely behave the same as the FPU ones, just operating on a vector)

B) Once (If) you have the accurate FPU implementation, you have to worry about the performance. For the FPU, this is negligible. But an accurate VU implementation would almost definitely need to a large amount of work on each vector element separately (not in parallel), seriously hurting performance. (Though hardware always marches forward, so don't let this stop you!)

As of several years ago (and it doesn't look like much has changed?), the most accurate (least inaccurate) FPU implementation sits in iFPUd.cpp (notice the "d").
It emits rather than executes code, though, and is based on IEEE fpu instructions rather than doing things fully in software, so it's really much harder to understand. But it has comments giving background on the algorithms so it's not too bad (I think!).

An (interpreted?) software FPU core would be a nice addition, but do realize that the only advantage of it is that it'd be a lot easier to read, study and modify.
Just by writing a software FPU core, issue (A) won't magically disappear.

Still, it could be a promising first step if one's serious about doing (A), and might help prepare you for it.

**Blyss Sarania** · 02-27-2015, 05:48 AM

Thanks for the information. So it's basically just like I thought as far as the situation.

Akio · 02-27-2015, 08:53 PM

(02-27-2015, 05:29 AM)Nneeve Wrote: Thought I'd try and clear things up a bit:

Thanks for the information. Rep+1.

JMC47 · 02-28-2015, 01:39 AM

Yeah; I guess the step I was missing was just how much work and hardware testing magumagu did in order to make a software floating point implementation on the interpreter. It just kind of showed up one day.

dogen · 02-28-2015, 02:56 AM

(02-27-2015, 05:29 AM)Nneeve Wrote: B) Once (If) you have the accurate FPU implementation, you have to worry about the performance. For the FPU, this is negligible. But an accurate VU implementation would almost definitely need to a large amount of work on each vector element separately (not in parallel), seriously hurting performance. (Though hardware always marches forward, so don't let this stop you!)

In it's current state does pcsx2 work on the vector elements individually or in parallel?

Sort of off-topic, but this is really interesting.

Login
Username:
Password:	Lost Password?
	Remember me