Login

**rama** · 09-01-2009, 02:37 PM

Still nice to see how clever workarounds + minimal extra code (clamp, soft opcodes, etc) made like 90% of all games happy.
Anyone still remembering how incompatible pcsx2 0.9.4 was with every 2nd game? It was mostly due to reasons described in this blog Tongue2

frankdd89 · 09-09-2009, 12:14 AM

Hi CottonVibes!
Can i ask you why you didn't check if value is a NaN or Inf before the clamp?

float ps2_sqrt(float value) {
value = clamp(value); // Clamp Value if NaN or Inf to an ordered/normal number
value = abs(value); // Make Positive
value = sqrt(value); // Get sqrt of now-positive value
return value;
}

you can spare a few time if you do a control to value variable like this:

float ps2_sqrt(float value) {
if(value==NaN||value==Inf)
value = clamp(value);
if(value<0)
value = abs(value);
value = sqrt(value);
return value;
}

I think this solution would be more efficient because you are not calling abs and clamp functions if they are not necessary!
Laugh

(I Dunno how did you've implemented it anyway in the sourcecode)

echosierra · 09-09-2009, 12:31 AM

(09-09-2009, 12:14 AM)frankdd89 Wrote: Hi CottonVibes!
Can i ask you why you didn't check if value is a NaN or Inf before the clamp?

float ps2_sqrt(float value) {
value = clamp(value); // Clamp Value if NaN or Inf to an ordered/normal number
value = abs(value); // Make Positive
value = sqrt(value); // Get sqrt of now-positive value
return value;
}

you can spare a few time if you do a control to value variable like this:

float ps2_sqrt(float value) {
if(value==NaN||value==Inf)
value = clamp(value);
if(value<0)
value = abs(value);
value = sqrt(value);
return value;
}

I think this solution would be more efficient because you are not calling abs and clamp functions if they are not necessary!

(I Dunno how did you've implemented it anyway in the sourcecode)

Adding two conditionals may be a good enough reason to completely avoid it. Depending on how it's implemented, those if's could turn out to be very expensive.

dralor · 09-09-2009, 12:48 AM

I don't think it would that expensive as nand and inf are constant values there is no calculation needed there. plus not wasting cycles doing the clamp and the the abs is always a good thing. Though it could get really nasty if the developers coded for that behaivoir and thus coded to use buffer overflows as non junk data.

**Air** · (This post was last modified: 09-09-2009, 12:56 AM by Air.)

The "safe" stereotype is that "Branching is always bad". (this isn't actually true in every case, but if you assume it is you'll end up being right way more often than wrong) You are almost always better off using non-conditional solutions in situations like these. this is especially true of i7 and AMD chips, which have faster SSE units than the C2Duo and a slower branching unit.

dralor · 09-09-2009, 01:00 AM

Branching is that expensive? I was taught to use conditionals as much as possible.Though of course it is much less expensive as the branches get longer.

**rama** · (This post was last modified: 09-09-2009, 01:14 AM by rama.)

Ehm, don't forget that the clamp itself is the comparison Tongue

In the SSE code we're using it looks like this:

Code:
void mVUclamp1(int reg, int regT1, int xyzw) {

    if (CHECK_VU_OVERFLOW) {

        switch (xyzw) {

            case 1: case 2: case 4: case 8:

                SSE_MINSS_M32_to_XMM(reg, (uptr)mVU_maxvals);

                SSE_MAXSS_M32_to_XMM(reg, (uptr)mVU_minvals);

                break;

            default:

                SSE_MINPS_M128_to_XMM(reg, (uptr)mVU_maxvals);

                SSE_MAXPS_M128_to_XMM(reg, (uptr)mVU_minvals);

                break;

        }

    }

}

If you were to extract the value first, test if it needs clamping or not, and then do it depending on the results.. Well, that's some 10 operations there Tongue

**Air** · (This post was last modified: 09-09-2009, 01:15 AM by Air.)

Depends on the CPU. On MIPS cpus branching is pretty cheap, so if you're coding for a MIPS then yea you can use branching lots with little or no penalty. On most Intel and AMD chip designs branching can result in all kinds of potential stalls, cache flushes, prediction misses, and generally causes havoc on the exception-safe 4-6 pipeline superscalar design (that's how many instructions the chips can execute in parallel given optimum conditions). So branching on our desktop CPUs of choice is less recommended. Wink

Furthermore, the more branches you have on an Intel/AMD cpu, the faster you flood out your branch prediction cache. So even a whole crapload of otherwise very fast fast branches (99% reliable prediction) can indirectly end up slowing things down because they force other branches to miss-predict more often.

So yeah, if you're optimizing code for an Intel or AMD cpu, you're usually best avoiding branches. The guideline I usually go by is this: If I can remove a branch instruction by replacing it with 4 asm instructions (or 1-2 slow SSE instructions) I'll probably net a performance win.

echosierra · 09-09-2009, 01:16 AM

(09-09-2009, 01:00 AM)dralor Wrote: Branching is that expensive? I was taught to use conditionals as much as possible.Though of course it is much less expensive as the branches get longer.

It really depends on the context.

For floats, it's a bad idea to do a comparison in a manner that may require it to be copied to a general register and worked on that way. Floating-point operations take a relatively long time anyway, doing everything you can to avoid lengthening it is always a good idea.

There's nothing inherantly wrong with using the conditionals, it just isn't useful in this case. Any checks for INF or NaN are better done outside of each and every function call, attempting to check it every single time you call a function is the naive approach and can't be scaled well.

dralor · (This post was last modified: 09-09-2009, 01:35 AM by dralor.)

Well of course it is which means it makes sense to not have to do it as it's probably the default case in the switch thus run after every comparison. Assuming side cases for if it is a nand or an inf though of course if the number of comparisons is the same it's a lose lose. In fact it might depend on how often there are nands and inf's as that would lengthen the compare for the true case as it would tell you it was a nand twice. So most likely it's a lose lose scenario by doing so though that's nothing to do with branching itself there.

Ahh branch prediction was what I didn't factor in. I was just thing of a compare with a jump instruct as this branch doesn't need a return addy stored in a reg as far as I could tell by that snippet of code. Bah of course it would as it need an addy to jump to after the clamp call but still it's not that major.

Login
Username:
Password:	Lost Password?
	Remember me