AMD Phenom II 975 doesn't support sse3/4?
#21
DaZ Optimisations =/= SSE2's DaZ instruction.

AMD and Intel both support DaZ with SSE2, but only intel has implemented the IE 754's hardware level optimisations.

On supported Core 2 and i7 processors, Intel runs float ops faster where DaZ are required because AMD did not implement the optimisations recommended in the ie 754 floating point standard.

cottonvibes Wrote:hmm. well pcsx2 uses the DaZ and FtZ flags for SSE, and relies heavily on SSE optimizations.

setting the DaZ and FtZ flags is a huge speedup on intel CPUs, and for AMD cpus its decent but not as huge of a speedup as intel's (at least it wasn't with the amd X2 architecture).

cottonvibes Wrote:Intel CPUs don't follow the spec better, they both follow it the same; performance-wise however, Intel CPUs get a significant speedup with DaZ + FtZ flags whereas AMD cpu's don't get as big of a speedup.

SSE's pre-existing instructions receive improvements simply by improving on the hardware, thats all the difference here with amd DaZ vs intel DaZ.

last i heard from my old AMD source (left after the ati buyout) K8 and initial K10 had not actually implemented real hardware support for denormals are zero, they were faking it by trapping in software.

Finally found what i was looking for
http://www.agner.org/optimize/instruction_tables.pdf
Reply

Sponsored links

#22
(08-24-2011, 04:38 AM)Squall Leonhart Wrote: DaZ Optimisations =/= SSE2's DaZ instruction.

AMD and Intel both support DaZ with SSE2, but only intel has implemented the IE 754's hardware level optimisations.

On supported Core 2 and i7 processors, Intel runs float ops faster where DaZ are required because AMD did not implement the optimisations recommended in the ie 754 floating point standard.



SSE's pre-existing instructions receive improvements simply by improving on the hardware, thats all the difference here with amd DaZ vs intel DaZ.

last i heard from my old AMD source (left after the ati buyout) K8 and initial K10 had not actually implemented real hardware support for denormals are zero, they were faking it by trapping in software.

Finally found what i was looking for
http://www.agner.org/optimize/instruction_tables.pdf

I still wonder what your knowledge is about really coding something like that involved. google and techdocs are nice. but else.

teach me a lesson if you can. Wink
Reply
#23
I can't see the point, that document does not even touch the "Denormals Are Zero", only the calculating with them, which I never disagree Intel did implement in hardware and the others did not.

Anyway, DAZ is a SSE2 instruction, it is simple, the number is small enough to be a denormal? so make it zero... there is no optimization involved. Please lets finish that argumentation, I agree Intel has an edge in every application critical enough to take into account such small numbers and calculate with them instead just round them.

Imagination is where we are truly real
Reply
#24
Quote:Anyway, DAZ is a SSE2 instruction, it is simple, the number is small enough to be a denormal? so make it zero... there is no optimization involved.

You're talking out your rear end.

Quote:Explanation 1: The XMM registers have some tag bits that are used for remembering
whether floating point values are normal, denormal or zero. These tag bits have to be set
when the output of an integer instruction is used as input for a single or double precision
floating point instruction. This causes a so-called reformatting delay.
Reply
#25
I agree, DAZ is a mode instead an instruction as I have been talking. The result is still the same, what is the dead cat is Intel provides hardware optimization to "calculate" denormals, if this is not needed, they are simply made zero and things go fine.
Imagination is where we are truly real
Reply
#26
whatever the *****. we know it's just some flags for the simd code. and it gets dealed in a different way on the chips cause there's not yet a standard for that.

better optimized regular pc only programs wouldn't even care about that faulty behaviour and just do something else and skillfully overwrite that data in any way. so this is in this scope a pcsx2 only matter which relies on that.

can we cut that now?! I do love integer math. Laugh
Reply




Users browsing this thread: 1 Guest(s)