What would happen if...
#1
Wink 
hi

ive seen theres a "multithread" option available in gdx plugin in the software option and i was wondering...
how the result would be with a cpu with large number of phisycal cores/thread units (like the I7) in running a heavy graphic based game (like GT4 or Tekken5) and setting that particular value to, i dont know, 4 or 5 ?
only the cpu would be used (no gpu involving), but also high parallel raising in rendering. i guess in that case the graphic card would be the bottleneck...

someone willing to try? Laugh

(i know, maybe i have nothing to do in this hot afternoon, lol)
Reply

Sponsored links

#2
Not much.

The current design of the software rasterization model in GSdx causes the "weight" of the thread synchronization to out-weigh the benefit of having multiple workers. Put another way, all your cores end up spending all their time waiting on the other cores, and none of them actually do any more work than they would in a dual or quad core setup.

I get the best performance from GSdx Software mode on my Quad using MTGS with 2 software rasterizer threads (which actually works out to 3 total threads, since GSdx also uses the MTGS thread for rasterization).

This makes sense, and you'd think extending it further would be good -- until you learn the whole story. The fact is the performance of 2 is only about 3-5% better than using 1 rasterizer thread (which only utilizes 3 cores), and that's only about 10-15% faster than using no extra threads (MTGS only, dual core). If I turn off MTGS and use 3 rasterizer threads, I don't even come close to matching MTGS with 0 threads performance. Furthermore, I'm still quite GS limited, which means the EE core is idling ~40% of the time -- which in theory means that using one more raster thread should improve performance. But it doesn't... 3 raster threads will take me from 90% cpu to 100% cpu, and drops my fps 3-5%. -_-

(... and also causes my cpu to get really hot)

It's something I was looking at trying to improve, but it's a bit tricky. The GS batches primitives in typically small "jobs", which means that when you divide the jobs across more and more threads, they get so small that the overhead of dispatching the jobs is simply overbearing the time each thread needs to do the job. So all the workers just spend all their time waiting for the main thread to prepare the job. >_<
Jake Stine (Air) - Programmer - PCSX2 Dev Team
Reply
#3
(08-04-2009, 03:19 PM)Air Wrote: Not much.

The current design of the software rasterization model in GSdx causes the "weight" of the thread synchronization to out-weigh the benefit of having multiple workers. Put another way, all your cores end up spending all their time waiting on the other cores, and none of them actually do any more work than they would in a dual or quad core setup.

I get the best performance from GSdx Software mode on my Quad using MTGS with 2 software rasterizer threads (which actually works out to 3 total threads, since GSdx also uses the MTGS thread for rasterization).

This makes sense, and you'd think extending it further would be good -- until you learn the whole story. The fact is the performance of 2 is only about 3-5% better than using 1 rasterizer thread (which only utilizes 3 cores), and that's only about 10-15% faster than using no extra threads (MTGS only, dual core). If I turn off MTGS and use 3 rasterizer threads, I don't even come close to matching MTGS with 0 threads performance. Furthermore, I'm still quite GS limited, which means the EE core is idling ~40% of the time -- which in theory means that using one more raster thread should improve performance. But it doesn't... 3 raster threads will take me from 90% cpu to 100% cpu, and drops my fps 3-5%. -_-

(... and also causes my cpu to get really hot)

It's something I was looking at trying to improve, but it's a bit tricky. The GS batches primitives in typically small "jobs", which means that when you divide the jobs across more and more threads, they get so small that the overhead of dispatching the jobs is simply overbearing the time each thread needs to do the job. So all the workers just spend all their time waiting for the main thread to prepare the job. >_<

incredibly ive understand it all (university efforts sometimes pay, lol).
linear increment of speedups is usually expected when u have large batch of "static" work is divided onto workers, but when work becomes "dynamic" (it depends on other workers job) the increment drops and all u have from multithreading is multioverheading. all stuff u can provide with less cores with less idling time.
thats a real pity.

tnx for the explanation and your time Smile
ive heard an Amdahl ring in my head Laugh
Reply
#4
that said using GSDX in software mode with 7 threads on an i7, plus pcsx2 in MTGS mode is pretty much nearly as fast (faster in some cases) than hardware mode (using a GTX 280 graphics card)
[Image: ref-sig-anim.gif]

Reply
#5
(08-04-2009, 03:44 PM)refraction Wrote: that said using GSDX in software mode with 7 threads on an i7, plus pcsx2 in MTGS mode is pretty much nearly as fast (faster in some cases) than hardware mode (using a GTX 280 graphics card)

doh, i intended Air's words as "no, theres no real optimization" and then u say the contrary Biggrin'
maybe i didnt get at all Air's speech? Unsure
Reply
#6
Measuring hardware performance depends on if you're using native resolution or not in HW mode, and also if you've benchmarked with the latest GSdx revisions which have had significant improvements in speed for almost every game that used to run slower in HW mode than SW mode.

I have a Q9400. Going back less than 2 months, software mode was as fast or faster than hardware mode on most games. All of that has since been fixed however, and I'm not sure if Ref's done a new sweep of benchmarks since getting his i7 (which was before GSdx received the recent wave of major HW mode performance boosts). Since HW mode got boosted, now only a few select games are faster in SW mode, and only by a small amount. Most games run significantly faster in HW mode now, and my video card is considered "poop" by most people.

(for the record, most of my benchmark games run 75-220fps in HW, and 60-140fps in software -- I typically use speedhacks when benchmarking GS stuff, so that I work the video as hard as possible and reduce the EEcore's skew factor).

... and I might add, the comparison is a bit of apples to oranges: HW mode almost always looks a LOT better. Even in games with "dirty" text (lines caused by upscaling), they're usually much more enjoyable with GSdx HW mode with a high internal resolution then they are via native res software mode (unless the game is basically just still images and menus, in which case upscaling isn't helpful -- but those are an extreme minority to be sure). So even if software is "matching" hardware in framerate it doesn't really end up feeling like much of an achievement to you the user. Wink
Jake Stine (Air) - Programmer - PCSX2 Dev Team
Reply




Users browsing this thread: 1 Guest(s)