Thread Rating:
  • 1 Vote(s) - 5 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Vector Units, and CUDA / GPGPU
#1
The people best placed to answer this question probably don't even have to read the rest of my post because they know what I'm going to say Tongue

I've searched to check whether anything like this has been asked before. I couldn't find anything. This is something I haven't been able to get out of my mind ever since I heard emulating the vector-units (VU0 and VU1) are often the major load on the CPU running PCSX2. I didn't want to ask yet another question a couple of weeks ago after I'd received an excellent answer (from Air/Jake) to why a 64-bit binary wouldn't really help, but this one is still bugging me.

As far as I know, these vector units are currently being emulated on the CPU using JIT compilation. Would it be possible to offload the work of the VU1 at least, and possibly the VU0 as well to the graphics-card, using either CUDA or the upcoming GPGPU layer in DX11? Vector units sound suspiciously like they do the same sort of things as the shader-units on modern graphics cards are capable of. Parallel calculations, the very name "vector" implies parallelism and vector calculations are what graphics-cards are now designed to do.

Given that PCSX2 is almost always CPU bound, and the vector-units are a major part of that, could some of the VU1 and possibly VU0 work be done on the GPU through CUDA or a future DX11 GPGPU interface?

Eeeek, even as I ask this, I feel I'm probably asking something the devs involved have already considered a hundred times, and know the answer to, but as I couldn't find an answer in the forums, I'm trepadatingly going to hit the 'Post Thread' button.
CPU: Athlon 64 X2 4400+ (2.2GHz @ solid 2.53GHz)
GPU: nVidia GeForce 8800GTS 640MB (not currently O/C)
Memory: 2GB DDR400 (2x 1GB @ DDR422 2.5-3-2)

Sponsored links

#2
Quote:Why not use CUDA to make things faster?

CUDA works best when doing simple operations on many parallel threads.
This isn't very useful for ps2 emulation, and any implementation that attempts to use CUDA will most-likely slow down the emu.
CUDA might however prove to be useful in a GS plugin.

From the FAQ...
For a longer technical answer, I guess you'll have to wait for someone from the dev team.
[Image: newsig.jpg]
#3
http://forums.pcsx2.net/thread-4768.html < my post about cuda

some devs gave simple answers about it
#4
So in other words probably a Graphics plugin will be better using CUDA instead of DirectX?
#5
no, in other words, it's not somehting the PCSX2 team will attempt tp use. but someone like Gabest might
#6
"He might", but very unlikely I guess.
#7
very unlikely... hmm
I think it would depend on 3 things...

1. whether It's more effective
2. how many can use the technology (if compute shader work it'd be no prob)
3. If he has fun coding it^^

(not in order of evaluated importance XD)


Oo I'm writing complete crap I guess I should hit the hay...
Chicken is not Vegan?

NO VEGAN DIET NO VEGAN POWERS!!!!

http://www.flixxy.com/my-blackberry-is-not-working.htm
#8
seems in order to me kabooz
#9
heres why cuda won't be good for VU emulation:

Wikipedia Wrote:# The bus bandwidth and latency between the CPU and the GPU may be a bottleneck.
# Threads should be running in groups of at least 32 for best performance, with total number of threads numbering in the thousands. Branches in the program code do not impact performance significantly, provided that each of 32 threads takes the same execution path; the SIMD execution model becomes a significant limitation for any inherently divergent task (e.g., traversing a ray tracing acceleration data structure).
# CUDA-enabled GPUs are only available from NVIDIA (GeForce 8 series and above, Quadro and Tesla).

1) Most likely it will be slow due to data having to travel back and forth from the gpu (to main memory and cpu)
2) Theres no way you can find 32 things to run in parrallel with the vus. And running VU instructions in parallel would require some very complex algorithms to figure out dependency chains + stalls + flag writes (at proper times); its pretty much asking for the impossible for a human to do this on a free project...
3) only nVidia GPUs will run it. So that cuts the userbase in half, which means we won't do it ever.
Check out my blog: Trashcan of Code
#10
That's true, don't forget about ATI owners please Sad




Users browsing this thread: 1 Guest(s)