OpenGL renderer is slow on AMD GPU due to inefficient driver. Sorry.
#21
Yeah Robert Hallock AMD talked about this on twitter during a recent AMA I believe. AMD could've implemented "multithreaded optimizations" (what this actually refers to is DCL's, or driver command lists, a feature that exists in DX11 and I guess OpenGL too) if they wanted to, but they opted to focus on Mantle instead (and its forks, DX12 and Vulkan.) This is what that whole "AMD has more driver overhead than NV in DX11" hubbub came from.
[Image: WdHHrza.png]
(image courtesy of some dude at Reddit)

DX11, like its predecessors, basically only has a single submission queue point, which means only one core is being used for feeding the GPU, essentially. What Nvidia did was take the commands being sent from the API and split them up the moment they arrived in the driver, assigning them to multiple threads and then sending that to the GPU and the end user. DCL's have their pro's and con's, but Robert claimed that you get the most draw calls out of the API by just using a single core/thread like they do, hence the "extra overhead" you hear so much about. This is because with DCL's, you have no idea when each command list will be submitted to the GPU, but you need to patch each of the command lists with GPU addresses before sending them to the GPU. So the one core that's your submission queue thread on DX11, it has to stop what it's doing and spend time going through the DCL's on the other threads.

edit: It's worth mentioning that Hallock referred to DCL's as useless for DX11, implying that they only benefit short benchmarks to artificially inflate scores, but various tests were shown where Nvidia got a drastic increase in a before & after scenario when they added DCL support to their driver. My take on it is that in spite of how flawed it can be, command lists still give a decent bonus and AMD just opted to not implement them in their DX11 driver in order to focus on Mantle/DX12/Vulkan instead. Basically DX11 is legacy at this point and most newer games are undoubtedly being developed with DX12 and Vulkan in mind.

In AMD's GCN performance guide, they recommended that devs dedicate one core to being the main GPU rendering thread, and move all other work to other cores/threads instead. I guess many devs forewent this and did things differently. However in more recent games, AMD's DX11 performance has drastically improved.

My guess is that OpenGL does things in a similar manner, and since AMD doesn't let users toggle the DCL functionality in the driver control panel like Nvidia does, you lose out on these supposed benefits. DX12 an Vulkan inherently *require* DCL's which is what AMD designed GCN and Mantle around in the first place (not DCL's in particular, but a more efficient approach to feeding the GPU in general.) This is why you see much better driver and game performance under GCN GPU's in DX12 and eventually Vulkan, because they can fully exploit the uarch and card's potential.

So I guess under older OpenGL you'll just have to deal with it, until somebody does a Vulkan version preferably. No real reason to stick with crappy old OpenGL and DX11 when Vulkan drivers are a real thing now and are better than the preceding API's in every possible way.

Sponsored links

#22
Nvidia multithreading is more or less that:
Quote: In AMD's GCN performance guide, they recommended that devs dedicate one core to being the main GPU rendering thread, and move all other work to other cores/threads instead.
Nvidia do it once, and it is available for all games. AMD doesn't have enough money so they ask the developpers to do it. It is very cost effective for big studio but it is killer for small project like us. AMD goal is to move as much as driver code into the application.

AMD gpu also haves severals GPU queues. And the driver costs isn't to transfer the command to the GPU but to generate it and to validate it. Then, it is matter of a couple of PCIe transactions. And you have a single PCIe port Wink

The idea behind DCL or related stuff (aka new API). Is to replace this code.
Code:
while (1) {
  set state A
  draw
  set state B
  draw
}
by this code
Code:
/* cold code */
set state A
records states
set state B
records states

/* hot code */
while (1) {
  get recorded state A
  draw
  get recorded state B
  draw
}
So in short, you trade-off CPU power by memory requirement.

Quote:This is why you see much better driver and game performance under GCN GPU's in DX12 and eventually Vulkan, because they can fully exploit the uarch and card's potential.
It just proves that AMD driver suck, and that game developers made a better jobs than AMD.

Quote: So I guess under older OpenGL you'll just have to deal with it, until somebody does a Vulkan version preferably. No real reason to stick with crappy old OpenGL and DX11 when Vulkan drivers are a real thing now and are better than the preceding API's in every possible way.
I'm feeling sad for the AMD's buyers. Basically, they bought a GPU couples months/years ago, and they still need to wait a couple of years to use it.
And DX11/GL is much easier to use than vulkan. It works on old cards. And if it is done well, it could be very close for the performance.
The biggest gain of VUlkan is to reduce the dependency of crappy drivers.
#23
I wouldn't say that the AMD driver "sucks", but it's an odd move from them. It makes sense but it's ironic. It's not even a big deal in most cases in actual PC games, and their cards still perform well enough under the DX renderers and apparently still do okay in the OpenGL renderer when not using super high accuracy. Like I said, they deliberately made those design choices focusing on newer API's, and recent game benchmark clearly shows it paid off for them. The ironic part I mentioned is the fact that AMD's approach to the DX11 driver requires high-IPC processors, which they themselves don't currently make (or at least don't have on the market yet.)

Quote:I'm feeling sad for the AMD's buyers. Basically, they bought a GPU couples months/years ago, and they still need to wait a couple of years to use it.
And DX11/GL is much easier to use than vulkan. It works on old cards. And if it is done well, it could be very close for the performance.
The biggest gain of VUlkan is to reduce the dependency of crappy drivers.

Sorry but this is nonsense. This whole "AMD crappy driver hurrr" meme needs to end. Go look at benches for recent games where AMD cards are trashing Nvidia cards in both DX11 and DX12. Crappy drivers? Waiting years to use them? That's absurd. Even older GPU's like Tahiti are walking all over Nvidia cards that used to win all the time in DX11. GCN has simply aged much better than Kepler did and now it's beating Maxwell.

Just because you as a software dev can't extract what you want from AMD's drivers with PCSX2 right now, doesn't mean the situation is bleak universally across the software ecosystem. DX12 has been taking a dump on DX11 and OpenGL in PC games these past few months and it'll only get better when devs become more familiar with the API and do crazier things with the code. Vulkan will be the same. Whether something is easier or not to use, doesn't matter when you can get so many benefits from the newer API's.

I'm talking mostly about conventional gaming btw, honestly couldn't care less about emu's at this point in that regard lol.
#24
Quote:Sorry but this is nonsense. This whole "AMD crappy driver hurrr" meme needs to end. Go look at benches for recent games where AMD cards are trashing Nvidia cards in both DX11 and DX12. Crappy drivers? Waiting years to use them? That's absurd. Even older GPU's like Tahiti are walking all over Nvidia cards that used to win all the time in DX11. GCN has simply aged much better than Kepler did and now it's beating Maxwell.
http://www.hardware.fr/focus/117/ashes-o...vidia.html
Sorry in French. So in full HD, heavy batch, Fury X is slower than a GTX960....

I don't say that AMD didn't work on their driver. But that a lots of work remains to be done. And even AMD says it, seriously even them. May I quote yourself.
Quote:Yeah Robert Hallock AMD talked about this on twitter during a recent AMA I believe. AMD could've implemented "multithreaded optimizations" if they wanted to, but they opted to focus on Mantle instead (and its forks, DX12 and Vulkan.)

So the "bad driver story" will end when AMD delivers nice and performance driver Wink Honestly they did a fairly good jobs, it was a nightmare several years ago. Now it is usable.

Quote:Whether something is easier or not to use, doesn't matter when you can get so many benefits from the newer API's.
Why dev's don't use ASM but instead C/C++ ? Why dev's don't use C/C++ but instead Java/Python/C#.... I'm sure ASM is 10 times faster than C#. DX12/Vulkan is around 10-50% betters... (based on Nvidia results). Why Vulkan/DX12 isn't bare metal (lower level), it would be even faster.

Quote:Just because you as a software dev can't extract what you want from AMD's drivers with PCSX2 right now, doesn't mean the situation is bleak universally across the software ecosystem. DX12 has been taking a dump on DX11 and OpenGL in PC games these past few months and it'll only get better when devs become more familiar with the API and do crazier things with the code. Vulkan will be the same. Whether something is easier or not to use, doesn't matter when you can get so many benefits from the newer API's.

I'm talking mostly about conventional gaming btw, honestly couldn't care less about emu's at this point in that regard lol.
I think you can understand that as an emulator developer on an emulator forum, I talk only on emu Wink In the meantime how many games were released with Mantle support ? Dx12 support ? Vulkan support ?
#25
(03-14-2016, 12:30 PM)gregory Wrote: http://www.hardware.fr/focus/117/ashes-o...vidia.html
Sorry in French. So in full HD, heavy batch, Fury X is slower than a GTX960....

The benchmarks on the page you linked has Fury X is delivering 249% the performance of the 960. On the page you linked the Fury X is outperforming the 980ti across the board in DX12

[Image: capture72rba.png]
#26
I don't know what you read. Here the value of the first bench in DX11 full HD. Of course, DX12 is much faster on AMD. Whereas it doesn't bring any real performance improvement in Nvidia. The only sane conclusion is that DX11 AMD driver is bad.

Fury X
normal: 71
medium: 46.4
heavy: 25.2

GTX960 (edit sorry mistype it)
normal: 37.4
medium: 31.5
heavy:28.5
#27
(03-14-2016, 12:30 PM)gregory Wrote: http://www.hardware.fr/focus/117/ashes-o...vidia.html
Sorry in French. So in full HD, heavy batch, Fury X is slower than a GTX960....

No where in that heavy batch benchmark were the 960 and Fury x even compared. Hell, in DX12, the Fury X beats the 980 Ti.
[Image: gmYzFII.png]
[Image: dvedn3-5.png]
#28
(03-14-2016, 03:10 PM)gregory Wrote: I don't know what you read. Here the value of the first bench in DX11 full HD. Of course, DX12 is much faster on AMD. Whereas it doesn't bring any real performance improvement in Nvidia. The only sane conclusion is that DX11 AMD driver is bad.

Fury X
normal: 71
medium: 46.4
heavy: 25.2

GTX760
normal: 37.4
medium: 31.5
heavy:28.5

We have to be looking at 2 completely different pages here. The 760 isnt listed and the 960 is only compared to the Fury X in one benchmark in which the fury X beats it in every scenario except for DX11 Heavy (but wins in Extreme, Normal, and Medium by an absolutely massive margin)
#29
Quote:I think you can understand that as an emulator developer on an emulator forum, I talk only on emu Wink In the meantime how many games were released with Mantle support ? Dx12 support ? Vulkan support ?

Ah, emu's. RPCS3 and Dolphon, and even Citra are messing around with DX12 and Vulkan backends already. DX12 shows nice gains in Dolphin. Yeah yeah different emu's but I doubt it would be a waste for PCSX2 to get Vulkan some day.

Anyway, what about the number of Mantle games? Mantle was literally a proof-of-concept to show everybody that this needed to be done. Vulkan and DX12 are literally just forks of Mantle. DX12 titles are rapidly releasing already and there will obviously be many more this coming year. Vulkan support in drivers just got released and AMD already has them in their 16.3 driver, NV either already has them or will have them soon (haven't checked.) Don't be surprised to see Vulkan games later this year and I won't be shocked if it supersedes DX12 altogether for the most part (which it should.)
#30
I mean this benchmark    

Of course, when the scene requires more GPU power, the fury X is nice. But the point was to demonstrate that DX11/openGL AMD drivers are bad (aka CPU limited)

"DX12/Vulkan" are faster isn't a valid answer for me. It just confirms that DX11/openGL are so bad. It confirms that AMD user will be able to use Dolphin at full speed now, whereas they could have run it at full speed 3-5 years ago (dunno when Dolphin implement the DX11/GL API). Note: it also confirm that most games are GPU limited and a new API wasn't that helpful.

Yes DX12 and Vulkan are the future. But we don't have a tsunami of DX12 games. So the poor guy that paid $$$ a fury X, still need to wait the end of the year (best case) to use it at its full potential.

From the above benchmark, you can infer that (openGL) PCSX2 is potentially running faster on a GTX960 than the fury X.

Unfortunately as AMD, the project doesn't have spare resources. So you won't see a Vulkan port this year. (and the only purpose of a Vulkan port will be to support AMD GPU).




Users browsing this thread: 1 Guest(s)