impossible blend => please read this thread
#21
Is conversion going to be more expensive than permanent though? or can we do that situational based?
[Image: ref-sig-anim.gif]

Reply

Sponsored links

#22
With GL stuff you never know. But I think conversion will be the fastest. In all cases I need to run a 2nd pass to clamp result and I confirm that I can merge 3a/3b. So it would only cost the CHAR -> HF conversion. Conversion must be faster than a draw call because it is only a blit.
Reply
#23
The half-float render target will help in color clipping, you could create a option with two states where, one state would be for converted RT and another one for the direct half float render target without any conversions. will the half float render target's provide a big impact on performance (or) is it minimal ?
We're supposed to be working as a team, if we aren't helping and suggesting things to each other, we aren't working as a team.
- Refraction
Reply
#24
For performance it depends on lots of factor. It the GPU capable to process float point value as the same speed as fixed float in blending unit. Potentially they use same silicon but I don't know.

However for the bandwith
For each fragment you have
1/ 0/1/4 texel reads (no texture/ near / linear)=> 1 texel is 32 bits
2/ 0/1/2 depth access (no depth/depth write only/ depth test+write enabled) => each access is 32 bits.
3/ 1/2 color access (no blend/ blend) => each access is 32 bits. It would be 64 bits on HF (+32 bits)

I let's you compute the increase of bandwidth requirements Wink

Yes I know for color clipping. However in all cases, I need to do a 2nd pass to clamp/wrap HF values.

The mix of those 2 features (col clip + impossible blend) will be a nigthmare Wink
Reply
#25
I have faith in you Wink
[Image: ref-sig-anim.gif]

Reply
#26
(05-04-2015, 02:32 PM)gregory Wrote: For performance it depends on lots of factor. It the GPU capable to process float point value as the same speed as fixed float in blending unit. Potentially they use same silicon but I don't know.

However for the bandwith
For each fragment you have
1/ 0/1/4 texel reads (no texture/ near / linear)=> 1 texel is 32 bits
2/ 0/1/2 depth access (no depth/depth write only/ depth test+write enabled) => each access is 32 bits.
3/ 1/2 color access (no blend/ blend) => each access is 32 bits. It would be 64 bits on HF (+32 bits)

I let's you compute the increase of bandwidth requirements Wink

I see, Thanks for the Information. Honestly, I don't know how to compute those stuffs in a accurate manner. Tongue2

Quote:Yes I know for color clipping. However in all cases, I need to do a 2nd pass to clamp/wrap HF values.

The mix of those 2 features (col clip + impossible blend) will be a nigthmare Wink

I see,then there is really no need for the use of half float render targets. then, the conversion to char will be the right choice in case of performance. what do you think about the FS blending method ?
We're supposed to be working as a team, if we aren't helping and suggesting things to each other, we aren't working as a team.
- Refraction
Reply
#27
Let's take 2 texels, 1.5 depth => 112 bits. No let's add the color stuff.
Before with CHAR
Quote:No blendind 144
With blending 176
After with HF
Quote:No blendind 176 (22% increase)
With blending 240 (36% increase)

FS blending method. On the normal behavior you use dedicated hardware unit so blending is nearly/ (completely?) free. In this case, you do it in FS, it will cost more ALU. But I think we have plenty of them (well maybe not for big upscale).
But in the first method the copy will be massive anyway, 640x480 @ 4B/color and a 6x upscale gives you => 42MB of copy (actually x2 to convert it back). If you push the math
1 special effect / frame => 2 copy => 84 MB => 60fps => 5 GB/s of bandwidth (ouch, ouch) !!!

Potentially 2nd method could be faster. I don't know what is the cost to split draw call.
Reply
#28
could you try creating a option where either one of the method can be used ?

Like, having a settings option for either using the FS blending method (or) the conversion method. that would allow the users to select a option which suitable for their hardware. is it possible to do ?
We're supposed to be working as a team, if we aren't helping and suggesting things to each other, we aren't working as a team.
- Refraction
Reply
#29
We have to be cautious of how much clutter we have on the interface, sometimes it's best just to pick a route and deal with it. If you put an option in for every change with multiple routes you get a massive interface of tick boxes or dropdowns and it will be very un-user friendly.
[Image: ref-sig-anim.gif]

Reply
#30
(05-05-2015, 06:15 PM)refraction Wrote: If you put an option in for every change with multiple routes you get a massive interface of tick boxes or dropdowns and it will be very un-user friendly.

Oh you mean kinda like we have now Tongue2
[Image: XTe1j6J.png]
Gaming Rig: Intel i7 6700k @ 4.8Ghz | GTX 1070 TI | 32GB RAM | 960GB(480GB+480GB RAID0) SSD | 2x 1TB HDD
Reply




Users browsing this thread: 1 Guest(s)