January - February 2016 progress report

Progress report 1-2 2016

Hello everyone and welcome to another glorious PCSX2 progress report! This time we've got a lot of GSdx improvements to share with you. We also have an announcement regarding the progress reports.

Those of you who have been following them probably know that we've been a little late with them pretty much every time. The reason for this is lack of manpower - for the most part, the same people that work on the emulator itself write the progress reports. Because of that there are many times when they'd rather spend their free time improving the emulator rather than writing about the improvements. With this in mind we have decided that from now on our progress reports will be released every three months.

This will enable us to release higher quality progress reports as well as dedicate more time to development.
If you wish to follow development more closely then you can always check out what's going on over at Github.

Now with that out of the way, let's get down to it!

[Enhancement]   GSDX-OpenGL: Fast accurate blending by Gregory and Turtleli

As most of you know, Blending Unit Accuracy is an option where a SW blending unit is emulated through the fragment shader with the intent of reproducing the blending unit output of the GS. This option fixes blending issues in a lot of games - Valkyrie Profile 2, Rule of Rose and Bakugan Battle Brawlers are examples. One of the major disadvantages of this from a user perspective is that increasing the accuracy increases the demand on the system. In a worst case scenario this can drop performance well below an acceptable level.

Two reasons why poor performance was observed on some games while using accurate blending:

  • Primitives being drawn one by one when any primitive overlap is detected
  • The GPU Texture Cache needs to be flushed at each instances of reading a target.

Turtleli fixed the first issue by reworking the code so that non-overlapping consecutive sprite primitives are drawn together when sprite overlap is detected, which can greatly reduce the number of barriers used and therefore improve rendering speed when accurate blending is enabled.

Gregory has tackled the second issue by not flushing the texture cache between primitives when we are emulating the Frame buffer mask on an alpha channel. Normally if you don't flush the cache you wouldn't know what you're reading - it could be an old cached value or it could be a value written in the render target or it could even be something undefined. Fortunately the "Undefined" case is rather unlikely as read and write paths are separated for performance and memory is designed to handle both read/write at the same time/same address.

In the end, it all comes down to the case of "Cached value" VS "Current value". In order to generate an effect, the fragment shader needs to read the masked bits and then merge them with the new alpha value. Since the alpha channel isn't blended (due to GS limitations), the masked bits in the target are constant. This also means that said bits are the same in the cache (which contains the value of rendering target at a random time). Because of this we don't care which data is read, cached or not, the required bits will be the same. This provides a significant increase at speed on games like Xenosaga.

Note: The former slow method of accurate blending can still be used by enabling "Safe Accurate blending" on HW hacks. ( useful for checking on regressions caused by the current fast blending)

Speed impact of Fast accurate blending
GameBlending unit accuracy [disabled]Blending unit accuracy [medium] (before)Blending unit accuracy [medium] (now)
Xenosaga 172 36 153
Zone of the Enders 172 27 156

[Enhancement]   GSDX-OpenGL: Depth buffer lookup optimization by Gregory

The Graphics synthesizer (GS) is allowed to overlap the depth buffer and the color buffer. In order to support it on a standard PC GPU, HW depth was created to convert between the depth and color format which resolves tons of issues related to the depth textures. Unfortunately it is quite costly on performance.

After some observations, it was found that at several instances the depth buffer is useless, not read, not written. GSdx was updated to not handle the depth buffer when it isn't used by the game. It removes most of the overhead of the HW depth option on games like Tekken 5.

Speed impact of depth buffer lookup optimization
GameHW depth [disabled]HW depth [enabled] (before)HW depth [enabled] (now)
Tekken 5   215 32 205

[Enhancement]   GSDX-FX: Post-Processing updates by Asmodean

Asmodean has updated his effects suite with improved cel edges of the cel shader and a new debanding shader along with Timothy Lottes' CRT emulation shader.

Here are some comparison screenshots showing the effect of the CRT emulation shader:

gundam gundam
dmc3 dmc3

[Bug-Fix]   GSDX: Proper handling of 576P/720P/1080I video modes by ssakash

1080P (before)
video mode test
1080P (now)
video mode test fixed
Note: This screenshot is of an ELF file called "Video mode test" which was designed to observe the Display/SMODE registers value changes at each video modes.

Gitaroo Man [PAL] (before)
gitaroo man broken

Gitaroo Man [PAL] (now)
gitaroo man broken
Note: The NTSC version of Gitaroo Man didn't suffer the following issue since it uses a height of 448 whereas the PAL version uses 576.

The physical display dimensions (CRTC size) of the video modes can be found using the value of the following display registers:

  • DW    - Display Width
  • DH     - Display Height
  • MAGH - Horizontal Magnification factor
  • MAGV - Vertical Magnification factor

The width and height are calculated by the formula (DW+1) / (MAGH + 1) and (DH+1) / (MAGV + 1) respectively. Different video modes use different magnification factors and it depends on the game as well.

Previously there was a saturation limit for display height (DH) as it's generally not higher than 640 for NTSC / PAL video modes. However, that wasn't the case for other VESA/DTV (576P,720P...1080I) video modes since they utilize a bigger display height (DH) and scale it according to the magnification factor. To solve this, the saturation limit is ignored for video modes other than NTSC/PAL as those do have the possibility of producing a higher DH value. The check was ultimately placed on CRTC height instead of DH to avoid any possible conflicts with the vertical magnification factor scaling.

Another part of the fix was to prevent a bad division of height when the INT and FFMD register values were set. Previously the code did the division twice since we were receiving the height of the device size instead of the display rectangle. We don't need to divide once again as device size already goes through the halving of height when the respective register values are set.

[Bug-Fix]   PCSX2: GUI Improvements by ssakash , Turtleli and NZJenkins

  • Console Window: Improved size handling and better autodock behavior when maximized.
  • Console Window: make enable/disable/restore defaults affect all menu items.
  • Debugger: proper deletion of breakpoint when multiple breakpoints are present.
  • Debugger: prevent unexpected resuming when stepping into/over disabled breakpoints.
  • Debugger: stepout works without any preceding step in / step over.
  • GSFrame: Resolution is displayed at titlebar with respect to upscaling value.
  • VideoPanel: Better gray out of dialog elements.

[Bug-Fix]   PCSX2-Auto test suite by Gregory , Rama and Refraction

The PS2 Autotests is a repository created by Unknown brackets.

PS2 Autotests consists of multiple small programs for the PS2 with each one being a different test. It tests certain things and then simply writes out a log of what it did and what happened when it did those things. These tests are run on real hardware and then we save that log.

After that an emulator can execute the same small program and get its own log. If what happened on the real hardware is different from what happened in the emulator then guess what? You've got a bug, or at least a potential problem. Even better, it's quite easy to reproduce.

Gregory has recently added a perl script which makes it easier to run such tests and allows faster debugging for the developers - these tests have already helped our developers fix a couple of issues. (few of them are listed below for reference)

[Bug-Fix]   PCSX2-Vector Units: Scarface I bit hack by  Refraction

Scarface - The World is Yours has a certain way of doing it's vector programs which is a pain for recompilers, most games upload a program with specific instructions to process vectors in a set of order with a common outcome. However the following game does things a little bit differently, instead of having a vector program for certain processes and then just changing the input data, scarface constantly changes what is known as the I bit in its VU program.

The issue here is that the recompiler creates a hexadecimal hash of the entire program, one little change to the program (in this case, an I bit value in place of an operation code) and then the hash changes completely, telling the recompiler that this is a new program and not the same one it had previously recompiled, so it recompiles it over and over again until it runs out of memory.

To workaround the following issue, a new hack was added to check for the status of the I bit value and then avoid constant recompilation when only the I bit value has been changed at the specific address. Maybe in the future we can deal with this in a accurate way, but it's not an easy thing to check without compromising on speed.

This is an optional hack which can be found under the Gamefixes tab of the emulator as it is slower to handle things this way and is quite specific, but it was requested by quite a few users, so we obliged!

[New Feature]   LilyPad: Add Pop'n Music controller support by ekudritski

gitaroo man broken

Native support for the Pop'n Music controller has been added to LilyPad. Several users had requested this feature since a few games rely on said controller and it's quite hard and troublesome to simulate the behavior of the controller using macros and third party programs.

That was it for these 2 months!. The next report will come around June, but hopefully we will show signs of life before that Wink As always, check out our github repository for daily updates to PCSX2.

[Image: newsig.jpg]

Sponsored links

Cool, and GREAT WORK, that mean it can be played at least with some speed?
Where is the download to try it?
(03-02-2016, 03:13 AM)Solax Wrote: Cool, and GREAT WORK, that mean it can be played at least with some speed?
Where is the download to try it?

[Image: gmYzFII.png]
[Image: dvedn3-5.png]
(03-02-2016, 03:54 AM)Nobbs66 Wrote: buildbot.orphis.net

Thanks for replay, but I see many of them and they are not really stable versions, but do they work so fast and better than last stable version, because sometimes I got low speed from it.
I am using reshade, anyone knows a LOTTES_CRT config to obtain similar results (if is possible)

[Image: D3NXgVS.jpg]
Quote:GSDX-OpenGL: Fast accurate blending
Note: The former slow method of accurate blending can still be used by enabling "Safe Accurate blending" on HW hacks. ( useful for checking on regressions caused by the current fast blending)
The information in this section makes it unclear if there are now 3 modes in the emulator or only 2 (Improved accurate blender, previous accurate blender and fast blender). Specifically because it says you test for regressions by comparing the former/older accurate blending mode against the fast mode instead of the older accurate mode which doesn't make sense!

It also kinda makes it seem like the improvements were made to the Fast Render instead of the Accurate Blender.
Computer specifications:
Windows 10 | Ryzen 3700X | ASUS Crosshair VIII Hero (WiFi) | MSI 1070Ti | 16GB 3600MHz RAM
...what? Blending Unit Accuracy has 6 modes, none,basic,medium,high,full and ultra. The optimization is for all options, and you can disable it via the new option.
[Image: newsig.jpg]
The optimization that you can disable by "safe accurate blending" only impact the level basic of accurate blending. So you can choose between basic(fast) or basic(safe) mode.

But let's take it easy. I'm sure at 99.999% that fast mode work as expected. I can't imagine how it could be wrong on the hardware. The safe mode is only a security in case someone reports some issues on accurate blending (basic). The safe mode will likely be removed in the future.
(03-02-2016, 04:02 AM)Solax Wrote: Thanks for replay, but I see many of them and they are not really stable versions, but do they work so fast and better than last stable version, because sometimes I got low speed from it.

It isn't possible to provide a stable build every months. It is quite costly for us to provide a stable version.

Some games (tekken5/Zone of Enders/Xenosaga) will be much faster in latest git version if you enable accurate depth/blending. However it isn't mandatory to enable those options. Therefore you can keep the 1.4 version.
Wow, great ! Thanks, guys !

Where can I download that "video mode test" elf file ? I'd love to have Sanae in PCSX2 !

Users browsing this thread: 1 Guest(s)