NVIDIA's VRWorks vs. AMD's LiquidVR
NVIDIA vs. AMD – the SDK War to minimize latency & deliver a Premium VR experience
Please read Oculus Rift VR Benching, Part 1 as it gives a good introduction to VR. It details the difficulty of providing high-quality VR at 90 FPS as the average VR game is about seven times more demanding than a PC game at 1920×1080. The following images by NVIDIA shows this clearly.
In addition to rendering at a high frame rate and high resolution, the GPU also needs to maintain low latency between head motion and display updates. This low latency is important so that when you move your head, everything stays in sync with what your eyes see in the HMD. If the display updates too slowly, the user may experience serious discomfort. To put it into a nutshell, keeping frame rates high and consistent and latency low is crucial to delivering a high-quality VR experience, otherwise the VR gamer might actually get ill.
VR Research has shown that the motion-to-photon latency should be below 20 milliseconds to ensure that the experience is comfortable for users which means that the GPU pipeline becomes even more critical. Input has to first be processed and a new frame submitted from the CPU, then the image has to be rendered by the GPU, and finally scanned out to the display. Each of these steps adds latency and new techniques were developed to reduce VR latency below PC gaming levels.
Just as in the PC gaming space, AMD and NVIDIA are both heavily involved with VR believing it to be the “next big thing” in gaming. Polaris was touted by AMD as an inexpensive way to bring VR to the masses, and both companies have Software Development Kits (SDK) to help game developers create the best VR experience for their games. NVIDIA has VRWorks and AMD’s SDK is LiquidVR, plus there are also SDKs for the Oculus Rift and for the HTC Vive which work together with AMD and NVIDIA platforms.
To overcome the challenges of delivering VR smoothly, NVIDIA has created a VR graphics platform that increases performance, reduces latency, and provides a seamless out-of-box VR experience for GeForce users. The components of this graphics platform that is named VRWorks are comprised of NVIDIA GeForce GTX GPUs, GeForce Experience, and the VRWorks SDK.
We are going to test the latest gaming NVIDIA Pascal-based GeForce GTX GPUs (the GTX 1060, GTX 1070, GTX 1080, and GTX 1080 Ti) which are optimized to deliver the raw frame rates and high resolution required for demanding VR experiences. With full support for the DirectX 12 graphics API, and a Pascal Simultaneous Multi-Projection (SMP) architecture that enables new rendering techniques for VR, the GTX 1060 is a very good video card in our experience for entry-level VR that still provides an excellent experience with reduced in-game VR settings. At almost one-third faster, the GTX 1070 provides a higher level of detail than the GTX 1060, while the GTX 1080 can provide mostly a maxed-out VR experience. For the ultimate VR experience, a GTX 1080 Ti or a TITAN XP usually provides enough performance headroom to increase the pixel density further, providing an even more immersive experience.
The GeForce Experience is the second part of the VRWorks experience which when installed will automatically deliver the very latest drivers to the end user. It is crucial to use drivers that have been optimized for VR games as we discovered with Star Trek: Bridge Crew.
Here is VRWorks for headset developers from NVIDIA’s website:
Direct Mode hides the display from the OS, preventing the desktop from extending onto the VR headset although VR apps however can still see the desktop and render to it. Front buffer rendering enable direct access to the front buffer. Context Priority allows for two priority levels for work – the normal priority that handles all the usual rendering, and a high-priority context that can be used for asynchronous time warp (ATW) . ATW is basically a synthetic frame reprojection “safety net” similar to Oculus’ asynchronous space warp (ASW) when the goal of rendering at or above 90 fps can not be met.
Here is VRWorks for app developers:
VR SLI isn’t particularly well-supported by developers yet as only a few games have it available. In a follow-up to this evaluation, we will measure the performance of VR SLI with Serious Sam: The Last Hope which supports VR SLI. VR has two views to render and devoting one GPU to each eye will nearly double performance, without the latency associated with AFR. Although the system generates a single command stream, affinity masking decides which states apply to each GPU in VR SLI. For VR SLI to work, it needs to be part of the game engine, and it’s up to the developer to implement it. Unity and Unreal engines both support VR SLI. Although work duplication including rendering shadow maps and physics will reduce performance down from doubling one GPU, VR SLI can still give a major boost
NVIDIA also uses Multi-Resolution Shading (MRS), a Pascal feature that AMD doesn’t have. MRS features viewport multicasting, or multi-projection acceleration which was also implemented in the PC version of Shadow Warrior 2. Multi-Res Shading (MRS) refers to a multiple resolution shading technology which was created by NVIDIA. MRS works to increase a game’s overall performance by rendering the outer edges of the screen at a lower percentage of the basic screen resolution.
MRS causes the center of the screen to retain full image quality, and it works particularly well with action PC games and also with VR as players are mostly focused on the center of the display. NVIDIA’s image below illustrates what MRS does when it is enabled at 60% of the in-game resolution. There are often a couple of levels of MRS available as options – one is more aggressive than the other with the outer area rendered at about 40% of the full resolution, but it also delivers more performance.
Multi-Res Shading helps reduce rendering cost and it improves performance without impacting perceived image quality by using Pascal’s (and Maxwell’s) hardware-based multi-projection feature. The screen is divided into multiple viewports, and, the entire scene geometry is broadcast to each viewport simultaneously while extraneous geometry is culled. This has to be done because the lenses of the Oculus Rift HMD distort the image presented on a virtual reality headset which has to be warped to counteract the optical effects of the lenses. Instead of being square, the images appear curved and distorted until viewed through appropriate lenses.
Current VR platforms use a two-step process that first renders a normal image (above left) and afterward uses a post-processing pass that warps the image to the view (above right, above from NVIDIA’s example). This solution is considered inefficient by NVIDIA because there is oversampling at the edges and many rendered pixels are wastefully discarded. Their solution is to divide the viewport into nine divisions as below:
Each of the divided viewports are then warped so that the maximum sampling resolution needed within each portion of the image is now closer to what is finally displayed. The center viewport is also warped and it stays nearly the same but without overshading. Since fewer pixels are shaded, the rendering is quicker with savings translating into a 1.3x to 2x pixel shading speedup according to NVIDIA depending on the MRS level setting.
Since MRS reduces the rendering cost of VR games, the in-game VR quality settings can be increased. Make sure to check out the performance testing that we carried out in Batman Arkham VR which compares the performance of using MRS without using MRS and also versus “Fix Foviated” which is the Warner Brothers VR dev’s solution to improving performance without impacting image quality too negatively.
This same Simultaneous Multi-Projection architecture of NVIDIA Pascal-based GPUs that uses MRS also created two major new techniques for tackling the unique performance challenges VR creates: Lens Matched Shading and Single Pass Stereo.
Lens Matched Shading improves pixel shading performance by rendering better to the unique requirements of VR display output. This avoids rendering many pixels that would otherwise be discarded before the image is output to the VR headset. Because VR displays have a lens between the viewer and the display which bends and distorts the image, it has to be rendered with a special projection that inverts the distortion of the lens to look natural to the viewer because the two distortions cancel each other out. Producing a correct final image with pre-Maxwell GPUs requires two steps—first, the GPU must render with a standard projection, generating more pixels than needed. Second, for each pixel location in the output display surface, look up a pixel value from the rendered result from the first step to apply to the display surface.
With Lens Matched Shading, the SMP engine subdivides the display region into four quadrants, with each quadrant applying its own projection plane. The parameters can be adjusted to approximate the shape of the lens distortion as closely as possible with a significant reduction in shading rate that translates to a 50% increase in throughput available for pixel shading according to NVIDIA. And developers have the option to use settings that use a higher resolution in the center and are undersampled in the outer edges, to maximize frame rate without degrading the image quality significantly.
Single Pass Stereo increases geometry performance by allowing the HMD’s left and right displays to share a single geometry pass. Traditionally, VR applications have to draw geometry twice,once for each eye. Since Single Pass Stereo uses SMP to draw geometry only once, and then simultaneously project both views of the geometry, it allows developers to nearly double the geometric complexity of VR games using it.
There is much more to NVIDIA’s VRWorks including a physics-based audio solution. Both Oculus and Vive are working closely with NVIDIA, and Unreal Engine 4 and Unity Engine 5 already have integrated VRWorks support and quite a few of the latest VR games make use of these engines and VRWorks.
Just like with NVIDIA’s SDK, LiquidVR aims to reduce latency to deliver a consistent frame rate. AMD’s VR technology delivers several key benefits that make this possible:
- TrueAudio Next: is a scalable AMD technology that enables full real-time dynamic physics-based audio acoustics rendering. Leveraging the powerful resources of AMD GPU Compute, it enables the truly immersive audio required to achieve full presence in VR.Asynchronous Shaders: Provides a subset of the async compute functionality native to Direct3D 12 in Direct3D 11. Helps to increase performance and decrease latency.Affinity Multi-GPU: Provides the ability to send Direct3D 11 API calls to one or more GPUs set via an affinity mask.Latest Data Latch: Provides the ability to update data asynchronously from the CPU to reduce input or sensor latency.Direct-to-Display: Bypasses the operating system and sends the result of VR rendering straight to the headset for lower latency and better compatibility. This LiquidVR functionality is exposed in a special SDK targeted at headset vendors, and is not application-accessible.GPU-to-GPU Resource Copies: Provides ability to copy resources between GPUs with explicit control over synchronization.
All of these technologies are supported via AMD’s GPUOpen initiative and developers may modify the code as they wish. They basically accomplish the same thing as NVIDIA’s VRWorks SDK (with the exception of multi-projection like MRS) – to deliver frames smoothly and efficiently to the VR gamer. For AMD, the biggest difference is their Asynchronous Compute Engine (ACE), or use of async shaders which provide flexibility to command scheduling with the ability to run both compute and graphics work simultaneously.
AMD’s use of async shaders make it easier and more efficient to do certain VR tasks including implementing ATW to reduce latency. VR needs to sustain a fixed framerate target locked to 90 FPS. And if a PC can’t meet that target, the frame rate is halved to 45 FPS to make sure that there is no judder causing motion sickness. 90Hz/90 FPS is the premium experience standard for the Rift and the Vive. A game cannot exceed 90 FPS otherwise the player will see tearing in the HMD and feel sick. And a game cannot drop below or vary from a locked framerate or the player will get VR sick. So it is crucial that framerates are locked to either 45 FPS or to 90 FPS.
Although the next 4 images are from NVIDIA they represent the way the VR pipeline works and what happens when timing goes off. The first image below represents an ideal VR pipeline for a premium experience where no frames need to be synthesized at 90 FPS.
If a frame gets missed, this is what the pipeline looks like. Whenever a frame arrives too late to be displayed, a Frame Drop occurs and causes the game to stutter. An occasional drop is meaningless, but if there are several, the user will notice. And if there are a lot of dropped frames, the VR experienced will be ruined and the viewer may get VR sick.
When playing a game in the Oculus Rift, if you see performance locked at 45 FPS, you are no doubt running with Asynchronous Space Warp (ASW). In the Oculus runtime, ASW does motion prediction by inserting a synthetic frame, every other frame. With ASW the cadence looks something like this:
Frame 0: Frame created by the GPU
Frame 1: Frame synthesized by ASW
Frame 2: Frame created by the GPU
Frame 3: Frame synthesized by ASW
Frame 4: Frame created by the GPU … and so on.
ASW and ATW use a type of reprojection which uses the most recent head sensor location information to adjust the old frame to match the current head position. It won’t improve the animation of a frame which will still have a lower frame rate and some judder, but reprojection provides a more stable visual experience that tracks better with a gamer’s head motion.
Even though there is a downgraded visual experience with 45 FPS compared with 90 FPS, it is better to have ASW than not to have it. If the framerates cannot be locked at 90 FPS and did not drop to a locked on 45 FPS, then frames will be dropped and the resulting judder will result in unease and/or VR sickness. ASW will lock you into 45 FPS if your frame rate is anywhere between 45 and 90 FPS. The lower frame rate is in exchange for a smoother frame delivery.
If you want more details, take a look at the Oculus developer’s blog on ASW as well as Asynchronous Time Warp (ATW). These compromises work together to help improve smoothness, but ASW comes at a cost of reduced image quality due to synthesized and extrapolated frames which are also sometimes called “reprojection”.
The worst thing that can happen is a warp miss. A warp miss occurs when the runtime fails to produce a new or a reprojected frame in time. In the following image by NVIDIA, an earlier warped frame is repeated by the GPU. The VR user will notice this repeated frame as an immersion-breaking stutter and if there are many of them, he may get ill.