TL;DR: Asynchronous timewarp (ATW) is a technique that generates intermediate frames in situations when the game can’t maintain frame rate, helping to reduce judder. However, ATW is not a silver bullet and has limitations that developers should be aware of.
Over the past year there’s been a lot of excitement around asynchronous timewarp (ATW). Many hoped that ATW would allow engines to run and render at a lower frame rate, using ATW to artificially fill in dropped frames without a significant drop in the VR quality.
On Gear VR Innovator Edition, ATW has been a key part of delivering a great experience. Unfortunately, it turns out there are intrinsic limitations and technical challenges that prevent ATW from being a universal solution for judder on PC VR systems with positional tracking like the Rift. Under some conditions, the perceptual effects of timewarp judder in VR can be almost as bad as judder due to skipped frames.
In this blog, we analyze these limitations and situations that cause particular difficulties. As you’ll see, while ATW may be helpful at times, there is no substitute for hitting the full frame rate when it comes to delivering great VR.
Timewarp, Asynchronous Timewarp, and Judder
Timewarp is a technique that warps the rendered image before sending it to the display in order to correct for head motion that occurred after the scene was rendered and thereby reduce the perceived latency. The basic version of this is orientation-only timewarp, in which only the rotational change in the head pose is corrected for; this has the considerable advantage of being a 2D warp, so it does not cost much performance when combined with the distortion pass. For reasonably complex scenes, this can be done with much less computation than rendering a whole new frame.
Asynchronous timewarp refers to doing this on another thread in parallel (i.e. asynchronously) with rendering. Before every vsync, the ATW thread generates a new timewarped frame from the latest frame completed by the rendering thread.
Judder and its consequences are covered in detail by Michael Abrash in his 2013 blog posts here. Reviewing Michael’s notes on judder would be helpful to get the most out of this post.
In order to produce a perceptually correct representation of the virtual world, the images on the displays must be updated with every vsync refresh. However, if rendering takes too long, a frame will be missed, resulting in judder. This is because when no new frame has been rendered, the video adapter scans out the same image a second time. Here is what an object location looks like if the same rendered frame is displayed two frames in a row before updating:
Here, the eye is rotating to the left. When the same image is displayed again, its light falls on a different part of the retina, resulting in double image judder.
Of course, doubling is not the only possible effect. If we displayed the same frame three times in a row, you would get a triple image on your retina and so on.
Orientation-only ATW can be used to help address judder: if the rendered game frame is not submitted before vsync, timewarp can interrupt and generate the image instead, by warping the last frame to reflect the head motion since the last frame was rendered. Although this new image will not be exactly correct, it will have been adjusted for head rotation, so displaying it will reduce judder as compared to displaying the original frame again, which is what would have happened without ATW.
In certain situations, simple rotation-warping can work well. It has been implemented on Gear VR Innovator Edition, where it fills in the frames whenever games can’t meet the frame rate. This smooths many glitches to the point where they’re mostly not noticeable. Because Gear VR lacks position tracking and the content generally avoids near-field objects, many of the artifacts discussed below are lessened or avoided.
There are several reasons why ATW on the PC is significantly more challenging than on Gear VR, starting with the Rift’s support for positional tracking.
Positional judder is one of the most obvious artifacts with orientation-only timewarp. When you move your head, only the additional rotational component is reflected in the ATW-generated images, while any translational head movement since the frame was rendered is ignored. This means that as you move your head from side to side, or even just rotate your head which translates your eyes, you will see multiple-image judder on objects that are close to you. The effect is very noticeable in spaces with near field objects, such as the submarine screenshot below.
Image of a scene with near field objects affected by multiple-image judder.
So, how bad is this effect?
The magnitude of positional judder depends on the environment the player is in and the types of movements they make. If you keep your head relatively still and only look around at the scenery, the positional error will be small and the judder will not be very noticeable.
Note: This is normally the case for Gear VR Innovator Edition, which doesn’t include positional tracking. Nevertheless, the head model generates virtual translations, so when a game is running at half-rate on Gear VR, you can still observe positional judder on near-field objects.
If you’re looking at objects far away, the displacement change due to your head movement is unlikely to be significant enough to be noticeable. In these cases, ATW allows you to look around a scene with mid-to-far-field geometry without any noticeable judder.
On the other hand, if you’re in a environment with near-field detail, and you translate your head, the positional judder will be nearly as bad as running without ATW. This will also be true when you look down at a textured ground plane, which is not far enough away to avoid artifacts. The resulting perceptual effect is that of a glitchy, unstable world, which can be disorienting and uncomfortable.
One possible way to address positional judder is to implement full positional warping, which applies both translation and orientation fixups to the original rendered frame. Positional warping needs to consider the depth of the original rendered frame, displacing parts of the image by different amounts. However, such displacement generates dis-occlusion artifacts at object edges, where areas of space are uncovered that don’t have data in the original frame.
Additionally, positional warping is more expensive, can’t easily handle translucency, has trouble with certain anti-aliasing approaches, and doesn’t address the other ATW artifacts discussed below.
Moving and Animated Objects
Animated or moving objects cause another artifact with ATW: because a new image is generated just by warping the original image without knowledge of the movement of objects, for all ATW-generated frames they are effectively frozen in time. This artifact manifests as multiple images of these moving objects — i.e. judder.
Image of scene with a moving object affected by judder..
The impact of this artifact depends on the number, projected area, and speed of animated objects in the game scene: if the number or size of moving objects is small or they are not moving fast, the multiple images may not be particularly noticeable. However, when moving objects or animation covers a large portion of the screen it can be disturbing.
Additionally, the frame rate ratio between the game rendering and device refresh rate affects the perceived quality of the motion judder. In our experience, ATW should run at a fixed fraction of the game frame rate. For example, at 90Hz refresh rate, we should either hit 90Hz or fall down to the half-rate of 45Hz with ATW. This will result in image doubling, but the relative positions of the double images on the retina will be stable. Rendering at an intermediate rate, such as 65Hz, will result in a constantly changing number and position of the images on the retina, which is a worse artifact.
Specular and Reflection
Calculations for reflections and specular lighting consider the direction of the eye vector, or rendering camera vector, to produce an image that is custom rendered for each eye.
Diagrams courtesy of Wikipedia and http://ogldev.atspace.co.uk/www/tutorial19/tutorial19.html respectively.
Since this eye vector changes with head movement, specular and reflection rendering is no longer correct after timewarp. This may result in reflections and specular highlights juddering.
While specular highlights and reflections are two of the most common cases where shading relies on the eye vector, many other eye vector-dependent shading tricks will have similar issues. For example, parallax mapping and relief mapping (aka. parallax occlusion mapping) will show similar artifacts.
Implementing ATW is challenging for two primary reasons:
Let’s start with preemption granularity. At 90Hz, the interval between frames is roughly 11ms. This means that in order for ATW to have any chance of generating a frame, it must be able to preempt the main thread rendering commands and run in under 11ms.
However, 11ms isn’t actually good enough — If ATW runs at randomly scheduled points within the frame, its latency (ie. the amount of time between execution and frame scan-out) will also be random. And, we need to make sure we don’t skip any game-rendered frames.
What we really want is for ATW to run consistently shortly before the video card flips to a new frame for scan-out, with just enough time to complete the generation of the new timewarped frame. Short of having custom vsync-triggered ATW interrupt routines, we can achieve this with high-priority preemption granularity and scheduling of around 2ms or less.
It turns out that 2ms preemption on general rendering is a tall order for modern video cards and driver implementations. Although many GPUs support limited forms of preemption, the implementation varies significantly:
If the preemption doesn’t occur quickly enough, ATW will not complete warping a new frame before vsync, and the last frame will be shown again, resulting in judder. This means that a correct implementation should be able to preempt and resume rendering arbitrarily, regardless of the pipeline state. In theory, even triangle-granularity preemption is not good enough because with complex shaders we don’t know how long rendering a triangle will take. We’re working with GPU manufacturers to implement better preemption, but it will be awhile before it’s ubiquitous.
Another part of the equation is rendering preemption support in the OS. Prior to Windows 8, Windows Display Driver Model (WDDM) supported limited preemption using “batch queue” granularity, where batches were built by the graphics driver. Unfortunately, graphics drivers tend to accumulate large batches for rendering efficiency, resulting in preemption that is too coarse to support ATW well.
With Windows 8, the situation improved as WDDM 1.2 added support for preemption at finer granularities; however, these preemption modes are currently not universally supported by graphics drivers. Rendering pipeline management is expected to improve significantly with Windows 10 and DirectX 12, which gives developers lower-level rendering control. This is good news, but we’re still left without a standard way to support rendering preemption until Windows 10 becomes mainstream. As a result, ATW will require vendor-specific driver extensions for the foreseeable future.
ATW is helpful, but it’s not a silver bullet
Once we have ubiquitous GPU rendering pipeline management and preemption, ATW may become another tool to help developers increase performance and reduce judder in VR. However, due to the issues and challenges we’ve outlined here, ATW is not a silver bullet — VR applications will want to sustain high framerates to deliver the best quality of experience. In the worst cases, ATW’s artifacts can cause users to have an uncomfortable experience. Or stated differently: in the worst cases, ATW can’t prevent an experience from being uncomfortable.
Given the complexities and artifacts involved, it’s clear that ATW, even with positional timewarp, won’t become a perfect universal solution. This means that both orientation-only and positional ATW are best thought of as pothole insurance, filling in when a frame is occasionally skipped. To deliver comfortable, compelling VR that truly generates presence, developers will still need to target a sustained frame rate of 90Hz+.
Thankfully, Crysis-level graphics are by no means required to deliver incredible VR experiences. It’s perfectly reasonable to reduce the number of lights, the shadow detail, and the shader complexity if it means reaching that 90Hz sweet spot.
Dual-mode titles that try to support traditional monitors and VR will have the most performance difficulties, as the steep performance requirements for good VR quickly become a challenge to engine scalability. For developers in this situation, ATW may look very attractive despite the artifacts. However, as is typical with new mediums, ports are unlikely to be the best experiences; made-for-VR experiences that target 90Hz are likely to be substantially more successful in generating comfort, presence, and the true magic of VR.
— Michael Antonov, Chief Software Architect