It’s been a little over a year since Oculus Rift went on sale to consumers. With its release, not to mention the releases of Gear VR and other consumer-grade VR headsets, the industry shifted its focus from experimental R&D to actual product development. The challenges of VR software design become very real when your app ceases to be a tech demo and suddenly goes on sale with your name plastered all over it. Over the past year, we’ve seen VR developers hone in on design practices that work not just in theory, but with actual customers who are willing to spend real dollars for a good experience.
As a member of the Oculus Content team, I’m lucky enough to work with a large number of developers shipping some of the best VR software available today. These studios are on the bleeding edge, solving hard design problems with little precedent to work from, and doing it with polish. Though we’re still in the nascent stages of VR design, we’ve learned a lot from these early innovators.
This post is a summary of the key lessons learned and design best practices we’ve identified from full-fledged, customer-ready VR software that has shipped over the last year.
VR developers have been talking about vection—the “vestibular illusion” that causes some people to feel queasy in VR-for years. There’s no need to rehash all that here, but if you’re interested in the physiology of VR sickness, check out this talk by Oculus Perceptual Scientist Richard Yao.
At the most basic level, avoiding vection requires a good headset that can produce a stereoscopic, lens-warped 3D image in a very short amount of time with minimal head tracking latency. Early VR headsets struggled to meet these requirements, but modern devices like Rift can achieve this. Now that the necessary hardware is available, the problem of vection falls squarely on software design.
Understanding vection helps lay the groundwork for some base principles for VR design, particularly for VR motion design.
VR Motion Design Fundamentals:
There are almost no hard-and-fast rules when it comes to VR design, but maintaining these base principles is the first step towards comfortable VR software.
Modern VR Design
Finding ways to move people through virtual space without causing vection is the most common design problem facing VR developers today. In the first wave of VR applications, we noticed many developers designing around vection by simply not allowing the camera to move at all. People stood stationary or moved by teleporting, which makes motion implicit and is therefore comfortable for almost everybody. But we’re starting to see apps that find ways to achieve smooth motion without making people uncomfortable.
The common element of every compelling smooth-motion VR locomotion system I’ve seen is fixed velocity movement. The camera may move forward smoothly, but it never accelerates or decelerates—it only moves at a fixed velocity and is only ever on (when moving) or off (when stopped).
Another approach that has emerged is tunnel vision, or the vignetting of the eye buffers so that the field of view is narrowed and peripheral vision blocked off. Some apps fade out the peripheral view only when moving, others get it for free by virtue of the way the scene is framed (for instance, a diving helmet or dark room lit only by a flashlight effectively limit pixel flow in the peripheral vision without explicit vignetting). The most dynamic and complicated version of this system that we’ve seen to-date is Ubisoft’sEagle Flight.
Though smoothly rotating someone’s viewpoint is highly vection-inducing, developers have found a more comfortable way to map an analog stick to character rotation: snap turns. When the user pushes the stick left or right, the viewpoint instantly rotates (usually in 30 or 40 degree increments). This method omits visual information that would suggest rotation (which the vestibular system can’t corroborate). Snap turns are surprisingly comfortable.
Snap turns are leveraging a deeper pattern called “change blindness.” The idea is to omit visual information that’s likely to cause vection by simply cutting to black or instantaneously changing the viewpoint. The brain is tuned to fill in the blanks in our perceptual streams (e.g. you don’t notice it when you blink), and we can leverage that ability to transition the user’s viewpoint through motions that are likely to cause vection. For example, consider someone who needs to open a car door and sit down behind the wheel. Animating this motion would violate several VR101 rules and would induce vection in many people. Simply cutting from the point at which the door is opened to the point at which they’re already seated behind the wheel, with a short black interstitial, feels natural and expected.
We’ve seen a number of applications enable first-person movement by making the user’s head a rudder. In this system, people always move in the direction that they’re looking, and they drive around the scene by twisting their neck. The visual and vestibular systems shouldn’t be in dramatic disagreement because the head really is rotating. This model has proven effective with titles like AFFECTED: The Manorfrom Fallen Planet Studios and OZWE’s Anshar Wars 2 and is particularly useful on Gear VR.
Depth Cue Issues
One common mistake first-time VR developers make is overlaying 2D elements on top of the 3D scene. Floating UI elements like HUDs and subtitles are common in non-VR applications, and it’s tempting to simply hang them in space in your VR application. However, if done improperly this can cause depth cue conflict. Generally speaking, floating UI elements draw on top of the scene with no respect for depth or occlusion (your subtitles are rarely “behind” a foreground object). In VR, when people can see these floating elements with stereoscopic vision and perceive how far away they are in the virtual space, this sort of overlay doesn’t make sense. It’s common for an object that should be “behind” a wall (in terms of distance from the camera) to be drawn “in front” of the wall because it’s been implemented as an overlay. This sends a conflicting cue about the depth of these objects, which can be uncomfortable. This effect is extremely common with reticles, subtitles, and other sorts of floating UI elements.
The best solution we’ve seen is to attach HUD elements and other UI to objects in the world. Plaster your subtitles on a wall, or attach the life bar to the user’s arm. By putting them in the world and treating them as objects rather than an overlay, these elements will have real depth in the scene and avoid depth cue conflicts.
One of the key takeaways from 2016 has been that proper use of spatialized audio can increase the feeling of presence and immersion. Spatialized audio—where sounds that have obvious in-app sources sound like they’re coming from same direction as those sources—has such a huge impact on the overall believability of the scene that we now recommend every VR developer integrate a spatialized audio SDK (whether they use the Oculus Audio SDK or another).
Not all sound needs to be spatialized. Ambient room noise, background music, and other sounds that have no obvious spatial source can be played without spatialization. But any sound that issues from a physical location should be spatialized.
Designing for Positional Tracking
Positional tracking—the technology that lets people in VR directly control the camera’s position by moving their head around—is a basic component of many VR applications. It’s also the source of a number of meaty design problems.
Tracked Space Size
Proper positional tracking requires people to dedicate some space in their homes to VR. The amount of space available varies widely, making it difficult for developers to know how large to make their virtual spaces. Based on a survey of Rift owners, we believe that most people have, on average, four square meters of trackable space available. If that were a square, it would be roughly six feet per side, but most users probably have a rectangular space available. Some people have more space than that available, but a significant number have less room to work with.
That said, there’s huge variation in the way people prefer to enjoy VR. Should you design for a roomscale setup, 360 tracking, 180 tracking, or for a forward-facing, seated experience? Some people on mobile VR devices will be in more restrictive locations where they’re physically unable to turn all the way around. Regardless of how you intend your software to be used, people will interact with it in whichever way they’re most comfortable. To the extent that you can accommodate a wide range of usage patterns, your software will be available to a wider audience.
You can’t prevent people from putting their heads through the walls of your virtual world. Attempting to stop this behavior by stopping the virtual camera when it collides with a wall will result in an uncomfortable experience. Instead, a common solution is to detect an intersection between someone’s head and a virtual wall and fade to black based on the depth of penetration. As they push their head further into the wall, the fade should increase. At the point where they would be able to see through the polygon, the screen should be completely faded out. The nice thing about this system is that people quickly learn that accidental intersections with world geometry cause a fade and begin to self-correct. Creative Assembly’s Alien Isolation was one of the first titles to implement this solution.
Camera Origin Choices
When developing a VR application, you can choose to make the camera’s origin rest on people’s floor or on their eyes (these are called “floor” and “eye” origins, respectively). Both options have certain advantages and disadvantages.
Using the floor as the origin will cause people’s viewpoint to be at the same height off the ground that they are in real life. Aligning their virtual viewpoint height with their real-world height can increase the sense of immersion. However, you can’t control how tall people in your virtual worlds are. If you want to render a virtual body, you’ll need to build a solution that can scale to different people’s height.
Putting your camera’s origin at people’s eyes means that you can control their height within the virtual world. This is useful for rendering virtual bodies that are a specific height and also for offering perspectives that differ from people’s real-world experience (for example, you can show people what the world looks like from the eyes of a child). Camera origin is also common for seated experiences. EVE: Valkyrie is a great example of eye-origin design for a seated experience, including a believable virtual body. However, by using the eye point as the origin, you no longer know where the physical floor is. This complicates interactions that involve ducking low or picking things up from the ground.
It’s possible to use positional tracking data to animate virtual avatars, like a remote player in a networked game. For lots of details about one implementation of this type of system, see our blog post about positional tracking-based IK animation in Dead and Buried.
When animating virtual characters, make sure to include basic head and eye tracking animation to follow the player’s position. Static characters in VR feel like mannequins and are disconcerting.
A big part of VR in the last year has been the introduction of six-degrees-of-freedom tracked controllers like Oculus Touch. Tracked controllers introduce a whole new collection of interesting design challenges that developers are tackling in fascinating ways.
Touch is designed to give you access to your hands in VR—not just implements that you can hold, but your actual hands. When done properly, virtual hands let you interact with the virtual world effortlessly and without conscious thought—after all, you already know how to use your hands. When implemented poorly, virtual hands can cause an uncomfortable “uncanny valley of hands” feeling. Getting virtual hands right means you need good hand registration.
Registration occurs when your brain sees your virtual hands and accepts them as a representation of your physical hands. For this to happen, a number of requirements should be met. Most importantly, the hand position and orientation needs to match your actual hand. It’s common for a slight offset of rotational error in the hand models to lead to poor registration. To get registration right, one method is to put a controller model in the scene and ensure its pivot is correct by peeking out from the bottom of the HMD as you move the controller near your face. Properly implemented, you should see the controller pass from the real world into the virtual world seamlessly. From there, the next step is to model hands around the controller and then hide the controller model.
An easy way to test registration is to use the “back of the hand” test. Run your index finger along the back of your other hand in VR and look to see if the touch you’re feeling aligns with the positions of your hand graphics. Poorly aligned hands will often be wildly off, but virtual hands with good registration should match closely.
Avoid fleshy, overly realistic hands, which can feel lifeless and cause discomfort. Attaching large hands or other objects to the hand is easily accepted by the brain (people can assume their hands are “inside” those objects), but hands that are too small can be disconcerting. Oculus uses ethereal, semi-transparent hands because they believably map to a wide range of people regardless of gender, age, race, or ethnicity.
As with head positional tracking, you can’t prevent people from putting their hands through virtual geometry. Trying to prevent this with collision detection makes it feel like hand tracking has been lost, which is very disconcerting. Another reason that Oculus uses ghostly hands is that it’s believable that those hands might pass through geometry.
That said, some developers have found fascinating ways to achieve hand physicality. Bossa Sudios’ Surgeon Simulator uses physical hands that collide with world geometry, but when a person’s hand gets stuck somewhere, they immediately display a second set of transparent, skeletal hands that continue to track with the player’s motion. This visually indicates that tracking hasn’t been lost, and the ethereal appearance of the skeleton hands suggests that they can’t be used to manipulate the world. This is a great example of a developer designing an innovative solution to a VR-specific design problem.
Proprioception is your brain’s innate sense of where your limbs are located in physical space, even when you aren’t looking at them. This sensory system is so accurate that trying to simulate virtual limbs, like forearms and elbows, can be difficult—you can see virtual limbs, but you can’t feel them (in the absence of haptic feedback) or locate them using proprioception. It’s common for developers to attempt to attach an IK arm to a tracked controller, but this solution often results in discontinuity between the rendered virtual arm position and the person’s real arm, which proprioception can detect. There are titles that have achieved very believable forearms and elbows (see Lone Echo, coming soon), but this is a design problem that requires a great deal of patience and attention. Incorrect arms are worse than no arms at all.
Resistance and Torque
Though tracked controllers can give you virtual hands, they can’t simulate the torque or resistance we feel when manipulating weighty objects. Interactions that involve significant resistance—like lifting a heavy rock or pulling a large, industrial lever—don’t feel believable. At best, the targets of these sorts of interactions feel fake and flimsy, as if they were made of foam or papier-mâché. On the other hand, interactions that involve objects for which we don’t expect to feel significant physical resistance like flicking a light switch are readily believable. Similarly, objects that map well to the shape of the controller’s grip in a peron’s hand (like a gun) feel natural. When designing hand interactions, consider the apparent weight of the objects people are manipulating and look for ways to make the lack of physical resistance believable.
Also be careful when working with interactions that require two hands. Lifting a heavy box or holding a pitchfork with two hands is likely to feel strange because the rigid constraints we expect between the hands don’t exist in VR.
Picking Things Up
The best way to pick up an object in VR is to grab it the way it was designed to be held. Nobody expects to pick a gun up by its barrel or a coffee cup by its base. When a person tries to pick up an object that affords gripping in an obvious way, you should snap the object into their hand at the correct alignment. They reach for the gun, closes their hand, and come away with the gun held perfectly, just as they expected.
Objects that don’t have an obvious handle or grip (e.g. a soccer ball) should be attached to the hand at the moment that the grip trigger is depressed. In this case, the offset and orientation of the object to the hand is arbitrary, but as long as the object sticks to the hand it will feel believable. You shouldn’t snap or correct the object in this case—just stick it to the hand at whatever positional offset it was at when the grip was invoked.
Objects that require shifting of grip can be tricky. For example, early Oculus experiments with Touch controllers found that ping-pong paddles felt strange because we tend to shift them in our hand when we get ready to return a ball. This sort of angle shift is hard to mimic with a tracked controller. Tripwire’s Killing Floor: Incursion implements a dead simple (but highly effective) solution to this problem. The player has a knife that can either be used for slashing (when held by the handle) or throwing (when held by the tip). To switch the knife’s grip, the player simply presses a face button on the Touch controller. This gives the knife a simple, reliable interface without relying on gestures, hand-offs, or any other complicated interaction.
Throwing objects reliably with tracked controllers is harder than it looks. For Toybox, our early testbed for hand interactions, we tracked a history of hand velocities which were averaged to produce the velocity of a thrown object at the moment that it leaves the hand. Getting this to feel believable required a lot of iteration and a little smoke and mirrors (for example, we reduce the force of gravity on thrown objects in some cases).
The main takeaway here is that many objects afford throwing in many different ways. A frisbee is thrown using a completely different motion than the way a paper airplane is thrown. Making both of these actions believable requires building per-object physics rules to govern throwing. The most thorough implementation of this idea we’ve seen is Owlchemy Labs’ Job Simulator, which implements unique physics for just about every object available.
Down the Rabbit Hole
We’ve learned a lot from innovative VR developers in the last 12 months, but the patterns described here are just the beginning. One of the most exciting things about VR development is that so much of the design grammar has yet to be written. Developers exploring this space today are authoring solutions to brand-new design problems and, in the process, writing the rule book for the future of VR software. The things we’ve learned from these pioneers so far are to be appreciated, but there’s so much more to look forward to.
To check out the full talk presented at GDC and F8 2017, click here.