*This post was updated on July 31, 2017*
360 degree videos, also known as immersive videos or spherical videos, are video recordings where a view in every direction is recorded at the same time. Unlike normal videos that capture whatever is within the camera's field of view, these videos capture the entire 360 degree scene. A common way to depict 360 videos in VR is by wrapping a video to cover a sphere's inner surface and placing the user at the sphere's center. The viewer is able to look all around them, in any direction.
This blog is an introduction to 360 video, with tips and tricks specifically for developing videos for Gear VR.
BASIC TIPS
Before we dive into the nuts and bolts of 360 video, here is a list of basic tips for developers working on 360 video for Gear VR. Note that these are not recommendations or requirements, but instead suggested options and are just a starting point—the “best” configuration is highly dependent on the video content itself.
- Suggested Monoscopic Configuration: 4096x2048 video at 30 FPS, encoded with H.264.
- Suggested Stereoscopic Configuration: 3840×2160 video at 30 FPS, encoded with H.265/VP9 10-20mbps bitrate.
- A comprehensive option for bitrate and encoding experimentation is ffmpeg.
- Prefer streaming, but allow users to download videos for later playback when possible.
- When streaming, ensure you are at least using HLS/Dash.
- When streaming, you can experiment with bigger buffers to allow radio to turn off for brief periods of time during playback.
- ExoPlayer generally works well for advanced video playback options on Android.
MONO 360 vs STEREO 360
Most of the 360 video content on the web is monoscopic. This means that there is no depth information between background and foreground and the same image is displayed to both eye balls. Our brains can still partially make up for the lack of depth information by comparing objects to each other to guess at their size at distance, but these types of 360 videos appear essentially flat.
Stereoscopic 360 videos, also called “3D 360 videos,” deliver two distinct images rendered individually to each eye, allowing the brain to perceive depth the way we do in real life. The addition of depth information dramatically improves the experience of a 360 video. That said, making high quality 360 videos is a challenging task. As camera technology improves, stereoscopic 360 videos will become easier to produce, but for now creating a 3D 360 video involves serious considerations along every step of the process. From content planning to actual playback, including platform and equipment selection, capture and particularly post-production, the process is involved.
VIDEO RESOLUTION vs PERCEIVED RESOLUTION
One of the challenges with 360 degree video is resolution and playback. It is common for first-time video developers to be dismayed at the apparent low resolution of their video content, even when it is encoded at 4K or higher. At issue is the difference between the video file itself (which is usually an equirectangular panoramic image), and the small chunk of that video file that the user actually sees. Though the video file itself may be very high resolution, the viewer sees only a small portion of it at any given time, requiring the video file to be stretched significantly around their field of view. A 4K video file viewed with a 90 degree field of view would result in a perceived resolution of roughly 1K. In practice, to get 4k perceived definition within a 90 deg FOV, the actual resolution of the video must be roughly 16K.
ENCODING
Video codecs provide a format for video compression and decompression. These can be software- or hardware-based depending on the target platform. The most widely supported video codecs and, hence the most commonly suggested options, include H.264, H.265 and VP8, VP9.
Which encoding Should I use?
It depends, but here's a few things to consider when selecting a codec:
- Overall codec support. Depending on the platforms you are targeting, you may want to choose a codec based on how widely it is available. For example, H.264 is a widely supported codec across platforms, so it is generally the easiest and default option for many.
- Hardware vs software decoding. Once you know which devices you are targeting, you can determine whether the codec you are planning to use can be decoded via hardware or software. Hardware decoding generally has a considerable performance and power efficiency advantage over software decoding, as the latter requires a lot of CPU work, which is a particularly precious resource on mobile devices. To avoid overheating and excessive battery consumption, hardware decoding formats are preferred over software decoding formats. VP9, HVEC (H.265), and H.264 can all be hardware decoded on Gear VR devices.
- Licensing. Some codecs, such as HEVC (H.265) come with licensing terms that may not be compatible with your use case. Other codecs, such as VP8 and VP9, are free to use but may not enjoy as widespread support as H.264.
- The Future. There are new formats on the horizon, such as AV1, which may be applicable to 360 video in the future.
STREAMING OR DOWNLOADING
An important consideration to be made is whether the content will be streamed or available for download, and what’s appropriate for the experience you want to deliver. Most videos nowadays are consumed via streaming, rather than a-priori downloads, which leads users to expect the same kind of experience in VR. If applicable, a very valid option is to allow the user to choose between both. That said, Gear VR users tend to prefer streaming.
Adaptive streaming...
Should you decide to go the streaming path, some form of adaptive streaming is required, either HLS or DASH. Adaptive streaming technologies require you to have numerous versions of the same video sampled at different bitrates, so that the appropriate rate can be delivered based on the network speed of the consumer's device. Once the consumer's bandwidth hits a certain bitrate threshold, HLS/Dash serve the appropriate bitrate for that threshold, and adapt accordingly if the network speed changes. In order to be able to make use of HLS/Dash, you will either have to use an existing streaming service or roll your own player, as Android's built-in media player does not support adaptive streaming. ExoPlayer is a common choice for Gear VR 360 video streaming apps. It is worth noting that HLS/Dash's main advantage is to prevent buffering by sacrificing quality in return for speed. This is not an ideal solution for VR, as a sudden drop in video quality is very noticeable and can damage immersion.
...is not enough
How can HLS/Dash be improved for VR? There are a number of techniques specifically designed to improve 360 video streaming that are under development, such as foveated streaming, barrel encoding, and equiangular cubemaps. However, the jury is still out on which of these approaches (if any) will prove to be the most useful, and each is still in the early stages of development.
Keep in mind that all forms of streaming playback are more power hungry than the download alternative. It is important to test your application not only for performance, but also for battery impact and overheating.
AUDIO
All forms of streaming playback are more power hungry than the download alternative. It is important to test your application not only for performance, but also for battery impact and overheating.
If you are interested in learning more about spatialized audio, check out this post about Facebook's comprehensive 360 video audio toolset,
Facebook Spatial Workstation.
NEXT STEPS
This article has only scratched the surface of 360 video production. There are many other tricks and techniques that content designers are employing today, as well as many new technologies (including
Web VR) on the horizon.
Be sure to keep an eye on this blog for more Oculus best practices as well as developer guest posts.