Rendering to the Oculus Rift

The Oculus Rift requires split-screen stereo with distortion correction for each eye to cancel lens-related distortion.

OculusWorldDemo Stereo Rendering

Correcting for distortion can be challenging, with distortion parameters varying for different lens types and individual eye relief. To make development easier, Oculus SDK handles distortion correction automatically within the Oculus Compositor process; it also takes care of latency-reducing timewarp and presents frames to the headset.

With Oculus SDK doing a lot of the work, the main job of the application is to perform simulation and render stereo world based on the tracking pose. Stereo views can be rendered into either one or two individual textures and are submitted to the compositor by calling ovr_SubmitFrame. We cover this process in detail in this section.

Rendering to the Oculus Rift

The Oculus Rift requires the scene to be rendered in split-screen stereo with half of the screen used for each eye.

When using the Rift, the left eye sees the left half of the screen, and the right eye sees the right half. Although varying from person-to-person, human eye pupils are approximately 65 mm apart. This is known as interpupillary distance (IPD). The in-application cameras should be configured with the same separation.

Note:

This is a translation of the camera, not a rotation, and it is this translation (and the parallax effect that goes with it) that causes the stereoscopic effect. This means that your application will need to render the entire scene twice, once with the left virtual camera, and once with the right.

The reprojection stereo rendering technique, which relies on left and right views being generated from a single fully rendered view, is usually not viable with an HMD because of significant artifacts at object edges.

The lenses in the Rift magnify the image to provide a very wide field of view (FOV) that enhances immersion. However, this process distorts the image significantly. If the engine were to display the original images on the Rift, then the user would observe them with pincushion distortion.

Pincushion and Barrel Distortion

To counteract this distortion, the SDK applies post-processing to the rendered views with an equal and opposite barrel distortion so that the two cancel each other out, resulting in an undistorted view for each eye. Furthermore, the SDK also corrects chromatic aberration, which is a color separation effect at the edges caused by the lens. Although the exact distortion parameters depend on the lens characteristics and eye position relative to the lens, the Oculus SDK takes care of all necessary calculations when generating the distortion mesh.

When rendering for the Rift, projection axes should be parallel to each other as illustrated in the following figure, and the left and right views are completely independent of one another. This means that camera setup is very similar to that used for normal non-stereo rendering, except that the cameras are shifted sideways to adjust for each eye location.

HMD Eye View Cones

In practice, the projections in the Rift are often slightly off-center because our noses get in the way! But the point remains, the left and right eye views in the Rift are entirely separate from each other, unlike stereo views generated by a television or a cinema screen. This means you should be very careful if trying to use methods developed for those media because they do not usually apply in VR.

The two virtual cameras in the scene should be positioned so that they are oriented in the same way as the eye poses, specified by ovrEyeRenderDesc::HmdToEyePose, and such that the distance between them is the same as the distance between the eyes, or the interpupillary distance (IPD).

Although the Rift’s lenses are approximately the right distance apart for most users, they may not exactly match the user’s IPD. However, because of the way the optics are designed, each eye will still see the correct view. It is important that the software makes the distance between the virtual cameras match the user’s IPD as found in their profile (set in the configuration utility), and not the distance between the Rift’s lenses.

Rendering Setup Outline

The Oculus SDK makes use of a compositor process to present frames and handle distortion.

To target the Rift, you render the scene into one or two render textures, passing these textures into the API. The Oculus runtime handles distortion rendering, GPU synchronization, frame timing, and frame presentation to the HMD.

The following are the steps for SDK rendering:

  1. Initialize:
    1. Initialize Oculus SDK and create an ovrSession object for the headset as was described earlier.
    2. Compute the desired FOV and texture sizes based on ovrHmdDesc data.
    3. Allocate ovrTextureSwapChain objects, used to represent eye buffers, in an API-specific way: call ovr_CreateTextureSwapChainDX for either Direct3D 11 or 12, ovr_CreateTextureSwapChainGL for OpenGL, or ovr_CreateTextureSwapChainVk for Vulkan.
  2. Set up frame handling:

    1. Use ovr_GetTrackingState and ovr_CalcEyePoses to compute eye poses needed for view rendering based on frame timing information.
    2. Perform rendering for each eye in an engine-specific way, rendering into the current texture within the texture set. Current texture is retrieved using ovr_GetTextureSwapChainCurrentIndex and ovr_GetTextureSwapChainBufferDX, ovr_GetTextureSwapChainBufferGL, or ovr_GetTextureSwapChainBufferVk. After rendering to the texture is complete, the application must call ovr_CommitTextureSwapChain.
    3. Call ovr_SubmitFrame, passing swap texture set(s) from the previous step within a ovrLayerEyeFov structure. Although a single layer is required to submit a frame, you can use multiple layers and layer types for advanced rendering. ovr_SubmitFrame passes layer textures to the compositor which handles distortion, timewarp, and GPU synchronization before presenting it to the headset.
  3. Shutdown:
    1. Call ovr_DestroyTextureSwapChain to destroy swap texture buffers. Call ovr_DestroyMirrorTexture to destroy a mirror texture. To destroy the ovrSession object, call ovr_Destroy.

Texture Swap Chain Initialization

This section describes rendering initialization, including creation of texture swap chains.

Initially, you determine the rendering FOV and allocate the required ovrTextureSwapChain. The following code shows how the required texture size can be computed:

// Configure Stereo settings.
Sizei recommenedTex0Size = ovr_GetFovTextureSize(session, ovrEye_Left, 
                                                        session->DefaultEyeFov[0], 1.0f);
Sizei recommenedTex1Size = ovr_GetFovTextureSize(session, ovrEye_Right,
                                                        session->DefaultEyeFov[1], 1.0f);
Sizei bufferSize;
bufferSize.w  = recommenedTex0Size.w + recommenedTex1Size.w;
bufferSize.h = max ( recommenedTex0Size.h, recommenedTex1Size.h );

Render texture size is determined based on the FOV and the desired pixel density at the center of the eye. Although both the FOV and pixel density values can be modified to improve performance, this example uses the recommended FOV (obtained from session->DefaultEyeFov). The function ovr_GetFovTextureSize computes the desired texture size for each eye based on these parameters.

The Oculus API allows the application to use either one shared texture or two separate textures for eye rendering. This example uses a single shared texture for simplicity, making it large enough to fit both eye renderings.

If you're using Vulkan, there are three steps that you need to add before you can create the texture swap chain.

  • During initialization, call ovr_GetSessionPhysicalDeviceVk to get the current physical device matching the luid. Then, create a VkDevice associated with the returned physical device.
  • AMD hardware uses differnet extensions on Vulkan. Add code similar to the following example to handle the AMD GPU extensions during app initalization. This code example comes from the Win32_VulkanAppUtil.h that comes with the OculusRoomTiny_Advanced sample app that ships with the Oculus SDK.

    static const uint32_t AMDVendorId = 0x1002;
       isAMD = (gpuProps.vendorID == AMDVendorId);
    
       static const char* deviceExtensions[] =
       {
           VK_KHR_SWAPCHAIN_EXTENSION_NAME,
           VK_KHX_EXTERNAL_MEMORY_EXTENSION_NAME,
           #if defined(VK_USE_PLATFORM_WIN32_KHR)
              VK_KHX_EXTERNAL_MEMORY_WIN32_EXTENSION_NAME,
           #endif
       };
    
       static const char* deviceExtensionsAMD[] =
       {
          VK_KHR_SWAPCHAIN_EXTENSION_NAME
       };
  • Then, after the game loop has been established, identify which queue to synchronize when rendering. Call ovr_SetSynchonizationQueueVk to identify the queue.

Create the Texture Swap Chain

Once texture size is known, the application can call ovr_CreateTextureSwapChainGL, ovr_CreateTextureSwapChainDX, or ovr_CreateTextureSwapChainVk to allocate the texture swap chains in an API-specific way.

Here's how a texture swap chain can be created and accessed under OpenGL:

ovrTextureSwapChain textureSwapChain = 0;

ovrTextureSwapChainDesc desc = {};
desc.Type = ovrTexture_2D;
desc.ArraySize = 1;
desc.Format = OVR_FORMAT_R8G8B8A8_UNORM_SRGB;
desc.Width = bufferSize.w;
desc.Height = bufferSize.h;
desc.MipLevels = 1;
desc.SampleCount = 1;
desc.StaticImage = ovrFalse;

if (ovr_CreateTextureSwapChainGL(session, &desc, &textureSwapChain) == ovrSuccess)
{
    // Sample texture access:
    int texId;
    ovr_GetTextureSwapChainBufferGL(session, textureSwapChain, 0, &texId);
    glBindTexture(GL_TEXTURE_2D, texId);
    ...
}

Here's a similar example of texture swap chain creation and access using Direct3D 11:

ovrTextureSwapChain textureSwapChain = 0;
std::vector<ID3D11RenderTargetView*> texRtv;

ovrTextureSwapChainDesc desc = {};
desc.Type = ovrTexture_2D;
desc.Format = OVR_FORMAT_R8G8B8A8_UNORM_SRGB;
desc.ArraySize = 1;
desc.Width = bufferSize.w;
desc.Height = bufferSize.h;
desc.MipLevels = 1;
desc.SampleCount = 1;
desc.StaticImage = ovrFalse;
desc.MiscFlags = ovrTextureMisc_None;
desc.BindFlags = ovrTextureBind_DX_RenderTarget;

 if (ovr_CreateTextureSwapChainDX(session, DIRECTX.Device, &desc, &textureSwapChain) == ovrSuccess)
 {
     int count = 0;
     ovr_GetTextureSwapChainLength(session, textureSwapChain, &count);
     texRtv.resize(textureCount);
     for (int i = 0; i < count; ++i)
     {
         ID3D11Texture2D* texture = nullptr;
         ovr_GetTextureSwapChainBufferDX(session, textureSwapChain, i, IID_PPV_ARGS(&texture));
         DIRECTX.Device->CreateRenderTargetView(texture, nullptr, &texRtv[i]);
         texture->Release();
     }
 }

Here's sample code from the provided OculusRoomTiny sample running in Direct3D 12:

ovrTextureSwapChain TexChain;
std::vector<D3D12_CPU_DESCRIPTOR_HANDLE> texRtv;
std::vector<ID3D12Resource*> TexResource;

ovrTextureSwapChainDesc desc = {};
desc.Type = ovrTexture_2D;
desc.ArraySize = 1;
desc.Format = OVR_FORMAT_R8G8B8A8_UNORM_SRGB;
desc.Width = sizeW;
desc.Height = sizeH;
desc.MipLevels = 1;
desc.SampleCount = 1;
desc.MiscFlags = ovrTextureMisc_DX_Typeless;
desc.StaticImage = ovrFalse;
desc.BindFlags = ovrTextureBind_DX_RenderTarget;

// DIRECTX.CommandQueue is the ID3D12CommandQueue used to render the eye textures by the app
ovrResult result = ovr_CreateTextureSwapChainDX(session, DIRECTX.CommandQueue, &desc, &TexChain);
if (!OVR_SUCCESS(result))
    return false;

int textureCount = 0;
ovr_GetTextureSwapChainLength(Session, TexChain, &textureCount);
texRtv.resize(textureCount);
TexResource.resize(textureCount);
for (int i = 0; i < textureCount; ++i)
{
    result = ovr_GetTextureSwapChainBufferDX(Session, TexChain, i, IID_PPV_ARGS(&TexResource[i]));
    if (!OVR_SUCCESS(result))
        return false;

    D3D12_RENDER_TARGET_VIEW_DESC rtvd = {};
    rtvd.Format = DXGI_FORMAT_R8G8B8A8_UNORM;
    rtvd.ViewDimension = D3D12_RTV_DIMENSION_TEXTURE2D;
    texRtv[i] = DIRECTX.RtvHandleProvider.AllocCpuHandle(); // Gives new D3D12_CPU_DESCRIPTOR_HANDLE
    DIRECTX.Device->CreateRenderTargetView(TexResource[i], &rtvd, texRtv[i]);
}
Note: For Direct3D 12, when calling ovr_CreateTextureSwapChainDX, the caller provides a ID3D12CommandQueue instead of a ID3D12Device to the SDK. It is the caller's responsibility to make sure that this ID3D12CommandQueue instance is where all VR eye-texture rendering is executed. Or, it can be used as a "join-node" fence to wait for the command lists executed by other command queues rendering the VR eye textures.

Here's how a texture swap chain can be created and accessed using Vulkan:

bool Create(ovrSession aSession, VkExtent2D aSize, RenderPass& renderPass, DepthBuffer& depthBuffer)
{
    session = aSession;
    size = aSize;

    ovrTextureSwapChainDesc desc = {};
    desc.Type = ovrTexture_2D;
    desc.ArraySize = 1;
    desc.Format = OVR_FORMAT_R8G8B8A8_UNORM_SRGB;
    desc.Width = (int)size.width;
    desc.Height = (int)size.height;
    desc.MipLevels = 1;
    desc.SampleCount = 1;
    desc.MiscFlags = ovrTextureMisc_DX_Typeless;
    desc.BindFlags = ovrTextureBind_DX_RenderTarget;
    desc.StaticImage = ovrFalse;

    ovrResult result = ovr_CreateTextureSwapChainVk(session, Platform.device, &desc, &textureChain);
    if (!OVR_SUCCESS(result))
        return false;

    int textureCount = 0;
    ovr_GetTextureSwapChainLength(session, textureChain, &textureCount);
    texElements.reserve(textureCount);
    for (int i = 0; i < textureCount; ++i)
    {
        VkImage image;
        result = ovr_GetTextureSwapChainBufferVk(session, textureChain, i, &image);
        texElements.emplace_back(RenderTexture());
        CHECK(texElements.back().Create(image, VK_FORMAT_R8G8B8A8_SRGB, size, renderPass, depthBuffer.view));
   }

    return true;
}

Once these textures and render targets are successfully created, you can use them to perform eye-texture rendering. The Frame Rendering section describes viewport setup in more detail.

The Oculus compositor provides sRGB-correct rendering, which results in more photorealistic visuals, better MSAA, and energy-conserving texture sampling, which are very important for VR applications. As shown above, applications are expected to create sRGB texture swap chains. Proper treatment of sRGB rendering is a complex subject and, although this section provides an overview, extensive information is outside the scope of this document.

There are several steps to ensuring a real-time rendered application achieves sRGB-correct shading and different ways to achieve it. For example, most GPUs provide hardware acceleration to improve gamma-correct shading for sRGB-specific input and output surfaces, while some applications use GPU shader math for more customized control. For the Oculus SDK, when an application passes in sRGB-space texture swap chains, the compositor relies on the GPU's sampler to do the sRGB-to-linear conversion.

All color textures fed into a GPU shader should be marked appropriately with the sRGB-correct format, such as OVR_FORMAT_R8G8B8A8_UNORM_SRGB. This is also recommended for applications that provide static textures as quad-layer textures to the Oculus compositor. Failure to do so will cause the texture to look much brighter than expected.

For D3D 11 and 12, the texture format provided in desc for ovr_CreateTextureSwapChainDX is used by the distortion compositor for the ShaderResourceView when reading the contents of the texture. As a result, the application should request texture swap chain formats that are in sRGB-space (e.g. OVR_FORMAT_R8G8B8A8_UNORM_SRGB).

If your application is configured to render into a linear-format texture (e.g. OVR_FORMAT_R8G8B8A8_UNORM) and handles the linear-to-gamma conversion using HLSL code, or does not care about any gamma-correction, then:

  • Request an sRGB format (e.g. OVR_FORMAT_R8G8B8A8_UNORM_SRGB) texture swap chain.
  • Specify the ovrTextureMisc_DX_Typeless flag in the desc.
  • Create a linear-format RenderTargetView (e.g. DXGI_FORMAT_R8G8B8A8_UNORM)
Note: The ovrTextureMisc_DX_Typeless flag for depth buffer formats (e.g. OVR_FORMAT_D32) is ignored as they are always converted to be typeless.

The provided code sample demonstrates how to use the provided ovrTextureMisc_DX_Typeless flag in D3D11:

    ovrTextureSwapChainDesc desc = {};
    desc.Type = ovrTexture_2D;
    desc.ArraySize = 1;
    desc.Format = OVR_FORMAT_R8G8B8A8_UNORM_SRGB;
    desc.Width = sizeW;
    desc.Height = sizeH;
    desc.MipLevels = 1;
    desc.SampleCount = 1;
    desc.MiscFlags = ovrTextureMisc_DX_Typeless;
    desc.BindFlags = ovrTextureBind_DX_RenderTarget;
    desc.StaticImage = ovrFalse;

    ovrResult result = ovr_CreateTextureSwapChainDX(session, DIRECTX.Device, &desc, &textureSwapChain);

    if(!OVR_SUCCESS(result))
        return;

    int count = 0;
    ovr_GetTextureSwapChainLength(session, textureSwapChain, &count);
    for (int i = 0; i < count; ++i)
    {
        ID3D11Texture2D* texture = nullptr;
        ovr_GetTextureSwapChainBufferDX(session, textureSwapChain, i, IID_PPV_ARGS(&texture));
        D3D11_RENDER_TARGET_VIEW_DESC rtvd = {};
        rtvd.Format = DXGI_FORMAT_R8G8B8A8_UNORM;
        rtvd.ViewDimension = D3D11_RTV_DIMENSION_TEXTURE2D;
        DIRECTX.Device->CreateRenderTargetView(texture, &rtvd, &texRtv[i]);
        texture->Release();
    }
        

For OpenGL, the format parameter ofovr_CreateTextureSwapChainGL is used by the distortion compositor when reading the contents of the texture. As a result, the application should request texture swap chain formats preferably in sRGB-space (e.g. OVR_FORMAT_R8G8B8A8_UNORM_SRGB). Furthermore, your application should call glEnable(GL_FRAMEBUFFER_SRGB); before rendering into these textures.

Even though it is not recommended, if your application is configured to treat the texture as a linear format (e.g. GL_RGBA) and performs linear-to-gamma conversion in GLSL or does not care about gamma-correction, then:

  • Request an sRGB format (e.g. OVR_FORMAT_R8G8B8A8_UNORM_SRGB) texture swap chain.
  • Do not call glEnable(GL_FRAMEBUFFER_SRGB); when rendering into the texture.

For Vulkan, the format parameter of ovr_CreateTextureSwapChainVk is used by the distortion compositor when reading the contents of the texture. Your application should request texture swap chain formats in the sRGB-space (e.g. OVR_FORMAT_R8G8B8A8_UNORM_SRGB) as the compositor does sRGB-correct rendering. The compositor will rely on the GPU’s hardware sampler to perform the sRGB-to-linear conversion.

If your application prefers rendering to a linear format (e.g. OVR_FORMAT_R8G8B8A8_UNORM) while handling the linear-to-gamma conversion via SPIRV code, the application must still request the corresponding sRGB format and also use ovrTextureMisc_DX_Typeless in the Flag field of ovrTextureSwapChainDesc. This allows the application to create a RenderTargetView in linear format, while allowing the compositor to treat it as sRGB. Failure to do this will result in unexpected gamma-curve artifacts. The ovrTextureMisc_DX_Typeless flag for depth buffer formats (e.g. OVR_FORMAT_D32_FLOAT) is ignored as they are always converted to be typeless.

In addition to sRGB, these concepts also apply to the mirror texture creation. For more information, refer to the function documentation provided for ovr_CreateMirrorTextureDX, ovr_CreateMirrorTextureGL, and ovr_CreateMirrorTextureWithOptionsVk for D3D, OpenGL, and Vulkan, respectively.

Frame Rendering

Frame rendering typically involves several steps: obtaining predicted eye poses based on the headset tracking pose, rendering the view for each eye and, finally, submitting eye textures to the compositor through ovr_SubmitFrame. After the frame is submitted, the Oculus compositor handles distortion and presents it on the Rift.

Before rendering frames it is helpful to initialize some data structures that can be shared across frames. As an example, we query eye descriptors and initialize the layer structure outside of the rendering loop:

 // Initialize VR structures, filling out description.
ovrEyeRenderDesc eyeRenderDesc[2];
ovrPosef      hmdToEyeViewPose[2];
ovrHmdDesc hmdDesc = ovr_GetHmdDesc(session);
eyeRenderDesc[0] = ovr_GetRenderDesc(session, ovrEye_Left, hmdDesc.DefaultEyeFov[0]);
eyeRenderDesc[1] = ovr_GetRenderDesc(session, ovrEye_Right, hmdDesc.DefaultEyeFov[1]);
hmdToEyeViewPose[0] = eyeRenderDesc[0].HmdToEyePose;
hmdToEyeViewPose[1] = eyeRenderDesc[1].HmdToEyePose;

// Initialize our single full screen Fov layer.
ovrLayerEyeFov layer;
layer.Header.Type      = ovrLayerType_EyeFov;
layer.Header.Flags     = 0;
layer.ColorTexture[0]  = textureSwapChain;
layer.ColorTexture[1]  = textureSwapChain;
layer.Fov[0]           = eyeRenderDesc[0].Fov;
layer.Fov[1]           = eyeRenderDesc[1].Fov;
layer.Viewport[0]      = Recti(0, 0,                bufferSize.w / 2, bufferSize.h);
layer.Viewport[1]      = Recti(bufferSize.w / 2, 0, bufferSize.w / 2, bufferSize.h);
// ld.RenderPose and ld.SensorSampleTime are updated later per frame.

This code example first gets rendering descriptors for each eye, given the chosen FOV. The returned ovrEyeRenderDesc structure contains useful values for rendering, including the HmdToEyePose for each eye. Eye view offsets are used later to adjust for eye separation.

The code also initializes the ovrLayerEyeFov structure for a full screen layer. Starting with Oculus SDK 0.6, frame submission uses layers to composite multiple view images or texture quads on top of each other. This example uses a single layer to present a VR scene. For this purpose, we use ovrLayerEyeFov, which describes a dual-eye layer that covers the entire eye field of view. Since we are using the same texture set for both eyes, we initialize both eye color textures to pTextureSet and configure viewports to draw to the left and right sides of this shared texture, respectively.

Note: Although it is often enough to initialize viewports once in the beginning, specifying them as a part of the layer structure that is submitted every frame allows applications to change render target size dynamically, if desired. This is useful for optimizing rendering performance.

After setup completes, the application can run the rendering loop. First, we need to get the eye poses to render the left and right views.

// Get both eye poses simultaneously, with IPD offset already included.
double displayMidpointSeconds = GetPredictedDisplayTime(session, 0);
ovrTrackingState hmdState = ovr_GetTrackingState(session, displayMidpointSeconds, ovrTrue);
ovr_CalcEyePoses(hmdState.HeadPose.ThePose, hmdToEyeViewPose, layer.RenderPose);

In VR, rendered eye views depend on the headset position and orientation in the physical space, tracked with the help of internal IMU and external sensors. Prediction is used to compensate for the latency in the system, giving the best estimate for where the headset will be when the frame is displayed on the headset. In the Oculus SDK, this tracked, predicted pose is reported by ovr_GetTrackingState.

To do accurate prediction, ovr_GetTrackingState needs to know when the current frame will actually be displayed. The code above calls GetPredictedDisplayTime to obtain displayMidpointSeconds for the current frame, using it to compute the best predicted tracking state. The head pose from the tracking state is then passed to ovr_CalcEyePoses to calculate correct view poses for each eye. These poses are stored directly into the layer.RenderPose[2] array. With eye poses ready, we can proceed onto the actual frame rendering.

if (isVisible)
{
    // Get next available index of the texture swap chain
	int currentIndex = 0;
    ovr_GetTextureSwapChainCurrentIndex(session, textureSwapChain, &currentIndex);
    
    // Clear and set up render-target.            
    DIRECTX.SetAndClearRenderTarget(pTexRtv[currentIndex], pEyeDepthBuffer);

    // Render Scene to Eye Buffers
    for (int eye = 0; eye < 2; eye++)
    {
        // Get view and projection matrices for the Rift camera
        Vector3f pos = originPos + originRot.Transform(layer.RenderPose[eye].Position);
        Matrix4f rot = originRot * Matrix4f(layer.RenderPose[eye].Orientation);

        Vector3f finalUp      = rot.Transform(Vector3f(0, 1, 0));
        Vector3f finalForward = rot.Transform(Vector3f(0, 0, -1));
        Matrix4f view         = Matrix4f::LookAtRH(pos, pos + finalForward, finalUp);
        
        Matrix4f proj = ovrMatrix4f_Projection(layer.Fov[eye], 0.2f, 1000.0f, 0);
        // Render the scene for this eye.
        DIRECTX.SetViewport(layer.Viewport[eye]);
        roomScene.Render(proj * view, 1, 1, 1, 1, true);
    }
	
	// Commit the changes to the texture swap chain
	ovr_CommitTextureSwapChain(session, textureSwapChain);
}

// Submit frame with one layer we have.
ovrLayerHeader* layers = &layer.Header;
ovrResult       result = ovr_SubmitFrame(session, 0, nullptr, &layers, 1);
isVisible = (result == ovrSuccess);

This code takes a number of steps to render the scene:

  • It applies the texture as a render target and clears it for rendering. In this case, the same texture is used for both eyes.
  • The code then computes view and projection matrices and sets viewport scene rendering for each eye. In this example, view calculation combines the original pose (originPos and originRot values) with the new pose computed based on the tracking state and stored in the layer. There original values can be modified by input to move the player within the 3D world.
  • After texture rendering is complete, we call ovr_SubmitFrame to pass frame data to the compositor. From this point, the compositor takes over by accessing texture data through shared memory, distorting it, and presenting it on the Rift.

ovr_SubmitFrame returns once the submitted frame is queued up and the runtime is available to accept a new frame. When successful, its return value is either ovrSuccess or ovrSuccess_NotVisible.

ovrSuccess_NotVisible is returned if the frame wasn't actually displayed, which can happen when VR application loses focus. Our sample code handles this case by updating the isVisible flag, checked by the rendering logic. While frames are not visible, rendering should be paused to eliminate unnecessary GPU load.

If you receive ovrError_DisplayLost, the device was removed and the session is invalid. Release the shared resources (ovr_DestroyTextureSwapChain), destroy the session (ovr_Destroy), recreate it (ovr_Create), and create new resources (ovr_CreateTextureSwapChainXXX). The application's existing private graphics resources do not need to be recreated unless the new ovr_Create call returns a different GraphicsLuid.

Frame Timing

The Oculus SDK reports frame timing information through the ovr_GetPredictedDisplayTime function, relying on the application-provided frame index to ensure correct timing is reported across different threads.

Accurate frame and sensor timing are required for accurate head motion prediction, which is essential for a good VR experience. Prediction requires knowing exactly when in the future the current frame will appear on the screen. If we know both sensor and display scanout times, we can predict the future head pose and improve image stability. Computing these values incorrectly can lead to under or over-prediction, degrading perceived latency, and potentially causing overshoot “wobbles”.

To ensure accurate timing, the Oculus SDK uses absolute system time, stored as a double, to represent sensor and frame timing values. The current absolute time is returned by ovr_GetTimeInSeconds. Current time should rarely be used, however, since simulation and motion prediction will produce better results when relying on the timing values returned by ovr_GetPredictedDisplayTime. This function has the following signature:

ovr_GetPredictedDisplayTime(ovrSession session, long long frameIndex);

The frameIndex argument specifies which application frame we are rendering. Applications that make use of multi-threaded rendering must keep an internal frame index and manually increment it, passing it across threads along with frame data to ensure correct timing and prediction. The same frameIndex value must be passed to ovr_SubmitFrame as was used to obtain timing for the frame. The details of multi-threaded timing are covered in the next section, Rendering on Different Threads.

A special frameIndex value of 0 can be used in both functions to request that the SDK keep track of frame indices automatically. However, this only works when all frame timing requests and render submission is done on the same thread.

Rendering on Different Threads

In some engines, render processing is distributed across more than one thread.

For example, one thread may perform culling and render setup for each object in the scene (we'll call this the “main” thread), while a second thread makes the actual D3D or OpenGL API calls (we'll call this the “render” thread). Both of these threads may need accurate estimates of frame display time, so as to compute best possible predictions of head pose.

The asynchronous nature of this approach makes this challenging: while the render thread is rendering a frame, the main thread might be processing the next frame. This parallel frame processing may be out of sync by exactly one frame or a fraction of a frame, depending on game engine design. If we used the default global state to access frame timing, the result of GetPredictedDisplayTime could either be off by one frame depending which thread the function is called from, or worse, could be randomly incorrect depending on how threads are scheduled. To address this issue, previous section introduced the concept of frameIndex that is tracked by the application and passed across threads along with frame data.

For multi-threaded rendering result to be correct, the following must be true: (a) pose prediction, computed based on frame timing, must be consistent for the same frame regardless of which thread it is accessed from; and (b) eye poses that were actually used for rendering must be passed into ovr_SubmitFrame, along with the frame index.

Here is a summary of steps you can take to ensure this is the case:

  1. The main thread needs to assign a frame index to the current frame being processed for rendering. It would increment this index each frame and pass it to GetPredictedDisplayTime to obtain the correct timing for pose prediction.
  2. The main thread should call the thread safe function ovr_GetTrackingState with the predicted time value. It can also call ovr_CalcEyePoses if necessary for rendering setup.
  3. Main thread needs to pass the current frame index and eye poses to the render thread, along with any rendering commands or frame data it needs.
  4. When the rendering commands executed on the render thread, developers need to make sure these things hold:
    1. The actual poses used for frame rendering are stored into the RenderPose for the layer.
    2. The same value of frameIndex as was used on the main thead is passed into ovr_SubmitFrame.

The following code illustrates this in more detail:

void MainThreadProcessing()
{
    frameIndex++;
        
    // Ask the API for the times when this frame is expected to be displayed. 
    double frameTiming = GetPredictedDisplayTime(session, frameIndex);

    // Get the corresponding predicted pose state.  
    ovrTrackingState state = ovr_GetTrackingState(session, frameTiming, ovrTrue);
    ovrPosef         eyePoses[2];
    ovr_CalcEyePoses(state.HeadPose.ThePose, hmdToEyeViewOffset, eyePoses);

    SetFrameHMDData(frameIndex, eyePoses);

    // Do render pre-processing for this frame. 
    ...        
}

void RenderThreadProcessing()
{
    int      frameIndex;
    ovrPosef eyePoses[2];
    
    GetFrameHMDData(&frameIndex, eyePoses);
    layer.RenderPose[0] = eyePoses[0];
    layer.RenderPose[1] = eyePoses[1];
    
    // Execute actual rendering to eye textures.
    ...    
    
   // Submit frame with one layer we have.
   ovrLayerHeader* layers = &layer.Header;
   ovrResult       result = ovr_SubmitFrame(session, frameIndex, nullptr, &layers, 1);
}

The Oculus SDK also supports Direct3D 12, which allows submitting rendering work to the GPU from multiple CPU threads. When the application calls ovr_CreateTextureSwapChainDX, the Oculus SDK caches off the ID3D12CommandQueue provided by the caller for future usage. As the application calls ovr_SubmitFrame, the SDK drops a fence on the cached ID3D12CommandQueue to know exactly when a given set of eye-textures are ready for the SDK compositor.

For a given application, using a single ID3D12CommandQueue on a single thread is the easiest. But,it might also split the CPU rendering workload for each eye-texture pair or push non-eye-texture rendering work, such as shadows, reflection maps, and so on, onto different command queues. If the application populates and executes command lists from multiple threads, it will also have to make sure that the ID3D12CommandQueue provided to the SDK is the single join-node for the eye-texture rendering work executed through different command queues.

Layers

Similar to the way a monitor view can be composed of multiple windows, the display on the headset can be composed of multiple layers. Typically at least one of these layers will be a view rendered from the user's virtual eyeballs, but other layers may be HUD layers, information panels, text labels attached to items in the world, aiming reticles, and so on.

Each layer can have a different resolution, can use a different texture format, can use a different field of view or size, and might be in mono or stereo. The application can also be configured to not update a layer's texture if the information in it has not changed. For example, it might not update if the text in an information panel has not changed since last frame or if the layer is a picture-in-picture view of a video stream with a low framerate. Applications can supply mipmapped textures to a layer and, together with a high-quality distortion mode, this is very effective at improving the readability of text panels.

Every frame, all active layers are composited from back to front using pre-multiplied alpha blending. Layer 0 is the furthest layer, layer 1 is on top of it, and so on; there is no depth-buffer intersection testing of layers, even if a depth-buffer is supplied.

A powerful feature of layers is that each can be a different resolution. This allows an application to scale to lower performance systems by dropping resolution on the main eye-buffer render that shows the virtual world, but keeping essential information, such as text or a map, in a different layer at a higher resolution.

There are several layer types available:

EyeFovThe standard "eye buffer" familiar from previous SDKs, which is typically a stereo view of a virtual scene rendered from the position of the user's eyes. Although eye buffers can be mono, this can cause discomfort. Previous SDKs had an implicit field of view (FOV) and viewport; these are now supplied explicitly and the application can change them every frame, if desired.
QuadA monoscopic image that is displayed as a rectangle at a given pose and size in the virtual world. This is useful for heads-up-displays, text information, object labels and so on. By default the pose is specified relative to the user's real-world space and the quad will remain fixed in space rather than moving with the user's head or body motion. For head-locked quads, use the ovrLayerFlag_HeadLocked flag as described below.
EyeMatrix The EyeMatrix layer type is similar to the EyeFov layer type and is provided to assist compatibility with Gear VR applications. For more information, refer to the Mobile SDK documentation.
DisabledIgnored by the compositor, disabled layers do not cost performance. We recommend that applications perform basic frustum-culling and disable layers that are out of view. However, there is no need for the application to repack the list of active layers tightly together when turning one layer off; disabling it and leaving it in the list is sufficient. Equivalently, the pointer to the layer in the list can be set to null.

Each layer style has a corresponding member of the ovrLayerType enum, and an associated structure holding the data required to display that layer. For example, the EyeFov layer is type number ovrLayerType_EyeFov and is described by the data in the structure ovrLayerEyeFov. These structures share a similar set of parameters, though not all layer types require all parameters:

ParameterTypeDescription
 
Header.Typeenum ovrLayerTypeMust be set by all layers to specify what type they are.
Header.FlagsA bitfield of ovrLayerFlagsSee below for more information.
ColorTextureTextureSwapChainProvides color and translucency data for the layer. Layers are blended over one another using premultiplied alpha. This allows them to express either lerp-style blending, additive blending, or a combination of the two. Layer textures must be RGBA or BGRA formats and might have mipmaps, but cannot be arrays, cubes, or have MSAA. If the application desires to do MSAA rendering, then it must resolve the intermediate MSAA color texture into the layer's non-MSAA ColorTexture.
ViewportovrRectiThe rectangle of the texture that is actually used, specified in 0-1 texture "UV" coordinate space (not pixels). In theory, texture data outside this region is not visible in the layer. However, the usual caveats about texture sampling apply, especially with mipmapped textures. It is good practice to leave a border of RGBA(0,0,0,0) pixels around the displayed region to avoid "bleeding," especially between two eye buffers packed side by side into the same texture. The size of the border depends on the exact usage case, but around 8 pixels seems to work well in most cases.
FovovrFovPortThe field of view used to render the scene in an Eye layer type. Note this does not control the HMD's display, it simply tells the compositor what FOV was used to render the texture data in the layer - the compositor will then adjust appropriately to whatever the actual user's FOV is. Applications may change FOV dynamically for special effects. Reducing FOV may also help with performance on slower machines, though typically it is more effective to reduce resolution before reducing FOV.
RenderPoseovrPosefThe camera pose the application used to render the scene in an Eye layer type. This is typically predicted by the SDK and application using the ovr_GetTrackingState and ovr_CalcEyePoses functions. The difference between this pose and the actual pose of the eye at display time is used by the compositor to apply timewarp to the layer.
SensorSampleTimedoubleThe absolute time when the application sampled the tracking state. The typical way to acquire this value is to have an ovr_GetTimeInSeconds call right next to the ovr_GetTrackingState call. The SDK uses this value to report the application's motion-to-photon latency in the Performance HUD. If the application has more than one ovrLayerType_EyeFov layer submitted at any given frame, the SDK scrubs through those layers and selects the timing with the lowest latency. In a given frame, if no ovrLayerType_EyeFov layers are submitted, the SDK will use the point in time when ovr_GetTrackingState was called with the latencyMarkerset to ovrTrue as the substitute application motion-to-photon latency time.
QuadPoseCenterovrPosefSpecifies the orientation and position of the center point of a Quad layer type. The supplied direction is the vector perpendicular to the quad. The position is in real-world meters (not the application's virtual world, the actual world the user is in) and is relative to the "zero" position set by ovr_RecenterTrackingOrigin or ovr_SpecifyTrackingOrigin unless the ovrLayerFlag_HeadLocked flag is used.
QuadSizeovrVector2fSpecifies the width and height of a Quad layer type. As with position, this is in real-world meters.

Layers that take stereo information (all those except Quad layer types) take two sets of most parameters, and these can be used in three different ways:

  • Stereo data, separate textures—the app supplies a different ovrTextureSwapChain for the left and right eyes, and a viewport for each.
  • Stereo data, shared texture—the app supplies the same ovrTextureSwapChain for both left and right eyes, but a different viewport for each. This allows the application to render both left and right views to the same texture buffer. Remember to add a small buffer between the two views to prevent "bleeding", as discussed above.
  • Mono data—the app supplies the same ovrTextureSwapChain for both left and right eyes, and the same viewport for each.

Texture and viewport sizes may be different for the left and right eyes, and each can even have different fields of view. However beware of causing stereo disparity and discomfort in your users.

The Header.Flags field available for all layers is a logical-or of the following:

  • ovrLayerFlag_HighQuality—enables 4x anisotropic sampling in the compositor for this layer. This can provide a significant increase in legibility, especially when used with a texture containing mipmaps; this is recommended for high-frequency images such as text or diagrams and when used with the Quad layer types. For Eye layer types, it will also increase visual fidelity towards the periphery, or when feeding in textures that have more than the 1:1 recommended pixel density. For best results, when creating mipmaps for the textures associated to the particular layer, make sure the texture sizes are a power of 2. However, the application does not need to render to the whole texture; a viewport that renders to the recommended size in the texture will provide the best performance-to-quality ratios.
  • ovrLayerFlag_TextureOriginAtBottomLeft—the origin of a layer's texture is assumed to be at the top-left corner. However, some engines (particularly those using OpenGL) prefer to use the bottom-left corner as the origin, and they should use this flag.
  • ovrLayerFlag_HeadLocked—Most layer types have their pose orientation and position specified relative to the "zero position" defined by calling ovr_RecenterTrackingOrigin. However the app may wish to specify a layer's pose relative to the user's face. When the user moves their head, the layer follows. This is useful for reticles used in gaze-based aiming or selection. This flag may be used for all layer types, though it has no effect when used on the Direct type.

At the end of each frame, after rendering to whichever ovrTextureSwapChain the application wants to update and calling ovr_CommitTextureSwapChain, the data for each layer is put into the relevant ovrLayerEyeFov / ovrLayerQuad / ovrLayerDirect structure. The application then creates a list of pointers to those layer structures, specifically to the Header field which is guaranteed to be the first member of each structure. Then the application builds a ovrViewScaleDesc struct with the required data, and calls the ovr_SubmitFrame function.

// Create eye layer.
ovrLayerEyeFov eyeLayer;
eyeLayer.Header.Type    = ovrLayerType_EyeFov;
eyeLayer.Header.Flags   = 0;
for ( int eye = 0; eye < 2; eye++ )
{
	eyeLayer.ColorTexture[eye] = EyeBufferSet[eye];
	eyeLayer.Viewport[eye]     = EyeViewport[eye];
	eyeLayer.Fov[eye]          = EyeFov[eye];
	eyeLayer.RenderPose[eye]   = EyePose[eye];
}

// Create HUD layer, fixed to the player's torso
ovrLayerQuad hudLayer;
hudLayer.Header.Type    = ovrLayerType_Quad;
hudLayer.Header.Flags   = ovrLayerFlag_HighQuality;
hudLayer.ColorTexture   = TheHudTextureSwapChain;
// 50cm in front and 20cm down from the player's nose,
// fixed relative to their torso.
hudLayer.QuadPoseCenter.Position.x =  0.00f;
hudLayer.QuadPoseCenter.Position.y = -0.20f;
hudLayer.QuadPoseCenter.Position.z = -0.50f;
hudLayer.QuadPoseCenter.Orientation.x = 0;
hudLayer.QuadPoseCenter.Orientation.y = 0;
hudLayer.QuadPoseCenter.Orientation.z = 0;
hudLayer.QuadPoseCenter.Orientation.w = 1;
// HUD is 50cm wide, 30cm tall.
hudLayer.QuadSize.x = 0.50f;
hudLayer.QuadSize.y = 0.30f;
// Display all of the HUD texture.
hudLayer.Viewport.Pos.x = 0.0f;
hudLayer.Viewport.Pos.y = 0.0f;
hudLayer.Viewport.Size.w = 1.0f;
hudLayer.Viewport.Size.h = 1.0f;

// The list of layers.
ovrLayerHeader *layerList[2];
layerList[0] = &eyeLayer.Header;
layerList[1] = &hudLayer.Header;

// Set up positional data.
ovrViewScaleDesc viewScaleDesc;
viewScaleDesc.HmdSpaceToWorldScaleInMeters = 1.0f;
viewScaleDesc.HmdToEyeViewOffset[0] = HmdToEyePose[0];
viewScaleDesc.HmdToEyeViewOffset[1] = HmdToEyePose[1];

ovrResult result = ovr_SubmitFrame(Session, 0, &viewScaleDesc, layerList, 2);

The compositor performs timewarp, distortion, and chromatic aberration correction on each layer separately before blending them together. The traditional method of rendering a quad to the eye buffer involves two filtering steps (once to the eye buffer, then once during distortion). Using layers, there is only a single filtering step between the layer image and the final framebuffer. This can provide a substantial improvement in text quality, especially when combined with mipmaps and the ovrLayerFlag_HighQuality flag.

One current disadvantage of layers is that no post-processing can be performed on the final composited image, such as soft-focus effects, light-bloom effects, or the Z intersection of layer data. Some of these effects can be performed on the contents of the layer with similar visual results.

Calling ovr_SubmitFrame queues the layers for display, and transfers control of the committed textures inside the ovrTextureSwapChains to the compositor. It is important to understand that these textures are being shared (rather than copied) between the application and the compositor threads, and that composition does not necessarily happen at the time ovr_SubmitFrame is called, so care must be taken. To continue rendering into a texture swap chain the application should always get the next available index with ovr_GetTextureSwapChainCurrentIndex before rendering into it. For example:

// Create two TextureSwapChains to illustrate.
ovrTextureSwapChain eyeTextureSwapChain;
ovr_CreateTextureSwapChainDX ( ... &eyeTextureSwapChain );
ovrTextureSwapChain hudTextureSwapChain;
ovr_CreateTextureSwapChainDX ( ... &hudTextureSwapChain );

// Set up two layers.
ovrLayerEyeFov eyeLayer;
ovrLayerEyeFov hudLayer;
eyeLayer.Header.Type = ovrLayerType_EyeFov;
eyeLayer...etc... // set up the rest of the data.
hudLayer.Header.Type = ovrLayerType_Quad;
hudLayer...etc... // set up the rest of the data.

// the list of layers
ovrLayerHeader *layerList[2];
layerList[0] = &eyeLayer.Header;
layerList[1] = &hudLayer.Header;

// Each frame...
int currentIndex = 0;
ovr_GetTextureSwapChainCurrentIndex(... eyeTextureSwapChain, &currentIndex);
// Render into it. It is recommended the app use ovr_GetTextureSwapChainBufferDX for each index on texture chain creation to cache 
// textures or create matching render target views. Each frame, the currentIndex value returned can be used to index directly into that.
ovr_CommitTextureSwapChain(... eyeTextureSwapChain);

ovr_GetTextureSwapChainCurrentIndex(... hudTextureSwapChain, &currentIndex);
// Render into it. It is recommended the app use ovr_GetTextureSwapChainBufferDX for each index on texture chain creation to cache 
// textures or create matching render target views. Each frame, the currentIndex value returned can be used to index directly into that.
ovr_CommitTextureSwapChain(... hudTextureSwapChain);

eyeLayer.ColorTexture[0] = eyeTextureSwapChain;
eyeLayer.ColorTexture[1] = eyeTextureSwapChain;
hudLayer.ColorTexture = hudTextureSwapChain;
ovr_SubmitFrame(Hmd, 0, nullptr, layerList, 2);

Working with HMD Eye Poses

In the Oculus PC SDK, prior to version 1.17, eye poses only had three degrees-of-freedom (DOF), i.e. only translation. Eye poses were specified in the HmdToEyeOffsetvector provided by the ovr_GetRenderDesc function. Starting with version 1.17, HmdToEyeOffset has been renamed to HmdToEyePose using the type ovrPosef which contains a Position and Orientation, effectively giving eye poses six degrees-of-freedom. This means that each eye’s render frustum can now be rotated away from the HMD’s orientation, in addition to being translated by the SDK. Because of this, the eye frustums’ axes are no longer guaranteed to be parallel to each other or to the HMD’s orientation axes. This generalization provides greater freedom to the SDK in defining the HMD geometry. But it also means that, as a VR app developer, you need to be more careful about your previous assumptions, especially when it comes to rendering.

Here are some pointers to make sure your VR app is correctly using HmdToEyePose:

  • If your VR app needs the vector translation value of the (pre-version 1.17) HmdToEyeOffset, you can use HmdToEyePose.Position instead. However, unless you are absolutely sure about what you are doing, there is a good chance you actually want to treat HmdToEyePose as a whole transform, rather than separate out Orientation from Position.
  • Prior to PC-SDK version 1.17, rendering a 2D quad flat across the screen (e.g. a rectangle) would have always been acceptable. But with the possibility of rotating each eye frustum independently, your VR app will need to incorporate each eye’s orientation into the transformation of the quad so that the quad is rendered with the correct perspective in each eye. Here is a (somewhat exaggerated) example:

    The idea is to orient the quad such that it appears to use either the “center-eye” orientation, or the HMD orientation. This also applies to other screen-aligned quads, such as 3D splash screens, or particle effects such as large flip-book smoke quads. Except for particles, avoid rendering such quads natively; prefer using ovrLayerQuad instead.
  • Some VR apps generate a single monoscopic camera frustum from the ovrFovPort structures of both eyes, in order to take advantage of various rendering optimizations. This is normally done by using an ovrFovPort, which takes the maximum of the ovrFovPort values for both eyes, on all four sides of the frustum. Before generating the monoscopic frustum this way however, be sure to remove any potential rotation from the ovrFovPort values by calling FovPort::Uncant, which is located in the ovr_math.h header. See the OculusWorldDemo sample code, to see how to use FovPort::Uncant.

Asynchronous TimeWarp

Asynchronous TimeWarp (ATW) is a technique for reducing latency and judder in VR applications and experiences.

In a basic VR game loop, the following occurs:

  1. The software requests your head position.
  2. The CPU processes the scene for each eye.
  3. The GPU renders the scenes.
  4. The Oculus Compositor applies distortion and displays the scenes on the headset.

The following shows a basic example of a game loop:

Basic Game Loop

When frame rate is maintained, the experience feels real and is enjoyable. When it doesn’t happen in time, the previous frame is shown which can be disorienting. The following graphic shows an example of judder during the basic game loop:

Basic Game Loop with Judder

When you move your head and the world doesn’t keep up, this can be jarring and break immersion.

ATW is a technique that shifts the rendered image slightly to adjust for changes in head movement. Although the image is modified, your head does not move much, so the change is slight.

Additionally, to smooth issues with the user’s computer, game design or the operating system, ATW can help fix “potholes” or moments when the frame rate unexpectedly drops.

The following graphic shows an example of frame drops when ATW is applied:

Game Loop with ATW

At the refresh interval, the Compositor applies TimeWarp to the last rendered frame. As a result, a TimeWarped frame will always be shown to the user, regardless of frame rate. If the frame rate is very bad, flicker will be noticeable at the periphery of the display. But, the image will still be stable.

ATW is automatically applied by the Oculus Compositor; you do not need to enable or tune it. However, although ATW reduces latency, make sure that your application or experience makes frame rate.

Adaptive Queue Ahead

To improve CPU and GPU parallelism and increase the amount of time that the GPU has to process a frame, the SDK automatically applies queue ahead up to 1 frame.

Without queue ahead, the CPU begins processing the next frame immediately after the previous frame displays. After the CPU finishes, the GPU processes the frame, the compositor applies distortion, and the frame is displayed to the user. The following graphic shows CPU and GPU utilization without queue ahead:

CPU and GPU Utilization without Queue Ahead

If the GPU cannot process the frame in time for display, the previous frame displays. This results in judder.

With queue ahead, the CPU can start earlier; this provides the GPU more time to process the frame. The following graphic shows CPU and GPU utilization with queue ahead:

CPU and GPU Utilization with Queue Ahead