Mixed Reality Capture

Mixed reality capture places real-world people and objects in VR. This guide will review how to add support for mixed reality capture in your native Rift app.

Retrieving the Camera Details

When running a mixed reality capture supported app there are a couple of inputs that your app need to use. Prior to launching your app, each user will run the CameraTool introduced on the Camera Calibration page in the user guide. This will load details about the camera into the system that your app can retrieve. Users won’t have to perform the calibration each time, as they will be able to use this tool to load a previously saved file to the system.

  1. Camera intrinsics are the attributes of the camera. Resolution, field of view, exposure frequency, etc… Details of the camera intrinsics are sent to, and retrieved from, the SDK as part of OvrCameraIntrisics. We do not expect the camera intrinsics to change frequently.
  2. Camera extrinsics are everything external to the camera. Relative pose, latency to system, attached to (VR Object), etc.. We expect the camera extrinsics to change frequently.

Implementing Mixed Reality Capture

This guide assumes that you have previously implemented the Oculus SDK and have a functioning Rift application.

  1. The first step is to define an additional camera perspective, from a 3rd person view, that is capable of being moved around the scene.
  2. In your app, pair that perspective with the external camera configured in the camera tool. Call ovr_GetExternalCameras to retrieve a list of external cameras.
  3. Retrieve the properties of that camera from the pointer provided by the ovr_GetExternalCameras response. This gives you the following information.
    1. Intrinsics
      • LastChangedTime - Time in seconds from last change to the parameters.
      • FOVPort - Angles of all 4 sides of viewport.
      • VirtualNearPlaneDistanceMeters - Distance, in virtual meters, of the near plane clipping distance. Your app will determine this distance.
      • VirtualFarPlaneDistanceMeters - Distance, in virtual meters, of the far plane clipping distance. Your app will determine this distance.
      • ImageSensorPixelResolution - Height and width, in pixels, of image sensor.
      • LensDistortionMatrix - The lens distortion of the camera.
      • ExposurePeriodSeconds - How frequently, in seconds, the exposure is taken. This value is not provided by the CameraTool. You should request this information in your app from the user.
      • ExposureDurationSeconds - Length of the exposure time. This value is not provided by the CameraTool. You should request this information in your app from the user.
    2. Extrinsics
      • CameraStatusFlags - Current Status of the camera, a mix of bits from ovrCameraStatusFlags.
      • AttachedToDevice - Which Tracked device, if any, is the camera rigidly attached to. If set to ovrTrackedDevice_None, then the camera is not attached to a tracked object. If the external camera moves while unattached (i.e. set to ovrTrackedDevice_None), its Pose won't be updated.
      • RelativePose - The relative Pose of the External Camera. If AttachedToDevice is ovrTrackedDevice_None, then this is a absolute pose in tracking space.
      • LastExposureTimeSeconds - The time, in seconds, when the last successful exposure was taken.
      • ExposureLatencySeconds - Estimated exposure latency to get from the exposure time to the system.
      • AdditionalLatencySeconds - Additional latency to get from the exposure time of the real camera to match the render time of the virtual camera.
  4. Now you have the external camera with the initial pose, the tracked VR Object that will give you the transform and pose in the scene, and the details about the camera that is capturing the real-world portion of the scene. The external camera has also been associated with the in-app camera.

    The following example from our Oculus World Demo demonstrates the process of retrieving an external camera and using the camera intrinsics to set the window/mirror size.

    bool OculusWorldDemoApp::SetupMixedReality()
    {
        ovrResult error = ovr_GetExternalCameras(&ExternalCameras[0], &NumberOfCameras);
        
        if (!OVR_SUCCESS(error))
        {
            DisplayLastErrorMessageBox("ovr_GetExternalCameras failure.");
            return false;
        }
        CurrentCameraID = 0;
        // We use 0 as the default camera ID. If more than one camera is connected, 
        // you'll need to find the camera ID based on the camera name string provided during calibrating in the CameraTool.
    
        Sizei tempWindowSize;
        tempWindowSize.w = WindowSize.w + 2 * ExternalCameras[CurrentCameraID].Intrinsics.ImageSensorPixelResolution.w;
        tempWindowSize.h = std::max(WindowSize.h, ExternalCameras[CurrentCameraID].Intrinsics.ImageSensorPixelResolution.h);
    
        RenderParams.Resolution = tempWindowSize;
        if (pRender != nullptr)
        {
            pRender->SetWindowSize(tempWindowSize.w, tempWindowSize.h);
            pRender->SetParams(RenderParams);
        }
        pPlatform->SetWindowSize(tempWindowSize.w, tempWindowSize.h);   // resize the window
        
        NearRenderViewport = Recti(Vector2i(WindowSize.w, 0), ExternalCameras[CurrentCameraID].Intrinsics.ImageSensorPixelResolution);
        FarRenderViewport = Recti(Vector2i(ExternalCameras[CurrentCameraID].Intrinsics.ImageSensorPixelResolution.w + WindowSize.w, 0),
            ExternalCameras[CurrentCameraID].Intrinsics.ImageSensorPixelResolution);
         
         return true;
    }
  5. While rendering the scene you’ll want to render every frame or match the frame-rate of the external camera. Simultaneously, retrieve the location pose of the camera by calling a) ovr_GetDevicePoses which gives you the transform from the original pose. If it’s attached to ovrtrackeddevice_none then the relative pose is actually absolute.
  6. Save the rendered images and transform data with a timestamp to a temporary buffer. You’ll use that timestamp to pair the images from rendered scene to the external camera images. Images from the in-app scene will be later retrieved from the temporary buffer after the system latency, gathered from the external camera extrinsics, has passed.

    The following example from Oculus World Demo App demonstrates rendering the MR scene.

    void OculusWorldDemoApp::RenderCamNearFarView()
    {
        Posef tempPose;
        tempPose.SetIdentity();     // fixed camera pose
        if (ExternalCameras[CurrentCameraID].Extrinsics.AttachedToDevice ==  ovrTrackedDevice_LTouch)
            tempPose = HandPoses[0];
        else if (ExternalCameras[CurrentCameraID].Extrinsics.AttachedToDevice == ovrTrackedDevice_LTouch)
            tempPose = HandPoses[1];
        else if (ExternalCameras[CurrentCameraID].Extrinsics.AttachedToDevice == ovrTrackedDevice_Object0)
            tempPose = TrackedObjectPose;
    
        Posef CamPose = tempPose * Posef(ExternalCameras[CurrentCameraID].Extrinsics.RelativePose);
    
        Posef CamPosePlayer = ThePlayer.VirtualWorldTransformfromRealPose(CamPose, TrackingOriginType);
        Vector3f up = CamPosePlayer.Rotation.Rotate(UpVector);
        Vector3f forward = CamPosePlayer.Rotation.Rotate(ForwardVector);
        Vector3f dif = ThePlayer.GetHeadPosition(TrackingOriginType) - CamPosePlayer.Translation;
        float bodyDistance = forward.Dot(dif);
    
        bool flipZ = DepthModifier != NearLessThanFar;
        bool farAtInfinity = DepthModifier == FarLessThanNearAndInfiniteFarClip;
        unsigned int projectionModifier = ovrProjection_None;
        projectionModifier |= (RenderParams.RenderAPI == RenderAPI_OpenGL) ? ovrProjection_ClipRangeOpenGL : 0;
        projectionModifier |= flipZ ? ovrProjection_FarLessThanNear : 0;
        projectionModifier |= farAtInfinity ? ovrProjection_FarClipAtInfinity : 0;
    
    
        ViewFromWorld[2] = Matrix4f::LookAtRH(CamPosePlayer.Translation, CamPosePlayer.Translation + forward, up);
    
        // near view
        CamProjection = ovrMatrix4f_Projection(ExternalCameras[CurrentCameraID].Intrinsics.FOVPort,
            ExternalCameras[CurrentCameraID].Intrinsics.VirtualNearPlaneDistanceMeters,
            bodyDistance,
            projectionModifier);
        pRender->ApplyStereoParams(NearRenderViewport, CamProjection);
        pRender->SetDepthMode(true, true, (DepthModifier == NearLessThanFar ?
            RenderDevice::Compare_Less :
            RenderDevice::Compare_Greater));
    
        if ((GridDisplayMode != GridDisplay_GridOnly) && (GridDisplayMode != GridDisplay_GridDirect))
        {
            if (SceneMode != Scene_OculusCubes && SceneMode != Scene_DistortTune)
            {
                MainScene.Render(pRender, ViewFromWorld[2]);
                RenderControllers(ovrEye_Count);   // 2 : from the external camera
            }
        }
    
        // far view
        CamProjection = ovrMatrix4f_Projection(ExternalCameras[CurrentCameraID].Intrinsics.FOVPort,
            bodyDistance,
            ExternalCameras[CurrentCameraID].Intrinsics.VirtualFarPlaneDistanceMeters,
            projectionModifier);
        pRender->ApplyStereoParams(FarRenderViewport, CamProjection);
        pRender->SetDepthMode(true, true, (DepthModifier == NearLessThanFar ?
            RenderDevice::Compare_Less :
            RenderDevice::Compare_Greater));
    
        if ((GridDisplayMode != GridDisplay_GridOnly) && (GridDisplayMode != GridDisplay_GridDirect))
        {
            if (SceneMode != Scene_OculusCubes && SceneMode != Scene_DistortTune)
            {
                MainScene.Render(pRender, ViewFromWorld[2]);
                RenderControllers(ovrEye_Count);   // 2 : from the external camera
            }
        }
    }
  7. The final step is to combine the images. This process will vary depending on how your app is going to handle composition of the final scene.
    1. Direct Composition - If you’re handling the composition in your app, your app will need to be able to capture the external camera images, apply the chroma key to clip the greenscreen, and match the rendered scene with the clipped external camera image (to match take the timestamp of the rendered scene in the buffer, add the AdditionalLatencySeconds retrieved from the camera extrinsics, and apply to the external camera image that most closely matched the calculated timestamp).
    2. External Composition - If you’re not doing your own composition and are planning to have users run an external program, you’ll want to delay passing the rendered scene, from the buffer, by the AdditionalLatencySeconds so the images are properly aligned. The external program will handle the capture of the external camera, greenscreen clipping, and producing the final MR scene. Details about this process can be found in the Mixed Reality Capture Setup Guide.

All sample code provided on this page is covered under the Oculus Examples License.