Expressive Features for Avatars

Expressive features give Oculus Avatars more advanced facial geometry, allowing for realistic and nuanced animation of various facial behaviors. Expressive features increase social presence and make interactions seem more natural and dynamic. Expressive features are comprised of the following:

  • Realistic lip-syncing powered by Oculus Lipsync technology.
  • Natural eye behavior, including gaze dynamics and blinking.
  • Ambient facial micro-expressions when an avatar is not speaking.

Using Oculus Avatars with Expressive Features

To add an avatar with expressive features to a Unity project, follow these steps:

  1. Open Unity 2017.4.11f1 or later and create a new project.
  2. Go to the Unity Asset Store and search for the “Oculus Integration” package. Install the package and update OVRPlugin if necessary. Allow Unity to restart if requested.
  3. On the menu bar, go to Oculus > Avatars > Edit Settings and add a valid app ID for your target device. Your app ID can be found on the API page of the Oculus Dashboard.
  4. On the menu bar, go to Oculus > Platform > Edit Settings and add a valid app ID for your target device. Your app ID can be found on the API page of the Oculus Dashboard.
  5. Add an avatar to the scene. In the Inspector make sure the Enable Expressive check box is checked on the Ovr Avatar component.
  6. In the Inspector, set the Oculus User ID on Ovr Avatar to a valid value, or use one of the test user IDs listed at the bottom of this topic.

Apps must request mic permissions for lip-syncing functionality to work. Permissions must be requested separately rather than as part of the general manifest.

Setting Up Gaze Targets

As an avatar’s gaze shifts, different gaze targets enter and leave its field of view, and its eyes automatically move between and focus on valid gaze targets. See the “Gaze Modeling” section of this topic for more information on gaze targeting.

Any object in a scene can be made a gaze target by doing the following:

  1. Select any object in the scene.
  2. In the Inspector, click the Add Component button and search for Gaze Target. Select the component when it appears.
  3. Use the Type property to specify the type of gaze target the object is. Gaze target types are listed in descending order of saliency.
    • Avatar Head
    • Avatar Hand
    • Object
    • Static Object

Enabling VoIP and Lip Sync on Android Devices

Android allows access to the microphone from only a single process. This was previously not an issue when networking avatars, but with the expressive update, microphone input needs to be used by both VoIP for voice as well as by the OVRLipSync plugin to generate blend shapes and drive mouth movement. This means that trying to use the microphone input for both VoIP and OVRLipSync causes an inevitable race condition.

The SocialStarter sample scene demonstrates a solution to this issue. By using the VoIP SetMicrophoneFilterCallback function to set a callback whenever there is microphone input, the input can be used by both VoIP and OVRLipSync. The following steps walk through the relevant portions of the SocialStarter sample:

  1. SetMicrophoneFilterCallback is used to set a callback early in the local user’s connection. In SocialPlatformManager.cs, this is done in GetLoggedInUserCallback, which gets the identity and information of the logged-in user. See Voice Chat (VoIP) for more information on SetMicrophoneFilterCallback usage.
  2. The callback function (in this example, MicFilter) in turn calls the s_instance’s UpdateVoiceData function using the same pcmData and numChannels data as was passed in.
     public static void MicFilter(short[] pcmData, System.UIntPtr pcmDataLength, int frequency, int numChannels)
      s_instance.UpdateVoiceData(pcmData, numChannels);
  3. UpdateVoiceData then calls the localAvatar’s UpdateVoiceData function (in OvrAvatar.cs).
    public void UpdateVoiceData(short[] pcmData, int numChannels)
      if (localAvatar != null)
          localAvatar.UpdateVoiceData(pcmData, numChannels);
      float voiceMax = 0.0f;
      float[] floats = new float[pcmData.Length];
      for (int n = 0; n < pcmData.Length; n++)
          float cur = floats[n] = (float)pcmData[n] / (float)short.MaxValue;
          if (cur > voiceMax)
              voiceMax = cur;
      voiceCurrent = voiceMax;
  4. In OvrAvatar.cs, UpdateVoiceData passes the data to ProcessAudioSamplesRaw, which passes the audio data to the lip sync module.
    public void UpdateVoiceData(short[] pcmData, int numChannels)
    if (lipsyncContext != null && micInput == null)
        lipsyncContext.ProcessAudioSamplesRaw(pcmData, numChannels);
  5. Additionally, for both VoIP and OVRLipSync to work at the same time, the localAvatar’s CanOwnMicrophone must be set to false (or disabled in the Unity Inspector).

Expressive Features

The following sections provide more information on each of the expressive features.


OVRLipsync uses voice input from the microphone to drive realistic lip-sync animation for the avatar. Machine-learned viseme prediction translates the input into a set of blend shape weights used to animate the avatar’s mouth in real-time. Physically based blending is used to produce more natural and dynamic mouth movement, along with subtle facial movements around the mouth.

Gaze Modeling

Gaze modeling enables an avatar’s eyes to look around and exhibit gaze dynamics and patterns developed by studying human behavior. Here are the currently implemented kinds of eye behavior:

  • Fixated – Gaze is focused on a gaze target. This state periodically triggers micro-saccades, small jerk-like movements that occur while looking at an object as the eye adjusts focus and moves its target onto the retina.
  • Saccade - Fast, sweeping eye movements where the gaze quickly moves to refocus on a new gaze target.
  • Smooth Pursuit - A movement where the gaze smoothly tracks a gaze target across the avatar’s field of view.

Gaze targets are specifically tagged Unity objects that represent a visual point of interest for an avatar’s gaze. As an avatar’s gaze shifts, different gaze targets enter and leave its field of view, and its eyes will automatically move between and focus on valid gaze targets using the behaviors listed above. If no gaze targets are present, an avatar will exhibit ambient eye movement behavior.

When presented with multiple gaze targets, the targeting algorithm that determines which gaze target an avatar looks at calculates a saliency score to each target. This score is based on direction and velocity of head movement, eccentricity from the center of vision, distance, target type, and whether the object is moving.

Blink modeling enables an avatar to simulate human blinking behaviors, such as blinking to keep their eyes clean and at the end of spoken sentences. Blinks may also be triggered by sweeping eye movements.

Expression Modeling

Expression modeling enables slight facial nuances and micro-expressions on an avatar’s face to increase social presence when it isn’t actively speaking. These expressions are minor to avoid the implication they are indicating an actual mood or response to anything happening around them.

Testing Expressive Features

You can test expressive features by using the following user IDs:

  • 10150030458727564
  • 10150030458738922
  • 10150030458747067
  • 10150030458756715
  • 10150030458762178
  • 10150030458769900
  • 10150030458775732
  • 10150030458785587
  • 10150030458806683
  • 10150030458820129
  • 10150030458827644
  • 10150030458843421