Expressive Features for Avatars - Unity

Expressive features give Avatars more advanced facial geometry, allowing for realistic and nuanced animation of various facial behaviors. Expressive features increase social presence and make interactions seem more natural and dynamic. Expressive features are comprised of the following:

  • Realistic lip-syncing powered by Oculus Lipsync technology.
  • Natural eye behavior, including gaze dynamics and blinking.
  • Ambient facial micro-expressions when an Avatar is not speaking.

Using Oculus Avatars with Expressive Features

To add an Avatar with expressive features to a Unity project, follow these steps:

  1. Open Unity 2017.4.11f1 or later and create a new project.
  2. Go to the Unity Asset Store and import the Oculus Integration package. Update OVRPlugin if necessary.
  3. On the menu bar, go to Oculus > Avatars > Edit Settings > Oculus Rift App Id and add a valid app ID.
  4. On the menu bar, go to Oculus > Platform > Edit Settings > Oculus Rift App Id and add a valid app ID.
  5. Add an Avatar to the scene. In the Inspector make sure the Enable Expressive check box is checked on the Ovr Avatar component.
  6. In the Inspector, set the Oculus User ID on Ovr Avatar to a valid value, or use one of the test user IDs listed at the bottom of this topic.

In order to use expressive features, apps can no longer request version 1.0 Avatars.

Apps must request mic permissions for lip sync functionality to work. Permissions must be requested separately rather than as part of the general manifest.

In order to use expressive features, apps can no longer request version 1.0 Avatars.

Setting Up Gaze Targets

As an Avatar’s gaze shifts, different gaze targets enter and leave its field of view, and its eyes automatically move between and focus on valid gaze targets. See the “Gaze Modeling” section below for more information on gaze targeting.

Any object in a scene can be made a gaze target by doing the following:

  1. Select any object in the scene.
  2. In the Inspector, click Add Component and search for Gaze Target. Select the component when it appears.
  3. Use the Type property to specify the type of gaze target the object is. Gaze target types are listed in descending order of saliency.
    • Avatar Head
    • Avatar Hand
    • Object
    • Static Object

Enabling VoIP and Lip Sync on Android Devices

Android allows access to the microphone from only a single process. This was previously not an issue when networking Avatars, but with the expressive update, microphone input needs to be used by both VoIP for voice as well as by the OVRLipSync plugin to generate blend shapes and drive mouth movement. This means that trying to use the microphone input for both VoIP and OVRLipSync causes an inevitable race condition.

The SocialStarter sample demonstrates a solution to this issue. By using the VoIP SetMicrophoneFilterCallback function to set a callback whenever there is microphone input, the input can be used by both VoIP and OVRLipSync. The following steps walk through the relevant portions of the SocialStarter sample:

  1. SetMicrophoneFilterCallback is used to set a callback early in the local user’s connection. In SocialPlatformManager.cs, this is done in GetLoggedInUserCallback, which gets the identity and information of the logged-in user. See Voice Chat (VoIP) for more information on SetMicrophoneFilterCallback usage.
    Voip.SetMicrophoneFilterCallback(MicFilter);
    
  2. The callback function (in this example, MicFilter) in turn calls the s_instance’s UpdateVoiceData function using the same pcmData and numChannels data as was passed in.
    [MonoPInvokeCallback(typeof(Oculus.Platform.CAPI.FilterCallback))]
     public static void MicFilter(short[] pcmData, System.UIntPtr pcmDataLength, int frequency, int numChannels)
     {
      s_instance.UpdateVoiceData(pcmData, numChannels);
     }
    
  3. UpdateVoiceData then calls the localAvatar’s UpdateVoiceData function (in OvrAvatar.cs).
    public void UpdateVoiceData(short[] pcmData, int numChannels)
     {
      if (localAvatar != null)
      {
          localAvatar.UpdateVoiceData(pcmData, numChannels);
      }
    
      float voiceMax = 0.0f;
      float[] floats = new float[pcmData.Length];
      for (int n = 0; n < pcmData.Length; n++)
      {
          float cur = floats[n] = (float)pcmData[n] / (float)short.MaxValue;
          if (cur > voiceMax)
          {
              voiceMax = cur;
          }
      }
      voiceCurrent = voiceMax;
     }
    
  4. In OvrAvatar.cs, UpdateVoiceData passes the data to ProcessAudioSamplesRaw, which passes the audio data to the lip sync module.
    public void UpdateVoiceData(short[] pcmData, int numChannels)
     {
    if (lipsyncContext != null && micInput == null)
    {
        lipsyncContext.ProcessAudioSamplesRaw(pcmData, numChannels);
    }
     }
    
  5. Additionally, for both VoIP and OVRLipSync to work at the same time, the localAvatar’s CanOwnMicrophone must be set to false (or disabled in the Unity Inspector).

Expressive Features

The following sections provide more information on each of the expressive features.

OVRLipsync

Uses voice input from the microphone to drive realistic lip-sync animation for the Avatar. Machine-learned viseme prediction is applied to the input and used to translate it into a set of blendshape weights used to animate the mouth in real-time. Physically based blending is used to produce more natural and dynamic mouth movement, along with subtle facial movements around the mouth.

Gaze Modeling

An Avatar’s eyes look around and exhibit gaze dynamics and patterns developed by studying human behavior. Here are the currently implemented kinds of eye behavior:

  • Fixated – Gaze is focused on a gaze target. This state periodically triggers micro-saccades, small jerk-like movements that occur while looking at an object as the eye adjusts focus and moves its target onto the retina.
  • Saccade - Fast, sweeping eye movements where the gaze quickly moves to refocus on a new gaze target.
  • Smooth Pursuit - A movement where the gaze smoothly tracks a gaze target across the Avatar’s field of view.

Gaze targets are specifically tagged Unity objects that represent a visual point of interest for an Avatar’s gaze. As an Avatar’s gaze shifts, different gaze targets enter and leave its field of view, and its eyes will automatically move between and focus on valid gaze targets using the behaviors listed above. If no gaze targets are present, an Avatar will exhibit ambient eye movement behavior.

When presented with multiple gaze targets, the targeting algorithm that determines which gaze target an Avatar looks at calculates a saliency score to each target. This score is based on direction and velocity of head movement, eccentricity from the center of vision, distance, target type, and whether the object is moving.

Avatars simulate human blinking behaviors, such as blinking to keep their eyes clean, and at the end of spoken sentences. Blinks may also be triggered by sweeping eye movements.

Expression Model

With expressive features, an Avatar shows slight facial nuances and micro-expressions to increase social presence when it isn’t actively speaking. These expressions are minor to avoid the implication they are indicating an actual mood or response to anything happening around them.

Testing Expressive Features

You can test expressive features by using the following user IDs:

  • 10150030458727564
  • 10150030458738922
  • 10150030458747067
  • 10150030458756715
  • 10150030458762178
  • 10150030458769900
  • 10150030458775732
  • 10150030458785587
  • 10150030458806683
  • 10150030458820129
  • 10150030458827644
  • 10150030458843421