On this Page
Expressive features give Oculus Avatars more advanced facial geometry, allowing for realistic and nuanced animation of various facial behaviors. Expressive features increase social presence and make interactions seem more natural and dynamic. Expressive features are comprised of the following:
To add an avatar with expressive features to a Unity project, follow these steps:
In order to use expressive features, apps can no longer request version 1.0 avatars.
Apps must request mic permissions for lip sync functionality to work. Permissions must be requested separately rather than as part of the general manifest.
As an avatar’s gaze shifts, different gaze targets enter and leave its field of view, and its eyes automatically move between and focus on valid gaze targets. See the “Gaze Modeling” section below for more information on gaze targeting.
Any object in a scene can be made a gaze target by doing the following:
Android allows access to the microphone from only a single process. This was previously not an issue when networking avatars, but with the expressive update, microphone input needs to be used by both VoIP for voice as well as by the OVRLipSync plugin to generate blend shapes and drive mouth movement. This means that trying to use the microphone input for both VoIP and OVRLipSync causes an inevitable race condition.
The SocialStarter sample demonstrates a solution to this issue. By using the VoIP SetMicrophoneFilterCallback
function to set a callback whenever there is microphone input, the input can be used by both VoIP and OVRLipSync. The following steps walk through the relevant portions of the SocialStarter sample:
SetMicrophoneFilterCallback
is used to set a callback early in the local user’s connection. In SocialPlatformManager.cs
, this is done in GetLoggedInUserCallback
, which gets the identity and information of the logged-in user. See Voice Chat (VoIP) for more information on SetMicrophoneFilterCallback
usage. Voip.SetMicrophoneFilterCallback(MicFilter);
MicFilter
) in turn calls the s_instance
’s UpdateVoiceData
function using the same pcmData
and numChannels
data as was passed in. [MonoPInvokeCallback(typeof(Oculus.Platform.CAPI.FilterCallback))] public static void MicFilter(short[] pcmData, System.UIntPtr pcmDataLength, int frequency, int numChannels) { s_instance.UpdateVoiceData(pcmData, numChannels); }
UpdateVoiceData
then calls the localAvatar
’s UpdateVoiceData
function (in OvrAvatar.cs
). public void UpdateVoiceData(short[] pcmData, int numChannels) { if (localAvatar != null) { localAvatar.UpdateVoiceData(pcmData, numChannels); } float voiceMax = 0.0f; float[] floats = new float[pcmData.Length]; for (int n = 0; n < pcmData.Length; n++) { float cur = floats[n] = (float)pcmData[n] / (float)short.MaxValue; if (cur > voiceMax) { voiceMax = cur; } } voiceCurrent = voiceMax; }
OvrAvatar.cs
, UpdateVoiceData
passes the data to ProcessAudioSamplesRaw
, which passes the audio data to the lip sync module. public void UpdateVoiceData(short[] pcmData, int numChannels) { if (lipsyncContext != null && micInput == null) { lipsyncContext.ProcessAudioSamplesRaw(pcmData, numChannels); } }
localAvatar
’s CanOwnMicrophone
must be set to false (or disabled in the Unity Inspector).The following sections provide more information on each of the expressive features.
OVRLipsync uses voice input from the microphone to drive realistic lip-sync animation for the avatar. Machine-learned viseme prediction is applied to the input and used to translate it into a set of blend shape weights used to animate the mouth in real-time. Physically based blending is used to produce more natural and dynamic mouth movement, along with subtle facial movements around the mouth.
Gaze modeling enables an avatar’s eyes to look around and exhibit gaze dynamics and patterns developed by studying human behavior. Here are the currently implemented kinds of eye behavior:
Gaze targets are specifically tagged Unity objects that represent a visual point of interest for an avatar’s gaze. As an avatar’s gaze shifts, different gaze targets enter and leave its field of view, and its eyes will automatically move between and focus on valid gaze targets using the behaviors listed above. If no gaze targets are present, an avatar will exhibit ambient eye movement behavior.
When presented with multiple gaze targets, the targeting algorithm that determines which gaze target an avatar looks at calculates a saliency score to each target. This score is based on direction and velocity of head movement, eccentricity from the center of vision, distance, target type, and whether the object is moving.
Blink modeling enables an avatar to simulate human blinking behaviors, such as blinking to keep their eyes clean and at the end of spoken sentences. Blinks may also be triggered by sweeping eye movements.
Expression modeling enables slight facial nuances and micro-expressions on an avatar’s face to increase social presence when it isn’t actively speaking. These expressions are minor to avoid the implication they are indicating an actual mood or response to anything happening around them.
You can test expressive features by using the following user IDs: