Meta Developers

The Face Tracking API allows developers to use abstracted facial expression data to enhance social presence. For example, face tracking can help make a character’s facial expression look more natural during virtual interactions with other users. At a high level, creating a character with face tracking consists of creating the character to be represented with blendshapes that represent the facial expressions of your character, and adding the scripts to the character containing the blendshapes to read the API and map the expressions detected to the character blendshapes

The Face Tracking API supports expressions based on the Facial Action Coding System (FACS) or the Oculus Viseme-based expressions. The FACS expressions represent the different 70 muscles that are used to animate the face. Visemes represent the shape of the mouth when producing phonemes (i.e. sounds) and are represented by 15 blendshapes. Both the traditional Lipsync library and the Face Tracking API described in this section use the same 15 Oculus Visemes.

On the Quest Pro, the facial movements detected by the headset sensors are converted to activations of the expressions of the FACS blendshapes (e.g., jaw drop, nose wrinkle). On other Quest headsets, the audio stream is analyzed and translated into either FACS expressions or into Oculus Visemes relying on a machine learning model that has been trained on speech samples using a feature we call Audio To Expressions.

Within the face tracking implementation in Unity, these expressions are mapped into FACS-based blendshapes that an artist has created to represent the facial expressions of the character. In highly realistic cases and in the samples we provide, there is one of these blendshapes representing each of the expressions detected by the sensors. However, it is important to realize that a human wearing the headset will typically trigger a multiple of these expressions at the same time. For instance, when you smile, you may see lip corner raisers and also other actions on the cheek or even the eye. The API returns a weight corresponding to the strength of the expression (e.g., barely raising an eyebrow or an extreme raise of an eyebrow). The list of expressions that are fired along with their weights are then used to activate the blendshapes. Since these blendshape meshes can be deformed corresponding to the strength (or weight) of the expression, the combination of the different meshes combine to create the effect desired. There is no absolute requirement to match the number of FACS-based blendshapes and the number of expressions. For instance, you could create a simple avatar with two blendshapes (neutral and smile). You could then map only a few of the expressions provided (e.g., lip corner raiser) to detect a smile and activate the smile blendshape.

The Face Tracking API provides FACS-based blendshapes that represent most of the face including nose, mouth, jaw, eyebrows, and areas close to the eye. This provides coverage for the facial movements that make up smiling, frowning, surprise, and other facial expressions. This allows developers to provide their users with characters that can range from high-quality 3D representations for realistic VR experiences, to extremely stylized ones for fantasy and science-fiction environments.

For highly realistic characters, it is especially important to use accurate photogrammetry that will generate face assets and provide each of the 70 FACS expressions or the 15 visemes that the Face Tracking API defines. These can be combined to a skinned mesh.

Policies and disclaimers

When a user enables Natural Facial Expressions for your app, your app is granted access to real time abstracted facial expressions data which is user data under the Developer Data Use Policy. This data is only permitted to be used for purposes outlined in the Developer Data Use Policy. You are expressly forbidden from using this data for Prohibited Practices. The Natural Facial Expressions feature is powered by our Face Tracking technology.

Your use of the Face Tracking API must at all times be consistent with the Oculus SDK License Agreement, the Developer Data Use Policy and all applicable Oculus and Meta policies, terms and conditions. Applicable privacy and data protection laws may cover your use of Movement, and all applicable privacy and data protection laws.

In particular, you must post and abide by a publicly available and easily accessible privacy policy that clearly explains your collection, use, retention, and processing of data through the Face Tracking API. You must ensure that a user is provided with clear and comprehensive information about, and consents to, your access to and use of abstracted facial expression data prior to collection, including as required by applicable privacy and data protection laws.

Please note that we reserve the right to monitor your use of the Face Tracking API to enforce compliance with our policies.

Known issues

The following are known issues:

Confidence values for Face Tracking API facial expressions are not populated.
Audio to Expressions: Yawning may not result in the character opening their mouth as the models are trained mostly on verbal input.
Audio to Expressions: When the app first launches, there may be up to 10 seconds until the voice is fully connected to drive the face animation. This problem has been observed after a reboot of the device.
Similar to visual based face tracking, the Audio to Expressions model is trained on how users really move their lips. For users expecting more exaggerated movements like provided with visemes, you can increase the multipliers in the retargeting configuration file. See the JSON configuration section.

Integrate face tracking

Learning Objective

After completing this section, the developer should be able to:
1. Set up a new project for face tracking.
2. Enable a character using blendshapes corresponding directly to the Face Tracking API for face tracking.
3. Enable a character using ARKit blendshapes for face tracking.
4. Enable a character using Visemes for face tracking.

Note: Before following these steps, check the prerequisites in the Movement SDK Getting Started.

Set up a project that supports face tracking

After configuring your project for VR, follow these steps.

Make sure you have an OVRCameraRig prefab in your scene. The prefab is located at Packages/com.meta.xr.sdk.core/Prefabs/OVRCameraRig.prefab.
From the OVRCameraRig object, navigate to the OVRManager component.
Select Target Devices.
Scroll down to Quest Features > General.
If you want hand tracking, select Controllers and Hands for Hand Tracking Support.
Under General, make sure Face Tracking Support is selected. Click General if that view isn’t showing.
Under OVRManager select Face Tracking under Permissions Request On Startup.
If your project depends on Face Tracking, Eye Tracking, or Hand Tracking, ensure that these are enabled on your HMD. This is typically part of the device setup, but you can verify or change the settings by clicking Settings > Movement Tracking.
Fix any issues diagnosed by the Project Setup Tool. On the menu in Unity, go to Edit > Project Settings > Meta XR > to access the Project Setup Tool.
Select your platform.
Select Fix All if there are any issues. For details, see Use Project Setup Tool.

Setting up a character for FACS-based face tracking

Note: Some of the scripts below are distributed in the sample and not in Meta XR All-in-One SDK. As such, it is necessary to download the Oculus Samples GitHub repo to have access to these scripts. It is distributed as a package, so you can add it to an existing project.

Step 1: Characters with FACS-based blendshapes. If your character has blendshapes that correspond to our face tracking API blendshapes (see FACS-based blendshapes) follow this step. Otherwise, skip to Step 2.
- Step 1a: There is a helper function under Game Object->Movement Samples->Face Tracking->A2E Face. This should be used as a base setting, but it should be examined and modified based on the target character to make certain that blendshapes predictably work together.
Step 2: Characters with ARKit Blendshapes. Right-click on your character’s face skinned mesh that has blendshapes, then use the ARKit helper function under Game Object->Movement Samples->Face Tracking->A2E ARKit Face which establishes our default mappings for ARKit.
Step 3: Make sure your character has FaceDriver and FaceRetargeterComponent components, and the FaceRetargeterComponent component references a OVRWeightsProvider. The FaceDriver should reference meshes that animate in response to face tracking.
Step 4: Test your character. At this point your character should be ready to test by donning the headset with the character being used. The FaceRetargeterComponent’s “Retargeter Config” field references a JSON that can be modified to influence how the character animates.

Note: If you notice incorrect lighting on your character as it animates such as unusual creases that appear near the eyelids, please use the RecalculateNormals component.

To configure the RecalculateNormals component:

Assign the character’s face skinned mesh renderer to the Skinned Mesh Renderer field.
- You must use a compatible material based on Movement/PBR (Specular) or Movement/PBR (Metallic) on the skinned mesh renderer in order for normal recalculation to work.
Set the Recalculate Material Indices array to the indices of the materials that normal recalculation needs to be run on. Indicate the index of the sub mesh that recalculation should operate on using the “Sub Mesh” array.
Set the Duplicate Layer field to the layer that the character is located on.
Set the Hidden Mesh Layer Name field to the layer name that isn’t rendered by the camera.
- If no layer exists, head to Edit > Project Settings > Tags and Layers, and then create a new layer to be filled for the Hidden Mesh Layer Name field. The RecalculateNormals script will change the culling mask for all cameras in the scene to exclude the rendering of this layer.
Enable the Recalculate Independently field.

Scripts and Components

This section offers a brief description for the OVR Face Expressions component that defines the blendshapes drive face tracking.

OVR Face Expressions

OVRFaceExpressions

The OVR Face Expressions component queries and updates face expression data every frame. The OVRFaceExpressions component script defines an indexer for accessing face expression data and provides the field ValidExpressions to indicate whether or not expression data is valid. This component only works in apps that users granted face tracking permissions for.

FaceRetargeterComponent

Implements a retargeting WeightsProvider to map source tracking weights to a set of target weights based on a JSON configuration file. Each item in this file defines a set of input drivers, and what combination of output weights this specific combination should map to. The FaceRetargeterComponent class then creates a mapping function implementing these. The intended consumer for these weights is a FaceDriver instance.

FaceDriver

Implements a rig concept based on a naming convention, and drives the deformation. Using a list of blendshapes extracted from a list of skinned meshes, it builds a RigLogic instance, which interprets each name in terms of direct driver signals, in-betweens and correctives, and assembles their activation functions. The signals from the associated WeightsProvider then drive the deformation.

WeightsProvider

An abstract class that provides weights values to consumers. Examples include FaceRetargeterComponent and OVRWeightsProvider.

JSON configuration

Each character’s FaceRetargeterComponent references “Retargeter Config” field that can be used to tweak the character’s performance by modifying the drivers (FaceExpression) that drive the targets (target FACS-based blendshapes) via weights. For instance, when setting up a character using Game Object->Movement Samples->Face Tracking->A2E ARKit Face, a default arkit_retarget_a2e_v10.json configuration is added. Each entry is this configuration has an FaceExpression name followed by a list of ARKit blendshape names that are driven by it. If you wish for the corresponding ARKit blendshape to react differently, you can increase the weight next to it. Optionally, you can change the FACS-based blendshapes modified by removing and adding entries. This workflow does not apply to Viseme-based blendshapes.

Setting up a character for Viseme-based face tracking

This section describes configuring a character with Viseme-compatible blendshapes. This is currently a public experimental feature and cannot be used for apps that are published to the Meta app store.

To use Visemes, follow the steps discussed above, and then enable the following:

“Enable Visemes” under OVRCameraRig -> OVRManager -> Movement Tracking.
“Audio” and “Visual” under OVRCameraRig -> OVRManager -> Movement Tracking -> Face Tracking Data Sources.
“Record Audio for audio base Face Tracking” under OVRCameraRig -> OVRManager -> Permission Requests On Startup.
“Experimental Features” under OVRCameraRig -> OVRManager -> Experimental Features.

OVRFaceExpressions provides several properties and functions to access visemes, and you may use the component’s AreVisemesValid to query viseme validity, and GetViseme to obtain the weight of a given FaceViseme. TryGetFaceViseme is similar to GetViseme, except it returns false if the provided FaceViseme is invalid.

To make viseme integration easier, we also provide a VisemeDriver component in the Oculus Samples GitHub package. To use it, follow these steps:

Step 1: Add VisemeDriver as a component to each skinned mesh renderer that has viseme-compatible blendshapes.
Step 2: Click on the “Auto Generate Mapping” button to associate the skinned mesh renderer’s blendshapes to visemes.
- Step 2a: The “Clear Mapping” button can be used to clear these associations.
Step 3: Check the mappings that have been generated on the VisemeDriver component and make changes if required.

if you notice unusual creases that appear near the eyelids, please use the RecalculateNormals component discussed above.

FAQ

Which headset models support face tracking?
Natural Facial Expressions, which estimates facial movements based on inward facing cameras, is only available on Meta Quest Pro headsets. However, Audio To Expressions is available on all Meta Quest 2 and Meta Quest 3 devices with the same API.
How do I adapt my existing blendshapes to the FACS blendshapes Face Tracking API provides?
If the existing blendshapes don’t match the naming convention of the expected blendshapes in OVRFace, the blendshapes will need to be manually assigned. One may also inherit OVRCustomFace to create their own custom mapping. The blendshape visual reference can be found in the Movement - Face BlendShapes topic.
Are tongue blendshapes supported?
Yes, a total seven tongue blendshapes are supported, including tongue out, among the FACS blendshapes provided.

FACS-based blendshapes

Blendshapes
`BROW_LOWERER_L` and `BROW_LOWERER_R` knit and lower the brow area and lower central forehead.



`CHEEK_PUFF_L` and `CHEEK_PUFF_R` fill the cheeks with air causing them to round and extend outward.



`CHEEK_RAISER_L` and `CHEEK_RAISER_R` tighten the outer rings of the eye orbit and squeeze the lateral eye corners.



`CHEEK_SUCK_L` and `CHEEK_SUCK_R` suck the cheeks inward and against the teeth to create a hollow effect in the cheeks.



`CHIN_RAISER_B` and `CHIN_RAISER_T` push the skin of the chin and the lower lip upward. When the lips are touching, the upward force from the lower lip pushes the up top lip as well.



`DIMPLER_L` and `DIMPLER_R` pinch the lip corners against the teeth, drawing them slightly backward and often upward in the process.



`EYES_CLOSED_L` and `EYES_CLOSED_R` lower the top eyelid to cover the eye.



`EYES_LOOK_DOWN_L` and `EYES_LOOK_DOWN_R` move the eyelid consistent with downward gaze.



`LOOK_LEFT_L` and `LOOK_LEFT_R` move the eyelid consistent with leftward gaze.



`LOOK_RIGHT_L` and `LOOK_RIGHT_R` move the eyelid consistent with rightward gaze.



`LOOK_UP_L` and `LOOK_UP_R` move the eyelid consistent with upward gaze.



`INNER_BROW_RAISER_L` and `INNER_BROW_RAISER_R` lift the medial brow and forehead area.



`JAW_DROP` moves the lower mandible downward and toward the neck.


`JAW_SIDEWAYS_LEFT` moves the lower mandible leftward.


`JAW_SIDEWAYS_RIGHT` moves the lower mandible rightward.


`JAW_THRUST` projects the lower mandible forward.


`LID_TIGHTENER_L` and `LID_TIGHTENER_R` tighten the rings around the eyelids and push the lower eyelid skin toward the inner eye corners.



`LIP_CORNER_DEPRESSOR_L` and `LIP_CORNER_DEPRESSOR_R` draw the lip corners downward.



`LIP_CORNER_PULLER_L` and `LIP_CORNER_PULLER_R` draw the lip corners up, back, and laterally.



`LIP_FUNNELER_LB`, `LIP_FUNNELER_LT`, `LIP_FUNNELER_RB`, and `LIP_FUNNELER_RT` fan the lips outward in a forward projection, often rounding the mouth and separating the lips.





`LIP_PRESSOR_L` and `LIP_PRESSOR_R` press the upper and lower lips against one another.



`LIP_PUCKER_L` and `LIP_PUCKER_R` draw the lip corners medially causing the lips protrude in the process.



`LIP_STRETCHER_L` and `LIP_STRETCHER_R` draw the lip corners laterally, stretching the lips and widening the jawline.



`LIP_SUCK_LB`, `LIP_SUCK_LT`, `LIP_SUCK_RB`, and `LIP_SUCK_RT` suck the lips toward the inside of the mouth.





`LIP_TIGHTENER_L` and `LIP_TIGHTENER_R` narrow or constrict each lip on a horizontal plane.



`LIPS_TOWARD` forces contact between top and bottom lips to keep the mouth closed regardless of the position of the jaw.


`LOWER_LIP_DEPRESSOR_L` and `LOWER_LIP_DEPRESSOR_R` draw the lower lip downward and slightly laterally.



`MOUTH_LEFT` pulls the left lip corner leftward and pushes the right side of the mouth toward the left lip corner.


`MOUTH_RIGHT` pulls the right lip corner rightward and pushes the left side of the mouth toward the right lip corner.


`NOSE_WRINKLER_L` and `NOSE_WRINKLER_R` lift the sides of the nose, nostrils, and central upper lip area. Often pairs with brow lowering muscles to lower the medial brow tips.



`OUTER_BROW_RAISER_L` and `OUTER_BROW_RAISER_R` lift the lateral brows and forehead areas.



`UPPER_LID_RAISER_L` and `UPPER_LID_RAISER_R` pull the top eyelid up and back to widen eyes.



`UPPER_LIP_RAISER_L` and `UPPER_LIP_RAISER_R` lift the top lip (in a more lateral manner than nose wrinkler).



`TONGUE_TIP_INTERDENTAL` raises the tip of the tongue to touch the top teeth like with the viseme “TH”. The tongue is visible and slightly sticks out past the teeth line.


`TONGUE_TIP_ALVEOLAR` raises the tip of tongue to touch the back of the top teeth like in the viseme “NN”.


`TONGUE_FRONT_DORSAL_PALATE` makes the front part of the tongue to press against the palate like in the viseme “CH”.


`TONGUE_MID_DORSAL_PALATE` presses the middle of the tongue against the palate like in the viseme “DD”.


`TONGUE_BACK_DORSAL_VELAR` presses the back of the tongue against the palate like in the viseme “KK”.


`TONGUE_OUT` sticks the tongue out.


`TONGUE_RETREAT` pulls the tongue back in the throat and makes the tongue to stay down like in the viseme “AA”.

Viseme blendshapes

Viseme	Phonemes	Examples	Emphasized
SIL	neutral		None
PP	p, b, m	put, bat, mat
FF	f, v	fat, vat
TH	th	think, that
DD	t, d	tip, doll
KK	k, g	call, gas
CH	tS, dZ, S	chair, join, she
SS	s, z	sir, zeal
NN	n, l	lot, not
RR	r	red
AA	A:	car
E	e	bed
IH	ih	tip
OH	oh	toe
OU	ou	book