How to Obtain Stable GPU Measurements on Quest

Oculus Developer Blog
|
Posted by Rémi Palandri
|
April 30, 2021
|
Share
Quest

When developing for Quest, it is important to get accurate and repeatable measurements of GPU frame time on an application. Even the smallest changes to an application, such as a shader change, resizing a texture, removing or adding a draw call, can impact your app’s performance. Changes usually accepted because they didn't negatively impact refresh rate can actually have significant cost. For example, consider an app frame time at 5ms that has a 2x regression to 10ms while still maintaining 72 Hz.

Quest offers many tools to track GPU time including RenderDoc for Oculus, GPU Systrace, ovrgpuprofiler, and Perfetto but they all have a little overhead. See Optimization Tools for a list of all tools available for Quest development. These tools are useful for finding the reason behind a change in GPU time, but for the initial investigation of a change they may not be sufficient nor are they really necessary.

Ensuring App Consistency

The app's frames need to be very consistent from frame to frame when measuring, which the optimization tools can't do for you. For VR development, you should lock the camera to a specific position. Looking roughly in a predetermined location isn't enough as that adds measurable jitter from frame to frame. Additionally, depending on who is doing that testing, floor height itself may influence the app's GPU times in a very significant way.

In UE4, we can add this simple blueprint to our test scenes:

At scene load, this will disable orientation and position tracking from the HMD and replace them with a set rotation and position. Our app is now effectively camera-locked, and you can look around using timewarp at a "flat" 2D screen always facing the same direction.

Afterwards, remove dynamic objects (moving NPCs, particle effects whose costs change too much between frames) and remove the hand meshes. It is easy to forget hands when the HMD is on a desk and the head position is locked in the app, but they still render. Simply set them as hidden in the app. Your app should now be very precise and repeatable frame to frame.

In Unity, it can be achieved in a similar way. If you are using OVRCameraRig, using the following script to override the camera/hand poses:

Otherwise, if you are using Unity’s XR subsystem directly, you could disable the TrackedPoseDriver on the camera game object, and manually override its transform to the preferred position.

Eliminating System Changes

Now that you have ensured consistency in your app, you can remove the source of deltas originating from the system. These deltas come from either changes in the SoC’s frequency, or from processes that are running on a higher GPU priority than the app and therefore will randomly interrupt the app's GPU workloads, increasing their overall time.

First, we can lock the system’s frequency to representative levels by doing:

Then, the two main processes that run at a higher priority than the app are Guardian and the system compositor (TimeWarp).

For all profiling, we recommend turning off Guardian under developer settings. TimeWarp is a little more tricky to disable given that disabling it will effectively screen-freeze your HMD. Calling the following in an adb shell session will remove TimeWarp's draw calls for 15s:

Applications not meeting framerate

In the case of an application not meeting framerate, the system’s Vsync model can introduce frame-to-frame differences as one frame might wait for next-Vsync but the other would be released immediately. We recommend then adjusting the Vsync model to make your application blocked by the Vsync model (as if it was meeting framerate). In the case of an application running on a target of 72hz, but just meeting ~40 frames per second, calling

will lock the application at 36hz (72/2). You need to restart the compositor after this setprop is set by pressing the power button twice (sleep-cycle). You should then see in the VRAPI logs, not VSnc=1, but VSnc=2.

An application only meeting 30hz can be vsync-locked at 24hz by swapInterval 3, and so on.

Measuring Stable GPU on your app

Although the HMD is now screen-locked, the app is still rendering underneath it and all its metrics are reporting perfectly. Looking at the VRAPI logs (logcat | grep FPS works well), you should see something like this:

The “App=11.06ms” portion of the log is the OS-measured GPU time. TW=0 and GD=0 shows that timewarp and guardian were properly disabled. As you can see, the measurement on the app side is extremely stable, oscillating by 0.01ms maximum. You can track down the impact of diffs that only affect the GPU time by 0.02ms using this method and continuously use it to compare changes.

While this process can take some time to set up, it can be extremely beneficial for a performance-oriented team to track down performance changes. While this does not need to be done for every mesh and every scene, it is highly effective in answering a question like, "What is the GPU impact of switching my lightmaps from 1024 to 2048?" As you tune your application and maximize performance, we hope you will find this method helpful.

If you have any questions, please let us know in the comments or developer forum.