The Android SDK tool Systrace can be modified to provide low-level GPU pipeline data for apps running on Oculus Quest. This data includes information on the render stages of the pipeline and timing data for each stage. Once GPU Systrace functionality has been enabled, it can be performed from the command line or Android Device Monitor.
GPU Systrace supports render stage GPU tracing on a tile-per-tile level. Unlike direct-mode GPUs, which execute draw calls sequentially, tile-based renderers batch draw calls for an entire surface, then that surface is split into tiles that are computed sequentially, where each tile executes all the draw calls that touched that tile. GPU Systrace can tell you how much time was spent in each rendering stage for each surface rendered during a trace’s duration.
GPU Systrace uses the installation of Systrace that comes with the Android SDK in the <ANDROID_SDK_DIR>/platform-tools/systrace/catapult/systrace/systrace folder. To use GPU Systrace, download the GPU Systrace package from our Downloads page and place the enclosed files in the <ANDROID_SDK_DIR>/platform-tools/systrace/catapult/systrace/systrace folder, replacing any files if they already exist.
Before use, GPU Systrace requires an additional step to enable detailed profiling mode on the Oculus Quest. Establish an ADB connection with the Oculus Quest and run the following command from a command prompt to prepare for GPU profiling:
adb shell ovrgpuprofiler -e
Detailed profiling mode must be enabled before the app to be traced is launched.
Apps being used with GPU Systrace must have the <uses-permission android:name="android.permission.INTERNET" /> permission in its manifest.
Use GPU Systrace for Render Stage Tracing
After detailed profiling mode has been enabled, render stage tracing can be performed from the command line or Android Device Monitor.
Follow these steps to take a render stage trace from the command line:
Open a command line at <ANDROID_SDK_DIR>/platform-tools/systrace.
Enter the following command and find the name of your app:
adb shell pm list packages
Start the target app on the Oculus Quest, and run this command to start the capture:
The new renderstage category initiates the GPU trace.
Press Enter to end the trace.
The output file containing the trace will be located at <ANDROID_SDK_DIR>/platform-tools/systrace/trace.html.
Android Device Monitor
Follow these steps to take a render stage trace with Android Device Monitor:
Launch Android Device Monitor at <ANDROID_SDK_DIR>/tools/monitor.bat.
Click DDMS if it is not already selected.
Find the connected Oculus Quest in the list of devices, and select the app to be traced on that device.
Click the Systrace icon above the list of devices.
A dialog will appear. In Advanced Options, select GPU RenderStage.
Click OK to start the trace.
The output file containing the trace will be located at the location selected in the dialog.
Looking at the Trace
Open the output file in Google Chrome and you will see something similar to the following image:
The red square indicates the location of the zoom toggle. With the zoom toggle enabled, you can zoom in by holding down the left mouse button and dragging the mouse.
The red square indicates the location of the pan toggle. With the pan toggle enabled, you can move the view around by holding down the left mouse button.
Look at the GPU Timeline. Pan to find a surface and zoom in until you can see the render stages for the surface beneath it. You can click on the surfaces and their render stages to view more detailed information at the bottom of the screen. The Flow events, Processes, and Options buttons at the top of the trace provide different ways to filter and highlight flows and data.
When looking at the trace, the colors indicate the render stage:
Yellow - Binning - The Oculus Quest’s GPU uses a tiled architecture, meaning that all draw calls for a frame are executed in multiple stages. The first stage is the binning phase, where triangle vertex positions for all draw calls are calculated and assigned to bins that correspond to a partition of the drawing surface.
Light Green - Render -This is the second stage of the draw call that began with binning. One chunk of this represents the total cost of all vertex and fragment operations for one bin. A simplified version of vertex shaders are executed during binning for the purpose of finding a triangle’s position. The full version of the vertex shaders are re-executed to compute the interpolants used by the fragment shader during this stage.
Red - Store Color - After an entire bin of pixel and fragment operations are done executing, the calculated color value is copied from fast memory (dedicated for the bin’s rendering operations) to slow memory.
Orange - Blit - This represents copying between slow memory regions. This can happen for various operations, such as mipmap generation and when clearing a surface without rendering anything.
Gray - Preempt - The compositor is an OS-level service that executes at regular intervals to present the image submitted by the application onto the screen. In order to deliver the image at the proper cadence, the GPU will preempt the application’s workload so that the compositor can complete its work on time.
The following render stages are less common:
Light Purple - Load Color - Loads the color from slow memory into fast memory. This can happen when starting to render into a surface without clearing it.
Dark Purple - Load Depth Stencil - Similar to Load Color, this loads the content of the depth/stencil buffer into high-performance memory to facilitate depth/stencil operations. This can happen when starting to render with a depth buffer attached, but did not clear the depth buffer.
Dark Green - Store Stencil Depth - Similar to Store Color, this moves the calculated depth/stencil value from fast memory to slow memory. However, since the compositor does not need this information, it is discarded rather than stored to slow memory.
To see additional information, click individual bar slices.
The following information is available:
surfaceId - Internal surface identifier
width - Surface width
height - Surface height
colorBPP - Color bits per pixel
depthBPP - Depth bits per pixel
stecilBPP - Stencil bits per pixel
MSAA - MSAA level for the surface
MRT - Indicates multiple render targets
numberOfBins - Total number of bins for this surface
binWidth - Width (in pixels) of one bin
binHeight - Height (in pixels) of one bin
renderMode - Indicates render mode. Can be one of the following values:
Direct - Direct render to system memory
HwVizBinning - Binning render to tile memory using a visibility stream
SwBinning - Binning render to tile memory without a visibility pass
HwVizDirect - Direct render to system memory using a visibility stream