All Oculus Quest developers MUST PASS the concept review prior to gaining publishing access to the Quest Store and additional resources. Submit a concept document for review as early in your Quest application development cycle as possible. For additional information and context, please see Submitting Your App to the Oculus Quest Store.
A draw call occurs when a materials and mesh are submitted to the GPU for drawing. These calls can also be CPU intensive because VR games tend to have more draw calls than a 2D game, and VR games tend to run at higher frame rates. This means there is less CPU time for each frame to draw all of the objects, and there are twice as many objects. Note that single-pass stereo rendering (offered for Unity) can help remove some of this burden from the CPU, but you should still understand the impact of changing various parameters on draw calls.
The length of time for each draw call varies depending on the state from the last draw call. If parameters such as the material, mesh, and texture change from the previous call, the draw call takes more time.
This document shows the test results of various call parameters and the impact on draw call time for the CPU and GPU.
These tests were conducted with an Oculus Quest using Unity 2018.1.6f1, but are relevant regardless of how you are developing your app.
Unity Test App:
Disable Asynchronous TimeWarp (ATW) to prevent the ATW thread from blocking the main render thread. TimeWarp is included in each raw GPU measurement.
Disable VrPowerManager to prevent device from sleeping
The following image shows an example of the rendered test:
The first set of test results graphs the impact of changing shaders, materials, meshes, textures and colors on the CPU.
Based on these results, to optimize performance, in general you should:
The next few sections provide graphed results and more details of the impact of these parameters.
The tests show that using the same shader but switching materials (without changing any other attributes) causes a 64% time increase for draw calls.
Switching shader programs yields an even worse result. The tests show that the draw call time cost is 175% higher after switching shaders.
The following graph shows the affect of switching shaders and material on time taken to render objects that have zero textures and a shared mesh.
To help reduce the time of changing materials, you can sort your draw calls by material.
You should also avoid changing meshes between draw calls if possible. This graph shows the increased time when the mesh is switched between draw calls when the same material is used.
The impact of changing materials is larger than the impact of changing meshes. The following graph shows a comparison.
If you are changing materials and using multiple textures, it will be more expensive to make a draw call as you increase the number of textures.
If reuse a material, but have multiple textures and are changing them between draw calls, you can see that there is still a small cost for changing textures.
The following graph shows the effect of increasing textures for changed and reused materials.
Changing the texture, the material color, texture size, texture filtering or compression algorithm have little affect on draw call time.
In Unity 2018, changing textures doesn’t incur any cost beyond the cost of changing material. The following graph demonstrates this.
Changing the color of a material between draw calls doesn’t have a significant cost, so don’t worry about swapping colors if you are swapping materials. However, if color is the only change you are making, you should use the same material and change the mesh color.
Texture size won’t have a significant impact on your draw call cost. The following graph demonstrates this.
Changing the filtering algorithm or compression method won’t have any effect on the draw call cost.
The following graph shows the affect of changing filter algorithms.
The following graph shows the affect of compression algorithms.
Similar to texture size, the size of your meshes are not a significant CPU cost for draw calls. These affect GPU. The following graph demonstrates the impact of texture size on the CPU.
Time per draw call can be estimated if you know the total rendering time for a frame under various permutations. To calculate the draw time for our test, we observed the average delta between renders for various object counts.
These tests show that a redraw of the same object is about 25% the cost of drawing a different object.
To reduce draw time you can:
The second set of tests examines the impact on the GPU of changing material, complex meshes and more.
In general, to reduce GPU draw time:
The following sections contains the graphed results of each parameter change.
A material change has the most expensive CPU Cost, but what about its affect on GPU cost? It turns out that changing materials does impact GPU time, especially when the new material is using a different shader. The following graphs shows the impact of changing materials on GPU:
Material changes have an impact, but note that complex meshes with high vertex/triangle counts have a higher impact on GPU draw time than material changes. The following graph demonstrates that using higher polygon meshes will rapidly use up your GPU budget.
Be aware of the cost of sampling additional textures in your shader. Using a simple shader, the increased cost of additional texture samples can be seen in the graph. There is an increased memory cost with more textures, but compared to the increased cost of changing shaders, additional textures are not one of the big things to avoid.
Three things that didn’t affect the GPU cost of draw calls are texture compression, texture filtering and texture size.
Following is a graph of various texture compression levels, reusing a shader or changing the shader.
Following is a graph of various texture sizes levels, reusing a shader or changing the shader.
Following is a graph of various texture filter techniques, reusing a shader or changing the shader.
The following graph demonstrates the real time it takes per draw call for a small quad using a simple shader. This test was conducted in a controlled way, so you should use it to compare the relative cost difference between calls, and not the measured time.
The next set of tests looks at the impact of changing shaders on GPU draw cost.
It is thought to be generally more expensive to sample from a Cubemap than it is to sample from a Texture2D. While this is true, the difference is small and if you your app requires a Cubemap, you should use it. The following graph shows a Cubemap versus a Texture2D.
Texture reads are thought to be expensive and should be avoided. However, a dependent texture read on Oculus Quest is only a little more expensive than an independent one. So consider sampling a look-up-table (LUT) instead of more expensive shader operations.
Shader complexity will likely be the number one operation that takes GPU time. See the comparison between a simple diffuse shader, and a PBR shader (Unity’s Standard Shader). The number and complexity of logical operations is typically going to use a lot more time than sampling additional textures.
Finally, the following graph shows the result of combining the different texture parameters. The graph shows that the complexity of the shader results in a longer draw call time in almost every case.
We can also perform the same derivation to calculate a ‘real gpu time’ cost for the different parameters. As noted previously, this is useful only for relative comparisons. The results show that the complexity of the shader is the thing to be concerned with when you measure GPU time.
The tests and resulting data show you the some of the affects of changing parameters on the length of draw calls. There are also a number of other things to be aware of when designing your application.
GL_EXT_shader_framebuffer_fetch(simply write your fragment shader with an
inoutshader parameter instead of returning the final color). However, this hides the fact that this can be quite expensive, and the cost increases with MSAA level (frame buffer fetch is handled on a per sample basis instead of per fragment).