Developer Perspective: Optimizing Performance For Fast-Paced Interactions on Quest
Oculus Developer Blog
|
Posted by Steven Jian
|
October 9, 2019
|
Share

Hi, my name is Steven Jian and I'm part of the team at Coinflip Studios. We recently released Ninja Legends, an intense melee combat game that makes you feel like a total combat master - often fighting off multiple enemies at the same time.

One of the most satisfying parts of our game is timing a well-placed blow to slice an enemy in half. When this happens, the enemy character is split dynamically along the plane created by your sword's movement. Splitting an enemy ninja in real time can be expensive on both the CPU and GPU. While this is usually not a problem for most modern gaming PCs, getting it to run smoothly on the Oculus Quest's mobile chipset was much trickier. In this article, we're going to cover some of the techniques we used on the CPU side to keep a smooth framerate across all devices.

Our approach to this problem focused on 3 main areas:

  • Ensure that the mesh splitting algorithms have enough CPU cycles to run smoothly.
  • Manage the high number of visual effects that are spawned from slicing.
  • Keep the post-slice performance stutter free.

Background threads

Splitting the enemy mesh is a pretty straightforward operation of calculating which edges are on which side of the slice, then generating new meshes with the correct edges. This is a lot of simple math so it is easily offloaded to a background worker thread. By putting these calculations on a background thread, we reduce the strain on the main CPU thread that could cause potential hiccups in the rendering cycle.

One downside of using background threads is that you can no longer guarantee that an operation will complete in 1 frame. The asynchronous nature of multi-threaded programming means that it will likely take 2 or more frames to finish the work in the background and hand data back to the main thread. Another thing to keep in mind is that not all work can be done on a background thread - typically operations that affect the 3D scene itself must be executed on the main thread.

Preemptive CPU level ramping

The Oculus Quest OS allows the device to ramp up and down the CPU and GPU performance dynamically. We’ll focus on the CPU side of things, but similar behavior exists for the GPU as well. By ramping down the CPU whenever possible, the headset preserves battery and prevents overheating. The device will automatically detect when the game is straining at the current CPU level and ramp up the CPU level appropriately. The problem with leaving the decision up to the device is that it takes a few frames of heavy load to decide to increase the CPU levels.

The decision time required before the CPU is ramped up represents a potential framerate drop in our game! Since we know that slicing is stressful on the CPU, we ramp up the CPU manually whenever we detect that the player is about to slice an enemy. We max out the CPU level for half a second, then allow the default behavior to take over and reduce the CPU level naturally.

In the Unity Oculus SDK, this control is exposed via OVRManager.cpuLevel.

Object pooling for visual effects

When the player executes a slice, many visual effects trigger at that instant. The multiple effects work together to create a satisfying moment of feedback for the player. The problem is that instantiating these effects all at once is a very expensive operation. We’ve seen instantiation times of over 1ms per effect. At 72 fps (13ms per frame), this represents roughly 10% of the time you have with the CPU for that frame! In Ninja Legends, we mitigate this by using object pooling - maintaining a pool of commonly used effects in the scene, already instantiated but hidden until needed.

Smearing visual effect activations across multiple frames

Even with object pooling, reactivating multiple objects in the same frame incurs some cost that really adds up. Our solution is to smear VFX activations across multiple frames. We built a system that can queue up low priority activations and dole out one activation per frame. Turning on 5 effects in one frame can exceed your budget. Turning on one effect per frame across 5 frames has is much less of an impact. With this example, keep in mind that 5 frames at 72 fps is about 70ms - it will still appear instantaneous to the player.

Reduce physics interactions for the cut objects

Physics simulations in our game run on the CPU so reducing the complexity of the simulation gives us more performance headroom to work with. Once the enemy ninja is cut, we only allow their physics colliders to interact with the floor. Body part vs body part, body part vs. other enemy, and body part vs. player weapon interactions after a cut are no longer necessary and actually cause more unwanted visual noise if left activated. This allows us to maintain a smooth framerate after the cut which is important since the player is likely embroiled in an intense combat situation with multiple enemies and visual effects triggering everywhere.

See below for our Unity Physics collision matrix. Sliced enemies are assigned to the Chopped layer, which has no collision with anything except the Environment layer.

Conclusion

With these techniques, we were able to create the ultimate ninja experience on the Oculus Quest. Although it took multiple technical and design iterations, the positive feedback we’ve gotten from players was well worth the effort. If you love slicing up enemies in Ninja Legends, or have any questions, find us on Twitter, Discord, or Instagram!