Occlusion Culling for Mobile VR - Part 2: Moving Cameras and Other Insights
Oculus Developer Blog
|
Posted by Darkwind Team
|
June 4, 2019
|
Share

Hello! We at Darkwind Media worked hard to bring Gear VR and Go owners a chance to play Camouflaj’s Republique in VR. Along the way we had to explore every possible avenue to increase the performance of this graphics-heavy game, but none worked as well for us as the Dead Secret method of occlusion culling. This article is part two of a two-part series that showcases how we developed a custom occlusion culling solution tailored to fit our specific needs. In this article, we propose how the system can be expanded to handle a moving camera, while explaining some of the limitations and lessons we’ve learned along the way.


Extending the System

Initially we had been capturing from a single point, so we had to think through those situations where the camera was moving. We needed a system that could efficiently transition from one potentially visible set to another. In the very last scene of Republique, the player views the world through a hovering drone which can be moved along a fixed rail, essentially giving the camera 1D movement rather than 2D or 3D. That gives us the perfect opportunity to extend our occlusion culling to support some camera movement!

The first and most naive way to support a moving camera is to pick several spots along the 1D path and capture occlusion from those points, like before. But what if the camera ends up between two of these points? Generally, the shorter the distance between captures, the less likely the potentially visible set will change. Some might pack the 1D path with many, many capture points to shorten the distance between them. Unfortunately, that would end up eating more memory and require iterating over the whole list of renderers every time the camera moves.

So what if we started combining capture points into a single volume? Given two adjacent captures forming a line segment, we can’t know exactly where a renderer becomes occluded without taking another capture, but if we did that for every line segment, we’d end up taking an infinite number of captures!

Instead, the safest assumption is that any renderer seen by either capture is potentially visible whenever the camera is between those 2 points. We will do the same for the next line segment, and the next, and so on. Notice that an individual capture is shared with both line segments it is attached to, so when the camera is exactly in a capture location, we can choose to cull with either line segment and guarantee every renderer from the capture on that point is enabled.

...And since movement through these volumes is what we’re after, what if -- instead of storing whole lists for each volume -- we store just the change from the previous cell? We’ll need two lists: one for renderers to turn on and one for renderers to turn off. When moving in the opposite direction, the interpretation of the lists is reversed; renderers in the off-list should be turned on and renderers in the on-list should be turned off.

Formally, the list of renderers to turn on when crossing the boundary from cell A to cell B is the set of all renderers in B’s visibility set, but not A’s. This is called the relative complement of A in B.

You can see how encoding these on-and-off difference lists can be a lot smaller than whole lists, especially in a real scene with hundreds or thousands of renderers and many more than 4 cells. It can also greatly reduce the memory lost from making a lot of captures; the closer together the captures, the more likely they are to see the same set of objects, and therefore the more likely these difference lists will simply be small or even empty.

Perhaps even more importantly, if we remember which cell we occupied the previous frame, then even if the camera moves out of that cell within this frame, it’s not likely to have moved more than a few cells, and we just need to iterate over a few difference lists to update the occlusion culling. And that’s all it takes to build a simple 1D occlusion system!



Now that we have a working 1D occlusion culling system, it’s not too difficult to see how to extend the system to 2D or even 3D. To extend from 1D into 2D, you start with a grid of capture points and combine 4 of them into a rectangular cell. Each 2D cell will have 4 neighbors and therefore 4 sets of difference lists. To extend into 3D, you start with a 3D grid of capture points and combine 8 of them into a rectangular prism cell. Each 3D cell will have 6 neighbors and therefore 6 sets of difference lists.

Although no version of Republique currently uses such a system, if your game is targeting a 6DOF headset, you might need 2- or 3-dimensional occlusion culling.

Limitations & Lessons Learned

Every solution requires a tradeoff, and our solution is no exception. Here are the limitations of our system to remember when deciding whether or not our solution would work for your game.

Capture Resolution

For most of our game, we picked 512px as the size to capture, which is a size we found to be an acceptable balance between accuracy and calculation time in Republique. In theory, for perfect accuracy you should capture your game world as close to the same resolution as it will appear on the player’s screen. The default eye buffer size for Gear VR and Go is 1024px, each at roughly 90° FOV. On the other hand, if some renderers are missed at 512px, they must have been only 1 to 4 pixels wide in the final image, and at that size, they’re an invitation to aliasing. If your renderers are not too small or you just want to be aggressive about culling small objects, you can probably get away with a low resolution like 512px.

Static Geometry and Complex World States

In our system, we can only occlude static meshes that don’t move from their position during capture. Similarly, we are only occluding one state of the world. If your world has multiple distinct states featuring different renderers, you’ll have to do even more captures to occlude these other states.

Compatibility with other Visibility Systems

The visibility of some meshes may already be controlled by game systems. In this case, you must either exclude these renderers from culling, or you must implement additional logic to combine the two forms of visibility. LODGroups for example, control the visibility of renderers, and you can either treat only LOD0 as capable of occluding (as Unity does) or treat each LOD level as a separate world state.

Higher Memory Consumption

Storing this occlusion data in memory, especially in the most naive way as above, does take space in memory and time to load in the first place. Make sure you have the extra memory to spare.

Hierarchical Culling

In Republique, this fine-grained culling is merely the last layer on top of the coarser room and area divisions. This speeds up the application of culling data to the scene during camera transitions because not every renderer in the area has to be touched, only the ones in rooms required by the current camera. Even if the savings are small in absolute terms, switching cameras happens in such rapid succession during gameplay that flow might have been a lot worse without them.

Reauthoring Occlusion Data

When static renderers change in any way -- get moved, deleted, added, anything -- the occlusion data becomes stale and might cause visual errors. Therefore it is required to recalculate occlusion anytime the scene is modified, and depending on the level of automation you have, it might slow your workflow down a bit.

Realtime Shadowcasting

If you have any kind of realtime shadows, beware: objects culled by this system cannot cast realtime shadows. Use an alternative to fully-realtime shadows such as baked shadowmasks.

Optimizing the Authoring Process

The GPU is super fast at rendering our colorized scene, but reading all 6x512x512 pixels one at a time on the CPU is much slower. Here is just a sample of the many things you can do to speed this up:

  • Split the work into threads
  • Use a cheaper-to-compute hashcode
  • Cache data structures and initialize them with reasonable sizes, etc.

As an example, Republique can process the brig/power station area (requiring 385 cubemaps) in less than 25 seconds.

3DOF Head Movement

We’ve only calculated visibility for a single point. 3DOF VR doesn’t let the user physically move through the virtual world, but Oculus Go and Gear VR do try to predict how much the user’s head and neck have moved based on the orientation of the HMD, and that does give the game camera a little bit of movement. We’ve said before that not much will change within a few centimeters, but if the camera is very close to a corner, leaning your head might make the difference.

We can mitigate this by either moving the location of the camera slightly further away from the corner, or past the edge of the corner so the capture sees around it and the object is never culled in the first place.




Thanks for reading, that’s all for now. For even more information about writing your own occlusion culling system, don’t forget to check out Chris Pruett’s original post on the subject. He makes some great points about extra captures within larger cells, hand-authored occlusion “islands” and non-uniform cell sizes.

We hope this has inspired you to get creative with performance optimizations in your own games. We’d love to hear from you, so be sure to reach out and follow Darkwind on Twitter and Facebook where we post about some of our other projects, as well as the occasional game jam entry or developer blog. Thanks for reading, and good luck with your next mobile VR build!

- The Darkwind Team