This is the second article in a series reviewing new functionality in the Audio SDK. The following post covers volumetric sounds and the various usage patterns for achieving better presence through sound design.
* * *
Spatial sound using HRTF is great at representing point sources, this is what it was designed to do and you can get a long way with just point sources. In the real world, however, not all sounds originate from a point in space and sometimes you want the sound to come from a larger volume. This is especially true when there is a large object or character near the listener. In these cases using a point source can sound unnatural like the sound is coming only from the center of the object.
There are ways to work around the point source concern, each with their own advantages and disadvantages. A common approach is to mix in some of the original signal to reduce the directionality, this is often referred to as “spread” or “2D blend”. This approach comes from traditional game audio, where audio spatialization is achieved with panning so mixing in the original signal will just soften harsh panning to achieve envelopment. This approach doesn't not work as well in VR with HRTF based spatialization. The 3D spatialization with HRTF includes interaural time difference (ITD) which is the per-ear delay that is applied to the signal. When that delayed spatialized signal is mixed with the original input signal it can introduce a comb filtering artifacts due to the delay.
An alternative approach is to just place many point sources to spread the sound out. This approach can work reasonably well but this is going to eat into the voice count budget and you need to find the right balance for the number of voices. There are also potential phase coherence issues if each of the multiple sources producing the exact same signal.
To represent a larger sound, Oculus Research developed a process to compute the projection based on the distance and radius, and constructing the spatialization filter during run-time. When the sound is very far away the projection is small and it sounds like a point source. When the sound is close to the listener the projection is larger and the sound more spread out.
This provides a physically correct and high performance way to represent large sound sources. When the listener is inside the radius of the sound source they are enveloped by the sound. It smoothly blends from spatial to surround as the listener approaches the center of the sound.
This new technique is akin to work done in graphics community on the problems of global illumination and light transport. The main idea is to represent smooth functions over a sphere, such as diffuse lighting, using a compact basis function like spherical harmonics. This basis representation has dual benefits: (a) it saves memory (b) it simplifies complex integral expressions into simple dot products saving computation. This has enabled real-time computation of diffuse lighting and global illumination for large, complex virtual environments.
The underpinning technology here is spherical harmonics which is also the basis of ambisonics, but for volumetric it is applied differently. The HRTF filter is store in spherical harmonic domain, this allows us to to calculate the filter for a point source or any shaped projection onto the listener sphere. The initial implementation allows for spherical volumes, this is achieved by projecting a circle onto the listener sphere to calculate the HRTF filter.
Volumetric sources allow sound designers to model objects large and small, it is essentially spread done right for spatial audio. One thing that is important to emphasize is that volumetric sources are designed to only spread the sound out, it is not intended to provide a sense of scale. Making things sound “big” is still the job of the sound designer with elements such as reverb balance and the dynamics of the mix. Volumetric is an additional tool in the tool-belt for getting the right mix.
* * *
Check out our previous article covering our approach to Near-Field Spatialization and how it can be used in VR experiences.