We occasionally post insights from the Oculus Developer community so that VR’s early pioneers can share their practices and help push the industry forward. Today, we feature Nick Pittom, a VR developer and founder of the VR production studio Fire Panda - creators of the episodic VR story "Dispatch".
Here Be Dragons got in contact with Fire Panda last year in order to develop ‘Dispatch’, working with the writer and director Ed Robles and overseen by Oculus Studios. We put it together over a 7 month period. This is a look back at the production, some of the steps we took to develop it, lessons learned and really anything else that I think is useful.
When Ed and Here Be Dragons got in touch with us they had some concept art and a basic prototype, but wanted us to push this further. We would take our new development tests and push all the way to the final piece. This then quickly moved into developing five episodes, later edited down to four episodes for narrative and pacing reasons. Sometimes it’s better to leave things unseen! Indeed there would be script and story tweaks throughout the production, which I think is the natural process in developing a story in the way that we did.
Working with Here Be Dragons was a great experience, and one I thoroughly recommend! With them on production duties Fire Panda could focus on the artistic and technical side, which is the part I personally find most enjoyable. Susan Applegate as producer was especially lovely to work with and kept everything running very smoothly.
Dispatch is the story of Ted, a 911 dispatch operator, and a fateful night in his life. The concept for the VR experience would be that the visual elements are generated from what Ted is seeing: we are in his imagination. What we see is how Ted experiences the calls, and we only experience the story from what the audio of these calls reveal. Ed brought the idea that Ted’s imagination space was informed by the world he lived in – of screens and information and his isolation within that. Illustrated by early concept art this began as a starkly lit, ghostly, wire-framed style. The task would be to develop this into real-time visuals in the VR experience.
We began developing a number of art and technology tests to assess which would have the best results. Where certain imagery may work well in a still concept it may not work so well once moving, either creatively or technically. With this in mind, ‘Wireframe’ was not in and of itself the end goal of the artistic direction. Ed had some particular artistic ideas he wanted to pursue. The task in hand was both an artistic and technical one.
We developed Dispatch within Unity3D, a real time game engine. Unity3D has been especially good with it’s VR support and it’s become a natural platform for many to build their VR experiences within, not least because it makes switching platforms incredibly simple. (For anyone new to Unity3D you may fine some useful information or tips in this link HERE).
Developing the visual style took in a number of potential approaches, exploring wireframe shaders, physical wireframe models and also textures.
With Gear VR as a target we had a specific challenge on our hands. The experience was to be narratively and artistically driven, yet we were beholden to the demands of a mobile platform. These are not inconsiderable. There are limits on the number of vertices, memory limitations and limitations on the number of animating objects at once.
The legend that is John Carmack wrote an article that I have found incredibly useful in the past regarding optimal Gear VR development (Which you can see HERE) It goes into really great detail about what does and does not make sense when developing for Gear VR. Improvements have been made within Unity3D and OVR software, but the information above still stands.
The number of verts (or polys) is actually pretty generous, and not a significant issue, however the number of individual objects can begin to be. Objects and particles (as well as other things) each create a draw call. These can be reduced with ‘batching’, so identical particle systems will batch together, and ‘static’ objects will do this as well. It is certainly an area that has to be approached with great care.
However, in developing Dispatch it would be something altogether different which would cause us most problems: Fill Rate, or specifically Overdraw (You can learn more about optimisation in this document HERE). Our approach was leaning towards a great deal of transparent elements and in our early tests we discovered that even a couple of large transparent planes would begin to cause Gear VR performance issues. We would need to take special care to ensure we did not have significant amount of semi-transparent objects existing together in view.
Yet aside from optimisation for performance there are visual fidelity issues. Our entire aesthetic was leaning towards a wireframe led approach, which in testing became very clear would be an issue. Thin sharp lines simply break up into horrible aliasing. Carmack himself has said the best way to avoid this problem is simply to avoid these sorts of visuals. This became extremely evident when testing wireframe sharers, all of which would break down into shimmering ugliness. Thin geometry appeared to do much the same, breaking into shimmering jagged lines at any significant distance.
While certain approaches would look bad others seemed to hold up remarkably well. Wireframe textures would not shimmer at all and while they could often look quite muddy with careful application they would look very nice. Higher resolution textures with careful UV placement can go a long way. Semi transparent geometry meanwhile at a lower brightness and textured line-renderers would also appear to avoid image quality issues.
Richard Tongeman and James Kearsley were our 3D artists and as a team we developed each of these approaches. Once settled on a final approach they could begin work creating each character and prop with a uniformed style in mind. The amount of great work they were able to put out during the project was awesome, not only in modelling and animation, but in some of the other areas I discuss later.
Yet while we had an approach to the modelling and wireframe style effect we needed to push past this. We wanted to develop a visual language that could be dynamic. In Ted’s imagination images would need to become more ‘defined’ when Ted focuses on them and breaking apart when he does not. An obvious place where this is relevant is when characters are speaking or their actions are audible.
Particles were a good basis to begin with but they didn’t really feel ‘connected’ enough. They felt separate from the characters.
It was in discovering how to generate a ‘Plexus’ that we were able to put the final piece in place. By taking a particle system we are able to connect different points within a set distance to each other via a line rendered. We have a great deal of control over this, from the number of lines, to the number of connections, to the distance of connection.
Not only do they provide a dynamic and hypnotising visual effect, but also blend with the characters mesh. A character’s presence in a scene when not speaking becomes a tangled mass, with an underlying motion just about recognisably human.
The impact on performance was barely noticeable, allowing us to have nearly as many as we wanted. Indeed after initial tests we discovered that as long as the systems used identical materials they would batch together, allowing for near unlimited amounts.
These effects and approaches would be applied to all aspects of the visual design, with certain spot effects and styles used for specific moments.
With all the elements in hand it allowed Ed to craft the scenes and action as he wished. Indeed, while technical hurdles were overcome we discovered that it often became a case of ‘less is more’.
As with the characters the locations themselves went through an iterative design process, starting with the requirements of the story. Once gathered the rooms and locations were arranged and rearranged as we worked out the blocking of each shot and what would look good
As scenes came together Ed provided further concept art to push the art style and explore creative options. This would form part of the iterative creative approach throughout the production.
Building the Story
Unity3D has recently released a fantastic tool called ‘Timeline’, which allows you to sync together different animations from different objects, animate those together, turn objects on and off and really get a significant step towards a video editing timeline that some may be familiar with. I myself come from an editing background and so this is fundamentally appealing to me. Unfortunately this tool was not available to us at the beginning of the project and so a new approach was required.
Happily Jason is familiar with a plugin called Koreographer, which is primarily for audio reactive developing – visualisers for music and the like. It also uses a timeline and keyframe based approach. As Dispatch was to be audio led that meant we could take the audio track and ensure that different animations and events could be synced together. We would also be able to skip along the timeline to cue up moments as we tested. It’s not as within Unity3D as the new Timeline, but it provided us with a great analogue.
Each keyframe on the timeline would fire an event. We would be able to use these to not only trigger animations, but also to provide the foundation for audio reactivity.
The Event system Jason built would take in the audio events and we could then globally distribute them to individual elements. To keep this coherent ‘Actors’ were created. The Actor would control everything about a specific object, be it a character, a piece of scenery or a moving car. If we want that object to fade up then that actor could control that along with the colours it would fade to and the speed it would do so.
One of the key effects we wanted to pursue was audio reactivity so that the character voices would appear to have an effect on the world. The foundation of this was in character models themselves, where the wireframe texture would fade up and then off with each syllable, and also swap out to a new texture. While you may not see the entire figure in each texture the way they flash would build up an image of the character. When that character is not currently speaking we would then drop back to the particle and plexus approach. Together it creates an abstracted character, but one coherent enough to interpret it’s movements.
Locations would flicker in the same way, and as we are able to specify the intention of each event we could ensure angry voices had a particular effect that was different from quieter voices.
Some things were not simply triggered by the audio keyframes but by other actions, such as the position of a camera, progress in animations and so on. These would then also trigger as desired, with all events feeding into the global system.
There are benefits to this approach for re-timing and constructing the narrative. If changes were made to the audio we would simply need to ensure the keyframes would align as required and then everything would play out as it should. This systematic approach also drew benefits in other places.
During one sequence Ted is composing an email to the Sheriff and the viewer stands on what is revealed to be an enormous representation of his keyboard. Rather than animating a sequence we created a ‘virtual keyboard’ such that keys that were pressed would write to the screen. We then created a copy of the text being written and using audio events of a real keyboard we time the pressing of the keys on the virtual keyboard. This subsequently writes to the screen. We now have complete freedom to change the copy, the tempo of the writing, the visual keyboard or the screen itself and all work in isolation. Indeed the script and keyboard visuals changed more than once, but these alterations did not impact upon the schedule in any significant way.
These sorts of art-driven code-based systems bring benefits beyond pure animation and allow for a great deal of flexibility in how the art is visualised.
Directing the Viewer
As with the art style the approach to constructing and telling the story was also iterative. Ed provided a script which we broke down scene by scene into a plan which would identify the visual elements, camera motions, transitions etc. From here we could build our approach to how the viewer would experience each scene.
For each scene Ed provided a storyboard that identified where the viewer’s focus would be, along with his thoughts of scene blocking, viewer position and scene layout.
There’s a great deal of satisfaction in working out how viewers will experience and engage with the story. Which direction will they look in, what visual elements do we want to be the focus? What do we keep? What do we remove? How will the scale of these elements affect the intention of the story at this stage?
The way we approached the visual style leads to interesting possibilities, allowing us to mislead the viewer or reveal story information in a different way. Everything is based in Ted’s imagination and his interpretation of the scene is what we see and this was a creative opportunity that Ed wanted to explore as much as possible.
Indeed the first scene follows a call where a teenager appears to be lost in the woods, scared there is something after him. The trees build up around the teenager – forming as Ted pictures the location in his mind. As we look behind we see the monstrous, ambiguous ‘something’ that is following. Yet as the tension peaks we reveal it was all just a prank. The scene falls apart and we are left with teenagers sitting in a basement, a horror move playing on the TV. We can move through all these elements naturally within the space of the VR, including scale changes.
Ed’s intention here was that scale indicated how intently Ted was focused on a scene – how much emotional investment he has. When we are with the teen in the forest Ted is intent on ensuring the caller is safe, but as it falls back to a prank call the visual elements shrink down in view. The scene now takes place in miniature, within the frame of one of the monitors he sits in front of. He’s no longer emotionally engaged with it.
Using scale in this way gives us interesting ways to engage emotionally, and also narratively, framing additional scenes. A second monitor appears as Ted calls for an officer to go visit the teenagers and inform their parents of their crime. We do this again with ted speaking to Ray at a crime scene, overseeing a miniature scene at first, before diving into the scene as Ted realises he knows the victim from a previous call.
Framing the scenes is important, whether we are inside them or not. Much as with film we want to lead the viewer towards important story details and events. In another scene we start on a map screen identifying the location of a squad car Ted as he calls the Sheriff. We then smoothly move through the map and down to the car itself as it heads to the crime scene. Finally as Ted begins to argue with the sheriff the scene begins to scale up, growing until we are within it.
As well as moving between scenes we also want to lead the viewer around scenes. We do this both by literally moving the viewer, but also by visual and audio cues. In one scene we join Trevor as he calls Ted reporting that a dangerous man has arrived at his house. The viewer as camera is physically moved at times, but at others we move the characters around the viewer, causing them to follow the action around. At times there are more than one direction they can look in, but the important thing is to ensure that no matter what direction the do look there is something of importance to see, and if we want to ensure they see a specific event that we are using visual and audio cues to ensure that happens.
One thing I’ve always been fascinated by is the use of viewer’s look direction to trigger events. In one simple case we use look cues to trigger audio within the first scene in the forest. The viewer hears a sound and looks back to see the monstrous ‘something’ chasing the teen, triggering the monstrous moan of the creature. We can go further, triggering animations and events at certain points. In Dispatch we made a conscious choice to approach this with a light touch. Indeed while interaction was explored it was kept to a minimum to allow the intent and authorship of Ed’s story to drive the experience.
In telling this story the art is in many ways informed by what is possible within the technology at hand, in this case a game engine. Yet there are benefits to being in a real time system, and the art can instead drive the use of the technology.
A large part of Episode Three takes the form of an extended car chase, where we join a character inside their car as they attempt to get away from their assailant. Ed’s direction for the scene would be that we are within the car with Gloria, who is being chased. We needed to feel present in the scene and see it all as if we are also taking part.
As well as passing cars we would need to include streets, street lights, builds and so on. Yet this is far from simple. If we hand animate a street we locking ourselves into a timing or structure that may alter. What if we needed to add 30 seconds in the middle of the car chase? What if we wanted to add a new action beat.
There was also the visual style, with buildings and cars fading in and out as they passed, creating a dreamy effect. Manually animating this would be time consuming and again risk locking ourselves into timing.
It made sense to instead to approach it systematically. For this Jason came up with a ‘Street’ system. Each element we were going to pass would be treated as an Actor, fading up and down as they enter and exit. This would also control any particle systems, lens flares or sound effects (cars passing etc).
These would then be organised into ‘Lanes’, with each Lane able to spawn from a list of chosen Actors. We could assign a Lane for street lights, another for buildings close to us, another for larger buildings further away and anything in between. We also had the ability to control spawn rates, delays, the distance at which Actors begin and the duration they appear for.
As the scene progressed we would pass through different parts of the city, from residential, to commercial and ending up in an industrial area. To manage this transition Lanes would be organised into ‘Districts’. For each District the Lanes would have their own spawner, so we could control the variety and placement for each Actor as the scene progresses. Each district can control the speed at which the scene passes and we can also control that manually.
Richard and James created a large collection of buildings, street furniture and other elements. These objects were then styled, prepared and made into Actor prefabs. These objects were then styled, prepared and made into Actor prefabs.
Junctions, bridges and other features would also be required. These would form one off ‘Feature’ actors. When spawned they would pause all other lanes and then automatically restart them. This would include animations, such as a car screeching to a halt as we blow through a red light.
The car we are in can swerve around the street, with the camera able to detach from it as needed. Corners can be added, pausing the lanes and restarting them as we need, while animating the camera and car around these manually.
These system based approaches allow for an incredible amount of flexibility, artistic control and variety. It’s also important when we come to consider optimisation as we can reduce the amount that appears on screen at any one time.
Object pools are essential in the drive for performance, especially for gear, as spawning or instantiating objects on the fly can cause performance dips. To avoid this we preload a number of objects and keep them ‘pooled’ in the scene. Then can then be used when needed or returned them to the pool when we are finished with them.
In other areas it was not purely down to optimisation, but simply what the hardware is capable of. We have a moment towards the end of Episode One where Franklin bursts through the door, shattering it into dozens of pieces. Simulating this live is not practical, as there are too many objects and mobile hardware is not necessarily best suited to it.
However, what we can do is simulate these complicated moments outside of Unity3D, and then import them into the project. One such method is in using ‘point clouds’, where each frame of the animation is essentially a new mesh that we flip through. This ran extremely well and looked great, however once we begun adding more into a scene we begun to encounter prohibitive loading times.
Blendshapes however do not incur the same load times. We could blend between a reduced number of simulated frames and still retain much of the same effect.
For some elements this was ideal. Yet the door shatter involved so many individual and tiny pieces that morphs did not quite work as well, so the simulation was retained in full.
Animation and Modcap
We decided early on that motion capture would be used for the production. This was for artistic as well as practical reasons. Hand animation of course takes longer and realistically we did not have a production schedule to support it, however with such an abstracted art style it made sense that the core of the characters should be driven by something very real and human.
There are many modcap options, with varying degrees of cost, post-cleanup and shooting requirements. One option is Perception Neuron, which is an extremely cost effective option. Fire Panda already owns two Perception Neuron suits, and I had used it previously on the Apollo 11 VR experience (produced by Immersive VR Education). Considering the cost benefits and our previous experience this became the right choice for the production. However the system also has the benefit of being set up in a way that allows it to export useful files almost immediately, which was beneficial from a production time-scale point of view.
However, the system is not without it’s interesting quirks. The ‘neuron’ sensors and relative in nature, not absolute. They see where they are in relation to other sensors and a skeleton is built. This does lead to there being some potential inaccuracies in the result. This will manifest most in the contact points, feet and hands mostly, but also when an actor lies, kneels or otherwise interacts with the floor.
To manage the system for the best results Here Be Dragons organised the shoot with Ari Karczag at his specialised modcap setup, ‘Fonco Studios’. This studio is set up for a variety of solutions, however Ari is the senior US representative for Noitom Studios, makers of the Neuron suit. His expertise and his team were on hand for the duration and absolutely invaluable to ensuring the shoot went swiftly, smoothly and to schedule. It also ensured that we got the best possible results from the shoot with the least possible clean-up required.
The modcap performance team was led by Richard Dorton a mo-cap performer and expert with decades of stunt experience. Performances were then provided by actors Tyler King and Jessica Troiani. All three provided excellent material for use in the experience and it was greatly satisfying to see their work transposed onto the characters in the story.
It was not uncommon for individual Neuron sensors to lose tracking, or the suits to pop a connector. Ari, along with assistant Tk Gorgonia, would be incredibly efficiently in ensuring sensors were realigned, suits were correctly fitted and everything ran as smoothly as possible.
As the system is relative there is simply no foolproof way to ensure the captured movements align perfectly, especially when characters interact. For this a degree of cleanup is required, and Richard did an amazing job in this regard, ensuring the fight scene action from the shoot was preserved.
In other instances we need to adapt character animations to the scene within Unity3D. Final IK is an asset I’ve used on almost every project, and allows us to provide ‘targets’ for hands, feet, spine and look direction. This all blends with animation seamlessly and allows us to control for if characters need to reach for objects, or if we simply need them to sit still in a specific spot.
Audio was fundamental to the entire project and Ed worked closely with sound designer Matt Yocum. Conceptually, the narrative is driven by the audio, so it made sense for the project to be the same—with Ed and Matt delivering the episodes to us and crafting the experience around that. As with the visual style, this was an iterative process, with the result then informing how the audio might be altered to enhance it.
Being in VR, the location of audio is useful for driving where viewers look or drawing attention to important elements. We are able to take advantage of audio specialisation to ensure sounds appear to come from a specific location. Character footsteps appearing behind the viewer can cause them to turn to see where the sound is coming from. The same goes for the sound of a car pulling up or train passing. Characters themselves react to sound within scenes, so it can be a valuable tool.
During Episode 2 Trevor (seen above holding a knife for protection) is trapped inside his house as his assailant stalks around the outside. He begins at the front door, rattling it, but then begins to stalk around the house looking for an open window or a way in.
The viewer is compelled to follow the footsteps visually, but also audibly, following Trevor, but also turning when they hear someone striking a door behind them.
Being on Gear VR we needed to keep things as optimised as possible, and having numerous sound effects may have begun to cause performance issues or indeed increase loading times. As we progressed we found that not every element actually required placement in 3D space and could instead sit on the main stereo mix. It’s then our job to get the right balance between optimisation and scene immersion. Those moments that focused on a 3D audio placement became all the more intense.
Yet among all this the most important element in all this was the voice acting. Ed and Dragons were able to entice Silicon Valley’s Martin Starr (aka Gilfoyle) and character actor Beth Grant to Dispatch. Their involvement elevated the entire project. It was certainly a joy to hear Martin speak as Ted for the first time.
We have a rather conspicuous use of video in Episode Three where Martin is seen in real life, reflected in the screen in front of us. I really like this as a dramatic moment, allowing us to see an emotional reaction at the most dramatic moment of the episode. It also allows us to see Martin himself, as well as reveal an important detail about his character.
The video was shot in stereo, with two cameras attached base to base, allowing for a close approximation of eye distance separation. Each video was then synced and then taken into After Effects. Here I combined them into a single top/bottom stereo video, ensuring each video was correctly aligned to create the stereo effect. If one frame is slightly rotated or too high then the stereo effect would become quite uncomfortable, so it’s important to get correct. This also applies to colour and image levels, which must also match for each eye.
The Unity3D Store Asset ‘AVPro‘ was used to process and display the video, and is built with stereo and VR requirements in mind. We did discover with the front menu previews that 5 videos in a single scene would cause crashes on some Gear VR headsets, so in this instance animated gifs were used instead.
While the episodes themselves are non-interactive by design we did have to ensure there was an interface and menu system that supported the experience. Although we have released to Gear VR and Rift, this was the first multi-platform release we had worked on. The experience would need to support those with either headset, but it also had to support those with and without Touch or Gear VR controllers.
We also wanted to make sure that as a user connects or disconnects the controllers they were supported. A Gear VR user who’s controller runs out of battery should be able to switch to headset control without interruption. This is not handled by default, and we would need to bring in our own solution. The system listens for disconnect and reconnect events and then chooses the correct form of input, including controller models and raycast point.
The app also has to support the ability to pause. It seems like it should be simple, but can become quite complicated. The solution here is a State System which handles pausing, interaction control and other events such as taking off the HMD.
In app purchases (IAP) were a requirement we had also not encountered before. The first episode is free, but the following episodes would be behind a purchase gate. We needed to ensure that users could not accidentally view those episodes, and that once they had purchased them they were able to view them without issue.
In the end this was a fairly simple system which pings the Oculus servers and checks to see if the user owns a particular key. If they do then the buttons are enabled. Other requirements such as Entitlement (ensuring the user owns the app) are also very straight forward to implement.
Indeed this project required us to engage with the Oculus services far more than any previously and they were surprisingly pleasant to use, especially uploading builds to release channels allowing different users to test on their devices.
Dispatch has really been a dream project to work on. Creatively it’s been both challenging, but also incredibility satisfying to see come together. A massive part of this is of course Ed’s great storytelling and direction, and I’m incredibly grateful to Here Be Dragons for bringing Fire Panda onto the project.
I’ve always wanted Fire Panda to be about Storytelling in VR and I think Dispatch is a great argument for the medium being an ideal place for narrative focused experiences.
For some time now I have been developing a project called ‘Decay Theory’, a Sci-fi experience set in the far future. We follow SuRei, a young girl, who lives on a spaceship, alone except for FeiMa, an ancient traveller, grandmother-figure and enormous bio-mechanical horror. Their journey is to explore the galaxy, seeking out human civilisation, lost in a cataclysm.
I’m currently in the process of developing a vertical slice of the experience and plan to reveal more as soon as it is ready.
I’m also a co-founder of Psytec Games and we are currently in development on Windlands 2, which is coming together very nicely. I am in the process of adding in characters, narrative and some exciting cinematic moments. The game will feature giant robots that players will be able to take on in groups of up to four and the opportunity to build a world for them to explore is quite satisfying indeed.
Thanks for joining us Nick! You can check out Dispatch now on the Oculus Store.