As discussed in the previous post "Non-repetitive design sound design in video games" one of the great challenges of game audio is creating realistic sound that not only suits the environment of a game but does so without repeating itself and without breking the games memory budget.
We have already discussed how non-repetitive design can work toward achieving this. But as effective as non-repetitive sound design methods are, they can not create completely original sounds that have never been heard by the player, only alter existing sounds to form variations, or layer existing samples. A possible exception to this perhaps is the example of using oscillation and/or pitch shifting on machine hum to create a completely different object sound.
So is it possible to generate completely new but appropriate sounds every time one is needed? Yes. Procedural audio can work in a number of ways with a number of approaches to maximise variation and realism.
Although there are different approaches to procedural audio, which will be discussed later, the majority of descriptions are similar (Fournel 2012, Farnell 2007, Collins 2009, Verron 2012). Andy Farnell, a pioneer of one approach to Procedural Audio states that "Procedural audio is non-linear, often synthetic sound, created in real time according to a set of programmatic rules and live input" (Farnell, A 2012). In other words, procedural audio is the process of sound being created based upon a number of user defined rules which can be adapted in real time. Farnell also states in his book 'Designing Sound' that as Procedural Audio is such a broad term and is often used in different ways, it is sometimes easier to just say what it is not, in this vain, pocedural audio is not the linear creation or playback of sound.
One reason to use procedural audio is the age old and all important issue of memory. Whilst it may be argued that memory is becoming less of an issue with newer, higher powered platforms, some may say that as consoles and gaming PC's become more powerful, consumers expectations rise, so memory is just as much an issue now as it has ever been.
Another reason is of course the possibilities to avoid repetition. Not only isit possible to create unique sound effects every time one is triggered, but these soundeffects can also be adapted to what is shown on screen. For example, in the forthcoming, completely procedurally generated 'No Mans Sky' from Hello games, sound is created based upon rules that are used in the generation of the games creatures. Sound Designer Paul Weir passes these rules into a physically mdeled vocal tract, which includes a virtual mouth, larynx and vocal chords, to create entirely unique sounds for every procedurally generated creature in the game. "Rather than working against the game’s algorithmic chaos, he embrace(s) it"(Khatchadourian, 2015). Read this article in The New Yorker for more information and samples of the audio. Paul also gave a presentation at a proceduralaudionow.com meetup earlier this year on the pros and cons of procedural audio.
Another example of how procedural audio may be utilised in this way comes from Verron & Drettakis in the from of their Audio Engineering Society presentation in 2012. Their paper "Procedural audio modeling for particle-based environmental effects" outlines the creation of a sound synthesizer that "simultaneously drive(s) graphical parameters and sound parameters" resulting in a "tightly-coupled interaction between the two modalities that enhances the naturalness of the scene" (Verron, Drettakis 2012) Here is a video example of the synthesizer in action.
But why is procedural audio predominately used for video games and not other more traditional forms of media? in her 2009 paper "An Introduction to Procedural Music in Video Games" Karen Collins suggests that video games are an "ideal media form for procedural (audio)" as "many
elements of gameplay—especially the timing of events and actions—are unpredictable
and occur in a non-linear fashion" (Collins, 2009) Though Andy Farnell suggests that procedural audio also has its uses in film.
Its is generally considered that there a two main schools of thought in how procedural audio should be generated. We will refer to these methods as "Bottom up" and "Top Down".
Bottom up approach
Pioneered by Andy Farnell the bottom up approach to procedural audio is based upon the idea that sound effects can be created from nothing, or "generated from first principles, guided by analysis and synthesis" (Farnel, 2010). Farnell argues that by utilising the bottom up approach, a sound designer can create 'sound objects' which unlike audio recordings can be kept and changed in real time and can mimic the unpredictable nature of real world sounds.
In the book 'Designing Sound' Farnell suggests the use of Pure Data (PD) to create such 'sound objects'. PD is an open source visual programming language which enables the development of sound based software without the need to write code.
Throughout Designing Sound, Farnell adopts a scientific approach to dissecting numerous sound sources such as fire, running water, motors and explosions. Farnell dissects an explosion into the following elements:
>Early ground waves (prerumble or dull thud)
> shock front (dilated N-wave)
>Burning gasses from a moving fireball (phasing / roaring)
>Collisions and fragmentation (noisy textures)
>Relativistic shifts in frequency (time dilation and compression)
>Discrete environmental reflections (echo and reverb)
By focusing on the creation of each of these individual elements it is possible to create a Pure Data patch which can be adapted to suit different forms of explosions, and offers endless variation.
Drawbacks of the bottom up approach include the fact that a huge amount of work is required to develop a system of synthesis for a single sound effect. This, coupled with an in-depth application of sound propogation principles such as reflection, dispersion and oblique boundary loss, unfortunately puts this approach out of the reach of all but the most technologically and scientifically savvy sound designers.
Another common criticism of the bottom up approach is that currently, it is very difficult, if not near impossible to create a sound that wholly resembles the real thing.
So whilst an exciting and academically satisfying artform, this method of procedural audio still seems to be out of our grasp. But perhaps as it improves and evolves, sound designers will become more like programmers, and utilise scrippting to create more unique and varied sounds. If you'd like to have a listen to some effects made with Pure Data, Andy Farnells website has a selection of examples, plus some tutorials that you can work through yourself.
Top down approach
The top down approach, in line with how it sounds, is quite the opposite to bottom up. Whilst in 'bottom up' the aim is to recreate each minute detail of a sound with the use of coding and synthesis, when applying a top down approach the first step is to find or create a complete, pre-recorded/designed sound and work down from this.
In an interview with Designing Sound, Nicolas Fournel detailed the Spark procedural audio system that he built for Sony. He states that with Spark "You can create procedural models very quickly by analysing existing samples, extracting the features of interest, and then finding a way to model them".
According to Fournel the analysis system of sounds could be split into three main categories: audio generators, event generators and update modules. He then explains that you could extract transients, pitch contour, amplitude envelope and spectral flux amongst other parameters to build a model of a 'reference' sound. The tool can then create a model based upon this data and render countless variations of sounds from it.
Fournel goes on to argue that this top down method greatly reduces the amount of modules (snippets of visual scripting code) that would be required in comparison to using a bottom up approach and states that with Pure Data, you are provided "all the elementary modules you might ever need" and are expected to "go learn about probability distributions, go learn about modal synthesis, and build everything from scratch" (Fournel 2012).
Whilst the top down approach is attractive, it does of course have its drawbacks. The main one of these being that to create, or gain access to such a system, asa sound designer, is extremely difficult (unless you work for an organisation that happens to use such a system). The Spark system discussed above is a proprietary system designed for Sony which is not available for public use and to create such a system would require an immense amount of programming knowledge.
So this leads us to somewhat of an impasse. Whilst audio middleware such as Audiokinetics Wwise has made steps forward in making procedural audio more attainable and user friendly for sound designers (see Soundseed), tools are unfortunately still lacking in variation and are generally geared toward certain types of sounds. Larger develpers are certainly coming round to the idea - Rockstar in GTAV for example used forms of granular procedural audio for vehicles, but smaller studios may unfortunately, for the time being, have to continue on a more traditional (albeit effective) path, until further developments are made.
Thanks for reading, don;t forget to follow me on twitter at @thatjoethom. Until next time!
Collins, K. (2009). An Introduction to Procedural Music in Video Games. Contemporary Music
Review 28, 5–15.
Farnell, A (2010). Designing Sound. MIT Press, Cambridge, Mass.
Khatchadourian, R (2015) What a Dinosaur’s Mating Scream Sounds Like [Online]. The New
Yorker. Available from <http://www.newyorker.com/tech/elements/what-a-dragons-mating-scream-sounds-like> [Accessed 10/10/2015].
Nair, V (2012) Procedural Audio: An Interview with Nicolas Fournel [Online]. Designing Sound. Available from: <www.http://designingsound.org/procedural-audio-an-interview-with-nicolas-fournel> [Accessed 12/10/2015]
Verron, C. Drettakis, G. (2012). Procedural audio modeling for particle-based environmental
effects. In: Audio Engineering Society Convention 133 26/10/2012 San Francisco USA. Audio