Sponsored By

Establishing an Aesthetic in Next Generation Sound Design

With video game technology at the point where aural artists no longer have to limit themselves creatively, is there a hidden aesthetic in minimalism? Game industry veterans Rob Bridgett and Leonard Paul think so, and explain their techniques in this exclusive Gamasutra feature.

Rob Bridgett, Blogger

June 21, 2006

20 Min Read

Too Much Freedom in the Next Generation

Previous generation platforms had very strict limits on how many sounds could be played at one time and how much audio could be loaded at any one time. These constraints forced a very particular aesthetic onto those games which could perhaps be learnt from when creating sound for games today. As we move away from having our boundaries and aesthetics defined for us by the hardware, as sound designers we need to enforce our own ‘boundaries’ or refined aesthetics. In this case, refining and constraining the means and methods of production during the art-making process to further focus the resulting aesthetic of the outcome of the process. Instead of the arbitrary constraints from our materials we now have the artistic freedom to choose from a reduced spectrum of materials. It can be used to clarify the basic goal of the artistic process without the confusion of limitless options .This refined aesthetic is a result of a set of rules that you cannot break, and with increasing freedom to achieve anything sonically, the need to enforce limits and boundaries on the choices you make in terms of sound effects and music are becoming more and more paramount.

Diminishing Sonic Returns

To further compound the problems experienced in keeping up with a technology that moves so fast, is a symptom of gradually decreasing audio channel capacity. The almost infinite number of tracks we now have has given rise to a particularly digital problem. In terms of current trends in music, TV, film, videogames and radio, sound is becoming more and more maximised, compressed, limited and overloaded. As hundreds of tracks are being layered, all the sounds are competing for our attention in a mix. There does come a point of diminishing sonic returns, where the more you add the more you just end up with the sonic equivalent of a ‘grey goo’. This is where every frequency is filled and there is no more room to add anything without taking something else away. PS3 and Xbox 360 effectively allow a 10-fold increase in the amount of sounds that may be loaded at any one time allow audio designers to literally add any sound they like, and to keep adding and increase variations of these individual sounds. With older gaming systems limitations a sound designer had to employ very careful editorial skills in selecting exactly which sound they wanted to hear at exactly which time due to RAM and voice limitations.

This problematic aesthetic is compounded as a direct result of an over-reliance on sampled sounds. Sampled sounds are similar to taking a photograph of a dynamic event. They are very convincing on the first playback, but on repeated plays they becomes repetitive and eventually irritating. To compensate, the tendency is to overlay many waveforms together to hide their non-dynamic nature.

Even though the limits of replicating what the human ear can actually discern have been reached, anything above 24khz is considered inaudible, there is still an increase in the capacity to create and record sound above and beyond 96khz. One area where higher sample rates will come in useful is where sounds are being manipulated to a lower pitch in real-time, processing. In this case a higher sample rate will greatly reduce any artefacts that occur in lower sample-rate sounds being played below their original pitch.

Stereo and Surround sound began, to some extent, to get around the issues of a saturated sonic spectrum by spreading the sounds around in space so they at least do not compete with frequencies within the same spatial location. However, the number of individual audio channels available to us far outstrips the number of speaker channels we currently have available and often the content is down mixed to stereo anyways. Again, that question comes back “...are we really making games which will survive the test of time better than games we produced ten years ago?”

Methods of Limitation

Establishing a Strict Aesthetic
Given that sound designers and composers now have so much more freedom to overproduce and over-implement sound for videogames, methods of limitation will become necessary in order to differentiate the sounds of one game from another. How can we begin to make better sound in this seemingly limitless age by using less? It begins with realising that it is how we approach technical limitations as composers or sound designers that forms the very core of our art. The second thing to realise is that we have greater limitations than the fully equipped million dollar studio, and this gives us a distinct advantage. The limitation with expensive studios is cost, so usage time is bounded, however with the rise of the professional quality home studio and inexpensive digital equipment, cost has become less of an issue. All we need to do is recognise and capitalise on our limitations and avoid some of the easier traps to fall into. It is all too easy to create high production quality audio while ignoring the greater goals of high artistry and meaningful work. Instead of being able to identify the particular platform by playing a particular game, we should be able to have a good idea of who the audio team was behind the game and their style they have impressed upon the audio. The style we choose four our audio should be as distinctive and recognizable as the visual style of the game.

This whole process begins with establishing the limits you wish to work within. This is what will define your ‘aesthetic’.

Avoiding Complacency
Due to the ease with which digital sound design now occurs, it is easy to be guided by the samples and synthesiser voices immediately available. One of the main problems is that demo samples are often very attractive when solo, but easily add to a sonic mush when combined with other elements. How often have you started to create a track and pulled a drum beat from a sample library, starting to build your music up track by track from that drum loop bed? You’ve just got a new soft synthesiser, the preset sounds are pretty new and fresh to your ears, so you pick those sounds for you lead lines and pads. Now consider how many millions of other musicians have bought the same drum samples, have the same soft synthesiser and you start to realise the problem in musical aesthetics that is occurring with computer musicians and sound designers who rely on the same Sound Effects libraries. Contrast this image with that of the classic composer, sitting down at his piano with quill in hand, ready for inspiration to strike. What have we lost and what have we gained over the years?

Many of the tips and tricks we talk about here will help you commit to a sound or musical idea earlier on in the recording process than you normally would in the digital age. With the emphasis taken away from the post production, from frame by frame, from mixing every single element individually and adding effects to every single element separately, you begin to see that you can actually commit to the sound long before you record it.

This will help to avoid the morass of audio ‘grey goo’. Start with the idea first, strip things down. Start with a melody, if you have problems even at this stage, then start to limit the notes you are using within the scale, let’s say you like a particular interval or chord structure, use only notes from those chords. This can also be done with rhythmic tracks. Start by tapping out a drum rhythm you like and then creating it in the computer, essentially, make sure that your starting point is not at the computer at all.

Most rhythm creation software such as Native Instrument’s Battery is actually very sophisticated in terms of allowing you a great degree of control over exactly how you want to create beats and supports varied workflow methods. Pretty much all sequencers such as Acid, Nuendo, Live, Pro Tools also support this kind of small scale beat editing, and are not merely looping tools. Try zooming in on samples and doing very small scale beat level editing on a beat to replicate what you hear rather than using those easy loop tools.



Try breaking a loop into individual hits with a program like Recycle and begin removing non-essential sections. With the additional processing power on the new consoles we can hope to run programs similar to Ableton Live in real-time rather than continuing to repeat the offline rendering process that we perform with the older platforms. Once reduced, replace the remaining sections with new samples or notes to construct a percussive lead line. The breaks between the sections allow other layers to shine through and strengthens the transients of the newly reconstructed sections.

Another technique, which is becoming more forgotten, is to write all the parts of your music sitting at a piano or a guitar, before you move to orchestrate on the computer. In this way you could actually fully realise the piece of music before you even get near the computer. At the stage when you do sit down to recreate the piece of music, you will find yourself using software to build up a fully realised piece of music. Conceptualise what the music might sound like using different instruments and focus on orchestration and arrangement. Allow the melody to define the timbre colour and range which best suits its particular voice.

Limit Voices
These techniques can apply equally to sound design or music creation. Choose only one instrument, or one sound voice with which to explore your ideas. Try using only one soft synthesiser patch and really explore the limits of what that one ‘voice’ can achieve. If you consider musicians who actually spend years and years perfecting an instrument, they only really have the one voice that instrument carries, yet they spend their whole lives learning new and innovative things to do with that instrument. Likely the limitation with analog synthesizers is not only their dynamic sound quality but their playability as an instrument when compared to a soft-synth.Reducing things on this level leaves more room for dynamics with that voice. You can literally do this with the most simple to the most complex soft synthesisers.

It can be tempting to really go big when overdubbing things like vocal harmonies, more often than not getting three or more people who can sing vocal lines perfectly in one take can prove frustrating and time consuming, so in instances like this where you cannot limit yourself to one track, either due to budget or time constraints, you can begin to get clever about how you use those few tracks you have. Of course this is where software can really help us out of those jams…

As an exercise, try to reduce lush harmonies and large chords to arpeggios. This method was commonly used in early game music, where notes were rapidly oscillated between the notes of the chord to simulate the sound of a chord. Try to listen to the resonances present in the chords and know that often the most important element of a voice is not the fundamental frequency, but rather a harmonic overtone.

Old techniques like pitching or bouncing down onto one track are incredibly hard to do once you actually start to try it the old fashioned way particularly when you come from a digital background. Record a vocal line, listen back and record another vocal line on a second track, now, bounce those two tracks down onto one track and add your third harmony line, again bounce those three onto one track, and carry on and on. You start to realise that control over mix elements is no longer possible and that you have to know, in advance, at the mix down stage which lines need to be brought out more, or mixed in deeper. You will notice that you will create a piece of music and then need to completely recreate it again and again in order to get what you really want. When there is a consequence to each bounced layer, you will find yourself much more cautious and careful about each element you add. Above all you begin to get a sense of sheer awe for the compositions and recordings of artists such as Les Paul and The Beatles, while also giving the amazing digital technology we now have available to us a new found respect.

Limit Tracks
Limit the record tracks available, choose a limit with which you feel you can still create something that suits your music, but that you feel you still may struggle with, perhaps only four tracks, whatever you feel comfortable with, take one more track away. Even if you start with more tracks, you can retain the core of the tracks you remove by adding small elements to the remaining tracks. For example, you may have two drum loop tracks and when reducing the number of tracks, you may find that you were only functionally utilizing the crash and second kick drum of the second loop, so these can be added discretely to the first while cleaning up the mix.

This may seem difficult at first, but just try it and see how this changes the way you think about sound design, get your ideas across with less layers. You may quickly run into problems initially and start thinking ‘I need another track’, however start to think your way around these problems, you will be able to get out of the limitation more easily the more you develop techniques to cope with this limitation. This can be achieved with any sequencer. Try setting up templates for two track sessions, four track sessions etc. Just remember that in the not too distant past, four tracks were an expensive luxury.

Limit DSP
OK, so you don’t have to totally disregard DSP, after all it is one of the things that makes the next generation audio environment so appealing, but try imagining life without DSP effects for a moment. This will enable you to think about what the DSP is actually doing to your sound design. Are you using reverbs and effects just for the sake of it, or does the sound you require really depend on it. You can literally find yourself doing what some of the great producers have been doing from George Martin to Martin Hannett, who famously recorded Joy Division’s drummer on the roof of the studio to get the sound he wanted. Try not to use any EQ on playback, and perform it all when recording. Digital EQ is notorious for smearing frequencies, attenuating low frequencies and weakening transients. Try performing all your volume changes during recording as well. Not only does this tend to make you more aware of the dynamics of what you wish to record, but playing softer also often changes the timbre as well, which gives additional flavour to the mix.

Limit Your Microphones
Having amazing frequency reproduction and quality is something we all rely on these days, but the microphone you choose adds a particular colour and perspective to everything you record, and using the same microphone on every track, while creating a consistency that may be desirable, may also make everything begin to sound the same. One great way to really rethink the sounds you are recording is to use one microphone where you would have previously used several. In the example of drums, it is a great technical and artistic challenge to actually use one microphone to record the entire kit at once. This limiting of choice actually makes you think about the end result of what you want to create, so that rather than making the decision about how to mix the drums later, you actually have to make that decision while recording, and commit to it. You will need to experiment a great deal with microphone positioning, perhaps even think more about the location in which you record the drums, you could even get the reverb you require by recording the drums in a highly reverberant space thus saving yourself a DSP pass.

You can also record using whatever you have lying around, an inexpensive microphone, megaphone, a Dictaphone, all these things can create very unique sounding samples.

One of the main reasons classical composers were able to survive with such a small amount of timbral colours is that classic instruments are very responsive to the nuances of the performer. When working with live musicians, we realize that there is the written score and the score which the performer takes onto themselves to realize. When using synthesized sound and samples, it is good to think of the computer as a performer which needs a lot of instruction on not just what to play but exactly how to play it. When we think of the Commodore 64 of 1983, composers had a very limited access to parameters to modulate, such as waveform, volume, pitch and filtering. When we think of the parameters which the violinist can modulate simply with the bow, such as pressure, speed, direction, acceleration, string position and more under nearly instantaneous control by the performer we can begin to realise how much of the voice of the instrument is not the sound of a string being bowed, but how it is bowed. When we apply this to computer music, we should consider volume, pitch, filtering and other parameters as elements which can be continuously adjusted to better convey feeling and emotion. We need only to remember the artistry of Clara Rockmore on the Theremin, who only had control over pitch and volume of a simple sine oscillator.



One Track, One Take Foley or Music
Record your entire performance, or Foley session in one go. This will show you just how many times you need to be able to rehearse in order to get it to work in one go and get that ‘perfect take’, not by digital manipulation, but by moving the focus onto the actual performance. This is a great technique when replacing loops, since it will breathe new life into each time a phrase is heard rather than tiring the ear on each predictable repeat. The blood, sweat and tears of getting that perfect performance will increase your skills and motivations as a performer and your understanding as a producer. Again, we can see the movement from music as a performing medium into a medium of frame by frame manipulation and perfection, the creativity has moved from the performer and onto the producer

Interactive Mixing
Mixing is an essential part of limiting sounds as it is about subtraction rather than addition. Interactive mixing is still in its infancy in videogames, however more reliance on the dynamic balancing of sounds, music and dialogue will become essential in establishing a ‘point of view’ rather than a ‘subjective microphone’ perspective.

With next-gen, we have the increasing danger of assuming that utilizing 3D voices will make things more realistic. It is increasingly important to maintain aesthetic control over each sound and how it contributes to the overall experience. Making audio more 'realistic' can have the opposite effect of sounding 'lifeless' as the computer increasingly makes arbitrary decisions of the mix and quality of sounds as they are played in the imagined space.

Ducking music and FX when dialogue occurs is a very basic way of achieving a more cinematic effect in games, and also of ensuring that essential information is conveyed and is clearly audible. The interactive mixing process can identify a whole slew of prioritised sound effects that need to be heard at designated moments in game play and sometimes dropping a sound entirely is the best option. Ducking importantly allows subtraction of sounds so that you don’t just have to make everything louder in order to hear it.

Next generation software solutions to mixing also allow a slew of enhancements to interactive mixing. Dynamically carving out frequencies from a music track when dialogue is playing for example is a great way of generatively allowing space in the music to exist when a dialogue event occurs.

Real-time mastering and compression via software on PS3 and Xbox 360 on either separate elements of the soundtrack, such as dialogue, or on the sound track as a whole, also limit the dynamic levels and peaks that occur. Taking care not to over-compress the output when attempting to make the game 'sound louder'.

The essential notion to understand about mixing and mastering in real-time is that it is yet one more way of stripping away unwanted audio information from the potential cacophony of sounds that can occur in the almost limitless sonic environment that is now available to sound designers and composer in next generation video games,

Next Generation Working Craft and Technique

All these exercises form the beginning of a process that can enable you to begin to think more clearly about how digital technology can dictate the way you create sound. Essentially next generation audio production will be very different from current generation audio production as a craft. It will more than likely follow a phase of over designing sound, and then pulling back and reducing the sounds that are unwanted or not required for the aesthetic direction Sound designers, composers and mixers will all need to work harder at sculpting, removing and shaping by subtraction in the beginning of a new limitless age of sound production.

Over the next few years we will begin to see the next generation of games addressing these issues and finding solutions to the problems of over designing sound and over creating music as a reaction against being limited for so long by hardware constraints of the PS2, Gamecube and to a lesser extent the Xbox.


Read more about:


About the Author(s)

Rob Bridgett


Rob Bridgett is senior audio director at radical entertainment in Vancouver and author of 'From the Shadows of Film Sound', a book dedicated to exploring the connections between video game production culture and film production. www.sounddesign.org.uk

Daily news, dev blogs, and stories from Game Developer straight to your inbox

You May Also Like