Sponsored By

iOS Audio Design: What Everyone Needs To Know

Experienced audio designer PJ Belcher dives deep into the iOS platforms to produce an overview of the ins and outs of iPhone audio design -- taking into account everything from user behavior to technical constraints.

PJ Belcher

December 8, 2010

19 Min Read

Since the launch of the iPhone, apps have been big, especially in the games sector, which has seen some sell like AAA titles. Major developers have added the iPhone to their list of available platforms for existing titles, as well as moved to develop unique titles just for it.

It has been a great platform for the indie developer too, thanks to its great accessibility, seeing games from very small groups of people reach huge audiences.

New opportunities have been opened up for people left in the wake of the recession to start their own developer ventures, setting up new companies and designing games, knowing there will be an audience for them.

It is the mobile nature, unique interface, huge number of users, great accessibility to developers and the nature of the "app" (short, simple, high quality games with a focus on playability) that has been incredibly useful to many developers.

With games being so large, though, and the bar for quality being set so high, what challenges does fitting such a big game in a tiny device involve? More specifically, what does this entail for the "audio guy"?

Aside from the normal list of challenges facing sound in any game, there are an added series of problems to overcome and unique issues facing some routine procedures. This makes everything just that little bit harder, and to add to the confusion is the huge lack of information available to audio developers.

What this feature hopes to achieve is a clearer understanding of what challenges the iPhone audio developer faces, how to overcome them, and, in fact, what game audio on the iPhone really means. In this article, I go into the design considerations you should make, how this affects the app, what technical concerns there are, what unique iPhone issues there are to overcome, aspects of user behavior, hardware concerns and the iPhone app as a genre.

These are all issues, problems, and topics I have encountered while developing audio for the iPhone and, in turn, (but not in all cases) have found a solution or at least some explanation for, which I hope to now pass on to other developers.

So what does it take to produce effective audio for the iPhone? Well, everything it takes to produce good audio for any game: dynamic soundscapes, an immersive soundtrack and great innovation -- only with a few added challenges thrown in for good measure!

You see, the iPhone is different from other platforms in a number of unique ways. Anything from the user interface, to the user him- or herself, is wildly different and far more diverse, leading to some very careful design considerations. The audio is no exception. Over the next few paragraphs I hope to shed some light on what these considerations should be and what you can do as the developer.

Asset Size

Possibly the biggest concern of any audio producer for the iPhone will be asset size. The scope of this topic is too large to fully address in this feature, but I write about this in more detail in my article "Audio for iPhone: Size Matters." The following, however, should give you a good grasp of what developers face.

Apple has an over-the-air size limit of 20 MB. This means in order for the app to be downloaded over 3G and not Wi-Fi, it must be under 20 MB. This is crucial to ensure good sales given the value of viral marketing in selling iPhone apps. The process is broken if the potential customer has to go and find a wi-fi network or wait until they get home to purchase an app, so under 20 MB is a must.

By the way, don't forget about downloadable content (DLC), which can be added to the app after the initial purchase. This is handy if your app has to go over the limit for the full version, but still lets you sell the lite version over 3G.

This lack of asset space is hard for the audio producer given the size of audio assets, not to mention fighting with other assets for space - -you are looking at a maximum of 50 percent, leaving you just 10 MB. Does that immersive, dynamic, innovative soundtrack seem tricky yet? This is assuming that you convince the other people working on the game that your ideas are worth 50 percent of the game!

Remember, you are only as good as your last job, so don't back down or produce sloppy audio in light of the challenges -- instead, produce even better audio in spite of the challenge and really impress.

The iPhone doesn't really help matters much either, since it can only decode one compressed file at any time, meaning most of your audio (aside from a few careful choices) has to be in un-compressed PCM formats, which any producer will know are rather large. There are lots of techniques that can be applied to mitigate this, though, such as asset-based generative sound systems, crossfade looping, and carefully selected bouncing; all of which I cover in detail in my above-linked article.

I also recently came across another neat solution. Compressed formats, in general, cannot be looped, due to the "padding" added to the top and tail of all audio files by the compression. Normally this would mean the audio would have to be uncompressed, but I recently learned of utilities that will remove the "padding" and produce an MP3 that would in fact loop seamlessly.

Exactly how it works I don't know, but a quick Google search should reveal what you're looking for, I have in the past used one created by "Compu Phase" -- just remember most require the source file in PCM format. This effectively removes one requirement for uncompressed file formats; something you will find crucial.

In addition to this, the Apple documentation details a system that allows you to programmatically remove the padding and achieve seamless looping. This has the disadvantage of additional code, but does mean you can use M4A (AAC) encoding, which at really low bit rates does sound much better than MP3 -- in my opinion at least.

Environmental Issues

So what about production concerns and the iPhone App? These concerns are part of any game audio, but more predominant in the mobile device market because how the player listens to a game is hugely varied. This is due to the unique environments in which the user can be found, as well as forms of playback available to the player.

Normally it's the bedroom or living room: played over a TV's speakers or some home cinema system. Not for the iPhone -- it's anything from the kitchen to the bus, from the in-built speaker to whatever the user decides to plug into.

We can make some assumptions, though; most users will use either the in-built speaker (which produces a different sound from iPhone to iPod to iPad) or the earphones provided (which vary minimally between each Apple product) you can cater your sound to work in these environments.

Always test your sounds in-game on the device. Testing your sounds in-game is crucial and you should already being doing this, but iOS devices particularly require it. You can produce the best sounds in the world on a set of studio monitors, but whether the Apple earphones will do them justice is another thing.

With this in mind, test your game on several different devices, in different environments (kitchen, bus, park, toilet...) with and without earphones, with headphones, etc. Obviously one way of mixing isn't going to suit all these criteria; the key is to find a good middle ground in the most common of setups.

The Device

The first time I got my hands on an iPhone one of the things I instantly noticed was the lack of tactility in the screen, or any of the interfaces for that matter (aside from the home button). I cover this in more detail later, but it is worth mentioning now that this lack of feeling calls for an increase in feedback to the user, which is often the job of the audio (especially since whatever is being pressed can't be seen due to it being obscured to the user's own finger).

In most cases this comes in the form of a nice tick or pop, re-affirming the user's action, but this must be audible. The menu select sound of the iOS does a really good job of this. By working with the in-built speaker, the sound's dominant frequency is easily reproduced and the overall audio isn't drowned by frequencies the speaker would struggle to reproduce.

This means the tick is reproduced as a nice short transient -- it doesn't clip or distort, it isn't nasty to listen to, and easily reaches the user's ear. The same applies to earphones. As the audio producer you can use this to your advantage, although bear in mind that all devices are different.

Due to the nature of the tiny speaker (earphones or built-in) the spectrum of accurate audio playback is incredibly limited. In the case of most low frequencies, they are not capable of being played back at all. The speaker attempting to do so will sacrifice the quality of higher frequencies that it should otherwise be able to playback.

The best way to help this is to have a low cut in your mix before putting on the iPhone. A cut for most music at about 20 to 70Hz is generally sensible, but this can be increased further for iPhone given the possible forms of playback. It is worth noting though that cutting the low frequencies will allow the higher frequencies to "breathe". This can cause an increase in peak volume and will be true in whatever digital audio workstation software you are using, and in physical playback, so be wary of clipping.

My final point in terms of production is high volume clipping. All of the devices will have harmonic resonance distortion, from playback at high volumes and from certain frequencies. This means that if the user turns their device all the way up, it will result in distortion from the built-in speaker. Given the noisy environments in which the devices are used, this is often the case. It is not necessarily the user's fault, but it does make the audio sound bad, and, in turn, the game looks bad.

Unfortunately, I have no solution for this -- it is an issue that lies with the devices. One way around this is to reduce your overall mix, until you reach a point whereby peak volume does not cause cone break up, and in turn distortion, but this will make your game much quieter than other games playing at an equal level, in terms of volume setting. This can often be considered a negative feature.

My only feasible answer to this would be to allow the volume to increase as it would normally and then apply a ceiling, in which the volume would eventually reach a peak volume, before clipping. Not perfect but a solution nonetheless.

The tools I typically work with to create audio for the iPhone will generally not go beyond standard DAW software. Providing you can write, record, produce, mix, edit, optimize, encode and downsample then that is typically all you will need. Personally I will use Logic Pro and Cubase (increasingly just Logic Pro) to do everything, but any software with a decent encoder (good production tools of course) and downsampler will do.

Logic Pro works well for me because, also having been created by Apple, it has a very nice m4a encoder, as well as a top notch mp3 encoder (it is worth noting here that Unity 3, a popular iPhone development tool, no longer supports m4a.) There are other dedicated downsampler software packages available, which can be of use.

Personally I will just use Logic for this, but only your ears will tell you if it is doing a good job, just make sure it is doing something more complex than just picking out every other sample (I will often downsample sound files at 44.1Khz to 24Khz just to ensure this).

Beyond this I will rarely use any other software; I feel there is no need to complicate your creation process, and hopping between softwares will often increase you production time.

Providing you can give developers what they ask for, then it doesn't matter what you use. Typically, my expected deliverables and what I deliver are pretty similar with each job. I will hand over all sounds and music in unedited, normalized PCM file formats at at least 44.1Khz.

With this I will hand over all files as they are intended for the game, so downsampled/encoded versions of files, that are mixed and edited. In addition to this I will supply a technical file detailing what each sample does, explaining its intended use, where it goes, file name etc and any other specific detail that may be required.

In terms of how this varies from each development environment the changes are minimal. As I mentioned earlier Unity 3 no longer supports m4a, so although I would use this file format by default I would instead have to use mp3, or ogg (popular in bespoke development environments) for clients using Unity 3 or a self-developed dev tool.

All development tools I have encountered will require mono sound files for any sounds to be played in 3D space, but this is often true of any platform. One a different note, most Mac users will not have the capabilities to open Word DOC files, so I will often send text in plain text file formats for optimum compatibility.

The iPhone is marketed as a mobile gaming device, so as I mentioned earlier you need to carefully consider where it will be played. This will have strong influences on many design considerations from all aspect of development and audio is no exception. Finally don't forget to test on as many devices as possible.

We know of the various iOS devices (iPad, iPod, iPhone) and how they vary from each other, but what about how their interfaces vary from other devices? How do they affect the design of an iPhone game? Well, there are all the same principles of designing any other game, only with a few extra challenges thrown in (again!)

The iOS devices are unique in several ways, which all have an effect on user behavior. It has a six axis accelerometer, the main interface is a touch screen, the screen is tiny, and above all it is mobile, possibly a phone, and open to a wide audience of consumers.

Thanks to this unique user interface, players are forced to learn a new way to interact with their technology -- and you as the developer has to design this. But what does this mean for the audio design? Compared to visual impact, the sound in any game is frequently an unsung hero, and given the often noisy uncomfortable environments in which the game could be played, it often never gets heard by the player.

Furthermore, most users will turn off the sound so as not to disturb others, and will seldom use headphones due to their unsociable nature. It would seem that making the device so accessible is a problem for the game's audio. You can easily share a game, pass it to others, play it pretty much anywhere and with anyone, and it is this increased social nature that makes the sound get turned down, or, even worse, off!

In the event that the user is wearing their headphones, they are likely to either be listening to their music (it is an iPod at the end of the day) even whilst playing a game. I first noticed this behavior while beta testing a recent game. People would turn the sound down to avoid disturbing others, or, if they couldn't already hear it, not even be aware of its existence.

People presenting the games to others would often hand over the device muted, and in a world in which it is already hard to obtain audio feedback, I found myself with users not even aware there is sound. This can all seem very disheartening, but at the end of the day anything from tens to millions of people will play your game and how many of these listen is unknown. However, those that do will be impressed, reinforcing the role that audio plays in any game.

As I mentioned earlier, the interface on the iPhone is unique and as a result several design considerations need to be made to address this. One of the first things I noticed about the iPhone was the lack of tactile nature about any of its controls. The touch screen and accelerometer, which are the two main forms of interfacing, give the user very little feedback. This makes operation very tricky -- since each action does not necessarily have a reaction, the user constantly questions whether their interaction was received, or did what they wanted.

A real-life button does several things; it physically moves, has limits, and you can distinguish its whereabouts and differentiate it from the other buttons without looking at the device, something often necessary in gaming. On top of this, all interaction is reinforced in-game, with visuals and sound.

At its most basic level, a game is a simple positive reaction for the user's interaction, but the iPhone changes all this. To further complicate things; since the touch screen is your only way to see what's happening as well as being what you touch, often what you need to see is then obscured by your finger.

All attempts to inform the user of the touch outside of the button are therefore hindered by the small screen and using the accelerometer means the screen moves around. This leaves the only available space in the sound. As you can see the audio has a lot resting on it!

Taking into account all of the above-mentioned considerations, working with developers can produce a number of issues to address. Typically a developer (unless I am directing the audio) will send a list of the sound effects, music, ambiences, and any other audio they want, with details of their intended use, placement in-game, length (and whether music needs to loop), and a short brief describing how it should sound.

Normally this would be fine, but on the iPhone this presents several problems. For instance, say I'm requested to make five one-minute-long pieces of looping music, each with an equally-sized, looping, ambient track to play simultaneously on different-themed levels. As far as briefs go this is pretty reasonable, but it wouldn't work on the iPhone.

To start with, on disk, a one-minute-long PCM file at 24Khz is 5.8 MB. So the music and ambience alone amounts to 58 MB. This is already well over the 20 MB over the air limit. But let's assume you're not fussed about asset size; it is still going to take some heavy memory management to deal with those files.

So you compress them, right? Nope, the iPhone can only play back one compressed file at any time. In this case, ambience and music need to play simultaneously... And so you can see the problems that ensue.

These are by no means impossible to overcome. There are solutions to each of these issues mentioned earlier; the challenge is implementing them. The audio clearly requires more direction, planning, and thought than most will expect, and simply catering to a list of sounds is not an option.

To make matters worse, audio designers are often asked to produce the audio at the end of development. "We're now about to start beta testing and need to get some audio in," is not going to work. The sound is now a more integral part of the game's design and development than ever before, and it will only work if carefully thought through. So, developers! Get audio designers involved from the start -- and audio designers, be prepared for these challenges.

If I am presented with a list of sounds, then I will ask to play the game and develop my own list. This isn't because I think I know better, but because audio is my profession, and implementing the various design considerations required can be a lot easier if you start from scratch.

That's not at all to say you should thrown away the developers' list. On the contrary, you should follow it -- just don't do so blindly. Plus techniques such as generative sound systems, combining assets, compression, etc, can not only sound better but also be critical to the audio design. These are often something developers would not consider requesting.

Why would they? It's not their profession to know the ins and outs of iPhone audio design; it's the audio designers. This can see composers and sound designers also having to act as directors, designers, and leads, and with most indie game developers operating as a small team. With one person doing audio, development on the iPhone can see the audio guy becoming a very diverse one man band.

And Other Considerations...

I'd like to return to user behavior -- and how that can enable viral marketing. Viral marketing has been great for the iPhone; thanks to its mobile nature, easy web connectivity and small app price, stuff can really get around! Audio is one of the best indirect and direct ways to make others aware of what you're playing -- unlike visuals, you don't have to be looking in any particular direction.

With existing brands this is easier, since most sound logos are an established part of the brand, making both the sound and the brand instantly recognizable. Establishing a brand is one of the trickiest parts of starting a new product, and sound is key feature. With this in mind you can see how the audio starts to play a role outside of the game.

Genres of games on the iPhone have been incredibly diverse, with pretty much every existing base covered, as well as a lot of innovation. Games from other devices have been modified to work on the iPhone, games have been made just for the iPhone, and some games have been made with very little consideration for the device at all. Some work well and others don't, but at the end of the day, with well over 50,000 games, the chances are it has been done. In terms of audio for games, well, they all have it -- but few do it well.

All the same, the level of quality on most successful iPhone games has set the bar very high. This includes the audio, which obviously needs to match the high standards of the game. Although this can seem very tricky given the challenges being faced, it is by no means impossible -- so get out there and make big audio for the next big game!

Read more about:


About the Author(s)

PJ Belcher


PJ Belcher is the owner and operator of PJ Belcher Pro Audio, a freelance Sound Design and Music Audio Production company for games, based in Cambridge, UK. He first started looking at different user interfaces (UI's) and their application in real-time music performance when he noticed the potential of the iPhone. This led to the development of an interactive music system designed for the iPhone based on it's unique UI's. This allowed PJ to develop a broad knowledge of the iOS devices and what effects it has on audio design. Since then he has worked on audio for iPhone in a variety of projects, utilizing the skills and information he has establised.

Daily news, dev blogs, and stories from Game Developer straight to your inbox

You May Also Like