We who make our living from game audio tend to have a strained, dysfunctional, co-dependent relationship with our counterparts who work in the film audio world. On one hand, we resent how sexy and alluring film is, and how games come off as the ugly spinsters of entertainment media. We love to throw around statistics about the gross revenues of our respective industries, as if these numbers somehow add credibility to our endeavors. On the other hand, many of us secretly (or even not so secretly) wish to make the jump to the big screen and be a part of that world, with all the glamour and recognition it promises. And we know that for the most part, the folks involved with film audio aren't thinking about us at all.
While the relationship between those who create audio for games and films is not necessarily healthy, it is becoming increasingly relevant as the two industries become more alike and work more closely together -- and most importantly, as the audiences for the two media begin to merge. My team and I at Stormfront Studios recently completed work on the audio for EA's Lord of the Rings, The Two Towers. The experience gave me some valuable insight into how the worlds of games and films are colliding, and it got me thinking about the current state of the art in terms of audio for each medium (and, based on reviews and popularity, I think it's fair to say that Two Towers qualifies as a good example of the highest level of quality in both worlds).
I've been making game audio professionally for ten years, and I've shipped over two-dozen titles. I've worked on crossover, licensed products before, albeit from television, such as Viacom New Media's Star Trek Deep Space Nine: Harbinger and Sony's ESPN Baseball Tonight. But working on The Two Towers put game audio into a new perspective for me. The "Fellowship" film had just set new standards in the genre for quality, earning an Academy Award nomination for sound, and winning the Academy Award for music. My team's mandate was to recreate the epic experience of the film, sonically, in an interactive action game running on a $200 console. No pressure, right?
We had access to the movie's audio assets, but as brilliant as they are, we quickly found that they did not make for great game audio. As a result, we spent more than four months intensively editing and processing the music, we developed a sophisticated custom adaptive-music system to install the music, and found that we could provide a similar experience in the game that the film score provided.
Unfortunately, we found it impossible to use most of the film's sound effects. Despite their amazing quality, we found it more effective to create our own. Consequently, almost every sound you hear in the game was created specifically for it, based on our attempt to recreate the overall effect of the film's sound. It was this process of trying to recreate the film experience without using the original assets that inspired this article.
It is unproductive to think of games as "interactive movies," although many people tend to think of games in those terms. Let's be clear: games and films are different media. The techniques, processes, and skills involved in the creation of each are unique and not interchangeable. The metrics by which each is judged are also different, meaning that many of the properties that make for a good film would lead to a lousy game, and vice versa. These are important points to keep in mind, because while they are true, there are a lot of ways in which we simply must start thinking about the places where the industries intersect.
The original audio assets from the film didn't work in the context of a real-time interactive game.
Most important is the audience: there is now enormous overlap between the audience that goes to films and the audience that plays games. This is key for the audio team to understand: the audience does not wonder whether the music was custom composed for a scene or is being modified adaptively at runtime, they just want the music to sound cool, fit the action and provide emotional drive. Gamers don't care how much sound RAM there is to work with, that playback voices are limited, or that disc throughput limits the number of streams of ambient sound. They just want the sounds to punch out of the speakers, rattle the subwoofer, bring the game world to life just the way sounds bring the world of a film to life.
A new trend I've noticed is the subtle way in which movies are adopting the aesthetic sense of games. Certainly, there is a genre of movies actually based on games (Tomb Raider, Resident Evil, etc.), and there have been movies that are blatantly about games (The Last Starfighter, Tron), but other mainstream movies are starting to demonstrate an awareness of a gameplay feel, too. For instance, look at Ocean's 11 from 2001 -- this is a remake of a classic "rat-pack" movie of the 1960s. The original was all style and cool in a way that only that group could pull off. The remake incorporates action and technology in a way that seems, at times, to be very much informed by the world of video games. There's a scene, for instance, in which the intrepid heroes need to drop themselves down a shaft into the vault of the casino below. The shaft is protected by a series of "lasers" (visible beams of light, anyway) that will trigger an alarm if their beams are broken. The support team needs to defeat the power to the lasers so the two thieves can rappel down the shaft -- which they do, just as the power comes back on and the security system is restored. Sound familiar? In how many games has that scenario been presented?
Look a little deeper: there is no real fictional reason that this shaft would need to be protected at all; there is very heavy security at the top and bottom. And if you're going to protect it, why not just use a motion sensor like those available in any decent car alarm? Why can we see the beams of light? The dry cleaner I go to uses an "electric eye" to sense when someone walks through the door, but I can't see the light as a solid beam across the doorway. The answer, of course, is that it makes for a better scene to have the security system visible, and a great moment of tension to have to disable it and make the move before it recovers. It's an aesthetic that feels very familiar to a game player, and would have made no sense to an audience of 1960: it's very much like a puzzle in an action game, in which a series of events in one location changes the state of another location such that progress can be made, all while the clock is ticking. I'm not suggesting that this technique was borrowed from any specific game, or even that the technique was invented in games. Rather, I'm suggesting that perhaps the paths of development for films and games are becoming more closely aligned and are beginning to intersect.
In terms of audio, it's harder to pinpoint this similar influence, but it cannot be far off. Certainly there are hints already: the soundtracks to The Matrix and M:I-2, for instance, popping back and forth between dramatic orchestral cues and driving techno music, as has been the habit of some fighting games for years. Or in the world of television, the little sounds that play during sports casts when the screen overlays come up; don't they sound remarkably like the 'interface' sounds in virtually every game made in the last five years?
Variety and Repetition
There is a constant tension in game audio that simply does not exist in film audio, that being the desire to minimize repetitive sounds versus the limitations of the delivery mechanism. If a sound designer working on a film wants every footstep, door slam, gunshot and telephone ring to be completely unique, it's simply a matter of creating the right number of instances of each sound and laying them into the soundtrack. Similarly, a film composer can choose the extent to which each cue is wholly original and balance that against some degree of repetition for aesthetic impact. Sometimes it might make sense to reuse music, but many times each cue needs to be original. Regardless, though, the medium itself does not dictate these choices. (Although the film's schedule, budget, and staff might dictate these choices in film just as they do in games.)
In a game, of course, it would be unfeasible to attempt to make every instance of a sound unique, if for no other reason than it would require too much RAM to store all the possible variations. But even assuming infinite system resources, there remains the runtime problem inherent in the interactive world of a game. It isn't possible to make every gunshot sound unique if you don't know how many gunshot sounds are needed! This is not to say that variety is impossible, of course, but the idea that each instance of a sound can be unique is simply not possible, much less workable. Instead, in games, we need to achieve a sense of liveliness and variety in our sounds with other means.
In Blood Wake, the chain gun on the bow of the ship proved to be a particular challenge for the audio team.
In Blood Wake, a recent nautical combat game for the Xbox that I worked on, the "chain gun" sound presented a difficult problem. The player boat almost always has at least one chain gun (and, depending on upgrades, as many as four chain guns) and the enemy boats often have them too. We needed to make a rapid-firing weapon sound that was powerful and impactful, could be listened to for extended periods without becoming irritating, and was able to support multiple playback instances simultaneously. In addition, since the guns jammed up when fired for too long, the sound needed to be able to sputter out and stop. Loops would have to be too long to avoid sounding repetitious and thus would cause us memory problems, and it would have been tough to get the jammed-up sound with loops anyway. So we had to go with individual shot sounds and find some way to make them sound natural.
Chris Hegstrom, the sound designer on the project, came up with a great solution. First, he created individual gunshot sounds that had the kind of punch and power that I was looking for. He created two groups of shot sounds, player chain gun and enemy chain gun sounds, and made eight variations within each group. I then asked the audio programmer write a system that would call these sounds in a quasi-random order (it was random, but was weighted to be less likely to call the same sound twice in a row) and could adjust the playback rate. We also made adjustments to slightly randomize the pitch and volume, and to vary the timing of each shot so that it was almost, but not quite, perfectly regular. Then we added an additional layer of control: when more than one gun was firing, instead of calling another instance of the same system, we simply increased the rate of the shots and increased the pitch, volume, and time variations slightly. This avoided the horrible "flanging" problem caused by playing the same sound multiple times at very slightly different times and pitches. Hegstrom spent many painstaking days tweaking all the little values, but once he was done, he had created a very convincing multiple weapon sound.
If we had been working on a film, none of that would have been necessary. We would have purpose-built each combat sequence with its own gun sounds, and modulated them to get exactly the character we needed at the moment.
Quality Versus Quantity
The drive to increase the variety of sounds and reduce repetition often leads to a desire to increase the quantity of sounds. We then face a painful tradeoff: should we include more sounds at lower fidelity (lower sample rate, higher compression, etc.), or fewer sounds of higher fidelity? There are actually pretty good arguments on both sides of this issue, and I've shipped games that favor each approach. But my recent experience gave me new insight in this area.
I have long believed that the only measure of success that matters is the overall sound quality of the final product. The quality of any given sound is arguably irrelevant. Nonetheless, while we were producing The Two Towers, we decided to push hard to make each sound as fine as it could be, then take care of the relationships between the sounds at mix time. This approach offered a couple of benefits. First, it forced us to concentrate on what we were trying to accomplish with each sound. We asked ourselves questions like "Is this sound realistic? Is it stylized? Does it have impact? Should it stand out and hold attention on its own, or should it meld into the ambience and the rest of the soundscape?" These decisions forced us to pay attention to the consistency between sounds in a way that we simply wouldn't have had to do if we were building the sounds directly into the mix. Plus, since we had access to a number of sounds from the film production, we wanted to meet or exceed the quality standards that they set at every point.
The result was a set of sounds that required quite a bit of finessing to work together at runtime -- because if every sound is just as warm and rich and vibrant as the next, the mix is often one big, muddy mess. Sure enough, with all elements present, the mix didn't work well. So we went through every sound and adjusted it slightly, then listened again, found the new problem areas and adjusted them yet again. We did this over and over in dozens of passes. It was very much like a film or music mix, balancing each sound's volume and equalization slowly and subtly with volume and equalizer until they all worked together just right, and yet retained as much of their previous character as possible.
When you take many passes over a game's audio, you find spots where unexpected sounds jump to the fore. This is due to the nature of an unpredictable run-time mix. For instance, perhaps the player has killed all the enemies in an area and chooses to swing his weapon again anyway, and hears the sound in all its exposed glory. Or maybe an AI-driven character chooses to attack with a unique combination of enemies. One way or another, the mix is bound to find a way to surprise you, no matter how much you work to control it.
Having each individual sound produced to such a high degree also had an interesting impact on the production of the game. As we dropped sound clips in, the game began to sound complete very quickly. Because each sound could stand on its own relatively well, even a skeletal set helped bring the game to life sonically much earlier in the development cycle than is typical. This allowed us to get useful feedback early and get the other disciplines bought in to what we were doing.
Finally, making each sound as good as it could be in isolation also lead us to sacrifice quantity for the sake of quality, when forced to make the tradeoff. While there's hardly a moment in The Two Towers when there aren't at least ~130 sounds loaded into RAM, there are always moments when something that should be making a sound isn't. We decided to use this to our advantage by paying extra close attention to the "point of view" the sound created.
The sound designer, editor, mixer, and director of a film have a great deal of power to affect the audience's perceived POV, increase or decrease its scope, and elicit emotion. In a game, this can be more difficult. Again, because of the unpredictable nature of a run-time mix, it's hard to know what is going to seem important at any given moment. In our case, winnowing the number of sounds down for the sake of fidelity made us think deeply about this problem. Footsteps are boring; they're throwaway sounds, right? Actually no, because at least in the case of the player character, they provide a sense of presence in the world that is key to the player's involvement. Big, showy enemy creature vocalizations are the showcase for sound design, right? Well, not always, because when things get busy on screen, these sounds tend to clutter up the mix and provide little emotional benefit. It turns out that the things that we did to get the game audio to work in The Two Towers are precisely the things that cinema's top sound designers call for.
While very little that I've ever predicted about the future has proven to come true, that won't stop me from speculating. Certainly some general trends have developed along the paths I thought they might (though significantly less quickly than I would have liked), and I feel fairly confident I can extrapolate some useful projections. For one thing, the amount of attention paid to, and emphasis placed upon, game sound has increased dramatically over the last ten years. And, predictably, the quality has improved right along with it. It's important to note, though, that this hasn't been because some Hollywood hotshot has come in and dictated the "right" way to do things. Game audio has learned many lessons from film audio -- and has lots more to learn -- but it has also developed into a mature and unique craft of its own. I now believe that film audio can take some lessons from games.
For Further Reading
To those who think that the film guys have it easy and that the problems we face in game audio are unique, I highly recommend the following article by Randy Thom (of course, it's also full of great thinking about using audio for storytelling, point of view, etc).
To read more about my team's experiences adapting the Lord of the Rings movie music, sound, and voice into a game environment, check out these articles: