Featured Blog

Making Ear Monsters: Developing a 3D Audio Game

Sound has always been an important component of videogames. But what happens when you turn that on its head and design a game where sound is the main driver of gameplay?

Sound has always been an important component of videogames.   But with the exception of music games, it has always taken on a supporting role for the gameplay itself.  That’s natural because video games are, well… video games.

But what challenges do we have when we turn that concept on its head and make sound the primary driver of gameplay?  In this and following posts, I’ll discuss the creation of Ear Monsters, an arcade-style casual audio game for iPhone/iPad/iPod touch that was designed with sound as the main gameplay mechanic.


There have been a few audio-only videogames over the couple of years.  Games such as Papa Sangre, Night Jar and BlindSide are excellent, innovative games that rely exclusively on audio for gameplay.

Unlike these longer, story-based games, my goal was to create a short, casual game experience, with arcade-style scoring designed to be played in bite-sized chunks.   I like the old-style arcade gameplay, where higher scores are the result of better skills and practice.  Not only does that match my short-experience goal, but it also provides a built-in achievement mechanism—you are continually trying to best your own score or that of your friends.

I chose a gameplay mechanic driven by 3D Sound (which we’ll get to later), but inspired by traditional visually oriented games such as Fruit Ninja or Whack-a-Mole, and even Asteroids.  You can learn the basic gameplay in a few seconds, and as you get better, the game becomes more difficult in response.

In this post, I’ll be discussing the gameplay, design and accessibility challenges I faced creating the 3D audio game, Ear Monsters; in a follow-up post, I’ll cover the sound-related challenges.

GamePlay: Idea vs Reality

My initial conception was to base gameplay on variations of Whack-a-mole or Fruit Ninja, but use a 3D sound engine to place the enemies randomly in 3D space around the listener instead of visual minute mole minions or floating flung fruit.  The player would tap the screen at the spot corresponding to the location of the 3D sound enemy, and voila— an audio game.  That seemed simple enough, so I figured people would pick up on it pretty quickly.   A basic game mechanic prototype took only a few weeks to get up and running.  Here I was my initial, very sparse, very empty “playfield”:


  Ear Monsters 3D audio game

Turns out things weren’t so simple.  When faced with an utterly blank play area, people had no idea what to do.  Even when the game was described to them, the game was maddingly difficult for most.  People simply aren’t used to needing to react quickly to audio cues and audio cues alone, let alone on such a visually oriented device like an iPhone or iPad.  Players were desperate for something visual to latch on to.  One mentioned being fixated by the countdown time at the top of the screen.  On top of that, it turns out that neither our natural hearing nor 3D sound technology really makes my idealized gameplay mechanic very practical; as human beings we just aren’t very good at precisely locating sounds around us in the absence of any additional visual information.

So I went back to square one.  Clearly random 3D audio fruit was not going to work.  But the concept still seemed sound—it was just too hard.  So I simplified the game dramatically.  Instead of being able to appear at an arbitrary position in 3D sound space, I limited the number of possible positions the enemy would appear.  That way, a player could quickly learn that sounds seemed to be coming from a limited number of positions, making the game much easier and manageable.  Coincidentally, this worked in with an original thematic idea for the game:

Invisible Monsters from a parallel dimension have created wormholes to invade our universe -- but they don't realize we can still hear them.


By limiting the location of the monsters to discrete positions, it was as if they were appearing at the mouths of their wormholes.  Sort of an alien 3D audio “Whack-a-Monster”.

I limited the number of discrete positions to 14; three each at varying distances laterally to the left and right of the player as well as 4 up/front-ish and 4-down/back-ish.  Still the game was too hard, unless you’d already practiced quite a lot.  Although I had designed a ‘practice arena’, where you can see as well as hear the enemies, a game is no fun if you have to practice to be any good at it.  

So I reduced the number of positions even further—so that at the start of the game, the Monsters can only appear in one of 4 specific locations, unambiguously differentiated by sound.   This not only let me make the game playable for people who hadn’t played it a lot, it let me naturally add in another element common in arcade-style games; the feature of the game getting harder  as you get better at playing it.  As the player’s score increased, I could assume they were getting the hang of things.  So just when they started hitting pretty much every monster, I’d increase the number of potential positions by a few.  At higher scores, I’m back to the full 14 “wormholes”, which turns out to be challenging even for someone who has played quite a bit.

At that point, if you knew what you were doing, the game was simple to understand, fairly easy at the onset, and got harder as you got better.  Solving the “if you knew what you were doing” issue turned out to be the next big problem.

Teaching Gameplay

From the initial stages, I wanted to avoid having to have an explicit tutorial, or a voice having to explain what it is that you are supposed to do.  While tutorials work fine for story-based games such as Papa Sangre, that didn’t fit well with a bite-sized casual experience I was shooting for.  A challenge was how to have an audio game’s rules and goals be self-evident when the gameplay is based on sound, and without having a disembodied voice lead you through the tame.   For this, I ended up taking advantage of the video part of the videogame…

My first attempt at gameplay discovery relied on the separate practice arena.  Here, the monsters and bombs are fully visible when they make their sounds.  Simple, I thought.  People will go to the practice arena, and make the connection between the sound and the location on-screen where they are supposed to tap to kill the enemy.  Of course, the Achilles’-heel of that approach was “people will go to the practice arena.”  Not only is it unlikely, but as mentioned above, who wants to have to practice and be trained to play their quick, casual game?

During  playtesting however, it was clear that players “got it” in the arena.  In a sense, the practice arena is “whack-a-mole”, since you can see the enemies as well as hear them; they appear and you whack them by tapping where you see them.  So the challenge was how to duplicate that teach-ability in the main game, so as to make the practice arena unnecessary.    I ended up adding two very important visual elements to my otherwise audio-only game:  Visual “wormholes” and visible enemies.

Wormholes and Disappearing Monsters..

Despite limiting the number of possible aural enemy locations to only 4 at the offset, players nonetheless were attempting to hit the monsters all over the screen, even after practicing in the arena—They figured that monsters could appear from anywhere, and as I’d mentioned, human hearing isn’t really good enough for the players to realize that every monster was coming from one of only 4 positions!  The solution turned out to be to cue the player visually where monsters could potentially appear by displaying the wormholes.  So now my gameplay screen looked something like this:

  Ear Games 3D Sound

Each of those black circles is a “wormhole” and monsters only appear from them.  Then to train the player further, I made the Monsters visible at the very beginning of the main game.  For the first few Monsters, the player  can see them as well as hear them.   However, with each subsequent appearance, they get dimmer and dimmer.  By the 8th monster, you can’t see them at all, and have to rely on sound, which is then the main gameplay mechanic.  This led me to modify my thematic idea to:

Monsters from a parallel dimension have created wormholes to invade our universe – they quickly become invisible in our world, but they don't realize we can still hear them.


At that point, I had solved my main problems: the game being too hard for beginners, and the player not knowing what to do.  Further, I had some very easy ways to increase the difficulty of the game: eliminating the visual enemy, increasing the number of wormholes and finally by eliminating the visibility of the wormholes themselves.  In fact, at a certain point deep in the game, the screen reverts back to the original concept as shown at the beginning of this article, which provides no visual clues whatsoever.  But by the time a player has reached the point where all visual clues are gone, they have demonstrated a mastery of the main gameplay mechanics and rules.

Although the game could be played and understood at this point, I still wanted the sighted player to be able to understand what was going on more easily.  So in addition to the aural feedback for elements such as defusing time bombs or earning extra bonus points or time, I added simple text-based visual elements to let the player know what they did.   They appear as simple text, appearing and floating up and off the game screen as they’re awarded as in the “air support bonus”  in the picture above .

Accessibility and REAL Accessibility

Ear Monsters was not designed specifically to be a game for the visually impaired.  However, since the game is a natural fit for the visually impaired gaming community, I spent a good deal of time thinking how a visually impaired gamer would play Ear Monsters and what could be done to improve that experience.  This is a precept in all EarGames audio games.

Fortunately, Apple’s iOS SDK provides straightforward and powerful methods for increasing accessibility for the visually impaired.  In addition to the straightforward, such as adding VoiceOver text to the gameplay buttons, Apple makes it easy to add Siri-style voice to portions of the game when a visually impaired player is playing such as reading their score, listing the high scores and other information.  It was also important to be able to automatically turn off the VoiceOver functionality during the main game screen itself, since by default apps that read screen touches directly are not very compatible with VoiceOver.

[Technical note:  VoiceOver is an iOS feature that is globally turned on or off by the user.  When on, it uses the Siri voice to read app names, menu items and buttons, provide hints on what they do as well as navigate the device with special touch commands.  Games can also provide specific Siri spoken output that is only heard if the user has VoiceOver turned on. ]

All in all I was pleasantly surprised at how easy it was to add accessibility to Ear Monsters and I figured I had an audio game that was very accessible for the visually impaired community.  So I released Ear Monsters to the world. 

However, early feedback from great community sites for the visually impaired like and AppleVis, showed me that Ear Monsters in fact wasn’t as accessible as I thought.   My own analytics were showing this as well.   Sure, all the buttons spoke their functions, and I used VoiceOver to speak the score and provide other information to the player.  So Ear Monsters met the technical definition of being “accessible.”  But I had provided little of the help the visually impaired gamer that I had for the fully sighted gamer when it came to learning the game.  In fact, of games started with VoiceOver enabled (which are presumably visually impaired gamers), almost 15% of game sessions ended with a score of zero!  Clearly the game was too hard to understand for many non-sighted players who did not have the benefit of my visible wormholes and vanishing visual enemies.

My solution to this is to update the game with some specific additions that affect only games that are started when VoiceOver has been turned on by the player.   If a game ends with a particularly low score, the Siri voice will provide some more detailed, very specific gameplay hints that there are a limited number of possible positions, and describing precisely where to tap on the screen to hit them.

One other accessibility issue was reported by several blind players who said that all the sounds were backwards; when they tapped on the right side, they’d hear their attack from the left.   It turns out that the code I’d added to detect orientation and flip the game—which, ironically, I had added in specifically so that visually impaired gamers would always have the correct orientation!— was too clever by half.  A normal sighted player will typically hold their device in their hands, with a tilt towards themselves.  And that’s how we tested Ear Monsters.  However, it turns out that many visually impaired players played the game either laying their device on a flat table, or even in their laps, with a slight tilt away from them.  In that case, the game would frequently rotate itself away from the player, upside down. And when they tried to turn around their device to fix it, it would rotate away again!  That is also being addressed by fixing the orientation to one specific landscape orientation, which is common for iOS games.

My big lessons learned from those two issues?  Game objective discoverability and knowing how visually impaired users use their devices is at least as important to accessibility as ensuring all the buttons have proper spoken labels and spoken hits!


Creating Ear Monsters seemed like a very straightforward idea, but as in many things, there were multiple, significant devils dwelling in the details.  Working through these details yielded a game somewhat  different than initially envisioned (en-hearon-ed ?), but with the basic gameplay mechanic pretty much as originally conceived.  And as discussed, creating a truly accessible app required more thinking and work than mere VoiceOver support.

In the next post, I’ll discuss the rather unique sound design challenges, including a couple of things that surprised me immensely when creating the sounds and music for Ear Monsters.

Brian Schmidt is a 25-year game audio veteran and an independent Game Audio Consultant, Composer and Sound Designer at Brian Schmidt Studios, LLC and is founder of the audio game company,Eargames.  He is also the founder and Executive Director ofGameSoundCon.  Brian sits on the GDC Advisory Board and is President of the Game Audio Network Guild

Latest Jobs


Hybrid, Cambridge, MA or Chicago, IL
Quality Assurance Lead

Bladework games

Remote (United States)
Senior Gameplay Engineer

High Fidelity, Inc.

Game Interaction Designer

Fred Rogers Productions

Hybrid (424 South 27th Street, Pittsburgh, PA, USA
Producer - Games & Websites
More Jobs   


Explore the
Advertise with
Follow us

Game Developer Job Board

Game Developer


Explore the

Game Developer Job Board

Browse open positions across the game industry or recruit new talent for your studio

Advertise with

Game Developer

Engage game professionals and drive sales using an array of Game Developer media solutions to meet your objectives.

Learn More
Follow us


Follow us @gamedevdotcom to stay up-to-date with the latest news & insider information about events & more