Yonder: The Cloud Catcher Chronicles
An Episodic Audio Journal
Episode Two: 40 Species of Noise
I love open world games and I find myself observing the environments in detail as I travel around exploring. But I often find myself thinking about how they compare to the real world. I spent three years living in Japan, 12 months in Tokyo and 2 years in Aomori Prefecture, one of the main things I noticed in Japan was how the environments changed both due to seasonal changes and location.
Spring and Summer time are vibrant seasons in Japan; birds are very active, but the really noticeable aspect of these months from a sound point of view is the insects. Millions of cicadas during the day and crickets during the night saturate the Japanese landscape with their song. What many travellers to Japan do not realize is that the various species have very specific song types as well as very specific locations in which they live. This means if you travel from southern Japan up to Tokyo, and then north to Tohoku you will be able to hear different insect sounds. And the insect sounds will also change from season to season. Once I realized this I was able to watch a Japanese movie and if there were insect sounds I could identify where it was filmed and what time of year it was.
We often ask our audience to spend hours inside our game worlds and we design complex terrain with various biome types. And yet sometimes we craft the audio for these environments with much less detail than the visuals. I wanted to not only add more life to the world of Gemea, but to create a living world that would highlight the changes of day and night, summer and winter, sunshine and storm.
Yonder: The Cloud Catcher Chronicles has the sound of over 40 species of birds, insects and frogs. Did I go overboard with this? Perhaps, but the result is a world that has a dynamic ecology that reflects various states of change. But each choice and its implementation is also designed to support the overall narrative of exploring the land of Gemea.
To start with I wanted to select as broad a range of content as possible. I have a massive SFX collection as well as my own personal recordings. I have also recently been working on restoring some older recordings and making them suitable for game implementation purposes. Working through sound recordings is like selecting which instruments you want to compose for. Each sound has character and texture and a quality that suits certain emotional states. Selecting the most suitable Cicada sound is no less important than choosing which wind instrument to write a solo for.
I had created a first pass of the dynamic environment system months ago, it worked and it sounded effective in the game from a technical standpoint. This was a series of basic bird and insect sounds that functioned appropriately for the game world. I will go into the details of the system in a minute, but the selection of source content is critical, so I want to focus on that for a second. By putting in placeholder sounds it meant the system could be tested. As the world grew and the artwork became more finalized I was able to spend many hours in the world. Most of this time was spent implementing other sounds or performing constant ongoing mix balancing, but sometimes I would just play and explore and I would listen and I would “feel”
When it came to selecting the second pass content I knew the biomes very well and I knew what I wanted. I would go through my raw SFX material, often not looking at the names and I would listen with an open mind, I would allow the sound to take me wherever it would. Some sounds were lush and full of life, others seemed sparse, and some would invoke a feeling of hot, dry, arid landscapes. In most instances this is because the sounds were from a creature that lived in the appropriate environment in the real world, but instead of working with names and descriptions I worked with feelings and I think this allowed for some nice choices of content.
We have been using the State System in Wwise to control various aspects of the sound and music. We already had a day and night state, but I wanted to expand on this. So we added dawn and dusk states to create transition points between the main day and night states. This is used for both music and SFX, but for the species of birds and insects I crafted each one individually to align with how they made me feel. Certain bird sounds just didn’t sound to me like I would hear then immediately at dawn. They had more of a feeling of circling overhead at noon. So these sounds only triggered once full day state was active. Some insects I implemented to trigger right at dusk and others not until full night had fallen.
So in this example the insect sounds will be active at dusk and during the night, but fade when dawn arrives. And this particular species is only active during the Spring and Summer months.
The dynamic structure for the birds used individual bird calls triggered at various rates depending on exactly which species I was working on. Smaller birds tend to twitter more often than larger birds. This system also allowed me to alter the trigger rate dynamically. So at dawn and dusk I could create the “massed chorus” effect of many birds all singing more often. Over the course of the day the trigger rate would drop off so that at noon when the day was hottest the birds were barely present.
From a Seasonal point of view I set many birds to be most active in Spring time when many creatures are mating, then slightly less active in summer and in some biomes completely replaced with noisy summer insects. In some regions the same bird species is active through spring, summer and autumn and then they fall quiet during the cold winter. In other biomes a specific bird may only be active in spring, and then through other months different birds or insects become more vocal.
The Grasslands_Birds consists of four species and in this example the BellsVireo has been expanded to show it has 16 sound files that make up the full pool for that species. Each bird has a varied number of sound files depending on the type of bird and what sound files I had available for that species.
The advantage of the Wwise state system is that I could easily assign and tune each species object to be unique but also so they would blend and dovetail nicely. The other useful implementation technique was to add all my biome specific sound objects into a single event. The main game objects I used for implementation into Unity were the biome specific trees. Each biome had its own tree species. As birds and insects generally gather in trees this was logical from a narrative point of view and provided objects spread throughout the game world as emitters. This also meant I had to sync only one event to a tree Prefab and it would be instanced across the whole world. (This was important because the player can collect plant seeds and place them anywhere else in the world.
Initially I had both a bird event and an insect event attached to each tree Prefab. Then I realized I could simplify this. A Wwise event can contain multiple sound objects. So I could place each of the different bird and insect species that I wanted to inhabit a biome into a single event and attach that to a tree prefab. The State system meant that even though there might have been 4-6 sound objects in the one event, each would only play at the specific day/night and seasonal state defined for it. Each of these objects could have unique effect and attenuation behaviour. So again the drop-off range of species could all be tuned to present unique behaviour within the world.
A single Event can contain all the sound objects I need to produce the spatialized environment sound for a biome. Each of the objects in this event will only trigger when their appropriate states are met. So even though there are 6 sound objects here only one usually plays at any one time. This makes implementing into the game much simpler once the system is defined.
As you walk through the world there is a true spatial environment around you. Trees may include 2 or 3 different species of birds within it and each species had a range of bird calls, so the entire system generates a spatial dynamic environment. If you chop down a harvestable tree it stops emitting its related sounds. If you deforest an entire biome its environmental audio will reflect this.
Each biome also has birds in flight. These are very basic animated shapes. But they also have sounds attached to them so they sporadically emit a bird call as they fly past. This final element really helped to sell the feeling of a dynamic alive world.
The weather system for each biome is also unique. So while there is a general wind sound through all biomes, the forest also has a wind-through-leaves sound that is attached to the tress and emits from that location. Rain in forest areas is the sound of rain on leaves, while in the grassland is it a lighter rain on ground sound. Alpine and desert areas have very different wind sounds to other environment types. All of these choices were made with the same “how does it make me feel” approach to help support the overall narrative.
Making the audience feel cold in the winter months and in snow biomes and hot and dry and in the desert can be achieved more successfully when the audio fully supports the visual effects. In fact often the audio can be more evocative to trigger emotional feelings in the audience than visual changes. Keep in mind that apart from the basic flying bird shapes, none of these birds or insects exist in the game as objects, they only exist as sounds. So the world is vastly populated with a great and diverse selection of lifeforms that exist only because of the audio. In this regard I got to decide and create much of the eco system of the world of Gamea and this helped create a lush environment without having to create dozens of models and lots of complex code.
All of the techniques that I applied to creating the environmental audio for Yonder: The Cloud Catcher Chronicles were taken from my experience doing research for VR/AR/MR implementation. For the New Realities we are striving for more detailed and precise surround spatial environments; we want to immerse the audience into these worlds and make their experience more engaging. But I realized that many of these techniques were just as valid for a "traditional screen" format game world. So movement of the camera produces a similar “world rotating” effect in the audio that head tracking does in VR. This is because all the weather sounds such as wind and rain are set at four compass points and all of the environmental birds and insects are localized throughout the world inside the trees. So Gemea is a dynamic virtual environment in many ways and the player’s experience should be far more engaging and enjoyable because of it.