(Image: Felix the Cat in Feline Follies, 1919).
Between the early 1920s and the mid 30s, Walt Disney Studios turned animated characters from a cheap distraction into an internationally recognized art form. Looking at Felix the Cat today is a somewhat painful experience, whereas Snow White seems artistically perfect even in the age of CGI. No amount of motion capture or hair simulation could make me feel more of her carefree happiness as she dances with the dwarves.
(Image: Snow White and The Seven Dwarfs, 1937).
Much ink has been spilled on animation's rapid early evolution and the artistic as well as organizational forces that made it possible. What I want to investigate in this post is the simple observation that video games have not, by any standard, had their Disney moment yet. When animated characters become interactive characters (or non-player characters, NPCs), the illusion tends to break quickly.
|Three NPCs in identical 'idle-loops' in Animal Crossing: New Horizons (2020).||Unresponsive NPC in Cyberpunk 2077 (2020).||Trying to feed an NPC in Pokémon Go AR (2019).|
I have yet to come across a mainstream game with NPCs whose behaviors aren't for the most part mechanistic, repetitive, or clumsy. Walt Disney knew that the key to making the audience suspend disbelief and accept animated characters as real, living beings isn't visual realism. Instead, it's about creating believable behaviors. In order to apply his insights to NPCs, we need to understand what makes behavior believable in the first place. I was a researcher in social psychology and artificial behavior before I became a founder of Virtual Beings, and I would like to share some research-based insights with you that help shed light on the concept of believability.
Social cognition 101
The first thing to note is that it's players, not developers, who get to decide if a given NPC is believable or not. The mental processes that happen when we humans perceive other creatures (human or not) are studied in the field of social cognition. There are three well-established findings from this field that help clarify believability.
Our perception of others is expectation-driven
Nature (via our genes) and nurture (via our upbringing) have worked together to endow us with assumptions about how the world works. These assumptions start with something as simple as object continuity: If I see you diving behind the couch during hide-and-seek, I take for granted you haven't just vaporized. Infants as young as 5 months are able to achieve this feat.
More complex assumptions concern what happens in other people's heads. Whenever we're with another being, we thus can't help but form expectations about what this being feels, believes, desires, and so on. Many studies have shown how and why these expectations are useful in 'normal' situations, but also that they are not infallible.
(Image: Tom Hanks giving a perfect illustration of the uncanny valley in Polar Express, 2005).
When they fail, two things happen: First, cognition becomes slower and more effortful (because it needs to find out which expectations were violated, and why). Second, positive mental states such as visual pleasure, flow or immersion get interrupted and replaced by neutral or even negative ones such as confusion or disgust.
This has been researched for the problem of visual realism and is now known by everyone under the name 'uncanny valley'. What is less known is that it doesn't just apply to imagery but also to motions and behaviors, as illustrated by various GIFs throughout this post.
Our perception of others is automatic
One of the most robust findings from cognitive science over the last two decades is that most cognition and virtually all perception is automatic. Summarized as 'System 1' in Daniel Kahneman's Thinking, Fast and Slow, automatic cognition is fast, involuntary, expectation-driven and emotional.
The automatic nature of social cognition can be illustrated through a classic experiment by Fritz Heider and Marianne Simmel, first published in 1944. The researchers presented the clip linked behind the following image to ordinary people and had them describe 'what they see'.
Virtually all subjects spontaneously described the three moving shapes as agents and constructed a story around the events, typically one about the large triangle 'bullying' the smaller one, the circle 'feeling scared', and so on. Try for yourself to see meaningless movements instead of meaningful social actions when looking at the clip - it's almost impossible.
Agent and object aren't two extremes of a continuum
(Image: Careful there, the British grass snake kindly asks you to leave it alone.)
Heider & Simmel's experiments can also help illustrate the last finding that I want to mention here. We can perceive shades of color, feel degrees of pain and listen to a continuum of sounds, but we cannot perceive degrees of 'agenthood'. A long object on the ground in front of us can be a stick or a snake, but not a sticksnake.
Animals and objects provide wildly different risks and opportunities to us, and confounding them could occasionally have been deadly for our ancestors. In consequence, our brains developed what the cognitive scientist Daniel Dennett calls the intentional stance. It refers to the automatic (see above) expectation (again, see above) that objects with certain properties are agents and therefore able to perform intentional actions. An example of such revelatory properties are presence of a pair of eyes or self-propelled locomotion. The kinds of intentional actions we might expect from them range from fleeing or attacking to purring and talking.
Believability as coherence with expectations
Now back to the concept of believability. An interesting feature of the intentional stance is that it produces far more false positives than false negatives. As a result, it takes very little for us to 'see' or 'hear' agents everywhere - in pencil drawings, cloud patterns, creaky attics, and so on.
Why is this important? Because it doesn't take much to convince us that an animated picture represents an animated character. Just like in Heider & Simmel's research, early animation exploited this in clever ways, but it also learned that the illusion of agenthood can break. This lesson carries over directly to interactive characters.
|Behavioral believability is easy to achieve, but equally easy to destroy when the result violates our automatic expectations about agent behavior.|
The big question is now: What are those expectations? Drawing once more on psychological research, we can organize them along three levels: physical, biological and psychological. Let's go through them one-by-one.
Level 1: expectations about physical systems
(Image: Ralph doesn't know either why this NPC's path-finding doesn't work. From Wreck-It Ralph, 2012).
At the most basic level, we expect animate and inanimate objects to obey the laws of what psychologists call 'naïve physics'. This refers to our everyday (and partially innate) understanding of how the physical world works.
According to naïve physics, objects move but don't just appear out of thin air, they have weight and display inertia, they collide instead of interpenetrating with each other, and so on. NPCs are notorious for violating physical expectations - so much so that it's become an expected and meme-worthy 'feature' of today's games, commented on in blockbuster movies such as Wreck-It Ralph, or ridiculed in thousands of videos on YouTube.
Level 2: expectations about biological systems
(Image: Caressing a cat via touch input in Cat Hotel, 2020).
Our world is full of real agents, be they humans, pets, birds or insects. This constant exposure to agent behavior provides us with tons of 'naïve' data that we use automatically whenever we interact with other creatures. Some of these data relate to certain biological species (say, cats vs. people), some to specific individuals (grumpy aunt Maggie), while others concern entire groups of species (mammals vs. snakes).
Psychologists refer to the resulting expectations as 'naïve biology'. They are not equally distributed across the population - making a believable virtual pet for toddlers is easier than creating one for vets. Still, as the GIF above illustrates, many of today's games violate even the most universal expectations about biological systems.
Level 3: expectations about other minds
The final level concerns expectations you and I use to make sense of other creatures. Psychologists call such expectations 'theory of mind', referring to naïve 'theories' we all carry around in our heads to understand each other.
(Image: A passer-by not responding to the player in GTA V, 2013).
Building artificial characters that live up to all our expectations about other minds is still science fiction. Such characters would, by definition, be able to pass the Turing test, which many AI researchers believe will only be possible once we can build 'general' AI.
Until then, there is a large amount of improvements which could be done to violate fewer of said expectations.
Among the lowest-hanging fruit are expectations about social responsiveness, as illustrated by the NPCs of GTA V (or virtually any other AAA game with human NPCs). When we enter someone's field of vision or do something that concerns them (such as talking, poking, ...), we normally expect a reaction (a glance, a word, an interruption of what they are doing, ...). Not reacting is weird, but it's also a lot cheaper to implement. There are many more areas where it's possible to improve the social believability of NPCs without waiting for general AI, and we at Virtual Beings are working on them one-by-one.
Breaking and sustaining believability
(Image: The player interacting with helpful creatures in Journey, 2012).
If there's one thing I would like you to take away from this post, it's that NPC behavior becomes non-believable when it violates players' expectations, and these expectations can be managed. This is important enough to warrant a highlight.
|Non-believable behavior in today's NPCs is the combined result of a) deficient game AI and b) badly managed expectations.|
My favorite example of believable NPCs are the flying paper-fish in Journey. They are interactive, beautiful, poetic, and they never do anything that ruins my suspension of disbelief. Sure, they are also very simple, and designed in such a way that the AI programmers never had to deal with complicated animation systems or path-finding. My point is, they work. And between flying paper-fish and realistic virtual humans, there's a lot of room - certainly enough to create games that delight today's players with NPCs, and not in spite of them.