My field of expertise is the perception of action, the way the human brain draws meaning from animated scenes. I am a psychologist interested in the spontaneous explanations our mind gives to movement, and the way it uses it to produce coherent representations of causes and intentions. Although I am not a specialist in AI for video games, I want to highlight an important complement to AI: properties of the design of non-player characters (NPCs) informed by cognitive psychology findings. With this article, I want to provide a simple framework for those interested in the psychological foundations of the practice of video games. In doing so I will advocate a sort of minimalism in the conception of animated characters. Minimalism is key for two main reasons: First, because what matters the most when trying to depict convincing figures is the realism of the behavior rather than the realism of the appearance. Second, because the brain proceeds by reducing the complexity of visual scenes to a few meaningful chunks of information. I suggest AI developers could benefit from knowing exactly which elements of behavior meaning the brain extracts from minimal visual scenes. Having this knowledge means that, without requiring computationally taxing algorithms, it is indeed possible to exploit psychological resources to create lively and interesting characters in video games. Such an undertaking has already started, with the advent of non-antagonistic NPCs, that I will call companions. They will serve as an illustration of the way movement patterns are exploited to give the illusion of life and intelligence.
What’s a companion for?
What it takes to establish a solid bond
Game characters’ lifelike qualities are often discussed in terms of their resemblance to a human model, rather than the complexity of their behavior. As you may know, there is a puzzling effect in robotics, coined the uncanny valley by the Japanese roboticist Masahiro Mori. When an android becomes too realistic, people tend to feel repelled, or even disgusted. The closer the resemblance to a human being, the more people tend to focus on the differences that separate the android from a real human. Mori also noticed the role of movement in the uncanny valley effect. An android may look weird to you because it is not able to reproduce the smoothness of natural movements. However realistic its skin may be, a robot still falls behind when trying to be graceful. Not being meat and bones, a robot cannot take advantage of the stiffness and elasticity a biological muscular system is endowed with.
But you may be more inclined to excuse a clunky behavior in a robot that does not pretend to mimic a human being. Think about a Roomba for instance, its unpretentious aspect doesn’t preclude a certain evocativeness. OK it’s a bland piece of plastic. OK it may look awkward and limited. But one can still appreciate the way it moves, the autonomy it conveys, and perhaps even a sense of stubbornness as it cleans relentlessly your floor.
Going one step further in the direction of simplification of appearance, you could imagine a spot moving on a screen. What would it takes for this spot to look alive? This is a question animators are willing to answer. And cognitive psychologists too. Both want to know the basic patterns of movement that provoke an impression of aliveness. The spontaneous bond that we seem to imbue in everything that looks remotely alive does not depend so much on a familiar appearance. Rather, how evocative an animated object looks depends on its behavioral realism. And how realistic a behavior is depends on the movements that the brain recognizes as meaningful patterns of activity.
We are going to examine the crucial components of this notion of behavioral realism. Following three stages which correspond to the questions one can ask when observing the behavior of an animated object, I will highlight the core properties of the process of attributing psychological traits to an object.
Is it alive?
The visual system is an exquisite mechanism. It draws various kinds of inferences about the world. It can tell you how far an object is based on the difference between the images projected on your right and left eyes. It can complete the missing parts when an object is partially visible. It can tell you approximatively how many peanuts there are in the jar, and how large the glass of beer is so you can hold it appropriately. It is also quite capable in dealing with the mechanical properties of a dynamic scene. If you see a collision between two objects, you can’t help but perceive a causal relationship between the object you identify as provoking the collision to the object that you interpret as receiving the collision. As pointed out by the Scottish philosopher David Hume, there is nothing in the sensory experience that can signal the causal relationship. Causation is a matter of the mind that lumps together pieces of experience into a unified percept.
We are also aware of certain constraints that shape the movement of objects. We know and expect that an object without support is going to fall down, that a moving object is eventually going to decelerate as a result of friction, or that a movement if undisturbed has to continue in a straight line. Apparent violations of these constraints —when for instance an object moves on its own, or suddenly accelerates—constitute precious indications that some non observable causes are influencing its movement. Numerous experiments in psychology show that when confronted to these types of motion cues, observers spontaneously adopt another system of interpretation that the one they would use for inert objects. They tend to perceive the object as something that looks alive and that has a certain control over the way it moves.
It doesn’t take a familiar face, or some fancy beast outfits to create life. Any simple geometrical figure would do, providing it moves the right way. More precisely we can distinguish two components that characterize the process of attributing aliveness to a moving object:
Spontaneity: this is the propensity to initiate a movement and spontaneously change speed or direction (even without interaction with an external object). The more spontaneous, the more lively an object is likely to appear.
Directedness: this is the consistency of a trajectory over a certain period of time. The more often an object changes speed or direction, the less controlled its behavior will look.
Spontaneity determines the overall liveliness of the movement. Directedness determines how an object appears to be in control of its movement. A movement without spontaneity looks rigid, mechanic. A movement without directedness looks random, unconstrained. You need to adjust both parameters to make something look alive. The movement you impart to the object has to give the impression of being self-generated, yet controlled enough that it doesn’t appear purely random. This is because life is something that imposes dynamic and transient structures to the world. In essence, spontaneity represents the spark of life, and directedness the ephemeral organization that constitutes a gesture.
We can illustrate these notions taking some examples from Super Mario. For instance Koopas move in a straightforward fashion until they hit an obstacle and reverse their direction. Although they initiate their movement spontaneously, further changes in their trajectory are induced by the environment. They score low both on spontaneity and directedness, because their movement is almost entirely constrained by the ground on which they evolve. As a result their behavior is rather dull and predictable. Paratroopas have a more interesting behavior: because they are not constrained by the ground, they do not move in such a straightforward way. Instead they oscillate slightly on an otherwise rectilinear trajectory. Bloopers are even more interesting. They score high on spontaneity as they often change direction, and they score relatively high on directedness as they maintain a rectilinear trajectory for a few seconds before changing direction.
Is it an intentional being?
Compared to the previous stage, this is where we start considering goal-oriented behaviors. Previously we only mentioned local cues from motion, such as spontaneous changes of direction. However, when designing NPCs with apparent goals or intentions, we need to consider their relationship with distant objects or events. Especially important is the reactivity of the character to events occurring in its surroundings. The perceived reactivity of a given NPC depends on the human ability to detect contingencies between objects and agents. A contingency is a non-accidental relation between a movement and another one.
Contingency: this is the connection between events. We talk about mechanical contingencies when referring to physical connections involving physical forces; and of social contingencies when causation takes place at a distance, based on intentional properties rather than physical ones.
Illustrating this notion, let’s remember Ico's Yorda, one of the most exquisite NPCs in the history of video games. This most graceful character successfully conveys a social presence by implementing certain contingencies. She turns her head in direction of Ico, she turns her head in direction of an immediate danger, or slightly steps back when Ico is approaching too close from her. Remark however that she possesses some measure of autonomy. Her behavior is not being perfectly contingent on external cues. You will see her walk away at some occasions. This is a very important property if you want to convey the illusion of freedom and self-sufficiency in a companion.
An important aspect of contingency is temporal contiguity, i.e., the short delay between two events —for instance something that your avatar is doing and the reaction from an animated character. The human visual system is very sensitive to ruptures of the temporal continuity in an action sequence. For instance imagine a sequence involving two objects. One moves in direction of the other, until it contacts it. The second object starts to move as soon as the first has touched it. In such conditions, you couldn't help having the feeling that the first object has given a push to the second object (you can experiment this causal impression here). But introduce a delay when the two objects collide and it will alter the causal impression. A temporal delay as short as 50 ms between the two movements suffices to break the causal illusion, and you would see instead two disconnected movements. In terms of intentional behavior, temporal contiguity is going to determine the perceived reactivity of an animated character, and possibly its degree of intelligence.
Contingency is but one aspect of goal-oriented behavior. Imagine now a dot moving on your screen. The dot moves toward a specific direction, until an invisible force — a gush of wind say— repulses it in the opposite direction. Imagine that after a while, the dot resumes its movement in direction of the same location, and is repulsed once again. Imagine that it does that several times. What would you conclude from this sequence of movements? Probably that the dot is stubborn, foolish, courageous, or blind to its inevitable fate. All those descriptions tap into a set of intuitive principles that psychologists sometimes call “naïve psychology”. Naïve psychology is about goals, motives, intentions; it is about the reasons for which someone does something. When observing the dot moving, you implicitly infer from the consistency of its trajectory that it ‘wants’ to reach a particular location. From the patterns of acceleration, deceleration and deviations from the main trajectory, you infer the physical constraints that it faces. And you can also infer certain personality traits that give another layer of explanation to the observed behavior.
Beyond the attributions of intentions and personality traits, you may also wonder why the dot wants to reach the corner of the screen. For humans, this is often a mandatory step in the way the attribution process naturally unfurls. It’s very hard to consider an intentional behavior without asking for the underlying motives of this action. This is exemplified by a pioneer experiment in the domain of psychological attribution (the attribution of psychological traits), designed by Heider & Simmel in 1944. In this experiment, people watch a short film involving simple geometric figures (you can watch it here). Instead of simply describing the cinematic characteristics of the scene, people elaborate sophisticated scenarios. They say for instance that the small triangle is trying to rescue his girlfriend (the disc) from a bully (the big triangle). Such a narration is a concentrate of intentions and motivations, packed elegantly into a coherent string of events. For a human being it is the natural output of the perception of a sequence of movements that appear goal-directed and contingent upon each other.
A narration is also a description of rational motives. A rationality principles states that an intentional agent adopts rational means to reach a goal. Seeing an object adjusting its behavior to external constraints indicates that it possesses the ability to select appropriate strategies to achieve a goal. For example, if an agent's goal is to reach another object and there is no obstacle in its way, it would seem more reasonable for the first object to move in a straight line rather than to jump around. In the observer’s opinion, an object’s behavior appears all the more intentional if it conforms to this rationality principle.
Rationality: this is the property of adopting the most efficient means to reach a goal, given the specific constraints of a situation. A principle of rational action entails that an intentional action functions to realize goal-states by the most efficient means available.
In terms of AI, a rational behavior is certainly more difficult to implement than a simple chasing behavior.
Does it have mind-reading abilities?
Another layer of psychological attribution can be added when considering social behavior. Thomas was alone, but another fellow comes along. How is Thomas going to react? That depends on Thomas’ social skills and motivations. If Thomas was, let’s say a spider, he wouldn’t show much interest in a companion, and would mind his own business (or would try to eat it!). But if Thomas has human social skills, another guy in the room is an all different matter. Another guy in the room is someone you can interact with, chat with, making plans with, quarrel with, fight with. Another guy in the room may be just like you, someone able to recognize someone else as an intentional being.
When considering social interactions, what we consider in fact is the capacity for an agent to act not only relative to its own goals but also relative to other agents’ goals. This is what ‘social’ means: a coordination of goals in a negotiated space. Imagine for instance a square and a triangle. The triangle tries to climb a slope —at least that’s what you infer from its movement and the context upon which the movement takes place. The square puts itself immediately behind the triangle and moves in the same direction. This gives it the appropriate push that the triangle needs to reach the plateau. You would probably conclude from this scene that the square wants to help the triangle. If we decompose into a sequence of cognitive operations: 1) you infer that the triangle wants to reach the plateau, 2) you infer that the square wants the triangle to reach the plateau, 3) you interpret the alignment of both goals in terms of a prosocial behavior, and 4) you conclude that the square is friendly to the triangle. It turns out that even infants are sensitive to goal alignment. Several experiments have shown that infants as young as six months-old are sensitive to the altruistic nature of an action. They tend to prefer the object that ‘help’ the other object going up the slope, rather than the one that ‘hinder’ the other object from going up the slope by placing itself in front of it.
Implementing genuine social behavior in a NPC is a challenging task. A game AI would need to evaluate and predict the player’s current goal to adjust the NPC’s behavior. It would need to integrate different metrics to produce a model of the player’s personality and playing style. For instance the player’s pace, his reliance on certain tactics, or the options chosen in previous similar circumstances. In short the game AI would need to read the player’s mind, much as we humans attempt when interacting with each other. Of course there are some tricks developers can use to give the appearance of mind-reading. The game ‘knows’ the player’s goal when this is the only possible goal. The game ‘knows’ the player’s goal when for instance the player needs to find an item to unlock a portal, and this is the only thing he can do. At that point, a NPC can adjust its behavior to the player’s goal. This is what Yorda does when directing Ico’s attention toward important clues for the resolution of a puzzle. The pointing gesture is an example of goal alignment. It indicates the possibility to consider another agent’s perspective and influence his behavior in a manner that is relevant to the goal at stake.
Is that all?
I hope I have managed to convey the sense that there are certain rules to consider when designing an interesting companion. Those rules correspond to the process of psychological attribution. This is a process through which our brain recovers the structure of the social world. A world of goals, intentions, attempts, emotions, beliefs, or sentiments. Game designers tap intuitively into what the brain considers as satisfying causal connexions. But exposing these preferences may offer opportunities to refine NPCs’ behavior at little cost. I tried to give a brief sketch of this program, with the hope that it could contribute to spark life in a domain that deserves to be scouted out. Note that I have barely scratched the surface and much more needs to be said. What about emotions and personality? How are they conveyed through motion alone? What about empathy? How does an empathic relationship develops through time ? How does it relate to the process of psychological attribution? Surely there is still some land to be toiled to plant the seed of life.