[In this Game Developer Magazine reprint, designer Ara Shirinian discusses affordances -- how interfaces suggest what they let people accomplish versus what they actually let people accomplish -- and how that affects game players.]
One peculiarity of video games is that we often think of them in terms of "games we are able to play" and "games we are not able to play." Much like a sport, and unlike most other forms of consumer entertainment, video games typically demand some standard of performance ability before the player can even begin to enjoy their various workings. From the very moment we start playing a game, we develop an impression of how easy or hard a time it's going to give us.
Some games are quite easy to understand. Regardless of whether there is an explicit tutorial, players instantly intuit what to do, what the basic rules are, what is good, what is bad, and how to go about doing the various things that can be accomplished.
They feel like they're capable of playing the game from the first moment. They don't really expend a lot of effort figuring out how to operate the basic mechanics of the game, they "just do it," and find themselves immediately engaged. Any problems and difficulties they do experience are intrinsic to the game.
Conversely, other games seem to be bewildering and obtuse. When you play those games, your capabilities are unclear, you find yourself punished for reasons you don't understand, and you take guesses (often to find out you are wrong) about what various cues and symbols mean.
You spend far more time thinking about "how to work" the game. If you're an experienced gamer, you'll often ask yourself questions like "What the hell is that?" or, "Why the hell is this here?" -- substituting your favorite expletive for "hell."
Of course, I have just illustrated two extreme situations, and most gameplay experiences have some mix of intuitive-feeling things and counterintuitive-feeling things. In this article, I'll explore some reasons why certain things feel natural or intuitive, why certain other things don't, how two people can have very different opinions of what intuitive means, and what the implications are for video game design.
Intuition Taste Test #1
For the sake of communicating in a more universally familiar way, but also to illustrate a more complete picture of how these dynamics work, I'm going to draw on some non-video game examples throughout this piece. The psychological ideas we explore here are indeed the same ones that govern general human interaction with interfaces. Video games just happen to be an application of these ideas, albeit one of primary importance to us.
Suppose you were presented with two different GUI arrangements. In Figure 1a, we have Mystery GUI A where the Cancel button is always displayed to the left of the Save button. In Figure 1b, we have Mystery GUI B where the order of the buttons is exactly reversed: The Cancel button is always displayed to the right of the Save button.
Is one of these arrangements more intuitive than the other? If so, why? You can make a reasonably compelling argument either way. Incidentally, Mystery GUI A is the Mac GUI standard, and Mystery GUI B is the Windows GUI standard. For the moment, all that we will say about this example is that it is interesting that the two most popular computer GUI interfaces use standards that are complete opposites of each other.
Intuition Taste Test #2
Another interesting example, and one that more readily elicits an opinion as far as intuitiveness goes, is a comparison of the QWERTY and Dvorak keyboard layouts. QWERTY, the original and oldest keyboard layout, has been in use for about 140 years. The Dvorak layout, while regarded by many as some kind of unfamiliar, novel alternative layout for hardcore typists, is still a relatively old invention, at 80 years.
When I present people with these two layouts and ask them which one is more intuitive, invariably they give their nod to QWERTY. When I ask why they think QWERTY is more intuitive than Dvorak, they typically say that it's because they are used to it.
Ironically, Dvorak's layout results in your fingers traveling significantly shorter distances overall, because that's what it was engineered to do (see References.)
This ostensibly results in a higher performance ceiling than with QWERTY, and Dvorak's claim to fame has been higher typing speeds, although there have been numerous challenges to its superiority to QWERTY in various contexts.
In any case, to a naive user (someone who doesn't have any experience with any keyboard layout), there isn't much about either alternative that makes it inherently more intuitive than the other.
We like QWERTY because we've been using QWERTY all our lives. We have developed expectations from our past experiences with the keyboard, and it's pleasant to know where every key is, particularly when you don't have to look. In this way, we can say that our sense of what is intuitive is at least partially a function of our personal history or training.
This personal history is very powerful; it's so powerful that in any novel situation you encounter, your mind will inevitably produce explicit and implicit expectations about it. I like to call this "cognitive baggage" for its colorful effect, but it's directly related to your personal history, past experience, and training.
In other words, your idea of what makes sense, what you understand, or what you think you can or can't do is a function of your own personal cognitive baggage. What's more, you bring this baggage with you to every video game you ever play. When a game cooperates with your cognitive baggage nicely, you may say that the game is imminently playable or easy to learn. When a game is inconsiderate of your cognitive baggage, you might say that the game is frustrating or confusing.
Video Game Cognitive Baggage
In 2004, I was a designer on a game called The Red Star, which was eventually released in 2007 on PlayStation 2. Often I would present the game to a new player who had never seen the game before, but who had past experience with other games, and observe reactions without actually engaging with them while they played. Of course, it's best to do this while you are making the game instead of after the fact.
The Red Star is essentially a character-based melee and projectile action game comprising a linear series of levels. Some levels used a side-view camera, while others have a top-view camera. I would not always present the same starting level to a naive player, but it just so happened that the first level in the game begins with a side-view camera, while the second level in the game introduces the top-view camera.
One of my most interesting discoveries was the clear pattern that emerged from players' desire and expectation to be able to make their character jump; it depended on which level they played first! Why is the expectation of jumping so critical here? Well, in The Red Star, there is no conventional jumping action available to the player. A few special melee attacks propel the character into the air as a side effect, but for all practical purposes, this is a game where jumping is not part of the mechanics.
The Red Star: side-view camera section of gameplay.
If the subject played the side-view level first (see Figure 2a), it was common to hear reactions like "Where is the jump button?" or "Why can't I jump?" or perhaps something even more colorful.
However, if the subject played the top-view level first (see Figure 2b), I never heard them make any comments about jumping one way or the other, even if they played the side-scrolling level after that.
The Red Star: top-view camera section of gameplay.
So what's happening here? Many gamers, particularly those who grew up with the myriad 8- and 16-bit side-scrollers, carry baggage that says "Character games that are side-scrolling have a jump button." This baggage accumulates over time, after seeing many different examples with the same standard features. Conversely, apart from a few outliers, few games that feature a top-view also allow characters to jump, especially from the 8- and 16-bit eras.
Furthermore, it can be said that the side-view camera affords jumping quite well, while the top-view camera doesn't. Just by looking at the side-view screenshot, you can see the vertical space on the screen and it's very easy to imagine your character jumping. After all, most games that look like this let you jump, so this game looks like one where you can jump.
When you look at the top-view screenshot for the first time, the vertical dimension is squashed on top of itself, so you cannot really see any physical space for jumping, especially if the camera is oriented straight downward. It looks like a game where you cannot jump.
What do I mean I say that something affords something else? This particular usage was popularized by Don Norman in 1990 (see References):
"...the term affordance refers to the perceived and actual properties of the thing, primarily those fundamental properties that determine just how the thing could possibly be used."
In this context, "the thing" is really any object that has a use, and its affordances are ways it can be used. Don Norman was primarily concerned with perceived affordances, and has since admonished the professional community against the increasing misuse of the word over the years (see References).
To be clear, one of the original uses of the word in this context was by J.J. Gibson in 1977, but Gibson did not care about any perceived aspect of affordance.
For Gibson, if an object can be used in a certain way, then it has that affordance period, regardless of whether you think it does. For our purposes, we must distinguish between perceived affordance and actual affordance.
Suppose we have a common folding chair. Most everybody is able to perceive its most common affordances: You can sit on it, you can put things on it, you can use it as a step, and you can fold it for easy storage.
Perceived affordances can also be a function of culture or cognitive baggage. If you are a professional wrestler, for example, when presented with the folding chair you are more apt to perceive a different set of affordances, like throwing, a platform for body slamming, and folding it up, but this time for ease of swinging at your opponents.
At the risk of irritating Don Norman further, it can be useful to think of affordance as a point along some continuum, even though it was never intended to be that way. For example, a bicycle affords rolling. If you build the same bicycle with square wheels, you could then say that the bicycle does not afford rolling at all.
However, if you build the bicycle a third time with octagonal wheels, it still affords rolling, but perhaps not as well as the original example. In a game like Gran Turismo 5, every vehicle in the game affords driving in the strict, actual sense, but there is a huge difference in the drivability of a smooth front-wheel drive Honda Civic Type R versus the powerful and touchy '66 Shelby Cobra 427, or even the insane 1,500 horsepower Red Bull X2010.
Perceived vs. Actual Affordance
In our chair example, we enumerated several perceived affordances. Coincidentally, these perceived affordances were also the actual affordances of the chair in question. But what happens when a perceived affordance is not an actual affordance?
Sink drawers with more perceived affordances than actual affordances.
Let's examine the fine sink installation in Figure 3a. Of interest are the three drawers just below the countertop. We only have one legitimate perceived affordance, that of pulling, because each drawer sports a handle in the center, and we all know that handles are for pulling.
When we consult reality for confirmation, we find that if we pull on either of the two side drawers, the drawers do in fact afford being pulled out. However, when we attempt to pull out the center drawer, it doesn't go anywhere.
It's actually not a drawer at all, because there is a sink basin behind it and no room for a drawer. It just looks like a drawer. Its perceived affordance is pulling, but in reality it has no such affordance. There are bound to be problems when perceived affordances are different from the actual ones.
Sink drawers with the same perceived affordances as actual affordances.
In the case of sink drawer design, the problem is solved by creating a surface that does not allow any perceived affordance of pulling. In the case of Figure 3b, no one could confuse a neat relief design on a cover plate for something that could be pulled.
Returning to the camera view examples from The Red Star, we find that the sink drawer problem is analogous in terms of the number and type of affordances, if we trade pulling for jumping. Like the fake sink drawer, in the side-view level, a perceived affordance of jumping is frequently reported, but there is no actual such affordance. Similarly, like the relief design cover plate on the second sink, in the top-view level, there is no perceived affordance of jumping, and there is no actual affordance either.
Given the opportunity to correct mistakes, it would have been a much clearer experience for the player if we had started the game in a top-view level, and only later transitioned into a side-view camera style. Once the player has an entire level of experience with how the game works, they no longer carry the expectation or baggage that they should be able to jump in The Red Star's side-scrolling levels.
After a level of play, the player has already learned a substantial amount about the game's actual affordances, and there is not much room left to falsely perceive other affordances from similarly presented games. The big question for us was this: How many players did we lose during the first level, who did not have the patience or resilience to come to grips with the fact that it looked like a game with jumping, yet was not?
Cognitive Baggage vs. Natural Mappings
Previous experience, training, or baggage is not the only force that gives us expectations about how things work. There is also the idea of a natural mapping, which Norman has expounded upon at length.
First, let's look at his definition of regular mapping:
"Mapping is a technical term meaning the relationship between two things, in this case between the controls and their movements and the results in the world."
In Super Mario Bros., the right arrow on the D-pad moves Mario to the right. We say that the right button is mapped to Mario moving to the right. Or, on the QWERTY keyboard, we say that the key under your left pinky when your fingers are resting on the home keys is mapped to the letter A. So, what makes a mapping natural?
Norman Identified Four Possible Characteristics of Natural Mappings
1. Natural Maps May Use Physical or Spatial Analogies
Stove burner interface with a natural mapping.
Stove controls are a great example of spatial analogy. Looking at the stove in Figure 4a, how do you know which knobs control which burner? In this case the mapping is natural because the spatial arrangement of the knobs corresponds exactly with the spatial arrangement of the burners. The left knob controls the left burner; the middle knob controls the middle burner, and so on.
Stove burner interface without a natural mapping.
Figure 4b illustrates an example that does not clearly use a spatial analogy. Which knob controls the upper-left burner? It could be the first knob, or the second. Or is it one of the last two? The only way to know is through trial and error.
2. Natural Maps May Utilize Cultural Standards
Looking at the volume knob in Figure 5, if I asked you to turn the volume up, which way would you turn it? Most everyone would say clockwise, and of course they would be right. Even in the absence of labels, it's a generally accepted cultural standard that turning a knob clockwise means "more" and turning one counterclockwise means "less." (See References.)
Among PC gamers, it's a cultural standard to map the WASD keys to character movement, and furthermore that particular mapping also takes partial advantage of spatial analogy (the leftmost key, A, moves you left). However, those not accustomed to that culture, even if they are avid gamers, may have great trouble with WASD.
A volume knob. Which way do you turn it to turn the volume up, and how do you know?
Beyond the spatial analogy, such an input method still requires the player to internalize their muscle memory with sufficient experience before they can perform competently. Video games in general depend more on this aspect than everyday interfaces because of the time constraint: We are not often required to turn a knob in the exact right direction and amount within a fraction of a second to adjust volume, but in Arkanoid that's exactly what the game demands from you.
3. Natural Maps May Utilize "Biological" Aspects
This is perhaps the most obscure characteristic of the four. Biological aspects refer to how some mapping analogies have a biological basis -- for example, "louder" naturally maps to "more." But this doesn't work with frequency of sound; we do not generally associate higher frequency sounds with "more."
However, there has been some esoteric research done on this category, such as Christopher Wickens' finding (see References) that aircraft pilots appeared to react faster and with less error when stimulus and response processing occurred in wholly separate hemispheres of the brain (e.g., your visual target appears in your right eye and you use your right hand to control the response).
4. Natural Maps May Use Principles Of Perception
This refers to concepts such as physical arrangements of controls where proximity or visual grouping or priority corresponds to function. For example, most car air conditioning systems have temperature controls, fan controls, and vent controls. A natural mapping of such controls in this case would imply that each of those three functional groupings would also be reflected in the physical presentation of the controls themselves; all vent controls should be together, as should all fan and temperature controls.
Now that we understand a few things about natural mappings, let's see how each of the features in Norman's list map to our QWERTY keyboard example.
1. There isn't any physical or spatial analogy to take advantage of, but numbers and letters do have a logical sequence: alphanumeric order.
The order of the numbers at the top row already use this element, and notice that most alternative keyboard layouts use the same order of numbers on the top row. The letters are a different story.
Following this idea blindly, we could make a keyboard with just one row of letters, in alphabetic order, but clearly this would be a bad idea. The sheer physical awkwardness of such a result completely outweighs any advantage conferred by sticking to a logical map.
The next logical alternative is an alphabetically-ordered keyboard, but spread across three rows in the traditional aspect ratio. This too we find to be suboptimal, although for the absolute naive user, an alphabetic order may confer a very slight performance advantage over an otherwise random layout.
2. The ubiquity of the QWERTY layout is a huge cultural advantage. We've been trained to use QWERTY almost exclusively, and the layout is present almost everywhere a person uses a device to input text, except perhaps for the occasional portable or entertainment device.
3. There really isn't any biological aspect to take advantage of in this case.
4. There are limited ways to apply functional or perceptual groupings here, since most keys are generically equal in purpose and weight. Interestingly, Dvorak does group all the vowels on the left side of the middle row, although it's for two separate reasons: first, the high frequency of vowel usage coupled with the fingers' natural position on the home row is part of why Dvorak can boast such an efficient usage of finger travel. With Dvorak, roughly 70 percent of all letters are typed on the home row.
Second, since vowels are frequently interlaced between consonants and are rarely consecutive, restricting vowel access to just one hand increases the balance of words typed with one hand versus the other. Generally, it has been regarded as optimal for performance when letters are typed with alternating hands versus on the same hand. As already mentioned, numbers on keyboards are always grouped together, and that is one sensible perceptual grouping. To a lesser extent, both Dvorak and QWERTY do the same with punctuation characters.
Playing to Expectation
Finally, we've arrived at a point where we can make some interesting deductions about natural mappings and cognitive baggage. As we've seen with the keyboard example, not every technique to produce a natural mapping is equally available to exploit. In fact, when you are designing a control interface for a video game, you really have no guarantees that you can devise a natural mapping at all; it just depends on the idiosyncrasies of the design.
Because the player's interface is an inseparable component of any game design, you may sometimes have an incredible game design that is severely compromised by the reality that there is just no great way to interface with that design. In these cases it's of paramount importance that you resist the temptation to brush off the consequence of the interface component. Your players will always experience the interface before they experience your vision of the design, and depending on the quality of the interface, they may never even have an opportunity to experience your design in the way you originally intended.
If it has not yet become apparent, a cultural standard (natural mapping characteristic #2) is really nothing more than everybody having the same cognitive baggage or training. Some cultural standards are much broader than others -- the entire technologically-minded world is one culture that understands the QWERTY layout. PC gamers, who are able to identify the significance of WASD, comprise a smaller culture.
Old-school shoot-'em-up game experts who all know that their avatar's hit detection box is usually very small and in the center of their character are yet an even smaller culture (although this does not diminish the effectiveness of the "small hit box avatar" technique to improve playability). What's more, cultures can change over time on every scale -- even on the individual level.
In fact, cultural standards or cognitive baggage can carry so much weight, and be so deeply ingrained in a player's way of doing things, that they can often overpower all other natural mapping avenues. QWERTY's popularity and effectiveness for ease of use is primarily thanks to its cultural advantage. Natural mapping characteristics numbers 1 and 3 don't really come into play, and number 4 is better used by the Dvorak keyboard. And yet, for all Dvorak's advantages, it cannot even come close in ease of use when compared to QWERTY.
Part of the reason why "observation only" playtests are so effective for game development is because of this cultural component and its strength. Playtests, when conducted without any outside influence on the subject, reveal the player's cultural knowledge and expectations, or their individual cognitive baggage.
When we receive unbiased reports from players about their gameplay experience, we must understand their baggage and culture to really understand the meaning and significance of that feedback. All these things allow us to tweak our game to be more congruent with players' cultures, and that in turn allows them to more readily enjoy our games in the way we intend.
On the other hand, this does not necessarily mean the best way forward is to always maximally align with the largest cultural standards. You may keep more players in the short term, but the novelty of the experience may suffer, or you may be sacrificing another meaningful component of gameplay dynamics. Games like Gunvalkyrie with its odd dual-analog controls, and Steel Battalion with its massive, button-laden controller are arguably poster children for obtuse, unexpected control schemes. Yet, for a player with patience to overcome the interface wall, each offers a uniquely rewarding method of play.
With each game we develop, we inexorably reinforce, nudge, or fight cultural standards in various ways. We are all steering the course of game development together in some direction, the consequences and tradeoffs of which may not be clear for decades. We want everybody to feel that our games are accessible to them, but in some cases in order to show the player new, meaningful, and compelling gameplay, you might have to gently break down a certain part of their culture first.
An analysis of Querty vs Dvorak keyboard layouts http://patorjk.com/keyboard-layout-analyzer/
Liebowitz, S.J and Margolis, Stephen E. (1990) The Fable of the Keys. Journal of Law & Economics vol. XXXIII (http://wwwpub.utdallas.edu/~liebowit/keys1.html)
Norman, Donald A. (1990) The Design of Everyday Things.
Norman’s further clarification of affordance: www.jnd.org/dn.mss/affordances_and_design.html
A discussion of cultural norms: Bergum, B.O. & Bergum J.E. (1991) Population stereotypes: An attempt to measure and define. Proceedings of the human factors society 25th annual meeting. p.662-665
Wickens, Christopher D., Vidulich, Michael & Sandry-Garza, Diane (1984) Principles of S-C-R compatibility with spatial and verbal tasks: The role of display-control location and voice-interactive display-control interfacing. Human Factors, 26(5), 533-543
Norman on typewriting skills: Norman, Donald A. and Rumelhart, David E. (1983) Studies of Typing from the LNR Research Group. Cognitive Aspects of Skilled Typewriting, edited by William E. Cooper.