Voiceover or no Voiceover?

Full voice acting is a given for most modern “AAA” games, but text still has value and indie and mobile games operate under different expectations. Consider the following factors:

Budget. Regardless of design considerations, small projects may not have the budget to hire voice actors. This is especially likely with a branching dialogue game, where the number of dialogue lines could amount to several times that of a game with a linear script. Adding multiple Player character voices (as with a game that allows a speaking Player character to be male or female) compounds the burden. BioWare’s Mass Effect 3 reportedly had 40,000 lines of dialogue recorded, well beyond affordability for small studios and wildly inappropriate for mobile devices.

Presentation. But budget isn’t the only consideration when it comes to voiceover. A game that uses voiceover must wrestle with an extra expectation of realism in its presentation. Mid-conversation pauses–such as those caused by a Player considering conversation options–become more distracting; anything that inhibits the flow of the conversation as a conversation–such as abrupt shifts between subjects like those in a question hub–becomes a potential problem.

Voiceover is also more time-consuming for the Player than pure text. A typical Player reads much faster than a voice actor speaks, and while few people would suggest that fast-forwarding through a subtitled movie is a superior experience to watching it as filmed, movies have the advantage of actors providing visually nuanced performances that games can’t yet match. Games with voiceover must provide a compelling reason for a Player to sit through that voiceover. Well-rendered cinematic sequences assist greatly in this regard. On the other hand, voiceover playing over a static screen showing only text and character portraits can try the patience of many gamers, even with high-quality performances. (And we’re just going to assume the performances are high quality. No voice acting is almost always better than bad voice acting.)

In contrast, it’s difficult to create compelling cinematic conversations with text-only dialogue. Fully rendered characters standing in a scene and emoting–but not speaking–come across as odd and stilted. If realistic cinematics are a project priority or a strength of your engine, voiceover almost has to go with them. The more stylized your cinematics, the less voiceover will feel necessary.

Narration. Mixing dialogue and narration (‘Ralph wipes away a tear. “I miss my dog.”‘) permits a degree of intricate detail that the best cinematic sequences can’t match, but narration mixes poorly with voiceover.

Sorcery!

Sorcery! is an adaptation of a 1980s-era gamebook for mobile devices. Clearly not a candidate for full voiceover (at least not without considerable design changes).

Text Manipulation. Voiceover limits direct alteration of the text. For example, in a game that allows the Player to determine the protagonist’s gender, different NPC lines will need to be recorded to reflect the proper pronoun (“So, <he/she> comes crawling back, huh?”). In a text-only environment, “tokenizing” individual words so that they can be swapped out according to variables permits flexibility made cumbersome with voiceover.

Recording. Producing scripts for voiceover doesn’t just require that a writer work under restrictions and avoid sloppy structure. It also challenges the voice actors and directors. Do you have a way to present scripts in a vaguely comprehensible manner that can be used in a recording booth? Do you have directors familiar with the format and the nuances of the medium? Do you have a way to tell a voice actor that line 214 can’t be delivered too passionately, since line 215 is linked from a separate, more subdued section of the conversation?

Familiarity. And of course, none of this matters if you don’t have writers who understand how to write for voice actors. Voiced lines need to be written differently than non-voiced lines. A compelling monologue in text can become implausible and dull when spoken. Character “voice” needs to be conveyed in radically different ways. (This is a subject that goes beyond interactive dialogue, of course, and therefore not one we’ll spend much time on here. It’s still good to keep in mind.)

Silent Protagonists. During the early and mid-2000s, as text-heavy role-playing games were transitioning to full voiceover and cinematics, it was common for games to have full NPC voiceover but silent Player characters (as in BioWare’s Knights of the Old Republic). Aside from saving money and avoiding problems with certain interface issues (see “Selection Interface and Options,” below), this allowed Players to imagine whatever voice they wanted for their highly customized protagonists–not a meaningless benefit, given that poor or inappropriate Player character voiceover can be a major detriment to a Player’s experience.

Nowadays, the silent protagonist has largely faded away in games using branching conversations with voice acting. As cinematic sequences become increasingly detailed and realistic, having one character–and the protagonist, at that!–say nothing while everyone else delivers voiced lines feels increasingly strange. The protagonist becomes a passive bystander in his own story while the voiced characters become the stars.

For these reasons, I can’t really recommend “silent protagonist, voiced NPCs” in many situations. In a game with cinematic conversations, it de-emphasizes the Player character. In a game without cinematic conversations, any voice acting can feel like a drag. There are always edge cases, of course, but there’s a reason the silent protagonist has been left in the past.

Hub and Spoke or Waterfall?

There are two foundational structures for branching conversations: hub and spoke structures and waterfall structures. They can be combined or not, but most games are best served by consistency–a game that uses a pure waterfall structure uses it in every conversation, and a game with a blended mix always uses hubs in one specific way. The base conversation structure will impact both the dialogue itself and the conversation interface you build–consider early and consider carefully.

We'll examine these structures a little more next time and study how each should look from a designer's (rather than a Player's) perspective. For the moment, however, use the following definitions.

Hub and Spoke. In a hub and spoke structure, the Player chooses from a set of conversation options at a central “hub.” When an option is chosen, it branches down a series of unique lines and additional unique response options, until that branch is exhausted and the Player returns to the hub and can select a different option and progress down another spoke.

The conversation ends up reading like this:

NPC: What did you want to know about?

Player Option 1: Tell me about hubs.

Player Option 2: Tell me about waterfalls.

Player Option 3: Tell me about ornithomimids.

[Player selects option 1.]

NPC: Hubs are useful, but they don’t exactly sound natural.

NPC: Have you ever actually played a game with hubs and said, “That was a realistic conversation”?

Player Option 1: Actually, I have.

Player Option 2: Realism isn’t everything.

Player Option 3: Point taken.

[Player selects option 2.]

NPC: I suppose.

NPC: That’s enough of that, though. Did you want to know anything else?

Player Option 1: Tell me about hubs.

Player Option 2: Tell me about waterfalls.

Player Option 3: Tell me about ornithomimids.

A hub structure may hide options from the hub after the Player has selected them in order to prevent the Player from repeating large segments of the conversation, or it may limit the total number of options the Player may choose (ending the conversation after the Player chooses three of four options in the hub) to force the Player to prioritize. Certain branches (spokes) may end the conversation or add new options to the hub. Hubs can even be nested within spokes of other hubs.

Planescape: Torment is a blended game that’s never shy about using large question hubs. (Screenshot generously provided by Jasyla.)

Hubs are useful in situations where the Player may not want to view or hear all the available dialogue, or where making the Player feel in control of an investigation is paramount. At their best, hubs make Players feel clever and in charge, picking and choosing how to approach their subjects in interrogation. At their worst, hubs turn NPCs into information vending machines where the Player feels obligated to punch every button and get each available tidbit of data. Such conversations lose any sort of naturalistic flow; pacing ceases to exist and voice acting becomes implausible as actors veer wildly from one subject to the next.

Waterfall. In a waterfall structure, the Player can never go backward. Response options are never repeated. Instead, the conversation moves forward until it reaches a predefined conclusion (or one of several predefined conclusions). Player choices may branch the conversation and branches may or may not recombine (we’ll get into more details of structure in the next post), but a choice not taken is lost forever.

Waterfall structures lend themselves to more realistic conversations, where Players and NPCs can both give and take control of a conversation’s direction, characters can segue naturally from one topic to the next, and so on. A well-written branching conversation using a waterfall structure can and should read like a well-written linear conversation–no matter what path a Player chooses.

Of course, a writer working with a waterfall structure must take care to ensure that all necessary information is conveyed in every possible path. It’s easy to accidentally “bury” key topics under Player choices that may not get picked. The “must see” elements of the conversation comprise the conversation’s “critical path,” another topic we’ll return to in future posts.

Blends. Many games use a combination of waterfalls and hubs and spokes. Often, games use waterfalls by default and switch to hubs and spokes when the main path of the conversation is complete. The Mass Effect series permits characters to sidetrack from waterfall conversations into an “Investigate” submenu which offers a hub and spoke structure allowing the Player to learn about nonessential pieces of game backstory.

While blends can certainly prove useful, I find that any presence of a hub and spoke structure greatly diminishes the realism offered by a waterfall; the moment the conversation becomes “game-y,” my suspension of disbelief is strained. The more realistic or cinematic the conversation, the worse that strain becomes–when pure text without voiceover glosses over a subject change, it’s less bothersome than when voice acting and facial expressions fail to transition naturally from one topic to the next.

Which isn’t to say that blends shouldn’t ever be used, or shouldn’t be used in highly cinematic games–but they shouldn’t be assumed to offer only the best of both worlds, or have no negative consequences.

Selection Interface and Options

How your Player selects conversation options will influence emotion and immersion as much as your structure and your choice of whether to use voiceover. Below are a few items to grapple with when building your selection interface.

Presentation. On the most basic level, how do you present a response option to the Player? Do you show a menu of dialogue lines, allowing the Player to choose among them? Do you show symbols or short descriptors representing options instead? (A “heart” symbol for a compassionate dialogue choice, or a menu option reading “Angry Dismissal.”) Do you show a shortened or “paraphrased” version of a choice instead of the full, literal line? (The choice “I won’t help you” results in the Player character declaring, “You’ll have to take care of it on your own. I can’t participate in this.”) Do you use a combination (symbols and paraphrases) or “layers” of interface (symbols by default, full text on mouseover)?

For most games, you’ll want a system that lets the Player comprehend the choices available as swiftly as possible. A handful of symbols is faster for a Player to take in than a full menu of dialogue lines–but if the Player can’t intuitively grasp what each symbol means, she may find it time-consuming (and frustrating!) to guess at the writer’s intent.

Dragon Age 2 uses a combination of symbols and paraphrases in a relatively consistent order (“heroic” at the top, “aggressive” down below, special options on the left).

Consider how much control you want the Player to have, and that surprise decreases as control increases. A literal system where full lines are available for a Player’s inspection means a Player isn’t likely to accidentally choose an unintended response (“I thought the heart represented ‘rage’!”), but it also means the Player never sees her character brought to unexpected life. There can be great joy in choosing an “angry” response, then sitting back and seeing your character speak or react angrily in a manner both surprising and appropriate. Consider also the consequences of responses–if every Player choice causes the story to branch wildly, then it’s far more important to give the Player full understanding of the choices she’s making. If Player choices are largely (but not always) “cosmetic,” holding no long-term consequences, Players are more likely to forgive an occasional confusing set of options. (Just make sure they’re crystal clear when it matters!)

Literal menu choices tend to play poorly with Player voiceover–there’s little fun to be had in choosing a line and then hearing a Player character speak that exact line. The Player has already read the line in the character’s “voice” while browsing the menu.

Consider how much consistency you want in your presentation. Do you want options with the same attitude to have the same placement every time options become available? (e.g., the “evil” option always appears at the bottom of the menu, the “heroic” option always appears at the top.) Consistency usually adds to swift comprehension, but there may occasionally be reasons to take other approaches.

Note that consistency needn’t be obvious to be effective. In Star Wars: The Old Republic we arranged response options differently according to the Player’s character class, with the response most suitable to the class’s archetype (a classic Star Wars character) appearing first. The smuggler class always had the most “Han Solo” option appear at the top of the list, letting Players who didn’t care to read through all their options still experience a classic interpretation of the character.

Presentation is an easy area in which to “innovate” (“My game has thought balloons slowly rise out of the Player character’s head, representing how he instinctively wants to reply; the faster the Player clicks on a balloon, the more Confidence Points the character earns!”), but it’s also an easy area in which to go terribly wrong. Try new things–heaven knows we need to keep improving–but don’t be afraid to simplify and fall back on tried-and-true methods after playtesting. Be wary as well of possible intellectual property issues when imitating competitors’ presentation too closely (see, for example, this patent on the Mass Effect dialogue wheel).

Forced Player Lines. Does the Player need to make a choice every time the Player character speaks? Or are choices reserved for key moments, leaving the Player character to often speak “on his own,” whether according to a baseline personality or adjusted for the Player’s previous choices? How much control does the Player really have over the character, and what degree of control does the Player even want?

The more naturalistic you want your conversations to be, the more forced Player lines you’re likely to want to use–real conversations entail a lot of back-and-forth, and if you don’t make the Player character speak on his own, you’ll either end up with weirdly one-sided conversations or distract the Player with dozens of less-than-pivotal response options. And in most cases, Players won’t mind the Player character adding simple conversation “filler” (“Are you Bob?” “Keep going.” “What did she say?”) or building on the response already chosen (arguing back and forth with an NPC, say, if the Player’s initial chosen response was argumentative).

On the other hand, it’s easy to go overboard and leave Players feeling alienated from their Player characters, lacking true control or frustrated over lines of dialogue that don’t fit the character they’ve envisioned. This is a danger for every single forced line, and one you’ll need to be constantly vigilant of.

For less naturalistic conversations–if you lack voiceover, if you’re using a hub-and-spoke structure, and so on–forced Player lines will probably feel like less of a necessity, but much of the same principle applies. The less the Player character speaks up in a conversation, the less the Player character feels like a genuine, active character with a life, feelings, and opinions of his own and more like a mere interface for the Player herself.

A hub-and-spoke, text-only, no-forced-lines dialogue system can still support a character-focused story, of course. But such a system prioritizes Player agency over strong Player character definition, and the rest of the game and narrative should be designed accordingly.

Number of Options. How many options do you want to give the Player at once? This of course relates to presentation (and, accordingly, user interface design) but should also be examined on its own. The more options you allow the Player, the more control and granularity you allow in regards to Player character approaches and personalities. But every option also slows the Player down and makes the crux of the decision less clear.

Your number of options should reflect the “default” array of personalities available to the Player. We’ll discuss this further in another entry, but picture the major approaches most Players would choose for your game. In a high fantasy game, you might have Player characters who can be broadly stereotyped as noble heroes, brutish thugs, clever and witty rogues, and thoughtful seekers. That being the case, you’ll need a standard number of options large enough to cover those bases–or at least large enough to let the Player to find an option that doesn’t contradict her image of the character at any given time.

Most dialogue-heavy games tend to go with 3-4 options for most Player choices–two choices often feels too limiting and unnuanced, while more than four becomes slow and difficult for the Player to digest. Note that consistency is important here, too: if you train your Player to expect 4-5 responses most of the time, there’s a risk she’ll rebel when you only present two choices for an important moment. (“Why can’t I do X, Y, or Z?”) Conversely, if you only offer two choices at a time from the beginning, your Player may simply accept the lack of granular control as the style of the game.

Timers. Putting a time limit on Player responses–that is, forcing the Player to choose a response within X seconds or else leaving the Player character silent / defaulting to a particular Player response–is primarily useful for two reasons. First, it adds tension to a conversation and encourages the Player to react “in the moment” and “in-character” rather than looking for the “best” approach. Second, for conversations with voiceover, it limits the awkward stop-and-start effect of a Player taking time to ponder responses and ruining the natural flow of the dialogue.

The big downside, of course, is that Players may not have enough time to decide what they want to do and may become frustrated. If you do want to use a timer, it’s more important than ever that your response options are clear and distinct (both in nature and in presentation). Don't let your Player waste precious seconds trying to figure out what her responses mean instead of which she wants to pick!

Interrupts. Do response options only appear when a non-player character stops speaking, or is it possible for the Player to interrupt an NPC? If so, are such interrupts presented differently than “normal” response options? This is a relatively rare mechanic–the Mass Effect franchise tried it in a limited capacity, and it came across largely as an entertaining gimmick–but if implemented successfully could add a much stronger sense of spontaneity into a conversation with voiceover.

Mechanical Aspects. Last but certainly not least: What visible role, if any, do other game systems play in your response options and dialogue?

For example, does the Player character have RPG-style skills? Are certain conversation options “locked out” unless the Player meets certain skill prerequisites? If so, are these options grayed-out if the Player doesn’t meet the prerequisite or are they not shown at all? Are options that require certain attribute thresholds marked differently? (e.g., an option that requires a Science score of 100 has a little beaker icon next to it; Seduction skill-based choices are in a flowery font, as in Vampire: The Masquerade – Bloodlines.) Can attempts at attribute-based responses fail, and how is that indicated to the Player? Can the Player adjust or “buff” attributes during a conversation by switching, for instance, to other game screens and interfaces, or must the Player make any adjustments beforehand?

What about options that change Player character statistics? For example, does the Player have a “morality meter” that tracks good and evil choices, fourteen “vice” and “virtue” trackers, or “reputation” scores that judge the Player character’s relationship with individuals and factions? Are options marked to indicate whether they will modify statistics? Do they indicate the modification only after the fact? Or is all modification invisible?

Is the Player given mechanical feedback on an NPC’s attitude while conversing? (For example, Westwood’s Blade Runner adventure game allowed Players to administer a “Voight-Kampff” test, similar to a polygraph.) How is this presented?

Deus Ex: Human Revolution

Deus Ex: Human Revolution provides extra information on characters that Players can use to find optimal conversation paths. It’s a clever idea, though a bit awkward for my tastes.

Tying mechanics into a dialogue interface can radically alter the way Players approach a game, for better or worse. Done poorly, it can shift the Player’s attention from narrative to numbers. Done well, it can create a sense of consequence and embedded narrative in a way basic dialogue can’t. This isn’t the right place for an in-depth discussion of mechanics in dialogue–doing so would require a larger discussion on the crossover of game systems and narrative–but how you handle this aspect of conversations will greatly impact their integration into your game as a whole. Are your conversations a minigame with narrative consequences but no mechanical impact? Or are your conversations a vital system interlinked with others?

♦ ♦ ♦

Much of this series assumes a certain “baseline” interface for branching dialogue. Specifically, my default assumption is that your game’s system typically a) plays one or more NPC lines, b) reveals a set of potential choices to the Player, c) allows the Player to make a selection (possibly timed), and d) plays an NPC response. Any “extras” (paraphrases, interrupts, etc.) are embellishments.

But there’s no reason a selection interface has to match that baseline. A system could present nothing but timed interrupt options, letting a conversation flow if no interrupt is selected and never presenting more than one option at a time. A system could allow a Player to move forward and backward through the conversation freely, “fast-forwarding” and “rewinding” and letting the Player simply construct the conversation as she sees fit. Interplay’s Fallout combined a branching conversation system with the ability for the Player to manually enter keywords through a separate interface, eliciting NPC responses otherwise not available before returning to the main course of dialogue.

The more your system diverges from the baseline, the more you’ll need to re-examine traditional interactive conversation writing techniques (including those discussed later). In all likelihood, most of the conventional wisdom will still apply–but some of it won’t, and there will be new lessons to learn. Not a bad thing, really.

Coming Up Next

We forget art and game design and talk practicalities (in a much shorter post!). How do you build a conversation tree without creating an incomprehensible, unmaintainable, uneditable mess? What are the proper tools and best practices? It doesn’t matter how brilliant your writing is if you can’t fix it when things go wrong (or if you need to guide your colleagues through your labyrinthine branches whenever cinematic designers or scripters are deployed).

About the Author(s)

Alexander Freed

Blogger

See more from Alexander Freed

Related Topics

Related Topics

Recent in More

Related Topics

Related Topics

Voiceover or no Voiceover?

Sorcery! is an adaptation of a 1980s-era gamebook for mobile devices. Clearly not a candidate for full voiceover (at least not without considerable design changes).

Hub and Spoke or Waterfall?

Planescape: Torment is a blended game that’s never shy about using large question hubs. (Screenshot generously provided by Jasyla.)

Selection Interface and Options

Dragon Age 2 uses a combination of symbols and paraphrases in a relatively consistent order (“heroic” at the top, “aggressive” down below, special options on the left).

Deus Ex: Human Revolution provides extra information on characters that Players can use to find optimal conversation paths. It’s a clever idea, though a bit awkward for my tastes.

Coming Up Next

About the Author(s)

Latest News

Trending

Featured Blogs