Conversational systems in parser games: essay and gamedev

In this essay, I want to go over a variety of conversational systems that parser games can use. Emily Short has written many, many articles on this subject over time (see this page for a selection), and so long time followers of interaction fiction may find this redundant.

This article is part of my series advertising my upcoming Spring Thing game/Introcomp winner Sherlock Indomitable.

I’d like to discuss various ways to handle conversation in parser games. Conversation has typically been seen as a ‘problem’ in parser games, something to be solved. In choice games, there is no distinction. Conversation in those games is handled the same as gameplay, and is, if anything, sometimes easier to implement than other things like movement and puzzles.

Here are some of the systems used:

Traditional ask/tell

In many ways this is still the gold standard of IF. Parser games give off the illusion that you can do anything and get a response. While limited parser games get a lot of mileage out of showing how restrictive it really is, there’s still something exciting about a game that might understand anything reasonable you type.

This is hard, though. Three of the best-known purely conversational ask/tell games are Galatea, Alabaster, and Mirror and Queen. Galatea has hundreds of topics, around 60 endings, and is responsive to the order that you say things in, indicating some sort of memory, in addition to a Boston or other variables like the direction Galatea faces, her mood, etc. Alabaster was co-written by many, many people. Mirror and Queen understands something like 1000 nouns and organizes conversations on distinct tracks.

What these all have in common is that they are extremely labor intensive. To make a great conversational game, you have to constrain your setting, and then implement every noun that appears in that setting. There’s an art of name-dropping new people and objects into your responses to unlock new topics.

Benefits: Immersion, fits traditional parser gameplay style, wonder of discovery.

Drawbacks: Time-consuming to implement, frustration from not guessing the noun, and typing ‘ask about NOUN’ doesn’t say exactly what you’re about to say. Does ‘ask about dog’ mean ‘accuse Tom of killing the dog’ or does it mean ‘ask Tom what a dog is’?

Traditional menu-based conversation

This system, where you choose from a selection of conversational choices, was popularized early on with polished games like Photopia, and even now sees great use in games like Brain Guzzlers from Beyond. This is also, of course, the standard in choice-based games.

Benefits: Easy to program, easy to use, allows the player to know exactly what they’re saying.
Drawbacks: Breaks immersion in parser games, less of a feeling of freedom, interface style doesn’t match the rest of the game.

Hybrid: menu-based conversations with ability to change topics by keywords

This is essentially Emily Short’s Pytho’s Mask and Best of Three, and no other games that I can recall. In these games, you have menus of things to say, but typing in keywords can change your menu.

Benefits: Allows both player flexibility and discovery while letting the player know what exactly they are saying, and cluing the player on interesting things to say.

Drawbacks: Also combines drawbacks of both systems: menus can break immersion or be intrusive, while good keyword mechanics can be opaque and new keywords difficult to find.

Hybrid: Ask/tell with suggested topics.

This is used in a large portion of the block buster games that have come out in the last two decades (what I called Class 1 and Class 2 games in this post). Blue Lacuna, Counterfeit Monkey, Eric Eve’s games (and most TADS 3 games in general), etc.

In these games, there are suggestions on what keywords to use. These suggestions can sometimes be found by typing ‘topics’, sometimes found in gray (like Counterfeit Monkey), and sometimes they show up in separate windows (like Blue Lacuna).

Benefits: Provides a similar interface to the rest of the game while preserving the ease of keyword discovery. Some versions let you see what exactly you are going to say.

Drawbacks: Not that many, which is why it’s so popular. Less of the thrill of the discovery of new keywords, a bit of a feeling like you’re being led by the nose. Overall, a very effective strategy, though.

Yes/No conversation

Constraining conversation in this way can allow for extraordinary freedom in coming up with responses. Spider and Web did amazing work with this system, and Gun Mute is surprisingly moving with the way it handles this.

Benefits: Easy to program, easy to use. Full immersion in the game.

Drawbacks: Doesn’t allow much player freedom.

TALK TO [Character]

In this system, a player types TALK TO whenever they meet someone. That person says a piece of dialogue. Repeated TALK TOs can get different answers.

This is popular in scene-driven or action-based games, where conversation only serves to move on to the next scene.

Benefits: Easy to program, easy to use. Doesn’t break immersion.

Drawbacks: Completely linear. Feels like ‘push button to play story’.

The conversation system for my Sherlock Holmes game:

For Sherlock Indomitable, I’m using the conversation system I developed in Halloween Dance and used in Color the Truth and Absence of Law.

This system is essentially a cross between ask/tell with suggestions and menus. The biggest difference is that this conversation system has been designed to bring conversation as close as possible to the rest of gameplay in parser games; that is, to make conversation essentially a medium-sized dry goods problem.

I do this by having topics be persistent in a sort of thought inventory. Standard ask/tell is naturally persistent; a keyword, once you know it’s implemented with one NPC, can be used with many.

Menus and suggested ask/tell are not persistent. A topic, once used, never comes up again.

By allowing topics to persist, my goal is to combine the ease of use of menus (by always knowing exactly what you can say) with the discovery/freedom of expression allowed by ask/tell (since you have to figure out who to say it too).

This system can be immersive, once you’re used to it, as it uses the exact format that inventory does. However, because it’s unusual, many players find it odd or frustrating.

One further feature of this method is that permanence of topics allows actions on topics. Topics can be examined to remember or to get an idea of exactly what you’ll say, and in my last 3 games, topics can be combined to create new topics.

Due to the large number of NPCs in Sherlock Indomitable, I’ve combined this system with TALK TO, so that Sherlock Reserves in-depth conversation for characters like Lestrade, Watson, and clients, while just TALKing TO all others.

Benefits: players know what they can say and what exactly they will say, allows for conversational puzzles and player freedom, allows for topic manipulation.

Drawbacks: requires a lot of response writing if every NPC reacts to every topic, and guidance otherwise to prevent players from getting the same rejection message over and over. Unusual, and, to be honest, somewhat clunky.

Conclusion:

Overall, it seems that the most successful conversational method in games that aren’t purely conversation-focused has been ask/tell with suggested topics, closely followed by menus.

The success of my particular conversational system is yet to be seen. Color the Truth and Absence of Law were well-received, but several players have complained about the conversational system. My next game after Sherlock may use traditional ask/tell. But for now, there’s plenty of room for exploration.

For an especially interesting take on conversation in a choice-based game, try 10pm by litrouke.

3 Likes

For me, the problems are not just about the system, but about how the tightness of constraints, and how the game communicates that tightness.

Conversation, IRL, is a hugely complicated thing. Words alone can communicate very subtly. “Did you know that Charles has been arrested?”, “You heard about Charles’s arrest?”, “Has Charles really been arrested?”, “Why was Charles arrested?”, “So, Charles has been arrested, right?”, “So … Charles …”, “What did Charles do to get himself arrested?” That’s even before you take into account tone of voice, context (who is Charles? How do these characters know him? How do they know each other?) and so forth.

It’s practically impossible, I’d guess, to have any system, at least one covering any meaningful amount of ground, that can deal with all these variations. Even a system that understood words, and could distinguish between an open question and a closed one, or between a “why” and “when”, or could understand when what looks like a statement is a question, and vice-versa, and could deal with context (all very complex problems) would not deal with things like tone of voice, gesture, and so forth. I don’t suppose, even if an author could manage to do it, that a player would be very keen on

[1]> ASK WITH FEIGNED BUT BELIEVABLE SYMPATHY WHETHER IT IS REALLY TRUE THAT CHARLES WAS ARRESTED BECAUSE HE SEEMED SUCH A NICE GUY WHEN YOU MET HIM LAST YEAR
vs
[2]> ASK WHETHER HE’S SURPRISED THAT CHARLES HAS BEEN ARRESTED, POINTING OUT SARCASTICALLY THAT HE IS SUCH A “NICE” GUY
vs
[3]> ASK WITH GENUINE SHOCK HOW SUCH A STAND-UP GUY AS CHARLES MANAGED TO GET HIMSELF ARRESTED

But IRL we instantly understand the difference between these quite easily, and any competent speaker of English could pretty quickly act them out.

Because a game can’t do this, it has to simplify. That is almost bound to lead to “immersion” trouble in one sense, whichever system you use, because there cannot be any guarantee that the author will accommodate the precise nuance that the player wants to communicate.

Pure “ask/tell”, as it seems to me, resolves that by making some very rough assumptions. When the player types “ASK ABOUT CHARLES’S ARREST” it has to assume a form the question will take. But as soon as it does that, it is bound to make assumptions about the player’s attitude. Even if it were sophisticated enough to allow for variation (let us suppose, by tracking the player’s history with Charles to select between [1] and [2] and [3]), it’s quite likely to make mistakes, and those mistakes will be immersion-breaking. Any other system will, I think, be likely to be immersion breaking in some other way: if you present choices (regardless of how you present them) you are already turning “conversation” into something unnatural, because IRL we don’t entirely consciously control our choices in this way, and in any event you are, by your very act of framing choices altering the player’s perception of the world (perhaps it would never have occurred to the player that Charles is not a nice guy: as soon as we present [1] and [2] as “choices” we are communicating new information about Charles effectively “out of game”). And anyway, of course, the choices still have the same problem: they will never exhaust the possible range of utterances that the player might have had in mind.

A second, different, problem is that conversation IRL follows very complex conventions (of what is acceptable/rational/polite etc). Quite apart from the difficulty of encoding those conventions, especially outside some highly restricted area, games present the additional problem that we often want to relax them. We do things in games that we never would do IRL. We have somehow to communicate to the player how far ordinary conventions apply (“It’s not OK to try to talk about random subjects in no particular order”) and how far they don’t (“It’s OK, you can talk to total strangers”). We also have to decide whether we will communicate them “in game” (as a fact about the world) or “out of game” (as a fact about the interface):

[1]> ASK CEDRIC ABOUT HIS HALITOSIS
“So Cedric,” you say, “you have a really bad case of bad breath.”

Cedric looks upset. He pointedly ignores your comment, and walks away, muttering something about being late for an appointment.

[2]> ASK CEDRIC ABOUT HIS HALITOSIS
That would be inappropriate.

Both of these communicate that “ordinary rules apply”, but they do so in different ways. [1] positively invites the player in the future to “push” the boundaries (maybe it’s important to be able to antagonise people or get a reputation of brusque rudeness). [2] simply communicates that the game is going to force the player to “stick to the rules”. But of course, for the player, it is not at all a “given” that the game will require you to stick to the rules (and indeed common in some ways not to do so). A game that faithfully models the conventions that apply to conversation between strangers, for instance, could be deadly dull. The cardinal and common sin here is inconsistency. No good telling me that I can’t ask Cedric about his halitosis if I am positively required to ask the Reverend Green about his Viagra prescription.

To my mind, those fundamentally semantic aspects of the immersion problem are much more intractable than the mechanical ones; they are in practical terms unavoidable. However, they are also less serious in practice than they might appear. Players can cope with a wide variety of interfaces, so long as they understand the interface and can get it to do what they want. Guns are not, in the real world, aimed by pressing buttons on a controller: but players can make that work. What they can’t deal with, without frustration, is a controller whose aim is unpredictably off, or which is so fiddly to operate that it’s frustrating. Players can also cope with a wide variety of constraints and departures from reality so long, again, as they understand what the “rules” are.

From this perspective, there simply is no gold standard: conversation in games, as things stand, will sooner or later–and usually sooner–butt its head against the practical impossibility of realistically simulating the incredible complexity of human language and interaction. (That’s not a criticism of those who have tried so hard to make progress on this: it’s a worthwhile objective; it’s just a VERY hard problem.)

The really critical thing is that the game effectively manages the player’s expectations, that it clearly communicates what choices the player has, and that it is consistent. In the end, all the systems offer limited choice. At one extreme “TALK TO” offers only the crude choice “shall I bother communicating or not” (itself, of course, a breach of the ordinary social conventions, where communication is often compulsory and often prohibited). At the other extreme, complex choice-based systems may offer a wide variety of possibilities: but they will never be infinite, and they will rarely ever precisely match every player’s possible desires.

My own instinct/experience is that the systems which work most effectively for me as a player are those that most clearly communicate their constraints, whatever those constraints are. TALK TO and YES/NO and explicit choice are hugely constrained: but their constraints are absolutely explicit. They are completely unrealistic (they bear more or less no resemblance to conversation as it actually takes place); but I “know where I stand”. ASK/TELL often fails to communicate such constraints: I am left unsure about how specific I will be able to be (and that is worse if, as can often be the case, the constraints are inconsistent, with some “chatty” NPCs and some who are only responsive to very limited questions). For me, much as I admire their technical complexity, the hybrid systems a la Alabaster don’t work as well as I’d hope they would: the constraint and “slightly off target” choices are still there, and the strings almost seem more obvious because someone has tried so hard to hide them. I mostly played Alabaster as if it were a choice-based game. I guess I’m saying that in this context I think the “discovery” element is not helpful. Instead of spending time figuring out how to use the freedom I have, I spend quite a bit of time trying to figure out what freedom I may enjoy, and that means (often) the inevitable disappointment of discovering that I don’t have a freedom I hoped I would.

So personally, my take on this is “face up to the fact that the player’s choices will be limited and try to make the limitations consistent and to communicate them clearly so that the player can spend time exercising the freedom you have given her rather than just probing around to find what that freedom is”.

Another benefit of the “suggested topics” approach is that it makes NPC interactions seem very natural. Example:

Tom grins at you.
(You could agree with him, scold him or dismiss his idea.)

dismiss his idea

The game would of course need to take care of recognizing as many variations as possible for the player’s response. Like “dismiss his idea” vs “dismiss tom’s idea” or “dismiss idea”, or just “dismiss”. Extra work, but it makes it feel rather natural.

At this point though, wouldn’t it just be easier to use a choice menu?

Yeah, I know, some feel pressing numbers like you’re in a phone menu destroys immersion etc etc…but saving all the work of accounting for every possible phrasing the player can come up with and making a one-to-one circuit of comprehension seems optimal in this case.

I think where the author can improve the pushbutton-simplicity of dialogue trees is by creating conversation trees that aren’t just “yes/no” and using cycling and random text to make it feel as though you’re not hitting the same dialogue nodes over and over. Sometimes it’s even possible to have several choices that lead to the same thing but give the player a little freedom to add nuance to how they respond, and that’s the author’s job. Even better if there’s some kind of system where rudeness or politeness can register despite essentially following the same branch so that the tone of the conversation skews based on what the player chooses while still following the same logical path. It’s surprising how far you can get by having three “yes” answers, but phrase them all in different ways so the player gets an attitude choice even though they are responding the same and hitting the same node, perhaps with a slightly modified response. It’s a trick, but it works. At least on the first playthrough!

I found this worked really well by using Hybrid Choices and cycling text in Fair - essentially in the book-selling minigame there are only four actual responses the player could give that repeated over and over, but, I was able to cycle the choice text to reflect the player getting more and more frustrated whenever people would repeatedly ask if their book was like Harry Potter… Or when people asked what the title of the author’s new book is, the three choices the player can make get wilder and wilder.

A choice menu switches to a different input mode. With the suggestions, you’re still at the regular prompt, meaning you can enter something else and ignore the suggestions. I find switching input modes tends to take me out of the “flow”, so to speak. And it feels more restrictive; “these are your only choices” vs “you can try anything you want, but here’s a couple suggestions.”