Ideal Language for Playing Interactive Fiction

I know that English isn’t very well suited to playing interactive fiction, especially because of the number of prepositions and the lack of a marker for the accusative case.

Which features would a “perfect” or “ideal” language for playing IF have? At the moment the best one I can think of is Classical Latin, for three reasons: no capitalization, no articles, and no distinction between a container and a supporter (“on” and “in” are the same word).

Dare I ask what you would do with the answer to this question?

Nothing, probably. I’m not good enough with Inform to write a translation for it. I just thought it would be interesting.

Why would not having prepositions to distinguish between whether something is inside something or on top of it be a good thing?

I don’t know if others share this, but when I was starting out it was enormously frustrating to get things like “>PUT COIN IN LEFT PAN OF BALANCE. That can’t contain things. PUT COIN ON LEFT PAN. Placed.” Internally, aren’t containers and supporters both just objects which can have a heirarchy?

That’s the fault of the author, not English. It’s very conceivable to have a container that is also a supporter…

True, but for something like that to be implemented in current IF languages it usually needs to be split into two parts (a chest and a lid, for example). In that case, PUT APPLE IN CHEST and PUT APPLE ON LID could be translated as PONE MALUM IN THECA and PONE MALUM IN OPERCULUM without adding any ambiguity.

I remember hearing a linguistic rumour that Russian words conjugate (declench?) to each other with such rigor that in a given sentence you can include the object, verb and subject in any order and still, sometimes even with words missing, be intelligible. If a parser could be made smart enough to understand that, it seems like it would go far toward understanding player intentions even when not phrased the particular way the author was expecting.

There may be actual Russian IF out there that can confirm or explode this myth, though I don’t know that any of us are qualified to evaluate them.

There are many highly inflected languages with free word orders, but on the whole I think they might actually be harder for IF, because in IF you are, on the whole, interacting with third person singular objects.

I think actually English is the best language for IF.

Consider: the imperative is the same as the first/second person singular. Hence the ambiguity of “You’re telling the computer to do something” and “you’re saying what YOU would like to do”, and they’re all the same as the infinitive. Some non-English games actually get very confusing, they often won’t accept the first person form and then who knows if they’ll accept, say, “prendere” (italian for “take”, infinitive) or “prendi” (second person singular/imperative. I think).

Also, there’s gender and hyphens. “Prends-le”. “Prendi-la/prendi-lo”. (or is it prendilo/prendila? Does italian use hyphens? I’m confusing myself. French definitely does) In Spanish, “> X ME” has to be “XME” because apparently they don’t use hyphens OR separate the pronoun from the verb, so “XME” is short for “ESAMINAME”, and “X ME” will not work (though X is a recognised synonim).

Also, English has the great advantage of having its cardinal points and up/down directions all start with different letters.

Also, basic verb/noun reductions still make for comprehensible, natural english. Not always so in other languages. Also, English words tend to have less syllables, easier for typing. I mean, do you want to compare “get coin” with “apanhar moeda”? It’s a mouthful.

I find English to be THE perfect language for IF.

Disclaimer: I know nothing of German, Swedish or Russian.

I think what you really mean is what is the language that makes creating interactive fiction easier for the programmer, right? For the player, any language would do as long as the parser supports it well (and therefore the obvious answer to the question would be “the player’s native language”) :slight_smile:

For programmers, I think English is one of the easiest. If I compare English e.g. to Spanish, in Spanish you have to worry about at least (1) supporting characters not in 7-bit ASCII and all the related text encoding problems that can ensue, (2) supporting input verbs in imperative, infinitive or first person (in English those three forms are the same), (3) building sentences in a different ways depending on noun and article gender, as in “el árbol” vs. “la mesa” (common nouns and articles don’t have a gender in English), (4) supporting clitics (“eat it” -> “come + lo” -> “cómelo”, all together and with an accent that doesn’t appear in the simple form of the verb “come”), and (5) contractions (“de el” -> “del”). Although most of this falls on the shoulders of the programmers of IF systems, the programmers of actual games also have some hassles, like defining their nouns as masculine or feminine, maybe defining the imperative/first person for some exotic verbs specific to their game, etc.

There are even some common constructions in Spanish that are very ambiguous to parse and that systems just don’t handle by default. For example, “se” is a really devilish word: it can be an indirect object pronoun (“mandárselo” - to send it to him/her, “se lo mandé” - I sent it to him/her), a reflexive pronoun (“lavarse” - to wash oneself), or just an emphatic pronoun that does nothing at all (“comérselo” - to eat it, totally equivalent to just “comerlo” - compare with “mandárselo” above). Something similar happens with the equally common pronoun “le”, and it’s impossible to handle these kinds of things by default because they are dependent on context. So they have to be handled individually and context-sensitively by the IF author, which is mostly not done because it’s a lot of work to handle a couple of works that most players will not use due to previous bad experiences anyway. This is the most important drawback of IF in Spanish IMO.

Portuguese, Galician, Catalan and probably Italian have similar problems as Spanish (in some cases even more pronounced), and I wouldn’t vouch for French but I imagine it’s not very different in these respects either.

I wrote my system Aetheria Game Engine for Spanish, and then created the option to write IF in English too, and the adaptation to English (apart from the obvious translation of default messages, etc.) was mostly about ignoring things and deleting code. Gender? Just ignore it. Methods to generate an article+noun depending on gender? Unneeded. Methods to convert imperative to infinitive? Unneeded. Methods to find clitics? Unneeded. Etc. The only thing I actually had to add for English was the support for phrasal verbs such that, e.g., “pick the sword up” is understood as “pick up the sword”.

Even easier than English is Esperanto. That’s a language that’s been artificially made to be regular, so it’s very easy to program IF systems and IF itself for it.

I suppose Japanese and Chinese should be easy too, as they have a simple grammar (the big problem in those is segmentation, but for something as specific as IF where you are always parsing the same kind of constructions, that shouldn’t be a huge problem). Regarding languages with declensions such as German or Latin, I don’t think they are easier for this purpose. They would be in theory, if you could always go from a word form to its lemma (base form) + declension, but that’s not the case in practice. At least in Latin (I don’t know in German) there are many ambiguities there: words that have the same form in accusative and dative, and even word forms that could be the nominative of a given noun or the accusative of other, and things like that. So declension is great for human speakers (which have no or few problems with ambiguity) but a pain for the programming of an IF system.

Just as a curiosity, in real unrestricted natural language parsing (which doesn’t have much to do with IF parsing) the languages for which parsers typically get the best precisions tend to be English and Japanese, with Chinese following more or less closely. Then we have the Romance and the Germanic languages. The Slavic languages tend to be a bit more difficult, and Arabic and especially Turkish are very difficult and people get crappy precisions (Turkish is noteworthy for being an agglutinative language that includes a lot of morphological information in a word). Note that this is a very rough and arguable outline, as these things depend on the domain of the texts, the availability of corpora, the number of people that happen to work on parsing or building grammars for a language, etc.; but that’s more or less the picture for natural language parsing.

Your points hold up well in comparison to both Swedish and German.

Interesting question! Some general thoughts on this:

  1. Flexible word order should be a lot harder to parse than strict word order. (That probably means: English and Romance languages easier than German, German easier than Slavonic languages.)

  2. No articles (Latin, Slavonic languages) does sound nice at first. In fact these languages will be pretty nightmarish to parse, since (as Al-Khwarizmi pointed out) there are ambiguities in the flection system. Some endings will just translate to many different cases, depending on context. For example in Czech “muže” would be either accusative or genitive case singular, “růži” would be either accusative or dative case (or with sloppy spelling habits it could even be instrumental - though actually instrumental should be: “růží”).

  3. Prepositions are most likely a very good thing(!), because they help the parser recognize where a noun phrase ends. Flexible word order AND lack of prepositions make for a very hard parsing. An example: one of the hardest verb for German IF parsers seems to be “geben” (give).

Compare:

English: give the ancient book TO the tall man

Here the preposition “to” clearly separates the two noun phrases (even when you drop articles).

German:

Gib das alte Buch dem großen Mann. OR Gib dem großen Mann das alte Buch.
As you can see the word order in German is flexible and there is NO preposition separating the two objects. The parser has to understand that “dem großen Mann” is dative case (to) and “das alte Buch” is accusative case. But first it has to separate “dem großen Mann das alte Buch” into two chunks. It could use the articles to do this, but if you drop articles (and players like to do that) it gets difficult.

Compare Italian:

Dai il libro antico all'uomo alto. OR Dai l'antico libro all'uomo alto.
Here you would have to handle all these strange articles changing from il to l’ and li to gli and prepositions that are contracted with an article (like a/all’) (depending on whether a vowel follows or not) but at least you have the preposition that tells you where to separate your noun phrases.

In Slavonic Languages you have an even more flexible word order than in German, hardly any to no articles separating your objects and even less prepositions (for example the preposition “with” is sometimes expressed with pure instrumental). A Czech example (hopefully correct): “Dej Marii růži.” This could be “give the rose to Mary” or “give Mary to the rose”. I guess the parser would have to ask the world model to find out which one is more likely.

  1. Frankly I don’t remember much Latin, but I remember that it was quite similar to the Slavonic languages (no articles, strong flection …), so I guess it would be a rather difficult language to parse.

  2. Esperanto could be nice indeed! I don’t know a lot about it, but apparently the ending vowel of a word often tells you if it’s a noun or a verb. No more problems like “fly” vs. “fly”. Neat.

(Of course I talked about programming a parser for a given language. I do now realize that maybe you were really just talking about the player experience? …?)

I was talking about both, really. Capitalization, for example, doesn’t matter much to the player, but it makes things a lot harder to program.

I believe some work has already been done on Esperanto IF – plover.net/~davidw/ifindex.html#if1_eo

Esperanto would also mitigate the problem of inflected languages in IF–there is only one stem for each verb, so just remove a final -u or -*s from all words in the player’s command to get the verb stems.

“Give the tall man the ancient book” is also good English.

And I know that that syntax is recognized by Inform. Have any of you run into an ambiguity with this? I can think of ways to make it happen, but I don’t think I’ve ever seen it happen.

Actually swedish is perfect for IFs. Here in sweden we mostly speak in the same grammatic tense, and rarely deviate from the two word sentence structure:
“Get the icepick.” = “Ta ishackan.”
“Slay the moose.” = “Slakta älgen.”
“Unlock the chest with the red key.” = “Upplås rödnyckelkistan.”
“Take all.” = “Ta allt.”
“Take inventory.” = “I.”

On a serious note, I’d bet on esperanto or some other articially created language. Just anything but german, really.