Deep Reinforcement Learning for IF

Hey,

I’m a PhD student working on deep reinforcement learning and our team is very much interested in introducing the problem of solving text-based adventure games into the mainstream. However, we are not experts in the games and feel slightly lost. We are thinking about conducting a human study first about seeing how people generally (learn to) play these games. Do you guys have any suggestions of which games should we test and why? For example I don’t think that epic stuff like Zork is necessarily a good idea to try due to obvious time constraints. Another thing is that we are not necessarily interested in huge mazes, sokoban style puzzles and would refrain from super cryptic puzzles of the nature “there was one guy in room 2 who had a purple shirt so now !!!obviously!!1 you have to drink the yellow potion in room 98 because it is the complementary color of purple and there are 100 rooms and 100 - 2 = 98, duh”.

Are you guys interested in helping out?
Have some concrete suggestions, questions?
Do you know any data set of people playing through games?

Thanks!

Some suggested games (of short or moderate length)

Aisle by Sam Barlow
ifdb.tads.org/viewgame?id=j49crlvd62mhwuzu

Galatea by Emily Short
ifdb.tads.org/viewgame?id=urxrv27t7qtu52lb

69,105 Keys by David Welbourn
ifdb.tads.org/viewgame?id=j3rwlhuy6j6v79qj

Castle of the Red Prince by C.E.J. Pacian
ifdb.tads.org/viewgame?id=bw3bnlf4ho8gqq1v

Captain Verdeterre’s Plunder
ifdb.tads.org/viewgame?id=8u5me2jkkw3icqa9

16 Ways to Kill a Vampire at McDonalds
ifdb.tads.org/viewgame?id=s8oklhvdqoo5dv4l

To Hell in a Hamper
ifdb.tads.org/viewgame?id=6d9dfn2akcrlq1bu

The Axolotl Project by Samatha Vick
ifdb.tads.org/viewgame?id=grmj2pkmo37x3fzs

Cactus Blue Motel by Astrid Dalmady
ifdb.tads.org/viewgame?id=7e699ifb6u3767yr

Dinner Bell by Jenni Polodna
ifdb.tads.org/viewgame?id=94vyohlxpwgwwl39

Toby’s Nose by Chandler Groover
ifdb.tads.org/viewgame?id=xf5y04yekcrqtnc

Oppositely Opal by Buster Hudson
ifdb.tads.org/viewgame?id=z3scyumvgsr75blm

Coloratura by Lynnea Glasser
ifdb.tads.org/viewgame?id=g0fl99ovcrq2sqzk

All Things Devours is a personal favorite: a short game that requires playing over and over again to get a satisfactory ending. (In a good way.)

Damnatio Memoriae also takes only a few minutes to play through, but many many playthroughs to get a happy(ish) ending.

Savoir-Faire and Suveh Nux involve learning to understand a complex and rigorously-implemented system, to make it do what you want.

So, You’ve Never Played a Text Adventure Before, Huh? is a tutorial game that explains what to do along the way. It’s got your standard parser stuff. Compass directions. Examining things. Pushing stuff around. I’d recommend this if you want a game with traditional world modeling, and you want people to “type anything” and see how well they catch on to what works.

The Dreamhold is another tutorial game, supposedly. It also starts off by giving the player help, but it’s fairly large and becomes pretty complicated. If you want people to finish a game, avoid this one, but if spending just a while at the start is okay, then it might be a very good pick.

Lime Ergot is one I always recommend for new players. It’s very easy because it has a restricted command set where one verb, EXAMINE, carries 90% of the gameplay. It’s also short. You can finish in twenty minutes. However, it’s not a tutorial game, so players are expected to already know basic parser commands. Also, since it’s centered around just a few verbs, players have less room to attempt different actions (a good thing, to my mind, but perhaps not for your testing purposes).

Hypertext games are easier to learn/play than parser games. Similar to a regular web page, to play a hypertext game you simply activate hypertext links which represent options or choices in an interactive story or simulation.

“Parser” games accept commands in a simple formalised subset of natural language e.g. “GO NORTH”, “PUT THE CAT ON THE MAT” (parsed into a verb “PUT” a subject “CAT” and an object “MAT”) from the player.

Reduced-parser games simplify the command line interface by reducing the effective number of verbs accepted by the game (a regular parser game may accept 100 functionally different verbs) which may make them easier to learn by statistical methods because valid player input to the game is more constrained.
There are 22 hits for the tag limited verbs on IFDB.

Inside the Facility (only ?5 different commands needed) is a notable example.

Short games are easier to solve. There are 4 hits for tags jam+parser

And 3 pages for short+parser.

As far as play datasets go, it’s common to record a transcript of commands entered and the responses printed by the game into a text file. Club Floyd has a large number of annotated transcripts for parser games available online.

You can find online videos of people playing parser games and hypertext (e.g. twine) games e.g. Lynnea Glasser’s live reviews.

Thank you guys for the suggestions! I’ll go over these and come back to you. Great community!

Hey,

Thanks for the great suggestions.

  • Suveh Nux and Inside the Facility seem to be great suggestions due to the fact that these games are focused on one specific type of problem.
  • So, You’ve Never Played a Text Adventure Before, Huh? is also a great suggestions.
  • Thanks for the Club Floyd suggestion we started to check out the playthroughs.
  • I’ve forgot to mention btw that we are primarily looking at Z-machine games written in Inform at this time.

Based on your suggestions we saw that there are certain types of puzzles that are more approachable than others.
For example some of the games require very extensive knowledge about some specific topic, e.g. ratios in 69,105, which is undesirable.

Could you guys help us identify the main types of recurring puzzles (that seem to be approachable) and providing
maybe a couple of examples for these types?

Games designed for children or for beginners tend to have simpler, fewer and more straightforward puzzles.

The Inform Designer’s Manual (Nelson 2001) has an approachable section on puzzles including examples.

The TADS3 Technical Manual, Puzzles, whether and how, includes some comments on puzzles:

ifwiki.org/index.php/Cruelty_scale
The Cruelty Scale (Plotkin 96) is an attempt to identify elements in games which players consider unfair or cruel and classify games depending on how many and how severe the elements are.

IFDB editors can note a cruelty (or “forgiveness”) level for a game.
ifdb.tads.org/help-forgiveness

Games on IFDB with cruelty “Merciful” - the least cruel (most forgiving) rating of games.
ifdb.tads.org/search?searchfor=f … s:Merciful

“Merciful” Inform games.
ifdb.tads.org/search?searchfor=f … m%3Ainform

“Merciful” z-machine games.
ifdb.tads.org/search?searchfor=f … AZ-Machine

You can browse authoring systems and file formats on the IFDB search page.
ifdb.tads.org/search

Note that the cruelty scale is about one specific dimension of fairness/cruelty, basically the extent to which the game can be put in an unwinnable state (and how well clued those unwinnable states are). If a game has no unwinnable states, but it has a puzzle that most users can’t be expected to solve (like some kind of guess-the-verb or guess-the-action), it’ll be considered “merciful” on the Plotkin scale but still considered “unfair.”

Hey! Sorry but i just saw this thread. You might be interested in an article I wrote in SPAG#64 :slight_smile:

Hello :slight_smile:

I’m working on something similar at the moment - in fact, I asked a similar question about a year ago: https://intfiction.org/t/ai-for-if-games-question-about-different-kinds-of-if-games/9888/1 (Also, I just realised I’m mentioned in Hugo’s article. Sorry for not responding to your e-mail, Hugo, I’ve been too busy…)

At the moment I’m working on a library that would simplify access to various IF games: github.com/MikulasZelinka/pyfiction
To be more precise, I’ve been working on some algorithms that I’ve only been testing on two games and now that I’ve got something promising, it’s time to extend the library and support more games that will be used for testing the agents more extensively.

Anyway, I would really appreciate any tips for the games with following features:

  • hyper-text or choice based game (no parsing of player’s input)
  • multiple endings
  • it is possible to find all the endings and assign a reward to them (can be done manually)

I did try going through the ifdb using tags and found these:

The Space Under the Window: ifdb.tads.org/viewgame?id=egtt6j8vif0plzrc
howling dogs: ifdb.tads.org/viewgame?id=mxj7xp4nffia9rbj

The thing is that I can’t think of a simple way to (ideally automatically) find all endings and assign some rewards to them. One way to overcome this would be to simply have the author (or anyone that knows the game well) write down all the game endings and determine how ‘good’ or ‘bad’ the ending is.

Do you guys have any tips on any more suitable games or, perhaps more importantly, could anyone help me with the ending annotation for the two games mentioned (or for any other suggested games)?

Thank you all very much! :slight_smile:

Some choice-based games with multiple endings of varying reward:

The Play - ifdb.tads.org/viewgame?id=ytohfp3jetsh1ik4

16 Ways to Kill a Vampire at McDonalds - ifdb.tads.org/viewgame?id=s8oklhvdqoo5dv4l

There’s a long list of multiple ending hyper-text here:
ifdb.tads.org/search?sortby=rcu& … tag%3Acyoa

What rewards are you trying to assign? Are they something inside the program, or are they rewards that the player would get for completing the game, like a trophy?

I ask because there are lots of games were rewarding the player with trophies would feel inappropriate. howling dogs is one game like that. It doesn’t have “good” and “bad” endings. Plugging the game into a library that rewards trophies would be a little too much like modifying the game itself. And if Porpentine wanted the game to have rewards, she most likely would’ve included them.

Personally, I wouldn’t want people to get rewards for reaching different endings in my games. I try my best to write them so that players don’t feel the need to collect all the endings.

“Trophies” aren’t appropriate in every game. I know some people enjoy completionism, and sometimes it’s a good idea to mark endings they’ve reached (without calling it a “trophy” - “achievement” may be more appropriate or not). It’s even a better idea to hide it somewhere, like in the options in case someone doesn’t want to play that way. I know when I like a game I like to be able to know how much of it I have and haven’t seen once I’m past the avoid spoiler point.

Per usual, every game, and your mileage, may vary!

I think the point is to assign machine-readable values for “good” or “bad” endings, so that the algorithm can become better at finding “good” ones.

That’s what I thought at first, but since the result is supposed to be a “library that would simplify access to various IF games,” I wasn’t sure how the rewards would figure into the final user experience. If they’re purely mechanics under the hood, it probably won’t be an issue.

Exactly. I didn’t mean something like “trophies” or “achievements”, but rather “feedback”. The agent needs to have some feedback supplied to him. This feedback is what actually drives the learning.

I do realise that this whole concept of “good” and “bad” doesn’t quite make sense in a lot of IF games. Still though, does anyone have any tips for games with the features mentioned beforehand? If the rewards would actually correspond to how “good” or “bad” the ending is, that would be a plus. But for research purposes, it is not completely necessary (we want to find out whether it is possible to learn anything at all).

So far I’ve successfully tested the agent on ‘Saving John’ and ‘Machine of Death’, meaning games with similar formats would come in handy. Ideally, the game would also have a walkthrough/author’s notes that would contain a list of all possible endings so the simulator could work with them easily.

Choicescript games are choice-based games that track statistics throughout the game. The scores are often displayed at the end or accessible via a menu at any time. You could use the stats to assign scores.

Thank you, the Choicescript games look great for this purpose!

However, I don’t unfortunately have much time to play around with the more complex games at the moment. I’d appreciate any tips on simple choice-based games with purely textual states and actions (i.e. no inventory, no health, no map, etc.) and multiple endings –– ideally similar to Saving John and Machine of Death in this regard and ideally in an HTML format (exported from one of the engines, probably). I’ll assign rewards to the various endings manually shoud they not be present in the game (their presence is just a bonus and would make it easier for me).

I’m looking to take a a few of these games and try some transfer learning (learn by playing all but one text-game and then observe the behaviour on the previously unseen game).

I managed to write a simple interactive parser/player for HTML choice-based games in Python that I plan to use. I have no idea why I didn’t do this from the start and instead tried to work with the source codes of various game engines :frowning:

Anyways, thank you all again for helping me out and I hope I’ll be able to share some results in about a month.