intfiction.org

The Interactive Fiction Community Forum
It is currently Wed May 22, 2013 5:05 pm

All times are UTC - 6 hours [ DST ]




Post new topic Reply to topic  [ 35 posts ]  Go to page 1, 2, 3, 4  Next
Author Message
PostPosted: Sun Jul 18, 2010 10:37 pm 
Offline
User avatar

Joined: Wed Oct 14, 2009 4:02 am
Posts: 963
There have been quite a few discussions this year about automatically collecting transcripts from testers/players, as some of the following links show. I'd like to make a more formal proposal now of a set of technologies which hopefully will meet everyone's desires. I will make three largely independant proposals here, the protocol, a metadata extension, and a Glulx/Glk extension for tagged output.


The transcript protocol

I've wanted the protocol to be simple and extensible. It should be easily implemented in all languages, and useful for all IF systems. I think JSON sent over HTTP POST requests will fulfil these needs. In the following descriptions, almost everything is optional, though the more that is sent the more useful a transcript will be, of course. Remember that within an object {} the order of elements does not matter, and that each key can be used only once.

The handshake

Firstly, the interpreter sends a message to the server giving the details of the story file. Either an IFID or a URL must be sent, with an IFID being strongly preferred. The other items are all optional, though as much information should be given as is possible. The interpreter must have the user's permission to send contact details.

Code:
{
   "ifid": "ZCODE-2-080406-A377",
   "url": "http://mirror.ifarchive.org/if-archive/games/zcode/LostPig.z8",
   "release": 2,
   "serial": "080406",
   "contact": "Dannii <curiousdannii@gmail.com>",
   "interpreter": "Gargoyle 2009-08-25 (Git 1.2.6)"
}


The server then replies with a session ID and a list of supported transcript types. If the server sends a HTTP code other than 200 or a non-JSON response, or it does not reply with both a session ID and at least the "input" type, then the client must not attempt to send any transcripts. Due to the protocol's extensible nature the client can send more data than the server supports, but that just wastes bandwidth. An example response is as follows:

Code:
{
   "session": 7253,
   "input": 1,
   "response": 1,
   "tagged": 0,
   "time": 1
}


If the client returns both a session ID and at least the "input" transcript type, then the client may proceed to send transcripts. Note that the "tagged" item could have been left out, with no difference to client behaviour.

Sending transcripts

Again, the client sends transcripts over HTTP POST requests, but after the handshake, it does not need to listen to the response. A fairly full example is as follows, with an explanation of the elements after that:

Code:
{
   "session": 7253,
   "log": [
      {
         "time": "2010-07-18T03:04:12",
         "input": "x fsih",
         "response": "You can't see any such thing.",
         "response-type": "libmsg",
         "response-code": 30
      },
      {
         "time": "2010-07-18T03:04:48",
         "input": "x fish",
         "response": "Looks tasty."
      }
   ]
}


A client first sends its session ID, followed by the logged commands. A client may send one command, or 100, and the server must be able to accept as many as it sends.

If the server indiciates that it supports time stamps, please send them with the following ISO 8601 format: YYYY-MM-DDTHH-MM-SS. If timestamps are not sent then the server should do its best to keep commands in order, and may use the time of the HTTP request instead. As JSON arrays are ordered structures, these timestamps must be strictly increasing.
The input command will always be sent, and if the server indicates that it supports response, the corresponding response may be sent too.
If the client has a way of detecting response types and codes it may send those too. Suggested types are "libmsg" for library messages, "parser" for parser errors and "runtime" for runtime errors. Again, check the server supports "tagged" logs first.

Here is an example of a minimal log, which may always be sent, even if the server supports more detailed logs:

Code:
{
   "session": 7253,
   "log": [
      {"input": "x fsih"},
      {"input": "x fish"}
   ]
}


Hanging questions
  • What to do with key/mouse/hyperlink etc input?
  • Have an explicit exit code?

Metadata extension

An custom interpreter may be set up for sending transcripts, or the transcript server may be specified with a querystring parameter (for web interpreters). But in most cases it will be useful to keep the server info with the story file itself, so that it may be used with any terp which supports transcripts. I propose a simple extension to the iFiction XML record, which in most cases will be included in a blorbed story file. Both <url> and <lastdate> are required. <url> must use the HTTP protocol, and <lastdate> must be in the format of "3-letter-month-name day, four-digit-year" eg, "Aug 9, 2010". The <lastdate> is required so that terps won't attempt to send transcripts long after the author has stopped wanting them (or after the web server has been taken offline), it may be set to many years or even centuries into the future (but don't do that unless you can guarantee your server's uptime!)

Code:
<transcripts>
   <url>URL</url>
   <lastdate>DATE</lastdate>
</transcripts>


Glulx/Glk extensions

This part of the proposal is actually where I really need help. Basically it would be good to be able to compile Glulx games which output tagged responses (with a type and code) so that the terps may then easily capture that information for the logs. Juhana's proposal emitted stuff like "#ERROR:[parser error code]#" to the normal output streams, which is a simple option we could add, with a gestalt check so that other terps don't see this nasty message. One advantage of this is that it could also be used with other VMs (in the Z-Machine for example, by first checking for a new previously unused interpreter number.) But it might be better to use new Glk streams. I don't really know.

Suggestions?


Last edited by Dannii on Tue Jul 20, 2010 8:06 pm, edited 3 times in total.

Top
 Profile Send private message  
 
PostPosted: Mon Jul 19, 2010 4:46 am 
Offline

Joined: Tue Dec 25, 2007 10:06 am
Posts: 887
Yes, this looks very good!

Have you thought of what the session ID should be? I realize this is just an example but 4 digits are not enough to prevent collisions. Perhaps it would be good to give some recommendations on what kind of ID the interpreter should generate.

I would also like to see an option to add the player's name and contact information to the handshake. This could be used for beta-testing games online and the player's information would be for the author to credit them and contact them for additional feedback.


Top
 Profile Send private message  
 
PostPosted: Mon Jul 19, 2010 4:50 am 
Offline

Joined: Tue Dec 25, 2007 10:06 am
Posts: 887
Also, I don't particularly like the "#ERROR:[error code]#" way of doing it. The problem with that is that you have to make two versions of the game, one for online play and one for off-line interpreters. If there's a way to hide the codes from interpreters that don't support this feature it would of course be the best way.


Top
 Profile Send private message  
 
PostPosted: Mon Jul 19, 2010 11:07 am 
Offline
User avatar

Joined: Wed Oct 14, 2009 4:02 am
Posts: 963
The server generates the session ID, not the client, and can generate anything it likes: increasing integers, random numbers, IP addresses, timestamps, names of fruit. Anything can be used, and the client will just send it back again.

I like your idea of a contact field. How about "contact": "Name <email>"? Leave it up to the terp to collect the name before the handshake.

We do of course need a way for compatible terps to be detected. With Glulx this is very easy, we can just have a new gestalt value. If we also want Z-Machine support then it would be possible to do by checking the interpreter number... which is not nearly as good a solution as doing so would preclude other extensions. Perhaps we could have a new interpreter number and a new custom gestalt opcode? Then if we ever wanted to extend the Z-Machine again we could use the gestalt opcode.

However even once we've detected that we're using a compatible terp, we still need to send the data somehow. Just printing out #ERROR etc is probably the simplest, but it's definitely not a clean solution... that's why I'd like feedback about Glk streams.


Top
 Profile Send private message  
 
PostPosted: Mon Jul 19, 2010 11:49 am 
Offline
User avatar

Joined: Thu Feb 11, 2010 1:51 pm
Posts: 198
Location: Chicago, Illinois, USA
Dannii wrote:
There have been quite a few discussions this year about automatically collecting transcripts from testers/players, as some of the following links show. I'd like to make a more formal proposal now of a set of technologies which hopefully will meet everyone's desires. I will make three largely independant proposals here, the protocol, a metadata extension, and a Glulx/Glk extension for tagged output.

The transcript protocol

Code:
{
   "session": 7253,
   "input": 1,
   "response": 1,
   "tagged": 0,
   "time": 1
}




In order to allow more detailed review of transcript data, I think the "location" should also be added to the basic response call. This allows someone to write all of the responses to a database and then query based on location.

David C.


Top
 Profile Send private message  
 
PostPosted: Mon Jul 19, 2010 12:36 pm 
Offline

Joined: Tue Apr 27, 2010 1:02 pm
Posts: 797
I'd also suggest a way to capture the version of the interpreter / library running the story file.

Certain bugs can cause otherwise valid player input to be flagged as errors. Glk libraries that lack Unicode support, for instance, will often display such parser errors. This could be frustrating for authors to troubleshoot without some way to group transcripts by interpreter.

On a related note, an exit code field could allow authors to distinguish sessions that ended normally from those that ended in a crash.

Quote:
The server generates the session ID, not the client, and can generate anything it likes: increasing integers, random numbers, IP addresses, timestamps, names of fruit. Anything can be used, and the client will just send it back again.


I'd encourage you to specify a more rigorous session ID mechanism. I often play games over the course of days or weeks. Any transcript session ID should be valid for an arbitrary duration, and not result in different transcripts being interleaved.

Unless you add a TTL field for the session ID, which may be useful for its own reasons: it provides a way to mark otherwise active transcripts as complete.


Top
 Profile Send private message  
 
PostPosted: Mon Jul 19, 2010 7:17 pm 
Offline
User avatar

Joined: Wed Oct 14, 2009 4:02 am
Posts: 963
DavidC, what kind of location do you mean?

In any case, you could always add it to your own terps/servers. If other people think it would be useful to them too, then I'll consider adding it to the recommended handshake.

Ben, sending the terp and terp version is a great idea! Should it be a bunch of items, or more like "Gargoyle 2009-08-25 (Git 1.2.6)"

I also like the idea of an exit code. But, what about if you want to pick it up later from a savefile?

I still don't think that we need to specify how to generate session IDs. Wouldn't it solve the situation you describe if the session ID gets stored in the savefile? However that makes me think of another issue... what if the server has already cleared the original log? If you have a session ID you should send it with the handshake, but then use whichever ID the server returns.


Top
 Profile Send private message  
 
PostPosted: Tue Jul 20, 2010 10:48 am 
Offline

Joined: Mon Dec 31, 2007 5:39 pm
Posts: 151
One possibility would be to extend the initial handshake to include not only story file information but also previous session ID, if available. That would allow interpreters to provide that information to the server if it's stored in a persistent cookie (for web interpreters) or in save game information. The server, of course, is welcome to ignore the previous session information, and at the very least would need to make sure the story file information matches that of the actual previous session.


Top
 Profile Send private message  
 
PostPosted: Tue Jul 20, 2010 2:45 pm 
Offline

Joined: Tue Apr 27, 2010 1:02 pm
Posts: 797
Dannii wrote:
Ben, sending the terp and terp version is a great idea! Should it be a bunch of items, or more like "Gargoyle 2009-08-25 (Git 1.2.6)"


The latter, I think. I have something like the browser UA string in mind.

Dannii wrote:
I also like the idea of an exit code. But, what about if you want to pick it up later from a savefile?


Well, you could probably just ignore the save / restore question, since it should be mostly clear from the transcript where the player is in the game world. Restoring from a save within a single game would append to the current transcript. Restoring it at the start of a new game would print the introductory text through the first prompt, but there would be an early save command to signal the transition.

Danni wrote:
I still don't think that we need to specify how to generate session IDs. Wouldn't it solve the situation you describe if the session ID gets stored in the savefile? However that makes me think of another issue... what if the server has already cleared the original log? If you have a session ID you should send it with the handshake, but then use whichever ID the server returns.


I'm not sure how I feel about storing the session ID in the save file, if it's meant to be a temporary variable tied to the individual player. In theory save files could be shared by any number of people, though in practice I doubt this happens often.

It might make more sense to task the interpreter with generating a UUID for each transcript session and ask the server to validate that identifier. It could either reject (non 200 code) or reply with the session ID. Then the interpreter can just write out <UUID>_<session>.txt somewhere and submit at a convenient time.

The session ID would function as a shared secret; the server will only accept updates for that UUID if the client also sends the correct session ID. In the event we wind up with a single community server hosting these transcripts, we want to avoid the situation where a malicious user or spammer could corrupt valid submissions by sending text fragments to random UUIDs. This is especially important if the transcripts would be available for search engines to index.


Top
 Profile Send private message  
 
PostPosted: Tue Jul 20, 2010 9:20 pm 
Offline
User avatar

Joined: Wed Oct 14, 2009 4:02 am
Posts: 963
You make some sense Ben. Although I would hope no one would feel the need to be malicious, we should be thinking of it anyway. Can you outline in full how you suggest sessions be managed?


Top
 Profile Send private message  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 35 posts ]  Go to page 1, 2, 3, 4  Next

All times are UTC - 6 hours [ DST ]


Who is online

Users browsing this forum: No registered users and 0 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group