Considerations for CYOA XML file format

Rudism · May 5, 2010, 1:55am

I’m not really sure how much demand there will be for this, but I am working out an XML file format specification for CYOA-style interactive novels, and plan on developing a number of tools for authoring and reading them (probably focusing on mobile apps and Kindle support for reading, initially).

I’ve been an IF-lurker for quite some time, and have great respect for the community that has been built around the hobby. As such, I am interested in any feedback or suggestions anyone here might have. I’ve roughed up a brief page describing the file format and its goals/limitations here: http://inml.rudism.com/.

I know that there’s been some controversy around whether or not CYOA even counts as IF in the past. My primary reason for considering this as a free-time project is that, while I do enjoy classic parser-based IF, I yearn for IF that’s a little less cerebral that can be comfortably read on an eReader or mobile phone without having to deal with typing on tiny thumb or on-screen keyboards. I feel that CYOA-style stories with simple hyperlinking would lend themselves very nicely to that purpose, and have been unable to find any existing system or tools geared specifically towards publishing those in a way that would provide maximum portability between platforms.

I did stumble across http://ifml.sourceforge.net/ during my googlings, which seems similar to what I would like to achieve, only on a grander scale (defining parser-based IF in XML). It also appears to be abandoned, which I feel is a shame.

Anyway, like I said, this idea is still just forming in my mind and I’m looking for feedback from the IF community (even if it’s that I’m crazy and this is a waste of time).

George · May 5, 2010, 2:17am

The one thing I would consider carefully with a system like this is who you would like your target author audience to be. Because INML deliberately avoids scripting in its design (unlike other systems, for example Twee or Choicescript, which are javascript based), extending the system beyond choice-based narrative is going to get pretty difficult. From my experience the more interesting games include some amount of scripting. Of course you can write great works with CBN alone, but I think the perception may be that the INML system is simpler and so some people will not consider it seriously when choosing a development environment.

I’m not sure how portable javascript-based systems are compared to INML, but since many mobile platforms do include a web browser, the possibly greater portability of INML may be moot in the end.

While I have a gut aversion to writing/editing XML, lately I’ve come around to the idea of structured content – so with a good front-end writing tool (or even just an XML editor of course) I could see how writing works for INML would be cool.

Anyway, just my 2 c., take it for what it’s worth.

Retro · May 6, 2010, 3:46pm

Post deleted by Retro.

zarf · May 6, 2010, 5:00pm

You don’t want to go down that road. Computer languages always copy ideas, and even syntax, from each other. It’s a good thing. That’s why Inform 6 looks a lot like C, and Hugo looks a lot like Inform 6, and so on.

Furthermore, if two people are trying to solve the same problem, they’re likely to invent similar solutions even if they never see each other’s work.

Legally, copyright applies to a manual or reference document that you write – but not to the structure of what you’re documenting. You could patent your language’s structure, but I wouldn’t recommend that either.

(Note: I am not a lawyer. I am, however, a programmer.)

Mick · May 6, 2010, 5:18pm

I’m having trouble imagining any COYA script language that would not be structurally similar to the two above examples. You are not the inventor of the concept of hyper-linking, and the syntax isn’t even remotely similar. There is no copyright violation.

Retro · May 6, 2010, 5:34pm

Similarity between programming/scripting languages cannot be avoided. I have to agree with that.

Retro · May 6, 2010, 7:22pm

Post deleted by Retro.

Retro · May 6, 2010, 8:02pm

Post deleted by Retro.

Peter_Piers1 · May 6, 2010, 9:04pm

You know, I really wish you had stumbled upon someone like yourself when you started your work on NodeScript. You’d be starting your own project, and someone else would come along and say “Hey, besides the fact that your language looks suspiciously like mine, mine can do all the things your thingy can do, and much more besides, and it’s innovative, and in fact it’s the best thing next to sliced bread”. I really wish that had happened.

Rudism · May 6, 2010, 9:30pm

While I can appreciate the similarities between NodeScript–which honestly was off my radar prior to reading about it here–and INML (as has been mentioned, different solutions to the same problem are bound to have similarities), I believe that one of the core differences with the approach I want to take is maximum portability. This includes the ability to publish to any number of existing e-book formats to be read on devices that may or may not support javascript, and may or may not have any kind of session/state management capabilities. I foresee even the ability to generate a printed manuscript suitable for dead-tree publication. To that end, its simplicity and lack of support for more advanced concepts such as custom interfaces and inventory management is by design.

I wouldn’t perceive INML as a competitor to NodeScript, but rather a storage medium and (ideally) a suite of user-friendly authoring tools for simple CYOA-style stories which could easily be transformed into NodeScript format (or Inform, or TADS, or ) if desired.

vaporware · May 6, 2010, 10:03pm

Looks a lot like an INI file to me. The basic structure of choices and linking scenes to each other by ID is, as others have said, pretty much inherent in the problem that both formats were invented to solve.

One difference between INML and yours is that INML doesn’t have blocks consisting of nothing but opaque IDs: the choices are part of the scenes, everything is described in complete words, and IDs are only used to link to scenes. That makes it more readable, in my opinion. For example, I have no idea what “EC”, “NJ”, “PN”, “NC”, and “DN” stand for, although I can guess at the others; the INML version is self-explanatory. But maybe that’s not a big deal if the NodeScript file is meant to be edited with a customized program (is it?).

bcressey · May 6, 2010, 11:26pm

Bonus points if the INML interpreter also tells me to disable my antivirus scanner. Or is that another trade secret?

Retro · May 7, 2010, 12:39am

Then ask yourself why it didn’t happen to me and maybe you’ll find the answer…

Retro · May 7, 2010, 12:52am

Yes, a NodeScript editor is planned. But for now a simple text editor (e.g. Notepad in Windows) should do the job.

vaporware · May 7, 2010, 1:16am

Well, technically yes, since it is a text file. But if you expect people to use a text editor, I encourage you to reconsider some of the cryptic syntax. As it is, it seems more like an intermediate format that humans shouldn’t view or edit except in an emergency.

Retro · May 7, 2010, 1:47am

Already done. View NodeScript.NINI format.

Rudism · May 12, 2010, 3:27pm

I have started working on an authoring tool to read and save stories in INML format, and have already come across a few deficiencies in the initial specification (for anyone who is interested in following my progress):

Since a Scene’s id attribute is defined as an XML ID in the DTD, it must conform to certain rules (essentially it must be a valid XML node descriptor) which is not very friendly to human authors (more prone to auto-generation by authoring tools). Because of this, I’ve added a “name” attribute to Scene elements which would be a human-friendly short description of the scene. This value would never actually be displayed to a reader of the story, but would rather be used by authoring tools to allow the author to label scenes.

I’ve also decided to disallow HTML formatting for Prompt and Choice elements (only for Description elements). My main reasoning behind this is that the Choices are essentially hyperlinks, so allowing the author to place his own conflicting hyperlinks and other HTML elements in the choice option could pose problems when rendering them to a reader.

Another thing I am currently considering is a way to allow the author to specify page breaks within a Scene Description while keeping away from the added complexity of multiple Description elements within a scene. My first thought is some kind of control character or string like \p or \clear, but I don’t particularly like that very much either. Anyone have any thoughts on how that could be achieved elegantly?

–edit: I actually came up with an idea I like better, which is to allow a element in the Meta section, where the author/authoring program could specify its own escape or character sequence that represents page breaks within the Description elements, thus allowing for page breaks while shifting the onus of coming up with a viable system to include them out of the specification itself.

George · May 12, 2010, 10:19pm

A page break seems pretty fundamental though, wouldn’t you want that in the spec?

With regard to styling links – I guess that raises the question of styling in general, are you allowing authors to write a CSS file or something like that?

Rudism · May 13, 2010, 1:42am

Well, the spec now supports page breaks, while leaving implementation details (ie, how to represent a pagebreak in the content) up to the individual story. I think flexibility is better than defining a standard in this situation, since picking some arbitrary universal way to represent a pagebreak (for example “") could interfere with an author’s content (in this example, if they wanted to use the string "” in their story without representing a pagebreak). This way, they can specify their own pagebreak string and guarantee that there won’t be parts of the story that will be incorrectly interpreted.

As for styling, CSS is definitely a possibility for HTML formatted stories (either through inline tags or including an external stylesheet when exporting to HTML format–or any format that inherently supports HTML). In the case of prompts and choices, it would likely be styled as a generic template, as opposed to individually styling each prompt and choice (which is part of my thinking behind disallowing HTML formatting in that content–the other more important part is that wrapping an tag around arbitrary author-generated HTML code will break in a lot of situations, and detecting/managing that could become nightmarish).