Inform 6 debugging addition

I’ve been working on a feature for the I6 compiler which will allow better debugging in the future. It’s not useful right now; it may not become useful for the foreseeable future. But I don’t want it to remain completely unnoticed either, so I’ll describe it here.

(Inform issue tracker URL: inform7.com/mantis/view.php?id=1779 )

The idea is that the I7 compiler might generate directives in the I6 code like

#origsource "game.inform/Source/story.ni" 100 20;

This means that the following I6 code is generated from the I7 code in story.ni, line 100, character 20. This applies to all I6 code until the next #origsource directive.

Or it might add a directive like

#origsource "Extensions/Person Name/Rocks.i7x" 50;

…indicating that the following I6 code is generated from that extension, line 50.

The character and line number are optional, so we have three forms:

#origsource FILE;
#origsource FILE LINE;
#origsource FILE LINE CHAR;

Or you can just say

#origsource;

…to turn it off again.

The #origsource info does not affect the compiled I6 file, but it turns up in the gameinfo.dbg if you compile with -k. It also appears in I6 error messages. You might see:

Here 5000 is the line number in the I6 .inf file. The parenthesized info comes straight from the (most recent) #origsource directive.

Having this info in the gameinfo.dbg opens up the possibility of better debugging interpreters. To be clear, there is no debugging tool that uses this information right now! To make use of this, I7 will have to be updated to generate these directives, and then we’d need to upgrade an interpreter to make use of the information.

Now the down side: adding this feature required tracking more location information inside the I6 compiler. Because of this, Inform now uses more memory and is slightly slower even when the #origsource directive is not used. My tests indicate that it’s about 4% slower on most compiles. This is not a huge cost – I6 is only a small fraction of the time of an I7 compile anyhow. But it’s enough that I wanted to bring it up here.

Semi-related: at one point we were talking about using a more compact representation of in the gameinfo.dbg file. Is that worth pursuing? We’ve now got concepts like “bundle the gameinfo.dbg into the Blorb file for easier profiling and testing”. The gameinfo.dbg files are kind of enormous, and something like two-thirds of their bulk is tags. (To be sure, they’d still be pretty enormous even at a third their current size.)

I’ve also considered adding a switch to suppress the tags in gameinfo.dbg, since (as I said) nothing uses them right now. Perhaps this is optimizing the wrong thing, however.

In terms of a compressed format, some prior art would be JS sourcemaps. I don’t know whether that would be practical or how much it would actually help us.

The tags currently look like

      <source-code-location>
        <file-index>0</file-index>
        <file-position>219</file-position>
        <line>20</line>
        <character>1</character>
      </source-code-location>

You could reasonably take this down to

      <codeloc>F0,P219,L20,C1</codeloc>

with only a small cost in manual parsing for the end-user. (The debugging tool, that is.)

Would those F, P, L, C prefixes be needed? They could be skipped if they’re always in that order. (If file-position means the position in the .ulx then it should go first?) If ranges are required they could be “P:F,L,C-L,C” (ex “3432:1,18,3-18,17”) or “P:F,L-L” or “P:F” depending on how much detail is available. (I originally had “P:F,L,C-F,L,C” but there wouldn’t be any times when it makes sense to range files would it?)

Another option to make things slightly smaller would be to output the numbers in base 16 or even 32.

Even bigger changes could be to change to single letter tag names. (Or even to JSON.) Support for these xml debug files is pretty limited currently so it might not be too disruptive to change the spec so much. But it may not be worth it either. Compressing the codelocs would surely make a huge difference by itself.

I want to treat all the fields as optional, so it’s easiest to give each one an independent prefix. A “C5-10” notation would save a character but makes the parsing logic another step more complicated.

The question I was asking, though, is whether this is worth doing at all. If you build a game.gblorb file for debugging – with debug commands and debug info included – does it matter if it’s 10 megabytes? Or 100 megabytes for a very large I7 game? Is halving that file size worth anybody’s time?

i think that a good approach to this would be to design and implement a single, narrow I7 debugging feature, maybe just for one type of construct, that extends from end to end throughout the elements of the system: the I7 IDE, ni, the I6 compiler, the interpreter, and possibly the blorb library.

The idea is to ensure that the pieces fit together without any surprises, and to have a full feature to which end users can react. It’s more compelling than “We have this thing that isn’t really useful or demoable on its own, but could be incorporated into something useful later”, and it requires that we define up front at least one actual I7 source-level debugging feature that we think people would want. When we’re done, I7 debugging is a real thing that just needs to be extended with additional features, rather than a thing that someone might do so someday.

One of the issues in pursuing this approach is that ni is a black box in an otherwise open technology stack, but perhaps additional doors are open to you there.

Assuming that the project is worth doing at all, I don’t think file size is a deal breaker. If it were, we could go to a binary format. I think having a human-readable format is worth the additional size, especially during initial development.

A much simpler alternative would be to gzip the file before adding it to the blorb. Counterfeit Monkey’s gameinfo.dbg is 84MB, which gzips down to 3.6MB. Adding decompression support to the tools shouldn’t be a problem, even for JS. I found a port of zlib which is only 23KB.

(Yes HTTP compression means that this won’t give much of a download improvement. But if the game author compresses it highly then it might - servers often only lightly gzip their output. And I think the smaller file size is valuable for mobile terps, for web terps which might want to cache it in localStorage or the like, or if you want to keep track of a lot of versions so that you can track how a game is being progressively optimised. Compressing the codelocs as well could still be good, but this would be very simple option.)

Thanks for the comments. I am coming down on the side of “file sizes don’t matter, let’s not introduce the complication of compression or alternate formatting”. This can of course be reconsidered if practical problems arise in the future.

The current reality is that all progress on Inform is incremental. I implemented this feature because I could; I had some free days in my schedule; it will allow someone (maybe me, maybe not) to build another feature on it in the future.

If you’re asking whether I have secret permission to add I7 compiler features – nope. Sorry.