I wrote a Python script that extracts all the text from an Inform 7 project so that it is easy to proofread them. It also substitutes them as much as possible so that spell checkers are not troubled with the brackets inside the text.
You can download it here:
You’ll need Python 3. Type the following command in a terminal to run the script:
$ python i7extract.py MyProject.inform
There’s a bunch of options, all is explained in the README. You can also define custom substitutions with a JSON file.
This link looks down. In case it is lost for good, I’d just like to add this VERY rough code that helped me:
import re
with open("story.ni") as file:
for (line_count, line) in enumerate(file, 1):
if '"' not in line: continue
quoted_text_ary = line.strip().split('"')[1:0:2]
quoted_text = ' | '.join(quoted_text_ary)
if remove_comments: quoted_text = re.sub("\[.*?\]", " ", quoted_text)
Obviously you can look for extension files, too, but these basics served me well for finding typos, etc.
Of related interest, glulx-strings pulls all readable strings out of any of: glulx, zcode (in or out of blorbs) or TADS 2 or 3. (glulx-strings.py is Python 2 that does just glulx/gblorb, but the README is literate CoffeeScript detailing the whole tale of a programmer having fun getting carried away with covering more cases.)
It’s operating on game files, not source, and you end up with lots of cruft and code fragments so it’s not suited for proofreading, but still handy.
I migrated from Bitbucket to GitLab when Bitbucket announced they’d no longer support Mercurial. I’ve updated the link in the original post. (And thanks zarf for giving the correct link!)
I think I’m the only one to have ever used this script, so I would be happy to know if it can help someone else!
This is a nice script, thanks very much. I gave it a try and it was interesting to see the output from my project. I was planning to do something like this at some point as I think I’ll need it but now I don’t have to, great.
Although, the main thing I’d love to have would be to be able to segregate text in understand statements from text in say statements. Going through the output for my game feels a bit daunting as it seems so long because of all the understand words; it would be much shorter without. My python isn’t good and I’m not sure where to start but I’ve cloned the repo at least!
Wow! I wasn’t expecting the original author to stop by. People move on, and so forth.
I think I really enjoyed writing my own text extraction program, because it made me feel competent. But I’m really glad others have attacked it in more detail.
It’s also cool to see other people like @Ben asking for features. Sometimes I feel wonder if I’m the only person who might ask for a feature, and other times it’s cool to see what people think of that’d makethings easier.
@Zed, yes, glulx-strings is great. I’ve used it so often. The author got back to me really quickly after I found a bug in the z-code reading.
For when it still doesn’t quite work, txd.exe and mrifk.exe (utilities from ifarchive.org) tend to fill in a lot of the gaps.