[I6] [Glulx] Printing Unicode characters

Draconis · July 30, 2017, 5:19pm

Printing Unicode characters seems straightforward in I7: just include them in a string. And in I6 for Z-machine, it’s clunky but doable with liberal use of the @print_unicode opcode.

I expected that in I6 for Glulx, it would also be straightforward—after all, according to the Glulx spec, encoded strings can contain a mix of ASCII and Unicode characters, and unencoded strings are pure Unicode.

But:

When play begins:
	say "↤↥↦↧ ←↑→↓ ⇦⇧⇨⇩";
	say paragraph break;
	zork;
	say paragraph break.

To zork: (- print "↤↥↦↧ ←↑→↓ ⇦⇧⇨⇩"; -).

Is there some convenient way of using non-ASCII characters in encoded strings, without using @streamunichar directly? Or is that the best/only way to do it?

zarf · July 30, 2017, 8:06pm

The I7 compiler should maybe not be performing transformations on I6 literal strings.

But, since it is, you will have to work around by using the I6 escape sequence for Unicode characters:

To zork: (- print "@{21A4}@{21A5}@{21A6}@{21A7} @{2190}@{2191}@{2192}@{2193} @{21E6}@{21E7}@{21E8}@{21E9}"; -).

Unfortunately this doesn’t work either, because I7 assumes that {xxxx} sequences in a “to” phrase definition are inline argument interpolations. To work around that, you can do something like:

Include (-
Constant zorkstring = "@{21A4}@{21A5}@{21A6}@{21A7} @{2190}@{2191}@{2192}@{2193} @{21E6}@{21E7}@{21E8}@{21E9}";
-).

To zork: (- print (string) zorkstring; -).

You could also define the string at the I7 level:

Bork-string is always "↤↥↦↧ ←↑→↓ ⇦⇧⇨⇩";

To bork: (- print (TEXT_TY_Say) (+ bork-string +); -).

zarf · July 30, 2017, 8:53pm

I should also note that the I6 compiler doesn’t accept UTF-8 encoded source code by default. There’s an option for that (-Cu) but it’s relatively recent and the I7 toolchain doesn’t use it. I7 expects to generate an I6 intermediate source file which is Latin-1 encoded, and does not (cannot) contain any higher Unicode characters literally.