intfiction.org

The Interactive Fiction Community Forum
It is currently Tue Dec 18, 2018 7:32 pm

All times are UTC - 6 hours [ DST ]




Post new topic Reply to topic  [ 16 posts ]  Go to page 1, 2  Next
Author Message
 Post subject: unicode problem
PostPosted: Thu Jun 21, 2018 12:34 pm 
Offline

Joined: Mon Apr 02, 2018 3:05 pm
Posts: 15
Hi,

I started to translate an IF game to my language. I'm a beginner in that. My language has some extra characters. As I knew, in the newer IF compilers/interpreters, it is not a problem to use unicode characters. But when I try to compile the (partially) translated source, I get the following error message:
Code:
The grammar token 'unicode 337' in the sentence 'Understand "kér [something]t [someone]t[unicode 337]l" as querysmalling'   looked to me as if it might be a unicode character, but this isn't something allowed in parsing grammar.

In the first word, there is an unicode character, but it is not a problem. I tried to google it, and it seems, the compiler only allows unicode characters with smaller code number. Is that true? Can I avoid that, and use my special characters somehow?


Top
 Profile Send private message  
Reply with quote  
 Post subject: Re: unicode problem
PostPosted: Thu Jun 21, 2018 12:36 pm 
Offline

Joined: Mon Apr 02, 2018 3:05 pm
Posts: 15
Sorry, I forgot to write about that I tried to compile the source with gnome-inform7 on Ubuntu.


Top
 Profile Send private message  
Reply with quote  
 Post subject: Re: unicode problem
PostPosted: Thu Jun 21, 2018 4:10 pm 
Offline

Joined: Tue Mar 09, 2010 2:34 pm
Posts: 5452
Location: Burlington, VT
It looks to me as though the issue is this, from §5.10 of Writing with Inform:

Quote:
The world has a bewildering range of letters, accents, diacritics, markers and signs. Inform tries to support the widest range possible, but the works of IF produced by Inform are programs which then have to be run on a (virtual) computer whose abilities are more constrained: few players will have an Ethiopian font installed, after all. So a degree of caution is called for.

(a) Definitely safe to use. Inform's highest level of support is for the letters found on a typical English typewriter keyboard, including both the $ and £ signs (but not the Yen or Euro symbols ¥ and €), and in addition the following:

ä, á, à, ã, å, â and Ä, Á, À, Ã, Å, Â
ë, é, è, ê and Ë, É, È, Ê
ï, í, ì, î and Ï, Í, Ì, Î
ö, ó, ò, õ, ø, ô and Ö, Ó, Ò, Õ, Ø, Ô
ü, ú, ù, û and Ü, Ú, Ù, Û
ÿ, ý and Ý (but not Ÿ)
ñ and Ñ
ç and Ç
æ and Æ (but not œ or Œ)
ß
¡, ¿
These characters can be typed directly into the Source panel, and can be used outside quotation marks: we can call a room the Église, for instance.


The other Unicode characters can be written inside quotation text but not source text--which I'm guessing means they can't be understood either. So é can be understood but unicode 337 can't.

Unfortunately I suspect there isn't a workaround for this--there's some internal representation in a format that doesn't include the Unicode characters (the ZSCII format). I'm not super familiar with the inner workings of the virtual machines though.


Top
 Profile Send private message  
Reply with quote  
 Post subject: Re: unicode problem
PostPosted: Thu Jun 21, 2018 9:35 pm 
Offline

Joined: Fri Oct 18, 2013 10:13 am
Posts: 2669
Location: The Midwest
Indeed, nothing outside that range can be properly handled by the parser. This is a significant problem when trying to write IF in different languages, since the limited range shown above isn't even enough for the entire European Union. (Greek, for instance, is missing its entire alphabet, while other languages have more subtle problems: Polish needs letters like ż, Romanian ă, Icelandic ð…it looks like you're specifically missing Hungarian's ő?)

Zarf has written an extension that updates the parser to support Unicode. But since you can't use most Unicode characters in object names or Understand lines, you need to use Inform 6 inclusions for all parsing-related code (Understand lines, object names, verb definitions, conversation topics…).

Hopefully an upcoming release of Inform 7 will change this. But for now, it's not really possible to use it for works in most non-English languages. Sorry about that.

_________________
Daniel Stelzer


Top
 Profile Send private message  
Reply with quote  
 Post subject: Re: unicode problem
PostPosted: Thu Jun 21, 2018 9:39 pm 
Offline

Joined: Fri Oct 18, 2013 10:13 am
Posts: 2669
Location: The Midwest
That said, modern systems and interpreters do support Unicode quite well. If you managed to get a Hungarian game past the first stage of compiling, everything else would go off without a hitch, and it would be completely playable. The only problem is the ni compiler itself, which is also the one part that's not open source (as opposed to the GUI, the template library, the I6 compiler, the blorb tools, the Glulx format, the Quixe interpreter…).

_________________
Daniel Stelzer


Top
 Profile Send private message  
Reply with quote  
 Post subject: Re: unicode problem
PostPosted: Fri Jun 22, 2018 6:11 am 
Offline

Joined: Mon Apr 02, 2018 3:05 pm
Posts: 15
Thanks for all the answers.
Yes, I would like to translate to Hungarian language. I know an old Hungarian IF game for C64, what was rewritten to I6, and it has unicode characters... I wrote its author, how he did it.
I downloaded an I6 source, wrote some special characters in it, and tried to compile it with the inform compiler, with -v8 flag, but it gave error messages for the spec characters... I also tried the -C2 flag, but it didn't help.

Draconis wrote:
That said, modern systems and interpreters do support Unicode quite well. If you managed to get a Hungarian game past the first stage of compiling, everything else would go off without a hitch, and it would be completely playable. The only problem is the ni compiler itself, which is also the one part that's not open source (as opposed to the GUI, the template library, the I6 compiler, the blorb tools, the Glulx format, the Quixe interpreter…).


Top
 Profile Send private message  
Reply with quote  
 Post subject: Re: unicode problem
PostPosted: Fri Jun 22, 2018 6:15 am 
Offline

Joined: Mon Apr 02, 2018 3:05 pm
Posts: 15
Is it possible to transcode I7 to I6/TADS/Hugo - if they handle the spec characters better?


Top
 Profile Send private message  
Reply with quote  
 Post subject: Re: unicode problem
PostPosted: Fri Jun 22, 2018 9:59 pm 
Offline

Joined: Sat Jan 23, 2010 4:56 pm
Posts: 5840
You can transcode I7 to I6. That's what the ni compiler does. That's the piece we're missing. :/

Other formats, no.


Top
 Profile Send private message  
Reply with quote  
 Post subject: Re: unicode problem
PostPosted: Fri Jun 22, 2018 10:01 pm 
Offline

Joined: Sat Jan 23, 2010 4:56 pm
Posts: 5840
Quote:
I downloaded an I6 source, wrote some special characters in it, and tried to compile it with the inform compiler, with -v8 flag


You need to use the -G flag (for Glulx), and -Cu (to indicate that the I6 source code is in UTF-8).

Then you need additional settings to get the I6 dictionary to be Unicode-compatible. I don't have a complete example on hand, unfortunately.


Top
 Profile Send private message  
Reply with quote  
 Post subject: Re: unicode problem
PostPosted: Sun Jun 24, 2018 6:13 pm 
Offline

Joined: Fri Oct 18, 2013 10:13 am
Posts: 2669
Location: The Midwest
Honestly, if ni could just be hacked to pass non-ZSCII characters through unmolested, then all the necessary transformations could be applied on the I6 side. This might be possible with disassembly, but might not: it depends on the data structures used internally. (Ideally it would just use UTF-8 in byte arrays, and depend on the I6 compiler to handle character sets, but I don't know if this actually happens.)

_________________
Daniel Stelzer


Top
 Profile Send private message  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 16 posts ]  Go to page 1, 2  Next

All times are UTC - 6 hours [ DST ]


Who is online

Users browsing this forum: No registered users and 13 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group