I think it would be better if there were a separate string compression option to control whether strings are polyvalent. People might want to use -Cu with Roman alphabets, or even within English. (I've wanted -Cu for years now.)
For the past few years, we've mostly been adding compilation options as $SETTING entries rather than command-line switches. (Too many command-line switches!) So the string compression option could be called $COMPRESS_UNICODE_CHARS, with a default value of 1, but you'd set it to 0 for your system.
I also think the compression system will work better than you expect for Chinese text. But you can test that when you get there.
Thanks for working on this.
Great zarf replying in person!
Thanks for the info. I initially wanted to add a command-line option to switch the system to "Unicode mode", but neither -u nor -U is available. It's conceptually ugly to have -Cu as a "global Unicode switch", so indeed, better leave the internal workings to $SETTING entries.