Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

Re: zsh doesn't understand some multibyte characters



On Wed, May 13, 2015 at 10:43:50AM -0700, Bart Schaefer wrote:

> On May 13,  9:14am, Danek Duvall wrote:
> } Subject: zsh doesn't understand some multibyte characters
> }
> } Perhaps this is just on Solaris, I dunno. But for some multibyte
> } characters [...] if I move the cursor back over them or delete back
> } over them, zsh gets confused and moves two positions instead of one
> }
> } I'll note that the same thing happens with all the other shells on
> } Solaris [... ] Where else should I be looking for the problem?
> 
> This sounds like the WCWIDTH() macro or function is returning the wrong
> value for some characters.

It does.

> If you are compiling your own zsh, can you (a) check whether config.h
> defines BROKEN_WCWIDTH, and (b) if it does not, try defining it and
> recompile to see if that makes any difference?

Not on its own; Solaris doesn't appear to define __STDC_ISO_10646__.  But
if I #define that to 1 (because nothing in zsh uses its value), then it
does work.

If I set

    comb_acute_mb[] = { (char)0xe2, (char)0x80, (char)0xa6 };

in the test, it thinks that character's wcwidth() is 2, not 1.  Perhaps
that should be a part of the test as well?  I don't know why the zero-width
combining character was chosen as the test.

I'm less sure what to do about __STDC_ISO_10646__.  I see that most of the
places it's checked you're also checking for __APPLE__, but not all of them
(and I'm not sure why that would be).

I can talk to our globalization folks who might know why this isn't
defined, or what it should be set to, or whatever, and file a bug if
necessary.  I guess until we figure that out, I can just have our zsh build
define it on the commandline (assuming that you don't want to hold 5.0.8
for this, and I wouldn't want you to).

Thanks,
Danek



Messages sorted by: Reverse Date, Date, Thread, Author