Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

Re: zsh doesn't understand some multibyte characters



2015/05/14 03:29, Danek Duvall <duvall@xxxxxxxxxxxxxx> wrote
> 
> If I set
> 
>    comb_acute_mb[] = { (char)0xe2, (char)0x80, (char)0xa6 };
> 
> in the test, it thinks that character's wcwidth() is 2, not 1.

U+2026 is one of the characters whose "East Asian Width" property
is set to "Ambiguous". Widths of these characters are *really* ambiguous;
in western (monospaced) fonts they have a single width,
while in (most of?) CJK fonts they have double width.

Usually, wcwidth() returns 1 for these characters so they are not
displayed correctly in CJK fonts, unless applications take spacial care of
them. For example, xterm has an option -cjk to handle this problem.

Your report indicates that Solaris is one of the rare systems in
which wcwidth() returns 2 for U+2026.

Are there any fonts in which U+2026 has double width on Solaris?

> I don't know why the zero-width
> combining character was chosen as the test.

The test was first introduced to detect a broken wcwidth() on Mac OS X,
where wcwidth() returns 1 for combining characters.



Messages sorted by: Reverse Date, Date, Thread, Author