Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

Re: UNICODE Private Use Area characters in BUFFER



On 10/24/22, Bart Schaefer <schaefer@xxxxxxxxxxxxxxxx> wrote:
> On Sun, Oct 23, 2022 at 4:35 PM Bart Schaefer <schaefer@xxxxxxxxxxxxxxxx>
> wrote:
>>
>> Asserting that zsh "handles" those characters in other
>> contexts isn't indicative of anything beyond demonstrating that
>> terminal "handling" is a special case.
>
> Seems to me we've got the following options:
>
> 1.  Do nothing.
> 2.  Presume Roman is correct that these characters can always be
> treated as printable and narrow.  (Still no answer as to how best to
> change this?)
> 3.  Add an option UNICODE_PRINTABLE_NARROW that when set, asserts all
> these characters to be printable and narrow.  Default ... on?
> 4.  Add special variable(s) (perhaps via module?) to allow remapping
> the wcwidth9.h lookup tables to make individual characters printable
> and set their width.

I think if we should do anything with wcwidth9.h, it's remove it.
Since adding it there have been 6 subsequent unicode standards, the
latest one adding over 4000 ideographs alone[1] (I don't know what
width the version 9 wcwidth gives for this range). It is probably
returning wrong values for many more thousands of characters on
systems where the libc has newer tables than unicode 9. I suppose it
could be useful to enable when remoting into old systems from a modern
one.

We should probably at least mark it as deprecated, glibc 2.26 added
support for unicode 9 and was released in august 2017, and the unicode
9 wcwidth.h was added to zsh in november 2016, a rather small window
where it mattered. What happened in unicode 9 was that the
presentation width for all emoji was changed to 2[2], I'm not sure how
this motivated people to add custom tables to every program they used
instead of simply updating glibc and have every program be correct at
once...

[1] https://home.unicode.org/announcing-the-unicode-standard-version-15-0/
[2] I couldn't find a more official reference than this atm,
https://github.com/irssi/irssi/issues/720

-- 
Mikael Magnusson




Messages sorted by: Reverse Date, Date, Thread, Author