Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

Re: utf-8



On Thu, Dec 18, 2014 at 7:48 AM, Павлов Николай Александрович
<kp-pav@xxxxxxxxx> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA512
>
> On December 18, 2014 3:39:56 AM EAT, Ray Andrews <rayandrews@xxxxxxxxxxx> wrote:
>>On 12/17/2014 12:31 PM, ZyX wrote:
>>
>>
>>ZyX,
>>> It looks like it is the following: - Explicit support in RE patterns.
>>
>>> - COMBINING_CHARS option that tells zsh that terminal is able to
>>display
>>... I did some reading, but it's too 'zoomed in' for me, it presumes
>>one
>>already more  or less knows what's going on.  I don't.
>
> Your question is too broad to give more detailed answer and the intent is not clear. You are also posting to zsh users and developers mainly live in zsh workers, reading users with lower priority. I know some internals of zsh (not the part you are requesting though) and know some "dark corners" of unicode processing in general, but I cannot give more detailed explanation without knowing what you are after.

All mails to zsh-users are automatically sent to subscribers of
zsh-workers as well. The main issue with non-singlebyte encodings is
that almost all the code used to assume that one byte equals one
character equals one on-screen character cell. This took a couple of
years to fix, but is more or less done now. There is nothing specific
to UTF-8 in the code as far as I know, except in getkeystring, but
that looks more like an optimization to avoid calling iconv(). Eg, zsh
works fine if you run under EUC-JP too, but then you can of course
only type japanese characters (and the ascii set).

Most of what Pavlov(if my cyrillic isn't too rusty) said applies to
unicode, not utf-8, which is a character set, not a character
encoding. All the unicode things should work fine in any
encoding/character set, assuming the character you want exists in it.

-- 
Mikael Magnusson



Messages sorted by: Reverse Date, Date, Thread, Author