Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

Re: utf-8



On Thu, 18 Dec 2014 09:36:33 -0800
Ray Andrews <rayandrews@xxxxxxxxxxx> wrote:

> On 12/18/2014 01:25 AM, Peter Stephenson wrote:
> 
> Mikael, Peter:
> 
> > Chapter 5 of the FAQ is the best place to start. You can see this 
> > online at http://zsh.sourceforge.net/FAQ/zshfaq05.html#l52. The 
> > version in Etc of the source is newer but I don't think there are 
> > significant differences. pws 
> 
> Very nicely written. That's exactly what I wanted to learn.  And tho I 
> knew it
> previously, I had semi forgotten the difference between unicode and utf-8,
> which lead to the fuzzy question. To ask it again more accurately, where are
> extended unicode characters permitted? Or perhaps that's better reversed,
> where are they *not* permitted? Can a variable have a name beyond ASCII?
> I see that zsh is transparent to utf-8 everywhere, but that does not presume
> that one has use of the entire unicode charset in all situations.

Yes, correct.  Most syntax is pinned down --- either something is
a keyword or something like a decimal number from a fixed set, or it's
any old string.  Identifiers are an exception.  There's an option for this.

POSIX_IDENTIFIERS <K> <S>
       When  this option is set, only the ASCII characters a to z, A to
       Z, 0 to 9 and _ may be  used  in  identifiers  (names  of  shell
       parameters and modules).

       When  the  option  is  unset  and multibyte character support is
       enabled (i.e. it is compiled in  and  the  option  MULTIBYTE  is
       set), then additionally any alphanumeric characters in the local
       character set may be used in identifiers.  Note that scripts and
       functions  written  with this feature are not portable, and also
       that both options must be set before the script or  function  is
       parsed;  setting  them during execution is not sufficient as the
       syntax variable=value has  already  been  parsed  as  a  command
       rather than an assignment.

       If  multibyte  character  support is not compiled into the shell
       this option is ignored; all octets with the top bit set  may  be
       used  in  identifiers.   This  is non-standard but is the tradi‐
       tional zsh behaviour.

pws



Messages sorted by: Reverse Date, Date, Thread, Author