Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

Re: UTF-8 input [was Re: PATCH: zle_params.c]



On Jan 31, 11:46am, Peter Stephenson wrote:
} Subject: Re: UTF-8 input [was Re: PATCH: zle_params.c]
}
} > Otherwise don't you have issues if what the user really means to
} > bind to self-insert is a single-byte character that happens to have
} > the high bit set?
}
} Hmmm... you mean that on a system where mbrtowc() reports that a
} single-byte character is incomplete, the user might nonetheless want to
} insert a single-byte character onto the command line?

No.  I mean, suppose the user uses the same .zshrc in both a iso-8859-*
and a UTF-8 locale, and has an explicit bindkey command which is intended
to work only in the iso-8859-* locale.  That bindkey happens to use a
character for which, in the UTF-8 locale, mbrtowc() reports incomplete.
This was in part why I added the footnote asking about plans for UTF-8
in shell scripts; is it even possible to have the same .zshrc in these
cases?

However, I wasn't thinking very clearly, since mbrtowc() won't report
incomplete for an iso-8859-* character if LC_CTYPE is set correctly.

I'm still worried about the case where that bindkey exists but is for a
function other than self-insert.  If multibyte translation is handled by
a widget at the same priority as all other widgets, that "stray" bindkey
can mess up the whole scheme.

} In other words, are you supposing this is some kind of fallback in
} case the locale isn't set correctly, e.g. it's set to UTF-8 but on an
} xterm with character set ISO-8859-1?

That was probably what was in my head, but on reflection it's not really
something that the shell can deal with.



Messages sorted by: Reverse Date, Date, Thread, Author