Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

typeset -<non-ASCII> (Was: metafication in error messages)



By the way, while looking at other culprits for missing
metafication in calls to zerr*, I found (in UTF-8 locale):

$ typeset -é
typeset: bad option: -^
~$ zsh -c 'typeset -é' |& sed -n l
zsh:typeset:1: bad option: -^\003$

(gdb)
305         while (*fmt)
(gdb)
306             if (*fmt == '%') {
(gdb)
307                 fmt++;
(gdb)
308                 switch (*fmt++) {
(gdb)
348                     num = va_arg(ap, int);
(gdb)
350                     mb_charinit();
(gdb) p num
$2 = -61
(gdb) p (unsigned char) num
$3 = 195 '\303'

(that's 0xc3, the first byte of é, normally rendered as \M-C).

We end up with a ^ because in wcs_nicechar_sel():

wcs_nicechar_sel (c=-61 L'\xffffffc3', widthp=0x0, swidep=0x0, quotable=0) at utils.c:602
602         int ret = 0;
(gdb) n
603         VARARR(char, mbstr, MB_CUR_MAX);
(gdb)
612         newalloc = NICECHAR_MAX;
(gdb)
613         if (bufalloc != newalloc)
(gdb)
619         s = buf;
(gdb)
620         if (!WC_ISPRINT(c) && (c < 0x80 || !isset(PRINTEIGHTBIT))) {
(gdb)
621             if (c == 0x7f) {
(gdb)
629             } else if (c == L'\n') {
(gdb)
632             } else if (c == L'\t') {
(gdb)
635             } else if (c < 0x20) {
(gdb)
636                 if (quotable) {
(gdb)
641                     *s++ = '^';

It's < 0x20 because negative and we end up with ^<0x3> because
that's -61 + 64.

zerrmsg's %c expects a wchar_t when MULTIBYTE_SUPPORT is
enabled, but here we're passing it the first byte (signed) of
the encoding of the wide character.

Calling it with (unsigned char)*arg is not much better as you
end up with: zsh:typeset:1: bad option: -Ã

As U+00C3 (Ã) happens to be printable. If it weren't -\M-C would
not be much better anyway. And you also get zsh:typeset:1: bad
option: -Ã for typeset -$'\xc3'

Some options could be:
- return a "no non-ASCII/multibyte option supported" error
  message when *arg >= 0x80
- extract and decode the full multibyte character before passing
  to zerrmsg. which leaves the problem of how to render byte
  sequences that can't be decoded.

Maybe there are already functions in the code that do that?

-- 
Stephane




Messages sorted by: Reverse Date, Date, Thread, Author