Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

zsh crashes on completeion of utf-8 file-names.



Hi,

I know that zsh-4.1.1 still doesn't support utf-8, but as realeased it could
do completion on utf-8 file names. However, I recently updated from the cvs
and now zsh crashes on completions of names, when I have two candidates of the
form RA and RB, and I hit R<TAB>. This happens when R=U+05E8 (0xd7 0xa8) or
U+05E9 (0xd7 0xa9) and A and B are U+05D0 and U+05D1. This is the Hebrew
range. I tried to recreate the problem in the latin1 supplamental rage
(U+0080..U+00FF) and didn't succeed. I produced a debug trace by configuring
zsh with CFLAGS=-g and LDFLAGS=-g and here it is:

/usr/local/src/build/zsh$ gdb Src/zsh
GNU gdb Red Hat Linux (5.3post-0.20021129.18rh)
Copyright 2003 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-redhat-linux-gnu"...
(gdb) run
Starting program: /usr/local/src/build/zsh/Src/zsh 
/usr/local/src/build/zsh$ echo $ZSH_VERSION
4.1.1-dev-1
/usr/local/src/build/zsh$ ls tmp
doit  רא  רב  שא  שב
/usr/local/src/build/zsh$ ls tmp/ר
Program received signal SIGSEGV, Segmentation fault.
0x080b6b3a in ztrsub (t=0x81419b3 "", 
    s=0x8249001 <Address 0x8249001 out of bounds>) at utils.c:2875
2875            if (*s++ == Meta) {
(gdb) bt
#0  0x080b6b3a in ztrsub (t=0x81419b3 "", 
    s=0x8249001 <Address 0x8249001 out of bounds>) at utils.c:2875
#1  0x402af206 in unmetafy_line () at zle_tricky.c:918
#2  0x402aed72 in docomplete (lst=0) at zle_tricky.c:820
#3  0x402ad97a in expandorcomplete (args=0x402bf250) at zle_tricky.c:288
#4  0x402ad53d in completecall (args=0x402bf250) at zle_tricky.c:182
#5  0x402a1912 in execzlefunc (func=0x402bd648, args=0x402bf250)
    at zle_main.c:903
#6  0x402a1047 in zlecore () at zle_main.c:696
#7  0x402a1609 in zleread (lp=0x80e77c0 "%~%(#.#.$) ", rp=0x0, flags=7, 
    context=0) at zle_main.c:840
#8  0x0807c161 in inputline () at input.c:277
#9  0x0807c02b in ingetc () at input.c:214
#10 0x08074079 in ihgetc () at hist.c:241
#11 0x08082b59 in gettok () at lex.c:631
#12 0x08082461 in yylex () at lex.c:347
#13 0x080991f1 in parse_event () at parse.c:449
#14 0x08079332 in loop (toplevel=1, justonce=0) at init.c:128
#15 0x0807bcb1 in zsh_main (argc=1, argv=0xbfffe8d4) at init.c:1272
#16 0x08052226 in main (argc=1, argv=0xbfffe8d4) at main.c:37
#17 0x42015704 in __libc_start_main () from /lib/tls/libc.so.6

(gdb) p line
$1 = (unsigned char *) 0x81419a8 "ls tmp/ר�\203"
(gdb) p line[6]
$5 = 47 '/'
(gdb) p line[7]  <==== This and the following one is the UTF-8 0xd7 0xa8
$6 = 215 '�'
(gdb) p line[8] 
$7 = 168 '�'
(gdb) p line[9]  <==== This is a UTF-8 0xd7 byte without the following one
$8 = 215 '�'
(gdb) p line[10] <==== This is zsh's "Meta", at the end of the string!!!
$9 = 131 '\203'
(gdb) p line[11]
$10 = 0 '\0'


One can easily see that the code of ztrsub at Src/util.c line 2870 is really
buggy, since if DEBUG is not set, one never checks for the end of string, and
if Meta falls in the end, we are screwed up. However, This code was already
there when the tag zsh-4_1_1 was generated, so I cannot see what triggered the
problem. Really, this "Meta" stuff shouldn't be the last character in the
string!  


-- 
Dr. Zvi Har'El     mailto:rl@xxxxxxxxxxxxxxxxxxx     Department of Mathematics
tel:+972-54-227607 icq:179294841     Technion - Israel Institute of Technology
fax:+972-4-8293388 http://www.math.technion.ac.il/~rl/     Haifa 32000, ISRAEL
"If you can't say somethin' nice, don't say nothin' at all." -- Thumper (1942)
                             Sunday, 26 Kislev 5764, 21 December 2003,  4:18PM



Messages sorted by: Reverse Date, Date, Thread, Author