Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

Re: Strange behavior of [[



On Tue, 9 Jun 2015 22:31:56 -0700
Bart Schaefer <schaefer@xxxxxxxxxxxxxxxx> wrote:
> On Jun 9,  8:27pm, Maxime Arthaud wrote:
> } Subject: Strange behavior of [[
> }
> } Hi everybody!
> } 
> } I just found a very strange behavior in zsh (v5.0.8).
> } 
> } % [[ " X" =~ "X" ]]
> } where in " X" the first character is a non-breaking space (0xa0).
> } My shell gets stuck, and Ctrl-C is not working. With bash, no problem.
> } 
> } Does anyone have an explanation? I think it's a bug.
> 
> MB_METACHARLEN() is returning that 0xa0 is a zero-width character, so
> "ptr" in the "while (ptr < lhstr + m->rm_so)" loop in regex.c never
> advances.  That macro ultimately resolves to mb_metacharlenconv_r()
> from utils.c, which returns zero here:
> 
> 4861		return 0;		/* Probably shouldn't happen */
>
> This means that imeta() is (incorrectly?) returning true for 0xa0, which
> might mean that we're passing an unmetafied string where a metafied
> string is expected.

Yes, that's obvious from the context.  You can see lhstr being metafied
above to go into a variable, but the unmetafied variant is then handled
as if it was metafied.  The problem is the match offsets are all in
unmetafied form from the regexp library.  Rather than attempt to metafy
with those, It probably needs to change to use mbrtowc() based on
unmetafied chracters, with simple code for the case of no
MULTIBYTE_SUPPORT.

pws



Messages sorted by: Reverse Date, Date, Thread, Author