Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

Re: [BUG] SIGSEGV under certain circumstances



On Sun, 5 Mar 2017 08:00:54 -0800
Bart Schaefer <schaefer@xxxxxxxxxxxxxxxx> wrote:
> In computil.c:cfp_matcher_pats there is a loop that walks the string
> from the command line, in this case the file name recalled from the
> history, Comparing each character to the matcher pattern.  If it gets
> a match it adjusts some counters that are initialized from strlen() of
> the candidate string, exiting the loop when the counters reach the
> end of the string.  It also adjusts pointers into string, and one of
> those pointers is running past the end.
> 
> The pattern is m:{a-zA-Z}={A-Za-z} m:{a-zA-Z}={A-Za-z} but I don't
> think that matters, it's the candidate string that's causing the
> confusion.
>...
> So there seem to be two problems, one that the history is either not
> saving or not reloading the Chinese characters correctly, and two
> that the loop in cfp_matcher_pats is not counting correctly when it
> examines that garbage string recalled from history.

The matcher code doesn't handle non-ASCII characters, and probably never
wlll --- I spent ages looking at this some years ago until I realised I
was getting nowhere.  The most we can hope is it's safe about pointing
off the end of the string.  That's complicated by the fact that the rest
of completion does handle multibyte characters.

There's a good chance this is another problem with the handling of Meta
characters.  We know that broken history parsing can get these wrong ---
we've had problems like that not so long ago.  If it gets past that
stage, we make assumptions about what the presence of a Meta in the
string means that, if it fails, can lead to any number of problems.  In
particular, we assume that any Meta anywhere in the string has the
standard meaning.

We might be able to debug the second half of this by testing for
incorrect metafication in the matcher code, but I'm not sure how far
that gets us.  We're not going to be able to be safe at every point in
the code.

I suspect tracking down the problem at the input stage is the only good
bet.   You might have thought that would be much more mechanical and
hence easy...

pws



Messages sorted by: Reverse Date, Date, Thread, Author