Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

Re: PATCH: read full multibyte string a bit more sooner

On Sat, Sep 12, 2015 at 2:57 AM, Peter Stephenson
<p.w.stephenson@xxxxxxxxxxxx> wrote:
> On 11 Sep 2015 23:42, Bart Schaefer <schaefer@xxxxxxxxxxxxxxxx> wrote:
>> This breaks for me with bracketed-paste-magic when pasting the multibyte
>> strings from Test/D07multibyte, specifically "More metafied characters
>> in prompt expansion" test that has several different languages.

I just reverted to the zsh-5.1.1 tag and tried again, and it breaks
there, too, so this is probably not specific to the patch in 36483.

> I won't have the source or anything more than
> phones or tablets for a week, but it might be
> meta aggro again.

Unfortunately I don't know what that refers to.

> I've a vague memory 'a grave' has one, if you
> want an easy check.

I threw in a { zle -M -- "$PASTED"; zle -R } in the read-command loop
and got the following output (hope it comes through OK with the
mutibyte in email).  Here is the test string I'm pasting:

Ą Пётр Ильич Чайковский 梶浦由記

And the result (5.1.1 without 36483):

burner% ă Ѓутр Ѓлуиу Чайковский 梶浦烴訃
Ą Пётр Ильич Чайковский 梶浦由記

The first line is what got composed by mbchar+=$KEYS and the second
line is what is actually in $PASTED.  As you can see they match for
some but not all characters.

I then switched back to zsh-5.1.1-dev-0 and tried to repeat this.
Here's where things get really interesting.

The very first time I pasted the test string, I got this:

burner% Ą Пётр Ильич Чайковский 梶浦由記~

As you can see this is ALMOST correct, except for that unexpected
trailing tilde, which must be part of the terminal escape for ending

Sadly the next time I try pasting, I get this:

burner% ă<ffffffff> Ѓ<ffffffff>у<ffffffff>тр
Ѓ<ffffffff>лу<ffffffff>иу<ffffffff> Чайковский
Ą Пётр Ильич Чайковский 梶浦由記

(where all those <ffffffff> are highlighted).  So either there's some
memory corruption, or the internal multibyte parsing state is messed
up, or both.

Is there someone who works in a multibyte character set all the time
who can help with figuring out where this is going wrong?  (Insight
into what happened in the first [5.1.1] case would also be

Messages sorted by: Reverse Date, Date, Thread, Author