Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

Re: read -d $'\200' doesn't work with set +o multibyte (and [PATCH])



"Jun. T" wrote:
> > --- a/Test/B04read.ztst
> > +++ b/Test/B04read.ztst
> (snip)
> > +  read -ed $'\xc2'
> > +0:read delimited by a single byte terminates if the byte is part of a multibyte character
> > +<one£two
> > +>one
>
> Is this really what the standard requires (or will require)?
> Breaking in the middle of a valid multibyte character looks
> rather odd to me.

The proposed standard wording appears to only talk about the case of the
delimiter consisting of "one single-byte character". $'\xc2' is not a
valid UTF-8 character so my interpretation is that they are leaving this
undefined.

Behaviour that treats the input as raw bytes for a raw byte delimiter
is consistent. This retains compatibility with the way things
work for a non-multibyte locale. Not all files are valid UTF-8 and it
can be useful to force things to work at a raw byte level.

The only alternative I can think of would be to print an error for the
delimiter. Did you have something else in mind?

Oliver




Messages sorted by: Reverse Date, Date, Thread, Author