Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

Re: Question about mb_metastrlen



On Tue, 27 Oct 2015 11:34:35 +0100
Sebastian Gniazdowski <sgniazdowski@xxxxxxxxx> wrote:

> On 27 October 2015 at 10:10, Peter Stephenson <p.stephenson@xxxxxxxxxxx> wrote:
> > On Tue, 27 Oct 2015 09:31:02 +0100
> > The function you're talking about is for a string length, not a
> > character length.  num_in_char counts the number of trailing bytes that
> > didn't form a wide character.  Each will be treated as a single byte.
> > So each counts 1 for the length of the string.
> 
> There is the condition:
>             if (ret == MB_INVALID) {
> 
> Isn't it that if there are many trailing bytes that do not form a
> character, they will be catched into MB_INVALID, and only last
> "character" can stay as not yet complete?

Only the last multibyte character can consist of multiple individual
bytes that look like part of an incomplete character rather than simply
as invalid, that's correct.  Hence the note at the end of the function
about use of num_in_char, and hence we reset num_in_char to 0 any time
we get a full multibyte character or mark a byte as invalid rather than
incomplete.

pws



Messages sorted by: Reverse Date, Date, Thread, Author