Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

Re: num_in_chars incremented after each mbrtowc()



On Sun, 6 Dec 2015 17:33:55 +0000
Peter Stephenson <p.w.stephenson@xxxxxxxxxxxx> wrote:
> Are you saying, for example, that a trailing set of chracters that are
> MB_INCOMPLETE appear as a single output (albeit invalid) character (I
> guess with a single width)?  That would mean the right return value was
> 
>      return num + (num_in_char > 0 ? 1 : 0);
> 
> (perhaps that was even what you meant above?)

Ah, reading your previous message in the light of the above, I think
that *is* what you're saying.

OK, as I said there isn't a really "right" answer here, just a
convenient one.  So if this is what works for you let's go with that.

pws

diff --git a/Src/utils.c b/Src/utils.c
index 45f8286..fc2b192 100644
--- a/Src/utils.c
+++ b/Src/utils.c
@@ -5180,11 +5180,15 @@ mb_metastrlenend(char *ptr, int width, char *eptr)
 
 	if (ret == MB_INCOMPLETE) {
 	    /*
-	     * "num_in_char" is only used for incomplete characters.  The
-	     * assumption is that we will output this ocatet as a single
+	     * "num_in_char" is only used for incomplete characters.
+	     * The assumption is that we will output all trailing octets
+	     * that form part of an incomplete character as a single
 	     * character (of single width) if we don't get a complete
-	     * character; if we do get a complete character, num_in_char
-	     * becomes irrelevant and is set to zero.
+	     * character.  This is purely pragmatic --- I'm not aware
+	     * of a standard way of dealing with incomplete characters.
+	     *
+	     * If we do get a complete character, num_in_char
+	     * becomes irrelevant and is set to zero
 	     *
 	     * This is in contrast to "num" which counts the characters
 	     * or widths in complete characters.  The two are summed,
@@ -5216,8 +5220,8 @@ mb_metastrlenend(char *ptr, int width, char *eptr)
 	}
     }
 
-    /* If incomplete, treat remainder as trailing single bytes */
-    return num + num_in_char;
+    /* If incomplete, treat remainder as trailing single character */
+    return num + (num_in_char ? 1 : 0);
 }
 
 /*



Messages sorted by: Reverse Date, Date, Thread, Author