Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

Re: Unicode, Korean, normalization form, Mac OS X and tab completion



There is a patch by a Japanese user which simply converts
file names obtained by readder() into the composed form ("NFC"):
	https://gist.github.com/waltarix/1403346
The patch in this gist is against zsh-5.0.0 (I guess).
I attached the same patch against the current git master below
(I added defined(__APPLE__) to the #if condition).
We may use this to see what kind of problem may appear by this simple
approach.

Kwon Yeolhyun, can you test this patch ? 

In the current zsh (without this patch), 
$ ls 가<TAB>
doesn't work if 가 is input from keyboard (NFC), but works if it is
pasted from the ls output (NFD). With the patch, the opposite happens.
 
Of course this patch affect not only Korean but any languages which
have decomposable character. For example, if you have a file named über 
in the current directory, with the current zsh (without the patch):

$ ls u<TAB>	# completes to über (useful for some user??)
$ ls ü<TAB>	# fails to complete

and u* matches with über while ü* doesn't.
With the patch, the we get the opposite behavior.

Jun



diff --git a/Src/utils.c b/Src/utils.c
index 9439227..86b61f1 100644
--- a/Src/utils.c
+++ b/Src/utils.c
@@ -4270,6 +4270,13 @@ mod_export char *
 zreaddir(DIR *dir, int ignoredots)
 {
     struct dirent *de;
+#if defined(HAVE_ICONV) && defined(__APPLE__)
+    static iconv_t conv_ds = (iconv_t)NULL;
+    static char *conv_name = (char *)NULL;
+    char *temp_name;
+    char *temp_name_ptr, *orig_name_ptr;
+    size_t temp_name_len, orig_name_len;
+#endif
 
     do {
 	de = readdir(dir);
@@ -4278,6 +4285,23 @@ zreaddir(DIR *dir, int ignoredots)
     } while(ignoredots && de->d_name[0] == '.' &&
 	(!de->d_name[1] || (de->d_name[1] == '.' && !de->d_name[2])));
 
+#if defined(HAVE_ICONV) && defined(__APPLE__)
+    if (!conv_ds)
+	conv_ds = iconv_open("UTF-8", "UTF-8-MAC");
+    if (conv_ds) {
+	orig_name_ptr = de->d_name;
+	orig_name_len = strlen(de->d_name);
+	conv_name = zrealloc(conv_name, orig_name_len+1);
+	temp_name_ptr = conv_name;
+	temp_name_len = orig_name_len;
+	if (iconv(conv_ds,&orig_name_ptr,&orig_name_len,&temp_name_ptr,&temp_name_len) >= 0) {
+	    *temp_name_ptr = '\0';
+	    temp_name = conv_name;
+	    return metafy(temp_name, -1, META_STATIC);
+	}
+    }
+#endif
+
     return metafy(de->d_name, -1, META_STATIC);
 }
 




Messages sorted by: Reverse Date, Date, Thread, Author