Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

Re: PATCH: Fix inverted condition for unique completions



On 3/15/22, Mikael Magnusson <mikachu@xxxxxxxxx> wrote:
> This change makes most-recent-file[1] completion take a few milliseconds
> instead of close to a minute in a directory of 30000 files. I'm not sure
> exactly what's going on with these flags, but surely if they are not set,
> then we should not deduplicate the set of matches. Most likely this was
> a thinko-copy-paste-o from the first NOSORT case which does have to be
> negated. I'm not sure why regular tab-press was always fast, but it is
> still fast after this change for me.
>
> I don't know when these flags are supposed to be set, so I haven't tested
> that case.
>
> [1]
> zstyle ':completion:(*-|)most-recent-file:*' match-original both
> zstyle ':completion:(*-|)most-recent-file:*' file-sort modification
> zstyle ':completion:(*-|)most-recent-file:*' file-patterns '*:all\ files'
> zstyle ':completion:most-recent-file:*' hidden all
> zstyle ':completion:(*-|)most-recent-file:*' completer _files

Okay, I set out to test the case when the flags are set, and waded
through some confusion but I think the conclusion is that the comments
in the header are wrong, and the names of the constants are
misleading, and _path_files should probably use -V -2 since files
can't be duplicated anyway. Is there a useful case when the same file
could be added twice? (eg not just running _files twice by mistake).

Here's the wading through confusion part:
compcore.c has code like this (without my patch)
	    if (!(flags & CGF_UNIQCON)) {
		int dup;

		/* And delete the ones that occur more than once. */
...
	    if (!(flags & CGF_UNIQALL) && !(flags & CGF_UNIQCON)) {
                int dup;
	    } else if (!(flags & CGF_UNIQCON)) {
		int dup;
...

Checking the comments for these flags:
#define CGF_UNIQALL  8		/* remove all duplicates */
#define CGF_UNIQCON 16		/* remove consecutive duplicates */

So far it clearly seems like the checks are inverted.

addmatches() does this:
        gflags = (((dat->aflags & CAF_NOSORT ) ? CGF_NOSORT  : 0) |
...
                  ((dat->aflags & CAF_UNIQALL) ? CGF_UNIQALL : 0) |
                  ((dat->aflags & CAF_UNIQCON) ? CGF_UNIQCON : 0));

Okay, so we've copied over some other flags (note the CAF vs CGF),
what do those say?
#define CAF_UNIQCON  8    /* compadd -2: don't deduplicate */
#define CAF_UNIQALL 16    /* compadd -1: deduplicate */
okay... that's somewhat contradictory. Let's check the manpage:
  -1   If  given  together  with the -V option, makes only consecutive
       duplicates in the group be removed. If  combined  with  the  -J
       option,  this  has no visible effect. Note that groups with and
       without this flag are in different name spaces.

  -2   If given together with the -J or -V option,  makes  all  dupli‐
       cates  be kept. Again, groups with and without this flag are in
       different name spaces.


Okay, so UNIQALL means to remove only consecutive duplicates, and
UNIQCON means keep all duplicates. Right. That doesn't make any sense,
but it does match what the original code does (I think).

With that in mind, I have the following patch instead, any objections
to this? (Again, it makes completion be instant instead of taking
minutes in large directories). If there are objections, we can
probably improve the deduplication algorithm to from n^2 to nlog n
(sort array of pointers and use that to manipulate the original array
keeping relative order of kept elements. In fact, the input data to
the function is a list of pointers which we copy to an array of
elements which is then manipulated).

PS the manpage says -V is required for -1/-2 but -J with -o nosort
works as well which is what happens below.

-- 8< --

Subject: PATCH: _path_files: keep duplicates when not sorting

The deduplication algorithm is too slow to run on very large sets of
matches, and files can't be duplicate anyway.
---
 Completion/Unix/Type/_path_files | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Completion/Unix/Type/_path_files b/Completion/Unix/Type/_path_files
index c09ca6fa9e..2ee3030785 100644
--- a/Completion/Unix/Type/_path_files
+++ b/Completion/Unix/Type/_path_files
@@ -168,7 +168,7 @@ if zstyle -s ":completion:${curcontext}:"
file-sort tmp1; then
   if [[ "$sort" = on ]]; then
     sort=
   else
-    mopts=( -o nosort "${mopts[@]}" )
+    mopts=( -o nosort -2 "${mopts[@]}" )

     tmp2=()
     for tmp1 in "$pats[@]"; do

-- 
Mikael Magnusson




Messages sorted by: Reverse Date, Date, Thread, Author