Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

There is a serious inefficiency in the way zsh handles wildcards



  Hello,

  I had a bug report described as @subject. The test case was described as:

---%<---
$ time zsh -c "ls /tmp/*****.*****"

real 0m0.006s
user 0m0.004s
sys 0m0.002s

$ time zsh -c "ls /tmp/******.******"

real 0m0.032s
user 0m0.031s
sys 0m0.001s

$ time zsh -c "ls /tmp/*******.*******"

real 0m0.127s
user 0m0.125s
sys 0m0.003s

$ time zsh -c "ls /tmp/********.********"

real 0m0.485s
user 0m0.484s
sys 0m0.002s

$ time zsh -c "ls /tmp/**********.**********"

real 0m5.933s
user 0m5.937s
sys 0m0.002s
---%<---

I did look a bit in zsh sources, and wrote this patch, that should not interfere
on the special handling of **/ and ***/, and just avoid the very deep recursions
that consume a huge amount of cpu, and apparently yield nothing.

---%<---
diff -up zsh-5.0.2/Src/pattern.c.orig zsh-5.0.2/Src/pattern.c
--- zsh-5.0.2/Src/pattern.c.orig 2014-09-03 12:21:44.673792750 -0300
+++ zsh-5.0.2/Src/pattern.c 2014-09-03 12:22:28.069303587 -0300
@@ -2911,6 +2911,10 @@ patmatch(Upat prog)
     break;
  case P_STAR:
     /* Handle specially for speed, although really P_ONEHASH+P_ANY */
+    while (P_OP(next) == P_STAR) {
+ scan = next;
+ next = PATNEXT(scan);
+    }
  case P_ONEHASH:
  case P_TWOHASH:
     /*
---%<---

Do you believe this patch is OK?

The user reports those patterns are generated by one of their scripts.

Thanks,
Paulo



Messages sorted by: Reverse Date, Date, Thread, Author