Zsh Mailing List Archive Messages sorted by: Reverse Date, Date, Thread, Author

Re: Emulating 'locate'

X-seq: zsh-users 6633
From: Lloyd Zusman <ljz@xxxxxxxxxx>
To: zsh-users@xxxxxxxxxx
Subject: Re: Emulating 'locate'
Date: Sat, 04 Oct 2003 09:48:43 -0400
Mailing-list: contact zsh-users-help@xxxxxxxxxx; run by ezmlm
References: <20031001221753.GA23189@DervishD> <1031002023639.ZM22046@xxxxxxxxxxxxxxxxxxxxxxx> <20031002080358.GA23230@DervishD> <m365j6watm.fsf@xxxxxxxxxx> <20031004104844.GA50@DervishD>
Reply-to: ljz@xxxxxxxxxx
Sender: Lloyd Zusman <ljz@xxxxxxxxxx>

DervishD <raul@xxxxxxxxxxxx> writes:

>     Hi Lloyd :)
>
>  * Lloyd Zusman <ljz@xxxxxxxxxx> dixit:
>> >>     locate() { print -l /**/*${^*}*{,/**/*} }
>> >     Ok, it works like a charm... Thanks a lot, as always :)
>> I might have missed something about this in the first part of the thread
>> a couple weeks ago (those messages have already expired on my system),
>> but in case it wasn't mentioned before, I want to point out that this
>> function is _extremely_ slow in comparison to the standard 'locate'
>> command.  [ ... ]
>
>     Obviously: locate uses a database of names for doing the
> 'location'. Moreover, I don't know exactly if locate is faster than
> doing a grep in the same database (uncompressed, of course... The
> locate database is front-compressed, see find manual for details).
>
>     The 'locate' command doesn't do any magic for being fast: the
> price it pays is the need of a database, that may be outdated (so you
> will miss files, or find nonexistent ones...). If you want reliable
> results you have two options:
>
>     - Use the zsh version, or a version with 'find'.
>     - Update de database regularly. Very regularly, in fact. If files
> are created and destroyed frequently, you will have to update the
> database continously... On the average system, anyway, this is not an
> issue, specially if you look for files that reside on 'stable' parts
> of the system.

Well, I generally use the 'locate' command when I want to do a global
search over my entire system.  I always am aware that it might be
out-dated, and I go back to 'find' when I want to do a search that is
up-to-the-moment accurate.  However, in that case, I target it to a
specific directory tree, and rarely, if ever recurse down from the root
directory unless I want to take a long coffee break waiting for results,
and I don't mind users screaming at me for slowing down the system.

Your locate function would be even better than it already is if you
could point it at a directory instead of having it always start at root.
That would be an interesting continuation of this exercise!

>> I'm not sure how it compares to this:
>>   locate() { find / -name "*${^*}*" -print }
>
>     This is faster, IMHO, because AFAIK find uses a non-recursive
> algorithm to recurse the hierarchy. Although I'm not sure about that
> glob pattern you use, since it will be interpreted by find, not the
> shell :?? The manual says you can use a shell pattern, but I'm not
> sure about who interprets it. If it is find who interprets, then
> ${^*} won't work as expected. Using more ellaborate patterns is an
> advantage of using the zsh version.

zsh interprets the ${^*} part in intersperses it between the other two
asterisks when the shell function is being invoked, and 'find'
interprets the result.  I think I should have left out the ^, however,
or probably only used ${1}.

I just ran a timing test, and unfortunately, 'find' fares better than
your locate function, which I named 'xlocate' on my system.  Here are
the results:

  find / -name specific-file -print   # 15 min 19 sec elapsed
  xlocate specific-file               # 28 min 40 sec elapsed

Of course, your function provides zsh's much richer set of matching
capabilities.

>> Figuring this out is a very good learning experience for zsh. 
>> However, I would not recommend installing this function for
>> everyday use on a reasonably sized system.
>
>     Of course ;))) But on small systems or when searching on a
> limited set of directories, the zsh version, although slower, permits
> more ellaborated searches, IMHO. And doesn't find false positives
> (nonexistent files) nor misses files ;) But you're true, this is more
> a learning experience than a function of real use. For it to be
> useful, it must be rewritten to use a database, or something like
> that...

Well, I think that there is a way to make it quite good for everyday use
without having to go so far as to create a database: just come up with a
way to target the search from a specific directory instead of always
having to start from root.  If your shell function could take an
additional first argument, namely the directory under which to start
searching, it would be great, IMHO.  For example:

  # look under my HOME directory and find all
  # files whose names match the x*.c pattern
  locate ~ 'x*.c'

  # I know that 'lost-file-name' is located under
  # /usr/share, but I can't for the life of me
  # remember where it is
  locate /usr/share lost-file-name

  # Give me a list of every GIF, JPEG, and PNG
  # on my entire system.  I don't mind taking
  # a coffee break while waiting for the results
  locate / '(#i)*.{gif,jp{,e}g,png}'

Here's my first try at it (I call it 'xlocate' so as not to conflict
with the 'locate' command on my system):

  xlocate() {
    setopt nullglob extendedglob
    eval print -l ${argv[1]%/}'/**/'${^argv[2,-1]}'{,/**/*}'
  }

I removed the asterisks before and after the ${^argv[2,-1]} so I don't
lose the ability to do the following:

  xlocate ~ '*.c'   # only matches *.c files under HOME
  xlocate ~ c       # only matches files named 'c' under HOME

>     Raúl Núñez de Arenas Coronado

-- 
 Lloyd Zusman
 ljz@xxxxxxxxxx

Follow-Ups:
- Re: Emulating 'locate'
  - From: Bart Schaefer
- Re: Emulating 'locate'
  - From: DervishD

References:
- Re: Emulating 'locate'
  - From: DervishD
- Re: Emulating 'locate'
  - From: Lloyd Zusman
- Re: Emulating 'locate'
  - From: DervishD

Messages sorted by: Reverse Date, Date, Thread, Author