Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

Re: [RFC or so] Add HASH_LOOKUP option



On 23 August 2010 19:46, Bart Schaefer <schaefer@xxxxxxxxxxxxxxxx> wrote:
> On Aug 23,  4:09pm, Mikael Magnusson wrote:
> }
> } When this is unset, external commands are always resolved with a full
> } path search, but still inserted into the hash for spell correction if
> } those options are on.
> }
> } diff --git a/Src/exec.c b/Src/exec.c
> } index 93d1b26..9a488fe 100644
> } --- a/Src/exec.c
> } +++ b/Src/exec.c
> } @@ -754,7 +754,7 @@ findcmd(char *arg0, int docopy)
> }           }
> }           break;
> }       }
> } -    if (cn) {
> } +    if (cn && isset(HASHLOOKUP)) {
> }       char nn[PATH_MAX];
> }
> }       if (cn->node.flags & HASHED)
>
> I think there may be  problem with this in the event that the "hash"
> command has been used to deliberately insert an entry into the hash
> table.  That's a documented mechanism for overriding path search.
> Also I think you've missed that execute() also uses the hash table;
> did you intend that "command foo" ignores NO_HASHLOOKUP?

As I said, I didn't check everywhere yet to see if there are more
duplicated codepaths, so no, that is not the intent. However, command
foo for me doesn't appear to ignore it for me.

> In looking more closely, there are a number of things about findcmd()
> that look a bit fishy to me.  Correct me where I've gone wrong?

A lot of things looked fishy to me while poking at this, but I
couldn't be sure which were actually fishy and which were just arcane
reasons.

> It starts (if the command is not already in the hash table and HASHCMDS
> is set) by doing hashcmd(), which performs a search of only absolute
> directory names in the path (not, e.g., ".", "..", or other relative
> names).  That search itself seems a bit off because if *arg0 == '/'
> then it's going to prepend the path component anyway. [*]

That's the first fishy thing I noticed, why is this for loop over
$path copied to at least four places, two of which are in the same
function, with slight variations?

The fact that 'command' supposedly does a hash lookup while 'command
-p' doesn't seems to be rather undocumented too.

> It then ignores what it got back from the search and, if arg0 contains
> a '/', checks whether arg0 is executable without path traversal.  This
> seems a bit sideways to me; why not do that first?  HASHDIRS, I guess?
> [This disagrees with execute(), which does the directly-executable test
> first.]  In any case this can return NULL, leaving the ignored results
> from the previous search in the hash table, unless PATHDIRS is set.
>
> Then, if it DID find the command either in the hashtable already or
> via hashcmd() but the hashtable entry does not have the HASHED bit [**]
> it performs a search of only relative directory names in the path ahead
> of the previously-found absolute directory.

I guess in this specific function I could move my isset(HASHLOOKUP) to
be in the & HASHED if() instead of the cn one. Which is to say if the
whole thing isn't broken in the first place.

> Finally, if all of the above failed, we do a full path search of both
> relative and absolute directories again, which is redundant in the
> case of HASHCMDS unless by some unlikely race condition the command
> has appeared in one of the directories in the path in the meantime.
>
>
> [*] And indeed, from zsh -f:
>
> torch% hash
> torch% print =/X11/xterm
> zsh: /X11/xterm not found
> torch% hash | grep X11
> /X11/xterm=/usr/bin//X11/xterm

Even funner:
% print =./ls
/bin/./ls

> [**] The HASHED bit means the command was deliberately inserted into
> the hash table with the "hash" builtin, rather than found by search.
> In this case the both findcmd() and execute() are forced to believe
> what the hash table tells them.

Could this be used, theoretically, to still allow users to override
the path, and still do a full path search for search hashed entries? I
note you didn't include execcmd() in the function listing there, but
I'm not clear on how/if execute() and execcmd() relate to eachother.
When I looked at the definition for the struct hashnode earlier when
writing the patch, I found the rather helpful
  int flags;        /* various flags      */
and proceeded to ignore it :P.

Another thing I was vaguely wondering about is how is HASH_CMDS forced
on by CORRECT? I was unable to grep up anything about it.

-- 
Mikael Magnusson



Messages sorted by: Reverse Date, Date, Thread, Author