Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

Re: [PATCH] vcs_info: add 'find-deepest' zstyle

On Sat, Oct 24, 2020 at 5:49 AM Daniel Shahaf <d.s@xxxxxxxxxxxxxxxxxx> wrote:
> Aleksandr Mezin wrote on Fri, 23 Oct 2020 14:34 +0600:
> > Currently, vcs_info iterates over enabled VCS backends and outputs repository
> > status from the first backend that works (i.e. first backend that can find a
> > repository).
> >
> > But I prefer a different behavior: I want to see the status of the repository
> > closest to the current working directory. For example: if there is a Mercurial
> > repository inside a Git repository, and I 'cd' into the Mercurial repository,
> > I want to see the status of Mercurial repo, even if 'git' comes before 'hg' in
> > the 'enable' list.
> I suppose that's the Right thing to do in most cases: e.g., it's
> context-sensitive whereas the 'enable' style's value doesn't usually
> depend on the current directory.
> (Separately, in your example it might be nice to indicate that there's
> an "outside" Git repository, but that'd be a separate feature request.)
> > Because some people, apparently, want the old behavior of vcs_info,
> What's those people's argument?  The default behaviour should be decided
> on on this list, not simply announced to this list as a _fait accompli_.

I can change the default to 'yes', no problem.

I think I should have mentioned that this is a "v2" of this patch.
Here are some important quotes from comments for the v1:

Bart Schaefer wrote on Mon, Nov 25, 2019 at 01:13:21 -0800:
> I have several directories that are BOTH git and CVS repositories, used for
> keeping two different origins in sync.
> This is admittedly an unusual situation -- we have a legacy process that
> expects to fetch sources from CVS, and developers collaborating through git
> -- but I would be annoyed if vcs didn't find the git data because of the
> CVS subdir.

Previously, you proposed the following solution:
> How about showing the information from the worktree whose root is deeper
> (= closer to cwd)?  Users can show the information for the other worktree
> by setting the 'enable' style in a :vcs_info:foo:* context and calling
> «vcs_info foo».
> In my proposal, if {foo,foo/bar,foo/bar/baz}/CVS
> and foo/bar/.git all exist, and cwd is foo/bar/baz/, git info would be shown
> because the git root, foo/bar/, is closer (deeper) than the CVS root, foo/ —
> notwithstanding that foo/bar/baz/CVS exists.

I thought this patch implements what you suggested, except for CVS.
After re-reading the comments I think I made a mistake with
VCS_INFO_detect_cvs, and should revert it to what it was in v1:

> -[[ -d "./CVS" ]] && [[ -r "./CVS/Repository" ]] && return 0
> -return 1
> +[[ -d "./CVS" ]] && [[ -r "./CVS/Repository" ]] || return 1
> +
> +local cvsbase="."
> +while [[ -d "${cvsbase}/../CVS" ]]; do
> +    cvsbase="${cvsbase}/.."
> +done
> +vcs_comm[basedir]=${cvsbase:P}
> +
> +return 0

(and maybe fix that potentially infinite loop, yes)

It will work for Bart's use case this way, and it's what
VCS_INFO_get_data_cvs currently does.

> > I made it configurable, and, by default, old behavior is kept. To
> > enable new behavior, enable 'find-deepest' zstyle:
> >
> >     zstyle ':vcs_info:*' find-deepest yes
> >
> I'd probably set this in my dotfiles: in the rare cases that I nest
> repositories, I don't add files in the deeper repository to the
> shallower repository.
> > After this change, vcs_info will try all enabled backends, and choose the one
> > that returns the deepest basedir path. Usually it will be the repository
> > closest to the current directory - if all basedirs are ancestors of the
> > current directory. I do not care about cases when a basedir isn't an ancestor
> > (for example, if GIT_WORK_TREE is pointing to an unrelated directory).
> Would you clarify the last sentence (not the parenthetical but the part
> before it)?

I do not have a strong opinion on how vcs_info should behave when some
basedir isn't an ancestor of pwd.

When some basedir is a subdirectory in pwd, vcs_info should likely
choose that backend (because it should mean that the user specified
that basedir explicitly - for example GIT_WORK_TREE).

But when a basedir is neither ancestor nor subdir of pwd, simply
choosing the backend based on path depth doesn't look quite correct to
me. But I don't know what the correct way is, so I ignore the fact
that vcs_info could behave weirdly (from the user's POV) in this case.
Maybe vcs_info should actually prefer the basedir that's not an
ancestor of pwd (i. e. choose it immediately when encountered, before
all depth comparisons)? Because "basedir isn't an ancestor of pwd" is
"the user somehow specified this directory explicitly". As far as I

> > Also, this patch adjusts VCS_INFO_detect_git and VCS_INFO_detect_cvs to make
> > them set vcs_comm[basedir]. They were the only VCS_INFO_detect_* functions not
> > setting it.
> Thanks.  I don't recall if there's "How to write a backend"
> documentation

I haven't found anything like that.

> but so long as all existing backends get it right, we're
> probably covered well enough in practice.  Going forward, we could
> heavily comment one of the backends and declare it documentation, as
> with Test/B01cd.ztst.
> > Signed-off-by: Aleksandr Mezin <mezin.alexander@xxxxxxxxx>
> > ---
> >  Doc/Zsh/contrib.yo                            |  9 +++++++
> >  .../VCS_Info/Backends/VCS_INFO_detect_cvs     |  2 +-
> >  .../VCS_Info/Backends/VCS_INFO_detect_git     |  1 +
> >  .../VCS_Info/Backends/VCS_INFO_get_data_git   |  2 +-
> >  Functions/VCS_Info/vcs_info                   | 25 ++++++++++++++++---
> Please update _zstyle.
> >  5 files changed, 33 insertions(+), 6 deletions(-)
> >
> > diff --git a/Doc/Zsh/contrib.yo b/Doc/Zsh/contrib.yo
> > index 00f693664..88d00c0ac 100644
> > --- a/Doc/Zsh/contrib.yo
> > +++ b/Doc/Zsh/contrib.yo
> > @@ -1238,6 +1238,14 @@ of unapplied patches (for example with Mercurial Queue patches).
> >
> >  Used by the tt(quilt), tt(hg), and tt(git) backends.
> >  )
> > +kindex(find-deepest)
> > +item(tt(find-deepest))(
> > +By default, tt(vcs_info) tries backends in the order of 'enable' list, and
> > +returns the output of the first backend that can find a repository. When
> > +tt(find-deepest) is set to tt(true), tt(vcs_info) will try all enabled backends,
> > +and will choose the one that
> Multiple tweaks here.  How about:
>     By default, tt(vcs_info) tries backends in the order given by the tt(enable) style, and
>     operates on the repository detected by the first backend to successfully detect a repository.
>     When this style (tt(find-deepest)) is set to tt(true), tt(vcs_info) will try all enabled backends,
>     and will operate on [...]
> Wordier, yes, but "the output of a backend" didn't parse for me.

If I add the documentation written by you to my patch, I guess I
should mention you in the commit message somehow?

> > +returns the deepest tt(basedir) path.
> "basedir" is an implementation term; readers won't know what it means.
> Keeping it would be like having zshparam(1) document $? as "The value
> of tt(lastval)".
> If /foo/.${vcs} and /foo/bar/.${vcs} both exist (same value of ${vcs} in
> both) the style is set, the latter should be used.  However, you don't
> seem to be handling this case in any way.  I guess that if this patch is
> to be accepted, we'll have to go through all backends and verify that
> they prefer /foo/bar/.${vcs} to /foo/.${vcs} when both exist and the
> style is set appropriately.  It's probably best if you go through
> backends you can install via your distro and ask -workers@ volunteers to
> go through any others.  (Some backends will probably be left over, but
> we can cross that bridge when we come to it.)

If the current directory is /foo/bar in your example, then, I think,
all backends, except CVS, will choose /foo/bar/.${vcs}
Most of them simply use VCS_INFO_bydir_detect, git will definitely
choose /foo/bar too.
CVS is special just because it has its subdirectory in every directory
of the working copy. VCS_INFO_get_data_cvs thinks that the last
successive parent directory that still has CVS subdirectory is the
root of the working copy. And I think it's correct.

> When /foo/.${vcs} and /foo/bar/.${vcs} both exist and «find-deepest» is
> set to _«false»_, might someone expect /foo/.${vcs} to be used?  (After
> all, one might reasonably presume flipping the style's value would
> _somehow_ affect vcs_info's behaviour.)  Perhaps we can name the style
> more unambiguously?  How about changing it from boolean to enum, as in:
> .
>     zstyle ':vcs_info:*' detection-algorithm enablement-order
>     zstyle ':vcs_info:*' detection-algorithm deepest-.vcs-dir
> .
> That'd also be more extensible.

Yep, I agree

> Separately, as implemented the style doesn't quite go by the _deepest_
> basedir path.  Rather, it goes by the one that _has more path
> components_.¹  That's not the same thing, at least if you take "X is
> deeper than Y" to imply "X is a (possibly nested) subdirectory of Y",
> which, I think, would be the appropriate semantics here.  For instance,
> given the pair of repository root paths {/foo/bar/baz, /lorem/ipsum},
> the patch will prefer /foo/bar/baz, even though it's not "deeper than"
> /lorem/ipsum.  I don't think setting find-deepest should affect the
> relative precedence of /foo/bar/baz and /lorem/ipsum…

I used the word 'deep/deepest' because I was thinking that "depth" in
a tree is the length of the path from the root. Which is a number, and
can be compared for any nodes. Is this incorrect?

Maybe I should iterate over basedirs, and just check if "current"
basedir is a subdir (maybe nested) of the current best match? I'm ok
with this behavior too. Though I don't find it any better in "weird"
cases - when some basedirs aren't ancestors of pwd

> … unless either of the two directories happens to be an ancestor of cwd,
> in which case there's an argument to be made that that directory should
> be preferred.  (I hope we don't have to worry about symlinks here.)
> Moreover, the above description wasn't entirely accurate.  The patch
> doesn't count path components; it counts slashes.  That's not
> equivalent, because of things like «/foo/bar», «/foo/bar/»,
> «/foo/./bar», «/foo//bar», and «bar» (the last one depends on cwd, of
> course).  For instance, «/foo///bar» will be considered deeper than
> «/foo/bar/baz».

I think vcs_comm[basedir] can never contain duplicate slashes - either
the value assigned to it is expanded with ':P' modifier, or the vcs
itself returns it in that form (for example, git rev-parse
--show-toplevel removes duplicate slashes from GIT_WORK_TREE). If
necessary, I can add an extra expansion with ':P' in vcs_info itself.

> I think a reasonable next step would be to nail down what repository
> should be chosen, given $PWD and a set of existing .${vcs} dirs
> (possibly more than one of those per value of ${vcs}).  For instance:
>     1. The .${vcs} dir whose parent is an ancestor of ${PWD}; if there's
>     more than one of these, the closest ancestor; if there's more than
>     one of _these_, the first enabled.
>     2. If there's no such .${vcs} dir, then a .${vcs} dir from the first
>     backend in enablement order.  If there's more than one of these,
>     then XXX.  (TODO: What if the first backend in enablement order
>     identifies two repository directories, and one of them is an
>     ancestor of the other?  Which one will be used?)
> > +Usually it will be the repository closest to the current directory.
> Documentation shouldn't say "usually" without also explaining what
> happens in the _unusual_ case.
> > +++ b/Functions/VCS_Info/Backends/VCS_INFO_detect_cvs
> > @@ -7,5 +7,5 @@ setopt localoptions NO_shwordsplit
> > -[[ -d "./CVS" ]] && [[ -r "./CVS/Repository" ]] && return 0
> > +[[ -d "./CVS" ]] && [[ -r "./CVS/Repository" ]] && vcs_comm[basedir]=.(:P) && return 0
> «.(:P)» won't be expanded here.


Also, again, it seems that I should revert to the previous version of
my patch here - find the root CVS directory, like
VCS_INFO_get_data_cvs does.

> Cheers,
> Daniel

Messages sorted by: Reverse Date, Date, Thread, Author