Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

Re: [bug] :P modifier and symlink loops



Daniel Shahaf wrote on Sun, 02 Feb 2020 08:10 +0000:
> Stephane Chazelas wrote on Sat, 01 Feb 2020 17:57 +0000:
> > Ping:  
> 
> Thanks for the ping.  I've added this to Etc/BUGS so we don't forget
> it.  I worked on :P before, so I've added this to my list to
> investigate further, but I don't know when I'll get to it.
> 
> > 2020-01-11 17:00:47 +0000, Stephane Chazelas:
> > Hi,
> > 
> > I've got the feeling it's been discussed before, but could not
> > find it in the archives.
> > 
> > $ ln -s loop /tmp/
> > $ f=/tmp/loop strace ~/install/cvs/zsh/Src/zsh -c '$f:P'
> > [...]
> > readlink("/tmp/loop", "loop", 4096)     = 4
> > readlink("/tmp/loop", "loop", 4096)     = 4
> > [...]
> > readlink("/tmp/loop", "loop", 4096)     = 4
> > readlink("/tmp/loop", "loop", 4096)     = 4
> > --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR,  
> > si_addr=0x7ffec7a345e0} ---    
> > +++ killed by SIGSEGV +++
> > 
> > possibly stack overflow caused by unbound recursion or buffer
> > overflow on /tmp/loop/loop... but the bigger question is what to
> > do here.
> > 
> > The ELOOP problem is usually addressed by giving up after an
> > arbitrary number of symlinks has been resolved (regardless of
> > whether there is indeed a loop or not) in the lookup of the
> > file, but here $f:P *has* to expand to something, so what should
> > that be?
> > 
> > For instance, for
> > 
> > cd /
> > file=bin/../tmp/loop/../foo/.. above?
> > 
> > The only thing I can think of is expand to:
> > 
> > /tmp/loop/../foo/..
> > 
> > (maybe done by first doing a stat(the-file); if it returns
> > ELOOP, do a stat() at each stage of the resolution and give up
> > on the first ELOOP).
> > 
> > Any other idea?  
> 
> The postcondition of :P is "no dot or dot-dot components and no symlinks".
> 
> When the loop is on the last path component (as in ${${:-/tmp/loop}:P}
> and ${${:-/tmp/trap}:P} after «ln -s loop /tmp/trap») we could still print
> a path to the loop symlink that meets the postcondition, except for the loop
> symlink in the last path component.
> 
> However, in ${${:-"/tmp/loop/../foo"}} we can't meet the postcondition.
> I think our options are either to throw an exception, like a glob with
> no matches does, or to keep the additional components verbatim, as you
> suggest.
> 
> Intuitively I lean towards the first option.  We aren't a CGI script,
> where PATH_INFO is to be expected.  If we can't return a path without
> dot and dot-dot components and without symlinks, we should raise an
> error rather than continue silently. However, I'm open to alternatives.
> 
> I think the first option could be implemented along the lines of:
> 
> 1. Call realpath($arg).
> 2. If it returns ELOOP, call realpath(${arg:h}) and append "/${arg:t}".
> 3. Otherwise, throw an exception (i.e., set errflag).

Patch series attached.

I ended up implementing the second option — keeping the trailing
components verbatim — for several reasons:

1. It's actually documented this way for :P.  (xsymlink() has other
callers too, but I didn't check whether any of them specifically relied
on this behaviour.)

2. After I made the code use the realpath() wrapper function,
chabspath(), rather than xsymlinks() (plural), that's the behaviour
I observed, and I didn't go out of my way to change it.

I suppose we could revisit :P's behaviour on symlink loops with
trailing components after the loop, but in the meantime, this at least
fixes the segfault.

WDYT?

Cheers,

Daniel
>From 286bd5549ab5b3e7ef769310152460dda77b27d1 Mon Sep 17 00:00:00 2001
From: Daniel Shahaf <d.s@xxxxxxxxxxxxxxxxxx>
Date: Sat, 21 Mar 2020 18:40:37 +0000
Subject: [PATCH 1/8] Add tests for the segfault on resolving a symlink loop
 bug (workers/45282).

This is workers/45377, extended.
---
 Test/D02glob.ztst | 36 ++++++++++++++++++++++++++++++++++++
 1 file changed, 36 insertions(+)

diff --git a/Test/D02glob.ztst b/Test/D02glob.ztst
index 4e6dc2a7a..248cc7ff5 100644
--- a/Test/D02glob.ztst
+++ b/Test/D02glob.ztst
@@ -757,6 +757,42 @@
 -f:(workers/45367) modifier ':P' squashes multiple slashes
 >/dev  
 
+  ln -s loop glob.tmp/loop
+  ln -s loop glob.tmp/trap
+  { 
+    (set -- glob.tmp/trap; echo $1:P)
+    (set -- glob.tmp/loop; echo $1:P)
+  } always {
+    rm -f glob.tmp/trap glob.tmp/loop
+  }
+-f:the ':P' modifier handles symlink loops in the last path component
+*>*/(trap|loop)
+*>*/(trap|loop)
+
+  ln -s loop glob.tmp/loop
+  ln -s loop glob.tmp/trap
+  { 
+    (set -- glob.tmp/loop/trailing/components; echo $1:P)
+    (set -- glob.tmp/trap/trailing/components; echo $1:P)
+  } always {
+    rm -f glob.tmp/trap glob.tmp/loop
+  }
+-f:the ':P' modifier handles symlink loops before the last path component
+*>*/glob.tmp/loop/trailing/components
+*>*/glob.tmp/(loop|trap)/trailing/components
+
+  ln -s flip glob.tmp/flop
+  ln -s flop glob.tmp/flip
+  {
+    (set -- glob.tmp/flip; echo $1:P)
+    (set -- glob.tmp/flip/trailing/components; echo $1:P)
+  } always {
+    rm -f glob.tmp/flip glob.tmp/flop
+  }
+-f:the ':P' modifier handles symlink loops other than the trivial case
+*>*/glob.tmp/(flip|flop)
+*>*/glob.tmp/(flip|flop)/trailing/components
+
 %clean
 
  # Fix unreadable-directory permissions so ztst can clean up properly
>From 3d5dca6ae6c59eaf38bea5e2e6c3e2429553a8eb Mon Sep 17 00:00:00 2001
From: Daniel Shahaf <d.s@xxxxxxxxxxxxxxxxxx>
Date: Sat, 21 Mar 2020 18:01:36 +0000
Subject: [PATCH 2/8] chrealpath: Make symlink resolution optional.

---
 Src/hist.c  | 21 ++++++++++++++++-----
 Src/subst.c |  4 ++--
 2 files changed, 18 insertions(+), 7 deletions(-)

diff --git a/Src/hist.c b/Src/hist.c
index 5281e8718..db2cc4ad7 100644
--- a/Src/hist.c
+++ b/Src/hist.c
@@ -842,7 +842,7 @@ histsubchar(int c)
 		break;
 
 	    case 'A':
-		if (!chrealpath(&sline)) {
+		if (!chrealpath(&sline, 'A')) {
 		    herrflush();
 		    zerr("modifier failed: A");
 		    return -1;
@@ -1922,9 +1922,18 @@ chabspath(char **junkptr)
     return 1;
 }
 
+/*
+ * Resolve symlinks in junkptr.
+ *
+ * If mode is 'A', resolve dot-dot before symlinks.  Else, mode should be 'P'.
+ * Refer to the documentation of the :A and :P modifiers for details.
+ *
+ * Return 0 for error, non-zero for success.
+ */
+
 /**/
 int
-chrealpath(char **junkptr)
+chrealpath(char **junkptr, char mode)
 {
     char *str;
 #ifdef HAVE_REALPATH
@@ -1936,12 +1945,14 @@ chrealpath(char **junkptr)
 # endif
 #endif
 
+    DPUTS1(mode != 'A' && mode != 'P', "chrealpath: mode='%c' is invalid", mode);
+
     if (!**junkptr)
 	return 1;
 
-    /* Notice that this means ..'s are applied before symlinks are resolved! */
-    if (!chabspath(junkptr))
-	return 0;
+    if (mode == 'A')
+	if (!chabspath(junkptr))
+	    return 0;
 
 #ifndef HAVE_REALPATH
     return 1;
diff --git a/Src/subst.c b/Src/subst.c
index 79efc9ad2..7b3222d6e 100644
--- a/Src/subst.c
+++ b/Src/subst.c
@@ -4399,7 +4399,7 @@ modify(char **str, char **ptr, int inbrace)
 			chabspath(&copy);
 			break;
 		    case 'A':
-			chrealpath(&copy);
+			chrealpath(&copy, 'A');
 			break;
 		    case 'c':
 		    {
@@ -4485,7 +4485,7 @@ modify(char **str, char **ptr, int inbrace)
 		    chabspath(str);
 		    break;
 		case 'A':
-		    chrealpath(str);
+		    chrealpath(str, 'A');
 		    break;
 		case 'c':
 		{
>From 164a90521397ab75e6c57b96e2ec4f7a732de6a9 Mon Sep 17 00:00:00 2001
From: Daniel Shahaf <d.s@xxxxxxxxxxxxxxxxxx>
Date: Sat, 21 Mar 2020 18:06:48 +0000
Subject: [PATCH 3/8] chrealpath: Let caller decide how the return value should
 be allocated.

---
 Src/hist.c  | 11 +++++++----
 Src/subst.c |  4 ++--
 2 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/Src/hist.c b/Src/hist.c
index db2cc4ad7..8ab7828e8 100644
--- a/Src/hist.c
+++ b/Src/hist.c
@@ -842,7 +842,7 @@ histsubchar(int c)
 		break;
 
 	    case 'A':
-		if (!chrealpath(&sline, 'A')) {
+		if (!chrealpath(&sline, 'A', 1)) {
 		    herrflush();
 		    zerr("modifier failed: A");
 		    return -1;
@@ -1928,12 +1928,14 @@ chabspath(char **junkptr)
  * If mode is 'A', resolve dot-dot before symlinks.  Else, mode should be 'P'.
  * Refer to the documentation of the :A and :P modifiers for details.
  *
+ * use_heap is 1 if the result is to be allocated on the heap, 0 otherwise.
+ *
  * Return 0 for error, non-zero for success.
  */
 
 /**/
 int
-chrealpath(char **junkptr, char mode)
+chrealpath(char **junkptr, char mode, int use_heap)
 {
     char *str;
 #ifdef HAVE_REALPATH
@@ -2000,14 +2002,15 @@ chrealpath(char **junkptr, char mode)
 	str++;
     }
 
+    use_heap = (use_heap ? META_HEAPDUP : META_DUP);
     if (real) {
-	*junkptr = metafy(str = bicat(real, nonreal), -1, META_HEAPDUP);
+	*junkptr = metafy(str = bicat(real, nonreal), -1, use_heap);
 	zsfree(str);
 #ifdef REALPATH_ACCEPTS_NULL
 	free(real);
 #endif
     } else {
-	*junkptr = metafy(nonreal, lastpos - nonreal + 1, META_HEAPDUP);
+	*junkptr = metafy(nonreal, lastpos - nonreal + 1, use_heap);
     }
 #endif
 
diff --git a/Src/subst.c b/Src/subst.c
index 7b3222d6e..94ddb9ceb 100644
--- a/Src/subst.c
+++ b/Src/subst.c
@@ -4399,7 +4399,7 @@ modify(char **str, char **ptr, int inbrace)
 			chabspath(&copy);
 			break;
 		    case 'A':
-			chrealpath(&copy, 'A');
+			chrealpath(&copy, 'A', 1);
 			break;
 		    case 'c':
 		    {
@@ -4485,7 +4485,7 @@ modify(char **str, char **ptr, int inbrace)
 		    chabspath(str);
 		    break;
 		case 'A':
-		    chrealpath(str, 'A');
+		    chrealpath(str, 'A', 1);
 		    break;
 		case 'c':
 		{
>From 0c733a9eb677fd1c786a37917cbfbdbc088039b5 Mon Sep 17 00:00:00 2001
From: Daniel Shahaf <d.s@xxxxxxxxxxxxxxxxxx>
Date: Sat, 21 Mar 2020 18:45:35 +0000
Subject: [PATCH 4/8] Fix segfault on resolving symlink loops

---
 Etc/BUGS          |  3 ++-
 Src/utils.c       |  6 +++---
 Test/D02glob.ztst | 11 +++++------
 3 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/Etc/BUGS b/Etc/BUGS
index 99a0d9753..2501d59a7 100644
--- a/Etc/BUGS
+++ b/Etc/BUGS
@@ -29,5 +29,6 @@ skipped when STTY=... is set for that command
 44007 - Martijn - exit in trap executes rest of function
 See test case in Test/C03traps.ztst.
 ------------------------------------------------------------------------
-45282: ${${:-foo}:P} where foo is a symlink that points to itself segfaults
+45282: xsymlinks() segfaults on symlink loops
+Fixed for some cases; need to audit remaining callers
 ------------------------------------------------------------------------
diff --git a/Src/utils.c b/Src/utils.c
index 339404489..6ad028c6b 100644
--- a/Src/utils.c
+++ b/Src/utils.c
@@ -1029,11 +1029,11 @@ xsymlink(char *s, int heap)
     if (*s != '/')
 	return NULL;
     *xbuf = '\0';
-    if (xsymlinks(s + 1, 1) < 0)
+    if (!chrealpath(&s, 'P', heap)) {
 	zwarn("path expansion failed, using root directory");
-    if (!*xbuf)
 	return heap ? dupstring("/") : ztrdup("/");
-    return heap ? dupstring(xbuf) : ztrdup(xbuf);
+    }
+    return s;
 }
 
 /**/
diff --git a/Test/D02glob.ztst b/Test/D02glob.ztst
index 248cc7ff5..041784310 100644
--- a/Test/D02glob.ztst
+++ b/Test/D02glob.ztst
@@ -690,10 +690,9 @@
  # This is a bit brittle as it depends on PATH_MAX.
  # We could use sysconf..
  bad_pwd="/${(l:16000:: :):-}"
- print ${bad_pwd:P}
+ print ${bad_pwd:P} | wc -c
 0:modifier ':P' with path too long
-?(eval):4: path expansion failed, using root directory
->/
+>16002
 
  foo=a
  value="ac"
@@ -765,7 +764,7 @@
   } always {
     rm -f glob.tmp/trap glob.tmp/loop
   }
--f:the ':P' modifier handles symlink loops in the last path component
+0:the ':P' modifier handles symlink loops in the last path component
 *>*/(trap|loop)
 *>*/(trap|loop)
 
@@ -777,7 +776,7 @@
   } always {
     rm -f glob.tmp/trap glob.tmp/loop
   }
--f:the ':P' modifier handles symlink loops before the last path component
+0:the ':P' modifier handles symlink loops before the last path component
 *>*/glob.tmp/loop/trailing/components
 *>*/glob.tmp/(loop|trap)/trailing/components
 
@@ -789,7 +788,7 @@
   } always {
     rm -f glob.tmp/flip glob.tmp/flop
   }
--f:the ':P' modifier handles symlink loops other than the trivial case
+0:the ':P' modifier handles symlink loops other than the trivial case
 *>*/glob.tmp/(flip|flop)
 *>*/glob.tmp/(flip|flop)/trailing/components
 
>From 03a662dfced2b4161b7c57eef3da3ea2ed3e6c70 Mon Sep 17 00:00:00 2001
From: Daniel Shahaf <d.s@xxxxxxxxxxxxxxxxxx>
Date: Sat, 21 Mar 2020 19:04:09 +0000
Subject: [PATCH 5/8] Add a test for bin_whence's symlinks resolution.

---
 Test/B13whence.ztst | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)
 create mode 100644 Test/B13whence.ztst

diff --git a/Test/B13whence.ztst b/Test/B13whence.ztst
new file mode 100644
index 000000000..b22363980
--- /dev/null
+++ b/Test/B13whence.ztst
@@ -0,0 +1,22 @@
+%prep
+
+  mkdir whence.tmp
+  pushd whence.tmp
+  ln -s real step3
+  ln -s step3 step2
+  ln -s step2 step1
+  touch real
+  chmod +x real
+  prefix=$PWD
+  popd
+
+%test
+
+  (
+    path=( $PWD/whence.tmp $path )
+    whence -S step1
+    whence -s step1
+  )
+0q:whence symlink resolution
+>$prefix/step1 -> $prefix/step2 -> $prefix/step3 -> $prefix/real
+>$prefix/step1 -> $prefix/real
>From 25f8aa5cad87ab716fb480d50ee40c38609c78ce Mon Sep 17 00:00:00 2001
From: Daniel Shahaf <d.s@xxxxxxxxxxxxxxxxxx>
Date: Sat, 21 Mar 2020 19:09:04 +0000
Subject: [PATCH 6/8] Don't use xsymlinks() in 'whence -s'.

---
 Src/utils.c | 18 ++++++++++++++----
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/Src/utils.c b/Src/utils.c
index 6ad028c6b..567df9222 100644
--- a/Src/utils.c
+++ b/Src/utils.c
@@ -910,7 +910,14 @@ slashsplit(char *s)
     return r;
 }
 
-/* expands symlinks and .. or . expressions */
+/* expands symlinks and .. or . expressions
+ *
+ * Puts the result in the global "xbuf"
+ *
+ * If "full" is true, resolve one level of symlinks only.
+ *
+ * WARNING: This will segfault on symlink loops (thread: workers/45282)
+ */
 
 /**/
 static int
@@ -1041,10 +1048,10 @@ void
 print_if_link(char *s, int all)
 {
     if (*s == '/') {
-	*xbuf = '\0';
 	if (all) {
 	    char *start = s + 1;
 	    char xbuflink[PATH_MAX+1];
+	    *xbuf = '\0';
 	    for (;;) {
 		if (xsymlinks(start, 0) > 0) {
 		    printf(" -> ");
@@ -1059,8 +1066,11 @@ print_if_link(char *s, int all)
 		}
 	    }
 	} else {
-	    if (xsymlinks(s + 1, 1) > 0)
-		printf(" -> "), zputs(*xbuf ? xbuf : "/", stdout);
+	    if (chrealpath(&s, 'P', 0)) {
+		printf(" -> ");
+		zputs(*s ? s : "/", stdout);
+		zsfree(s);
+	    }
 	}
     }
 }
>From eab2c99c415cc5ae53bde8c87e6d2e90e77eb482 Mon Sep 17 00:00:00 2001
From: Daniel Shahaf <d.s@xxxxxxxxxxxxxxxxxx>
Date: Sat, 21 Mar 2020 19:12:55 +0000
Subject: [PATCH 7/8] Remove code that is now unreachable.

---
 Src/utils.c | 21 ++++-----------------
 1 file changed, 4 insertions(+), 17 deletions(-)

diff --git a/Src/utils.c b/Src/utils.c
index 567df9222..25579ba11 100644
--- a/Src/utils.c
+++ b/Src/utils.c
@@ -910,18 +910,16 @@ slashsplit(char *s)
     return r;
 }
 
-/* expands symlinks and .. or . expressions
+/* expands .. or . expressions and one level of symlinks
  *
  * Puts the result in the global "xbuf"
  *
- * If "full" is true, resolve one level of symlinks only.
- *
  * WARNING: This will segfault on symlink loops (thread: workers/45282)
  */
 
 /**/
 static int
-xsymlinks(char *s, int full)
+xsymlinks(char *s)
 {
     char **pp, **opp;
     char xbuf2[PATH_MAX*3+1], xbuf3[PATH_MAX*2+1];
@@ -970,7 +968,7 @@ xsymlinks(char *s, int full)
 	} else {
 	    ret = 1;
 	    metafy(xbuf3, t0, META_NOALLOC);
-	    if (!full) {
+	    {
 		/*
 		 * If only one expansion requested, ensure the
 		 * full path is in xbuf.
@@ -1005,17 +1003,6 @@ xsymlinks(char *s, int full)
 		 */
 		break;
 	    }
-	    if (*xbuf3 == '/') {
-		strcpy(xbuf, "");
-		if (xsymlinks(xbuf3 + 1, 1) < 0)
-		    ret = -1;
-		else
-		    xbuflen = strlen(xbuf);
-	    } else
-		if (xsymlinks(xbuf3, 1) < 0)
-		    ret = -1;
-		else
-		    xbuflen = strlen(xbuf);
 	}
     }
     freearray(opp);
@@ -1053,7 +1040,7 @@ print_if_link(char *s, int all)
 	    char xbuflink[PATH_MAX+1];
 	    *xbuf = '\0';
 	    for (;;) {
-		if (xsymlinks(start, 0) > 0) {
+		if (xsymlinks(start) > 0) {
 		    printf(" -> ");
 		    zputs(*xbuf ? xbuf : "/", stdout);
 		    if (!*xbuf)
>From 34c817f9c51ca22cf643866e3299c58b69074949 Mon Sep 17 00:00:00 2001
From: Daniel Shahaf <d.s@xxxxxxxxxxxxxxxxxx>
Date: Sat, 21 Mar 2020 19:16:17 +0000
Subject: [PATCH 8/8] Extend tests to prove that what remains of xsymlinks()
 handles symlink loops gracefully.

---
 Etc/BUGS            | 3 ---
 Src/utils.c         | 2 --
 Test/B13whence.ztst | 9 +++++++++
 3 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/Etc/BUGS b/Etc/BUGS
index 2501d59a7..8112299f5 100644
--- a/Etc/BUGS
+++ b/Etc/BUGS
@@ -29,6 +29,3 @@ skipped when STTY=... is set for that command
 44007 - Martijn - exit in trap executes rest of function
 See test case in Test/C03traps.ztst.
 ------------------------------------------------------------------------
-45282: xsymlinks() segfaults on symlink loops
-Fixed for some cases; need to audit remaining callers
-------------------------------------------------------------------------
diff --git a/Src/utils.c b/Src/utils.c
index 25579ba11..634470476 100644
--- a/Src/utils.c
+++ b/Src/utils.c
@@ -913,8 +913,6 @@ slashsplit(char *s)
 /* expands .. or . expressions and one level of symlinks
  *
  * Puts the result in the global "xbuf"
- *
- * WARNING: This will segfault on symlink loops (thread: workers/45282)
  */
 
 /**/
diff --git a/Test/B13whence.ztst b/Test/B13whence.ztst
index b22363980..ea0a4dae5 100644
--- a/Test/B13whence.ztst
+++ b/Test/B13whence.ztst
@@ -5,6 +5,9 @@
   ln -s real step3
   ln -s step3 step2
   ln -s step2 step1
+  ln -s loop loop
+  ln -s flip flop
+  ln -s flop flip
   touch real
   chmod +x real
   prefix=$PWD
@@ -20,3 +23,9 @@
 0q:whence symlink resolution
 >$prefix/step1 -> $prefix/step2 -> $prefix/step3 -> $prefix/real
 >$prefix/step1 -> $prefix/real
+
+  (
+    path=( $PWD/whence.tmp $path )
+    whence -S flip || whence -S loop || whence -s flip || whence -s loop
+  )
+1:whence deals with symlink loops gracefully


Messages sorted by: Reverse Date, Date, Thread, Author