Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

Re: zsh got stuck without any message because of history lock file

On Sep 6, 11:03am, Mikael Magnusson wrote:
} > I'm not 100% sure it's because of this change, but recently, I've had
} > new terminals stuck forever on startup until I exit all open terminals
} > logged in as that user (or possibly just a specific one of them). It
} > happens very spuriously, maybe a few times per week. I have
} > histfcntllock set. On one occasion I tried moving the .history file
} > instead of exiting, and this also immediately unstuck all new
} > terminals.
} I'm still seeing reports of this every now and then in #zsh (and I
} have the option unset so I'm not sure if it happens to me anymore.) If
} nobody fixed this, it's probably still a problem.

So, a few things.

The unlockhistfile() operation assumes that closing the flock_fd is
enough to release the lock.  I can't quote chapter and verse for this
from e.g. the POSIX spec, but I believe that to be correct.

However, lockhistfile() does not call flockhistfile() when it has
already been locked (flock_fd >= 0), and in that case it falls through
to file-based locking again.  This isn't a problem except when hend()
is deciding whether to call savehistfile(); it only considers whether
the file is already locked in the case of SHARE_HISTORY, but may call
savehistfile() for both SHARE_HISTORY and INC_APPEND_HISTORY et al.,
and savehistfile() calls lockhistfile() again.

Additionally, lockhistfile() only increments the lockhistct counter
when file locking, not when fcntl locking, which leads to an incorrect
answer from histfileIsLocked().

So I think there's a potential problem where INC_APPEND_HISTORY could
apply both locking mechanisms at different times, leading to a race.

These were all assumptions about both kinds of locking being applied
that I missed before.  The "better performance" from HIST_FCNTL_LOCK
I guess just mean that it would fail faster on permission problems or
lock collisions, not that it would improve performance overall.

We can either apply the additional patch below, or back out some of
the last one to restore the "always locks both ways" behavior.

diff --git a/Src/hist.c b/Src/hist.c
index 770d559..d29a65a 100644
--- a/Src/hist.c
+++ b/Src/hist.c
@@ -2490,6 +2490,9 @@ flockhistfile(char *fn, int keep_trying)
     struct flock lck;
     int ctr = keep_trying ? 9 : 0;
+    if (flock_fd >= 0)
+	return 0; /* already locked */
     if ((flock_fd = open(unmeta(fn), O_RDWR | O_NOCTTY)) < 0)
 	return errno == ENOENT ? 0 : 2; /* "successfully" locked missing file */
@@ -2768,12 +2771,6 @@ lockhistfile(char *fn, int keep_trying)
     if (!fn && !(fn = getsparam("HISTFILE")))
 	return 1;
-#ifdef HAVE_FCNTL_H
-    if (isset(HISTFCNTLLOCK) && flock_fd < 0) {
-	return flockhistfile(fn, keep_trying);
-    }
     if (!lockhistct++) {
 	struct stat sb;
 	int fd;
@@ -2786,6 +2783,11 @@ lockhistfile(char *fn, int keep_trying)
 # endif
+#ifdef HAVE_FCNTL_H
+	if (isset(HISTFCNTLLOCK))
+	    return flockhistfile(fn, keep_trying);
 	lockfile = bicat(unmeta(fn), ".LOCK");
 	/* NOTE: only use symlink locking on a link()-having host in order to
 	 * avoid a change from open()-based locking to symlink()-based. */

Messages sorted by: Reverse Date, Date, Thread, Author