Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

Re: zsh got stuck without any message because of history lock file



On Apr 24, 10:02am, Bart Schaefer wrote:
}
} The way the code is currently written the shell will wait until 10
} seconds after the modification time stamp of the file.  If for some
} reason the time stamp is wrong (clocks out of sync with NFS server,
} for example) then it may end up waiting for longer.

So ... here is a patch that checks for future timestamps.  If it finds
one, it leaves errno set to EEXIST, which results in:

zsh: locking failed for /home/schaefer/.zhistory: file exists

This isn't perfect, but at least it provides a clue.

There's a controversial bit in this patch:  It appears HIST_FCNTL_LOCK
may have been broken ever since workers/28047 in 2010.  That patch
changed flockhistfile() [which is misnamed, it uses fcntl] to return 0
on success, but then also this:

	ret = flockhistfile(fn, keep_trying);
	if (ret)
	    return ret;

That means that we only stop when flockhistfile() FAILS.  On success,
we go on and lock with BOTH fcntl AND a .LOCK file.  This completely
contradicts the documentation claim that "this may provide better
performance."  So the second hunk below makes flockhistfile() the
only lock applied when HIST_FCNTL_LOCK is set.

However, this would mean that every zsh that is accessing the same
HISTFILE has to be using the same locking options.  Is that OK?


diff --git a/Src/hist.c b/Src/hist.c
index 1624912..1182994 100644
--- a/Src/hist.c
+++ b/Src/hist.c
@@ -2720,6 +2720,25 @@ savehistfile(char *fn, int err, int writeflags)
 
 static int lockhistct;
 
+static int
+checklocktime(char *lockfile, time_t then)
+{
+    time_t now = time(NULL);
+
+    if (now + 10 < then) {
+	/* File is more than 10 seconds in the future? */
+	errno = EEXIST;
+	return -1;
+    }
+
+    if (now - then < 10)
+	sleep(1);
+    else
+	unlink(lockfile);
+
+    return 0;
+}
+
 /*
  * Lock history file.  Return 0 on success, 1 on failure to lock this
  * time, 2 on permanent failure (e.g. permission).
@@ -2737,9 +2756,7 @@ lockhistfile(char *fn, int keep_trying)
 
 #ifdef HAVE_FCNTL_H
     if (isset(HISTFCNTLLOCK) && flock_fd < 0) {
-	ret = flockhistfile(fn, keep_trying);
-	if (ret)
-	    return ret;
+	return flockhistfile(fn, keep_trying);
     }
 #endif
 
@@ -2776,10 +2793,10 @@ lockhistfile(char *fn, int keep_trying)
 		    continue;
 		break;
 	    }
-	    if (time(NULL) - sb.st_mtime < 10)
-		sleep(1);
-	    else
-		unlink(lockfile);
+	    if (checklocktime(lockfile, sb.st_mtime) < 0) {
+		ret = 1;
+		break;
+	    }
 	}
 	if (fd < 0)
 	    lockhistct--;
@@ -2804,10 +2821,10 @@ lockhistfile(char *fn, int keep_trying)
 			continue;
 		    ret = 2;
 		} else {
-		    if (time(NULL) - sb.st_mtime < 10)
-			sleep(1);
-		    else
-			unlink(lockfile);
+		    if (checklocktime(lockfile, sb.st_mtime) < 0) {
+			ret = 1;
+			break;
+		    }
 		    continue;
 		}
 		lockhistct--;
@@ -2832,10 +2849,10 @@ lockhistfile(char *fn, int keep_trying)
 		ret = 2;
 		break;
 	    }
-	    if (time(NULL) - sb.st_mtime < 10)
-		sleep(1);
-	    else
-		unlink(lockfile);
+	    if (checklocktime(lockfile, sb.st_mtime) < 0) {
+		ret = 1;
+		break;
+	    }
 	}
 	if (fd < 0)
 	    lockhistct--;



Messages sorted by: Reverse Date, Date, Thread, Author