Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

Re: [PATCH] Enable sub-second timeout in zsystem flock



	Hello,

About the patch I submitted last month enabling fractional seconds for
the timeout in zsystem flock, there was the following outstanding issue.


Daniel Shahaf (Sunday 2020-03-15):
> How likely is zsh 5.9 to be ported to a platform that doesn't have a
> 64-bit type?
> 
> If that's a concern, I'd recommend to issue a warning or disable the
> feature when we detect that case.
> 
> I'm not sure whether that's a concern.

If it is, here is a reimplementation of the patch that doesn't depend
on a 64-bit type.  Instead of counting time in microseconds, I've used
struct timespec and a function similar to zgettime() but that uses
CLOCK_MONOTONIC if available instead of CLOCK_REALTIME.

(I haven't managed to avoid some code duplication, because a common
function would require choosing between the two clocks, thus need a
parameter of type clockid_t, which is not always defined.  So I don't
know how to write the prototype of a common function.)

Another difference is that the timeout length is now limited to 2^30-1
seconds (34 years) so as to avoid overflowing time_t on 32-bit systems.
There is one case where it may overflow, but it can't be avoided: if
time_t is 32-bit signed and CLOCK_MONOTONIC isn't available or returns
the wall clock instead of e.g. the uptime.  Then, specifying a timeout
beyond year 2038 won't work, but then the system has a Y2038 problem
anyway.  I don't see how to avoid it short of checking the timeout
against Y2038, which may actually create a Y2038 problem on systems
that wouldn't otherwise have it.

Also, the retry interval is now limited to 0.999*LONG_MAX microseconds;
the 0.999 avoids a rounding problem that causes an overflow on 64-bit
systems (the value is given as a double, but a double can't represent
2^63-1 exactly).

I think that's all.  I believe I checked all code paths, though I don't
know how to do it systematically.

					Thanks, best regards,
					Cedric Ware.


diff --git a/Doc/Zsh/mod_system.yo b/Doc/Zsh/mod_system.yo
index 6292af071..be9a99f62 100644
--- a/Doc/Zsh/mod_system.yo
+++ b/Doc/Zsh/mod_system.yo
@@ -166,7 +166,7 @@ to the command, or 2 for an error on the write; no error message is
 printed in the last case, but the parameter tt(ERRNO) will reflect
 the error that occurred.
 )
-xitem(tt(zsystem flock) [ tt(-t) var(timeout) ] [ tt(-f) var(var) ] [tt(-er)] var(file))
+xitem(tt(zsystem flock) [ tt(-t) var(timeout) ] [ tt(-i) var(interval) ] [ tt(-f) var(var) ] [tt(-er)] var(file))
 item(tt(zsystem flock -u) var(fd_expr))(
 The builtin tt(zsystem)'s subcommand tt(flock) performs advisory file
 locking (via the manref(fcntl)(2) system call) over the entire contents
@@ -196,9 +196,16 @@ a safety check that the file descriptor is in use for file locking.
 
 By default the shell waits indefinitely for the lock to succeed.
 The option tt(-t) var(timeout) specifies a timeout for the lock in
-seconds; currently this must be an integer.  The shell will attempt
-to lock the file once a second during this period.  If the attempt
-times out, status 2 is returned.
+seconds; fractional seconds are allowed.  During this period, the
+shell will attempt to lock the file every var(interval) seconds
+if the tt(-i) var(interval) option is given, otherwise once a second.
+(This var(interval) is shortened before the last attempt if needed,
+so that the shell waits only until the var(timeout) and not longer.)
+If the attempt times out, status 2 is returned.
+
+(Note: var(timeout) must be less than 2^30-1 seconds (about 34 years);
+and var(interval) must be less than 0.999 * LONG_MAX microseconds
+(only about 35 minutes on 32-bit systems).)
 
 If the option tt(-e) is given, the file descriptor for the lock is
 preserved when the shell uses tt(exec) to start a new process;
diff --git a/Src/Modules/system.c b/Src/Modules/system.c
index fb3d80773..a60c18f5d 100644
--- a/Src/Modules/system.c
+++ b/Src/Modules/system.c
@@ -29,6 +29,7 @@
 
 #include "system.mdh"
 #include "system.pro"
+#include <math.h>
 
 #ifdef HAVE_POLL_H
 # include <poll.h>
@@ -531,7 +532,9 @@ static int
 bin_zsystem_flock(char *nam, char **args, UNUSED(Options ops), UNUSED(int func))
 {
     int cloexec = 1, unlock = 0, readlock = 0;
-    zlong timeout = -1;
+    double timeout = -1;
+    long timeout_interval = 1e6;
+    mnumber timeout_param;
     char *fdvar = NULL;
 #ifdef HAVE_FCNTL_H
     struct flock lck;
@@ -583,7 +586,51 @@ bin_zsystem_flock(char *nam, char **args, UNUSED(Options ops), UNUSED(int func))
 		} else {
 		    optarg = *args++;
 		}
-		timeout = mathevali(optarg);
+		timeout_param = matheval(optarg);
+		timeout = (timeout_param.type & MN_FLOAT)?
+		    timeout_param.u.d : (double)timeout_param.u.l;
+
+		/*
+		 * timeout must not overflow time_t, but little is known
+		 * about this type's limits.  Conservatively limit to 2^30-1
+		 * (34 years).  Then it can only overflow if time_t is only
+		 * a 32-bit int and CLOCK_MONOTONIC is not supported, in which
+		 * case there is a Y2038 problem anyway.
+		 */
+		if ((timeout < 1e-6) || (timeout > 1073741823.)) {
+		    zwarnnam(nam, "flock: invalid timeout value: '%s'",
+			     optarg);
+		    return 1;
+		}
+		break;
+
+	    case 'i':
+		/* retry interval in seconds */
+		if (optptr[1]) {
+		    optarg = optptr + 1;
+		    optptr += strlen(optarg) - 1;
+		} else if (!*args) {
+		    zwarnnam(nam,
+			     "flock: option %c requires "
+			     "a numeric retry interval",
+			     opt);
+		    return 1;
+		} else {
+		    optarg = *args++;
+		}
+		timeout_param = matheval(optarg);
+		if (!(timeout_param.type & MN_FLOAT)) {
+		    timeout_param.type = MN_FLOAT;
+		    timeout_param.u.d = (double)timeout_param.u.l;
+		}
+		timeout_param.u.d *= 1e6;
+		if ((timeout_param.u.d < 1)
+		    || (timeout_param.u.d > 0.999 * LONG_MAX)) {
+		    zwarnnam(nam, "flock: invalid interval value: '%s'",
+			     optarg);
+		    return 1;
+		}
+		timeout_interval = (long)timeout_param.u.d;
 		break;
 
 	    case 'u':
@@ -647,7 +694,24 @@ bin_zsystem_flock(char *nam, char **args, UNUSED(Options ops), UNUSED(int func))
     lck.l_len = 0;  /* lock the whole file */
 
     if (timeout > 0) {
-	time_t end = time(NULL) + (time_t)timeout;
+	/*
+	 * Get current time, calculate timeout time.
+	 * No need to check for overflow, already checked above.
+	 */
+	struct timespec now, end;
+	double timeout_s;
+	long remaining_us;
+	zgettime_monotonic_if_available(&now);
+	end.tv_sec = now.tv_sec;
+	end.tv_nsec = now.tv_nsec;
+	end.tv_nsec += modf(timeout, &timeout_s) * 1000000000L;
+	end.tv_sec += timeout_s;
+	if (end.tv_nsec >= 1000000000L) {
+	    end.tv_nsec -= 1000000000L;
+	    end.tv_sec += 1;
+	}
+
+	/* Try acquiring lock, loop until timeout. */
 	while (fcntl(flock_fd, F_SETLK, &lck) < 0) {
 	    if (errflag) {
                 zclose(flock_fd);
@@ -658,11 +722,16 @@ bin_zsystem_flock(char *nam, char **args, UNUSED(Options ops), UNUSED(int func))
 		zwarnnam(nam, "failed to lock file %s: %e", args[0], errno);
 		return 1;
 	    }
-	    if (time(NULL) >= end) {
+	    zgettime_monotonic_if_available(&now);
+	    remaining_us = timespec_diff_us(&now, &end);
+	    if (remaining_us <= 0) {
                 zclose(flock_fd);
 		return 2;
             }
-	    sleep(1);
+	    if (remaining_us <= timeout_interval) {
+		timeout_interval = remaining_us;
+	    }
+	    zsleep(timeout_interval);
 	}
     } else {
 	while (fcntl(flock_fd, timeout == 0 ? F_SETLK : F_SETLKW, &lck) < 0) {
diff --git a/Src/compat.c b/Src/compat.c
index 74e426fba..817bb4aaf 100644
--- a/Src/compat.c
+++ b/Src/compat.c
@@ -126,6 +126,32 @@ zgettime(struct timespec *ts)
     return ret;
 }
 
+/* Likewise with CLOCK_MONOTONIC if available. */
+
+/**/
+mod_export int
+zgettime_monotonic_if_available(struct timespec *ts)
+{
+    int ret = -1;
+
+#if defined(HAVE_CLOCK_GETTIME) && defined(CLOCK_MONOTONIC)
+    struct timespec dts;
+    if (clock_gettime(CLOCK_MONOTONIC, &dts) < 0) {
+	zwarn("unable to retrieve CLOCK_MONOTONIC time: %e", errno);
+	ret--;
+    } else {
+	ret++;
+	ts->tv_sec = (time_t) dts.tv_sec;
+	ts->tv_nsec = (long) dts.tv_nsec;
+    }
+#endif
+
+    if (ret) {
+	ret = zgettime(ts);
+    }
+    return ret;
+}
+
 
 /* compute the difference between two calendar times */
 
diff --git a/Src/utils.c b/Src/utils.c
index 69885fed3..db5626830 100644
--- a/Src/utils.c
+++ b/Src/utils.c
@@ -2748,6 +2748,42 @@ read_poll(int fd, int *readchar, int polltty, zlong microseconds)
     return (ret > 0);
 }
 
+/*
+ * Return the difference between 2 times, given as struct timespec*,
+ * expressed in microseconds, as a long.  If the difference doesn't fit
+ * into a long, return LONG_MIN or LONG_MAX so that the times can still
+ * be compared.
+ *
+ * Note: returns a long rather than a zlong because zsleep() below
+ * takes a long.
+ */
+
+/**/
+long
+timespec_diff_us(const struct timespec *t1, const struct timespec *t2)
+{
+    int reverse = (t1->tv_sec > t2->tv_sec);
+    time_t diff_sec;
+    long diff_usec, max_margin, res;
+
+    /* Don't just subtract t2-t1 because time_t might be unsigned. */
+    diff_sec = (reverse? t1->tv_sec - t2->tv_sec : t2->tv_sec - t1->tv_sec);
+    if (diff_sec > LONG_MAX / 1000000L) {
+	goto overflow;
+    }
+    res = diff_sec * 1000000L;
+    max_margin = LONG_MAX - res;
+    diff_usec = (reverse?
+		 t1->tv_nsec - t2->tv_nsec : t2->tv_nsec - t1->tv_nsec
+		 ) / 1000;
+    if (diff_usec <= max_margin) {
+	res += diff_usec;
+	return (reverse? -res : res);
+    }
+ overflow:
+    return (reverse? LONG_MIN : LONG_MAX);
+}
+
 /*
  * Sleep for the given number of microseconds --- must be within
  * range of a long at the moment, but this is only used for
diff --git a/Test/V14system.ztst b/Test/V14system.ztst
index e69de29bb..79a2c0a77 100644
--- a/Test/V14system.ztst
+++ b/Test/V14system.ztst
@@ -0,0 +1,150 @@
+# Test zsh/system module
+
+%prep
+
+  if zmodload -s zsh/system && zmodload -s zsh/zselect; then
+    tst_dir=V13.tmp
+    mkdir -p -- $tst_dir
+  else
+    ZTST_unimplemented='the zsh/system and zsh/zselect modules are not available'
+  fi
+  : > $tst_dir/file # File on which to acquire flock.
+
+%test
+
+  (
+    zsystem flock -t 0.1 -i 0.000001 $tst_dir/file
+  )
+0:zsystem flock valid time arguments
+
+  (
+    zsystem flock -t -1         $tst_dir/file ||
+    zsystem flock -t 0.49e-6    $tst_dir/file ||
+    zsystem flock -t 1073741824 $tst_dir/file ||
+    zsystem flock -t 1e100      $tst_dir/file ||
+    zsystem flock -i -1         $tst_dir/file ||
+    zsystem flock -i 0.49e-6    $tst_dir/file ||
+    zsystem flock -i 1e100      $tst_dir/file
+  )
+1:zsystem flock invalid time arguments
+?(eval):zsystem:2: flock: invalid timeout value: '-1'
+?(eval):zsystem:3: flock: invalid timeout value: '0.49e-6'
+?(eval):zsystem:4: flock: invalid timeout value: '1073741824'
+?(eval):zsystem:5: flock: invalid timeout value: '1e100'
+?(eval):zsystem:6: flock: invalid interval value: '-1'
+?(eval):zsystem:7: flock: invalid interval value: '0.49e-6'
+?(eval):zsystem:8: flock: invalid interval value: '1e100'
+
+  (
+    # Lock file for 1 second in the background.
+    lock_flag=$tst_dir/locked1
+    (zsystem flock $tst_dir/file \
+     && touch $lock_flag \
+     && zselect -t 100
+     mv $lock_flag $lock_flag.done) &
+    # Wait until sub-shell above has started.
+    while ! [[ -f $lock_flag || -f $lock_flag.done ]]; do
+      zselect -t 1
+    done
+    if [[ -f $lock_flag.done ]]; then
+      echo "Background shell should not have completed already." 1>&2
+    else
+      # Attempt to lock file with 0.5 second timeout: must fail.
+      zsystem flock -t 0.5 $tst_dir/file
+    fi
+  )
+2:zsystem flock unsuccessful wait test
+F:This timing test might fail due to process scheduling issues unrelated to zsh.
+
+  (
+    # Lock file for 0.5 second in the background.
+    lock_flag=$tst_dir/locked2
+    (zsystem flock $tst_dir/file \
+      && touch $lock_flag \
+      && zselect -t 50
+      mv $lock_flag $lock_flag.done) &
+    # Wait until sub-shell above has started.
+    while ! [[ -f $lock_flag || -f $lock_flag.done ]]; do
+      zselect -t 1
+    done
+    if [[ -f $lock_flag.done ]]; then
+      echo "Background shell should not have completed already." 1>&2
+    fi
+    typeset -F SECONDS
+    start=$SECONDS
+    # Attempt to lock file without a timeout:
+    # must succeed after sub-shell above releases it (0.5 second).
+    if zsystem flock $tst_dir/file; then
+      elapsed=$[ $SECONDS - $start ]
+      if [[ $elapsed -ge 0.3 && $elapsed -le 0.7 ]]; then
+        echo "elapsed time seems OK" 1>&2
+      else
+        echo "elapsed time $elapsed should be ~ 0.5 second" 1>&2
+      fi
+    fi
+  )
+0:zsystem flock successful wait test, no timeout
+?elapsed time seems OK
+F:This timing test might fail due to process scheduling issues unrelated to zsh.
+
+  (
+    # Lock file for 0.5 second in the background.
+    lock_flag=$tst_dir/locked3
+    (zsystem flock $tst_dir/file \
+      && touch $lock_flag \
+      && zselect -t 50
+      mv $lock_flag $lock_flag.done) &
+    # Wait until sub-shell above has started.
+    while ! [[ -f $lock_flag || -f $lock_flag.done ]]; do
+      zselect -t 1
+    done
+    if [[ -f $lock_flag.done ]]; then
+      echo "Background shell should not have completed already." 1>&2
+    fi
+    typeset -F SECONDS
+    start=$SECONDS
+    # Attempt to lock file with 1-second timeout:
+    # must succeed 1 second after start because we retry every 1 second.
+    if zsystem flock -t 1 $tst_dir/file; then
+      elapsed=$[ $SECONDS - $start ]
+      if [[ $elapsed -ge 0.8 && $elapsed -le 1.2 ]]; then
+        echo "elapsed time seems OK" 1>&2
+      else
+        echo "elapsed time $elapsed should be ~ 1 second" 1>&2
+      fi
+    fi
+  )
+0:zsystem flock successful wait test, integral seconds
+?elapsed time seems OK
+F:This timing test might fail due to process scheduling issues unrelated to zsh.
+
+  (
+    # Lock file for 0.25 second in the background.
+    lock_flag=$tst_dir/locked4
+    (zsystem flock $tst_dir/file \
+      && touch $lock_flag \
+      && zselect -t 25
+      mv $lock_flag $lock_flag.done) &
+    # Wait until sub-shell above has started.
+    while ! [[ -f $lock_flag || -f $lock_flag.done ]]; do
+      zselect -t 1
+    done
+    if [[ -f $lock_flag.done ]]; then
+      echo "Background shell should not have completed already." 1>&2
+    fi
+    typeset -F SECONDS
+    start=$SECONDS
+    # Attempt to lock file with 0.4-second timeout, retrying every 0.1 second:
+    # must succeed 0.3 second after start.
+    if zsystem flock -t 0.4 -i 0.1 $tst_dir/file; then
+      elapsed=$[ $SECONDS - $start ]
+      if [[ $elapsed -ge 0.2 && $elapsed -le 0.5 ]]; then
+        echo "elapsed time seems OK" 1>&2
+      else
+        echo "elapsed time $elapsed should be ~ 0.3 second" 1>&2
+      fi
+    fi
+  )
+0:zsystem flock successful wait test, fractional seconds
+?elapsed time seems OK
+F:This timing test might fail due to process scheduling issues unrelated to zsh.



Messages sorted by: Reverse Date, Date, Thread, Author