Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

Re: [PATCH] Enable sub-second timeout in zsystem flock



Cedric Ware wrote on Mon, 13 Apr 2020 23:34 +0200:
> 	Hello,
> 
> About the patch I submitted last month enabling fractional seconds for
> the timeout in zsystem flock, there was the following outstanding issue.
> 
> 
> Daniel Shahaf (Sunday 2020-03-15):
> > How likely is zsh 5.9 to be ported to a platform that doesn't have a
> > 64-bit type?
> > 
> > If that's a concern, I'd recommend to issue a warning or disable the
> > feature when we detect that case.
> > 
> > I'm not sure whether that's a concern.  
> 
> If it is, here is a reimplementation of the patch that doesn't depend
> on a 64-bit type.  Instead of counting time in microseconds, I've used
> struct timespec and a function similar to zgettime() but that uses
> CLOCK_MONOTONIC if available instead of CLOCK_REALTIME.
> 

Wow.  Thank you very much for the rewrite, Cedric.  I'm afraid I wasn't
clear, though.  All I wanted to say was: (1) _If_ zsh 5.9 is likely to
be ported to a 64-bit-type-less platform, we should add a preprocessor
check to catch that case (at either configure- or compile-time);
(2) I don't know what's the probability of the antecedent.  I didn't
mean to ask for a rewrite, not in the least — only for a quick,
three-liner «#if» block — but that's water under the bridge now.
Thanks for going the extra mile :-)

Cheers,

Daniel


> (I haven't managed to avoid some code duplication, because a common
> function would require choosing between the two clocks, thus need a
> parameter of type clockid_t, which is not always defined.  So I don't
> know how to write the prototype of a common function.)
> 
> Another difference is that the timeout length is now limited to 2^30-1
> seconds (34 years) so as to avoid overflowing time_t on 32-bit systems.
> There is one case where it may overflow, but it can't be avoided: if
> time_t is 32-bit signed and CLOCK_MONOTONIC isn't available or returns
> the wall clock instead of e.g. the uptime.  Then, specifying a timeout
> beyond year 2038 won't work, but then the system has a Y2038 problem
> anyway.  I don't see how to avoid it short of checking the timeout
> against Y2038, which may actually create a Y2038 problem on systems
> that wouldn't otherwise have it.
> 
> Also, the retry interval is now limited to 0.999*LONG_MAX microseconds;
> the 0.999 avoids a rounding problem that causes an overflow on 64-bit
> systems (the value is given as a double, but a double can't represent
> 2^63-1 exactly).
> 
> I think that's all.  I believe I checked all code paths, though I don't
> know how to do it systematically.
> 
> 					Thanks, best regards,
> 					Cedric Ware.
> 
> 
> diff --git a/Doc/Zsh/mod_system.yo b/Doc/Zsh/mod_system.yo
> index 6292af071..be9a99f62 100644
> --- a/Doc/Zsh/mod_system.yo
> +++ b/Doc/Zsh/mod_system.yo
> @@ -166,7 +166,7 @@ to the command, or 2 for an error on the write; no error message is
>  printed in the last case, but the parameter tt(ERRNO) will reflect
>  the error that occurred.
>  )
> -xitem(tt(zsystem flock) [ tt(-t) var(timeout) ] [ tt(-f) var(var) ] [tt(-er)] var(file))
> +xitem(tt(zsystem flock) [ tt(-t) var(timeout) ] [ tt(-i) var(interval) ] [ tt(-f) var(var) ] [tt(-er)] var(file))
>  item(tt(zsystem flock -u) var(fd_expr))(
>  The builtin tt(zsystem)'s subcommand tt(flock) performs advisory file
>  locking (via the manref(fcntl)(2) system call) over the entire contents
> @@ -196,9 +196,16 @@ a safety check that the file descriptor is in use for file locking.
>  
>  By default the shell waits indefinitely for the lock to succeed.
>  The option tt(-t) var(timeout) specifies a timeout for the lock in
> -seconds; currently this must be an integer.  The shell will attempt
> -to lock the file once a second during this period.  If the attempt
> -times out, status 2 is returned.
> +seconds; fractional seconds are allowed.  During this period, the
> +shell will attempt to lock the file every var(interval) seconds
> +if the tt(-i) var(interval) option is given, otherwise once a second.
> +(This var(interval) is shortened before the last attempt if needed,
> +so that the shell waits only until the var(timeout) and not longer.)
> +If the attempt times out, status 2 is returned.
> +
> +(Note: var(timeout) must be less than 2^30-1 seconds (about 34 years);
> +and var(interval) must be less than 0.999 * LONG_MAX microseconds
> +(only about 35 minutes on 32-bit systems).)
>  
>  If the option tt(-e) is given, the file descriptor for the lock is
>  preserved when the shell uses tt(exec) to start a new process;
> diff --git a/Src/Modules/system.c b/Src/Modules/system.c
> index fb3d80773..a60c18f5d 100644
> --- a/Src/Modules/system.c
> +++ b/Src/Modules/system.c
> @@ -29,6 +29,7 @@
>  
>  #include "system.mdh"
>  #include "system.pro"
> +#include <math.h>
>  
>  #ifdef HAVE_POLL_H
>  # include <poll.h>
> @@ -531,7 +532,9 @@ static int
>  bin_zsystem_flock(char *nam, char **args, UNUSED(Options ops), UNUSED(int func))
>  {
>      int cloexec = 1, unlock = 0, readlock = 0;
> -    zlong timeout = -1;
> +    double timeout = -1;
> +    long timeout_interval = 1e6;
> +    mnumber timeout_param;
>      char *fdvar = NULL;
>  #ifdef HAVE_FCNTL_H
>      struct flock lck;
> @@ -583,7 +586,51 @@ bin_zsystem_flock(char *nam, char **args, UNUSED(Options ops), UNUSED(int func))
>  		} else {  
>  		    optarg = *args++;
>  		}  
> -		timeout = mathevali(optarg);
> +		timeout_param = matheval(optarg);
> +		timeout = (timeout_param.type & MN_FLOAT)?
> +		    timeout_param.u.d : (double)timeout_param.u.l;
> +
> +		/*
> +		 * timeout must not overflow time_t, but little is known
> +		 * about this type's limits.  Conservatively limit to 2^30-1
> +		 * (34 years).  Then it can only overflow if time_t is only
> +		 * a 32-bit int and CLOCK_MONOTONIC is not supported, in which
> +		 * case there is a Y2038 problem anyway.
> +		 */
> +		if ((timeout < 1e-6) || (timeout > 1073741823.)) {
> +		    zwarnnam(nam, "flock: invalid timeout value: '%s'",
> +			     optarg);
> +		    return 1;
> +		}
> +		break;
> +
> +	    case 'i':
> +		/* retry interval in seconds */
> +		if (optptr[1]) {
> +		    optarg = optptr + 1;
> +		    optptr += strlen(optarg) - 1;
> +		} else if (!*args) {
> +		    zwarnnam(nam,
> +			     "flock: option %c requires "
> +			     "a numeric retry interval",
> +			     opt);
> +		    return 1;
> +		} else {
> +		    optarg = *args++;
> +		}
> +		timeout_param = matheval(optarg);
> +		if (!(timeout_param.type & MN_FLOAT)) {
> +		    timeout_param.type = MN_FLOAT;
> +		    timeout_param.u.d = (double)timeout_param.u.l;
> +		}
> +		timeout_param.u.d *= 1e6;
> +		if ((timeout_param.u.d < 1)
> +		    || (timeout_param.u.d > 0.999 * LONG_MAX)) {
> +		    zwarnnam(nam, "flock: invalid interval value: '%s'",
> +			     optarg);
> +		    return 1;
> +		}
> +		timeout_interval = (long)timeout_param.u.d;
>  		break;
>  
>  	    case 'u':
> @@ -647,7 +694,24 @@ bin_zsystem_flock(char *nam, char **args, UNUSED(Options ops), UNUSED(int func))
>      lck.l_len = 0;  /* lock the whole file */
>  
>      if (timeout > 0) {
> -	time_t end = time(NULL) + (time_t)timeout;
> +	/*
> +	 * Get current time, calculate timeout time.
> +	 * No need to check for overflow, already checked above.
> +	 */
> +	struct timespec now, end;
> +	double timeout_s;
> +	long remaining_us;
> +	zgettime_monotonic_if_available(&now);
> +	end.tv_sec = now.tv_sec;
> +	end.tv_nsec = now.tv_nsec;
> +	end.tv_nsec += modf(timeout, &timeout_s) * 1000000000L;
> +	end.tv_sec += timeout_s;
> +	if (end.tv_nsec >= 1000000000L) {
> +	    end.tv_nsec -= 1000000000L;
> +	    end.tv_sec += 1;
> +	}
> +
> +	/* Try acquiring lock, loop until timeout. */
>  	while (fcntl(flock_fd, F_SETLK, &lck) < 0) {
>  	    if (errflag) {
>                  zclose(flock_fd);
> @@ -658,11 +722,16 @@ bin_zsystem_flock(char *nam, char **args, UNUSED(Options ops), UNUSED(int func))
>  		zwarnnam(nam, "failed to lock file %s: %e", args[0], errno);
>  		return 1;
>  	    }  
> -	    if (time(NULL) >= end) {
> +	    zgettime_monotonic_if_available(&now);
> +	    remaining_us = timespec_diff_us(&now, &end);
> +	    if (remaining_us <= 0) {
>                  zclose(flock_fd);
>  		return 2;
>              }  
> -	    sleep(1);
> +	    if (remaining_us <= timeout_interval) {
> +		timeout_interval = remaining_us;
> +	    }
> +	    zsleep(timeout_interval);
>  	}
>      } else {  
>  	while (fcntl(flock_fd, timeout == 0 ? F_SETLK : F_SETLKW, &lck) < 0) {
> diff --git a/Src/compat.c b/Src/compat.c
> index 74e426fba..817bb4aaf 100644
> --- a/Src/compat.c
> +++ b/Src/compat.c
> @@ -126,6 +126,32 @@ zgettime(struct timespec *ts)
>      return ret;
>  }  
>  
> +/* Likewise with CLOCK_MONOTONIC if available. */
> +
> +/**/
> +mod_export int
> +zgettime_monotonic_if_available(struct timespec *ts)
> +{
> +    int ret = -1;
> +
> +#if defined(HAVE_CLOCK_GETTIME) && defined(CLOCK_MONOTONIC)
> +    struct timespec dts;
> +    if (clock_gettime(CLOCK_MONOTONIC, &dts) < 0) {
> +	zwarn("unable to retrieve CLOCK_MONOTONIC time: %e", errno);
> +	ret--;
> +    } else {
> +	ret++;
> +	ts->tv_sec = (time_t) dts.tv_sec;
> +	ts->tv_nsec = (long) dts.tv_nsec;
> +    }
> +#endif
> +
> +    if (ret) {
> +	ret = zgettime(ts);
> +    }
> +    return ret;
> +}  
> +
>  
>  /* compute the difference between two calendar times */
>  
> diff --git a/Src/utils.c b/Src/utils.c
> index 69885fed3..db5626830 100644
> --- a/Src/utils.c
> +++ b/Src/utils.c
> @@ -2748,6 +2748,42 @@ read_poll(int fd, int *readchar, int polltty, zlong microseconds)
>      return (ret > 0);
>  }  
>  
> +/*
> + * Return the difference between 2 times, given as struct timespec*,
> + * expressed in microseconds, as a long.  If the difference doesn't fit
> + * into a long, return LONG_MIN or LONG_MAX so that the times can still
> + * be compared.
> + *
> + * Note: returns a long rather than a zlong because zsleep() below
> + * takes a long.
> + */
> +
> +/**/
> +long
> +timespec_diff_us(const struct timespec *t1, const struct timespec *t2)
> +{
> +    int reverse = (t1->tv_sec > t2->tv_sec);
> +    time_t diff_sec;
> +    long diff_usec, max_margin, res;
> +
> +    /* Don't just subtract t2-t1 because time_t might be unsigned. */
> +    diff_sec = (reverse? t1->tv_sec - t2->tv_sec : t2->tv_sec - t1->tv_sec);
> +    if (diff_sec > LONG_MAX / 1000000L) {
> +	goto overflow;
> +    }
> +    res = diff_sec * 1000000L;
> +    max_margin = LONG_MAX - res;
> +    diff_usec = (reverse?
> +		 t1->tv_nsec - t2->tv_nsec : t2->tv_nsec - t1->tv_nsec
> +		 ) / 1000;
> +    if (diff_usec <= max_margin) {
> +	res += diff_usec;
> +	return (reverse? -res : res);
> +    }
> + overflow:
> +    return (reverse? LONG_MIN : LONG_MAX);
> +}  
> +
>  /*
>   * Sleep for the given number of microseconds --- must be within
>   * range of a long at the moment, but this is only used for
> diff --git a/Test/V14system.ztst b/Test/V14system.ztst
> index e69de29bb..79a2c0a77 100644
> --- a/Test/V14system.ztst
> +++ b/Test/V14system.ztst
> @@ -0,0 +1,150 @@
> +# Test zsh/system module
> +
> +%prep
> +
> +  if zmodload -s zsh/system && zmodload -s zsh/zselect; then
> +    tst_dir=V13.tmp
> +    mkdir -p -- $tst_dir
> +  else
> +    ZTST_unimplemented='the zsh/system and zsh/zselect modules are not available'
> +  fi
> +  : > $tst_dir/file # File on which to acquire flock.
> +
> +%test
> +
> +  (
> +    zsystem flock -t 0.1 -i 0.000001 $tst_dir/file
> +  )
> +0:zsystem flock valid time arguments
> +
> +  (
> +    zsystem flock -t -1         $tst_dir/file ||
> +    zsystem flock -t 0.49e-6    $tst_dir/file ||
> +    zsystem flock -t 1073741824 $tst_dir/file ||
> +    zsystem flock -t 1e100      $tst_dir/file ||
> +    zsystem flock -i -1         $tst_dir/file ||
> +    zsystem flock -i 0.49e-6    $tst_dir/file ||
> +    zsystem flock -i 1e100      $tst_dir/file
> +  )
> +1:zsystem flock invalid time arguments
> +?(eval):zsystem:2: flock: invalid timeout value: '-1'
> +?(eval):zsystem:3: flock: invalid timeout value: '0.49e-6'
> +?(eval):zsystem:4: flock: invalid timeout value: '1073741824'
> +?(eval):zsystem:5: flock: invalid timeout value: '1e100'
> +?(eval):zsystem:6: flock: invalid interval value: '-1'
> +?(eval):zsystem:7: flock: invalid interval value: '0.49e-6'
> +?(eval):zsystem:8: flock: invalid interval value: '1e100'
> +
> +  (
> +    # Lock file for 1 second in the background.
> +    lock_flag=$tst_dir/locked1
> +    (zsystem flock $tst_dir/file \
> +     && touch $lock_flag \
> +     && zselect -t 100
> +     mv $lock_flag $lock_flag.done) &
> +    # Wait until sub-shell above has started.
> +    while ! [[ -f $lock_flag || -f $lock_flag.done ]]; do
> +      zselect -t 1
> +    done
> +    if [[ -f $lock_flag.done ]]; then
> +      echo "Background shell should not have completed already." 1>&2
> +    else
> +      # Attempt to lock file with 0.5 second timeout: must fail.
> +      zsystem flock -t 0.5 $tst_dir/file
> +    fi
> +  )
> +2:zsystem flock unsuccessful wait test
> +F:This timing test might fail due to process scheduling issues unrelated to zsh.
> +
> +  (
> +    # Lock file for 0.5 second in the background.
> +    lock_flag=$tst_dir/locked2
> +    (zsystem flock $tst_dir/file \
> +      && touch $lock_flag \
> +      && zselect -t 50
> +      mv $lock_flag $lock_flag.done) &
> +    # Wait until sub-shell above has started.
> +    while ! [[ -f $lock_flag || -f $lock_flag.done ]]; do
> +      zselect -t 1
> +    done
> +    if [[ -f $lock_flag.done ]]; then
> +      echo "Background shell should not have completed already." 1>&2
> +    fi
> +    typeset -F SECONDS
> +    start=$SECONDS
> +    # Attempt to lock file without a timeout:
> +    # must succeed after sub-shell above releases it (0.5 second).
> +    if zsystem flock $tst_dir/file; then
> +      elapsed=$[ $SECONDS - $start ]
> +      if [[ $elapsed -ge 0.3 && $elapsed -le 0.7 ]]; then
> +        echo "elapsed time seems OK" 1>&2
> +      else
> +        echo "elapsed time $elapsed should be ~ 0.5 second" 1>&2
> +      fi
> +    fi
> +  )
> +0:zsystem flock successful wait test, no timeout
> +?elapsed time seems OK
> +F:This timing test might fail due to process scheduling issues unrelated to zsh.
> +
> +  (
> +    # Lock file for 0.5 second in the background.
> +    lock_flag=$tst_dir/locked3
> +    (zsystem flock $tst_dir/file \
> +      && touch $lock_flag \
> +      && zselect -t 50
> +      mv $lock_flag $lock_flag.done) &
> +    # Wait until sub-shell above has started.
> +    while ! [[ -f $lock_flag || -f $lock_flag.done ]]; do
> +      zselect -t 1
> +    done
> +    if [[ -f $lock_flag.done ]]; then
> +      echo "Background shell should not have completed already." 1>&2
> +    fi
> +    typeset -F SECONDS
> +    start=$SECONDS
> +    # Attempt to lock file with 1-second timeout:
> +    # must succeed 1 second after start because we retry every 1 second.
> +    if zsystem flock -t 1 $tst_dir/file; then
> +      elapsed=$[ $SECONDS - $start ]
> +      if [[ $elapsed -ge 0.8 && $elapsed -le 1.2 ]]; then
> +        echo "elapsed time seems OK" 1>&2
> +      else
> +        echo "elapsed time $elapsed should be ~ 1 second" 1>&2
> +      fi
> +    fi
> +  )
> +0:zsystem flock successful wait test, integral seconds
> +?elapsed time seems OK
> +F:This timing test might fail due to process scheduling issues unrelated to zsh.
> +
> +  (
> +    # Lock file for 0.25 second in the background.
> +    lock_flag=$tst_dir/locked4
> +    (zsystem flock $tst_dir/file \
> +      && touch $lock_flag \
> +      && zselect -t 25
> +      mv $lock_flag $lock_flag.done) &
> +    # Wait until sub-shell above has started.
> +    while ! [[ -f $lock_flag || -f $lock_flag.done ]]; do
> +      zselect -t 1
> +    done
> +    if [[ -f $lock_flag.done ]]; then
> +      echo "Background shell should not have completed already." 1>&2
> +    fi
> +    typeset -F SECONDS
> +    start=$SECONDS
> +    # Attempt to lock file with 0.4-second timeout, retrying every 0.1 second:
> +    # must succeed 0.3 second after start.
> +    if zsystem flock -t 0.4 -i 0.1 $tst_dir/file; then
> +      elapsed=$[ $SECONDS - $start ]
> +      if [[ $elapsed -ge 0.2 && $elapsed -le 0.5 ]]; then
> +        echo "elapsed time seems OK" 1>&2
> +      else
> +        echo "elapsed time $elapsed should be ~ 0.3 second" 1>&2
> +      fi
> +    fi
> +  )
> +0:zsystem flock successful wait test, fractional seconds
> +?elapsed time seems OK
> +F:This timing test might fail due to process scheduling issues unrelated to zsh.



Messages sorted by: Reverse Date, Date, Thread, Author