Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

Re: [PATCH] zsh/random module [UPDATED]



dana wrote on Wed, 23 Nov 2022 21:42 +00:00:
> On Wed 23 Nov 2022, at 13:46, Daniel Shahaf wrote:
>>   + Why should -c default to 8 in the "random bytes" case?  Why
>>     shouldn't it default to some other value, or even be required to be
>>     specified explicitly by the user? ...
>> - Why should -c's default depend on anything at all?  You mentioned
>>   downthread you consider making -c1 the default in all cases; that'd be
>>   better.
>
> The defaults with this API are kind of weird, because if you make them
> dependent on the format (e.g. 8 for hex and 1 for everything else) it's kind
> of arbitrary, but if you keep them all the same (e.g. 1 or 8 for everything)
> they aren't generally useful — i think it's safe to assume that 'i would like
> exactly 1 random hex digit' is not going to be the most common use case
>

Well, agreed on that last sentence, but note that «-c 1» in the patch
means one byte, not one nibble.

> Requiring the user to explicitly specify it would address that, though you
> could say then that it goes the other way, e.g. again it's probably safe to
> assume that 90% of the time you're only going to want one integer value, and
> making people write that out every time, whilst expected in a lower-level API
> like a C function, is maybe annoying in a convenience shell built-in
>

But 1 /is/ the default for integer mode, and I don't think anyone
proposed to change that?  Rather, it was proposed to change the default
for bytes mode from 4 bytes (8 nibbles) to 1 byte.  Do you reckon requesting 4 bytes
should be the default for that mode, as opposed to, say, 1, 2, 8, or 64 bytes?

> But annoying is probably better than confusing, if those are the options
>

Heh :)

> On Wed 23 Nov 2022, at 13:46, Daniel Shahaf wrote:
>> Oh, and bump that 16 to something 3 or 4 times as big, because a 1/65536
>> chance isn't really enough in a world where automated builds (CI,
>> distros' QA, etc.) is a thing.
>
> I feel like it should be very nearly impossible for a test to fail just for
> randomness reasons. Maybe it's over-kill but in my draft reply to the patch i
> was going to suggest something like this:
>
>   () {
>     repeat $(( 10 ** 5 )); do
>       getrandom -L4 -U5 -c64 -a tmpa
>       [[ $tmpa[(r)5] == 5 ]] && return 0
>     done
>     return 1
>   }
>

No maybe about it :)

With these parameters, the probability of a false positive is 2 to the
power of minus the overall number of iterations, i.e., 2**(-6.4 million),
which is 1/[a number that has 1.9M decimal digits].

To be clear, it's not 1/1.9M, which is about the probability of a random
Londoner being at 10 Downing Street right now.  It's 1/[10 ** 1.9M],
which is about the probability of correctly guessing the genders of all
Londoners.

If you converted the entire Earth's mass to CPUs and ran «getrandom -L4
-U5 -c64» on it repeatedly until Sol died, and the CPUs all operated at
4GHz, and there were no bugs in anything, the chance of getting a single
run to not return a 5 would still be something like a billion to one
(give or take several zeroes depending on CPU mass, the argument to -c,
and so on).

That's why in practice, if a single -c64 call ever doesn't return a 5,
it's safe to assume there's a bug.

Conversely, if you actually retain those 6.4 million iterations, what's
the probability that the outer loop will return 0 on the first iteration
and then a gamma ray will flip that to 0?




Messages sorted by: Reverse Date, Date, Thread, Author