Zsh Mailing List Archive Messages sorted by: Reverse Date, Date, Thread, Author

Re: [PATCH] zsh/random module [UPDATED]

X-seq: zsh-workers 51052
From: "Daniel Shahaf" <d.s@xxxxxxxxxxxxxxxxxx>
To: "Zsh hackers list" <zsh-workers@xxxxxxx>
Cc: "Clinton Bunch" <cdb_zsh@xxxxxxxxxxx>
Subject: Re: [PATCH] zsh/random module [UPDATED]
Date: Wed, 23 Nov 2022 23:54:06 +0000
Archived-at: <https://zsh.org/workers/51052>
Feedback-id: i425e4195:Fastmail
In-reply-to: <869f6d65-15d2-477f-b78b-02427a0c1395@app.fastmail.com>
List-id: <zsh-workers.zsh.org>
References: <1b2cafe6-b4b5-c59a-11f3-4dbc1e99e2bc@zentaur.org> <6275a5ac-3a47-f591-7b3c-380ec4fed5ac@zentaur.org> <Y3rPPc5eoZ13gqua@CptOrmolo.darkstar> <3423b634-a7c3-9efc-92cd-b9b995ac1c27@zentaur.org> <Y3rgtOsy8PnJdAlt@CptOrmolo.darkstar> <30a7e749-7f30-ecae-6479-a345b1682e7f@zentaur.org> <Y311+V0r6acebcMp@CptOrmolo.darkstar> <2df1001e-69a6-9785-70a6-8416fdcffd8d@zentaur.org> <Y32eP2PXtVgr7rLA@CptOrmolo.darkstar> <0a07afaf-1194-6752-8133-8aa6b689724d@zentaur.org> <20221123203329.GP27622@tarpaulin.shahaf.local2> <869f6d65-15d2-477f-b78b-02427a0c1395@app.fastmail.com>

dana wrote on Wed, 23 Nov 2022 21:42 +00:00:
> On Wed 23 Nov 2022, at 13:46, Daniel Shahaf wrote:
>>   + Why should -c default to 8 in the "random bytes" case?  Why
>>     shouldn't it default to some other value, or even be required to be
>>     specified explicitly by the user? ...
>> - Why should -c's default depend on anything at all?  You mentioned
>>   downthread you consider making -c1 the default in all cases; that'd be
>>   better.
>
> The defaults with this API are kind of weird, because if you make them
> dependent on the format (e.g. 8 for hex and 1 for everything else) it's kind
> of arbitrary, but if you keep them all the same (e.g. 1 or 8 for everything)
> they aren't generally useful — i think it's safe to assume that 'i would like
> exactly 1 random hex digit' is not going to be the most common use case
>

Well, agreed on that last sentence, but note that «-c 1» in the patch
means one byte, not one nibble.

> Requiring the user to explicitly specify it would address that, though you
> could say then that it goes the other way, e.g. again it's probably safe to
> assume that 90% of the time you're only going to want one integer value, and
> making people write that out every time, whilst expected in a lower-level API
> like a C function, is maybe annoying in a convenience shell built-in
>

But 1 /is/ the default for integer mode, and I don't think anyone
proposed to change that?  Rather, it was proposed to change the default
for bytes mode from 4 bytes (8 nibbles) to 1 byte.  Do you reckon requesting 4 bytes
should be the default for that mode, as opposed to, say, 1, 2, 8, or 64 bytes?

> But annoying is probably better than confusing, if those are the options
>

Heh :)

> On Wed 23 Nov 2022, at 13:46, Daniel Shahaf wrote:
>> Oh, and bump that 16 to something 3 or 4 times as big, because a 1/65536
>> chance isn't really enough in a world where automated builds (CI,
>> distros' QA, etc.) is a thing.
>
> I feel like it should be very nearly impossible for a test to fail just for
> randomness reasons. Maybe it's over-kill but in my draft reply to the patch i
> was going to suggest something like this:
>
>   () {
>     repeat $(( 10 ** 5 )); do
>       getrandom -L4 -U5 -c64 -a tmpa
>       [[ $tmpa[(r)5] == 5 ]] && return 0
>     done
>     return 1
>   }
>

No maybe about it :)

With these parameters, the probability of a false positive is 2 to the
power of minus the overall number of iterations, i.e., 2**(-6.4 million),
which is 1/[a number that has 1.9M decimal digits].

To be clear, it's not 1/1.9M, which is about the probability of a random
Londoner being at 10 Downing Street right now.  It's 1/[10 ** 1.9M],
which is about the probability of correctly guessing the genders of all
Londoners.

If you converted the entire Earth's mass to CPUs and ran «getrandom -L4
-U5 -c64» on it repeatedly until Sol died, and the CPUs all operated at
4GHz, and there were no bugs in anything, the chance of getting a single
run to not return a 5 would still be something like a billion to one
(give or take several zeroes depending on CPU mass, the argument to -c,
and so on).

That's why in practice, if a single -c64 call ever doesn't return a 5,
it's safe to assume there's a bug.

Conversely, if you actually retain those 6.4 million iterations, what's
the probability that the outer loop will return 0 on the first iteration
and then a gamma ray will flip that to 0?

Follow-Ups:
- Re: [PATCH] zsh/random module [UPDATED]
  - From: dana
- Re: [PATCH] zsh/random module [UPDATED]
  - From: Daniel Shahaf

References:
- Re: [PATCH] zsh/random module
  - From: Clinton Bunch
- Re: [PATCH] zsh/random module [UPDATED]
  - From: Clinton Bunch
- Re: [PATCH] zsh/random module [UPDATED]
  - From: Matthew Martin
- Re: [PATCH] zsh/random module [UPDATED]
  - From: Clinton Bunch
- Re: [PATCH] zsh/random module [UPDATED]
  - From: Matthew Martin
- Re: [PATCH] zsh/random module [UPDATED]
  - From: Clinton Bunch
- Re: [PATCH] zsh/random module [UPDATED]
  - From: Matthew Martin
- Re: [PATCH] zsh/random module [UPDATED]
  - From: Clinton Bunch
- Re: [PATCH] zsh/random module [UPDATED]
  - From: Matthew Martin
- Re: [PATCH] zsh/random module [UPDATED]
  - From: Clinton Bunch
- Re: [PATCH] zsh/random module [UPDATED]
  - From: Daniel Shahaf
- Re: [PATCH] zsh/random module [UPDATED]
  - From: dana

Messages sorted by: Reverse Date, Date, Thread, Author