!standard A.5.2(20) 16-01-28 AI12-0144-1/07
!standard A.5.2(32)
!standard A.5.2(41)
!class Amendment 14-12-04
!status ARG Approved 10-0-2 (By Letter Ballot) 16-02-26
!status work item 15-12-17
!status ARG Approved 7-0-0 15-10-18
!status Promising (10-0-0) 15-02-26
!status work item 14-12-04
!status received 14-11-06
!priority Low
!difficulty Easy
!subject Make Discrete_Random more flexible
!summary
Add a more general function Random to Discrete_Random.
!problem
A recent StackOverflow question asked how to write a function:
function generate_random_number (n: in Positive) return Integer is
to generate one random number in the range 0..n. n could be different
for each call.
It looks like this isn't easily possible using Ada.Numerics.Discrete_Random,
since a new instantiation of Discrete_Random must be declared inside
generate_random_number, and a new generator variable of type
Instantiation.Generator needs to be declared. The questioner tried Reset
to reset the generator to a time-dependent state each time, but this failed
because the procedure was being called faster than the clock was changing.
The most direct way to accomplish something like
this seems to be to use Float_Random, reset the generator, and then
perform the needed computations to yield a discrete random number.
This is unfortunate, because the Discrete_Random packages are no doubt
already performing these or related computations, and hopefully doing
it in a way that produces the best results, which is not trivial; but
these computations aren't available to a program (unless a compiler
vendor makes them available through a non-standard package).
!proposal
(See wording.)
!wording
Add after A.5.2(20):
function Random (Gen : Generator;
First : Result_Subtype;
Last : Result_Subtype) return Result_Subtype
with Post => Random'Result in First .. Last;
Modify A.5.2(32):
Obtains the “next” random number from the given generator, relative to its
current state, according to an implementation-defined algorithm.[ The result
of the function in Numerics.Float_Random is delivered as a value of the
subtype Uniformly_Distributed, which is a subtype of the predefined type
Float having a range of 0.0 .. 1.0. The result of the function in an
instantiation of Numerics.Discrete_Random is delivered as a value of the
generic formal subtype Result_Subtype.]
[Editor's note: We delete the last two sentences here as all they say is
that the function result is in the result subtype, which is always true for
a function (in the absence of exceptions or erroneous execution, both of
which have to be explicitly stated). The actual description of the result
is in the Implementation Requirements that follow later; the existing text
promises to give that without doing so. We remove this text now so that we
don't have to give similar text in the description of the new subprogram,
where it would be substantially more complicated.]
Add after A.5.2(32):
function Random (Gen : Generator;
First : Result_Subtype;
Last : Result_Subtype) return Result_Subtype
with Post => Random'Result in First .. Last;
Obtains the "next" random number from the given generator, relative to its
current state, according to an implementation-defined algorithm. If the
range First .. Last is a null range, Constraint_Error is raised.
[Editor's notes: During the ARG phone meeting of February 26, 2015, it was
suggested that the null range case raise Program_Error. However,
Constraint_Error is the existing exception for that (A.5.2.(39)), so we
used that here.
Postconditions are assumed to impose a requirement on the body of a
subprogram, as proposed in AI12-0179-1. In the unlikely event that AI is not
approved, then we should not use a postcondition here; we'd need some
wording instead.
End Editor's notes.]
Modify A.5.2(41):
{Each call of a Random function has a *result range*; this is the range
First .. Last for the version of Random with First and Last parameters and
the range of the result subtype of the function otherwise.}
A sufficiently long sequence of random numbers obtained by {consecutive}
[successive] calls to Random {that have the same generator and result range}
is approximately uniformly distributed over the {result} range [of the
result subtype].
{AARM Discussion: In this rule, "consecutive" means at least that there are
no intervening explicit calls involving the same generator. This restricts
the rule to only applying to cases where just the Random function changes
the generator. We don't mean to impose a requirement if there are
intervening calls to Reset, to Random with same generator but a different
result range, or any other case that would affect the sequence of values
returned. Operations which use the resulting random values (for instance,
to store them somewhere) are not considered in determining if calls are
consecutive.}
Modify A.5.2(42):
{A}[The] Random function in an instantiation of Numerics.Discrete_Random is
guaranteed to yield each value in its result {range}[subtype] in a
finite number of calls, provided that the number of such values does not
exceed 2 15.
[Editor's note: the above "15" is a superscript, which doesn't show up in this
plain text rendering.]
!discussion
The author ran into this problem with a hobby solitaire program. It was
necessary to use a reproducible generator (thus using a single generator
created using Reset with an Initiator parameter). The algorithm for dealing
cards is to create an array containing all of the cards, then select one from
that array using a random generator. Then close up the array, and repeat the
selection on the array (minus the selected card). This means you want to
select with ranges 1..52, then 1..51, then 1..50, etc.
That's not possible with the Ada 95 definition of Discrete_Random.
Users often accomplish this in ways that do not provide a truly uniform
distribution. It's easy to come up with methods that don't work (see
the !appendix for several such algorithms), but it's difficult to get right.
One algorithm that is correct is to start with a uniform random generator
that produces an integer in the range 0 .. Large. (Usually Large is some
power of 2 - 1). Then, for a call to Random with bounds First and Last, we
determine Len := (Last-First+1) and then Mult, which is the largest multiple
of Len that is less than Large. (Len cannot be greater than Large.) We then
loop, getting a random value, and exiting the loop when the random value is
less than Mult. Finally, we mod the random value (now in the range 0 ..
Mult-1) by Len and add First.
This loop will usually exit on the first iteration, and even in the worst
case (Len = Large/2+1), there is at least a 50% chance that the loop will
exit on each iteration. So it is unlikely that many iterations will be needed.
If this was 1994, we should have simply defined Random in Discrete_Random as
follows:
function Random (Gen : Generator;
First : Result_Subtype := Result_Subtype'First;
Last : Result_Subtype := Result_Subtype'Last) return Result_Subtype;
This allows leaving out the First parameter if it matches that of the formal
subtype (which it usually will).
Unfortunately, that is more incompatible, as renames and access-to-subprogram
matching would fail (whereas the proposed solution is only incompatible if
there are use-clause conflicts with the profile, which is unlikely).
!ASIS
No impact.
!ACATS test
An ACATS test should be constructed to check that the new function exists and
works appropriately. Perhaps the card-dealing example mentioned in the
discussion could be used.
!appendix
!topic Allow random numbers for different discrete subtypes using same generator
!reference RM12 A.5.2
!from Adam Beneschan 14-11-06
!discussion
This arose from a StackOverflow question in which a questioner wanted
to write a function:
function generate_random_number (n: in Positive) return Integer is
to generate one random number in the range 0..n. n could be different
for each call.
It looks like this isn't easily possible using
Ada.Numerics.Discrete_Random, since a new instantiation of
Discrete_Random must be declared inside generate_random_number, and a
new generator variable of type Instantiation.Generator needs to be
declared. The questioner tried Reset to reset the generator to a
time-dependent state each time, but this failed because the procedure
was being called faster than the clock was changing.
Perhaps there are ways to get this to work with an initiator parameter
on Reset. However, the most direct way to accomplish something like
this seems to be to use Float_Random, reset the generator, and then
perform the needed computations to yield a discrete random number.
This is unfortunate, because the Discrete_Random packages are no doubt
already performing these or related computations, and hopefully doing
it in a way that produces the best results, which is not trivial; but
these computations aren't available to a program (unless a compiler
vendor makes them available through a non-standard package).
I think there should be a way to generate random numbers for different
discrete ranges using the same generator. Several possibilities come
to mind:
(1) Add to Discrete_Random:
function Random (Gen : Ada.Numerics.Float_Random.Generator)
return Result_Subtype;
which would allow the program to set up one common generator using
Float_Random.
(2) Add to Float_Random:
generic
type Result_Subtype is (<>);
function Random (Gen : Generator) return Result_Subtype;
Basically the same as #1.
(3) Add a new non-generic package that defines a common generator type
that could be used in Random functions in Discrete_Random and
Float_Random. This package would also define a common State type
and provided the Reset, Save, Image, and Value subprograms
currently provided by Float_Random and Discrete_Random.
Since this question came from a StackOverflow poster who could have
just been experimenting, I can't say how useful in practice this
feature would be. It does seem like a flaw that it's missing,
however.
***************************************************************
From: Jean-Pierre Rosen
Sent: Thursday, November 6, 2014 3:17 PM
> It looks like this isn't easily possible using
> Ada.Numerics.Discrete_Random
Isn't it possible to use a random generator on Natural and return
Value mod n + 1 ?
(the quality of the random numbers might not be as good for big n's,
but it may be sufficient in practice)
***************************************************************
From: Adam Beneschan
Sent: Thursday, November 6, 2014 4:47 PM
Sure, if you don't want a uniform distribution. Unless the number of values
in Natural (2**m for some m) is divisible by (n + 1), doing this will result
in some values in the range being generated with a higher probability than
others.
There are ways to make things uniform. The point is that the implementors
of Discrete_Random have, I presume, already figured out how to do this (I
think they would have to in order to satisfy Annex G); why should we require
that those with the kind of problem I wrote about have to reinvent the wheel?
***************************************************************
From: Pascal Obry
Sent: Thursday, November 6, 2014 4:10 PM
> Isn't it possible to use a random generator on Natural and return
> Value mod n + 1 ?
>
> (the quality of the random numbers might not be as good for big n's,
> but it may be sufficient in practice)
>
Or better, just use Ada.Numerics.Float_Random and multiply by N.
***************************************************************
From: Jean-Pierre Rosen
Sent: Friday, November 7, 2014 2:32 AM
> Sure, if you don't want a uniform distribution. Unless the number of
> values in Natural (2**m for some m) is divisible by (n + 1), doing
> this will result in some values in the range being generated with a
> higher probability than others.
Well, correct my maths if I'm wrong, but it seems to me that the average
unbalance is (2**m rem (n+1)) / 2**m.
If what is needed is a dice-4, dice-6 and dice-12, the unbalance would be
neglectible (even up to n=2**20 for a 32 bits generator).
***************************************************************
From: Randy Brukardt
Sent: Friday, November 7, 2014 6:44 PM
This problem happened to me in a hobby program: I needed to assign playing
cards to a Solitare layout. I wanted to use a reproducible generator (thus
using a single generator created using Reset with an Initiator parameter).
To properly deal cards, it's best to create an array containing all of the
cards, then select one from that array using a random generator. Then close
up the array, and repeat the selection on the array (minus the selected card).
This means you want to select with ranges 1..52, then 1..51, then 1..50, etc.
Since m in this case is 52!, I'm pretty sure that no 32-bit or even 64-bit
generator is going to be uniform using your formula.
I ended up using Float_Random, using a loop to avoid having to do any sort of
conversion out of float. (I couldn't find any way to make that work and still
keep a uniform distribution.)
It would have been nice to be able to use Discrete_Random for this case,
especially as every simulation problem that I've ever dealt with has had this
form. (None of which ever were work related, so I've never thought to bring
this up at the ARG level.) What is the value of having a package that never
can be used on actual problems??
Interestingly, I ended up with a better generator having used Float_Random,
since our Discrete_Random is not going to generate an exactly uniform
distribution -- it has the same conversion problems that I couldn't find any
way to avoid with the card dealer. It's "good enough" for small ranges (like
dice or cards) [as noted by J-P's formula], but some large ranges would have
a noticeable non-uniformity. Probably should fix that someday. (And we all
know "Someday Never Comes" - thanks to Credence Clearwater Revival. :-)
***************************************************************
From: Jean-Pierre Rosen
Sent: Friday, November 7, 2014 11:59 PM
>> Isn't it possible to use a random generator on Natural and return
>> Value mod n + 1 ?
>>
>> (the quality of the random numbers might not be as good for big n's,
>> but it may be sufficient in practice)
>>
>
> Or better, just use Ada.Numerics.Float_Random and multiply by N.
This doesn't work.
During the making of Ada95, there was a huge discussion between numerician
about whether it was possible to make a discrete generator out of a floating
one, and after megabytes of e-mail, the conclusion was that it was not
possible.
I'm not enough of a numerician to explain why,
***************************************************************
From: Jean-Pierre Rosen
Sent: Saturday, November 8, 2014 12:12 AM
> This means you want to select with ranges 1..52, then 1..51, then
> 1..50, etc.
>
> Since m in this case is 52!, I'm pretty sure that no 32-bit or even
> 64-bit generator is going to be uniform using your formula.
?? m (if you refer to the number used as a parameter to the function we are
discussing) is 52, then 51, ... Certainly a "small number" for the generator
bias.
As I use to say, I'm not a numerician, I just know enough of numerics to
realize how easy it is to make incorrect statements about numerics when you
are not a numerician. So I'm asking:
1) Is it a real problem? (i.e. worth spending time on it - we should take
advice of real numericians)
2) What do other languages do? (to be fair, having a functionality is not
enough if there is no guarantee of uniformity - anybody can write a bad random
generator).
***************************************************************
From: Jeffery R. Carter
Sent: Saturday, November 8, 2014 12:36 AM
>>> Isn't it possible to use a random generator on Natural and return
>>> Value mod n + 1 ?
>>>
>>> (the quality of the random numbers might not be as good for big n's,
>>> but it may be sufficient in practice)
For some applications (games, for example) the statistics of the RNG may not
be important, but for others, they are very important.
Given an RNG that returns uniformly distributed values in 0 .. 2 ** M - 1
(Unsigned_32, for example, where M=32), taking R rem (N + 1) gives uniformly
distributed results only when M rem (N + 1) = 0. All other cases will not be
satisfactory for strict uses.
>> Or better, just use Ada.Numerics.Float_Random and multiply by N.
>>
> This doesn't work.
Right. Rounding R * N to the nearest integer results in zero and N appearing
about 1/2 as often as 1 .. N - 1. That may not be acceptable even for some of
the less strict uses.
A better approach is to truncate R * (N + 1), but even that isn't acceptable
for strict uses. This is the approach taken by the Random_Int function in
PragmARC.Universal_Random.
***************************************************************
From: Randy Brukardt
Sent: Saturday, November 8, 2014 1:26 AM
> > This means you want to select with ranges 1..52, then 1..51, then
> > 1..50, etc.
> >
> > Since m in this case is 52!, I'm pretty sure that no 32-bit or even
> > 64-bit generator is going to be uniform using your formula.
>
> ?? m (if you refer to the number used as a parameter to the function
> we are discussing) is 52, then 51, ... Certainly a "small number" for
> the generator bias.
To use the *existing* Discrete_Random, you have to use an m that is a factor
of all of them, as you can't create a new generator for each step (and you'd
have to do that to use different instantiations).
To use the float generator, you don't use "m" at all.
> As I use to say, I'm not a numerician, I just know enough of numerics
> to realize how easy it is to make incorrect statements about numerics
> when you are not a numerician.
Same here. But I know that I couldn't get even acceptable results out of the
existing random generators without extreme care (and I don't have critical
requirements) -- essentially I had to never convert the float value to any
discrete type. (And I can't be certain I got it right as it is.) You noted
that multiplying by N isn't enough, nor is truncating (that caused values to
occassionally be out of range). I used a loop and a float compare to avoid
that problem, but it might have left an extra value into one of the end
intervals. I don't know of any way to avoid that problem (which is at least
small enough for my purposes, given the float value is a long float, so we
have n/2**53, which is pretty darn small).
Note that Jeff Carter said the same thing:
> Right. Rounding R * N to the nearest integer results in zero and N appearing
> about 1/2 as often as 1 .. N - 1. That may not be acceptable even for some of
> the less strict uses.
It wasn't for my card dealing application, for which I wanted to simulate more
than 100,000 deals (sometimes running into millions of deals).
> A better approach is to truncate R * (N + 1), but even that isn't acceptable
> for strict uses. This is the approach taken by the Random_Int function in
> PragmARC.Universal_Random.
That didn't even work for me, as there was a non-zero chance of R = 1.0; in
that case the result ended up indexing out of the range of the upper end of
the selection array (that's how I found out it was still wrong, a bizarre
Constraint_Error after simulating several thousand deals). I ended up putting
that value into the uppermost bin; that's makes the uppermost bin slightly more
likely to be selected, but never more often than 52/2**53, which seemed small
enough for my application. (That would take roughly 2**40 simulations to be
significant.) I don't know of any way to avoid that problem, perhaps that's
the source of the issue that J-P remembered reading about.
But how to do this perfectly I don't know. The implementation of
Discrete_Random in Janus/Ada is probably worse than that, because using a loop
isn't practical if the range is more than a few hundred items, and I probably
didn't have a real truncation operation available when I wrote it.
Which I believe I had to do, because no "real numerician" provided an
implementation that actually would work for Janus/Ada as it existed in the
1990s. The Float_Random was from Ken Dritz, but I don't think there was ever a
matching Discrete_Random implementation (and I could find none in the
literature). Probably it would be easier to find these things today (no Google
in 1996), but of course rewriting and retesting that is a low priority in the
absence of some customer demand.
> So I'm asking:
>
> 1) Is it a real problem? (i.e. worth spending time on it - we should
> take advice of real numericians)
It's certainly a real problem in that there is a class of applications where
neither Discrete_Random nor Float_Random (as currently defined) can be used to
provide completely uniform behavior. Perhaps there is some numerical reason
that the problem can never be solved, but we will need to be convinced of that.
> 2) What do other languages do? (to be fair, having a functionality is
> not enough if there is no guarantee of uniformity - anybody can write
> a bad random generator).
I don't think that other languages even had an equivalent for Discrete_Random
back in the 1990s. Whether that's changed, I don't know; none of the languages
I've ever used has had anything other than a float version.
As such, this seems to be a self-made Ada problem. Whether or not the landscape
for other languages has changed, it seems amazingly restrictive to have to
select every random number from the same range. There are a lot of problems
where that simply doesn't work.
Perhaps there is some numerical reason for Discrete_Random (I don't recall the
discussions that J-P mentioned, but that doesn't mean much for something that
happened over 20 years ago) that would prevent it from being useful for a large
class of problems. If that's true, it should be documented in the AARM so that
at least us language lawyers will know why it can't be useful.
I think this means that we'll need an AI and someone will have to be tasked to
go off and scour the Internet for information. (We're never going to have
enough "real numericians" in the ARG.)
***************************************************************
From: Bob Duff
Sent: Saturday, November 8, 2014 9:26 AM
> For some applications (games, for example) the statistics of the RNG
> may not be important, but for others, they are very important.
Depends on whether you're playing for money. I'd like to play blackjack
against a program that uses the "mod N" method. ;-)
I thought the standard way to get a random integer in the range
(say) 1 .. 6 is to get a random integer in the range (say) 0 .. 2**31-1, and
throw away results >= 2_147_483_646. 2_147_483_646 is the largest multiple of
6 <= 2**31-1. Do this in a loop until you get a number in 0 .. 2_147_483_645.
THEN do the "mod 6" and add 1.
That loop terminates, and usually goes round just once.
It's not much harder to program than the plain "mod 6" method.
J-P wrote:
> As I use to say, I'm not a numerician, ...
Is that a French word? I've never heard it. When I put "define numerician"
in google, it defines "numeration". ;-) I assume you mean "numerical analyst".
I'm not one, either! Neither are most programmers, which is why it's good to
have this sort of thing in the predefined libraries.
Randy wrote:
> I don't think that other languages even had an equivalent for
> Discrete_Random back in the 1990s. Whether that's changed, I don't
> know; none of the languages I've ever used has had anything other than
> a float version.
Lots of languages have pseudo-random generators that produce integers.
Few have any strict requirements, which makes them useless for serious work
if portability is desired.
For example, C has rand(), which returns an int in the range 0 .. RAND_MAX.
RAND_MAX = 2**31-1 on my machine. Some implementations of rand() are pretty
bad, or so I've heard. I'm pretty sure rand() dates back to the early days of
C, so before 1990s.
> I think this means that we'll need an AI and someone will have to be
> tasked to go off and scour the Internet for information. (We're never
> going to have enough "real numericians" in the ARG.)
Or ask somebody who knows what they're talking about (unlike me).
For example, Ken Dritz or Paul Hilfinger.
***************************************************************
From: Jeffrey R. Carter
Sent: Saturday, November 8, 2014 11:56 AM
>> A better approach is to truncate R * (N + 1), but even that isn't acceptable
>> for strict uses. This is the approach taken by the Random_Int function in
>> PragmARC.Universal_Random.
>
> That didn't even work for me, as there was a non-zero chance of R =
> 1.0; in that case the result ended up indexing out of the range of the
> upper end of the selection array (that's how I found out it was still
> wrong, a bizarre Constraint_Error after simulating several thousand
> deals). I ended up putting that value into the uppermost bin; that's
> makes the uppermost bin slightly more likely to be selected, but never
> more often than 52/2**53, which seemed small enough for my
> application. (That would take roughly 2**40 simulations to be
> significant.) I don't know of any way to avoid that problem, perhaps
> that's the source of the issue that J-P remembered reading about.
Other than Ada.Numerics.Float_Random, I've never used an RNG that produces
1.0, and I tend to forget that Float_Random may. My comment was for RNGs that
don't. Universal_Random, for example, produces all multiples of 2**(-24) from
zero to 1 - 2**(-24), so the truncation approach works fine with it.
There is still a problem, similar to the problem with the remainder method for
a discrete generator. The RNG produces M unique values in 0 .. 1-e. If
M rem N /= 0, then the results of the truncation method are not uniformly
distributed. (Universal_Random, for example, produces 2**24 unique values, and
2**24 rem 52 /= 0.)
***************************************************************
From: Jeffrey R. Carter
Sent: Saturday, November 8, 2014 2:43 PM
> Depends on whether you're playing for money. I'd like to play
> blackjack against a program that uses the "mod N" method. ;-)
Then it's a business for someone. Not entirely a game.
> I thought the standard way to get a random integer in the range
> (say) 1 .. 6 is to get a random integer in the range (say) 0 ..
> 2**31-1, and throw away results >= 2_147_483_646.
> 2_147_483_646 is the largest multiple of 6 <= 2**31-1.
> Do this in a loop until you get a number in 0 .. 2_147_483_645. THEN
> do the "mod 6" and add 1.
>
> That loop terminates, and usually goes round just once.
> It's not much harder to program than the plain "mod 6" method.
Are we sure the results are uniformly distributed?
> Some implementations of rand() are pretty bad, or so I've heard.
The Turbo Pascal random number generator was very bad. If you plotted
consecutive values as (x, y) coordinates you got vertical lines! But on one
project I worked on that had pretty strict requirements for its random values,
we combined it with another, not-great linear-congruential generator and got
acceptable results.
***************************************************************
From: Tero Koskinen
Sent: Saturday, November 8, 2014 3:21 PM
> 2) What do other languages do? (to be fair, having a functionality is
> not enough if there is no guarantee of uniformity - anybody can write
> a bad random generator).
OpenBSD uses arc4random and arc4random_uniform(uint32_t upper_bound) functions
for random numbers in LibreSSL (and other places also).
Implementation for arc4random_uniform (which returns 32-bit unsigned integer
below the specified value) is found from
https://github.com/libressl-portable/openbsd/blob/master/src/lib/libc/crypt/arc4random_uniform.c
And pasted here, in case the link does not work:
uint32_t
arc4random_uniform(uint32_t upper_bound) {
uint32_t r, min;
if (upper_bound < 2)
return 0;
/* 2**32 % x == (2**32 - x) % x */
min = -upper_bound % upper_bound;
/*
* This could theoretically loop forever but each retry has
* p > 0.5 (worst case, usually far better) of selecting a
* number inside the range we need, so it should rarely need
* to re-roll.
*/
for (;;) {
r = arc4random();
if (r >= min)
break;
}
return r % upper_bound;
}
arc4random itself is found from
https://github.com/libressl-portable/openbsd/blob/master/src/lib/libc/crypt/arc4random.c
Operating system specific helper functions are at
https://github.com/libressl-portable/openbsd/tree/master/src/lib/libcrypto/crypto
They (OpenBSD/LibreSSL developers) have spent some time tuning their random
generator, so I would assume it is good enough for most purposes.
***************************************************************
From: Randy Brukardt
Sent: Saturday, November 8, 2014 3:36 PM
> Other than Ada.Numerics.Float_Random, I've never used an RNG that
> produces 1.0, and I tend to forget that Float_Random may. My comment
> was for RNGs that don't.
I think that they all don't effectively. My recollection is that Float_Random
generates a 64-bit random integer (2**64-1), and when that is scaled into a
float value, it essentially rounds to 1.0. The definition of Float_Random
means that we don't have to avoid that. (Although I do wonder if 0.0 and 1.0
are as likely to occur as any other bit pattern.) In any case, I used the
implementation provided by Ken Dritz (and his associated test program), which
was part of the Ada 9x work. [The test program isn't in the ACATS as its
nature is to sometimes fail a couple of trials; he used 40 trials and it's
OK if it passes at least 38. But it still can be OK and fail more often,
since it's supposed to be random, after all. I've seen 4 failures on a single
run (the next run typically has none). As an ACATS test, that wouldn't be of
much use, since it would be theortically possible to fail all 40 trials and
still be good. Note that a generator that *always* passed all of the trials
would be insufficiently random, it would have a pattern to keep the Chi-squares
low.] I *could* have done something wrong, but I doubt it.
Discrete_Random is a different kettle of fish (I don't think there was a
reference implementation, if there was, I didn't use it, or it was wrong).
***************************************************************
From: Pascal Pignard
Sent: Monday, November 10, 2014 2:39 PM
Let's take the initial proposal from Adam:
(1) Add to Discrete_Random:
function Random (Gen : Ada.Numerics.Float_Random.Generator)
return Result_Subtype;
My understanding of the given problem is how to produce a "random" integer
in a range specified at each call from a pseudo random generator of a known
quality which produce for example 32 bits with each bit having the same
probability at each draw.
So for range as 0..(2**n)-1, it's trivial -> take n bits in the random number.
For the other ranges, I'll take the example of a dice of 20 sides thus in
range 1..20.
Let's admit that adding an integer to a random number keeps the quality of
this number. It's like a translation.
So we can translate to 0..19.
Take the greatest power of two minus one less than 19, that is 15 or (2**4)-1,
so take four bits of the random number.
Note: whatever the position of the bits as the have the same probability for
each. Thus we have a random u in range 0..15.
Take now two others bits, so we have a random v in range 0..3.
Take one more bit, so we have a random w in range 0..1.
Add then all x = u + v + w, so we have a random x in range 0..19.
Plus one, we have a random x in range 1..20.
Of course we have to care if the number of random bits fits in the range of
the random generator and maybe others things like to get a high quality n bits
pseudo random generator.
It seems then feasible to get such a function.
***************************************************************
From: Gustaf Thorslund
Sent: Monday, November 10, 2014 3:39 PM
> Add then all x = u + v + w, so we have a random x in range 0..19.
> Plus one, we have a random x in range 1..20.
No, I'm afraid you cannot do that. If you have two dice and add the numbers of
them you will not get an even distribution (7 being the most common sum). It
should not be too much work calculating the distribution of your suggestion.
If you on the other hand (like Bob Duff already suggested) throw away the
numbers outside the range, and retry until you get a value inside the range,
you should get an even distribution over that range. If you for example would
roll a six sided dice, throw away and retry every time you get a six, you will
basically get a five sided dice since those values will still be an even
distribution.
As Bob also said you would throw away the values outside an even multiple of
the range, else you'll need to loop for awhile if you want to make a dice out
of a 64 bit number.
Looks like the code Tero posted also loops until it finds a value inside a
range.
***************************************************************
From: Randy Brukardt
Sent: Tuesday, November 11, 2014 12:16 AM
> > Add then all x = u + v + w, so we have a random x in range 0..19.
> > Plus one, we have a random x in range 1..20.
>
> No, I'm afraid you cannot do that. If you have two dice and add the
> numbers of them you will not get an even distribution
> (7 being the most common sum). It should not be too much work
> calculating the distribution of your suggestion.
Right. When dealing with probabilities and distributions and the like, the
operative phrase is that "everything you know is wrong". In particular,
ordinary mathematics can't used to combine probabilities, and in fact little
can be done at all if the probabilities aren't independent.
> If you on the other hand (like Bob Duff already suggested) throw away
> the numbers outside the range, and retry until you get a value inside
> the range, you should get an even distribution over that range. If you
> for example would roll a six sided dice, throw away and retry every
> time you get a six, you will basically get a five sided dice since
> those values will still be an even distribution.
>
> As Bob also said you would throw away the values outside an even
> multiple of the range, else you'll need to loop for awhile if you want
> to make a dice out of a 64 bit number.
>
> Looks like the code Tero posted also loops until it finds a value
> inside a range.
Clearly, we can write such a function.
The question is how to add it to the existing Ada libraries.
I don't like any of Adam's suggestions, because they all cause
cross-contamination of the float and discrete libraries (which should be
independent, IMHO). Moreover, they all continue needing to use one
instantiation per range of values, which is trying to shoehorn something
dynamic into a compile-time construct. It can be done, but it's clunky at
best.
I'd go with a much more straightforward approach:
function Random (Gen : Generator;
First : Result_Subtype := Result_Subtype'First;
Last : Result_Subtype := Result_Subtype'Last) return Result_Subtype;
Obtains the "next" random number from the given generator, relative to its
current state, according to an implementation-defined algorithm. The result
of the function in an instantiation of Numerics.Discrete_Random is within the
range First .. Last, and is delivered as a value of the generic formal subtype
Result_Subtype.
---
The question with this is whether we can accept the level of incompatibility
that's associated with changing the profile of a library-defined function.
The suggested change would allow calls to be unchanged, but would break any
renames or access-to-subprogram uses.
We could avoid that by using overloading instead, leaving the existing function
alone and dropping the default values of the parameters. That's slightly less
flexible (usually, you'd only want to give Last), but it's more compatible.
(It's still not 100% compatible, because of use-visibility and the possibility
of user-defined Random functions.)
function Random (Gen : Generator;
First : Result_Subtype;
Last : Result_Subtype) return Result_Subtype;
Obtains the "next" random number from the given generator, relative to its
current state, according to an implementation-defined algorithm. The result
of the function in an instantiation of Numerics.Discrete_Random is within the
range First .. Last, and is delivered as a value of the generic formal subtype
Result_Subtype.
***************************************************************
From: Randy Brukardt
Sent: Friday, December 11, 2015 9:33 PM
The draft minutes for this AI say:
AI12-0144-1/02 Make Discrete_Random more flexible
Bob: It is hard to implement this right, users usually get it wrong. (The e-mail
thread proves that multiple times.) So we should provide it.
Add the correct algorithm to random generation for integers to the discussion.
Steve Baird says that the wording in A.5.2(41) require a uniformly distributed
result over the result subtype. But that's not what we want when we have the
First and Last parameters.
So add to A.5.2 (41): ..., or in the case of the version of Random with First
and Last parameters, over the range First .. Last.
Erhard suggests: "...uniformly distributed over the range provided by the
parameters First and Last, or the range of the result subtype." He then
withdraws that.
Straw poll: Use a postcondition 6-1-0.
Approve AI with changes: 7-0-0.
----------------------------------
Two problems with these minutes.
First, I have no recollection what "use a postcondition" means. Jeff says that
he thinks Steve Baird made the suggestion, but that's all he recalls. I edited
the AI without using a postcondition, but then the second problem came up.
Second, Jeff claims that we also are supposed to change the wording of A.5.2(42)
like that of A.5.2(41). One clear problem with that is that at least I voted on
this proposal assuming that the change of A.5.2(41) was the only one. I find the
proposed wording to be an ugly patch, and I didn't object to it mainly because I
couldn't think of anything better assuming only one use. One ugly hack is OK,
but duplicating it in successive paragraphs is not acceptable, IMHO. To wit:
A sufficiently long sequence of random numbers obtained by successive calls to
Random is approximately uniformly distributed over the range of the result
subtype{, or in the case of the version of Random with First and Last
parameters, over the range First .. Last.}.
The Random function in an instantiation of Numerics.Discrete_Random is
guaranteed to yield each value in its result subtype{, or in the case of the
version of Random with First and Last parameters, over the range First ..
Last,} in a finite number of calls, provided that the number of such values
does not exceed 2**15.
I don't know if I was the only one who didn't know that a change was intended to
A.5.2(42), but the fact that I didn't put it into my notes suggests that it
wasn't done very clearly. Thus I have to think the vote to approve was invalid;
I never, *ever* would have voted for the above, and I would have objected to
voting for "whatever the editor comes up with" on general principles.
----------------------------------
With these two problems, we will have to rewrite and revote this AI. Since the
problems are relatively minor, we can agree to wording here and then do a letter
ballot to approve the AI (we don't have to spend meeting time on it assuming
that we can agree enough that no one feels obligated to vote against the AI).
I have no suggestion about a postcondition; someone will have to explain what
was meant (and whether that intended to subsume some of the English wording in
the AI).
As far as the wording for A.5.2(41-2), I think we need to define once what the
result ranges are and then use that in the other wording:
{The *result range* of a Random function, is the range First .. Last for the
version of Random with First and Last parameters, and the range of the result
subtype otherwise.}
A sufficiently long sequence of random numbers obtained by successive calls to
Random is approximately uniformly distributed over the {result} range [of the
result subtype].
{A}[The] Random function in an instantiation of Numerics.Discrete_Random is
guaranteed to yield each value in its result {range}[subtype] in a
finite number of calls, provided that the number of such values does not
exceed 2 15.
[Note: "15" in this last sentence is superscript. We're not changing that; it
doesn't show in plain text.]
Thoughts???
***************************************************************
From: John Barnes
Sent: Saturday, December 12, 2015 4:35 AM
A tricky subject. A good idea to define the range first.
The actual requirement is pretty vague and so unwise to use a postcondition. And
we would have to define uniformly distributed which is hardly the point. There
are lots of ways of generating a useless sequence that visits every value the
same number of times. Even G.2.5 is vague since it doesn't define what test
suite it is talking about or what conformance means.
So apart from defining the range first I would just leave it alone but please
ensure that the poor superscript 15 is on the same line as the 2. It's not in
the LNCS version which has a line feed between the 2 and 15. Gosh.
Cheers and Happy Holidays
John
Don't use a postcondition.
***************************************************************
From: Tucker Taft
Sent: Saturday, December 12, 2015 9:32 AM
I generally agree with John. The only thing I could imagine the postcondition
might do is define the minimum and maximum possible values returned. You
probably couldn't say anything more than that about the distribution with a
postcondition.
So, e.g.,
function Random (Gen : Generator;
First : Result_Subtype;
Last : Result_Subtype) return Result_Subtype
with Post => Random'Result in First .. Last;
***************************************************************
From: Bob Duff
Sent: Saturday, December 12, 2015 11:14 AM
> with Post => Random'Result in First .. Last;
I think that's the postcondition that was discussed at the meeting, and I think
somebody said that if we have that, then we don't need to say in English "the
result is in the range ..."
***************************************************************
From: Randy Brukardt
Sent: Saturday, December 12, 2015 12:34 PM
Randy’s proposed introduction of a definition of “result range” just for local
usage in those two paragraphs seems a good simplification to me.
And now that Tuck has written out the post-condition then yes, I think that’s
all that was meant.
Until people are used to looking for post-conditions, I think the (improved)
wording is still needed.
***************************************************************
From: John Barnes
Sent: Friday, December 18, 2015 7:26 AM
I am fretting over that postcondition.
The text says that the value returned will be in the range. Fact. There is
absolutely no hint that it might try to be outside the range and so raise some
sort of exception.
But if we express this idea with a postcondition, then by the dynamic nature of
postconditions in Ada (as opposed to good old Spark) there is an implication
that it might fail and so raise Assertion_Error or whatever.
So for me the postcondition seems weaker. I don't like it.
[Editor's note: This topic was split into AI12-0179-1; the mail thread continues
there.]
***************************************************************
From: Randy Brukardt
Sent: Tuesday, January 5, 2016 12:10 AM
Getting back to the original reason for this thread...
Here's the revised proposed wording section for this AI. I have a question at
the end...
Add after A.5.2(20):
function Random (Gen : Generator;
First : Result_Subtype;
Last : Result_Subtype) return Result_Subtype
with Post => Random'Result in First .. Last;
Add after A.5.2(32):
function Random (Gen : Generator;
First : Result_Subtype;
Last : Result_Subtype) return Result_Subtype
with Post => Random'Result in First .. Last;
Obtains the "next" random number from the given generator, relative to its
current state, according to an implementation-defined algorithm. The result
of the function in an instantiation of Numerics.Discrete_Random is delivered
as a value of the generic formal subtype Result_Subtype that makes the
postcondition True. If the range First .. Last is a null range, Constraint_Error
is raised.
Modify A.5.2(41):
{The *result range* of a Random function, is the range First .. Last for the
version of Random with First and Last parameters, and the range of the result
subtype otherwise.}
A sufficiently long sequence of random numbers obtained by successive calls to
Random is approximately uniformly distributed over the {result} range [of the
result subtype].
Modify A.5.2(42):
{A}[The] Random function in an instantiation of Numerics.Discrete_Random is
guaranteed to yield each value in its result {range}[subtype] in a
finite number of calls, provided that the number of such values does not
exceed 2 15.
-----------
In the new text for function Random, we have "The result of the function in an
instantiation of Numerics.Discrete_Random is delivered as a value of the
generic formal subtype Result_Subtype that makes the postcondition True."
Assuming that we have the blanket rule for postconditions (now proposed in
AI12-0179-1, not yet posted), this sentence says only that the function meets
its specification. We *assume* that everywhere in the Standard, so this is
just noise.
I've continued to include it because this paragraph immediately follows
A.5.2(32), which has similar wording, which is similarly useful (not at all).
[In all of these cases, the important information is in the Implementation
Requirements portion.]
But it would be better to omit this junk wording -- that was the intent of
adding a postcondition as I understand it. To make the wording consistent,
we then would have to delete the existing wording from A.5.2(32).
Specifically, the 2nd and 3rd sentences of the following paragraph would also
be dropped as marked:
Obtains the "next" random number from the given generator, relative to its
current state, according to an implementation-defined algorithm. [The result
of the function in Numerics.Float_Random is delivered as a value of the
subtype Uniformly_Distributed, which is a subtype of the predefined type Float
having a range of 0.0 .. 1.0. The result of the function in an instantiation
of Numerics.Discrete_Random is delivered as a value of the generic formal
subtype Result_Subtype.]
I didn't unilaterally make this change as it seems like more change than is
necessary, even though I can hardly imagine any more empty verbiage in the
Standard. (What else would the result of Random be? A lake in southern
Sheboygan county? Santa Claus?? :-) Who would expect something other than a
value of its result subtype? Whose details can be read in the specification
that precedes this text!)
Should I make this slightly larger change, or forget it?
P.S. I'd like to put this AI to bed someday soon. Or does Someday Never
Come?? :-)
***************************************************************
From: Tucker Taft
Sent: Tuesday, January 5, 2016 2:32 PM
> ... I didn't unilaterally make this change as it seems like more
> change than is necessary, even though I can hardly imagine any more
> empty verbiage in the Standard. (What else would the result of Random
> be? A lake in southern Sheboygan county? Santa Claus?? :-) Who would
> expect something other than a value of its result subtype? Whose
> details can be read in the specification that precedes this text!)
I have no problem with deleting clearly superfluous wording. The danger
of blatantly redundant wording is that the user might think it means more
than it says, and try to read something extra in it. So I would agree,
heave ho!
> Should I make this slightly larger change, or forget it?
OK with me.
***************************************************************
From: Edmond Schonberg
Sent: Tuesday, January 5, 2016 2:43 PM
>> Should I make this slightly larger change, or forget it?
>
> OK with me.
That’s what I would call a Talmudic response ...
***************************************************************
From: Tucker Taft
Sent: Tuesday, January 5, 2016 2:53 PM
Oops. Deleting the clearly redundant wording is fine if it is not clarifying
things for the user.
***************************************************************
From: Bob Duff
Sent: Tuesday, January 5, 2016 4:06 PM
> Assuming that we have the blanket rule for postconditions (now
> proposed in AI12-0179-1, not yet posted), this sentence says only that
> the function meets its specification. We *assume* that everywhere in
> the Standard, so this is just noise.
I agree. We should never say that a particular procedure obeys its
postcondition, because that would lead readers to think that maybe other
procedures don't obey their postcondition. Better to rely on the blanket
statement. So I agree with your suggested change.
***************************************************************
From: Jeff Cousins
Sent: Wednesday, January 6, 2016 3:48 AM
I agree for the same reason.
***************************************************************
From: Tucker Taft
Sent: Thursday, January 7, 2016 8:48 AM
[Editor's note: Letter Ballot stuff stripped out here.]
This new sentence preceding A.5.2(41) has some grammatical issues:
The result range of a Random function, is the
range First .. Last for the version of Random
with First and Last parameters, and the range
of the result subtype otherwise.
That first comma doesn't make sense to me, but then that second comma has a
problem if you leave it out.
How about the following:
Each Random function has a /result range/; this
is the range First .. Last for the version of Random
with First and Last parameters and the range
of the result subtype of the function otherwise.
***************************************************************
From: Jean-Pierre Rosen
Sent: Thursday, January 7, 2016 9:14 AM
[Editor's note: Letter Ballot stuff stripped out here.]
It seems there has been some aggressive "substitute all" in the text.
After "Add after A.5.2(32):", the text is in the code font, and all Ada
keywords are in bold in the editor's note. Under the !discussion, in the
phrase "see the appendix", appendix has been turned into a !appendix
***************************************************************
From: Randy Brukardt
Sent: Thursday, January 7, 2016 1:46 PM
Those are the effects of the auto-formatter. The first defies any correction,
since the only way to prevent the formatter from treating the text as part
of the code would be to not indent it (the autoformatter assumes code
continues until the end of the indented section). But the style of the RM
is to have such text indented. Presumably the !corrigendum will be correct
one it gets added to the AI. I saw this when I posted it, but as noted, I
couldn't think of any fix.
The second occurred because "!appendix" happened to start a line, and thus
the formatter took it as the start of the !appendix section. Since the other
tools make the same assumption, I will change that by moving line breaks
around, but note that the formatting is not part of the official AI, so this
is not a real change at all.
***************************************************************
From: Randy Brukardt
Sent: Thursday, January 7, 2016 1:55 PM
> How about the following:
>
> Each Random function has a /result range/; this
> is the range First .. Last for the version of Random
> with First and Last parameters and the range
> of the result subtype of the function otherwise.
You couldn't have made this comment when I proposed this wording
back on December 11th?? That's kinda the reason I *send* wording suggestions
to the ARG list in the first place.
This does seem like an improvement, I'll use it. I'll wait for any other
suggestions before posting an updated AI.
***************************************************************
From: Steve Baird
Sent: Thursday, January 7, 2016 1:50 PM
The proposed wording includes.
> Modify A.5.2(41):
> {The result range of a Random function, is the range First .. Last for
> the version of Random with First and Last parameters, and the range of
> the result subtype otherwise.} A sufficiently long sequence of random
> numbers obtained by successive calls to Random is approximately
> uniformly distributed over the {result} range [of the result subtype].
When we say "is approximately uniformly distributed over the result range",
what single range are we talking about? We are talking about "result range"
as though it is a property of the function, but two calls to the same function
can have different result ranges (in the case where First and Last are passed
in).
To be correct, I think we really need to say that the result range differs
from one call to the next (i.e., it is a property of the call, not of the
function) and then require something like
"A sufficiently long sequence of random numbers obtained by
successive calls to Random which have the same result range is ..."
The idea is that you have lots of calls to the Random which takes a First and
Last parameter, then you partition that set of calls so that two calls are in
the same partition element if and only if their First/Last values match (which
means that they have the same result range). We then impose the "sufficiently
long sequence" requirement on partition elements (where the notion of a single
result range makes sense).
Or am I being too pedantic?
***************************************************************
From: Randy Brukardt
Sent: Thursday, January 7, 2016 2:23 PM
> Or am I being too pedantic?
Usually. ;-)
Why didn't you point this out when you were complaining about this
wording in Bennington? The wording proposed there had a similar problem. Or
back in December when the current version was proposed??
OK, now that I that I got *that* off of my chest, I'm understanding that you
are proposing the following change to this wording, folding in Tucker's
change:
Modify A.5.2(41):
{Each call of a Random function has a *result range*; this is the range
First .. Last for the version of Random with First and Last parameters and
the range of the result subtype of the function otherwise.}
A sufficiently long sequence of random numbers obtained by successive calls
to Random {which have the same result range} is approximately uniformly
distributed over the {result} range [of the result subtype].
(The changes are to add "a call of" in the first paragraph, and "which have
the same result range" to the second paragraph.
One effect of this wording is that the requirement would hold for mixed calls
to the Random without First and Last parameters and calls to Random where
First = Result_Subtype'First and Last = Result_Subtype'Last (since these
result ranges would be the same). That's probably OK, we'd want the routines
to operate in the same way in that case.
Should we make these changes? (I lean toward doing so, as they clarify the
model.) [I think we'd restart the Letter Ballot if we do, as this is a pretty
substantive change.]
***************************************************************
From: Bob Duff
Sent: Thursday, January 7, 2016 3:39 PM
[Editor's note: Letter Ballot stuff stripped out here.]
...
> The revised AI can be found on-line in the normal place:
> http://www.ada-auth.org/cgi-bin/cvsweb.cgi/ai12s/ai12-0144-1.txt (make
> sure you use the most recent version; I mistakenly posted a version
> with a blank space where the change for A.5.2(32) was supposed to go,
> so there are two /05 versions in the VCS).
Sounds naughty to have to same-numbered versions. It's not like we're going
to run out of positive integers. ;-)
...
I agree with the comments that led to this:
> Modify A.5.2(41):
>
> {Each call of a Random function has a *result range*; this is the range
> First .. Last for the version of Random with First and Last parameters and
> the range of the result subtype of the function otherwise.}
>
> A sufficiently long sequence of random numbers obtained by
> successive calls to
> Random {which have the same result range} is approximately uniformly
> distributed
> over the {result} range [of the result subtype].
I suggest deleting this from the AI:
(unless a compiler
vendor makes them available through a non-standard package).
You could say that about ANY new feature.
Obtains the ?next? random number from the given generator, relative to its
current state, according to an implementation-defined algorithm.[ The result
of the function in Numerics.Float_Random is delivered as a value of the
subtype Uniformly_Distributed, which is a subtype of the predefined type
Float having a range of 0.0 .. 1.0. The result of the function in an
instantiation of Numerics.Discrete_Random is delivered as a value of the
generic formal subtype Result_Subtype.]
[Editor's note: We delete the last two paragraphs here as all they say is
^^^^^^^^^^
Those things we're deleting are sentences, not paragraphs.
***************************************************************
From: Gary Dismukes
Sent: Thursday, January 7, 2016 4:11 PM
> Should we make these changes? (I lean toward doing so, as they clarify the
> model.) [I think we'd restart the Letter Ballot if we do, as this is a
> pretty substantive change.]
Yes, I'm in favor of going ahead with those changes. With one minor tweak:
change "which" to "that" in "{which have the same result range}". Other than
that it looks good. :)
If that means restarting the ballot, will hold off on my vote.
***************************************************************
From: Randy Brukardt
Sent: Thursday, January 7, 2016 4:30 PM
...
> http://www.ada-auth.org/cgi-bin/cvsweb.cgi/ai12s/ai12-0144-1.txt (make
> > sure you use the most recent version; I mistakenly posted a version
> > with a blank space where the change for A.5.2(32) was supposed to
> > go, so there are two /05 versions in the VCS).
>
> Sounds naughty to have to same-numbered versions. It's not like we're
> going to run out of positive integers. ;-)
I try to ensure that we never have two versions on the same date, so we're
limited to one per day in usual circumstances. In a situation like this,
where an incomplete AI gets stored, I just make the correction as the previous
version shouldn't exist. (But one can't delete versions from the VCS, so my
mistakes will live forever.)
I also don't use new version numbers for presentation changes (that is, moving
around the whitespace), for posting mail, and (rarely) for fixing obvious
typos in nonnormative parts of an AI.
...
> I suggest deleting this from the AI:
>
> (unless a compiler
> vendor makes them available through a non-standard package).
>
> You could say that about ANY new feature.
That's from the original posting by Adam; as I frequently tell John, we try to
use as much of the original question as possible, and as such we edit it as
little as possible. (In John's case, that means we might leave some iffy
grammar and some unclear constructions.) I will trim out extraneous
information, so it's a close call as to whether to leave it or remove it. I
view it as harmless, which is why I left it initially.
It fairly frequent that I have to create !problem or !question out of thin air,
in which case all changes are on the table. But I don't want to put words in
other people's mouths unless absolutely necessary (and it's best to let as much
of their original thought remain as possible).
...
> [Editor's note: We delete the last two paragraphs here as all they say is
> ^^^^^^^^^^
>
> Those things we're deleting are sentences, not paragraphs.
Arrggghh!!
***************************************************************
From: Steve Baird
Sent: Monday, January 11, 2016 12:36 PM
[Significant part of a longer message - Editor.]
I'm seeing
"does not exceed 2 15"
as opposed to, perhaps,
"does not exceed 2 ** 15" . Is there a typo here?
***************************************************************
From: Gary Dismukes
Sent: Monday, January 11, 2016 1:11 PM
I believe that's a actually a superscript, and it doesn't get rendered
properly in the online AI web page.
[Added Editor's note to clarify - Editor.]
***************************************************************
From: John Barnes
Sent: Tuesday, January 12, 2016 4:09 AM
[Significant part of a longer message - Editor.]
The sentence "postconditions are assumed to be a requirement on the body of
a subprogram" upset me I could read that to mean that every subprogram body
has to have a postcondition.
[Changed "be" to "impose" - Editor.]
***************************************************************
From: Erhard Ploedereder
Sent: Tuesday, January 19, 2016 8:00 PM
[Significant part of a longer message - Editor.]
For a letter ballot I am casting a "reject" vote (although I agree with the
techbnical intent), because
1. The sentence
> A sufficiently long sequence of random numbers obtained by successive
> calls to Random is approximately uniformly distributed over the
> {result} range [of the result subtype].
is at best seriously incomplete, at worst plain wrong. It should read:
A sufficiently long sequence of random numbers obtained by successive calls to
Random with the same actual parameter for Gen and the same result_subtype is
approximately uniformly distributed over the {result} range [of the result
subtype].
Obviously, both qualifications, especially the Gen bit, are important.
2. Looking at the interface in context, I am not sure that it is a good
idea to have one interface in the package use postconditions, while
the others do not. As much as I like postconditions, this looks odd.
Maybe we could (re)do this package in its entirety with
postconditions?
Also, there are editorial bugs:
Formatting is somehow off (courier fonts for parts of the text.) In the
Semantics section of (32): ". it" -> ". It"
***************************************************************
From: Randy Brukardt
Sent: Wednesday, January 20, 2016 8:28 PM
[Significant part of a longer message - Editor.]
...
> Obviously, both qualifications, especially the Gen bit, are important.
I suppose you're right, but this is Ada 95 wording, and it's hard to imagine
how important it could be to be precise about the generator parameter now
when it never was before.
> 2. Looking at the interface in context, I am not sure that it is a good
> idea to have one interface in the package use postconditions, while
> the others do not. As much as I like postconditions, this looks odd.
> Maybe we could (re)do this package in its entirety with
> postconditions?
The postcondition on the new routine is pretty minimal (it just defines the
range); the operative stuff is in the Implementation Requirements that can't
be described in a postcondition. There's no point in adding a postcondition
of True to the other Random functions, and there isn't anything about the
result which can be described with a postcondition. I don't see any useful
postconditions that could be added to any of the other routines, either
(Reset, Save, Image, Value). That's rather typical: the containers most
likely will only have a handful of routines that have postconditions (based
on our previous discussion on AI12-0112-1), the majority of routines will not.
While this was one of the reasons that I've been luke-warm at best at adding
a postcondition rather than words for this routine, this is a discussion that
I clearly lost and I don't see much point in revisiting it lest we enter an
infinite loop of no progress.
> Also, there are editorial bugs:
> Formatting is somehow off (courier fonts for parts of the
> text.) In the Semantics section of (32): ". it" -> ". It"
Sigh. You're (like several others) looking at the autoformatted version, which
should *never* be used for editorial review of an AI, as the format is
generated by guessing what each piece of text is. It will *always* have errors.
These particular errors (assuming you meant "if" instead of "it", since there
is no "it" in (32)) come from something that was discussed twice previously.
***************************************************************
From: Erhard Ploedereder
Sent: Friday, January 22, 2016 6:22 PM
Returning to more technical arguments ....
What got me going here was the fact that a user can easily misjudge the
situation casued by a local instantiation of the generic package, which par
force, because the generator type is inside the package (-- maybe stupidly --),
causes his seemingly sequential calls on Random to NOT return random numbers,
because the Generators differ. (Wasn't this even the scenario why this AI came
up in the first place?)
Worth mentioning? I think so.
Worth a shitstorm? No.
***************************************************************
From: Erhard Ploedereder
Sent: Sunday, January 24, 2016 5:09 PM
[Significant part of a longer message - Editor.]
I have yet another technical issue:
Now that there are two Random functions, what do the semantics say if both of
them get called in some order? Clearly the uniform distribution sentence is
then all screwed up.
(A safer version would duplicate the surrounding package, so that the
Generators keep the two kinds of calls apart, but that is a lot of text or
the manual.)
In exchange, I take back the one about a singular interface with
postconditions. The other functions don't really have specifiable ones.
***************************************************************
From: Randy Brukardt
Sent: Sunday, January 24, 2016 5:32 PM
[Significant part of a longer message - Editor.]
> I have yet another technical issue:
>
> Now that there are two Random functions, what do the semantics say if
> both of them get called in some order?
> Clearly the uniform distribution sentence is then all screwed up.
My intent (I thought that it was written somewhere, but I can't find it now)
is that it doesn't matter. That is, only the result range (and, as you point
out, the generator) matters, not the form of the call.
Specifically, if one interleaved calls to the two random functions:
... Random (Gen, Result_Subtype'First, Result_Subtype'Last) ...
... Random (Gen) ...
... Random (Gen, Result_Subtype'First, Result_Subtype'Last) ...
... Random (Gen) ...
The uniform distribution requirement would still apply. The expectation is
that these really different views of the same logical function; we have two
separate specifications for compatibility.
As the AI says, if this was 1994, we just would have defined the function
as:
function Random (Gen : Generator;
First : Result_Subtype := Result_Subtype'First;
Last : Result_Subtype := Result_Subtype'Last) return Result_Subtype;
and there would be no question about the handling of "different" random
functions. (This specification is incompatible on the margins -- specifically
for renaming and formal subprogram matching -- and it's too late to break
those things.)
I think the wording reflects that. We have (with your generator change):
A sufficiently long sequence of random numbers obtained by successive calls
to Random that have the same generator and result range is approximately
uniformly distributed over the result range.
And the definition of result range is in the (new) previous paragraph:
Each call of a Random function has a *result range*; this is the range
First .. Last for the version of Random with First and Last parameters and
the range of the result subtype of the function otherwise.}
> (A safer version would duplicate the surrounding package, so that the
> Generators keep the two kinds of calls apart, but that is a lot of
> text for the manual.)
I don't see the point. We want to think of this as two views of a single
random function, not two different functions.
> In exchange, I take back the one about a singular interface with
> postconditions. The other functions don't really have specifiable
> ones.
Yup, that's what I noticed when I looked at it.
***************************************************************
From: Erhard Ploedereder
Sent: Monday, January 25, 2016 6:43 AM
> Specifically, if one interleaved calls to the two random functions:
> ... Random (Gen, Result_Subtype'First, Result_Subtype'Last) ...
> ... Random (Gen) ...
> ... Random (Gen, Result_Subtype'First, Result_Subtype'Last) ...
> ... Random (Gen) ...
>
> The uniform distribution requirement would still apply. The
> expectation is that these really different views of the same logical
> function; we have two separate specifications for compatibility.
True for the above. False for:
> ... Random (Gen, Result_Subtype'First+5, Result_Subtype'Last-5)
> ... Random (Gen) ...
> ... Random (Gen, Result_Subtype'First+5, Result_Subtype'Last-5)
> ... Random (Gen) ...
Now the 10 outliers get chosen half as often, since the distribution in the
one routine gets narrowed to the First..Last range, and in the other routine
to Result_Subtype'Range. And, within each category, the series is not
reproducible, due to possibly varying numbers of the "other" calls.
(What one could promise is that the float-version with the uniform
distribution is underneath it all, and these versions merely map its uniform
distribution faithfully onto discrete values within the result ranges.)
***************************************************************
From: Randy Brukardt
Sent: Monday, January 25, 2016 6:12 PM
> Now the 10 outliers get chosen half as often, since the distribution
> in the one routine gets narrowed to the First..Last range, and in the
> other routine to Result_Subtype'Range.
Sure, but the requirement doesn't apply to these calls because the ranges are
different. The same would be true for:
... Random (Gen, 1, 10) ... -- (A)
... Random (Gen, 1, 5) ... -- (B)
... Random (Gen, 1, 10) ... -- (C)
... Random (Gen, 1, 5) ... -- (D)
> And, within each category, the series is not reproducible, due to
> possibly varying numbers of the "other" calls.
I couldn't parse this, but I do see a problem with the wording (and probably
it's the same one you're seeing).
Specifically, the requirement wouldn't apply to calls (A) and (B) above,
because the ranges are different. But the requirement does appear to apply to
calls (A) and (C) above; the calls are "successive" and meet the other
requirements. And that's not going to be easy to ensure, as the changes that
call (B) makes to the generator Gen might make it difficult to preserve
uniformity. Particularly if the pattern of calls is irregular. Similarly, a
call to Reset or one of the other operations on Gen would mess up the
requirement. Worst case:
Reset (Gen, 1);
A(1) := Random (Gen, 1, 10);
Reset (Gen, 1);
A(2) := Random (Gen, 1, 10);
...
Arguably, the requirement applies to the two calls to Random here, but they'll
most likely both return the same value.
I think we can fix the wording to cover these cases by simply changing
"successive" to "consecutive" in the proposed wording:
A sufficiently long sequence of random numbers obtained by {consecutive}
[successive] calls to Random that have the same generator and result range
is approximately uniformly distributed over the result range.
Maybe it would help to add an AARM note to spell out that "consecutive" here
means that there are no intervening explicit calls (to anything) between the
calls to Random. That prevents any issues because the generator is changed by
calling Reset, other calls on Random with the same generator, and probably
other things that we haven't thought of yet.
You (or someone else) might complain that this version of the requirement
would hardly ever apply. That's true, but it's OK, since the only real purpose
of this requirement is to give users and implementers a baseline for a "good"
random generator. And so long as the requirement *could* happen in a program
(and it certainly could, just put the values in an array and analyze them
later), the random function will have to behave properly. I'm not worried
about someone constructing a random generator that only works right if there
are no intervening calls -- one would have to be actively evil for a call to
some unrelated function to make the random generator malfunction. (And the RM
does not worry about actively evil implementers anyway, they can't be stopped
by a mere Standard.)
> (What one could promise is that the float-version with the uniform
> distribution is underneath it all, and these versions merely map its
> uniform distribution faithfully onto discrete values within the result
> ranges.)
That exposes too much mechanism; arguably, there wouldn't even have to be a
float version.
I think the above change does the trick. One could argue that the intent of
"successive" was to not allow non-matching calls in between, but it's not
obvious and at a minimum deserves an AARM note.
Thoughts??
***************************************************************
From: Erhard Ploedereder
Sent: Tuesday, January 26, 2016 6:10 AM
> Sure, but the requirement doesn't apply to these calls because the
> ranges are different. The same would be true for:
>
> ... Random (Gen, 1, 10) ... -- (A)
> ... Random (Gen, 1, 5) ... -- (B)
> ... Random (Gen, 1, 10) ... -- (C)
> ... Random (Gen, 1, 5) ... -- (D)
>
>> > And, within each category, the series is not reproducible, due to
>> > possibly varying numbers of the "other" calls.
> I couldn't parse this, but I do see a problem with the wording (and
> probably it's the same one you're seeing).
>
> Specifically, the requirement wouldn't apply to calls (A) and (B)
> above, because the ranges are different. But the requirement does
> appear to apply to calls (A) and (C) above; the calls are "successive"
> and meet the other requirements. And that's not going to be easy to
> ensure, as the changes that call (B) makes to the generator Gen might
> make it difficult to preserve uniformity. Particularly if the pattern
> of calls is irregular. Similarly, a call to Reset or one of the other
> operations on Gen would mess up the requirement.
Since the calls go to the same generator, I would expect
- indeed, each one, within their range, would distribute uniformly
- however, in general, not reproducibly, i.e., for cases, where
the number of intervening other calls to the same Gen is
dynamically determined, the sequence will not be reproducibly
the same.
So, one requirement on Random is met, the other is not. The counter example
is:
Reset (Gen, 1);
A(1) := Random (Gen, 1, 10);
for I in 1..changingvalue loop
... Random(Gen, 1, 5);
end loop;
A(2) := Random (Gen, 1, 10);
***************************************************************
From: Randy Brukardt
Sent: Tuesday, January 26, 2016 4:49 PM
Yes, that's what I said (or at least meant). I went on to point out that we
don't really care if the uniform requirement only applies in narrow
circumstances, since an implementer would have to get it right in those
circumstances and only an actively evil implementer would manage to do
something bad in a case like the above if they've really met the uniform
requirement.
But you didn't answer my question: does changing the wording to "consecutive"
(rather than "successive") and then adding an AARM note to clarify further
address your concern???
And if not, what *would* address your concern???
P.S. To repeat yesterday's message, here's exactly what I suggested (with an
explicit AARM note to follow):
A sufficiently long sequence of random numbers obtained by consecutive
calls to Random that have the same generator and result range
is approximately uniformly distributed over the result range.
AARM Discussion: In this rule, "consecutive" means that there are no
intervening explicit calls (to anything) between the calls to Random. This
restricts the rule to only applying to cases where just the Random function
changes the generator. We don't mean to impose a requirement if there are
intervening calls to Reset, to Random with same generator but a different
result range, or any other case that would affect the sequence of values
returned.
[I don't want to get into formally defining what and what cannot appear
between two consecutive calls, as that way seems to lead to madness. If the
generator is hardware-based, simply wasting differing amounts of time between
calls could change the behavior to something non-uniform. But we at least have
to allow the program to store the value somewhere between calls to Random.
Vague is probably better here.]
***************************************************************
From: Tucker Taft
Sent: Wednesday, January 27, 2016 8:42 AM
> But you didn't answer my question: does changing the wording to
> "consecutive" (rather than "successive") and then adding an AARM note
> to clarify further address your concern???
Saying "consecutive" helps me, at least.
***************************************************************
From: Erhard Ploedereder
Sent: Wednesday, January 27, 2016 10:55 AM
> But you didn't answer my question: does changing the wording to
> "consecutive" (rather than "successive") and then adding an AARM note
> to clarify further address your concern???
>
> And if not, what *would* address your concern???
I am o.k. with the new wording, especially with the idea expressed by the
AARM note. It is more user-understandable than what I had in mind.
Maybe the AARM note could be relaxed by saying "...means that there are no
other intervening calls involving the same generator." Tying it to a
statically checkable and very restrictive property seems unnecessary.
***************************************************************
From: Randy Brukardt
Sent: Wednesday, January 27, 2016 2:20 PM
In thinking about it more, I think you're right. I was trying to cover for
the *next* problem someone thought of, but that's probably taking CYA too far.
P.S. CYA = Cover your a**!
***************************************************************
From: Randy Brukardt
Sent: Friday, February 26, 2016 8:00 PM
Voting results for Letter Ballot on AI12-0144-1/07. This AI is approved by a
10-0-2 vote.
Approve as is: Randy Brukardt, Steve Baird, Bob Duff, Tucker Taft, Jean-Pierre
Rosen, Jeff Cousins, Ed Schonberg, Tullio Vardanega, Gary Dismukes, and Brad
Moore.
Voting, but Abstaining: John Barnes and Alan Burns.
Not voting: Pascal Leroy, Erhard Ploedereder, and Florian Schanda.
****************************************************************