Version 1.6 of ai12s/ai12-0144-1.txt

Unformatted version of ai12s/ai12-0144-1.txt version 1.6
Other versions for file ai12s/ai12-0144-1.txt

!standard A.5.2(20)          16-01-06 AI12-0144-1/05
!standard A.5.2(32)
!standard A.5.2(41)
!class Amendment 14-12-04
!status ARG Approved 7-0-0 15-10-18
!status Promising (10-0-0) 15-02-26
!status work item 14-12-04
!status received 14-11-06
!priority Low
!difficulty Easy
!subject Make Discrete_Random more flexible
!summary
Add a more general function Random to Discrete_Random.
!problem
A recent StackOverflow question asked how to write a function:
function generate_random_number (n: in Positive) return Integer is
to generate one random number in the range 0..n. n could be different for each call.
It looks like this isn't easily possible using Ada.Numerics.Discrete_Random, since a new instantiation of Discrete_Random must be declared inside generate_random_number, and a new generator variable of type Instantiation.Generator needs to be declared. The questioner tried Reset to reset the generator to a time-dependent state each time, but this failed because the procedure was being called faster than the clock was changing.
The most direct way to accomplish something like this seems to be to use Float_Random, reset the generator, and then perform the needed computations to yield a discrete random number. This is unfortunate, because the Discrete_Random packages are no doubt already performing these or related computations, and hopefully doing it in a way that produces the best results, which is not trivial; but these computations aren't available to a program (unless a compiler vendor makes them available through a non-standard package).
!proposal
(See wording.)
!wording
Add after A.5.2(20):
function Random (Gen : Generator; First : Result_Subtype; Last : Result_Subtype) return Result_Subtype with Post => Random'Result in First .. Last;
Modify A.5.2(32):
[Editor's note: We delete the last two paragraphs here as all they say is that the function result is in the result subtype, which goes without saying. The actual description of the result is in the Implementation Requirements follow later; the existing text promises to give that without doing so. We remove this text now so that we don't have to give similar text in the description of the new subprogram, where it would be substantially more complicated.]
Add after A.5.2(32):
function Random (Gen : Generator; First : Result_Subtype; Last : Result_Subtype) return Result_Subtype with Post => Random'Result in First .. Last;
Obtains the "next" random number from the given generator, relative to its current state, according to an implementation-defined algorithm. If the range First .. Last is a null range, Constraint_Error is raised.
[Editor's notes: During the ARG phone meeting of February 26, 2015, it was suggested that the null range case raise Program_Error. However, Constraint_Error is the existing exception for that (A.5.2.(39)), so we used that here.
Postconditions are assumed to be a requirement on the body of a subprogram, as proposed in AI12-0179-1. If that AI is not approved, then we should not use a postcondition here; we'd need some wording instead. End Editor's notes.]
Modify A.5.2(41):
{The result range of a Random function, is the range First .. Last for the version of Random with First and Last parameters, and the range of the result subtype otherwise.}
A sufficiently long sequence of random numbers obtained by successive calls to Random is approximately uniformly distributed over the {result} range [of the result subtype].
Modify A.5.2(42):
{A}[The] Random function in an instantiation of Numerics.Discrete_Random is guaranteed to yield each value in its result {range}[subtype] in a finite number of calls, provided that the number of such values does not exceed 2 15.
!discussion
The author ran into this problem with a hobby solitaire program. It was necessary to use a reproducible generator (thus using a single generator created using Reset with an Initiator parameter). The algorithm for dealing cards is to create an array containing all of the cards, then select one from that array using a random generator. Then close up the array, and repeat the selection on the array (minus the selected card). This means you want to select with ranges 1..52, then 1..51, then 1..50, etc.
That's not possible with the Ada 95 definition of Discrete_Random.
Users often accomplish this in ways that do not provide a truly uniform distribution. It's easy to come up with methods that don't work (see the
!appendix

One algorithm that is correct is to start with a uniform random generator
that produces an integer in the range 0 .. Large. (Usually Large is some
power of 2 - 1). Then, for a call to Random with bounds First and Last, we
determine Len := (Last-First+1) and then Mult, which is the largest multiple
of Len that is less than Large. (Len cannot be greater than Large.) We then
loop, getting a random value, and exiting the loop when the random value is
less than Mult. Finally, we mod the random value (now in the range 0 ..
Mult-1) by Len and add First.

This loop will usually exit on the first iteration, and even in the worst
case (Len = Large/2+1), there is at least a 50% chance that the loop will
exit on each iteration. So it is unlikely that many iterations will be needed.


If this was 1994, we should have simply defined Random in Discrete_Random as
follows:

function Random (Gen   : Generator;
                 First : Result_Subtype := Result_Subtype'First;
                 Last  : Result_Subtype := Result_Subtype'Last) return Result_Subtype;

This allows leaving out the First parameter if it matches that of the formal
subtype (which it usually will).

Unfortunately, that is more incompatible, as renames and access-to-subprogram
matching would fail (whereas the proposed solution is only incompatible if
there are use-clause conflicts with the profile, which is unlikely).

!ASIS

No impact.

!ACATS test

An ACATS test should be constructed to check that the new function exists and
works appropriately. Perhaps the card-dealing example mentioned in the
discussion could be used.

!appendix

!topic Allow random numbers for different discrete subtypes using same generator
!reference RM12 A.5.2
!from Adam Beneschan 14-11-06
!discussion

This arose from a StackOverflow question in which a questioner wanted
to write a function:

    function generate_random_number (n: in Positive) return Integer is

to generate one random number in the range 0..n.  n could be different
for each call.

It looks like this isn't easily possible using
Ada.Numerics.Discrete_Random, since a new instantiation of
Discrete_Random must be declared inside generate_random_number, and a
new generator variable of type Instantiation.Generator needs to be
declared.  The questioner tried Reset to reset the generator to a
time-dependent state each time, but this failed because the procedure
was being called faster than the clock was changing.

Perhaps there are ways to get this to work with an initiator parameter
on Reset.  However, the most direct way to accomplish something like
this seems to be to use Float_Random, reset the generator, and then
perform the needed computations to yield a discrete random number.
This is unfortunate, because the Discrete_Random packages are no doubt
already performing these or related computations, and hopefully doing
it in a way that produces the best results, which is not trivial; but
these computations aren't available to a program (unless a compiler
vendor makes them available through a non-standard package).

I think there should be a way to generate random numbers for different
discrete ranges using the same generator.  Several possibilities come
to mind:

(1) Add to Discrete_Random:

    function Random (Gen : Ada.Numerics.Float_Random.Generator)
        return Result_Subtype;

    which would allow the program to set up one common generator using
    Float_Random.

(2) Add to Float_Random:

    generic
        type Result_Subtype is (<>);
    function Random (Gen : Generator) return Result_Subtype;

    Basically the same as #1.

(3) Add a new non-generic package that defines a common generator type
    that could be used in Random functions in Discrete_Random and
    Float_Random.  This package would also define a common State type
    and provided the Reset, Save, Image, and Value subprograms
    currently provided by Float_Random and Discrete_Random.

Since this question came from a StackOverflow poster who could have
just been experimenting, I can't say how useful in practice this
feature would be.  It does seem like a flaw that it's missing,
however.

***************************************************************

From: Jean-Pierre Rosen
Sent: Thursday, November 6, 2014  3:17 PM

> It looks like this isn't easily possible using
> Ada.Numerics.Discrete_Random

Isn't it possible to use a random generator on Natural and return
Value mod n + 1 ?

(the quality of the random numbers might not be as good for big n's,
but it may be sufficient in practice)

***************************************************************

From: Adam Beneschan
Sent: Thursday, November 6, 2014  4:47 PM

Sure, if you don't want a uniform distribution.  Unless the number of values
in Natural (2**m for some m) is divisible by (n + 1), doing this will result
in some values in the range being generated with a higher probability than
others.

There are ways to make things uniform.  The point is that the implementors
of Discrete_Random have, I presume, already figured out how to do this (I
think they would have to in order to satisfy Annex G); why should we require
that those with the kind of problem I wrote about have to reinvent the wheel?

***************************************************************

From: Pascal Obry
Sent: Thursday, November 6, 2014  4:10 PM

> Isn't it possible to use a random generator on Natural and return
> Value mod n + 1 ?
>
> (the quality of the random numbers might not be as good for big n's,
> but it may be sufficient in practice)
>

Or better, just use Ada.Numerics.Float_Random and multiply by N.

***************************************************************

From: Jean-Pierre Rosen
Sent: Friday, November 7, 2014  2:32 AM

> Sure, if you don't want a uniform distribution.  Unless the number of
> values in Natural (2**m for some m) is divisible by (n + 1), doing
> this will result in some values in the range being generated with a
> higher probability than others.

Well, correct my maths if I'm wrong, but it seems to me that the average
unbalance is (2**m rem (n+1)) / 2**m.

If what is needed is a dice-4, dice-6 and dice-12, the unbalance would be
neglectible (even up to n=2**20 for a 32 bits generator).

***************************************************************

From: Randy Brukardt
Sent: Friday, November 7, 2014  6:44 PM

This problem happened to me in a hobby program: I needed to assign playing
cards to a Solitare layout. I wanted to use a reproducible generator (thus
using a single generator created using Reset with an Initiator parameter).
To properly deal cards, it's best to create an array containing all of the
cards, then select one from that array using a random generator. Then close
up the array, and repeat the selection on the array (minus the selected card).
This means you want to select with ranges 1..52, then 1..51, then 1..50, etc.

Since m in this case is 52!, I'm pretty sure that no 32-bit or even 64-bit
generator is going to be uniform using your formula.

I ended up using Float_Random, using a loop to avoid having to do any sort of
conversion out of float. (I couldn't find any way to make that work and still
keep a uniform distribution.)

It would have been nice to be able to use Discrete_Random for this case,
especially as every simulation problem that I've ever dealt with has had this
form. (None of which ever were work related, so I've never thought to bring
this up at the ARG level.) What is the value of having a package that never
can be used on actual problems??

Interestingly, I ended up with a better generator having used Float_Random,
since our Discrete_Random is not going to generate an exactly uniform
distribution -- it has the same conversion problems that I couldn't find any
way to avoid with the card dealer. It's "good enough" for small ranges (like
dice or cards) [as noted by J-P's formula], but some large ranges would have
a noticeable non-uniformity. Probably should fix that someday. (And we all
know "Someday Never Comes" - thanks to Credence Clearwater Revival. :-)

***************************************************************

From: Jean-Pierre Rosen
Sent: Friday, November 7, 2014  11:59 PM

>> Isn't it possible to use a random generator on Natural and return
>> Value mod n + 1 ?
>>
>> (the quality of the random numbers might not be as good for big n's,
>> but it may be sufficient in practice)
>>
>
> Or better, just use Ada.Numerics.Float_Random and multiply by N.

This doesn't work.

During the making of Ada95, there was a huge discussion between numerician
about whether it was possible to make a discrete generator out of a floating
one, and after megabytes of e-mail, the conclusion was that it was not
possible.

I'm not enough of a numerician to explain why,

***************************************************************

From: Jean-Pierre Rosen
Sent: Saturday, November 8, 2014  12:12 AM

> This means you want to select with ranges 1..52, then 1..51, then
> 1..50, etc.
>
> Since m in this case is 52!, I'm pretty sure that no 32-bit or even
> 64-bit generator is going to be uniform using your formula.

?? m (if you refer to the number used as a parameter to the function we are
discussing) is 52, then 51, ... Certainly a "small number" for the generator
bias.

As I use to say, I'm not a numerician, I just know enough of numerics to
realize how easy it is to make incorrect statements about numerics when you
are not a numerician. So I'm asking:

1) Is it a real problem? (i.e. worth spending time on it - we should take
advice of real numericians)

2) What do other languages do? (to be fair, having a functionality is not
enough if there is no guarantee of uniformity - anybody can write a bad random
generator).

***************************************************************

From: Jeffery R. Carter
Sent: Saturday, November 8, 2014  12:36 AM

>>> Isn't it possible to use a random generator on Natural and return
>>> Value mod n + 1 ?
>>>
>>> (the quality of the random numbers might not be as good for big n's,
>>> but it may be sufficient in practice)

For some applications (games, for example) the statistics of the RNG may not
be important, but for others, they are very important.

Given an RNG that returns uniformly distributed values in 0 .. 2 ** M - 1
(Unsigned_32, for example, where M=32), taking R rem (N + 1) gives uniformly
distributed results only when M rem (N + 1) = 0. All other cases will not be
satisfactory for strict uses.

>> Or better, just use Ada.Numerics.Float_Random and multiply by N.
>>
> This doesn't work.

Right. Rounding R * N to the nearest integer results in zero and N appearing
about 1/2 as often as 1 .. N - 1. That may not be acceptable even for some of
the less strict uses.

A better approach is to truncate R * (N + 1), but even that isn't acceptable
for strict uses. This is the approach taken by the Random_Int function in
PragmARC.Universal_Random.

***************************************************************

From: Randy Brukardt
Sent: Saturday, November 8, 2014  1:26 AM

> > This means you want to select with ranges 1..52, then 1..51, then
> > 1..50, etc.
> >
> > Since m in this case is 52!, I'm pretty sure that no 32-bit or even
> > 64-bit generator is going to be uniform using your formula.
>
> ?? m (if you refer to the number used as a parameter to the function
> we are discussing) is 52, then 51, ... Certainly a "small number" for
> the generator bias.

To use the *existing* Discrete_Random, you have to use an m that is a factor
of all of them, as you can't create a new generator for each step (and you'd
have to do that to use different instantiations).

To use the float generator, you don't use "m" at all.

> As I use to say, I'm not a numerician, I just know enough of numerics
> to realize how easy it is to make incorrect statements about numerics
> when you are not a numerician.

Same here. But I know that I couldn't get even acceptable results out of the
existing random generators without extreme care (and I don't have critical
requirements) -- essentially I had to never convert the float value to any
discrete type. (And I can't be certain I got it right as it is.) You noted
that multiplying by N isn't enough, nor is truncating (that caused values to
occassionally be out of range). I used a loop and a float compare to avoid
that problem, but it might have left an extra value into one of the end
intervals. I don't know of any way to avoid that problem (which is at least
small enough for my purposes, given the float value is a long float, so we
have n/2**53, which is pretty darn small).

Note that Jeff Carter said the same thing:

> Right. Rounding R * N to the nearest integer results in zero and N appearing
> about 1/2 as often as 1 .. N - 1. That may not be acceptable even for some of
> the less strict uses.

It wasn't for my card dealing application, for which I wanted to simulate more
than 100,000 deals (sometimes running into millions of deals).

> A better approach is to truncate R * (N + 1), but even that isn't acceptable
> for strict uses. This is the approach taken by the Random_Int function in
> PragmARC.Universal_Random.

That didn't even work for me, as there was a non-zero chance of R = 1.0; in
that case the result ended up indexing out of the range of the upper end of
the selection array (that's how I found out it was still wrong, a bizarre
Constraint_Error after simulating several thousand deals). I ended up putting
that value into the uppermost bin; that's makes the uppermost bin slightly more
likely to be selected, but never more often than 52/2**53, which seemed small
enough for my application. (That would take roughly 2**40 simulations to be
significant.) I don't know of any way to avoid that problem, perhaps that's
the source of the issue that J-P remembered reading about.

But how to do this perfectly I don't know. The implementation of
Discrete_Random in Janus/Ada is probably worse than that, because using a loop
isn't practical if the range is more than a few hundred items, and I probably
didn't have a real truncation operation available when I wrote it.

Which I believe I had to do, because no "real numerician" provided an
implementation that actually would work for Janus/Ada as it existed in the
1990s. The Float_Random was from Ken Dritz, but I don't think there was ever a
matching Discrete_Random implementation (and I could find none in the
literature). Probably it would be easier to find these things today (no Google
in 1996), but of course rewriting and retesting that is a low priority in the
absence of some customer demand.

> So I'm asking:
>
> 1) Is it a real problem? (i.e. worth spending time on it - we should
> take advice of real numericians)

It's certainly a real problem in that there is a class of applications where
neither Discrete_Random nor Float_Random (as currently defined) can be used to
provide completely uniform behavior. Perhaps there is some numerical reason
that the problem can never be solved, but we will need to be convinced of that.

> 2) What do other languages do? (to be fair, having a functionality is
> not enough if there is no guarantee of uniformity - anybody can write
> a bad random generator).

I don't think that other languages even had an equivalent for Discrete_Random
back in the 1990s. Whether that's changed, I don't know; none of the languages
I've ever used has had anything other than a float version.

As such, this seems to be a self-made Ada problem. Whether or not the landscape
for other languages has changed, it seems amazingly restrictive to have to
select every random number from the same range. There are a lot of problems
where that simply doesn't work.

Perhaps there is some numerical reason for Discrete_Random (I don't recall the
discussions that J-P mentioned, but that doesn't mean much for something that
happened over 20 years ago) that would prevent it from being useful for a large
class of problems. If that's true, it should be documented in the AARM so that
at least us language lawyers will know why it can't be useful.

I think this means that we'll need an AI and someone will have to be tasked to
go off and scour the Internet for information. (We're never going to have
enough "real numericians" in the ARG.)

***************************************************************

From: Bob Duff
Sent: Saturday, November 8, 2014  9:26 AM

> For some applications (games, for example) the statistics of the RNG
> may not be important, but for others, they are very important.

Depends on whether you're playing for money.  I'd like to play blackjack
against a program that uses the "mod N" method.  ;-)

I thought the standard way to get a random integer in the range
(say) 1 .. 6 is to get a random integer in the range (say) 0 .. 2**31-1, and
throw away results >= 2_147_483_646. 2_147_483_646 is the largest multiple of
6 <= 2**31-1. Do this in a loop until you get a number in 0 .. 2_147_483_645.
THEN do the "mod 6" and add 1.

That loop terminates, and usually goes round just once.
It's not much harder to program than the plain "mod 6" method.

J-P wrote:

> As I use to say, I'm not a numerician, ...

Is that a French word?  I've never heard it.  When I put "define numerician"
in google, it defines "numeration".  ;-) I assume you mean "numerical analyst".
I'm not one, either! Neither are most programmers, which is why it's good to
have this sort of thing in the predefined libraries.

Randy wrote:

> I don't think that other languages even had an equivalent for
> Discrete_Random back in the 1990s. Whether that's changed, I don't
> know; none of the languages I've ever used has had anything other than
> a float version.

Lots of languages have pseudo-random generators that produce integers.
Few have any strict requirements, which makes them useless for serious work
if portability is desired.

For example, C has rand(), which returns an int in the range 0 .. RAND_MAX.
RAND_MAX = 2**31-1 on my machine. Some implementations of rand() are pretty
bad, or so I've heard.  I'm pretty sure rand() dates back to the early days of
C, so before 1990s.

> I think this means that we'll need an AI and someone will have to be
> tasked to go off and scour the Internet for information. (We're never
> going to have enough "real numericians" in the ARG.)

Or ask somebody who knows what they're talking about (unlike me).
For example, Ken Dritz or Paul Hilfinger.

***************************************************************

From: Jeffrey R. Carter
Sent: Saturday, November 8, 2014  11:56 AM

>> A better approach is to truncate R * (N + 1), but even that isn't acceptable
>> for strict uses. This is the approach taken by the Random_Int function in
>> PragmARC.Universal_Random.
>
> That didn't even work for me, as there was a non-zero chance of R =
> 1.0; in that case the result ended up indexing out of the range of the
> upper end of the selection array (that's how I found out it was still
> wrong, a bizarre Constraint_Error after simulating several thousand
> deals). I ended up putting that value into the uppermost bin; that's
> makes the uppermost bin slightly more likely to be selected, but never
> more often than 52/2**53, which seemed small enough for my
> application. (That would take roughly 2**40 simulations to be
> significant.) I don't know of any way to avoid that problem, perhaps
> that's the source of the issue that J-P remembered reading about.

Other than Ada.Numerics.Float_Random, I've never used an RNG that produces
1.0, and I tend to forget that Float_Random may. My comment was for RNGs that
don't. Universal_Random, for example, produces all multiples of 2**(-24) from
zero to 1 - 2**(-24), so the truncation approach works fine with it.

There is still a problem, similar to the problem with the remainder method for
a discrete generator. The RNG produces M unique values in 0 .. 1-e. If
M rem N /= 0, then the results of the truncation method are not uniformly
distributed. (Universal_Random, for example, produces 2**24 unique values, and
2**24 rem 52 /= 0.)

***************************************************************

From: Jeffrey R. Carter
Sent: Saturday, November 8, 2014  2:43 PM

> Depends on whether you're playing for money.  I'd like to play
> blackjack against a program that uses the "mod N" method.  ;-)

Then it's a business for someone. Not entirely a game.

> I thought the standard way to get a random integer in the range
> (say) 1 .. 6 is to get a random integer in the range (say) 0 ..
> 2**31-1, and throw away results >= 2_147_483_646.
> 2_147_483_646 is the largest multiple of 6 <= 2**31-1.
> Do this in a loop until you get a number in 0 .. 2_147_483_645.  THEN
> do the "mod 6" and add 1.
>
> That loop terminates, and usually goes round just once.
> It's not much harder to program than the plain "mod 6" method.

Are we sure the results are uniformly distributed?

> Some implementations of rand() are pretty bad, or so I've heard.

The Turbo Pascal random number generator was very bad. If you plotted
consecutive values as (x, y) coordinates you got vertical lines! But on one
project I worked on that had pretty strict requirements for its random values,
we combined it with another, not-great linear-congruential generator and got
acceptable results.

***************************************************************

From: Tero Koskinen
Sent: Saturday, November 8, 2014  3:21 PM

> 2) What do other languages do? (to be fair, having a functionality is
> not enough if there is no guarantee of uniformity - anybody can write
> a bad random generator).

OpenBSD uses arc4random and arc4random_uniform(uint32_t upper_bound) functions
for random numbers in LibreSSL (and other places also).

Implementation for arc4random_uniform (which returns 32-bit unsigned integer
below the specified value) is found from
https://github.com/libressl-portable/openbsd/blob/master/src/lib/libc/crypt/arc4random_uniform.c

And pasted here, in case the link does not work:
uint32_t
arc4random_uniform(uint32_t upper_bound) {
  uint32_t r, min;
  if (upper_bound < 2)
    return 0;
  /* 2**32 % x == (2**32 - x) % x */
  min = -upper_bound % upper_bound;
  /*
   * This could theoretically loop forever but each retry has
   * p > 0.5 (worst case, usually far better) of selecting a
   * number inside the range we need, so it should rarely need
   * to re-roll.
   */
  for (;;) {
    r = arc4random();
    if (r >= min)
      break;
  }
  return r % upper_bound;
}

arc4random itself is found from
https://github.com/libressl-portable/openbsd/blob/master/src/lib/libc/crypt/arc4random.c

Operating system specific helper functions are at
https://github.com/libressl-portable/openbsd/tree/master/src/lib/libcrypto/crypto

They (OpenBSD/LibreSSL developers) have spent some time tuning their random
generator, so I would assume it is good enough for most purposes.

***************************************************************

From: Randy Brukardt
Sent: Saturday, November 8, 2014  3:36 PM

> Other than Ada.Numerics.Float_Random, I've never used an RNG that
> produces 1.0, and I tend to forget that Float_Random may. My comment
> was for RNGs that don't.

I think that they all don't effectively. My recollection is that Float_Random
generates a 64-bit random integer (2**64-1), and when that is scaled into a
float value, it essentially rounds to 1.0. The definition of Float_Random
means that we don't have to avoid that. (Although I do wonder if 0.0 and 1.0
are as likely to occur as any other bit pattern.) In any case, I used the
implementation provided by Ken Dritz (and his associated test program), which
was part of the Ada 9x work. [The test program isn't in the ACATS as its
nature is to sometimes fail a couple of trials; he used 40 trials and it's
OK if it passes at least 38. But it still can be OK and fail more often,
since it's supposed to be random, after all. I've seen 4 failures on a single
run (the next run typically has none). As an ACATS test, that wouldn't be of
much use, since it would be theortically possible to fail all 40 trials and
still be good. Note that a generator that *always* passed all of the trials
would be insufficiently random, it would have a pattern to keep the Chi-squares
low.] I *could* have done something wrong, but I doubt it.

Discrete_Random is a different kettle of fish (I don't think there was a
reference implementation, if there was, I didn't use it, or it was wrong).

***************************************************************

From: Pascal Pignard
Sent: Monday, November 10, 2014  2:39 PM

Let's take the initial proposal from Adam:
(1) Add to Discrete_Random:
   function Random (Gen : Ada.Numerics.Float_Random.Generator)
       return Result_Subtype;

My understanding of the given problem is how to produce a "random" integer
in a range specified at each call from a pseudo random generator of a known
quality which produce for example 32 bits with each bit having the same
probability at each draw.

So for range as 0..(2**n)-1, it's trivial -> take n bits in the random number.

For the other ranges, I'll take the example of a dice of 20 sides thus in
range 1..20.

Let's admit that adding an integer to a random number keeps the quality of
this number. It's like a translation.

So we can translate to 0..19.

Take the greatest power of two minus one less than 19, that is 15 or (2**4)-1,
so take four bits of the random number.

Note: whatever the position of the bits as the have the same probability for
each. Thus we have a random u in range 0..15.

Take now two others bits, so we have a random v in range 0..3.

Take one more bit, so we have a random w  in range 0..1.

Add then all x = u + v + w, so we have a random x  in range 0..19.

Plus one, we have a random x in range 1..20.

Of course we have to care if the number of random bits fits in the range of
the random generator and maybe others things like to get a high quality n bits
pseudo random generator.

It seems then feasible to get such a function.

***************************************************************

From: Gustaf Thorslund
Sent: Monday, November 10, 2014  3:39 PM

> Add then all x = u + v + w, so we have a random x  in range 0..19.
> Plus one, we have a random x in range 1..20.

No, I'm afraid you cannot do that. If you have two dice and add the numbers of
them you will not get an even distribution (7 being the most common sum). It
should not be too much work calculating the distribution of your suggestion.

If you on the other hand (like Bob Duff already suggested) throw away the
numbers outside the range, and retry until you get a value inside the range,
you should get an even distribution over that range. If you for example would
roll a six sided dice, throw away and retry every time you get a six, you will
basically get a five sided dice since those values will still be an even
distribution.

As Bob also said you would throw away the values outside an even multiple of
the range, else you'll need to loop for awhile if you want to make a dice out
of a 64 bit number.

Looks like the code Tero posted also loops until it finds a value inside a
range.

***************************************************************

From: Randy Brukardt
Sent: Tuesday, November 11, 2014  12:16 AM

> > Add then all x = u + v + w, so we have a random x  in range 0..19.
> > Plus one, we have a random x in range 1..20.
>
> No, I'm afraid you cannot do that. If you have two dice and add the
> numbers of them you will not get an even distribution
> (7 being the most common sum). It should not be too much work
> calculating the distribution of your suggestion.

Right. When dealing with probabilities and distributions and the like, the
operative phrase is that "everything you know is wrong". In particular,
ordinary mathematics can't used to combine probabilities, and in fact little
can be done at all if the probabilities aren't independent.

> If you on the other hand (like Bob Duff already suggested) throw away
> the numbers outside the range, and retry until you get a value inside
> the range, you should get an even distribution over that range. If you
> for example would roll a six sided dice, throw away and retry every
> time you get a six, you will basically get a five sided dice since
> those values will still be an even distribution.
>
> As Bob also said you would throw away the values outside an even
> multiple of the range, else you'll need to loop for awhile if you want
> to make a dice out of a 64 bit number.
>
> Looks like the code Tero posted also loops until it finds a value
> inside a range.

Clearly, we can write such a function.

The question is how to add it to the existing Ada libraries.

I don't like any of Adam's suggestions, because they all cause
cross-contamination of the float and discrete libraries (which should be
independent, IMHO). Moreover, they all continue needing to use one
instantiation per range of values, which is trying to shoehorn something
dynamic into a compile-time construct. It can be done, but it's clunky at
best.

I'd go with a much more straightforward approach:

function Random (Gen   : Generator;
                 First : Result_Subtype := Result_Subtype'First;
                 Last  : Result_Subtype := Result_Subtype'Last) return Result_Subtype;

Obtains the "next" random number from the given generator, relative to its
current state, according to an implementation-defined algorithm. The result
of the function in an instantiation of Numerics.Discrete_Random is within the
range First .. Last, and is delivered as a value of the generic formal subtype
Result_Subtype.

---

The question with this is whether we can accept the level of incompatibility
that's associated with changing the profile of a library-defined function.
The suggested change would allow calls to be unchanged, but would break any
renames or access-to-subprogram uses.

We could avoid that by using overloading instead, leaving the existing function
alone and dropping the default values of the parameters. That's slightly less
flexible (usually, you'd only want to give Last), but it's more compatible.
(It's still not 100% compatible, because of use-visibility and the possibility
of user-defined Random functions.)

function Random (Gen   : Generator;
                 First : Result_Subtype;
                 Last  : Result_Subtype) return Result_Subtype;

Obtains the "next" random number from the given generator, relative to its
current state, according to an implementation-defined algorithm. The result
of the function in an instantiation of Numerics.Discrete_Random is within the
range First .. Last, and is delivered as a value of the generic formal subtype
Result_Subtype.

***************************************************************

From: Randy Brukardt
Sent: Friday, December 11, 2015  9:33 PM

The draft minutes for this AI say:

AI12-0144-1/02 Make Discrete_Random more flexible

Bob: It is hard to implement this right, users usually get it wrong. (The e-mail
thread proves that multiple times.) So we should provide it.

Add the correct algorithm to random generation for integers to the discussion.

Steve Baird says that the wording in A.5.2(41) require a uniformly distributed
result over the result subtype. But that's not what we want when we have the
First and Last parameters.

So add to A.5.2 (41): ..., or in the case of the version of Random with First
and Last parameters, over the range First .. Last.

Erhard suggests: "...uniformly distributed over the range provided by the
parameters First and Last, or the range of the result subtype." He then
withdraws that.

Straw poll: Use a postcondition 6-1-0.

Approve AI with changes: 7-0-0.

----------------------------------

Two problems with these minutes.

First, I have no recollection what "use a postcondition" means. Jeff says that
he thinks Steve Baird made the suggestion, but that's all he recalls. I edited
the AI without using a postcondition, but then the second problem came up.


Second, Jeff claims that we also are supposed to change the wording of A.5.2(42)
like that of A.5.2(41). One clear problem with that is that at least I voted on
this proposal assuming that the change of A.5.2(41) was the only one. I find the
proposed wording to be an ugly patch, and I didn't object to it mainly because I
couldn't think of anything better assuming only one use. One ugly hack is OK,
but duplicating it in successive paragraphs is not acceptable, IMHO. To wit:

  A sufficiently long sequence of random numbers obtained by successive calls to
  Random is approximately uniformly distributed over the range of the result
  subtype{, or in the case of the version of Random with First and Last
  parameters, over the range First .. Last.}.

  The Random function in an instantiation of Numerics.Discrete_Random is
  guaranteed to yield each value in its result subtype{, or in the case of the
  version of Random with First and Last parameters, over the range First ..
  Last,} in a finite number of calls, provided that the number of such values
  does not exceed 2**15.

I don't know if I was the only one who didn't know that a change was intended to
A.5.2(42), but the fact that I didn't put it into my notes suggests that it
wasn't done very clearly. Thus I have to think the vote to approve was invalid;
I never, *ever* would have voted for the above, and I would have objected to
voting for "whatever the editor comes up with" on general principles.

----------------------------------

With these two problems, we will have to rewrite and revote this AI. Since the
problems are relatively minor, we can agree to wording here and then do a letter
ballot to approve the AI (we don't have to spend meeting time on it assuming
that we can agree enough that no one feels obligated to vote against the AI).

I have no suggestion about a postcondition; someone will have to explain what
was meant (and whether that intended to subsume some of the English wording in
the AI).

As far as the wording for A.5.2(41-2), I think we need to define once what the
result ranges are and then use that in the other wording:

  {The *result range* of a Random function, is the range First .. Last for the
  version of Random with First and Last parameters, and the range of the result
  subtype otherwise.}

  A sufficiently long sequence of random numbers obtained by successive calls to
  Random is approximately uniformly distributed over the {result} range [of the
  result subtype].

  {A}[The] Random function in an instantiation of Numerics.Discrete_Random is
  guaranteed to yield each value in its result {range}[subtype] in a
  finite number of calls, provided that the number of such values does not
  exceed 2 15.

[Note: "15" in this last sentence is superscript. We're not changing that; it
doesn't show in plain text.]

Thoughts???

***************************************************************

From: John Barnes
Sent: Saturday, December 12, 2015  4:35 AM

A tricky subject. A good idea to define the range first.

The actual requirement is pretty vague and so unwise to use a postcondition. And
we would have to define uniformly distributed which is hardly the point. There
are lots of ways of generating a useless sequence that visits every value the
same number of times. Even G.2.5 is vague since it doesn't define what test
suite it is talking about or what conformance means.

So apart from defining the range first I would just leave it alone but please
ensure that the poor superscript 15 is on the same line as the 2. It's not in
the LNCS version which has a line feed between the 2 and 15. Gosh.

Cheers and Happy Holidays

John


Don't use a postcondition.

***************************************************************

From: Tucker Taft
Sent: Saturday, December 12, 2015  9:32 AM

I generally agree with John.  The only thing I could imagine the postcondition
might do is define the minimum and maximum possible values returned.  You
probably couldn't say anything more than that about the distribution with a
postcondition.

So, e.g.,

function Random (Gen   : Generator;
                 First : Result_Subtype;
                 Last  : Result_Subtype) return Result_Subtype
     with Post => Random'Result in First .. Last;

***************************************************************

From: Bob Duff
Sent: Saturday, December 12, 2015  11:14 AM

>      with Post => Random'Result in First .. Last;

I think that's the postcondition that was discussed at the meeting, and I think
somebody said that if we have that, then we don't need to say in English "the
result is in the range ..."

***************************************************************

From: Randy Brukardt
Sent: Saturday, December 12, 2015  12:34 PM

Randy’s proposed introduction of a definition of “result range” just for local
usage in those two paragraphs seems a good simplification to me.

And now that Tuck has written out the post-condition then yes, I think that’s
all that was meant.

Until people are used to looking for post-conditions, I think the (improved)
wording is still needed.

***************************************************************

From: John Barnes
Sent: Friday, December 18, 2015  7:26 AM

I am fretting over that postcondition.

The text says that the value returned will be in the range. Fact. There is
absolutely no hint that it might try to be outside the range and so raise some
sort of exception.

But if we express this idea with a postcondition, then by the dynamic nature of
postconditions in Ada (as opposed to good old Spark) there is an implication
that it might fail and so raise Assertion_Error or whatever.

So for me the postcondition seems weaker. I don't like it.

[Editor's note: This topic was split into AI12-0179-1; the mail thread continues
there.]

***************************************************************

From: Randy Brukardt
Sent: Tuesday, January 5, 2016 12:10 AM

Getting back to the original reason for this thread...

Here's the revised proposed wording section for this AI. I have a question at
the end...

Add after A.5.2(20):

function Random (Gen   : Generator;
                 First : Result_Subtype;
                 Last  : Result_Subtype) return Result_Subtype
     with Post => Random'Result in First .. Last;

Add after A.5.2(32):

function Random (Gen   : Generator;
                 First : Result_Subtype;
                 Last  : Result_Subtype) return Result_Subtype
     with Post => Random'Result in First .. Last;

  Obtains the "next" random number from the given generator, relative to its
  current state, according to an implementation-defined algorithm. The result
  of the function in an instantiation of Numerics.Discrete_Random is delivered
  as a value of the generic formal subtype Result_Subtype that makes the
  postcondition True. If the range First .. Last is a null range, Constraint_Error
  is raised.

Modify A.5.2(41):

  {The *result range* of a Random function, is the range First .. Last for the
  version of Random with First and Last parameters, and the range of the result
  subtype otherwise.}

  A sufficiently long sequence of random numbers obtained by successive calls to
  Random is approximately uniformly distributed over the {result} range [of the
  result subtype].

Modify A.5.2(42):

  {A}[The] Random function in an instantiation of Numerics.Discrete_Random is
  guaranteed to yield each value in its result {range}[subtype] in a
  finite number of calls, provided that the number of such values does not
  exceed 2 15.

-----------

In the new text for function Random, we have "The result of the function in an
instantiation of Numerics.Discrete_Random is delivered as a value of the
generic formal subtype Result_Subtype that makes the postcondition True."

Assuming that we have the blanket rule for postconditions (now proposed in
AI12-0179-1, not yet posted), this sentence says only that the function meets
its specification. We *assume* that everywhere in the Standard, so this is
just noise.

I've continued to include it because this paragraph immediately follows
A.5.2(32), which has similar wording, which is similarly useful (not at all).
[In all of these cases, the important information is in the Implementation 
Requirements portion.]

But it would be better to omit this junk wording -- that was the intent of
adding a postcondition as I understand it. To make the wording consistent,
we then would have to delete the existing wording from A.5.2(32).
Specifically, the 2nd and 3rd sentences of the following paragraph would also
be dropped as marked:

  Obtains the "next" random number from the given generator, relative to its
  current state, according to an implementation-defined algorithm. [The result
  of the function in Numerics.Float_Random is delivered as a value of the
  subtype Uniformly_Distributed, which is a subtype of the predefined type Float
  having a range of 0.0 .. 1.0. The result of the function in an instantiation
  of Numerics.Discrete_Random is delivered as a value of the generic formal
  subtype Result_Subtype.]

I didn't unilaterally make this change as it seems like more change than is
necessary, even though I can hardly imagine any more empty verbiage in the
Standard. (What else would the result of Random be? A lake in southern
Sheboygan county? Santa Claus?? :-) Who would expect something other than a
value of its result subtype? Whose details can be read in the specification
that precedes this text!)

Should I make this slightly larger change, or forget it?

P.S. I'd like to put this AI to bed someday soon. Or does Someday Never
Come?? :-)

***************************************************************

From: Tucker Taft
Sent: Tuesday, January 5, 2016  2:32 PM

> ... I didn't unilaterally make this change as it seems like more 
> change than is necessary, even though I can hardly imagine any more 
> empty verbiage in the Standard. (What else would the result of Random 
> be? A lake in southern Sheboygan county? Santa Claus?? :-) Who would 
> expect something other than a value of its result subtype? Whose 
> details can be read in the specification that precedes this text!)

I have no problem with deleting clearly superfluous wording.  The danger
of blatantly redundant wording is that the user might think it means more
than it says, and try to read something extra in it.  So I would agree,
heave ho!

> Should I make this slightly larger change, or forget it?

OK with me.

***************************************************************

From: Edmond Schonberg
Sent: Tuesday, January 5, 2016  2:43 PM

>> Should I make this slightly larger change, or forget it?
> 
> OK with me.

That’s what I would call a Talmudic response ...

***************************************************************

From: Tucker Taft
Sent: Tuesday, January 5, 2016  2:53 PM

Oops.  Deleting the clearly redundant wording is fine if it is not clarifying
things for the user.

***************************************************************

From: Bob Duff
Sent: Tuesday, January 5, 2016  4:06 PM

> Assuming that we have the blanket rule for postconditions (now 
> proposed in AI12-0179-1, not yet posted), this sentence says only that 
> the function meets its specification. We *assume* that everywhere in 
> the Standard, so this is just noise.

I agree.  We should never say that a particular procedure obeys its
postcondition, because that would lead readers to think that maybe other
procedures don't obey their postcondition. Better to rely on the blanket
statement. So I agree with your suggested change.

***************************************************************

From: Jeff Cousins
Sent: Wednesday, January 6, 2016  3:48 AM

I agree for the same reason.

***************************************************************

Questions? Ask the ACAA Technical Agent