!standard A.5.2(20) 16-01-28 AI12-0144-1/07 !standard A.5.2(32) !standard A.5.2(41) !standard A.5.2(42) !class Amendment 14-12-04 !status Amendment 1-2012 16-02-29 !status WG9 Approved 16-06-13 !status ARG Approved 10-0-2 (By Letter Ballot) 16-02-26 !status work item 15-12-17 !status ARG Approved 7-0-0 15-10-18 !status Promising (10-0-0) 15-02-26 !status work item 14-12-04 !status received 14-11-06 !priority Low !difficulty Easy !subject Make Discrete_Random more flexible !summary Add a more general function Random to Discrete_Random. !problem A recent StackOverflow question asked how to write a function: function generate_random_number (n: in Positive) return Integer is to generate one random number in the range 0..n. n could be different for each call. It looks like this isn't easily possible using Ada.Numerics.Discrete_Random, since a new instantiation of Discrete_Random must be declared inside generate_random_number, and a new generator variable of type Instantiation.Generator needs to be declared. The questioner tried Reset to reset the generator to a time-dependent state each time, but this failed because the procedure was being called faster than the clock was changing. The most direct way to accomplish something like this seems to be to use Float_Random, reset the generator, and then perform the needed computations to yield a discrete random number. This is unfortunate, because the Discrete_Random packages are no doubt already performing these or related computations, and hopefully doing it in a way that produces the best results, which is not trivial; but these computations aren't available to a program (unless a compiler vendor makes them available through a non-standard package). !proposal (See wording.) !wording Add after A.5.2(20): function Random (Gen : Generator; First : Result_Subtype; Last : Result_Subtype) return Result_Subtype with Post => Random'Result in First .. Last; Modify A.5.2(32): Obtains the “next” random number from the given generator, relative to its current state, according to an implementation-defined algorithm.[ The result of the function in Numerics.Float_Random is delivered as a value of the subtype Uniformly_Distributed, which is a subtype of the predefined type Float having a range of 0.0 .. 1.0. The result of the function in an instantiation of Numerics.Discrete_Random is delivered as a value of the generic formal subtype Result_Subtype.] [Editor's note: We delete the last two sentences here as all they say is that the function result is in the result subtype, which is always true for a function (in the absence of exceptions or erroneous execution, both of which have to be explicitly stated). The actual description of the result is in the Implementation Requirements that follow later; the existing text promises to give that without doing so. We remove this text now so that we don't have to give similar text in the description of the new subprogram, where it would be substantially more complicated.] Add after A.5.2(32): function Random (Gen : Generator; First : Result_Subtype; Last : Result_Subtype) return Result_Subtype with Post => Random'Result in First .. Last; Obtains the "next" random number from the given generator, relative to its current state, according to an implementation-defined algorithm. If the range First .. Last is a null range, Constraint_Error is raised. [Editor's notes: During the ARG phone meeting of February 26, 2015, it was suggested that the null range case raise Program_Error. However, Constraint_Error is the existing exception for that (A.5.2.(39)), so we used that here. Postconditions are assumed to impose a requirement on the body of a subprogram, as proposed in AI12-0179-1. In the unlikely event that AI is not approved, then we should not use a postcondition here; we'd need some wording instead. End Editor's notes.] Modify A.5.2(41): {Each call of a Random function has a *result range*; this is the range First .. Last for the version of Random with First and Last parameters and the range of the result subtype of the function otherwise.} A sufficiently long sequence of random numbers obtained by {consecutive} [successive] calls to Random {that have the same generator and result range} is approximately uniformly distributed over the {result} range [of the result subtype]. {AARM Discussion: In this rule, "consecutive" means at least that there are no intervening explicit calls involving the same generator. This restricts the rule to only applying to cases where just the Random function changes the generator. We don't mean to impose a requirement if there are intervening calls to Reset, to Random with same generator but a different result range, or any other case that would affect the sequence of values returned. Operations which use the resulting random values (for instance, to store them somewhere) are not considered in determining if calls are consecutive.} Modify A.5.2(42): {A}[The] Random function in an instantiation of Numerics.Discrete_Random is guaranteed to yield each value in its result {range}[subtype] in a finite number of calls, provided that the number of such values does not exceed 2 15. [Editor's note: the above "15" is a superscript, which doesn't show up in this plain text rendering.] !discussion The author ran into this problem with a hobby solitaire program. It was necessary to use a reproducible generator (thus using a single generator created using Reset with an Initiator parameter). The algorithm for dealing cards is to create an array containing all of the cards, then select one from that array using a random generator. Then close up the array, and repeat the selection on the array (minus the selected card). This means you want to select with ranges 1..52, then 1..51, then 1..50, etc. That's not possible with the Ada 95 definition of Discrete_Random. Users often accomplish this in ways that do not provide a truly uniform distribution. It's easy to come up with methods that don't work (see the !appendix for several such algorithms), but it's difficult to get right. One algorithm that is correct is to start with a uniform random generator that produces an integer in the range 0 .. Large. (Usually Large is some power of 2 - 1). Then, for a call to Random with bounds First and Last, we determine Len := (Last-First+1) and then Mult, which is the largest multiple of Len that is less than Large. (Len cannot be greater than Large.) We then loop, getting a random value, and exiting the loop when the random value is less than Mult. Finally, we mod the random value (now in the range 0 .. Mult-1) by Len and add First. This loop will usually exit on the first iteration, and even in the worst case (Len = Large/2+1), there is at least a 50% chance that the loop will exit on each iteration. So it is unlikely that many iterations will be needed. If this was 1994, we should have simply defined Random in Discrete_Random as follows: function Random (Gen : Generator; First : Result_Subtype := Result_Subtype'First; Last : Result_Subtype := Result_Subtype'Last) return Result_Subtype; This allows leaving out the First parameter if it matches that of the formal subtype (which it usually will). Unfortunately, that is more incompatible, as renames and access-to-subprogram matching would fail (whereas the proposed solution is only incompatible if there are use-clause conflicts with the profile, which is unlikely). !corrigendum A.5.2(20) @dinsa @xcode< @b Random (Gen : Generator) @b Result_Subtype;> @dinst @xcode< @b Random (Gen : Generator; First : Result_Subtype; Last : Result_Subtype) @b Result_Subtype @b Post =@> Random'Result @b First .. Last;> !corrigendum A.5.2(32) @drepl @xindent @dby @xindent @xcode<@b Random (Gen : Generator; First : Result_Subtype; Last : Result_Subtype) @b Result_Subtype @b Post =@> Random'Result @b First .. Last;> @xindent !corrigendum A.5.2(41) @drepl A sufficiently long sequence of random numbers obtained by successive calls to Random is approximately uniformly distributed over the range of the result subtype. @dby Each call of a Random function has a @i; this is the range First .. Last for the version of Random with First and Last parameters and the range of the result subtype of the function otherwise. A sufficiently long sequence of random numbers obtained by consecutive calls to Random that have the same generator and result range is approximately uniformly distributed over the result range. !corrigendum A.5.2(42) @drepl The Random function in an instantiation of Numerics.Discrete_Random is guaranteed to yield each value in its result subtype in a finite number of calls, provided that the number of such values does not exceed 2 15. @dby A Random function in an instantiation of Numerics.Discrete_Random is guaranteed to yield each value in its result range in a finite number of calls, provided that the number of such values does not exceed 2 15. !ASIS No impact. !ACATS test An ACATS test should be constructed to check that the new function exists and works appropriately. Perhaps the card-dealing example mentioned in the discussion could be used. !appendix !topic Allow random numbers for different discrete subtypes using same generator !reference RM12 A.5.2 !from Adam Beneschan 14-11-06 !discussion This arose from a StackOverflow question in which a questioner wanted to write a function: function generate_random_number (n: in Positive) return Integer is to generate one random number in the range 0..n. n could be different for each call. It looks like this isn't easily possible using Ada.Numerics.Discrete_Random, since a new instantiation of Discrete_Random must be declared inside generate_random_number, and a new generator variable of type Instantiation.Generator needs to be declared. The questioner tried Reset to reset the generator to a time-dependent state each time, but this failed because the procedure was being called faster than the clock was changing. Perhaps there are ways to get this to work with an initiator parameter on Reset. However, the most direct way to accomplish something like this seems to be to use Float_Random, reset the generator, and then perform the needed computations to yield a discrete random number. This is unfortunate, because the Discrete_Random packages are no doubt already performing these or related computations, and hopefully doing it in a way that produces the best results, which is not trivial; but these computations aren't available to a program (unless a compiler vendor makes them available through a non-standard package). I think there should be a way to generate random numbers for different discrete ranges using the same generator. Several possibilities come to mind: (1) Add to Discrete_Random: function Random (Gen : Ada.Numerics.Float_Random.Generator) return Result_Subtype; which would allow the program to set up one common generator using Float_Random. (2) Add to Float_Random: generic type Result_Subtype is (<>); function Random (Gen : Generator) return Result_Subtype; Basically the same as #1. (3) Add a new non-generic package that defines a common generator type that could be used in Random functions in Discrete_Random and Float_Random. This package would also define a common State type and provided the Reset, Save, Image, and Value subprograms currently provided by Float_Random and Discrete_Random. Since this question came from a StackOverflow poster who could have just been experimenting, I can't say how useful in practice this feature would be. It does seem like a flaw that it's missing, however. *************************************************************** From: Jean-Pierre Rosen Sent: Thursday, November 6, 2014 3:17 PM > It looks like this isn't easily possible using > Ada.Numerics.Discrete_Random Isn't it possible to use a random generator on Natural and return Value mod n + 1 ? (the quality of the random numbers might not be as good for big n's, but it may be sufficient in practice) *************************************************************** From: Adam Beneschan Sent: Thursday, November 6, 2014 4:47 PM Sure, if you don't want a uniform distribution. Unless the number of values in Natural (2**m for some m) is divisible by (n + 1), doing this will result in some values in the range being generated with a higher probability than others. There are ways to make things uniform. The point is that the implementors of Discrete_Random have, I presume, already figured out how to do this (I think they would have to in order to satisfy Annex G); why should we require that those with the kind of problem I wrote about have to reinvent the wheel? *************************************************************** From: Pascal Obry Sent: Thursday, November 6, 2014 4:10 PM > Isn't it possible to use a random generator on Natural and return > Value mod n + 1 ? > > (the quality of the random numbers might not be as good for big n's, > but it may be sufficient in practice) > Or better, just use Ada.Numerics.Float_Random and multiply by N. *************************************************************** From: Jean-Pierre Rosen Sent: Friday, November 7, 2014 2:32 AM > Sure, if you don't want a uniform distribution. Unless the number of > values in Natural (2**m for some m) is divisible by (n + 1), doing > this will result in some values in the range being generated with a > higher probability than others. Well, correct my maths if I'm wrong, but it seems to me that the average unbalance is (2**m rem (n+1)) / 2**m. If what is needed is a dice-4, dice-6 and dice-12, the unbalance would be neglectible (even up to n=2**20 for a 32 bits generator). *************************************************************** From: Randy Brukardt Sent: Friday, November 7, 2014 6:44 PM This problem happened to me in a hobby program: I needed to assign playing cards to a Solitare layout. I wanted to use a reproducible generator (thus using a single generator created using Reset with an Initiator parameter). To properly deal cards, it's best to create an array containing all of the cards, then select one from that array using a random generator. Then close up the array, and repeat the selection on the array (minus the selected card). This means you want to select with ranges 1..52, then 1..51, then 1..50, etc. Since m in this case is 52!, I'm pretty sure that no 32-bit or even 64-bit generator is going to be uniform using your formula. I ended up using Float_Random, using a loop to avoid having to do any sort of conversion out of float. (I couldn't find any way to make that work and still keep a uniform distribution.) It would have been nice to be able to use Discrete_Random for this case, especially as every simulation problem that I've ever dealt with has had this form. (None of which ever were work related, so I've never thought to bring this up at the ARG level.) What is the value of having a package that never can be used on actual problems?? Interestingly, I ended up with a better generator having used Float_Random, since our Discrete_Random is not going to generate an exactly uniform distribution -- it has the same conversion problems that I couldn't find any way to avoid with the card dealer. It's "good enough" for small ranges (like dice or cards) [as noted by J-P's formula], but some large ranges would have a noticeable non-uniformity. Probably should fix that someday. (And we all know "Someday Never Comes" - thanks to Credence Clearwater Revival. :-) *************************************************************** From: Jean-Pierre Rosen Sent: Friday, November 7, 2014 11:59 PM >> Isn't it possible to use a random generator on Natural and return >> Value mod n + 1 ? >> >> (the quality of the random numbers might not be as good for big n's, >> but it may be sufficient in practice) >> > > Or better, just use Ada.Numerics.Float_Random and multiply by N. This doesn't work. During the making of Ada95, there was a huge discussion between numerician about whether it was possible to make a discrete generator out of a floating one, and after megabytes of e-mail, the conclusion was that it was not possible. I'm not enough of a numerician to explain why, *************************************************************** From: Jean-Pierre Rosen Sent: Saturday, November 8, 2014 12:12 AM > This means you want to select with ranges 1..52, then 1..51, then > 1..50, etc. > > Since m in this case is 52!, I'm pretty sure that no 32-bit or even > 64-bit generator is going to be uniform using your formula. ?? m (if you refer to the number used as a parameter to the function we are discussing) is 52, then 51, ... Certainly a "small number" for the generator bias. As I use to say, I'm not a numerician, I just know enough of numerics to realize how easy it is to make incorrect statements about numerics when you are not a numerician. So I'm asking: 1) Is it a real problem? (i.e. worth spending time on it - we should take advice of real numericians) 2) What do other languages do? (to be fair, having a functionality is not enough if there is no guarantee of uniformity - anybody can write a bad random generator). *************************************************************** From: Jeffery R. Carter Sent: Saturday, November 8, 2014 12:36 AM >>> Isn't it possible to use a random generator on Natural and return >>> Value mod n + 1 ? >>> >>> (the quality of the random numbers might not be as good for big n's, >>> but it may be sufficient in practice) For some applications (games, for example) the statistics of the RNG may not be important, but for others, they are very important. Given an RNG that returns uniformly distributed values in 0 .. 2 ** M - 1 (Unsigned_32, for example, where M=32), taking R rem (N + 1) gives uniformly distributed results only when M rem (N + 1) = 0. All other cases will not be satisfactory for strict uses. >> Or better, just use Ada.Numerics.Float_Random and multiply by N. >> > This doesn't work. Right. Rounding R * N to the nearest integer results in zero and N appearing about 1/2 as often as 1 .. N - 1. That may not be acceptable even for some of the less strict uses. A better approach is to truncate R * (N + 1), but even that isn't acceptable for strict uses. This is the approach taken by the Random_Int function in PragmARC.Universal_Random. *************************************************************** From: Randy Brukardt Sent: Saturday, November 8, 2014 1:26 AM > > This means you want to select with ranges 1..52, then 1..51, then > > 1..50, etc. > > > > Since m in this case is 52!, I'm pretty sure that no 32-bit or even > > 64-bit generator is going to be uniform using your formula. > > ?? m (if you refer to the number used as a parameter to the function > we are discussing) is 52, then 51, ... Certainly a "small number" for > the generator bias. To use the *existing* Discrete_Random, you have to use an m that is a factor of all of them, as you can't create a new generator for each step (and you'd have to do that to use different instantiations). To use the float generator, you don't use "m" at all. > As I use to say, I'm not a numerician, I just know enough of numerics > to realize how easy it is to make incorrect statements about numerics > when you are not a numerician. Same here. But I know that I couldn't get even acceptable results out of the existing random generators without extreme care (and I don't have critical requirements) -- essentially I had to never convert the float value to any discrete type. (And I can't be certain I got it right as it is.) You noted that multiplying by N isn't enough, nor is truncating (that caused values to occassionally be out of range). I used a loop and a float compare to avoid that problem, but it might have left an extra value into one of the end intervals. I don't know of any way to avoid that problem (which is at least small enough for my purposes, given the float value is a long float, so we have n/2**53, which is pretty darn small). Note that Jeff Carter said the same thing: > Right. Rounding R * N to the nearest integer results in zero and N appearing > about 1/2 as often as 1 .. N - 1. That may not be acceptable even for some of > the less strict uses. It wasn't for my card dealing application, for which I wanted to simulate more than 100,000 deals (sometimes running into millions of deals). > A better approach is to truncate R * (N + 1), but even that isn't acceptable > for strict uses. This is the approach taken by the Random_Int function in > PragmARC.Universal_Random. That didn't even work for me, as there was a non-zero chance of R = 1.0; in that case the result ended up indexing out of the range of the upper end of the selection array (that's how I found out it was still wrong, a bizarre Constraint_Error after simulating several thousand deals). I ended up putting that value into the uppermost bin; that's makes the uppermost bin slightly more likely to be selected, but never more often than 52/2**53, which seemed small enough for my application. (That would take roughly 2**40 simulations to be significant.) I don't know of any way to avoid that problem, perhaps that's the source of the issue that J-P remembered reading about. But how to do this perfectly I don't know. The implementation of Discrete_Random in Janus/Ada is probably worse than that, because using a loop isn't practical if the range is more than a few hundred items, and I probably didn't have a real truncation operation available when I wrote it. Which I believe I had to do, because no "real numerician" provided an implementation that actually would work for Janus/Ada as it existed in the 1990s. The Float_Random was from Ken Dritz, but I don't think there was ever a matching Discrete_Random implementation (and I could find none in the literature). Probably it would be easier to find these things today (no Google in 1996), but of course rewriting and retesting that is a low priority in the absence of some customer demand. > So I'm asking: > > 1) Is it a real problem? (i.e. worth spending time on it - we should > take advice of real numericians) It's certainly a real problem in that there is a class of applications where neither Discrete_Random nor Float_Random (as currently defined) can be used to provide completely uniform behavior. Perhaps there is some numerical reason that the problem can never be solved, but we will need to be convinced of that. > 2) What do other languages do? (to be fair, having a functionality is > not enough if there is no guarantee of uniformity - anybody can write > a bad random generator). I don't think that other languages even had an equivalent for Discrete_Random back in the 1990s. Whether that's changed, I don't know; none of the languages I've ever used has had anything other than a float version. As such, this seems to be a self-made Ada problem. Whether or not the landscape for other languages has changed, it seems amazingly restrictive to have to select every random number from the same range. There are a lot of problems where that simply doesn't work. Perhaps there is some numerical reason for Discrete_Random (I don't recall the discussions that J-P mentioned, but that doesn't mean much for something that happened over 20 years ago) that would prevent it from being useful for a large class of problems. If that's true, it should be documented in the AARM so that at least us language lawyers will know why it can't be useful. I think this means that we'll need an AI and someone will have to be tasked to go off and scour the Internet for information. (We're never going to have enough "real numericians" in the ARG.) *************************************************************** From: Bob Duff Sent: Saturday, November 8, 2014 9:26 AM > For some applications (games, for example) the statistics of the RNG > may not be important, but for others, they are very important. Depends on whether you're playing for money. I'd like to play blackjack against a program that uses the "mod N" method. ;-) I thought the standard way to get a random integer in the range (say) 1 .. 6 is to get a random integer in the range (say) 0 .. 2**31-1, and throw away results >= 2_147_483_646. 2_147_483_646 is the largest multiple of 6 <= 2**31-1. Do this in a loop until you get a number in 0 .. 2_147_483_645. THEN do the "mod 6" and add 1. That loop terminates, and usually goes round just once. It's not much harder to program than the plain "mod 6" method. J-P wrote: > As I use to say, I'm not a numerician, ... Is that a French word? I've never heard it. When I put "define numerician" in google, it defines "numeration". ;-) I assume you mean "numerical analyst". I'm not one, either! Neither are most programmers, which is why it's good to have this sort of thing in the predefined libraries. Randy wrote: > I don't think that other languages even had an equivalent for > Discrete_Random back in the 1990s. Whether that's changed, I don't > know; none of the languages I've ever used has had anything other than > a float version. Lots of languages have pseudo-random generators that produce integers. Few have any strict requirements, which makes them useless for serious work if portability is desired. For example, C has rand(), which returns an int in the range 0 .. RAND_MAX. RAND_MAX = 2**31-1 on my machine. Some implementations of rand() are pretty bad, or so I've heard. I'm pretty sure rand() dates back to the early days of C, so before 1990s. > I think this means that we'll need an AI and someone will have to be > tasked to go off and scour the Internet for information. (We're never > going to have enough "real numericians" in the ARG.) Or ask somebody who knows what they're talking about (unlike me). For example, Ken Dritz or Paul Hilfinger. *************************************************************** From: Jeffrey R. Carter Sent: Saturday, November 8, 2014 11:56 AM >> A better approach is to truncate R * (N + 1), but even that isn't acceptable >> for strict uses. This is the approach taken by the Random_Int function in >> PragmARC.Universal_Random. > > That didn't even work for me, as there was a non-zero chance of R = > 1.0; in that case the result ended up indexing out of the range of the > upper end of the selection array (that's how I found out it was still > wrong, a bizarre Constraint_Error after simulating several thousand > deals). I ended up putting that value into the uppermost bin; that's > makes the uppermost bin slightly more likely to be selected, but never > more often than 52/2**53, which seemed small enough for my > application. (That would take roughly 2**40 simulations to be > significant.) I don't know of any way to avoid that problem, perhaps > that's the source of the issue that J-P remembered reading about. Other than Ada.Numerics.Float_Random, I've never used an RNG that produces 1.0, and I tend to forget that Float_Random may. My comment was for RNGs that don't. Universal_Random, for example, produces all multiples of 2**(-24) from zero to 1 - 2**(-24), so the truncation approach works fine with it. There is still a problem, similar to the problem with the remainder method for a discrete generator. The RNG produces M unique values in 0 .. 1-e. If M rem N /= 0, then the results of the truncation method are not uniformly distributed. (Universal_Random, for example, produces 2**24 unique values, and 2**24 rem 52 /= 0.) *************************************************************** From: Jeffrey R. Carter Sent: Saturday, November 8, 2014 2:43 PM > Depends on whether you're playing for money. I'd like to play > blackjack against a program that uses the "mod N" method. ;-) Then it's a business for someone. Not entirely a game. > I thought the standard way to get a random integer in the range > (say) 1 .. 6 is to get a random integer in the range (say) 0 .. > 2**31-1, and throw away results >= 2_147_483_646. > 2_147_483_646 is the largest multiple of 6 <= 2**31-1. > Do this in a loop until you get a number in 0 .. 2_147_483_645. THEN > do the "mod 6" and add 1. > > That loop terminates, and usually goes round just once. > It's not much harder to program than the plain "mod 6" method. Are we sure the results are uniformly distributed? > Some implementations of rand() are pretty bad, or so I've heard. The Turbo Pascal random number generator was very bad. If you plotted consecutive values as (x, y) coordinates you got vertical lines! But on one project I worked on that had pretty strict requirements for its random values, we combined it with another, not-great linear-congruential generator and got acceptable results. *************************************************************** From: Tero Koskinen Sent: Saturday, November 8, 2014 3:21 PM > 2) What do other languages do? (to be fair, having a functionality is > not enough if there is no guarantee of uniformity - anybody can write > a bad random generator). OpenBSD uses arc4random and arc4random_uniform(uint32_t upper_bound) functions for random numbers in LibreSSL (and other places also). Implementation for arc4random_uniform (which returns 32-bit unsigned integer below the specified value) is found from https://github.com/libressl-portable/openbsd/blob/master/src/lib/libc/crypt/arc4random_uniform.c And pasted here, in case the link does not work: uint32_t arc4random_uniform(uint32_t upper_bound) { uint32_t r, min; if (upper_bound < 2) return 0; /* 2**32 % x == (2**32 - x) % x */ min = -upper_bound % upper_bound; /* * This could theoretically loop forever but each retry has * p > 0.5 (worst case, usually far better) of selecting a * number inside the range we need, so it should rarely need * to re-roll. */ for (;;) { r = arc4random(); if (r >= min) break; } return r % upper_bound; } arc4random itself is found from https://github.com/libressl-portable/openbsd/blob/master/src/lib/libc/crypt/arc4random.c Operating system specific helper functions are at https://github.com/libressl-portable/openbsd/tree/master/src/lib/libcrypto/crypto They (OpenBSD/LibreSSL developers) have spent some time tuning their random generator, so I would assume it is good enough for most purposes. *************************************************************** From: Randy Brukardt Sent: Saturday, November 8, 2014 3:36 PM > Other than Ada.Numerics.Float_Random, I've never used an RNG that > produces 1.0, and I tend to forget that Float_Random may. My comment > was for RNGs that don't. I think that they all don't effectively. My recollection is that Float_Random generates a 64-bit random integer (2**64-1), and when that is scaled into a float value, it essentially rounds to 1.0. The definition of Float_Random means that we don't have to avoid that. (Although I do wonder if 0.0 and 1.0 are as likely to occur as any other bit pattern.) In any case, I used the implementation provided by Ken Dritz (and his associated test program), which was part of the Ada 9x work. [The test program isn't in the ACATS as its nature is to sometimes fail a couple of trials; he used 40 trials and it's OK if it passes at least 38. But it still can be OK and fail more often, since it's supposed to be random, after all. I've seen 4 failures on a single run (the next run typically has none). As an ACATS test, that wouldn't be of much use, since it would be theortically possible to fail all 40 trials and still be good. Note that a generator that *always* passed all of the trials would be insufficiently random, it would have a pattern to keep the Chi-squares low.] I *could* have done something wrong, but I doubt it. Discrete_Random is a different kettle of fish (I don't think there was a reference implementation, if there was, I didn't use it, or it was wrong). *************************************************************** From: Pascal Pignard Sent: Monday, November 10, 2014 2:39 PM Let's take the initial proposal from Adam: (1) Add to Discrete_Random: function Random (Gen : Ada.Numerics.Float_Random.Generator) return Result_Subtype; My understanding of the given problem is how to produce a "random" integer in a range specified at each call from a pseudo random generator of a known quality which produce for example 32 bits with each bit having the same probability at each draw. So for range as 0..(2**n)-1, it's trivial -> take n bits in the random number. For the other ranges, I'll take the example of a dice of 20 sides thus in range 1..20. Let's admit that adding an integer to a random number keeps the quality of this number. It's like a translation. So we can translate to 0..19. Take the greatest power of two minus one less than 19, that is 15 or (2**4)-1, so take four bits of the random number. Note: whatever the position of the bits as the have the same probability for each. Thus we have a random u in range 0..15. Take now two others bits, so we have a random v in range 0..3. Take one more bit, so we have a random w in range 0..1. Add then all x = u + v + w, so we have a random x in range 0..19. Plus one, we have a random x in range 1..20. Of course we have to care if the number of random bits fits in the range of the random generator and maybe others things like to get a high quality n bits pseudo random generator. It seems then feasible to get such a function. *************************************************************** From: Gustaf Thorslund Sent: Monday, November 10, 2014 3:39 PM > Add then all x = u + v + w, so we have a random x in range 0..19. > Plus one, we have a random x in range 1..20. No, I'm afraid you cannot do that. If you have two dice and add the numbers of them you will not get an even distribution (7 being the most common sum). It should not be too much work calculating the distribution of your suggestion. If you on the other hand (like Bob Duff already suggested) throw away the numbers outside the range, and retry until you get a value inside the range, you should get an even distribution over that range. If you for example would roll a six sided dice, throw away and retry every time you get a six, you will basically get a five sided dice since those values will still be an even distribution. As Bob also said you would throw away the values outside an even multiple of the range, else you'll need to loop for awhile if you want to make a dice out of a 64 bit number. Looks like the code Tero posted also loops until it finds a value inside a range. *************************************************************** From: Randy Brukardt Sent: Tuesday, November 11, 2014 12:16 AM > > Add then all x = u + v + w, so we have a random x in range 0..19. > > Plus one, we have a random x in range 1..20. > > No, I'm afraid you cannot do that. If you have two dice and add the > numbers of them you will not get an even distribution > (7 being the most common sum). It should not be too much work > calculating the distribution of your suggestion. Right. When dealing with probabilities and distributions and the like, the operative phrase is that "everything you know is wrong". In particular, ordinary mathematics can't used to combine probabilities, and in fact little can be done at all if the probabilities aren't independent. > If you on the other hand (like Bob Duff already suggested) throw away > the numbers outside the range, and retry until you get a value inside > the range, you should get an even distribution over that range. If you > for example would roll a six sided dice, throw away and retry every > time you get a six, you will basically get a five sided dice since > those values will still be an even distribution. > > As Bob also said you would throw away the values outside an even > multiple of the range, else you'll need to loop for awhile if you want > to make a dice out of a 64 bit number. > > Looks like the code Tero posted also loops until it finds a value > inside a range. Clearly, we can write such a function. The question is how to add it to the existing Ada libraries. I don't like any of Adam's suggestions, because they all cause cross-contamination of the float and discrete libraries (which should be independent, IMHO). Moreover, they all continue needing to use one instantiation per range of values, which is trying to shoehorn something dynamic into a compile-time construct. It can be done, but it's clunky at best. I'd go with a much more straightforward approach: function Random (Gen : Generator; First : Result_Subtype := Result_Subtype'First; Last : Result_Subtype := Result_Subtype'Last) return Result_Subtype; Obtains the "next" random number from the given generator, relative to its current state, according to an implementation-defined algorithm. The result of the function in an instantiation of Numerics.Discrete_Random is within the range First .. Last, and is delivered as a value of the generic formal subtype Result_Subtype. --- The question with this is whether we can accept the level of incompatibility that's associated with changing the profile of a library-defined function. The suggested change would allow calls to be unchanged, but would break any renames or access-to-subprogram uses. We could avoid that by using overloading instead, leaving the existing function alone and dropping the default values of the parameters. That's slightly less flexible (usually, you'd only want to give Last), but it's more compatible. (It's still not 100% compatible, because of use-visibility and the possibility of user-defined Random functions.) function Random (Gen : Generator; First : Result_Subtype; Last : Result_Subtype) return Result_Subtype; Obtains the "next" random number from the given generator, relative to its current state, according to an implementation-defined algorithm. The result of the function in an instantiation of Numerics.Discrete_Random is within the range First .. Last, and is delivered as a value of the generic formal subtype Result_Subtype. *************************************************************** From: Randy Brukardt Sent: Friday, December 11, 2015 9:33 PM The draft minutes for this AI say: AI12-0144-1/02 Make Discrete_Random more flexible Bob: It is hard to implement this right, users usually get it wrong. (The e-mail thread proves that multiple times.) So we should provide it. Add the correct algorithm to random generation for integers to the discussion. Steve Baird says that the wording in A.5.2(41) require a uniformly distributed result over the result subtype. But that's not what we want when we have the First and Last parameters. So add to A.5.2 (41): ..., or in the case of the version of Random with First and Last parameters, over the range First .. Last. Erhard suggests: "...uniformly distributed over the range provided by the parameters First and Last, or the range of the result subtype." He then withdraws that. Straw poll: Use a postcondition 6-1-0. Approve AI with changes: 7-0-0. ---------------------------------- Two problems with these minutes. First, I have no recollection what "use a postcondition" means. Jeff says that he thinks Steve Baird made the suggestion, but that's all he recalls. I edited the AI without using a postcondition, but then the second problem came up. Second, Jeff claims that we also are supposed to change the wording of A.5.2(42) like that of A.5.2(41). One clear problem with that is that at least I voted on this proposal assuming that the change of A.5.2(41) was the only one. I find the proposed wording to be an ugly patch, and I didn't object to it mainly because I couldn't think of anything better assuming only one use. One ugly hack is OK, but duplicating it in successive paragraphs is not acceptable, IMHO. To wit: A sufficiently long sequence of random numbers obtained by successive calls to Random is approximately uniformly distributed over the range of the result subtype{, or in the case of the version of Random with First and Last parameters, over the range First .. Last.}. The Random function in an instantiation of Numerics.Discrete_Random is guaranteed to yield each value in its result subtype{, or in the case of the version of Random with First and Last parameters, over the range First .. Last,} in a finite number of calls, provided that the number of such values does not exceed 2**15. I don't know if I was the only one who didn't know that a change was intended to A.5.2(42), but the fact that I didn't put it into my notes suggests that it wasn't done very clearly. Thus I have to think the vote to approve was invalid; I never, *ever* would have voted for the above, and I would have objected to voting for "whatever the editor comes up with" on general principles. ---------------------------------- With these two problems, we will have to rewrite and revote this AI. Since the problems are relatively minor, we can agree to wording here and then do a letter ballot to approve the AI (we don't have to spend meeting time on it assuming that we can agree enough that no one feels obligated to vote against the AI). I have no suggestion about a postcondition; someone will have to explain what was meant (and whether that intended to subsume some of the English wording in the AI). As far as the wording for A.5.2(41-2), I think we need to define once what the result ranges are and then use that in the other wording: {The *result range* of a Random function, is the range First .. Last for the version of Random with First and Last parameters, and the range of the result subtype otherwise.} A sufficiently long sequence of random numbers obtained by successive calls to Random is approximately uniformly distributed over the {result} range [of the result subtype]. {A}[The] Random function in an instantiation of Numerics.Discrete_Random is guaranteed to yield each value in its result {range}[subtype] in a finite number of calls, provided that the number of such values does not exceed 2 15. [Note: "15" in this last sentence is superscript. We're not changing that; it doesn't show in plain text.] Thoughts??? *************************************************************** From: John Barnes Sent: Saturday, December 12, 2015 4:35 AM A tricky subject. A good idea to define the range first. The actual requirement is pretty vague and so unwise to use a postcondition. And we would have to define uniformly distributed which is hardly the point. There are lots of ways of generating a useless sequence that visits every value the same number of times. Even G.2.5 is vague since it doesn't define what test suite it is talking about or what conformance means. So apart from defining the range first I would just leave it alone but please ensure that the poor superscript 15 is on the same line as the 2. It's not in the LNCS version which has a line feed between the 2 and 15. Gosh. Cheers and Happy Holidays John Don't use a postcondition. *************************************************************** From: Tucker Taft Sent: Saturday, December 12, 2015 9:32 AM I generally agree with John. The only thing I could imagine the postcondition might do is define the minimum and maximum possible values returned. You probably couldn't say anything more than that about the distribution with a postcondition. So, e.g., function Random (Gen : Generator; First : Result_Subtype; Last : Result_Subtype) return Result_Subtype with Post => Random'Result in First .. Last; *************************************************************** From: Bob Duff Sent: Saturday, December 12, 2015 11:14 AM > with Post => Random'Result in First .. Last; I think that's the postcondition that was discussed at the meeting, and I think somebody said that if we have that, then we don't need to say in English "the result is in the range ..." *************************************************************** From: Randy Brukardt Sent: Saturday, December 12, 2015 12:34 PM Randy’s proposed introduction of a definition of “result range” just for local usage in those two paragraphs seems a good simplification to me. And now that Tuck has written out the post-condition then yes, I think that’s all that was meant. Until people are used to looking for post-conditions, I think the (improved) wording is still needed. *************************************************************** From: John Barnes Sent: Friday, December 18, 2015 7:26 AM I am fretting over that postcondition. The text says that the value returned will be in the range. Fact. There is absolutely no hint that it might try to be outside the range and so raise some sort of exception. But if we express this idea with a postcondition, then by the dynamic nature of postconditions in Ada (as opposed to good old Spark) there is an implication that it might fail and so raise Assertion_Error or whatever. So for me the postcondition seems weaker. I don't like it. [Editor's note: This topic was split into AI12-0179-1; the mail thread continues there.] *************************************************************** From: Randy Brukardt Sent: Tuesday, January 5, 2016 12:10 AM Getting back to the original reason for this thread... Here's the revised proposed wording section for this AI. I have a question at the end... Add after A.5.2(20): function Random (Gen : Generator; First : Result_Subtype; Last : Result_Subtype) return Result_Subtype with Post => Random'Result in First .. Last; Add after A.5.2(32): function Random (Gen : Generator; First : Result_Subtype; Last : Result_Subtype) return Result_Subtype with Post => Random'Result in First .. Last; Obtains the "next" random number from the given generator, relative to its current state, according to an implementation-defined algorithm. The result of the function in an instantiation of Numerics.Discrete_Random is delivered as a value of the generic formal subtype Result_Subtype that makes the postcondition True. If the range First .. Last is a null range, Constraint_Error is raised. Modify A.5.2(41): {The *result range* of a Random function, is the range First .. Last for the version of Random with First and Last parameters, and the range of the result subtype otherwise.} A sufficiently long sequence of random numbers obtained by successive calls to Random is approximately uniformly distributed over the {result} range [of the result subtype]. Modify A.5.2(42): {A}[The] Random function in an instantiation of Numerics.Discrete_Random is guaranteed to yield each value in its result {range}[subtype] in a finite number of calls, provided that the number of such values does not exceed 2 15. ----------- In the new text for function Random, we have "The result of the function in an instantiation of Numerics.Discrete_Random is delivered as a value of the generic formal subtype Result_Subtype that makes the postcondition True." Assuming that we have the blanket rule for postconditions (now proposed in AI12-0179-1, not yet posted), this sentence says only that the function meets its specification. We *assume* that everywhere in the Standard, so this is just noise. I've continued to include it because this paragraph immediately follows A.5.2(32), which has similar wording, which is similarly useful (not at all). [In all of these cases, the important information is in the Implementation Requirements portion.] But it would be better to omit this junk wording -- that was the intent of adding a postcondition as I understand it. To make the wording consistent, we then would have to delete the existing wording from A.5.2(32). Specifically, the 2nd and 3rd sentences of the following paragraph would also be dropped as marked: Obtains the "next" random number from the given generator, relative to its current state, according to an implementation-defined algorithm. [The result of the function in Numerics.Float_Random is delivered as a value of the subtype Uniformly_Distributed, which is a subtype of the predefined type Float having a range of 0.0 .. 1.0. The result of the function in an instantiation of Numerics.Discrete_Random is delivered as a value of the generic formal subtype Result_Subtype.] I didn't unilaterally make this change as it seems like more change than is necessary, even though I can hardly imagine any more empty verbiage in the Standard. (What else would the result of Random be? A lake in southern Sheboygan county? Santa Claus?? :-) Who would expect something other than a value of its result subtype? Whose details can be read in the specification that precedes this text!) Should I make this slightly larger change, or forget it? P.S. I'd like to put this AI to bed someday soon. Or does Someday Never Come?? :-) *************************************************************** From: Tucker Taft Sent: Tuesday, January 5, 2016 2:32 PM > ... I didn't unilaterally make this change as it seems like more > change than is necessary, even though I can hardly imagine any more > empty verbiage in the Standard. (What else would the result of Random > be? A lake in southern Sheboygan county? Santa Claus?? :-) Who would > expect something other than a value of its result subtype? Whose > details can be read in the specification that precedes this text!) I have no problem with deleting clearly superfluous wording. The danger of blatantly redundant wording is that the user might think it means more than it says, and try to read something extra in it. So I would agree, heave ho! > Should I make this slightly larger change, or forget it? OK with me. *************************************************************** From: Edmond Schonberg Sent: Tuesday, January 5, 2016 2:43 PM >> Should I make this slightly larger change, or forget it? > > OK with me. That’s what I would call a Talmudic response ... *************************************************************** From: Tucker Taft Sent: Tuesday, January 5, 2016 2:53 PM Oops. Deleting the clearly redundant wording is fine if it is not clarifying things for the user. *************************************************************** From: Bob Duff Sent: Tuesday, January 5, 2016 4:06 PM > Assuming that we have the blanket rule for postconditions (now > proposed in AI12-0179-1, not yet posted), this sentence says only that > the function meets its specification. We *assume* that everywhere in > the Standard, so this is just noise. I agree. We should never say that a particular procedure obeys its postcondition, because that would lead readers to think that maybe other procedures don't obey their postcondition. Better to rely on the blanket statement. So I agree with your suggested change. *************************************************************** From: Jeff Cousins Sent: Wednesday, January 6, 2016 3:48 AM I agree for the same reason. *************************************************************** From: Tucker Taft Sent: Thursday, January 7, 2016 8:48 AM [Editor's note: Letter Ballot stuff stripped out here.] This new sentence preceding A.5.2(41) has some grammatical issues: The result range of a Random function, is the range First .. Last for the version of Random with First and Last parameters, and the range of the result subtype otherwise. That first comma doesn't make sense to me, but then that second comma has a problem if you leave it out. How about the following: Each Random function has a /result range/; this is the range First .. Last for the version of Random with First and Last parameters and the range of the result subtype of the function otherwise. *************************************************************** From: Jean-Pierre Rosen Sent: Thursday, January 7, 2016 9:14 AM [Editor's note: Letter Ballot stuff stripped out here.] It seems there has been some aggressive "substitute all" in the text. After "Add after A.5.2(32):", the text is in the code font, and all Ada keywords are in bold in the editor's note. Under the !discussion, in the phrase "see the appendix", appendix has been turned into a !appendix *************************************************************** From: Randy Brukardt Sent: Thursday, January 7, 2016 1:46 PM Those are the effects of the auto-formatter. The first defies any correction, since the only way to prevent the formatter from treating the text as part of the code would be to not indent it (the autoformatter assumes code continues until the end of the indented section). But the style of the RM is to have such text indented. Presumably the !corrigendum will be correct one it gets added to the AI. I saw this when I posted it, but as noted, I couldn't think of any fix. The second occurred because "!appendix" happened to start a line, and thus the formatter took it as the start of the !appendix section. Since the other tools make the same assumption, I will change that by moving line breaks around, but note that the formatting is not part of the official AI, so this is not a real change at all. *************************************************************** From: Randy Brukardt Sent: Thursday, January 7, 2016 1:55 PM > How about the following: > > Each Random function has a /result range/; this > is the range First .. Last for the version of Random > with First and Last parameters and the range > of the result subtype of the function otherwise. You couldn't have made this comment when I proposed this wording back on December 11th?? That's kinda the reason I *send* wording suggestions to the ARG list in the first place. This does seem like an improvement, I'll use it. I'll wait for any other suggestions before posting an updated AI. *************************************************************** From: Steve Baird Sent: Thursday, January 7, 2016 1:50 PM The proposed wording includes. > Modify A.5.2(41): > {The result range of a Random function, is the range First .. Last for > the version of Random with First and Last parameters, and the range of > the result subtype otherwise.} A sufficiently long sequence of random > numbers obtained by successive calls to Random is approximately > uniformly distributed over the {result} range [of the result subtype]. When we say "is approximately uniformly distributed over the result range", what single range are we talking about? We are talking about "result range" as though it is a property of the function, but two calls to the same function can have different result ranges (in the case where First and Last are passed in). To be correct, I think we really need to say that the result range differs from one call to the next (i.e., it is a property of the call, not of the function) and then require something like "A sufficiently long sequence of random numbers obtained by successive calls to Random which have the same result range is ..." The idea is that you have lots of calls to the Random which takes a First and Last parameter, then you partition that set of calls so that two calls are in the same partition element if and only if their First/Last values match (which means that they have the same result range). We then impose the "sufficiently long sequence" requirement on partition elements (where the notion of a single result range makes sense). Or am I being too pedantic? *************************************************************** From: Randy Brukardt Sent: Thursday, January 7, 2016 2:23 PM > Or am I being too pedantic? Usually. ;-) Why didn't you point this out when you were complaining about this wording in Bennington? The wording proposed there had a similar problem. Or back in December when the current version was proposed?? OK, now that I that I got *that* off of my chest, I'm understanding that you are proposing the following change to this wording, folding in Tucker's change: Modify A.5.2(41): {Each call of a Random function has a *result range*; this is the range First .. Last for the version of Random with First and Last parameters and the range of the result subtype of the function otherwise.} A sufficiently long sequence of random numbers obtained by successive calls to Random {which have the same result range} is approximately uniformly distributed over the {result} range [of the result subtype]. (The changes are to add "a call of" in the first paragraph, and "which have the same result range" to the second paragraph. One effect of this wording is that the requirement would hold for mixed calls to the Random without First and Last parameters and calls to Random where First = Result_Subtype'First and Last = Result_Subtype'Last (since these result ranges would be the same). That's probably OK, we'd want the routines to operate in the same way in that case. Should we make these changes? (I lean toward doing so, as they clarify the model.) [I think we'd restart the Letter Ballot if we do, as this is a pretty substantive change.] *************************************************************** From: Bob Duff Sent: Thursday, January 7, 2016 3:39 PM [Editor's note: Letter Ballot stuff stripped out here.] ... > The revised AI can be found on-line in the normal place: > http://www.ada-auth.org/cgi-bin/cvsweb.cgi/ai12s/ai12-0144-1.txt (make > sure you use the most recent version; I mistakenly posted a version > with a blank space where the change for A.5.2(32) was supposed to go, > so there are two /05 versions in the VCS). Sounds naughty to have to same-numbered versions. It's not like we're going to run out of positive integers. ;-) ... I agree with the comments that led to this: > Modify A.5.2(41): > > {Each call of a Random function has a *result range*; this is the range > First .. Last for the version of Random with First and Last parameters and > the range of the result subtype of the function otherwise.} > > A sufficiently long sequence of random numbers obtained by > successive calls to > Random {which have the same result range} is approximately uniformly > distributed > over the {result} range [of the result subtype]. I suggest deleting this from the AI: (unless a compiler vendor makes them available through a non-standard package). You could say that about ANY new feature. Obtains the ?next? random number from the given generator, relative to its current state, according to an implementation-defined algorithm.[ The result of the function in Numerics.Float_Random is delivered as a value of the subtype Uniformly_Distributed, which is a subtype of the predefined type Float having a range of 0.0 .. 1.0. The result of the function in an instantiation of Numerics.Discrete_Random is delivered as a value of the generic formal subtype Result_Subtype.] [Editor's note: We delete the last two paragraphs here as all they say is ^^^^^^^^^^ Those things we're deleting are sentences, not paragraphs. *************************************************************** From: Gary Dismukes Sent: Thursday, January 7, 2016 4:11 PM > Should we make these changes? (I lean toward doing so, as they clarify the > model.) [I think we'd restart the Letter Ballot if we do, as this is a > pretty substantive change.] Yes, I'm in favor of going ahead with those changes. With one minor tweak: change "which" to "that" in "{which have the same result range}". Other than that it looks good. :) If that means restarting the ballot, will hold off on my vote. *************************************************************** From: Randy Brukardt Sent: Thursday, January 7, 2016 4:30 PM ... > http://www.ada-auth.org/cgi-bin/cvsweb.cgi/ai12s/ai12-0144-1.txt (make > > sure you use the most recent version; I mistakenly posted a version > > with a blank space where the change for A.5.2(32) was supposed to > > go, so there are two /05 versions in the VCS). > > Sounds naughty to have to same-numbered versions. It's not like we're > going to run out of positive integers. ;-) I try to ensure that we never have two versions on the same date, so we're limited to one per day in usual circumstances. In a situation like this, where an incomplete AI gets stored, I just make the correction as the previous version shouldn't exist. (But one can't delete versions from the VCS, so my mistakes will live forever.) I also don't use new version numbers for presentation changes (that is, moving around the whitespace), for posting mail, and (rarely) for fixing obvious typos in nonnormative parts of an AI. ... > I suggest deleting this from the AI: > > (unless a compiler > vendor makes them available through a non-standard package). > > You could say that about ANY new feature. That's from the original posting by Adam; as I frequently tell John, we try to use as much of the original question as possible, and as such we edit it as little as possible. (In John's case, that means we might leave some iffy grammar and some unclear constructions.) I will trim out extraneous information, so it's a close call as to whether to leave it or remove it. I view it as harmless, which is why I left it initially. It fairly frequent that I have to create !problem or !question out of thin air, in which case all changes are on the table. But I don't want to put words in other people's mouths unless absolutely necessary (and it's best to let as much of their original thought remain as possible). ... > [Editor's note: We delete the last two paragraphs here as all they say is > ^^^^^^^^^^ > > Those things we're deleting are sentences, not paragraphs. Arrggghh!! *************************************************************** From: Steve Baird Sent: Monday, January 11, 2016 12:36 PM [Significant part of a longer message - Editor.] I'm seeing "does not exceed 2 15" as opposed to, perhaps, "does not exceed 2 ** 15" . Is there a typo here? *************************************************************** From: Gary Dismukes Sent: Monday, January 11, 2016 1:11 PM I believe that's a actually a superscript, and it doesn't get rendered properly in the online AI web page. [Added Editor's note to clarify - Editor.] *************************************************************** From: John Barnes Sent: Tuesday, January 12, 2016 4:09 AM [Significant part of a longer message - Editor.] The sentence "postconditions are assumed to be a requirement on the body of a subprogram" upset me I could read that to mean that every subprogram body has to have a postcondition. [Changed "be" to "impose" - Editor.] *************************************************************** From: Erhard Ploedereder Sent: Tuesday, January 19, 2016 8:00 PM [Significant part of a longer message - Editor.] For a letter ballot I am casting a "reject" vote (although I agree with the techbnical intent), because 1. The sentence > A sufficiently long sequence of random numbers obtained by successive > calls to Random is approximately uniformly distributed over the > {result} range [of the result subtype]. is at best seriously incomplete, at worst plain wrong. It should read: A sufficiently long sequence of random numbers obtained by successive calls to Random with the same actual parameter for Gen and the same result_subtype is approximately uniformly distributed over the {result} range [of the result subtype]. Obviously, both qualifications, especially the Gen bit, are important. 2. Looking at the interface in context, I am not sure that it is a good idea to have one interface in the package use postconditions, while the others do not. As much as I like postconditions, this looks odd. Maybe we could (re)do this package in its entirety with postconditions? Also, there are editorial bugs: Formatting is somehow off (courier fonts for parts of the text.) In the Semantics section of (32): ". it" -> ". It" *************************************************************** From: Randy Brukardt Sent: Wednesday, January 20, 2016 8:28 PM [Significant part of a longer message - Editor.] ... > Obviously, both qualifications, especially the Gen bit, are important. I suppose you're right, but this is Ada 95 wording, and it's hard to imagine how important it could be to be precise about the generator parameter now when it never was before. > 2. Looking at the interface in context, I am not sure that it is a good > idea to have one interface in the package use postconditions, while > the others do not. As much as I like postconditions, this looks odd. > Maybe we could (re)do this package in its entirety with > postconditions? The postcondition on the new routine is pretty minimal (it just defines the range); the operative stuff is in the Implementation Requirements that can't be described in a postcondition. There's no point in adding a postcondition of True to the other Random functions, and there isn't anything about the result which can be described with a postcondition. I don't see any useful postconditions that could be added to any of the other routines, either (Reset, Save, Image, Value). That's rather typical: the containers most likely will only have a handful of routines that have postconditions (based on our previous discussion on AI12-0112-1), the majority of routines will not. While this was one of the reasons that I've been luke-warm at best at adding a postcondition rather than words for this routine, this is a discussion that I clearly lost and I don't see much point in revisiting it lest we enter an infinite loop of no progress. > Also, there are editorial bugs: > Formatting is somehow off (courier fonts for parts of the > text.) In the Semantics section of (32): ". it" -> ". It" Sigh. You're (like several others) looking at the autoformatted version, which should *never* be used for editorial review of an AI, as the format is generated by guessing what each piece of text is. It will *always* have errors. These particular errors (assuming you meant "if" instead of "it", since there is no "it" in (32)) come from something that was discussed twice previously. *************************************************************** From: Erhard Ploedereder Sent: Friday, January 22, 2016 6:22 PM Returning to more technical arguments .... What got me going here was the fact that a user can easily misjudge the situation casued by a local instantiation of the generic package, which par force, because the generator type is inside the package (-- maybe stupidly --), causes his seemingly sequential calls on Random to NOT return random numbers, because the Generators differ. (Wasn't this even the scenario why this AI came up in the first place?) Worth mentioning? I think so. Worth a shitstorm? No. *************************************************************** From: Erhard Ploedereder Sent: Sunday, January 24, 2016 5:09 PM [Significant part of a longer message - Editor.] I have yet another technical issue: Now that there are two Random functions, what do the semantics say if both of them get called in some order? Clearly the uniform distribution sentence is then all screwed up. (A safer version would duplicate the surrounding package, so that the Generators keep the two kinds of calls apart, but that is a lot of text or the manual.) In exchange, I take back the one about a singular interface with postconditions. The other functions don't really have specifiable ones. *************************************************************** From: Randy Brukardt Sent: Sunday, January 24, 2016 5:32 PM [Significant part of a longer message - Editor.] > I have yet another technical issue: > > Now that there are two Random functions, what do the semantics say if > both of them get called in some order? > Clearly the uniform distribution sentence is then all screwed up. My intent (I thought that it was written somewhere, but I can't find it now) is that it doesn't matter. That is, only the result range (and, as you point out, the generator) matters, not the form of the call. Specifically, if one interleaved calls to the two random functions: ... Random (Gen, Result_Subtype'First, Result_Subtype'Last) ... ... Random (Gen) ... ... Random (Gen, Result_Subtype'First, Result_Subtype'Last) ... ... Random (Gen) ... The uniform distribution requirement would still apply. The expectation is that these really different views of the same logical function; we have two separate specifications for compatibility. As the AI says, if this was 1994, we just would have defined the function as: function Random (Gen : Generator; First : Result_Subtype := Result_Subtype'First; Last : Result_Subtype := Result_Subtype'Last) return Result_Subtype; and there would be no question about the handling of "different" random functions. (This specification is incompatible on the margins -- specifically for renaming and formal subprogram matching -- and it's too late to break those things.) I think the wording reflects that. We have (with your generator change): A sufficiently long sequence of random numbers obtained by successive calls to Random that have the same generator and result range is approximately uniformly distributed over the result range. And the definition of result range is in the (new) previous paragraph: Each call of a Random function has a *result range*; this is the range First .. Last for the version of Random with First and Last parameters and the range of the result subtype of the function otherwise.} > (A safer version would duplicate the surrounding package, so that the > Generators keep the two kinds of calls apart, but that is a lot of > text for the manual.) I don't see the point. We want to think of this as two views of a single random function, not two different functions. > In exchange, I take back the one about a singular interface with > postconditions. The other functions don't really have specifiable > ones. Yup, that's what I noticed when I looked at it. *************************************************************** From: Erhard Ploedereder Sent: Monday, January 25, 2016 6:43 AM > Specifically, if one interleaved calls to the two random functions: > ... Random (Gen, Result_Subtype'First, Result_Subtype'Last) ... > ... Random (Gen) ... > ... Random (Gen, Result_Subtype'First, Result_Subtype'Last) ... > ... Random (Gen) ... > > The uniform distribution requirement would still apply. The > expectation is that these really different views of the same logical > function; we have two separate specifications for compatibility. True for the above. False for: > ... Random (Gen, Result_Subtype'First+5, Result_Subtype'Last-5) > ... Random (Gen) ... > ... Random (Gen, Result_Subtype'First+5, Result_Subtype'Last-5) > ... Random (Gen) ... Now the 10 outliers get chosen half as often, since the distribution in the one routine gets narrowed to the First..Last range, and in the other routine to Result_Subtype'Range. And, within each category, the series is not reproducible, due to possibly varying numbers of the "other" calls. (What one could promise is that the float-version with the uniform distribution is underneath it all, and these versions merely map its uniform distribution faithfully onto discrete values within the result ranges.) *************************************************************** From: Randy Brukardt Sent: Monday, January 25, 2016 6:12 PM > Now the 10 outliers get chosen half as often, since the distribution > in the one routine gets narrowed to the First..Last range, and in the > other routine to Result_Subtype'Range. Sure, but the requirement doesn't apply to these calls because the ranges are different. The same would be true for: ... Random (Gen, 1, 10) ... -- (A) ... Random (Gen, 1, 5) ... -- (B) ... Random (Gen, 1, 10) ... -- (C) ... Random (Gen, 1, 5) ... -- (D) > And, within each category, the series is not reproducible, due to > possibly varying numbers of the "other" calls. I couldn't parse this, but I do see a problem with the wording (and probably it's the same one you're seeing). Specifically, the requirement wouldn't apply to calls (A) and (B) above, because the ranges are different. But the requirement does appear to apply to calls (A) and (C) above; the calls are "successive" and meet the other requirements. And that's not going to be easy to ensure, as the changes that call (B) makes to the generator Gen might make it difficult to preserve uniformity. Particularly if the pattern of calls is irregular. Similarly, a call to Reset or one of the other operations on Gen would mess up the requirement. Worst case: Reset (Gen, 1); A(1) := Random (Gen, 1, 10); Reset (Gen, 1); A(2) := Random (Gen, 1, 10); ... Arguably, the requirement applies to the two calls to Random here, but they'll most likely both return the same value. I think we can fix the wording to cover these cases by simply changing "successive" to "consecutive" in the proposed wording: A sufficiently long sequence of random numbers obtained by {consecutive} [successive] calls to Random that have the same generator and result range is approximately uniformly distributed over the result range. Maybe it would help to add an AARM note to spell out that "consecutive" here means that there are no intervening explicit calls (to anything) between the calls to Random. That prevents any issues because the generator is changed by calling Reset, other calls on Random with the same generator, and probably other things that we haven't thought of yet. You (or someone else) might complain that this version of the requirement would hardly ever apply. That's true, but it's OK, since the only real purpose of this requirement is to give users and implementers a baseline for a "good" random generator. And so long as the requirement *could* happen in a program (and it certainly could, just put the values in an array and analyze them later), the random function will have to behave properly. I'm not worried about someone constructing a random generator that only works right if there are no intervening calls -- one would have to be actively evil for a call to some unrelated function to make the random generator malfunction. (And the RM does not worry about actively evil implementers anyway, they can't be stopped by a mere Standard.) > (What one could promise is that the float-version with the uniform > distribution is underneath it all, and these versions merely map its > uniform distribution faithfully onto discrete values within the result > ranges.) That exposes too much mechanism; arguably, there wouldn't even have to be a float version. I think the above change does the trick. One could argue that the intent of "successive" was to not allow non-matching calls in between, but it's not obvious and at a minimum deserves an AARM note. Thoughts?? *************************************************************** From: Erhard Ploedereder Sent: Tuesday, January 26, 2016 6:10 AM > Sure, but the requirement doesn't apply to these calls because the > ranges are different. The same would be true for: > > ... Random (Gen, 1, 10) ... -- (A) > ... Random (Gen, 1, 5) ... -- (B) > ... Random (Gen, 1, 10) ... -- (C) > ... Random (Gen, 1, 5) ... -- (D) > >> > And, within each category, the series is not reproducible, due to >> > possibly varying numbers of the "other" calls. > I couldn't parse this, but I do see a problem with the wording (and > probably it's the same one you're seeing). > > Specifically, the requirement wouldn't apply to calls (A) and (B) > above, because the ranges are different. But the requirement does > appear to apply to calls (A) and (C) above; the calls are "successive" > and meet the other requirements. And that's not going to be easy to > ensure, as the changes that call (B) makes to the generator Gen might > make it difficult to preserve uniformity. Particularly if the pattern > of calls is irregular. Similarly, a call to Reset or one of the other > operations on Gen would mess up the requirement. Since the calls go to the same generator, I would expect - indeed, each one, within their range, would distribute uniformly - however, in general, not reproducibly, i.e., for cases, where the number of intervening other calls to the same Gen is dynamically determined, the sequence will not be reproducibly the same. So, one requirement on Random is met, the other is not. The counter example is: Reset (Gen, 1); A(1) := Random (Gen, 1, 10); for I in 1..changingvalue loop ... Random(Gen, 1, 5); end loop; A(2) := Random (Gen, 1, 10); *************************************************************** From: Randy Brukardt Sent: Tuesday, January 26, 2016 4:49 PM Yes, that's what I said (or at least meant). I went on to point out that we don't really care if the uniform requirement only applies in narrow circumstances, since an implementer would have to get it right in those circumstances and only an actively evil implementer would manage to do something bad in a case like the above if they've really met the uniform requirement. But you didn't answer my question: does changing the wording to "consecutive" (rather than "successive") and then adding an AARM note to clarify further address your concern??? And if not, what *would* address your concern??? P.S. To repeat yesterday's message, here's exactly what I suggested (with an explicit AARM note to follow): A sufficiently long sequence of random numbers obtained by consecutive calls to Random that have the same generator and result range is approximately uniformly distributed over the result range. AARM Discussion: In this rule, "consecutive" means that there are no intervening explicit calls (to anything) between the calls to Random. This restricts the rule to only applying to cases where just the Random function changes the generator. We don't mean to impose a requirement if there are intervening calls to Reset, to Random with same generator but a different result range, or any other case that would affect the sequence of values returned. [I don't want to get into formally defining what and what cannot appear between two consecutive calls, as that way seems to lead to madness. If the generator is hardware-based, simply wasting differing amounts of time between calls could change the behavior to something non-uniform. But we at least have to allow the program to store the value somewhere between calls to Random. Vague is probably better here.] *************************************************************** From: Tucker Taft Sent: Wednesday, January 27, 2016 8:42 AM > But you didn't answer my question: does changing the wording to > "consecutive" (rather than "successive") and then adding an AARM note > to clarify further address your concern??? Saying "consecutive" helps me, at least. *************************************************************** From: Erhard Ploedereder Sent: Wednesday, January 27, 2016 10:55 AM > But you didn't answer my question: does changing the wording to > "consecutive" (rather than "successive") and then adding an AARM note > to clarify further address your concern??? > > And if not, what *would* address your concern??? I am o.k. with the new wording, especially with the idea expressed by the AARM note. It is more user-understandable than what I had in mind. Maybe the AARM note could be relaxed by saying "...means that there are no other intervening calls involving the same generator." Tying it to a statically checkable and very restrictive property seems unnecessary. *************************************************************** From: Randy Brukardt Sent: Wednesday, January 27, 2016 2:20 PM In thinking about it more, I think you're right. I was trying to cover for the *next* problem someone thought of, but that's probably taking CYA too far. P.S. CYA = Cover your a**! *************************************************************** From: Randy Brukardt Sent: Friday, February 26, 2016 8:00 PM Voting results for Letter Ballot on AI12-0144-1/07. This AI is approved by a 10-0-2 vote. Approve as is: Randy Brukardt, Steve Baird, Bob Duff, Tucker Taft, Jean-Pierre Rosen, Jeff Cousins, Ed Schonberg, Tullio Vardanega, Gary Dismukes, and Brad Moore. Voting, but Abstaining: John Barnes and Alan Burns. Not voting: Pascal Leroy, Erhard Ploedereder, and Florian Schanda. ****************************************************************