--- ai12s/ai12-0144-1.txt 2015/12/18 02:07:00 1.4 +++ ai12s/ai12-0144-1.txt 2016/01/05 06:11:12 1.5 @@ -1,4 +1,4 @@ -!standard A.5.2(20) 15-11-19 AI12-0144-1/03 +!standard A.5.2(20) 16-01-04 AI12-0144-1/04 !standard A.5.2(32) !standard A.5.2(41) !class Amendment 14-12-04 @@ -28,7 +28,7 @@ generate_random_number, and a new generator variable of type Instantiation.Generator needs to be declared. The questioner tried Reset to reset the generator to a time-dependent state each time, but this failed -because the procedure was being called faster than the clock was changing. +because the procedure was being called faster than the clock was changing. The most direct way to accomplish something like this seems to be to use Float_Random, reset the generator, and then @@ -49,32 +49,57 @@ function Random (Gen : Generator; First : Result_Subtype; - Last : Result_Subtype) return Result_Subtype; + Last : Result_Subtype) return Result_Subtype + with Post => Random'Result in First .. Last; Add after A.5.2(32): function Random (Gen : Generator; First : Result_Subtype; - Last : Result_Subtype) return Result_Subtype; + Last : Result_Subtype) return Result_Subtype + with Post => Random'Result in First .. Last; -Obtains the "next" random number from the given generator, relative to its -current state, according to an implementation-defined algorithm. The result -of the function in an instantiation of Numerics.Discrete_Random is within the -range First .. Last, and is delivered as a value of the generic formal subtype -Result_Subtype. If the range First .. Last is a null range, Constraint_Error -is raised. + Obtains the "next" random number from the given generator, relative to its + current state, according to an implementation-defined algorithm. The result + of the function in an instantiation of Numerics.Discrete_Random is delivered + as a value of the generic formal subtype Result_Subtype that makes the + postcondition True. If the range First .. Last is a null range, Constraint_Error + is raised. + + [Editor's notes: During the ARG phone meeting of February 26, 2015, it was + suggested that the null range case raise Program_Error. However, + Constraint_Error is the existing exception for that (A.5.2.(39)), so we + used that here. + + The "delivered as a value" wording is intended to echo that of A.5.2(32); it + would be odd to leave it out. But it doesn't say anything that isn't required + by the specification (and the later implementation requirements). Possibly + a better thing to do would be to delete the two sentences of noise from + A.5.2(32), and then we don't need it here, either. I didn't do that as it + would be more change than necessary, but we could/should consider doing so. + + Postconditions are assumed to be a requirement on the body of a subprogram, + as proposed in AI12-0179-1. If that AI is not approved, then we should not + use a postcondition here. + End Editor's notes.] -[Editor's note: During the ARG phone meeting of February 26, 2015, it was -suggested that the null range case raise Program_Error. However, -Constraint_Error is the existing exception for that (A.5.2.(39)), so we -used that here.] - Modify A.5.2(41): + + {The *result range* of a Random function, is the range First .. Last for the + version of Random with First and Last parameters, and the range of the result + subtype otherwise.} + + A sufficiently long sequence of random numbers obtained by successive calls to + Random is approximately uniformly distributed over the {result} range [of the + result subtype]. + +Modify A.5.2(42): + + {A}[The] Random function in an instantiation of Numerics.Discrete_Random is + guaranteed to yield each value in its result {range}[subtype] in a + finite number of calls, provided that the number of such values does not + exceed 2 15. -A sufficiently long sequence of random numbers obtained by successive calls to -Random is approximately uniformly distributed over the range of the result -subtype{, or in the case of the version of Random with First and Last -parameters, over the range First .. Last}. !discussion @@ -148,7 +173,7 @@ new generator variable of type Instantiation.Generator needs to be declared. The questioner tried Reset to reset the generator to a time-dependent state each time, but this failed because the procedure -was being called faster than the clock was changing. +was being called faster than the clock was changing. Perhaps there are ways to get this to work with an initiator parameter on Reset. However, the most direct way to accomplish something like @@ -166,11 +191,11 @@ (1) Add to Discrete_Random: - function Random (Gen : Ada.Numerics.Float_Random.Generator) + function Random (Gen : Ada.Numerics.Float_Random.Generator) return Result_Subtype; which would allow the program to set up one common generator using - Float_Random. + Float_Random. (2) Add to Float_Random: @@ -189,14 +214,14 @@ Since this question came from a StackOverflow poster who could have just been experimenting, I can't say how useful in practice this feature would be. It does seem like a flaw that it's missing, -however. +however. *************************************************************** From: Jean-Pierre Rosen Sent: Thursday, November 6, 2014 3:17 PM -> It looks like this isn't easily possible using +> It looks like this isn't easily possible using > Ada.Numerics.Discrete_Random Isn't it possible to use a random generator on Natural and return @@ -226,12 +251,12 @@ From: Pascal Obry Sent: Thursday, November 6, 2014 4:10 PM -> Isn't it possible to use a random generator on Natural and return +> Isn't it possible to use a random generator on Natural and return > Value mod n + 1 ? -> -> (the quality of the random numbers might not be as good for big n's, +> +> (the quality of the random numbers might not be as good for big n's, > but it may be sufficient in practice) -> +> Or better, just use Ada.Numerics.Float_Random and multiply by N. @@ -240,9 +265,9 @@ From: Jean-Pierre Rosen Sent: Friday, November 7, 2014 2:32 AM -> Sure, if you don't want a uniform distribution. Unless the number of -> values in Natural (2**m for some m) is divisible by (n + 1), doing -> this will result in some values in the range being generated with a +> Sure, if you don't want a uniform distribution. Unless the number of +> values in Natural (2**m for some m) is divisible by (n + 1), doing +> this will result in some values in the range being generated with a > higher probability than others. Well, correct my maths if I'm wrong, but it seems to me that the average @@ -290,13 +315,13 @@ From: Jean-Pierre Rosen Sent: Friday, November 7, 2014 11:59 PM ->> Isn't it possible to use a random generator on Natural and return +>> Isn't it possible to use a random generator on Natural and return >> Value mod n + 1 ? >> ->> (the quality of the random numbers might not be as good for big n's, +>> (the quality of the random numbers might not be as good for big n's, >> but it may be sufficient in practice) >> -> +> > Or better, just use Ada.Numerics.Float_Random and multiply by N. This doesn't work. @@ -313,13 +338,13 @@ From: Jean-Pierre Rosen Sent: Saturday, November 8, 2014 12:12 AM -> This means you want to select with ranges 1..52, then 1..51, then +> This means you want to select with ranges 1..52, then 1..51, then > 1..50, etc. -> -> Since m in this case is 52!, I'm pretty sure that no 32-bit or even +> +> Since m in this case is 52!, I'm pretty sure that no 32-bit or even > 64-bit generator is going to be uniform using your formula. -?? m (if you refer to the number used as a parameter to the function we are +?? m (if you refer to the number used as a parameter to the function we are discussing) is 52, then 51, ... Certainly a "small number" for the generator bias. @@ -339,10 +364,10 @@ From: Jeffery R. Carter Sent: Saturday, November 8, 2014 12:36 AM ->>> Isn't it possible to use a random generator on Natural and return +>>> Isn't it possible to use a random generator on Natural and return >>> Value mod n + 1 ? >>> ->>> (the quality of the random numbers might not be as good for big n's, +>>> (the quality of the random numbers might not be as good for big n's, >>> but it may be sufficient in practice) For some applications (games, for example) the statistics of the RNG may not @@ -370,14 +395,14 @@ From: Randy Brukardt Sent: Saturday, November 8, 2014 1:26 AM -> > This means you want to select with ranges 1..52, then 1..51, then +> > This means you want to select with ranges 1..52, then 1..51, then > > 1..50, etc. -> > -> > Since m in this case is 52!, I'm pretty sure that no 32-bit or even +> > +> > Since m in this case is 52!, I'm pretty sure that no 32-bit or even > > 64-bit generator is going to be uniform using your formula. -> -> ?? m (if you refer to the number used as a parameter to the function -> we are discussing) is 52, then 51, ... Certainly a "small number" for +> +> ?? m (if you refer to the number used as a parameter to the function +> we are discussing) is 52, then 51, ... Certainly a "small number" for > the generator bias. To use the *existing* Discrete_Random, you have to use an m that is a factor @@ -386,8 +411,8 @@ To use the float generator, you don't use "m" at all. -> As I use to say, I'm not a numerician, I just know enough of numerics -> to realize how easy it is to make incorrect statements about numerics +> As I use to say, I'm not a numerician, I just know enough of numerics +> to realize how easy it is to make incorrect statements about numerics > when you are not a numerician. Same here. But I know that I couldn't get even acceptable results out of the @@ -438,8 +463,8 @@ absence of some customer demand. > So I'm asking: -> -> 1) Is it a real problem? (i.e. worth spending time on it - we should +> +> 1) Is it a real problem? (i.e. worth spending time on it - we should > take advice of real numericians) It's certainly a real problem in that there is a class of applications where @@ -447,8 +472,8 @@ provide completely uniform behavior. Perhaps there is some numerical reason that the problem can never be solved, but we will need to be convinced of that. -> 2) What do other languages do? (to be fair, having a functionality is -> not enough if there is no guarantee of uniformity - anybody can write +> 2) What do other languages do? (to be fair, having a functionality is +> not enough if there is no guarantee of uniformity - anybody can write > a bad random generator). I don't think that other languages even had an equivalent for Discrete_Random @@ -475,7 +500,7 @@ From: Bob Duff Sent: Saturday, November 8, 2014 9:26 AM -> For some applications (games, for example) the statistics of the RNG +> For some applications (games, for example) the statistics of the RNG > may not be important, but for others, they are very important. Depends on whether you're playing for money. I'd like to play blackjack @@ -501,9 +526,9 @@ Randy wrote: -> I don't think that other languages even had an equivalent for -> Discrete_Random back in the 1990s. Whether that's changed, I don't -> know; none of the languages I've ever used has had anything other than +> I don't think that other languages even had an equivalent for +> Discrete_Random back in the 1990s. Whether that's changed, I don't +> know; none of the languages I've ever used has had anything other than > a float version. Lots of languages have pseudo-random generators that produce integers. @@ -515,8 +540,8 @@ bad, or so I've heard. I'm pretty sure rand() dates back to the early days of C, so before 1990s. -> I think this means that we'll need an AI and someone will have to be -> tasked to go off and scour the Internet for information. (We're never +> I think this means that we'll need an AI and someone will have to be +> tasked to go off and scour the Internet for information. (We're never > going to have enough "real numericians" in the ARG.) Or ask somebody who knows what they're talking about (unlike me). @@ -530,16 +555,16 @@ >> A better approach is to truncate R * (N + 1), but even that isn't acceptable >> for strict uses. This is the approach taken by the Random_Int function in >> PragmARC.Universal_Random. -> -> That didn't even work for me, as there was a non-zero chance of R = -> 1.0; in that case the result ended up indexing out of the range of the -> upper end of the selection array (that's how I found out it was still -> wrong, a bizarre Constraint_Error after simulating several thousand -> deals). I ended up putting that value into the uppermost bin; that's -> makes the uppermost bin slightly more likely to be selected, but never -> more often than 52/2**53, which seemed small enough for my -> application. (That would take roughly 2**40 simulations to be -> significant.) I don't know of any way to avoid that problem, perhaps +> +> That didn't even work for me, as there was a non-zero chance of R = +> 1.0; in that case the result ended up indexing out of the range of the +> upper end of the selection array (that's how I found out it was still +> wrong, a bizarre Constraint_Error after simulating several thousand +> deals). I ended up putting that value into the uppermost bin; that's +> makes the uppermost bin slightly more likely to be selected, but never +> more often than 52/2**53, which seemed small enough for my +> application. (That would take roughly 2**40 simulations to be +> significant.) I don't know of any way to avoid that problem, perhaps > that's the source of the issue that J-P remembered reading about. Other than Ada.Numerics.Float_Random, I've never used an RNG that produces @@ -558,18 +583,18 @@ From: Jeffrey R. Carter Sent: Saturday, November 8, 2014 2:43 PM -> Depends on whether you're playing for money. I'd like to play +> Depends on whether you're playing for money. I'd like to play > blackjack against a program that uses the "mod N" method. ;-) Then it's a business for someone. Not entirely a game. > I thought the standard way to get a random integer in the range -> (say) 1 .. 6 is to get a random integer in the range (say) 0 .. +> (say) 1 .. 6 is to get a random integer in the range (say) 0 .. > 2**31-1, and throw away results >= 2_147_483_646. > 2_147_483_646 is the largest multiple of 6 <= 2**31-1. -> Do this in a loop until you get a number in 0 .. 2_147_483_645. THEN +> Do this in a loop until you get a number in 0 .. 2_147_483_645. THEN > do the "mod 6" and add 1. -> +> > That loop terminates, and usually goes round just once. > It's not much harder to program than the plain "mod 6" method. @@ -588,8 +613,8 @@ From: Tero Koskinen Sent: Saturday, November 8, 2014 3:21 PM -> 2) What do other languages do? (to be fair, having a functionality is -> not enough if there is no guarantee of uniformity - anybody can write +> 2) What do other languages do? (to be fair, having a functionality is +> not enough if there is no guarantee of uniformity - anybody can write > a bad random generator). OpenBSD uses arc4random and arc4random_uniform(uint32_t upper_bound) functions @@ -635,8 +660,8 @@ From: Randy Brukardt Sent: Saturday, November 8, 2014 3:36 PM -> Other than Ada.Numerics.Float_Random, I've never used an RNG that -> produces 1.0, and I tend to forget that Float_Random may. My comment +> Other than Ada.Numerics.Float_Random, I've never used an RNG that +> produces 1.0, and I tend to forget that Float_Random may. My comment > was for RNGs that don't. I think that they all don't effectively. My recollection is that Float_Random @@ -665,7 +690,7 @@ Let's take the initial proposal from Adam: (1) Add to Discrete_Random: - function Random (Gen : Ada.Numerics.Float_Random.Generator) + function Random (Gen : Ada.Numerics.Float_Random.Generator) return Result_Subtype; My understanding of the given problem is how to produce a "random" integer @@ -736,29 +761,29 @@ > > Add then all x = u + v + w, so we have a random x in range 0..19. > > Plus one, we have a random x in range 1..20. -> -> No, I'm afraid you cannot do that. If you have two dice and add the +> +> No, I'm afraid you cannot do that. If you have two dice and add the > numbers of them you will not get an even distribution -> (7 being the most common sum). It should not be too much work +> (7 being the most common sum). It should not be too much work > calculating the distribution of your suggestion. Right. When dealing with probabilities and distributions and the like, the operative phrase is that "everything you know is wrong". In particular, ordinary mathematics can't used to combine probabilities, and in fact little can be done at all if the probabilities aren't independent. - -> If you on the other hand (like Bob Duff already suggested) throw away -> the numbers outside the range, and retry until you get a value inside -> the range, you should get an even distribution over that range. If you -> for example would roll a six sided dice, throw away and retry every -> time you get a six, you will basically get a five sided dice since + +> If you on the other hand (like Bob Duff already suggested) throw away +> the numbers outside the range, and retry until you get a value inside +> the range, you should get an even distribution over that range. If you +> for example would roll a six sided dice, throw away and retry every +> time you get a six, you will basically get a five sided dice since > those values will still be an even distribution. -> -> As Bob also said you would throw away the values outside an even -> multiple of the range, else you'll need to loop for awhile if you want +> +> As Bob also said you would throw away the values outside an even +> multiple of the range, else you'll need to loop for awhile if you want > to make a dice out of a 64 bit number. -> -> Looks like the code Tero posted also loops until it finds a value +> +> Looks like the code Tero posted also loops until it finds a value > inside a range. Clearly, we can write such a function. @@ -782,7 +807,7 @@ current state, according to an implementation-defined algorithm. The result of the function in an instantiation of Numerics.Discrete_Random is within the range First .. Last, and is delivered as a value of the generic formal subtype -Result_Subtype. +Result_Subtype. --- @@ -805,8 +830,192 @@ current state, according to an implementation-defined algorithm. The result of the function in an instantiation of Numerics.Discrete_Random is within the range First .. Last, and is delivered as a value of the generic formal subtype -Result_Subtype. +Result_Subtype. *************************************************************** +From: Randy Brukardt +Sent: Friday, December 11, 2015 9:33 PM + +The draft minutes for this AI say: + +AI12-0144-1/02 Make Discrete_Random more flexible + +Bob: It is hard to implement this right, users usually get it wrong. (The e-mail +thread proves that multiple times.) So we should provide it. + +Add the correct algorithm to random generation for integers to the discussion. + +Steve Baird says that the wording in A.5.2(41) require a uniformly distributed +result over the result subtype. But that's not what we want when we have the +First and Last parameters. + +So add to A.5.2 (41): ..., or in the case of the version of Random with First +and Last parameters, over the range First .. Last. + +Erhard suggests: "...uniformly distributed over the range provided by the +parameters First and Last, or the range of the result subtype." He then +withdraws that. + +Straw poll: Use a postcondition 6-1-0. + +Approve AI with changes: 7-0-0. + +---------------------------------- + +Two problems with these minutes. + +First, I have no recollection what "use a postcondition" means. Jeff says that +he thinks Steve Baird made the suggestion, but that's all he recalls. I edited +the AI without using a postcondition, but then the second problem came up. + + +Second, Jeff claims that we also are supposed to change the wording of A.5.2(42) +like that of A.5.2(41). One clear problem with that is that at least I voted on +this proposal assuming that the change of A.5.2(41) was the only one. I find the +proposed wording to be an ugly patch, and I didn't object to it mainly because I +couldn't think of anything better assuming only one use. One ugly hack is OK, +but duplicating it in successive paragraphs is not acceptable, IMHO. To wit: + + A sufficiently long sequence of random numbers obtained by successive calls to + Random is approximately uniformly distributed over the range of the result + subtype{, or in the case of the version of Random with First and Last + parameters, over the range First .. Last.}. + + The Random function in an instantiation of Numerics.Discrete_Random is + guaranteed to yield each value in its result subtype{, or in the case of the + version of Random with First and Last parameters, over the range First .. + Last,} in a finite number of calls, provided that the number of such values + does not exceed 2**15. + +I don't know if I was the only one who didn't know that a change was intended to +A.5.2(42), but the fact that I didn't put it into my notes suggests that it +wasn't done very clearly. Thus I have to think the vote to approve was invalid; +I never, *ever* would have voted for the above, and I would have objected to +voting for "whatever the editor comes up with" on general principles. + +---------------------------------- + +With these two problems, we will have to rewrite and revote this AI. Since the +problems are relatively minor, we can agree to wording here and then do a letter +ballot to approve the AI (we don't have to spend meeting time on it assuming +that we can agree enough that no one feels obligated to vote against the AI). + +I have no suggestion about a postcondition; someone will have to explain what +was meant (and whether that intended to subsume some of the English wording in +the AI). + +As far as the wording for A.5.2(41-2), I think we need to define once what the +result ranges are and then use that in the other wording: + + {The *result range* of a Random function, is the range First .. Last for the + version of Random with First and Last parameters, and the range of the result + subtype otherwise.} + + A sufficiently long sequence of random numbers obtained by successive calls to + Random is approximately uniformly distributed over the {result} range [of the + result subtype]. + + {A}[The] Random function in an instantiation of Numerics.Discrete_Random is + guaranteed to yield each value in its result {range}[subtype] in a + finite number of calls, provided that the number of such values does not + exceed 2 15. + +[Note: "15" in this last sentence is superscript. We're not changing that; it +doesn't show in plain text.] + +Thoughts??? + +*************************************************************** + +From: John Barnes +Sent: Saturday, December 12, 2015 4:35 AM + +A tricky subject. A good idea to define the range first. + +The actual requirement is pretty vague and so unwise to use a postcondition. And +we would have to define uniformly distributed which is hardly the point. There +are lots of ways of generating a useless sequence that visits every value the +same number of times. Even G.2.5 is vague since it doesn't define what test +suite it is talking about or what conformance means. + +So apart from defining the range first I would just leave it alone but please +ensure that the poor superscript 15 is on the same line as the 2. It's not in +the LNCS version which has a line feed between the 2 and 15. Gosh. + +Cheers and Happy Holidays + +John + + +Don't use a postcondition. + +*************************************************************** + +From: Tucker Taft +Sent: Saturday, December 12, 2015 9:32 AM + +I generally agree with John. The only thing I could imagine the postcondition +might do is define the minimum and maximum possible values returned. You +probably couldn't say anything more than that about the distribution with a +postcondition. + +So, e.g., + +function Random (Gen : Generator; + First : Result_Subtype; + Last : Result_Subtype) return Result_Subtype + with Post => Random'Result in First .. Last; + +*************************************************************** + +From: Bob Duff +Sent: Saturday, December 12, 2015 11:14 AM + +> with Post => Random'Result in First .. Last; + +I think that's the postcondition that was discussed at the meeting, and I think +somebody said that if we have that, then we don't need to say in English "the +result is in the range ..." + +*************************************************************** + +From: Randy Brukardt +Sent: Saturday, December 12, 2015 12:34 PM + +Randy’s proposed introduction of a definition of “result range” just for local +usage in those two paragraphs seems a good simplification to me. + +And now that Tuck has written out the post-condition then yes, I think that’s +all that was meant. + +Until people are used to looking for post-conditions, I think the (improved) +wording is still needed. + +*************************************************************** + +From: Randy Brukardt +Sent: Friday, December 11, 2015 9:33 PM + +*************************************************************** + +From: John Barnes +Sent: Friday, December 18, 2015 7:26 AM + +I am fretting over that postcondition. + +The text says that the value returned will be in the range. Fact. There is +absolutely no hint that it might try to be outside the range and so raise some +sort of exception. + +But if we express this idea with a postcondition, then by the dynamic nature of +postconditions in Ada (as opposed to good old Spark) there is an implication +that it might fail and so raise Assertion_Error or whatever. + +So for me the postcondition seems weaker. I don't like it. + +[Editor's note: This topic was split into AI12-0179-1; the mail thread continues +there.] + +***************************************************************

Questions? Ask the ACAA Technical Agent