!standard 3.2(8) 07-12-06 AC95-00152/01

!class confirmation 07-12-06

!status received no action 07-12-06

!status received 07-11-21

!subject Using floating point correctly is hard

!class confirmation 07-12-06

!status received no action 07-12-06

!status received 07-11-21

!subject Using floating point correctly is hard

!summary

!appendix

!topic Constraint_Error for FP value mathematically equal to type bound !reference Ada 2005 RM3.2(8) !from Brian Dobbing 07-11-21 !discussion Constraint_Error may be raised for a failed range check when comparing a computed float result with the 'First and 'Last values of its type (which has a range constraint) in the case where the computed result is mathematically equal to one of the bounds, to the number of decimal digits of accuracy of the type, but the result is beyond the bound in FP machine representation terms due to rounding in the contribution of the "Ada-insignificant" mantissa bits. The specific project example that highlighted the issue is shown by: package P is type Feet is range 0 .. 67000; Feet_To_Metres : constant := 0.3048; Max_Metres : constant := Float(Feet'Last) * Feet_To_Metres; type Metres is digits 6 range 0.0 .. Max_Metres; M : Metres; function Conv (F : Feet) return Metres; end P; package body P is function Conv (F : Feet) return Metres is begin return Metres (Float (F) * Feet_To_Metres); end Conv; end P; with P; with Ada.Text_IO; procedure Main is begin P.M := P.Conv (P.Feet'First); Ada.Text_IO.Put_Line ("OK"); exception when Constraint_Error => Ada.Text_IO.Put_Line ("C_E raised"); end Main; This raises Constraint_Error in certain (but not all) implementatations despite: - 67000.0 being accurate to digits 6 (Float) - 0.3048 being accurate to digits 6 (Float) - 20421.6 (67000.0 * 0.3048) being accurate to digits 6 (Float) The problem is made more acute in the scenario of a large safety-related system where there is no obvious systematic way to determine which precise characteristics of an FP type would give rise to this effect. Various workarounds were proposed: (a) Why are you using range constraints for FP types, rather than just "digits"? The design model we use requires types to be specified as completely and accurately as possible, and using static constraints, so that we can eliminate application-illegal values/computations by static analysis and proof of absence of run-time exceptions (via SPARK). In this context, all of our types, including the FP ones, have precise static range constraints that reflect the real world entities that they represent. (b) Artificially inflate the range so that C_E is not raised This doesn't work in the case where conversion between two types must be performed in both directions and the ranges of each type are derived from each other (e.g. Degrees and Radians) (c) Add explicit dynamic checks against T'First and T'Last prior to doing every mixed-type computation This is unrealistic for large-scale existing code and is anyway dangerous, since a truly illegal value may be changed to a legal type bound. It may be possible if we could reliably detect all type conversion pairings that might give rise to this problem (see (d)). (d) Check that the (static) bounds of every FP type are model numbers of the type We tried a few experiments using T'Model, but were not able to find the expression style to detect the potentially-failing type, eg. pragma Assert (Metres'Model(Metres'Last) = Metres'Last) did not fail. (e) Suppress range checks for all FP types Unacceptable for a safety-related project. **************************************************************** From: Adam Beneschan Sent: Monday, November 26, 2007 3:46 PM ... > with P; > with Ada.Text_IO; > procedure Main is > begin > P.M := P.Conv (P.Feet'First); The operand should be P.Feet'Last, I presume? > Ada.Text_IO.Put_Line ("OK"); > exception > when Constraint_Error => > Ada.Text_IO.Put_Line ("C_E raised"); > end Main; > > This raises Constraint_Error in certain (but not all) implementatations > despite: > - 67000.0 being accurate to digits 6 (Float) > - 0.3048 being accurate to digits 6 (Float) > - 20421.6 (67000.0 * 0.3048) being accurate to digits 6 (Float) I wonder if this isn't an error in the Ada implementation. The definition of Max_Metres is Max_Metres : constant := Float(Feet'Last) * Feet_To_Metres; If it were Max_Metres : constant := 67000.0 * Feet_To_Metres; then the multiplication would be the result of multiplying two universal_real values, which would be exactly 67000.0 and 0.3048; the multiplication would have to be mathematically exact, and Max_Metres would be the result, which would later be to Float when the Metres type is defined. But in this declaration: Max_Metres : constant := Float(Feet'Last) * Feet_To_Metres; the multiplication is the multiplication of two Float values. The right operand, Feet_To_Metres, will be converted to a Float; thus, it will not be exactly 0.3048 but will be slightly less or more than that. Thus, the mathematical result of the multiplication will be slightly different, and this could result in Float(Max_Metres) being different. If the implementation is instead using exact arithmetic to compute Max_Metres, I'd guess that it's incorrect, and could produce the wrong result. In theory, computing the value of Max_Metres and computing the value of the result of the multiplication inside Conv should involve the exact same operations: converting 0.3048 to a Float, converting the integer Feet to a Float (which will produce an exact result), and then using the multiplication operation predefined for Float. So it seems like the same result should be produced, and you shouldn't get C_E. Even so, I don't think it can be trusted; if you're running the compiler on one processor and generating code for a different processor, the compiler will probably do the multiplication resulting in Max_Metres itself, while the target processor will do the multiplication inside Conv, and it's possible that the results will not be the same in the last bit. I could be wrong about any of the above... I'm not sure what the language says about this (it seems to say that any result is OK as long as it's in the model interval, but I would think that the exact same operations on the exact same model numbers should produce the exact same result). Also, I don't know if there are any IEEE standards dictating what the result of the multiplication should look like. However, I wonder if you need to program things to be a bit more tolerant of roundoff errors. function Conv (F : Feet) return Metres is Result : Metres'Base; begin Result := Metres'Base (Float (F) * Feet_To_Metres); if Result > Metres'Last then if Result = Metres'Succ (Metres'Last) then return Metres'Last; else raise Constraint_Error; end if; else return Result; end if; end Conv; I haven't tried this (except to verify that it appears to be legal Ada). Will this work? Is it acceptable? Is it a good idea? Any thoughts? Floating-point programming isn't exactly my forte'. I'm hoping someone with more knowledge in this area will respond. **************************************************************** From: Randy Brukardt Sent: Monday, November 26, 2007 9:53 PM > Constraint_Error may be raised for a failed range check when comparing a > computed float result with the 'First and 'Last values of its type (which > has a range constraint) in the case where the computed result is > mathematically equal to one of the bounds, to the number of decimal digits > of accuracy of the type, but the result is beyond the bound in FP machine > representation terms due to rounding in the contribution of the > "Ada-insignificant" mantissa bits. I'm no numerical expert, but this is precisely what I'd expect to happen. It's always necessary to take potential calculation errors into account when using floating point. ... >This raises Constraint_Error in certain (but not all) implementations despite: >- 67000.0 being accurate to digits 6 (Float) >- 0.3048 being accurate to digits 6 (Float) >- 20421.6 (67000.0 * 0.3048) being accurate to digits 6 (Float) None of this is relevant. What is relevant is the error that occurs when you do a floating point multiplication. Unlike fixed point error, it depends on the values being multiplied. One of the first things I learned in programming (in my introductory Fortran class in 1976) is that you can't compare two independently derived floating point expressions and expect to get equality. That applies to bounds just as well any calculations. Of course, I've learned more about floating point since then. The main thing is that the model for Ada floating point operations is interval arithmetic (unless the value is exact a model number). That is, for a particular mathematical value V, the floating point representation must be in the interval V - E1 .. V + E2. The magnitude of the E's depends on the digits value *and* the value in question; in this example they are about V*1E-6. When you multiply two numbers, you have to do an interval multiply, multiplying all of the interval ends and taking the largest and smallest answers to provide the final interval. For instance, if you multiply F*C, you have to calculate the intervals based on the values (assuming both C and F are positive): F-E1..F+E2 * C+E3..C+E4 => F*C-C*E1-F*E3+E1*E3 .. F*C+C*E2+F*E4+E2*E4. E1*E3 and E2*E4 are going to be much smaller than the other terms, so they're usually ignored. We can plug in the estimates for E1, E2, E3, E4 and we'll get: F*C-C*F*1E-6-F*C*1E-6 .. F*C+C*F*1E-6+F*C*1E-6 => F*C*(1 - 2*1E1-6) .. F*C*(1 + 2*1E1-6) Here, the error is doubling with every multiply. I believe you'll see similar effects with "+" ("-" can generate much larger errors, and divide is messy!) On the other hand, with the static expression that calculates the bounds, you will always end up with a model number (that is, an interval of F*C (1 - 1E1-6) .. F*C (1 - 1E1-6)), so *of course* you're going to get Constraint_Error from time-to-time. The only way to avoid that is to expand the range enough to allow for the error from all of the operations that you are going to do on the values -- and that is going to require extensive analysis of the expressions that are involved. > The problem is made more acute in the scenario of a large safety-related system > where there is no obvious systematic way to determine which precise characteristics > of an FP type would give rise to this effect. I'd expect similar effects for *any* floating point use. Why would you expect to be able to use floating point in a safety-critical system without doing error analysis? That is a fool's game, as it is trivial to make a mistake (such as subtracting two nearly equal numbers) which would turn your entire calculation into meaningless numbers. > Various workarounds were proposed: ... I'd suggest two more workarounds: (f) Use fixed point instead; the error analysis is much easier and you'd probably get the results you want; or (g) Do the error analysis for the *entire* system! That's probably easier if you have tools support. Noting your e-mail address, it seems that that should be the better solution. ;-) Once you have that, use the 'Adjacent attribute to ensure that you are getting model numbers that expand the range as needed. ---- In particular, because of the issues I noted previously, I don't believe that there is anything that the language can do to help here. You need to be able to "push out" the bounds enough to encompass any possible calculated value, and the amount that needs to be done depends completely on the expressions involved. I can't imagine a generalized way to handle that (perhaps that is a lack of imagination on my part, but I'd like a clear description of what can be done in order to change my mind). I could imagine an attribute that would make it easier to define the bounds by specifying how many model intervals past a static expression you need to go; it would work much like Adjacent except that it would additionally take an error expression. But I suspect that could be done just as well with the existing attributes. The need to convert both ways makes the problem completely impossible in my view: the expressions on both sides of the conversions will require wider bounds, and you'll end up in an infinite regress until you have no bounds at all (which you've said that you are unwilling to do). Perhaps you've left something out here, but I think your methodology is seriously flawed. **************************************************************** From: Randy Brukardt Sent: Monday, November 26, 2007 10:15 PM One more quick comment: > ... but the result is beyond the bound in FP machine representation terms > due to rounding in the contribution of the "Ada-insignificant" mantissa > bits. It is not necessarily "Ada-insignificant" bits; nothing in the language requires two similar operations to get the same model number when an interval is the specified result. In particular, the conversion of: Float(67000.0*0.3048) to a model number (of which there are two possibilities) might provide a different result than Float(67000.0)*Float(0.3048) even if you assume that 67000.0 is in fact as model number (giving the same two model numbers as possible results). The same is true for two independent multiplies of the same values. (Of course, the *reason* for such differences might be "Ada-insignificant" bits, but that has nothing to do with the language model.) It might be possible to make this work in very limited circumstances (say one "+" or "*" operation where one value was a model number), but I don't think it would be appropriate to add features to the language that can only solve the problems of a handful of users. (Individual vendors might feel differently, of course.) **************************************************************** From: Jeffrey R. Carter Sent: Monday, November 26, 2007 10:54 PM > package P is > type Feet is range 0 .. 67000; > Feet_To_Metres : constant := 0.3048; > Max_Metres : constant := Float(Feet'Last) * Feet_To_Metres; Note that this is an expression of type Float, not a universal expression. > type Metres is digits 6 range 0.0 .. Max_Metres; > M : Metres; > function Conv (F : Feet) return Metres; > end P; > > package body P is > function Conv (F : Feet) return Metres is > begin > return Metres (Float (F) * Feet_To_Metres); > end Conv; > end P; I think you should not be using Float at all in this code. I would have done something like: Max_Feet : constant := 67_000; type Feet is range 0 .. Max_Feet; Metres_Per_Foot : constant := 0.3048; Max_Metres : constant := Metres_Per_Foot * Max_Feet; -- ARM 4.5.5 type Metres is digits 6 range 0.0 .. Max_Metres; Finally, Conv would be built around Metres_Per_Foot * Metres'Base (F) perhaps with some range checking on this value to avoid the exception. However, I don't know if this would fix the problem without adding some range checking to Conv. **************************************************************** From: Adam Beneschan Sent: Tuesday, November 27, 2007 9:40 AM > It is not necessarily "Ada-insignificant" bits; nothing in the language > requires two similar operations to get the same model number when an > interval is the specified result. In particular, the conversion of: > Float(67000.0*0.3048) to a model number (of which there are two > possibilities) might provide a different result than > Float(67000.0)*Float(0.3048) even if you assume that 67000.0 is in fact as > model number (giving the same two model numbers as possible > results). I agree completely; however, looking at Brian's code, I didn't see anywhere where Float(67000.0*0.3048) should have been computed at all, the way his example was written. It seemed to me that if the compiler was doing this, in any fashion, the compiler was erroneous. Maybe there's something in the language that would allow Float(67000.0*0.3048) to be computed, but I doubt it. A more interesting issue that might be a language issue is: Since the language defines the result of Float(67000.0)*Float(0.3048) in terms of an interval, and (I think) specifies that any machine number in that interval is an acceptable result, is it required that a computation using those values return the exact same machine number every time? Specifically, if the compiler is run on a different processor than the executable will be run, and the compiler can determine statically that a certain value will be Float(67000.0)*Float(0.3048) and can therefore compute the result itself, is it required to compute the exact same machine number that the target processor would compute if it performs the same calculation? I suspect the answer is "No"---is this correct? Should it be? This issue could have an impact on Brian's code, although it's probably moot since I agree that the code needs to be more error-tolerant anyway. **************************************************************** From: Pascal Leroy Sent: Tuesday, November 27, 2007 12:29 PM > The only way to avoid that is to expand the range enough to allow for the > error from all of the operations that you are going to do on the values -- > and that is going to require extensive analysis of the expressions that are > involved. > I'd expect similar effects for *any* floating point use. Why would you > expect to be able to use floating point in a safety-critical system without > doing error analysis? That is a fool's game, as it is trivial to make a > mistake (such as subtracting two nearly equal numbers) which would turn your > entire calculation into meaningless numbers. Randy is right of course. I believe that in the example at hand, nudging the upper bound a bit using Float'Succ(Max_Metres) would be sufficient to eliminate the C_E (I didn't try, though). The reason is that the IEEE multiplication might end up rounding to the machine number above the mathematical value, and that number is precisely what is returned by Float'Succ. However this is not a satisfactory solution, because if you happen to perform one extra floating-point operation the result might still move above the upper bound. And as you point out if you convert back and forth between Degrees and Radians you may have to widen the range substantially. The C_E is actually very helpful here: it tells you that you might get values that exceed the bounds that you expected, and that you must do something about it. It might be tempting to add checks that the values that you are dealing with don't exceed the range of the subtype, but, ignoring the fact that the performance would be bad, this would be a *horrible* solution. The root of the problem is that the error intervals are growing as you do floating-point calculations, and that happens everywhere, not only close to the bounds. For instance if you keep converting the same number back and forth between Degrees and Radians you might end up with a totally bogus value (although in practice the unbiased rounding mode of IEEE will keep the error relatively small). The bottom line is: there is no way that you can meaningfully pick the bounds of your subtype without a solid error analysis. I must say btw that I am quite shocked that anyone would try to program safety-critical systems without a minimum of understanding of numerics. **************************************************************** From: Tucker Taft Sent: Tuesday, November 27, 2007 3:14 PM With the back-and-forth conversion problem, it would seem you might need multiple types, one for values that arise naturally in a given unit, and one that is the result of converting from some other unit. E.g.: package P is type Feet is range 0 .. 67000; subtype Feet_From_Metres is Feet'Base range 0 .. Feet'Succ(Feet'Last); Feet_To_Metres : constant := 0.3048; Max_Metres : constant := Float(Feet'Last) * Feet_To_Metres; type Metres is digits 6 range 0.0 .. Max_Metres; subtype Metres_From_Feet is Metres'Base range 0.0 .. Metres'Succ(Metres'Last); M : Metres; function Conv (F : Feet) return Metres_From_Feet; function Conv (M : Metres) return Feet_From_Metres; end P; After conversion, you would need to check whether the result is in the smaller subrange, and if not, decide how to handle this borderline case. I can't quite imagine how the language could be altered to make this tricky situation any easier to handle. **************************************************************** From: Pascal Leroy Sent: Tuesday, November 27, 2007 3:57 PM > With the back-and-forth conversion problem, it would seem > you might need multiple types, one for values that > arise naturally in a given unit, and one that is > the result of converting from some other unit. > > ... > > After conversion, you would need to check whether the result > is in the smaller subrange, and if not, decide how to > handle this borderline case. But this is sidestepping the issue. The problem is *not* what to do with the borderline case. First, note that it's unclear what are you options for handling the borderline value. The only sensible choices seem to be (1) to raise an exception, but then the original subtype did just that; or (2) to return Metres'Last, but then this can be done by having the feet-to-metres conversion function return Metres'Min (Metres'Last, Metres'Base (F * Feet_To_Metres))). So it's not clear what an extra subtype buys you. The real problem is that, if you convert back-and-forth, the error interval will grow. That will happen for all values, not only close to the bounds. If you don't have an unbiased rounding, the values are actually likely to drift and to contain a lot of noise in the least significant bits of the mantissa. So rather than focusing on what to do near the bounds, it would seem more important to wonder whether these values are still meaningful after a bunch of conversions have taken place. Oh, and btw, this is definitely not a language problem. This is a problem with the user's understanding of floating-point arithmetic, and there is not much that Ada can do about that. **************************************************************** From: Tucker Taft Sent: Tuesday, November 27, 2007 7:01 PM > The problem is *not* what to do with the borderline case. First, note > that it's unclear what are you options for handling the borderline > value. The only sensible choices seem to be (1) to raise an exception, > but then the original subtype did just that; or (2) to return > Metres'Last, but then this can be done by having the feet-to-metres > conversion function return Metres'Min (Metres'Last, Metres'Base (F * > Feet_To_Metres))). So it's not clear what an extra subtype buys you. Using 'Base directly in the 'Min seems like a straightforward way to truncate toward zero, so I agree the extra subtype probably doesn't buy you much. I guess where I was headed was that it seemed weird to have values that are converted in one direction and then the other, so it might be better to think of converted values of not just a different subtype, but actually a different type. Values that are the result of conversion would be used sparingly, and not something that should ever be converted back to their original type. > The real problem is that, if you convert back-and-forth, the error > interval will grow. That will happen for all values, not only close to > the bounds. If you don't have an unbiased rounding, the values are > actually likely to drift and to contain a lot of noise in the least > significant bits of the mantissa. So rather than focusing on what to do > near the bounds, it would seem more important to wonder whether these > values are still meaningful after a bunch of conversions have taken place. I think we are on the same wavelength, namely, that you really shouldn't very often have values that get converted one way and then back the other. > Oh, and btw, this is definitely not a language problem. This is a > problem with the user's understanding of floating-point arithmetic, and > there is not much that Ada can do about that. It was never clear to me how this could be fixed in the language. It seems like your use of 'Min and 'Base is a nice way to avoid drifting past the upper bound. Using 'Machine or 'Model or 'Succ doesn't seem as clean as the simple approach you suggest. **************************************************************** From: Geert Bosch Sent: Tuesday, November 27, 2007 5:57 PM > But in this declaration: > > Max_Metres : constant := Float(Feet'Last) * Feet_To_Metres; > > the multiplication is the multiplication of two Float values. The > right operand, Feet_To_Metres, will be converted to a Float; thus, it > will not be exactly 0.3048 but will be slightly less or more than > that. I don't see any support for the reasoning that the conversion to float will round to a model number or machine number. In effect, 4.9(38) essentially states the opposite, namely that "for a real static expression that is not part of a larger static expression", the value is rounded to the nearest machine number. Together with the fact that no other clause gives permission to round, and that paragraph 4.9(33) states evaluation is performed exactly, I don't see any ambiguity. If the user wants to round a static expression to a Machine_Number, Float'Machine (expression) should be used. **************************************************************** From: Adam Beneschan Sent: Tuesday, November 27, 2007 6:45 PM Drat, it looks like you're right. I forgot about that. The "*" in the static expressions looks like it's supposed to be the predefined "*" that takes two Floats and returns a Float, but I guess in this context, it isn't (because of 4.9)---it's a different operation that takes two mathematically exact real numbers and returns an exact real number. I had forgotten that this section essentially overrides the normal meaning of "*". That negates some of what I said in my earlier posts, and I'll have to take back what I said about a possible implementation error. In looking into this, I looked at the wording of 4.9 carefully, and there seem to be some interesting ramifications I wasn't aware of before. 4.9 basically says that a static expression is evaluated at compile time, without any restrictions on what context the expression occurs in (except for one involving short circuits). So say you have code that looks like this: Const1 : constant Float := 0.2222222222; Const2 : constant Float := 0.3333333333; X1, X2, X3 : Float; ... X2 := Const1 * Const2 * X1; X3 := (Const1 * Const2) * X1; From what I can tell, in the assignment to X2, the only expression involved is Const1 * Const2 * X1, and that is not a static expression since it doesn't fit any of the categories listed in 4.9. In this example, looking at the syntax of <expression> in 4.4, this <expression> is a <relation>, which is a <simple_expression>, which is a <term>, which is a <factor> followed by a <multiplying_operator> followed by another <factor> followed by another <multiplying_operator> followed by another <factor>. There isn't anything in the syntax that would allow us to consider "Const1 * Const2" to be an <expression>. Therefore, "Const1 * Const2" can't be a static expression since it isn't an expression; thus this multiplication should not be performed exactly, but rather, Const1 and Const2 should be converted to Float and then the predefined multiplication operation on Floats (the one that uses two machine numbers) is performed, assuming the implementation doesn't rearrange the order as permitted by 4.5(13). But in the assignment to X3, Const1 * Const2 is an expression, since, in 4.4(7), the syntax indicates that the value in parentheses is, syntactically, an <expression>. Therefore, Const1 * Const2 is a static expression in this case, and the compiler needs to compute it exactly. The result, I suspect, is that X2 and X3 could be different. (I don't know whether it's mathematically possible with these specific numbers; it may require a more complex example to produce a case where the result could be different.) Actually, now that I think about it, there seems to be a slight hole in the language definition. Superficially, if C1, C2, and C3 are all constants of type Integer, one might think that these are static expressions that the compiler is required to evaluate at compile time: X : constant Integer := C1 + C2 + C3; Y : constant Integer := C1 * C2 + C3; But I'm not so sure. Actually, I think the second expression is a static expression, but the first is not; in the second case, the expression is a <term> <binary_adding_operator> <term>, and the first <term> is a <factor>, and 4.4(1) says that an "expression" refers to any of the syntactic categories listed below, including <factor>. But the first expression is a function call to "+" whose first operand is C1 + C2 and whose second operand is C3; and there's no syntactic category listed in 4.4(2-7) that covers the tokens "C1 + C2", and therefore this is not an expression and therefore not a static expression, and therefore 4.9(6) doesn't apply to the entire expression since the left operand of "+" is not a static expression. Not that I expect anyone to spend any time trying to fix the language definition for this. **************************************************************** From: Gary Dismukes Sent: Tuesday, November 27, 2007 7:17 PM > Actually, now that I think about it, there seems to be a slight hole > in the language definition. Superficially, if C1, C2, and C3 are all > constants of type Integer, one might think that these are static > expressions that the compiler is required to evaluate at compile time: > > X : constant Integer := C1 + C2 + C3; > Y : constant Integer := C1 * C2 + C3; > > But I'm not so sure. Actually, I think the second expression is a > static expression, but the first is not; in the second case, the > expression is a <term> <binary_adding_operator> <term>, and the first > <term> is a <factor>, and 4.4(1) says that an "expression" refers to > any of the syntactic categories listed below, including <factor>. But > the first expression is a function call to "+" whose first operand is > C1 + C2 and whose second operand is C3; and there's no syntactic > category listed in 4.4(2-7) that covers the tokens "C1 + C2", and > therefore this is not an expression and therefore not a static > expression, and therefore 4.9(6) doesn't apply to the entire > expression since the left operand of "+" is not a static expression. Not so fast. You seem to be assuming that a "static expression" is an "expression", but "static expression" is defined by 4.9(2-13)! (Remember, Ada's a language where a generic package is not a package;-) One of the things that's categorized as a static expression is a function call where the denoted function is a static function (which includes predefined ops of course), and it doesn't matter whether that occurs as a primary, factor, or whatever, it's still defined to be a static expression. (This is assuming the standard equivalence with infix form of course.) > Not that I expect anyone to spend any time trying to fix the language > definition for this. Fortunately it doesn't look like that will be necessary in this case. :-)` **************************************************************** From: Tucker Taft Sent: Tuesday, November 27, 2007 7:20 PM > ... > > Drat, it looks like you're right. I forgot about that. The "*" in > the static expressions looks like it's supposed to be the predefined > "*" that takes two Floats and returns a Float, but I guess in this > context, it isn't (because of 4.9)---it's a different operation that > takes two mathematically exact real numbers and returns an exact real > number. I had forgotten that this section essentially overrides the > normal meaning of "*". There is nothing that precludes the "normal" "*" from using infinite precision at run time. In fact, the only time you are *required* to produce a machine number is when crossing from the static domain to the non-static domain, and when the programmer explicitly applies a 'Model or 'Machine attribute. Ada permits the preservation of extra precision pretty much everywhere else. In Ada 2005 we made a change that *permitted* the compiler to produce the "correct" machine number when crossing from the static domain to the non-static domain, whereas Ada 95 sometimes required it to produce the "incorrect" value with respect to the rounding mode. > ... That negates some of what I said in my earlier > posts, and I'll have to take back what I said about a possible > implementation error. > > In looking into this, I looked at the wording of 4.9 carefully, and > there seem to be some interesting ramifications I wasn't aware of > before. 4.9 basically says that a static expression is evaluated at > compile time, without any restrictions on what context the expression > occurs in (except for one involving short circuits). So say you have > code that looks like this: > > Const1 : constant Float := 0.2222222222; > Const2 : constant Float := 0.3333333333; > X1, X2, X3 : Float; > ... > X2 := Const1 * Const2 * X1; > X3 := (Const1 * Const2) * X1; > >>From what I can tell, in the assignment to X2, the only expression > involved is Const1 * Const2 * X1, and that is not a static expression > since it doesn't fit any of the categories listed in 4.9. In this > example, looking at the syntax of <expression> in 4.4, this > <expression> is a <relation>, which is a <simple_expression>, which is > a <term>, which is a <factor> followed by a <multiplying_operator> > followed by another <factor> followed by another > <multiplying_operator> followed by another <factor>. There isn't > anything in the syntax that would allow us to consider "Const1 * > Const2" to be an <expression>. Therefore, "Const1 * Const2" can't be > a static expression since it isn't an expression; Now that is the first time I have *ever* seen that interpretation of the meaning of "expression." I'll admit that technically you are right due to the funky way the syntax rules are written and 4.4(1) is worded, but I think even if you ask 100 Ada language lawyers (presuming you could find that many), 99.5 of them would say "Const1 * Const2" is a "subexpression" of "Const1 * Const2 * X1" and all of the rules about static evaluation apply to it. 4.5(8) makes it clear that the operators are associated left to right. Furthermore, 6.6(2) says that each use of an operator is equivalent to a function call, and clearly a function call is an "expression." So to preserve the equivalence between: Const1 * Const2 * X1 and "*"("*"(Const1, Const2), X1) clearly "Const1 * Const2" (or equivalently "*"(Const1, Const2)) must be evaluated statically. > ... thus this > multiplication should not be performed exactly, but rather, Const1 and > Const2 should be converted to Float and then the predefined > multiplication operation on Floats (the one that uses two machine > numbers) is performed, assuming the implementation doesn't rearrange > the order as permitted by 4.5(13). > > But in the assignment to X3, Const1 * Const2 is an expression, since, > in 4.4(7), the syntax indicates that the value in parentheses is, > syntactically, an <expression>. Therefore, Const1 * Const2 is a > static expression in this case, and the compiler needs to compute it > exactly. > > The result, I suspect, is that X2 and X3 could be different. (I don't > know whether it's mathematically possible with these specific numbers; > it may require a more complex example to produce a case where the > result could be different.) I think you are barking up a very strange tree with the above argument, and I can't imagine any compiler implementation following this logic. > Actually, now that I think about it, there seems to be a slight hole > in the language definition. Superficially, if C1, C2, and C3 are all > constants of type Integer, one might think that these are static > expressions that the compiler is required to evaluate at compile time: > > X : constant Integer := C1 + C2 + C3; > Y : constant Integer := C1 * C2 + C3; > > But I'm not so sure. Actually, I think the second expression is a > static expression, but the first is not; in the second case, the > expression is a <term> <binary_adding_operator> <term>, and the first > <term> is a <factor>, and 4.4(1) says that an "expression" refers to > any of the syntactic categories listed below, including <factor>. But > the first expression is a function call to "+" whose first operand is > C1 + C2 and whose second operand is C3; and there's no syntactic > category listed in 4.4(2-7) that covers the tokens "C1 + C2", and > therefore this is not an expression and therefore not a static > expression, and therefore 4.9(6) doesn't apply to the entire > expression since the left operand of "+" is not a static expression. Again, that would break the equivalence with function notation, and so can't possibly be "right" (following the famous Robert's rule that states: "any attempt to interpret the reference manual in a way that leads to a silly conclusion is patently wrong"). > Not that I expect anyone to spend any time trying to fix the language > definition for this. And please don't make your compiler follow the above tortured logic. Trust me on this one: in C1 * C2 * X1, "C1 * C2" is an expression for the purposes of static evaluation and pretty much any other rule in the language, notwithstanding what 4.4(1) says. **************************************************************** From: Adam Beneschan Sent: Tuesday, November 27, 2007 7:23 PM > Not so fast. You seem to be assuming that a "static expression" is > an "expression", but "static expression" is defined by 4.9(2-13)! > (Remember, Ada's a language where a generic package is not a package;-) 4.9(2) says that a static expression must, by definition, be an expression. (It actually says "scalar or string expression", but there's no separate definition for "scalar expression" or "string expression", so the only possible meaning is an "expression" whose type is a scalar type or string type respectively.) So if it's not an expression, it can't be a static expression, even if it happens to be a static function call whose operands are static expressions. Sorry. **************************************************************** From: Adam Beneschan Sent: Tuesday, November 27, 2007 7:37 PM > > Not that I expect anyone to spend any time trying to fix the language > > definition for this. > > And please don't make your compiler follow the above tortured > logic. Trust me on this one: in C1 * C2 * X1, "C1 * C2" is > an expression for the purposes of static evaluation and pretty > much any other rule in the language, notwithstanding what > 4.4(1) says. Believe me, I won't. I definitely have better things to do with my time, although spending time composing a post on how there's a hole in the language was probably not one of them. I certainly agree with the "sensible" interpretation---I just thought it was interesting in a perverse way. I presume that in C1 * C2 * X1, however, "C2 * X1" is not an expression for this purpose? Thus, in a case like: C1 : constant Integer := Integer'Last; C2 : constant Integer := Integer'First; X, Y : Integer; X := 5; Y := X + C1 + C2; the last statement *could* raise Constraint_Error in a correct implementation (although it's not required to). **************************************************************** From: Tucker Taft Sent: Tuesday, November 27, 2007 8:34 PM > I presume that in C1 * C2 * X1, however, "C2 * X1" is not an > expression for this purpose? The language defines left-to-right association for an unparenthesized sequence of operators at the same precedence level, though 4.5(13) allows such sequences without parentheses to be reordered if the results are unaffected (ignoring the possibility of overflow). Since in this case, C1 * C2 must be evaluated exactly, it would be pretty hard to take advantage of this permission in this case. More generally, as 4.5(13.a) says, this permission isn't much use for floating point unless you do enough analysis to guarantee that the reordered result is no less precise than that required for the original order. In any case, for the purpose of static evaluation, in C1 * C2 * X1, it is interpreted as (C1 * C2) * X1. > ... Thus, in a case like: > > C1 : constant Integer := Integer'Last; > C2 : constant Integer := Integer'First; > X, Y : Integer; > > X := 5; > Y := X + C1 + C2; > > the last statement *could* raise Constraint_Error in a correct > implementation (although it's not required to). Yes, there is no requirement to reorder this to X + (C1 + C2), and clearly (X + C1) is likely to raise Constraint_Error. **************************************************************** From: Pascal Leroy Sent: Wednesday, November 28, 2007 2:40 AM > I don't see any support for the reasoning that the conversion > to float will round to a model number or machine number. > In effect, 4.9(38) essentially states the opposite, > namely that "for a real static expression that is not > part of a larger static expression", the value is > rounded to the nearest machine number. But 4.6(32) says that "the result is within the accuracy of the target type", and I believe that this applies to static as well as dynamic evaluation. I guess it says that the result of the conversion can be any number not too far from the argument, and therefore that returning the argument would be a legal choice. I have always thought that it would be sensible for a static conversion to have an effect similar to that of a dynamic conversion, but I can't find support for this in the RM at the moment. **************************************************************** From: Geert Bosch Sent: Wednesday, November 28, 2007 10:03 PM > But 4.6(32) says that "the result is within the accuracy of the target > type", and I believe that this applies to static as well as dynamic > evaluation. That still doesn't contradict anything I said, as RM 3.5.7(8) states: The set of values for a floating point type is the (infinite) set of rational numbers. ****************************************************************

Questions? Ask the ACAA Technical Agent