!standard 13.09.01 (12) 03-05-03 AI95-00167/04 !class binding interpretation 98-03-18 !status Amendment 200Y 03-02-19 !status ARG Approved 6-2-0 03-02-09 !status work item 98-03-18 !status received 96-11-16 !priority Low !difficulty Hard !subject Scalar unchecked conversion can be used with 'Valid to prevent erroneousness !summary Assigning the result of a call on Unchecked_Conversion or an imported function to an object and immediately testing it with 'Valid is not erroneous. !question 13.9.2(1) says, "The Valid attribute can be used to check the validity of data produced by unchecked conversion, input, interface to foreign languages, and the like," but 13.9.1(12) says, "A call to an imported function or an instance of Unchecked_Conversion is erroneous if the result is scalar, and the result object has an invalid representation." How can the Valid attribute be used to check the validity of an unchecked-conversion result without rendering execution erroneous in the case that the result is invalid? !recommendation If a call to an imported function or an instance of Unchecked_Conversion returns a scalar object that has an invalid representation, paragraph 13.9.1(12) makes any use of such a value erroneous. This paragraph should have made a special case for a use as the right hand side of an assignment or the prefix of 'Valid. Otherwise, there is no way to make use of the 'Valid attribute to check the validity of the result, as implied by 13.9.2(1). !wording Replace 13.9.1(12) by: A call to an imported function or an instance of Unchecked_Conversion is erroneous if the result is scalar, the result object has an invalid representation, and the result is used other than as the expression of an assignment_statement or an object_declaration, or as the prefix of a Valid attribute. If such a result object is used as the source of an assignment, and the assigned value is an invalid representation for the target of the assignment, then any use of the target object prior to a further assignment to the target object, other than as the prefix of a Valid attribute reference, is erroneous. !discussion Implementations that already provide the desired behavior (i.e., disregarding the fact that execution is formally erroneous in the case of an invalid unchecked-conversion result, and performing a meaningful validity test for the Valid attribute) need not change to conform to this interpretation. Consider the following declarations: type Setting is (Off, Low, Medium, High); for Setting use (2#000#, 2#001#, 2#010#, 2#100#); for Setting'Size use 3; type Three_Bits is array (Natural range <>) of Boolean; for Three_Bits'Component_Size use 1; for Three_Bits'Size use 3; function Bits_To_Setting is new Ada.Unchecked_Conversion(Three_Bits, Setting); Raw_Input : Three_Bits; Unvalidated_Setting : Setting; Validated_Setting : Setting; Input_Error: exception; According to 13.9.1(12), execution of Unvalidated_Setting := Bits_To_Setting(Raw_Input); if Unvalidated_Setting'Valid then Validated_Setting := Unvalidated_Setting; else raise Input_Error; end if; is erroneous if Raw_Input does not contain one of the four bit patterns that are valid representations of Setting values. Execution is rendered erroneous by the function call in the first assignment statement. Even though an implementation is likely in practice to behave as expected (raising Input_Error), execution is, formally, unpredictable from this point on. In theory, it is permissible to generate code for an attribute X'Valid, where X is known to be the result of an unchecked conversion, that always yields True (since the only case in which the attribute would yield False is the case in which execution is erroneous, and any behavior is permissible in that case)! 13.9(10) and 13.9(11) stipulate that if the representation of the actual parameter of an unchecked conversion is not "the representation of an object of the target subtype," then "the effect is implementation-defined; in particular the result can be abnormal." In the case of a scalar target type, assuming that the unchecked conversion produces an object with the same bit pattern as the actual parameter, the result will be invalid, as defined in 13.9.1(2) ("the object's representation does not represent any value of the object's subtype"). It is the intent of the Standard that a scalar unchecked-conversion result holding an invalid representation is not abnormal. Abnormality is a graver condition than invalidity. By 13.9.1(1), it is a bounded error to "evaluate the value" of an object with an invalid representation. This bounded error may result in an exception or in the use of the invalid representation value, but not in arbitrary behavior. In contrast, an abnormal object is considered so seriously corrupted that it is erroneous even to evaluate its name, even as the prefix of some enclosing name. Such serious corruption can occur in some composite objects (for example, dope vectors, discriminants, or internal offsets may be corrupted, causing run-time checks themselves to misbehave). However, the only forms of corrupt scalar data are: o a representation for an integer type, enumeration type, or floating point type object that is outside the range of the object's subtype o a representation for an enumeration type object that is not the representation of any value in the type o a representation for a floating point type object that is not the representation of any floating point value It is feasible to check for each of these forms of corruption, and the evaluation of the Valid attribute is expected to do so. (The check for an invalid representation in an enumeration type with gaps may entail a binary search of a table of valid representations. The check for an invalid floating-point representation may entail loading a value into a floating-point register or adding 0.0 to it, and responding to a resulting hardware trap.) As explained by the note in 13.9.2(12), evaluation of the attribute X'Valid does not entail "evaluating the value" of X. Therefore, the Valid attribute of a scalar unchecked-conversion result can always be evaluated without generating a bounded error. We considered repealing 13.9.1(12) altogether. In this case, 13.9.1(9) would apply, and use of the invalid value would cause a bounded error. This solution was rejected because it would have distributed overhead costs on implementations. An implementation would no longer be able to assume that an object, once initialized to a valid value, would stay valid. This would prevent optimization of many constraint checks. An alternative to that would be to effectively do a 'Valid check on the result of every call to Unchecked_Conversion or imported function, raising Program_Error on failure. Such an implementation would have both extra overhead and would effectively be taking control of data validity out of the user's hands. (They may have their own recovery mechanism for invalid data.) Therefore, we declare any use other than storing the invalid value and testing its validity to be erroneous. It is necessary to limit storing of invalid values to assignment_statements and object_declarations to prevent distributed overhead. For instance, passing the invalid value as a parameter is still erroneous; otherwise, static analysis could not assume valid values in most cases. It has been suggested that, since 13.9.1(12) applies only to scalars, a programmer can avoid erroneous execution by having the unchecked conversion return a one-element record containing the scalar: type Setting_Container is record Only_Component: Setting; end record; for Setting_Container use record Only_Component at 0 range 0 .. 2; end record; for Setting_Container'Size use 3; function Bits_To_Setting_Container is new Ada.Unchecked_Conversion(Three_Bits, Setting_Container); Unvalidated_Setting_Container: Setting_Container; ... Unvalidated_Setting_Container := Bits_To_Setting_Container(Raw_Input); if Unvalidated_Setting_Container.Only_Component'Valid then Validated_Setting := Unvalidated_Setting_Container.Only_Component; else raise Input_Error; end if; However, by 13.9(11), Bits_To_Setting_Container is permitted to return an abnormal object if Raw_Input does not contain the representation of a Setting_Container value. Then evaluation of the call on Bits_To_Setting_Container is erroneous by 13.9.1(8). !corrigendum 13.9.1(12) @drepl A call to an imported function or an instance of Unchecked_Conversion is erroneous if the result is scalar, and the result object has an invalid representation. @dby A call to an imported function or an instance of Unchecked_Conversion is erroneous if the result is scalar, the result object has an invalid representation, and the result is used other than as the @fa of an @fa or an @fa, or as the prefix of a Valid attribute. If such a result object is used as the source of an assignment, and the assigned value is an invalid representation for the target of the assignment, then any use of the target object prior to a further assignment to the target object, other than as the prefix of a Valid attribute reference, is erroneous. !ACATS Test An ACATS C-test could only check that the U_C and 'Valid work; it could not verify the absence of erroneousness. This does not seem valuable. !appendix !section 13.9.1(12) !subject Erroneous scalar Unchecked_Conversion? !reference RM95-13.9.1(12) !from Keith Thompson 96-10-07 !reference 96-5719.a Keith Thompson 96-10-7>> !discussion RM95-13.9.1(12) says: A call to an imported function or an instance of Unchecked_Conversion is erroneous if the result is scalar, and the result object has an invalid representation. This is followed in the AARM by several paragraphs recommending that implementations should behave sensibly. The last sentence of 12.a says: We considered requiring such sensible behavior, but it resulted in too much arcane verbiage, and since implementations have little incentive to behave irrationally, such verbiage is not important to have. Unfortunately, recommending that implementations behave sensibly is not sufficient. The best policy for a user trying to write good portable Ada is to avoid erroneous execution altogether, even if some specific implementations happen to document the particular form of undefined behavior that they implement. It should be possible, for example, to use Unchecked_Conversion to convert an integer value to a sparse enumeration type and apply the 'Valid attribute to the result without invoking erroneous execution. In any case, this paragraph should have been in 13.9, not 13.9.1. **************************************************************** >From the minutes of the April 1997 ARG meeting in Henley: AI-167 Erroneous scalar Unchecked_Conversion? There is a dilemma here -- the user can't get to check the validity of the resulting value before the program is *defined" to be erroneous. Sparse enumerations are a particular source of problems for this AI. One obvious portable solution is to somehow *promptly" test the validity of the resulting value before the user does anything else to the value. Making this work seems too messy to define. A non-portable solution is to make the situation implementation-defined rather than erroneous. Another non-portable solution is to raise an exception when an invalid value is detected during the conversion. This was rejected during the language design process. The user can directly handle this problem by doing the unchecked conversion by either: _ Converting to an integer, then using a case statement to check for valid values _ Wrapping the designated result in a record as the sole component. The user can then perform a validity check of the component value. This is due to the fact that there is no component type checking performed when the assignment is to a record type. The group reached consensus that the only option is to confirm the language on this issue and to expect the user to do the sensible thing to avoid this problem. **************************************************************** >From the minutes of the November 1997 ARG meeting in St. Louis: The group revisits the major points already discussed at the Henley meeting. The discussion centered on a model that makes the conversion in question a bounded error, possibly rendering (for conversion to sparse enumeration) the result to be a value with invalid representation, eligible to testing by 'Valid. Staying with erroneousness, there is no opportunity to do anything reasonable, formally speaking. The sentiment is voiced that that's fine because we know that, in reality, this conversion and follow-on 'Valid query would work. The counter-argument is that users want more assurance. This AI applies to Unchecked_Conversion and imports. Bob recalls that the erroneous execution for this case was originally selected for optimization reasons. It is certainly strange that unchecked conversion of a wrapper record is the way around this erroneous behavior or any related optimization. John recommends that the most straightforward way out of this dilemma is to simply state by fiat than 'Valid when applied [immediately] after an Unchecked_Conversion whose target is a sparse enumerated type produces a useable result and not an erroneous execution. Alternatively, Bob recommends that 13.9.1(12) be changed to returning a value with invalid representation (which can be subsequently checked with 'Valid applied to the object with this value). Norm will produce the write-up of this AI. **************************************************************** Editor's note: The priority of this AI was changed based on ARG discussion in November 2000. **************************************************************** From: Steve Baird Date: Thursday, November 8, 2001 8:18 PM Two points were made in the discussions of AI-167 at the 10/01 ARG meeting in Minneapolis: 1) A solution to the problem described in the AI is needed. 2) Changing the definition of Ada.Unchecked_Conversion is not the right solution. The change that was being considered would have imposed a performance penalty on programs which use Unchecked_Conversion "correctly" (i.e. consistently with the existing language rules). One can come close to solving the problem using the language as it stands today, but the solution is both awkward and obscure. Consider the following attempt: generic type Source (<>) is limited private; type Target is (<>); function My_Unchecked_Conversion (S : Source) return Target; with Ada.Unchecked_Conversion; function My_Unchecked_Conversion (S : Source) return Target is type Target_Record is record F : aliased Target; end record; pragma Pack (Target_Record); function Convert is new Ada.Unchecked_Conversion (Source, Target_Record); Result : Target renames Convert (S).F; begin if Result'Valid then return Result; else raise Program_Error; end if; end; The one-field record skin is used to get around 13.9.1(12) and the function result rename is used in order to meet the restrictions of 13.9.1(8). Even ignoring the issues of awkwardness and obscurity, this solution is not completely satisfactory. It depends on Target and Target_Record having the same representation (this is the motivation for the Pack pragma and for declaring the component to be aliased). Corresponding values of the two types would typically have the same representation, but relying on this assumption when trying to write portable code is undesirable. To address these problems, I propose adding the following language-defined generic functions: generic type Source (<>) is limited private; type Target is (<>); function Ada.Validated_Discrete_Conversion (S : Source) return Target; pragma Convention (Intrinsic, Ada.Validated_Discrete_Conversion); pragma Pure (Ada.Validated_Discrete_Conversion); generic type Source (<>) is limited private; type Target is digits <>; function Ada.Validated_Float_Conversion (S : Source) return Target; pragma Convention (Intrinsic, Ada.Validated_Float_Conversion); pragma Pure (Ada.Validated_Float_Conversion); generic type Source (<>) is limited private; type Target is delta <>; function Ada.Validated_Fixed_Conversion (S : Source) return Target; pragma Convention (Intrinsic, Ada.Validated_Fixed_Conversion); pragma Pure (Ada.Validated_Fixed_Conversion); generic type Source (<>) is limited private; type Target is delta <> digits <>; function Ada.Validated_Decimal_Conversion (S : Source) return Target; pragma Convention (Intrinsic, Ada.Validated_Decimal_Conversion); pragma Pure (Ada.Validated_Decimal_Conversion); These would be defined to return the same value as would be produced by a corresponding instance of Ada.Unchecked_Conversion except in the case where that result would be invalid; in that case, Program_Error is raised (this definition would require formalizing). Most (all?) existing implementations could provide these units without any special compiler support; simply copy the body given in the preceding example, changing the unit name appropriately. The key advantage of having this unit be language-defined is portability; the burden is on the compiler vendor, not on the user, to ensure that that the unit is implemented correctly. **************************************************************** From: Robert Dewar Date: Thursday, November 8, 2001 9:28 PM why are these packages any different from doing an unchecked conversion followed by a validity check? **************************************************************** From: Pascal Leroy Date: Friday, November 9, 2001 1:45 AM Because as the language stands today you may become erroneous as soon as you do the unchecked conversion, even before you have a chance to do the validity check. I suggest reading the AI for details of the problem. And this is not just an angel-on-a-pinhead issue, it shows up in real life because optimizers do make assumptions that may make the erroneousness very visible (e.g., leading to 'Valid being optimized away). **************************************************************** From: Robert Dewar Date: Friday, November 9, 2001 5:53 AM OK, I understand, but I must say that a compiler that optimizes away 'Valid under any circumstances seems broken from a pragmatic point of view to me. **************************************************************** From: Robert Dewar Date: Friday, November 9, 2001 6:01 AM By the way, in GNAT we handle this by guaranteeing that the sequence of an unchecked conversion followed by a validity check is always correct, and we never ever optimize valid checks away (it seems really really wrong to me to optimize a valid check away, after a validity check you should be able to assume the data is valid, and that includes checking for validity errors caused by random data clobbering etc, the compiler is never ever justified in optimizing away a valid check. Yes, I know we have no formal way of saying this, but in pragmatic terms this is a very important informal requirement. **************************************************************** From: Randy Brukardt Date: Wednesday, November 14, 2001 7:52 PM I think the units are fine, and you should go ahead and complete the write-up of the AI, including wording. :-) > Most (all?) existing implementations could provide these units without any > special compiler support; simply copy the body given in the preceding example, > changing the unit name appropriately. Well, the body wouldn't work for Janus/Ada, because the size of component of a generic formal integer type is the size of the largest possible integer type. And pragma Pack is ineffective on such types. (The results of generic code sharing.) Thus, you probably would get a type mismatch error on the Unchecked_Conversion instantiation. But I don't think that the burden of building these into the compiler would be high enough to be a problem. (I don't think our optimizer reasons from erroneousness; we wanted to be able to detect random data values that came from any source. If we were redoing it for Ada 95, I think we'd take a tack much like Robert outlined - only worry about random values at a 'Valid check.) Robert said: >I must say that a compiler that optimizes away 'Valid >under any circumstances seems broken from a pragmatic point of view to me. I think that is a little bit too strong: if the compiler can determine that the value being tested is known to be set to one or more valid, static values in the current extended basic block (via flow analysis, for example), then it can remove the check. Certainly if the value is known to be static. But those cases are likely to be rare enough that it may not be worth it to allow them. **************************************************************** From: Robert Dewar Date: Friday, November 15, 2001 9:11 AM <> This implementation is arguably incorrect, I see no reason to allow the UC to fail in this case. <> You are right legalistically, but I wlil repeat my assertion that it is a bad implementatoin choice to ever optimize away 'Valid. This should be a defence even in an erroneous program with memory overwriting etc. (or hardware failure) **************************************************************** From: David Emery Date: Friday, November 15, 2001 11:30 AM Speaking a J. Random User here :-). If I used 'Valid, then I want to be guaranteed a check, regardless of what the compiler can "prove". If I didn't want it, I wouldn't ask for it. Besides Robert's example, this could well be part of an external 'proof system' (e.g. SPARK) where the correctness of the proof depends on the check occuring at exactly this point. **************************************************************** From: Mike Kamrad Date: Friday, November 15, 2001 1:20 PM I would agree with Robert's assertion too, as M. Random-User. Is this the one attribute the only one way we do not want to optimize away? **************************************************************** From: Randy Brukardt Date: Friday, October 25, 2002 8:50 PM Some comments on the AI-167 write-up that was passed out at the ARG meeting: (this will be version /02 next time I upload AIs.) There was an incorrect paragraph reference in the !response (it should be 13.9.1(12), not 13.9.1(2)), which I've fixed in the to-be-posted version. The !discussion section (unchanged from the previous version) should include an example like the one we discussed in Cupertino (where an assignment discards bits). (It's in the minutes of that meeting.) The discussion spends a lot of words discussing whether a UC with a scalar result type ever produces an abnormal result. It would seem that this is allowed by 13.9(11). Do we need a wording change to insure that doesn't happen? It's not clear to me that we actually can make such a statement (implementation-defined is just that; if the implementor wants to say that all such UCs are erroneous, that seems to be an acceptable answer). But we could make an AARM note to the effect that scalar UCs should either produce a legitimate value or an invalid representation. Or perhaps Implementation Advice to that effect? ****************************************************************