!standard 13.09.01 (12) 01-01-29 AI95-00167/01 !class binding interpretation 98-03-18 !status work item 98-03-18 !status received 96-11-16 !priority Low !difficulty Hard !subject Scalar unchecked conversion is never erroneous !summary 98-03-18 Scalar results produced by unchecked conversions or calls on imported subprograms may be invalid, but not abnormal. The execution of such a call is not inherently erroneous. !question 98-03-18 13.9.2(1) says, "The Valid attribute can be used to check the validity of data produced by unchecked conversion, input, interface to foreign languages, and the like," but 13.9.1(12) says, "A call to an imported function or an instance of Unchecked_Conversion is erroneous if the result is scalar, and the result object has an invalid representation." How can the Valid attribute be used to check the validity of an unchecked-conversion result without rendering execution erroneous in the case that the result is invalid? !response 98-03-18 If a call to an imported function or an instance of Unchecked_Conversion returns a scalar object whose representation is not the representation of any value in the return subtype, or if a call to an imported procedure causes a scalar actual parameter to hold a representation that is not the representation of any value in the parameter subtype, the function result or actual parameter holds an invalid representation, but is not abnormal. Notwithstanding 13.9.1(12), the fact that the subprogram call produced an invalid representation does not make execution of the call erroneous. !discussion 98-03-18 Implementations that already provide the desired behavior (i.e., disregarding the fact that execution is formally erroneous in the case of an invalid unchecked-conversion result, and performing a meaningful validity test for the Valid attribute) need not change to conform to this interpretation. Consider the following declarations: type Setting is (Off, Low, Medium, High); for Setting use (2#000#, 2#001#, 2#010#, 2#100#); for Setting'Size use 3; type Three_Bits is array (Natural range <>) of Boolean; for Three_Bits'Component_Size use 1; for Three_Bits'Size use 3; function Bits_To_Setting is new Ada.Unchecked_Conversion(Three_Bits, Setting); Raw_Input : Three_Bits; Unvalidated_Setting : Setting; Validated_Setting : Setting; Input_Error: exception; According to 13.9.1(12), execution of Unvalidated_Setting := Bits_To_Setting(Raw_Input); if Unvalidated_Setting'Valid then Validated_Setting := Unvalidated_Setting; else raise Input_Error; end if; is erroneous if Raw_Input does not contain one of the four bit patterns that are valid representations of Setting values. Execution is rendered erroneous by the function call in the first assignment statement. Even though an implementation is likely in practice to behave as expected, raising Input_Error, execution is, formally, unpredictable from this point on. In theory, it is permissible to generate code for an attribute X'Valid, where X is known to be the result of an unchecked conversion, that always yields True (since the only case in which the attribute would yield False is the case in which execution is erroneous, and any behavior is permissible in that case)! 13.9(10) and 13.9(11) stipulate that if the representation of the actual parameter of an unchecked conversion is not "the representation of an object of the target subtype," then "the effect is implementation-defined; in particular the result can be abnormal." In the case of a scalar target type, assuming that the unchecked conversion produces an object with the same bit pattern as the actual parameter, the result will be invalid, as defined in 13.9.1(2) ("the object's representation does represent any value of the object's subtype"). It is the intent of the Standard that a scalar unchecked-conversion result holding an invalid representation is not abnormal. Abnormality is a graver condition than invalidity. By 13.9.1(1), it is a bounded error to "evaluate the value" of an object with an invalid representation. This bounded error may result in an exception or in the use of the invalid representation value, but not in arbitrary behavior. In contrast, an abnormal object is considered so seriously corrupted that it is erroneous even to evaluate its name, even as the prefix of some enclosing name. Such serious corruption can occur in some composite objects (for example, dope vectors, discriminants, or internal offsets may be corrupted, causing run-time checks themselves to misbehave). H owever, the only forms of corrupt scalar data are: o a representation for an integer-type, enumeration-type, or floating-point-type object that is outside the range of the object's subtype o a representation for an enumeration-type object that is not the representation of any value in the type o a representation for a floating-point-type object that is not the representation of any floating-point value It is feasible to check for each of these forms of corruption, and the evaluation of the Valid attribute is expected to do so. (The check for an invalid representation in an enumeration type with gaps may entail a binary search of a table of valid representations. The check for an invalid floating-point representation may entail loading a value into a floating-point register or adding 0.0 to it, and responding to a resulting hardware trap.) As explained by the note in 13.9.2(12), evaluation of the attribute X'Valid does not entail "evaluating the value" of X. Therefore, the Valid attribute of a scalar unchecked-conversion result can always be evaluated without generating a bounded error. More importantly, an unchecked conversion that returns an invalid result does not render execution erroneous. The use of an invalid unchecked-conversion result in other contexts may result in a bounded error, but not in erroneous execution. The following example illustrates the importance of this distinction: case Bits_To_Setting(Raw_Input) is when Off => ... when Low => ... when Medium => ... when High => ... end case; Suppose Raw_Input does not contain the representation of any Setting value. If the execution of the unchecked conversion were considered erroneous, it would be permissible for the implementation to ignore the possibility of an invalid result. That is, the implementation could optimize away the check called for by 5.4(13), which verifies that the value of a case-statement expression is covered by one of the case statement's discrete choice lists. The justification for eliminating the check is that the check can only fail during erroneous executions, and any behavior is permissible during erroneous execution. If the check is eliminated, an invalid result could cause a branch to an arbitrary address, with catastrophic results. In fact, the unchecked conversion is not erroneous, but the object it returns contains an invalid representation. The execution of the case statement entails the evaluation of this object to obtain its value, and this evaluation is a bounded error. The possible consequences of this bounded error are enumerated in 13.9.1(9): Constraint_Error or Program_Error can be raised, or execution can continue using the invalid representation. If execution continues using the invalid representation, the check stipulated by 5.4(13) is performed, raising Constraint_Error. (It has been suggested that, since 13.9.1(12) applies only to scalars, a programmer can avoid erroneous execution by having the unchecked conversion return a one-element record containing the scalar: type Setting_Container is record Only_Component: Setting; end record; for Setting_Container use record Only_Component at 0 range 0 .. 2; end record; for Setting_Container'Size use 3; function Bits_To_Setting_Container is new Ada.Unchecked_Conversion(Three_Bits, Setting_Container); Unvalidated_Setting_Container: Setting_Container; ... Unvalidated_Setting_Container := Bits_To_Setting_Container(Raw_Input); if Unvalidated_Setting_Container.Only_Component'Valid then Validated_Setting := Unvalidated_Setting_Container.Only_Component; else raise Input_Error; end if; However, by 13.9(11), Bits_To_Setting_Container is permitted to return an abnormal object if Raw_Input does not contain the representation of a Setting_Container value. Then evaluation of the call on Bits_To_Setting_Container is erroneous by 13.9.1(8).) !appendix 96-11-16 !section 13.9.1(12) !subject Erroneous scalar Unchecked_Conversion? !reference RM95-13.9.1(12) !from Keith Thompson 96-10-07 !reference 96-5719.a Keith Thompson 96-10-7>> !discussion RM95-13.9.1(12) says: A call to an imported function or an instance of Unchecked_Conversion is erroneous if the result is scalar, and the result object has an invalid representation. This is followed in the AARM by several paragraphs recommending that implementations should behave sensibly. The last sentence of 12.a says: We considered requiring such sensible behavior, but it resulted in too much arcane verbiage, and since implementations have little incentive to behave irrationally, such verbiage is not important to have. Unfortunately, recommending that implementations behave sensibly is not sufficient. The best policy for a user trying to write good portable Ada is to avoid erroneous execution altogether, even if some specific implementations happen to document the particular form of undefined behavior that they implement. It should be possible, for example, to use Unchecked_Conversion to convert an integer value to a sparse enumeration type and apply the 'Valid attribute to the result without invoking erroneous execution. In any case, this paragraph should have been in 13.9, not 13.9.1. **************************************************************** >From the minutes of the April 1997 ARG meeting in Henley: AI-167 Erroneous scalar Unchecked_Conversion? There is a dilemma here -- the user can't get to check the validity of the resulting value before the program is *defined" to be erroneous. Sparse enumerations are a particular source of problems for this AI. One obvious portable solution is to somehow *promptly" test the validity of the resulting value before the user does anything else to the value. Making this work seems too messy to define. A non-portable solution is to make the situation implementation-defined rather than erroneous. Another non-portable solution is to raise an exception when an invalid value is detected during the conversion. This was rejected during the language design process. The user can directly handle this problem by doing the unchecked conversion by either: _ Converting to an integer, then using a case statement to check for valid values _ Wrapping the designated result in a record as the sole component. The user can then perform a validity check of the component value. This is due to the fact that there is no component type checking performed when the assignment is to a record type. The group reached consensus that the only option is to confirm the language on this issue and to expect the user to do the sensible thing to avoid this problem. **************************************************************** >From the minutes of the November 1997 ARG meeting in St. Louis: The group revisits the major points already discussed at the Henley meeting. The discussion centered on a model that makes the conversion in question a bounded error, possibly rendering (for conversion to sparse enumeration) the result to be a value with invalid representation, eligible to testing by 'Valid. Staying with erroneousness, there is no opportunity to do anything reasonable, formally speaking. The sentiment is voiced that that's fine because we know that, in reality, this conversion and follow-on 'Valid query would work. The counter-argument is that users want more assurance. This AI applies to Unchecked_Conversion and imports. Bob recalls that the erroneous execution for this case was originally selected for optimization reasons. It is certainly strange that unchecked conversion of a wrapper record is the way around this erroneous behavior or any related optimization. John recommends that the most straightforward way out of this dilemma is to simply state by fiat than 'Valid when applied [immediately] after an Unchecked_Conversion whose target is a sparse enumerated type produces a useable result and not an erroneous execution. Alternatively, Bob recommends that 13.9.1(12) be changed to returning a value with invalid representation (which can be subsequently checked with 'Valid applied to the object with this value). Norm will produce the write-up of this AI. **************************************************************** Editor's note: The priority of this AI was changed based on ARG discussion in November 2000. **************************************************************** From: Steve Baird Date: Thursday, November 8, 2001 8:18 PM Two points were made in the discussions of AI-167 at the 10/01 ARG meeting in Minneapolis: 1) A solution to the problem described in the AI is needed. 2) Changing the definition of Ada.Unchecked_Conversion is not the right solution. The change that was being considered would have imposed a performance penalty on programs which use Unchecked_Conversion "correctly" (i.e. consistently with the existing language rules). One can come close to solving the problem using the language as it stands today, but the solution is both awkward and obscure. Consider the following attempt: generic type Source (<>) is limited private; type Target is (<>); function My_Unchecked_Conversion (S : Source) return Target; with Ada.Unchecked_Conversion; function My_Unchecked_Conversion (S : Source) return Target is type Target_Record is record F : aliased Target; end record; pragma Pack (Target_Record); function Convert is new Ada.Unchecked_Conversion (Source, Target_Record); Result : Target renames Convert (S).F; begin if Result'Valid then return Result; else raise Program_Error; end if; end; The one-field record skin is used to get around 13.9.1(12) and the function result rename is used in order to meet the restrictions of 13.9.1(8). Even ignoring the issues of awkwardness and obscurity, this solution is not completely satisfactory. It depends on Target and Target_Record having the same representation (this is the motivation for the Pack pragma and for declaring the component to be aliased). Corresponding values of the two types would typically have the same representation, but relying on this assumption when trying to write portable code is undesirable. To address these problems, I propose adding the following language-defined generic functions: generic type Source (<>) is limited private; type Target is (<>); function Ada.Validated_Discrete_Conversion (S : Source) return Target; pragma Convention (Intrinsic, Ada.Validated_Discrete_Conversion); pragma Pure (Ada.Validated_Discrete_Conversion); generic type Source (<>) is limited private; type Target is digits <>; function Ada.Validated_Float_Conversion (S : Source) return Target; pragma Convention (Intrinsic, Ada.Validated_Float_Conversion); pragma Pure (Ada.Validated_Float_Conversion); generic type Source (<>) is limited private; type Target is delta <>; function Ada.Validated_Fixed_Conversion (S : Source) return Target; pragma Convention (Intrinsic, Ada.Validated_Fixed_Conversion); pragma Pure (Ada.Validated_Fixed_Conversion); generic type Source (<>) is limited private; type Target is delta <> digits <>; function Ada.Validated_Decimal_Conversion (S : Source) return Target; pragma Convention (Intrinsic, Ada.Validated_Decimal_Conversion); pragma Pure (Ada.Validated_Decimal_Conversion); These would be defined to return the same value as would be produced by a corresponding instance of Ada.Unchecked_Conversion except in the case where that result would be invalid; in that case, Program_Error is raised (this definition would require formalizing). Most (all?) existing implementations could provide these units without any special compiler support; simply copy the body given in the preceding example, changing the unit name appropriately. The key advantage of having this unit be language-defined is portability; the burden is on the compiler vendor, not on the user, to ensure that that the unit is implemented correctly. **************************************************************** From: Robert Dewar Date: Thursday, November 8, 2001 9:28 PM why are these packages any different from doing an unchecked conversion followed by a validity check? **************************************************************** From: Pascal Leroy Date: Friday, November 9, 2001 1:45 AM Because as the language stands today you may become erroneous as soon as you do the unchecked conversion, even before you have a chance to do the validity check. I suggest reading the AI for details of the problem. And this is not just an angel-on-a-pinhead issue, it shows up in real life because optimizers do make assumptions that may make the erroneousness very visible (e.g., leading to 'Valid being optimized away). **************************************************************** From: Robert Dewar Date: Friday, November 9, 2001 5:53 AM OK, I understand, but I must say that a compiler that optimizes away 'Valid under any circumstances seems broken from a pragmatic point of view to me. **************************************************************** From: Robert Dewar Date: Friday, November 9, 2001 6:01 AM By the way, in GNAT we handle this by guaranteeing that the sequence of an unchecked conversion followed by a validity check is always correct, and we never ever optimize valid checks away (it seems really really wrong to me to optimize a valid check away, after a validity check you should be able to assume the data is valid, and that includes checking for validity errors caused by random data clobbering etc, the compiler is never ever justified in optimizing away a valid check. Yes, I know we have no formal way of saying this, but in pragmatic terms this is a very important informal requirement. **************************************************************** From: Randy Brukardt Date: Wednesday, November 14, 2001 7:52 PM I think the units are fine, and you should go ahead and complete the write-up of the AI, including wording. :-) > Most (all?) existing implementations could provide these units without any > special compiler support; simply copy the body given in the preceding example, > changing the unit name appropriately. Well, the body wouldn't work for Janus/Ada, because the size of component of a generic formal integer type is the size of the largest possible integer type. And pragma Pack is ineffective on such types. (The results of generic code sharing.) Thus, you probably would get a type mismatch error on the Unchecked_Conversion instantiation. But I don't think that the burden of building these into the compiler would be high enough to be a problem. (I don't think our optimizer reasons from erroneousness; we wanted to be able to detect random data values that came from any source. If we were redoing it for Ada 95, I think we'd take a tack much like Robert outlined - only worry about random values at a 'Valid check.) Robert said: >I must say that a compiler that optimizes away 'Valid >under any circumstances seems broken from a pragmatic point of view to me. I think that is a little bit too strong: if the compiler can determine that the value being tested is known to be set to one or more valid, static values in the current extended basic block (via flow analysis, for example), then it can remove the check. Certainly if the value is known to be static. But those cases are likely to be rare enough that it may not be worth it to allow them. **************************************************************** From: Robert Dewar Date: Friday, November 15, 2001 9:11 AM <> This implementation is arguably incorrect, I see no reason to allow the UC to fail in this case. <> You are right legalistically, but I wlil repeat my assertion that it is a bad implementatoin choice to ever optimize away 'Valid. This should be a defence even in an erroneous program with memory overwriting etc. (or hardware failure) **************************************************************** From: David Emery Date: Friday, November 15, 2001 11:30 AM Speaking a J. Random User here :-). If I used 'Valid, then I want to be guaranteed a check, regardless of what the compiler can "prove". If I didn't want it, I wouldn't ask for it. Besides Robert's example, this could well be part of an external 'proof system' (e.g. SPARK) where the correctness of the proof depends on the check occuring at exactly this point. **************************************************************** From: Mike Kamrad Date: Friday, November 15, 2001 1:20 PM I would agree with Robert's assertion too, as M. Random-User. Is this the one attribute the only one way we do not want to optimize away? ****************************************************************