Version 1.12 of ais/ai-00167.txt

Unformatted version of ais/ai-00167.txt version 1.12
Other versions for file ais/ai-00167.txt

!standard 13.09.01 (12)          03-05-03 AI95-00167/04
!class binding interpretation 98-03-18
!status Amendment 200Y 03-02-19
!status WG9 approved 03-06-20
!status ARG Approved 6-2-0 03-02-09
!status work item 98-03-18
!status received 96-11-16
!priority Low
!difficulty Hard
!subject Scalar unchecked conversion can be used with 'Valid to prevent erroneousness
!summary
Assigning the result of a call on Unchecked_Conversion or an imported function to an object and immediately testing it with 'Valid is not erroneous.
!question
13.9.2(1) says, "The Valid attribute can be used to check the validity of data produced by unchecked conversion, input, interface to foreign languages, and the like," but 13.9.1(12) says, "A call to an imported function or an instance of Unchecked_Conversion is erroneous if the result is scalar, and the result object has an invalid representation." How can the Valid attribute be used to check the validity of an unchecked-conversion result without rendering execution erroneous in the case that the result is invalid?
!recommendation
If a call to an imported function or an instance of Unchecked_Conversion returns a scalar object that has an invalid representation, paragraph 13.9.1(12) makes any use of such a value erroneous. This paragraph should have made a special case for a use as the right hand side of an assignment or the prefix of 'Valid. Otherwise, there is no way to make use of the 'Valid attribute to check the validity of the result, as implied by 13.9.2(1).
!wording
Replace 13.9.1(12) by:
A call to an imported function or an instance of Unchecked_Conversion is erroneous if the result is scalar, the result object has an invalid representation, and the result is used other than as the expression of an assignment_statement or an object_declaration, or as the prefix of a Valid attribute. If such a result object is used as the source of an assignment, and the assigned value is an invalid representation for the target of the assignment, then any use of the target object prior to a further assignment to the target object, other than as the prefix of a Valid attribute reference, is erroneous.
!discussion
Implementations that already provide the desired behavior (i.e., disregarding the fact that execution is formally erroneous in the case of an invalid unchecked-conversion result, and performing a meaningful validity test for the Valid attribute) need not change to conform to this interpretation.
Consider the following declarations:
type Setting is (Off, Low, Medium, High); for Setting use (2#000#, 2#001#, 2#010#, 2#100#); for Setting'Size use 3;
type Three_Bits is array (Natural range <>) of Boolean; for Three_Bits'Component_Size use 1; for Three_Bits'Size use 3;
function Bits_To_Setting is new Ada.Unchecked_Conversion(Three_Bits, Setting);
Raw_Input : Three_Bits; Unvalidated_Setting : Setting; Validated_Setting : Setting;
Input_Error: exception;
According to 13.9.1(12), execution of
Unvalidated_Setting := Bits_To_Setting(Raw_Input); if Unvalidated_Setting'Valid then Validated_Setting := Unvalidated_Setting; else raise Input_Error; end if;
is erroneous if Raw_Input does not contain one of the four bit patterns that are valid representations of Setting values. Execution is rendered erroneous by the function call in the first assignment statement. Even though an implementation is likely in practice to behave as expected (raising Input_Error), execution is, formally, unpredictable from this point on. In theory, it is permissible to generate code for an attribute X'Valid, where X is known to be the result of an unchecked conversion, that always yields True (since the only case in which the attribute would yield False is the case in which execution is erroneous, and any behavior is permissible in that case)!
13.9(10) and 13.9(11) stipulate that if the representation of the actual parameter of an unchecked conversion is not "the representation of an object of the target subtype," then "the effect is implementation-defined; in particular the result can be abnormal." In the case of a scalar target type, assuming that the unchecked conversion produces an object with the same bit pattern as the actual parameter, the result will be invalid, as defined in 13.9.1(2) ("the object's representation does not represent any value of the object's subtype").
It is the intent of the Standard that a scalar unchecked-conversion result holding an invalid representation is not abnormal. Abnormality is a graver condition than invalidity. By 13.9.1(1), it is a bounded error to "evaluate the value" of an object with an invalid representation. This bounded error may result in an exception or in the use of the invalid representation value, but not in arbitrary behavior. In contrast, an abnormal object is considered so seriously corrupted that it is erroneous even to evaluate its name, even as the prefix of some enclosing name. Such serious corruption can occur in some composite objects (for example, dope vectors, discriminants, or internal offsets may be corrupted, causing run-time checks themselves to misbehave). However, the only forms of corrupt scalar data are:
o a representation for an integer type, enumeration type, or
floating point type object that is outside the range of the object's subtype
o a representation for an enumeration type object that is not the
representation of any value in the type
o a representation for a floating point type object that is not the
representation of any floating point value
It is feasible to check for each of these forms of corruption, and the evaluation of the Valid attribute is expected to do so. (The check for an invalid representation in an enumeration type with gaps may entail a binary search of a table of valid representations. The check for an invalid floating-point representation may entail loading a value into a floating-point register or adding 0.0 to it, and responding to a resulting hardware trap.)
As explained by the note in 13.9.2(12), evaluation of the attribute X'Valid does not entail "evaluating the value" of X. Therefore, the Valid attribute of a scalar unchecked-conversion result can always be evaluated without generating a bounded error.
We considered repealing 13.9.1(12) altogether. In this case, 13.9.1(9) would apply, and use of the invalid value would cause a bounded error.
This solution was rejected because it would have distributed overhead costs on implementations. An implementation would no longer be able to assume that an object, once initialized to a valid value, would stay valid. This would prevent optimization of many constraint checks. An alternative to that would be to effectively do a 'Valid check on the result of every call to Unchecked_Conversion or imported function, raising Program_Error on failure. Such an implementation would have both extra overhead and would effectively be taking control of data validity out of the user's hands. (They may have their own recovery mechanism for invalid data.)
Therefore, we declare any use other than storing the invalid value and testing its validity to be erroneous. It is necessary to limit storing of invalid values to assignment_statements and object_declarations to prevent distributed overhead. For instance, passing the invalid value as a parameter is still erroneous; otherwise, static analysis could not assume valid values in most cases.
It has been suggested that, since 13.9.1(12) applies only to scalars, a programmer can avoid erroneous execution by having the unchecked conversion return a one-element record containing the scalar:
type Setting_Container is record Only_Component: Setting; end record;
for Setting_Container use record Only_Component at 0 range 0 .. 2; end record;
for Setting_Container'Size use 3;
function Bits_To_Setting_Container is new Ada.Unchecked_Conversion(Three_Bits, Setting_Container);
Unvalidated_Setting_Container: Setting_Container;
...
Unvalidated_Setting_Container := Bits_To_Setting_Container(Raw_Input); if Unvalidated_Setting_Container.Only_Component'Valid then Validated_Setting := Unvalidated_Setting_Container.Only_Component; else raise Input_Error; end if;
However, by 13.9(11), Bits_To_Setting_Container is permitted to return an abnormal object if Raw_Input does not contain the representation of a Setting_Container value. Then evaluation of the call on Bits_To_Setting_Container is erroneous by 13.9.1(8).
!corrigendum 13.9.1(12)
Replace the paragraph:
A call to an imported function or an instance of Unchecked_Conversion is erroneous if the result is scalar, and the result object has an invalid representation.
by:
A call to an imported function or an instance of Unchecked_Conversion is erroneous if the result is scalar, the result object has an invalid representation, and the result is used other than as the expression of an assignment_statement or an object_declaration, or as the prefix of a Valid attribute. If such a result object is used as the source of an assignment, and the assigned value is an invalid representation for the target of the assignment, then any use of the target object prior to a further assignment to the target object, other than as the prefix of a Valid attribute reference, is erroneous.
!ACATS Test
An ACATS C-test could only check that the U_C and 'Valid work; it could not verify the absence of erroneousness. This does not seem valuable.
!appendix

!section 13.9.1(12)
!subject Erroneous scalar Unchecked_Conversion?
!reference RM95-13.9.1(12)
!from Keith Thompson 96-10-07
!reference 96-5719.a Keith Thompson 96-10-7>>
!discussion

RM95-13.9.1(12) says:

        A call to an imported function or an instance of
        Unchecked_Conversion is erroneous if the result
        is scalar, and the result object has an invalid
        representation.

This is followed in the AARM by several paragraphs recommending that
implementations should behave sensibly.  The last sentence of 12.a says:

        We considered requiring such sensible behavior, but
        it resulted in too much arcane verbiage, and since
        implementations have little incentive to behave
        irrationally, such verbiage is not important to have.

Unfortunately, recommending that implementations behave sensibly is not
sufficient.  The best policy for a user trying to write good portable
Ada is to avoid erroneous execution altogether, even if some specific
implementations happen to document the particular form of undefined
behavior that they implement.

It should be possible, for example, to use Unchecked_Conversion to
convert an integer value to a sparse enumeration type and apply the
'Valid attribute to the result without invoking erroneous execution.

In any case, this paragraph should have been in 13.9, not 13.9.1.

****************************************************************

>From the minutes of the April 1997 ARG meeting in Henley:

AI-167 Erroneous scalar Unchecked_Conversion?

There is a dilemma here -- the user can't get to check the validity of the
resulting value before the program is *defined" to be erroneous.  Sparse
enumerations are a particular source of problems for this AI.  One
obvious portable solution is to somehow *promptly" test the validity of
the resulting value before the user does anything else to the value.
Making this work seems too messy to define.  A non-portable solution is
to make the situation implementation-defined rather than erroneous.
Another non-portable solution is to raise an exception when an invalid
value is detected during the conversion.  This was rejected during the
language design process.
The user can directly handle this problem by doing the unchecked
conversion by either:
_ Converting to an integer, then using a case statement to check
for valid values
_ Wrapping the designated result in a record as the sole
component.  The user can then perform a validity check of the
component value.  This is due to the fact that there is no
component type checking performed when the assignment is
to a record type.

The group reached consensus that the only option is to confirm the
language on this issue and to expect the user to do the sensible thing to
avoid this problem.

****************************************************************

>From the minutes of the November 1997 ARG meeting in St. Louis:

The group revisits the major points already discussed at the Henley
meeting.  The discussion centered on a model that makes the conversion
in question a bounded error, possibly rendering (for conversion to
sparse enumeration) the result to be a value with invalid
representation, eligible to testing by 'Valid. Staying with
erroneousness, there is no opportunity to do anything reasonable,
formally speaking. The sentiment is voiced that that's fine because we
know that, in reality, this conversion and follow-on 'Valid query
would work. The counter-argument is that users want more assurance.
This AI applies to Unchecked_Conversion and imports.

Bob recalls that the erroneous execution for this case was originally
selected for optimization reasons.  It is certainly strange that
unchecked conversion of a wrapper record is the way around this
erroneous behavior or any related optimization.

John recommends that the most straightforward way out of this dilemma
is to simply state by fiat than 'Valid when applied [immediately]
after an Unchecked_Conversion whose target is a sparse enumerated type
produces a useable result and not an erroneous execution.
Alternatively, Bob recommends that 13.9.1(12) be changed to returning
a value with invalid representation (which can be subsequently checked
with 'Valid applied to the object with this value).

Norm will produce the write-up of this AI.

****************************************************************

Editor's note:

The priority of this AI was changed based on ARG discussion in November 2000.

****************************************************************

From: Steve Baird
Date: Thursday, November 8, 2001  8:18 PM

Two points were made in the discussions of AI-167 at the 10/01 ARG meeting
in Minneapolis:

    1) A solution to the problem described in the AI is needed.

    2) Changing the definition of Ada.Unchecked_Conversion is not
       the right solution. The change that was being considered would
       have imposed a performance penalty on programs which use
       Unchecked_Conversion "correctly" (i.e. consistently with the
       existing language rules).

One can come close to solving the problem using the language
as it stands today, but the solution is both awkward and obscure.

Consider the following attempt:

     generic
         type Source (<>) is limited private;
         type Target is (<>);
     function My_Unchecked_Conversion (S : Source) return Target;

     with Ada.Unchecked_Conversion;
     function My_Unchecked_Conversion (S : Source) return Target is
         type Target_Record is
              record
                  F : aliased Target;
              end record;
              pragma Pack (Target_Record);

         function Convert is new Ada.Unchecked_Conversion (Source,
Target_Record);
         Result : Target renames Convert (S).F;
     begin
         if Result'Valid then
             return Result;
         else
             raise Program_Error;
         end if;
     end;

The one-field record skin is used to get around 13.9.1(12) and the function
result rename is used in order to meet the restrictions of 13.9.1(8).

Even ignoring the issues of awkwardness and obscurity, this solution is not
completely satisfactory. It depends on Target and Target_Record having the
same representation (this is the motivation for the Pack pragma and for
declaring
the component to be aliased). Corresponding values of the two types would
typically have the same representation, but relying on this assumption when
trying to
write portable code is undesirable.

To address these problems, I propose adding the following language-defined
generic functions:

     generic
         type Source (<>) is limited private;
         type Target is (<>);
     function Ada.Validated_Discrete_Conversion (S : Source) return Target;
     pragma Convention (Intrinsic, Ada.Validated_Discrete_Conversion);
     pragma Pure (Ada.Validated_Discrete_Conversion);

     generic
         type Source (<>) is limited private;
         type Target is digits <>;
     function Ada.Validated_Float_Conversion (S : Source) return Target;
     pragma Convention (Intrinsic, Ada.Validated_Float_Conversion);
     pragma Pure (Ada.Validated_Float_Conversion);

     generic
         type Source (<>) is limited private;
         type Target is delta <>;
     function Ada.Validated_Fixed_Conversion (S : Source) return Target;
     pragma Convention (Intrinsic, Ada.Validated_Fixed_Conversion);
     pragma Pure (Ada.Validated_Fixed_Conversion);

     generic
         type Source (<>) is limited private;
         type Target is delta <> digits <>;
     function Ada.Validated_Decimal_Conversion (S : Source) return Target;
     pragma Convention (Intrinsic, Ada.Validated_Decimal_Conversion);
     pragma Pure (Ada.Validated_Decimal_Conversion);

These would be defined to return the same value as would be produced
by a corresponding instance of Ada.Unchecked_Conversion except in the
case where that result would be invalid; in that case, Program_Error
is raised (this definition would require formalizing).

Most (all?) existing implementations could provide these units without any
special compiler support; simply copy the body given in the preceding example,
changing the unit name appropriately. The key advantage of having this unit be
language-defined is portability; the burden is on the compiler vendor, not on
the user, to ensure that that the unit is implemented correctly.

****************************************************************

From: Robert Dewar
Date: Thursday, November 8, 2001  9:28 PM

why are these packages any different from doing an unchecked conversion
followed by a validity check?

****************************************************************

From: Pascal Leroy
Date: Friday, November 9, 2001  1:45 AM

Because as the language stands today you may become erroneous as soon as you
do the unchecked conversion, even before you have a chance to do the
validity check.  I suggest reading the AI for details of the problem.  And
this is not just an angel-on-a-pinhead issue, it shows up in real life
because optimizers do make assumptions that may make the erroneousness very
visible (e.g., leading to 'Valid being optimized away).

****************************************************************

From: Robert Dewar
Date: Friday, November 9, 2001  5:53 AM

OK, I understand, but I must say that a compiler that optimizes away 'Valid
under any circumstances seems broken from a pragmatic point of view to me.

****************************************************************

From: Robert Dewar
Date: Friday, November 9, 2001  6:01 AM

By the way, in GNAT we handle this by guaranteeing that the sequence of
an unchecked conversion followed by a validity check is always correct,
and we never ever optimize valid checks away (it seems really really
wrong to me to optimize a valid check away, after a validity check
you should be able to assume the data is valid, and that includes
checking for validity errors caused by random data clobbering etc,
the compiler is never ever justified in optimizing away a valid
check. Yes, I know we have no formal way of saying this, but in
pragmatic terms this is a very important informal requirement.

****************************************************************

From: Randy Brukardt
Date: Wednesday, November 14, 2001  7:52 PM

I think the units are fine, and you should go ahead and complete the write-up
of the AI, including wording. :-)

> Most (all?) existing implementations could provide these units without any
> special compiler support; simply copy the body given in the preceding example,
> changing the unit name appropriately.

Well, the body wouldn't work for Janus/Ada, because the size of component of a
generic formal integer type is the size of the largest possible integer type.
And pragma Pack is ineffective on such types. (The results of generic code
sharing.) Thus, you probably would get a type mismatch error on the
Unchecked_Conversion instantiation.

But I don't think that the burden of building these into the compiler would be
high enough to be a problem. (I don't think our optimizer reasons from
erroneousness; we wanted to be able to detect random data values that came from
any source. If we were redoing it for Ada 95, I think we'd take a tack much
like Robert outlined - only worry about random values at a 'Valid check.)

Robert said:

>I must say that a compiler that optimizes away 'Valid
>under any circumstances seems broken from a pragmatic point of view to me.

I think that is a little bit too strong: if the compiler can determine that the
value being tested is known to be set to one or more valid, static values in
the current extended basic block (via flow analysis, for example), then it can
remove the check. Certainly if the value is known to be static. But those cases
are likely to be rare enough that it may not be worth it to allow them.

****************************************************************

From: Robert Dewar
Date: Friday, November 15, 2001  9:11 AM

<<Well, the body wouldn't work for Janus/Ada, because the size of component of
a generic formal integer type is the size of the largest possible integer
type. And pragma Pack is ineffective on such types. (The results of generic
code sharing.) Thus, you probably would get a type mismatch error on the
Unchecked_Conversion instantiation.>>

This implementation is arguably incorrect, I see no reason to allow the UC to
fail in this case.

<<I think that is a little bit too strong: if the compiler can determine that
the value being tested is known to be set to one or more valid, static
values in the current extended basic block (via flow analysis, for example),
then it can remove the check. Certainly if the value is known to be static.
But those cases are likely to be rare enough that it may not be worth it to
allow them.
>>

You are right legalistically, but I wlil repeat my assertion that it is a bad
implementatoin choice to ever optimize away 'Valid. This should be a
defence even in an erroneous program with memory overwriting etc.
(or hardware failure)

****************************************************************

From: David Emery
Date: Friday, November 15, 2001  11:30 AM

Speaking a J. Random User here :-).  If I used 'Valid, then
I want to be guaranteed a check, regardless of what the compiler
can "prove".  If I didn't want it, I wouldn't ask for it.

Besides Robert's example, this could well be part of an
external 'proof system' (e.g. SPARK) where the correctness
of the proof depends on the check occuring at exactly this
point.

****************************************************************

From: Mike Kamrad
Date: Friday, November 15, 2001  1:20 PM

I would agree with Robert's assertion too, as M. Random-User.  Is this the
one attribute the only one way we do not want to optimize away?

****************************************************************

From: Randy Brukardt
Date: Friday, October 25, 2002  8:50 PM

Some comments on the AI-167 write-up that was passed out at the ARG meeting:
(this will be version /02 next time I upload AIs.)

There was an incorrect paragraph reference in the !response (it should be
13.9.1(12), not 13.9.1(2)), which I've fixed in the to-be-posted version.

The !discussion section (unchanged from the previous version) should include
an example like the one we discussed in Cupertino (where an assignment
discards bits). (It's in the minutes of that meeting.)

The discussion spends a lot of words discussing whether a UC with a scalar
result type ever produces an abnormal result. It would seem that this is
allowed by 13.9(11). Do we need a wording change to insure that doesn't
happen?

It's not clear to me that we actually can make such a statement
(implementation-defined is just that; if the implementor wants to say that
all such UCs are erroneous, that seems to be an acceptable answer). But we
could make an AARM note to the effect that scalar UCs should either produce
a legitimate value or an invalid representation. Or perhaps Implementation
Advice to that effect?

****************************************************************

From: Pascal Leroy
Date: Thursday, April 24, 2003  10:55 AM

AI 167 singles out assignment_statements and object_declarations as
constructs that can be used to store the value returned by
Unchecked_Conversion before testing for validity.  As part of his
editorial review, Kiyoshi commented:

"Is the special case handling limited to 'an assignment_statement or an
object declaration'?  For example, in AI-00287, similar special case
consideration is given also to default_expression, allocator, and formal
object declaration.  I agree that assignment_statement and object
declaration cover most of useful cases, but like to make the situation
clear."

I think that parameter passing should be excluded here, or we are
killing all optimizations.  So default_expressions for parameters ought
to be excluded too.  However, I am not sure what the right answer is for
the other constructs.  Surely at the Padua meeting we didn't look at all
the constructs that involve an assignment operation, so I am wondering
if we covered all the useful cases.

****************************************************************

From: Tucker Taft
Date: Thursday, April 24, 2003  11:59 AM

This seems like a case where we should err on the
side of limitation, namely, only allow the minimal
number of constructs we think are needed.
Assignment statements and object declarations
are sufficient.  No need to explore other obscure
alternatives in my view.  But for the record...

    generic formal -- bad idea for shared generics
       (equivalent to subprogram parameter in that case)
    parameter/default -- bad, since subprogram body doesn't
        "see" the unchecked conversion
    allocator -- obscure
    aggregate -- obscure

****************************************************************

From: Tucker Taft
Date: Thursday, April 24, 2003  12:29 PM

I agree, no need to include other constructs.

****************************************************************

From: Robert Dewar
Date: Friday, April 25, 2003  7:53 AM

I agree with this limitation.

****************************************************************

!topic Inconsistent explanatation of use of invalid data
!reference Draft 11 AARM05-13.9.1(12/2, 12.d, 12.e/2)
!from Grein 2005-06-15
!discussion
13.9.1(12/2) talks about erroneous usages of the target object of an
assignment statement.
(12.d, 12.e/2) say use of Y, i.e. the source of an assignment statement,
is already erroneous.

I think this should be:

Z : Position := Y; -- Erroneous to use Z

and "as soon as Z is used". (Of course other uses of Y are also
erroneous by (12/2).)

BTW: (12.c/2) should in fact be (12.c) and (12.d) should in fact be
(12.d/2) because the former para is not changed, whereas the latter is.
Is there a problem in some generator?

****************************************************************

From: Randy Brukardt
Date: Wednesday, June 15, 2005  7:45 PM

> and "as soon as Z is used". (Of course other uses of Y are also
> erroneous by (12/2).)

No, the intent was that the use of Y would be erroneous. (We've already
talked about the other case, why repeat it?) The problem is with example,
not the text. What the heck is "Position", anyway? Not my best work.

I changed it to:

Z : Positive := Y+1; -- Erroneous to use Y

which is probably what I meant.

> BTW: (12.c/2) should in fact be (12.c) and (12.d) should in fact be
> (12.d/2) because the former para is not changed, whereas the latter is.
> Is there a problem in some generator?

No, it just isn't obvious from the source there are two paragraphs there, I
put the Change Reference on the wrong one, and I didn't notice it when I
reviewed the changes. No biggie.

****************************************************************


Questions? Ask the ACAA Technical Agent