Version 1.8 of ais/ai-00426.txt

Unformatted version of ais/ai-00426.txt version 1.8
Other versions for file ais/ai-00426.txt

!standard 13.09.01 (6)          05-09-20 AI95-00426/05
!standard 13.09 (11)
!standard 13.09.02 (12)
!class binding interpretation 05-04-08
!status Amendment 200Y 05-05-05
!status WG9 Approved 06-06-09
!status ARG Approved 8-0-2 05-04-18
!status work item 05-04-08
!status received 05-04-08
!priority High
!difficulty Hard
!subject Language-defined routines returning abnormal and invalid values
!summary
Various language-defined functions and procedures can produce abnormal values for non-scalar types.
Unchecked_Conversion cannot produce abnormal values for scalar types (but it can produce invalid values).
13.9.1(9) applies to function results.
!question
13.9.1(6) seems to describe all of the operations that can return abnormal values. But what exactly does "language-defined input-output procedure" mean? Does it include a stream attribute like T'Read (which doesn't necessarily do any input - the stream may be a memory buffer, for instance)?
Moreover, this list doesn't include any functions (like T'Input or Unchecked_Conversion), which also can return abnormal values. But this seems like a complete list. Should this be updated? (Yes.)
AI-167 eliminates erroneous execution from various function results. But it does not eliminate exceptions from being raised by the bounded error 13.9.1(9) nor from the subtype conversion implicit in assignment. It's clear that the intent was that validity could be tested without having to worry about stray exceptions, but this was not accomplished. Should it be? (No.)
!recommendation
(See summary.)
!wording
Replace 13.9.1(6):
* The object is not scalar, and is passed to an in out or out parameter of an
imported procedure, the Read procedure of an instance of Sequential_IO, Direct_IO, or Storage_IO, or the stream attribute T'Read, if after return from the procedure the representation of the parameter does not represent a value of the parameter's subtype.
* The object is the return object of a function call of a non-scalar type,
and the function is an imported function, an instance of Unchecked_Conversion, or the stream attribute T'Input, if after return from the function the representation of the return object does not represent a value of the function's subtype.
AARM Notes: We explicitly list the routines involved in order to avoid future arguments. All possibilities are listed.
We did not include Stream_IO.Read in the list above. A Stream_Element should include all possible bit patterns, and thus it cannot be invalid. Therefore, the parameter will always represent a value of its subtype. By omitting this routine, we make it possible to write arbitrary I/O operations without any possibility of abnormal objects. End AARM Notes
For an imported object, it is the programmer's responsibility to ensure that the object remains in a normal state.
Change 13.9(11):
Otherwise, if the result type is scalar, the result of the function is implementation defined, and can have an invalid representation (see 13.9.1). If the result type is non-scalar, the effect is implementation defined; in particular, the result can be abnormal (see 13.9.1).
AARM Note: Note the difference between these sentences; the first only says that the bits returned are implementation-defined, while the latter allows any effect. The difference is because scalar objects should never be abnormal unless their assignment was disrupted or if they are a subcomponent of an abnormal composite object. Neither exception applies to instances of Unchecked_Conversion.
Add the following after 13.9.2(12):
21 The Valid attribute may be used to check the result of calling an
instance of Unchecked_Conversion (or any other operation that can return invalid values). However, an exception handler should also be provided because implementations are permitted to raise Constraint_Error or Program_Error if they detect the use of an invalid representation (see 13.9.1).
!discussion
Once a careful examination of this area was accomplished, it is clear that we have a very confused situation. Therefore, we started from a list of goals that these rules are intended to accomplish:
1) An implementation need not protect itself against corrupted non-scalar
objects. ("Abnormal objects", in the terms of the RM.)
2) An implementation should not allow erroneous execution (memory trashing
or unpredictable execution) from uninitialized scalar objects ("invalid objects"). But precise semantics are not required.
3) The result T'Input should follow similar rules to Unchecked_Conversion
(these are both functions, and it is likely that T'Input will be implemented with Unchecked_Conversion).
4) T'Read and T'Input should follow similar rules. It is likely that one
will be defined in the terms of the other.
5) T'Read should follow similar rules to Sequential_IO.Read and similar
routines. These are both used for similar purposes, and it is conceivable that one would replace the other in code during maintenance.
6) Imported subprograms should follow similar rules to the above, as
the above routines are likely to be implemented with system calls defined as imported subprograms.
7) It should be possible to optimize away (some) range checks. (This
implies that compilers have to be able to assume that objects at least have valid ranges.)
8) It should be possible to test (some) scalar objects that come from
outside of the Ada universe for validity without having to handle exceptions.
Note that values do not have representations (technically), so these rules only apply to objects. Corrupted composite types and corrupted access types are awful, because they can point anywhere and thus damage any memory. (Composite types might be implemented non-contiguously, and have embedded pointers.)
OTOH, corrupted scalar objects are just a bag of bits. Their interpretation is not terribly messy.
The bounded error given in 13.9.1(9-11) applies to the results of functions in Ada 2006, because all functions return (anonymous) return objects. (See AI-318-2's changes to 6.5 for details.) That's a good thing, as range checks may need to interpret the value returned from a function. (For instance, a floating point range check means loading the value into a floating register, which may cause a trap for a invalid object. Similarly, some implementations do sparse enumeration checks by position number, thus requiring a table lookup which would fail for a invalid object.)
The range check is necessary because an assignment does a subtype conversion, which always does a constraint check (see 4.6(51)). While compilers may optimize that out when the ranges match, Ada 95 specifically was designed not to allow that for invalid values.
There is Implementation Advice to avoid extra checks for instances of Unchecked_Conversion, but that is not required, does not cover all of the cases identified above, and does not necessarily apply to a later assignment.
Thus, we would need a special rule to allow us to check for validity without raising an exception. Many rules were considered, but all that work are complicated and limiting. Eventually, we decided that principle (8) was too hard to require (as opposed to allow).
Therefore, we added a note to 13.9.2 pointing out that exception handlers are required, even when explicit tests for validity are made. At least in this case, programs are not erroneous, so exceptions and validity tests can be handled without the program going off the rails. Moreover, implementations can support exception-free behavior in some cases if their users demand it.
This can be done by taking advantage of 13.9.1(12) (as modified by AI-167). It allows an implementation to assume that the result of an Unchecked_Conversion or imported function is valid, and makes any later use of an invalid value erroneous.
Implementations may want to restrict the cases where they support exception-free behavior, in order to avoid making programs less safe with a 'Valid check than without it (or worse, just making them unsafe generally). Consider the following example:
procedure Example is type Status is (Off, Ready, On); S : Status := Off; -- Initialized to a valid value. function Convert is new Ada.Unchecked_Conversion (Integer, Status); A : array (Status) of Integer;
procedure Get_Status is begin S := Convert(200); -- Imagine this value comes from the Internet. if not S'Valid then Put_Line("Bad status!!"); -- Should have reset S here, but forgot. end if; end Get_Status;
begin Get_Status; A(S) := 1000; end Example;
Because S is initialized and thus known to be valid, virtually all Ada compilers will omit the range check at A(S).
An implementation not taking advantage of 13.9.1(12) will see that the object S is one that is known to be valid, and thus will avoid de-initializing it by checking that the result of Convert is in fact a value of the subtype of S. Since it is not, either Program_Error or Constraint_Error will be raised at the point of the assignment S. This should be relatively easy for the maintenance programmer to debug: the exception happens at the point of the error.
However, an implementation taking advantage of 13.9.1(12) will store the invalid value into S. The programmer tests its validity, but forgot to set it to a good value. The later use is erroneous by 13.9.1(12), so the program is now writing over memory it doesn't own. This is the sort of bug that gives C its well-deserved reputation.
The potential damage of such cases can be limited by restricting the application of 13.9.1(12). For instance, one possibility would be to apply it only to local variables.
---
Notes (in homage to Steve Baird)
1) A.13(16) gives permission to omit scalar validity checks for input routines. Thus, it is unlikely that an implementation will check validity for sparse enumerations or floating point representations. If the check is omitted, the result can be invalid. Similarly, there is no requirement for stream attributes to check for validity.
2) The minutes mention Address_to_Access_Conversions as additional cases which should be abnormal. That doesn't seem necessary, since dereferencing of bad values is already covered by 13.9.1(13), and if they aren't dereferenced, nothing bad will happen.
3) It would be nice to get rid of abnormal composite objects when the constraints are all static, the components are all scalar or composites that meet this rule. These can't have any significant problems from I/O or importing, and it would be nice to be able to handle them without erroneousness. But it seems rather complicated, and its too late.
4) We fix 13.9(11) to say that Unchecked_Conversions whose target types are scalar cam be an invalid representation (not abnormal). Only the bits of the result are implementation-defined. Scalar objects should only be abnormal if their assignments are disrupted or if they are a subcomponent of an abnormal composite object, neither of which applies to instances of Unchecked_Conversion.
!corrigendum 13.9(11)
Replace the paragraph:
Otherwise, the effect is implementation defined; in particular, the result can be abnormal (see 13.9.1).
by:
Otherwise, if the result type is scalar, the result of the function is implementation defined, and can have an invalid representation (see 13.9.1). If the result type is nonscalar, the effect is implementation defined; in particular, the result can be abnormal (see 13.9.1).
!corrigendum 13.9.1(6)
Replace the paragraph:
by:
For an imported object, it is the programmer's responsibility to ensure that the object remains in a normal state.
!corrigendum 13.9.2(12)
Insert after the paragraph:
20 X'Valid is not considered to be a read of X; hence, it is not an error to check the validity of invalid data.
the new paragraph:
22 The Valid attribute may be used to check the result of calling an instance of Unchecked_Conversion (or any other operation that can return invalid values). However, an exception handler should also be provided because implementations are permitted to raise Constraint_Error or Program_Error if they detect the use of an invalid representation (see 13.9.1).
!ACATS Test
It would be difficult (not impossible) to create ACATS C-Test(s) to check that either exceptions are raised or the program works. But they probably wouldn't have a lot of value.
!appendix

From: Randy Brukardt

How this AI came about.

I originally was assigned this AI when I discovered that stream attributes
were not properly covered by the standard when researching illegal cases of
tag use in Paris.

The problems with invalid representations appeared when I tried to figure out
how to revise the AARM notes associated with 13.9.1(12). This lead me to
wondering whether AI-167 was intended to allow testing of validity without
raising exceptions, or whether it was intended just to eliminate erroneousness
(which would allow an implementation to omit the validity check).

Discussions with John (after false starts with Tucker and Pascal) led me to
believe the former was intended. But the rules do not come close to supporting
that, as outlined in the AI.

Since functions return objects in Ada 2006, 13.9.1(9) always applies. And
4.6(51) says that a constraint check is always performed for constrained
subtypes; there is no special case for when the subtypes happen to be the same.

I also wondered why this problem doesn't show up in Claw. The answer was
that all Claw low-level bindings use full-range types, such that no invalid
values exist. Thinking about that further led me to the solution given above.

Finally, in thinking about the original question, I wondered about scalar
values returned. Since A.13(16) gives a permission to omit validity checks
for Sequential_IO.Read and the like, what is the result if the check is omitted
and the value is invalid? The RM seems to assume that is it invalid without
any clear statement to that effect.

****************************************************************

From: Randy Brukardt
Sent: Never

Here is one idea from the original AI that deserves to be preserved, but not
in the AI itself.

---

The real problem with avoiding exceptions is requiring an invalid value
to be stored into an object that is otherwise assumed valid *and* is going to
live for a while. If we only wanted to make Unchecked_Conversion and imported
functions work, a better way to solve this would be with a new procedure
attribute:
    T'Valid_Copy (Target : in out T'Base; Source : in T'Base;
       Is_Valid : out Boolean);
The second parameter would have a notwithstanding rule that says we are not
evaluating the value of the actual parameter, even though it looks like it.
(It would not trigger 13.9.1(9)). The validity of Source would be checked, and
the result put into Is_Valid. If the value is valid, Source is copied into
Target. Otherwise, Target is unchanged.

This would avoid any problems with future uses of the value, as it would only
be assigned if it is valid. Moreover, any range checks needed would happen
normally (if Target had a subtype other than T).

Unfortunately, this wouldn't work for the Out parameters of T'Read and the
various Read routines. Those would still raise exceptions before they could
be tested.

****************************************************************


Questions? Ask the ACAA Technical Agent