Version 1.12 of ais/ai-00373.txt
!standard 03.03.01(08) 05-10-12 AI95-00373/08
!standard 03.03.01(18/1)
!standard 03.03.01(19)
!standard 03.03.01(20)
!standard 04.08(07)
!standard 04.08(08)
!standard 04.08(10)
!standard 07.06(10)
!standard 07.06(11)
!class binding interpretation 04-02-05
!status Amendment 200Y 04-12-02
!status WG9 Approved 06-06-09
!status ARG Approved 9-0-1 04-11-21
!status work item 04-06-07
!status received 04-01-17
!priority Low
!difficulty Hard
!subject Undefined discriminants caused by loose order of init requirements
!summary
The "arbitrary order" of component initialization referred to in 3.3.1(20)
gives the implementation too much freedom. In particular, it allows the
implementation to choose an ordering which leads to problems with
uninitialized discriminants, uninitialized access values, etc. This problem
is reduced, but not eliminated, by imposing additional restrictions on the
order that an implementation may choose.
"Initialized by default" is defined, and it is used to define when
Initialize is called. (And will be used in other AIs to define the
initialization of other kinds of objects as well.)
!question
The rules of 3.3.1(20) are too lax -- they allow one to refer to
uninitialized discriminants, uninitialized access values, etc.
Was this intended? (No.)
We want Initialize always to be called for an object without an initialization
expression (other than aggregates), but this is done on a case-by-case basis.
Wouldn't it be better to do this with a global rule? (Yes.)
!recommendation
(See summary.)
!wording
Insert after 3.3.1(8):
A component of an object is said to require late initialization
if it has an access discriminant value constrained by a per-object
expression, or if it has an initialization expression that includes a name
denoting the current instance of the type or denoting an access discriminant.
Replace 3.3.1(18/1 - 20) with
3. The object is created, and, if there is not an initialization expression,
the object is initialized by default. When an object is initialized by
default, any per-object expressions (see 3.8) are elaborated and any implicit initial
values for the object or for its subcomponents are obtained as determined by
the nominal subtype. Any initial values (whether explicit or implicit) are
assigned to the object or to the corresponding subcomponents. As described
in 5.2 and 7.6, Initialize and Adjust procedures can be called.
For the third step above, evaluations and assignments are
performed in an arbitrary order subject to the following restrictions:
- Assignment to any part of the object is preceded
by the evaluation of the value that is to be assigned.
- The evaluation of a default_expression that includes the name of
a discriminant is preceded by the assigment to that discriminant.
- The evaluation of the default_expression for any component that
depends on a discriminant is preceded by the assignment to that
discriminant.
- The assignments to any components, including implicit components,
not requiring late initialization must precede the initial value
evaluations for any components requiring late initialization; if two
components both require late initialization, then assignments to parts
of the component occurring earlier in the order of the component
declarations must precede the initial value evaluations of the
component occurring later.
Replace 4.8(7-10) with
For the evaluation of an initialized allocator, the
evaluation of the qualified_expression is performed first. An
object of the designated type is created and the value of the
qualified_expression is converted to the designated subtype
and assigned to the object.
For the evaluation of an uninitialized allocator, the
elaboration of the subtype_indication is performed first. Then:
If the designated type is elementary, an object of the
designated subtype is created and any implicit initial value
is assigned;
If the designated type is composite, an object of the
designated type is created with tag, if any, determined by
the subtype_mark of the subtype_indication. This object is then
initialized by default (see 3.3.1) using the subtype_indication
to determine its nominal subtype. A check is made that the value of the
object belongs to the designated subtype. Constraint_Error is
raised if this check fails. This check and the initialization
of the object are performed in an arbitrary order.
Replace 7.6(10-11) with
During the elaboration or evaluation of a construct that causes an object to
be initialized by default, for every controlled subcomponent of the object
that is not assigned an initial value (as defined in 3.3.1), Initialize is
called on that subcomponent. Similarly, if the object that is initialized by
default as a whole is controlled, Initialize is called on the object.
For an extension_aggregate whose ancestor_part is a subtype_mark denoting
a controlled subtype, the Initialize procedure of the
ancestor type is called, unless that Initialize procedure is abstract.
!example
procedure Test is
type Inner;
type Outer;
function Flag_Init (X : access Inner) return Boolean;
type Inner (Discrim : access Outer) is limited
record
Flag : Boolean := Flag_Init (Inner'access);
end record;
type Type_With_Lots_Of_Interesting_Components is
--
--
... ;
type Outer is limited
record
F1 : Inner (Outer'access);
F2 : Type_With_Lots_Of_Interesting_Components;
F3 : Inner (Outer'access);
end record;
procedure Do_All_Sorts_Of_Things
(Interesting : in out
Type_With_Lots_Of_Interesting_Components) is separate;
--
--
function Flag_Init (X : access Inner) return Boolean is
begin
Do_All_Sorts_Of_Things (X.Discrim.all.F2);
return True;
end Flag_Init;
Problematic : Outer;
--
--
--
--
procedure Do_Something is separate;
begin
Do_Something;
end Test;
!discussion
An implementation is currently given too much freedom in choosing the order in
which components are initialized. The problem is illustrated by the preceding
example (see also Bob Duff's initial description of the problem in the
Appendix).
This proposal does not completely solve the problem of evaluating uninitialized
components (e.g. discriminants, access values, tasks, etc), but it greatly
reduces the chances of inadvertantly introducing such a problem, either by
writing new Ada source or by compiling exisiting source with another compiler.
The problem cannot be completely solved because a type that has two components
which both "require late initialization", then necessarily
one of them is going to be initialized first and problems may arise if
this initialization involves evaluation of the second component (or any part
thereof). Portability is improved by nailing down the order in which such
components are initialized.
Technical notes:
1) A discriminant never requires late initialization because RM 8.3(17) implies
that the current instance of a type cannot be named in the discriminant
part of the type.
2) The reference to "the order of the component declarations" is intended to
echo 7.6(12).
3) Consider:
type T (D : Some_Discrete_Type) is
limited record
F1 : T1 (T'access);
F2 : T2 (D) ;
F3 : T3 := F (D);
end record;
Here, F2 and F3 do not "require late initialization".
We want F1 initialized last in this case. Evaluation of a scalar
discriminant does not cause a component to "require late initialization".
4) Consider:
type Tt (Dd : Some_Type := Some_Value) is
record
Ff : Some_Type := Dd;
end record;
The Ada 95 Standard wording of 3.3.1(20) allows the following (bad) order:
a) Evaluate the initial value of the Dd component.
b) Evaluate the initial value of the Ff component.
c) Assign the value computed in step #1 to the Dd component.
d) Assign the value computed in step #2 to the Ff component.
The problem is that step b involves the evaluation of a component which
is not initialized until step c. No reasonable implementation would
choose this order, but the language definition shouldn't rely on that.
5) The former steps #3 and #4 (i.e. 3.3.1(18/1 and 19)) are now merged into
one step. It did not make sense to leave them as separate steps
because 3.3.1(15) explicitly states that the steps are to be performed
sequentially.
6) It does not need to be stated explicitly that the
evaluation-before-initialization cases that this proposal does not prevent
(i.e. cases involving two or more components which require late finalization)
result in erroneous execution. This is already a consequence of existing
rules (13.9.1(8) and the first sentence of 13.9.1(4), particularly the
clause "and any explicit or default initializations have been performed").
Still, an AARM note might be appropriate.
!corrigendum 3.3.1(8)
Insert after the paragraph:
The subtype_indication or full type definition of an
object_declaration defines the nominal subtype of the object. The
object_declaration declares an object of the type of the nominal subtype.
the new paragraph:
A component of an object is said to require late initialization
if it has an access discriminant value constrained by a per-object
expression, or if it has an initialization expression that includes a name
denoting the current instance of the type or denoting an access discriminant.
!corrigendum 3.3.1(18/1)
Replace the paragraph:
- 3.
- The object is created, and, if there is not an initialization
expression, any per-object constraints (see 3.8) are elaborated and any
implicit initial values for the object or for its subcomponents are obtained
as determined by the nominal subtype.
by:
- 3.
- The object is created, and, if there is not an initialization
expression, the object is initialized by default. When an object is
initialized by default, any per-object constraints (see 3.8) are elaborated and
any implicit initial values for the object or for its subcomponents are
obtained as determined by the nominal subtype. Any initial values (whether
explicit or implicit) are assigned to the object or to the corresponding
subcomponents. As described in 5.2 and 7.6, Initialize and Adjust procedures
can be called.
!corrigendum 3.3.1(19)
Delete the paragraph:
- 4.
- Any initial values (whether explicit or implicit) are assigned
to the object or to the corresponding subcomponents. As described in 5.2 and
7.6, Initialize and Adjust procedures can be called.
!corrigendum 3.3.1(20)
Replace the paragraph:
For the third step above, the object creation and any elaborations and
evaluations are performed in an arbitrary order, except that if the
default_expression for a discriminant is evaluated to obtain its initial
value, then this evaluation is performed before that of the
default_expression for any component that depends on the discriminant, and
also before that of any default_expression that includes the name of the
discriminant. The evaluations of the third step and the assignments of the
fourth step are performed in an arbitrary order, except that each evaluation is
performed before the resulting value is assigned.
by:
For the third step above, evaluations and assignments are
performed in an arbitrary order subject to the following restrictions:
- Assignment to any part of the object is preceded
by the evaluation of the value that is to be assigned.
- The evaluation of a default_expression that includes the name of
a discriminant is preceded by the assignment to that discriminant.
- The evaluation of the default_expression for any component that
depends on a discriminant is preceded by the assignment to that
discriminant.
- The assignments to any components, including implicit components,
not requiring late initialization must precede the initial value
evaluations for any components requiring late initialization; if two
components both require late initialization, then assignments to parts
of the component occurring earlier in the order of the component
declarations must precede the initial value evaluations of the
component occurring later.
!corrigendum 4.8(07)
Replace the paragraph:
For the evaluation of an allocator, the elaboration of the
subtype_indication or the evaluation of the qualified_expression is
performed first. For the evaluation of an initialized allocator, an object of
the designated type is created and the value of the qualified_expression
is converted to the designated subtype and assigned to the object.
by:
For the evaluation of an initialized allocator, the
evaluation of the qualified_expression is performed first. An
object of the designated type is created and the value of the
qualified_expression is converted to the designated subtype
and assigned to the object.
!corrigendum 4.8(08)
Replace the paragraph:
For the evaluation of an uninitialized allocator:
by:
For the evaluation of an uninitialized allocator, the
elaboration of the subtype_indication
is performed first. Then:
!corrigendum 4.8(10)
Replace the paragraph:
- If the designated type is composite, an object of the designated type
is created with tag, if any, determined by the subtype_mark of the
subtype_indication; any per-object constraints on subcomponents are
elaborated (see 3.8) and any implicit initial values for the subcomponents
of the object are obtained as determined by the subtype_indication
and assigned to the corresponding subcomponents. A check is made that the
value of the object belongs to the designated subtype. Constraint_Error is
raised if this check fails. This check and the initialization of the object are
performed in an arbitrary order.
by:
- If the designated type is composite, an object of the
designated type is created with tag, if any, determined by
the subtype_mark of the subtype_indication. This object is
then initialized by default (see 3.3.1) using the subtype_indication to
determine its nominal subtype. A check is made that the value of the
object belongs to the designated subtype. Constraint_Error is
raised if this check fails. This check and the initialization
of the object are performed in an arbitrary order.
!corrigendum 7.6(10)
Replace the paragraph:
During the elaboration of an object_declaration, for every controlled
subcomponent of the object that is not assigned an initial value (as defined in
3.3.1), Initialize is called on that subcomponent. Similarly, if the object as
a whole is controlled and is not assigned an initial value, Initialize is
called on the object. The same applies to the evaluation of an allocator,
as explained in 4.8.
by:
During the elaboration or evaluation of a construct that causes an object to
be initialized by default, for every controlled subcomponent of the object
that is not assigned an initial value (as defined in 3.3.1), Initialize is
called on that subcomponent. Similarly, if the object that is initialized by
default as a whole is controlled, Initialize is called on the object.
!corrigendum 7.6(11)
Replace the paragraph:
For an extension_aggregate whose ancestor_part is a subtype_mark,
for each controlled subcomponent of the ancestor part, either Initialize is
called, or its initial value is assigned, as appropriate Initialize is called
on all controlled subcomponents of the ancestor part; if the type of the
ancestor part is itself controlled, the Initialize procedure of the ancestor
type is called, unless that Initialize procedure is abstract.
by:
For an extension_aggregate whose ancestor_part is a subtype_mark
denoting a controlled subtype, the Initialize procedure
of the ancestor type is called, unless that Initialize procedure is abstract.
!ACATS test
Create an ACATS test to check that cases like the example evaluate in the
correct order.
!appendix
!subject Undefined discriminants caused by loose order of init requirements
!reference RM95-3.3.1(20)
!from Bob Duff
!problem
The rules of 3.3.1(20) are too lax -- they allow one to refer to
uninitialized discriminants. Here's an example:
type String_Ptr is access all String;
type Rec(Name: String_Ptr) is limited private;
function Init(X: access Rec) return Boolean;
type Rec(Name: String_Ptr) is limited
record
Comp: Boolean := Init(Rec'Access);
end record;
function Init(X: access Rec) return Boolean is
begin
Put_Line(X.all.Name.all); -- Is this erroneous?
return True;
end Init;
Thing: Rec(Name => new String'("Thing"));
This example is similar to one that came from real code (written by me).
It looks silly here, but I thought I had good reasons...
Anyway, the AdaMagic compiler generated code to call Init *before*
initializing Thing.Name. Init thus refers to that discriminant in its
raw undefined state. This bad behavior does not seem to be forbidden by
the RM. GNAT apparently chose a better order.
We plan to fix our compiler (in fact, I think Tuck already did so),
but it seems like this is a hole in the RM. You're not supposed to be
able to get at undefined discriminants without using chapter-13-ish
features.
The problem here is that 3.3.1(20) talks about direct references to
discriminants, forgetting that one can get at the discriminants via a
pointer as shown above. We should probably add something like "All
discriminants are initialized before evaluating any expression
containing the name of the current instance..."
The alternative would be to declare the above erroneous. I don't much
like that -- after all, I tripped over this by accident, and I wasn't
messing around with low-level chap-13 junk.
P.S. I'm impressed that Tucker tracked down the cause of this bug
quickly. It occurred in a 150,000-line program.
****************************************************************
From: Tucker Taft
Sent: Saturday, January 17, 2004 4:34 PM
Discriminants aren't the only problem. You have access to
the whole record. Various pointer fields might have stack
junk in them, and derefencing them would presumably be erroneous.
To be really safe, we would have to initialize all pointer fields
to null before evaluating any default initial expressions that
involve enclosing-rec'access. But what about pointers that
are of a "not null" subtype? They can presumably be assumed to be
non-null under normal circumstances, so no access-check is required
before derefencing them. And what about plain old integer fields
that are supposedly, say, integer range 1..100? Do we have to be
sure they are preeinitialized to some in-range value before
passing blah'access?
The overall implications are pretty worrisome. Java "solves"
this problem by first initializing everything to 0, null, etc,
and then doing further initialization. But as indicated above,
that doesn't solve the problem for Ada because a zero-ish value
is not reasonable for all subtypes.
I think we almost have to say derefencing a value produced by
passing enclosing_rec'Access may lead to erroneous execution
if the dereference occurs prior to the completion of default
initialization. We could special-case discriminants, but I'm
not convinced that is really doing the programmer a big favor.
They have to treat these access values with "kid" gloves in
any case. Unfortunately, I can't think of a way to protect
against the problem, short of disallowing the use of
enclosing_rec'access as an actual parameter in a function call in
a component default initial expression.
> ...
> The alternative would be to declare the above erroneous. I don't much
> like that -- after all, I tripped over this by accident, and I wasn't
> messing around with low-level chap-13 junk.
Unfortunately discriminants are only the tip of the iceberg...
> P.S. I'm impressed that Tucker tracked down the cause of this bug
> quickly. It occurred in a 150,000-line program.
Aw shucks.
****************************************************************
From: Robert A. Duff
Sent: Sunday, January 18, 2004 11:37 PM
> Discriminants aren't the only problem. You have access to
> the whole record.
Yes, I see that now, but I still think it's a good idea to special-case
discriminants. 3.3.1(20) already goes to some trouble to make sure you
can't refer to discriminants before they've been initialized. Other
components seem less worrisome, somehow.
I'm not sure I believe the above argument...
If I were the boss, initialization would happen in order (textual order
of the component declarations), for all record types. I doubt that's
going to fly. ;-)
At least the user could know which components must be initialized.
Maybe we could add such a rule, but only for records where 'Access (or
'Unchecked_Accessed) is used in a potentially damaging way.
We certainly don't want to make my example illegal. The main reason for
passing the 'Access was to create two records pointing at each other.
I.e., two record types declared in a decoupled way that actually
represent a single concept in the programmer's mind. That seems like
a legitimate thing to do. The problem was that in addition to saving
that pointer away, I took a quick peek at one of the discriminants.
We can't tell at compile time when that's going to happen.
****************************************************************
From: Gary Dismukes
Sent: Monday, January 19, 2004 4:11 PM
> Yes, I see that now, but I still think it's a good idea to special-case
> discriminants. 3.3.1(20) already goes to some trouble to make sure you
> can't refer to discriminants before they've been initialized. Other
> components seem less worrisome, somehow.
3.3.1(20) goes to some trouble, but it seems that the current wording
doesn't even handle the cases of discriminant defaults properly,
or at least the wording is sloppy, because the last sentence still
allows component and discriminant assignments to happen after all
of the evaluations. In any case, I agree there are problems here,
and it would be nice to fix the wording to address more cases.
> If I were the boss, initialization would happen in order (textual order
> of the component declarations), for all record types. I doubt that's
> going to fly. ;-)
> At least the user could know which components must be initialized.
> Maybe we could add such a rule, but only for records where 'Access (or
> 'Unchecked_Accessed) is used in a potentially damaging way.
I think it would be reasonable to add additional constraints on
evaluation and initialization along the lines of what's already
done for delaying Initialize calls for controlled components
constrained by per-object expressions as specified in 7.6(12).
The AARM says:
12.b The fact that Initialize is done for components with access
discriminants after other components allows the Initialize operation
for a component with a self-referential access discriminant to assume
that other components of the enclosing object have already been
properly initialized. For multiple such components, it allows some
predictability.
The case of normal component initializations seems similar to this
when per-object expressions are involved in defaults. So it would
seem reasonable to add rules requiring components initialized by
these special expressions to delay evaluation until earlier components
that don't involve per-object expressions have all been initialized.
****************************************************************
From: Randy Brukardt
Sent: Monday, June 7, 2004 7:40 PM
The AI doesn't contain a summary, question, or recommendation.
Recommendation can be "(See summary.)", but the other two are required.
[Editor's note: this is about version /01 of the AI.]
The last paragraph of the new wording refers to the "fourth step", but there
is no fourth step anymore.
I think that Note 6 needs to be stated somewhere, at least in the AARM. But
it really is something users ought to know, so explicit mention in the
Standard would be a good idea.
****************************************************************
From: Stephen W Baird
Sent: Tuesday, June 8, 2004 11:52 PM
> I think that Note 6 needs to be stated somewhere, at least in the AARM. But
> it really is something users ought to know, so explicit mention in the
> Standard would be a good idea.
I agree that it belongs at least in the AARM.
I'm not sure it belongs in the standard because it seems that 13.9.1 already
covers this. Note the words "and any explicit or default initializations have
been performed" in 13.9.1(4).
****************************************************************
From: Bob Duff
Sent: Monday, June 6, 2005 12:33 PM
I'm using draft 11.8 of the [A]ARM.
3.3.1(18.a) says:
18.a Discussion: For a per-object constraint that contains some
per-object expressions and some non-per-object expressions, the
values used for the constraint consist of the values of the
non-per-object expressions evaluated at the point of the
type_declaration, and the values of the per-object expressions
evaluated at the point of the creation of the object.
Is this still correct, given new rules about per-object constraints?
****************************************************************
From: Tucker Taft
Sent: Monday, June 6, 2005 4:33 PM
I would say it is still true. We might have changed the
order in which things are done, but the notion of what
is a "per-object constraint" hasn't changed.
****************************************************************
From: Pascal Leroy
Sent: Tuesday, June 7, 2005 3:41 AM
I agree.
****************************************************************
From: Bob Duff
Sent: Monday, June 6, 2005 12:34 PM
I'm using draft 11.8 of the [A]ARM.
3.3.1(20.h/2):
20.h/2 It is possible for there to be more than one component that
requires late initialization. In this case, the language can't
prevent problems, because all of the components can't be the last
one initialized. In this case, we specify the order of
initialization for components requiring late initialization; by
doing so, programmers can arrange their code to avoid accessing
uninitialized components, and such arrangements are portable. Note
that if the program accesses an uninitialized component, 13.9.1
defines the execution to be erroneous.
^^^^^^^^^
Shouldn't that be "bounded error"?
****************************************************************
From: Stephen W Baird
Sent: Monday, June 6, 2005 2:54 PM
> Shouldn't that be "bounded error"?
No; I think that "erroneous" is correct here.
13.9.1(4) states that an object is not guaranteed to be normal until "any
explicit or default initializations have been performed".
13.9.1(8) states that "it is erroneous to evaluate ... an abnormal
object".
Note that we're not just talking about invalid scalars here - we're
talking about uninitialized access values, tasks, discriminants, etc.
****************************************************************
From: Pascal Leroy
Sent: Tuesday, June 7, 2005 3:25 AM
> Shouldn't that be "bounded error"?
I believe it is definitely "erroneous", since we are talking accessing
discriminants before they have been initialized and other very nasty
things here.
****************************************************************
Questions? Ask the ACAA Technical Agent