Version 1.1 of ais/ai-00373.txt

Unformatted version of ais/ai-00373.txt version 1.1
Other versions for file ais/ai-00373.txt

!standard 03.03.01(20)          04-02-05 AI95-00373/00
!class binding interpretation 04-02-05
!status received 04-01-17
!priority Low
!difficulty Hard
!subject Undefined discriminants caused by loose order of init requirements
!summary
!question
!recommendation
!wording
!discussion
--!corrigendum
!example
!ACATS test
!appendix

!subject Undefined discriminants caused by loose order of init requirements
!reference RM95-3.3.1(20)
!from Bob Duff
!problem

The rules of 3.3.1(20) are too lax -- they allow one to refer to
uninitialized discriminants.  Here's an example:

    type String_Ptr is access all String;
    type Rec(Name: String_Ptr) is limited private;

    function Init(X: access Rec) return Boolean;

    type Rec(Name: String_Ptr) is limited
        record
            Comp: Boolean := Init(Rec'Access);
        end record;

    function Init(X: access Rec) return Boolean is
    begin
        Put_Line(X.all.Name.all); -- Is this erroneous?
        return True;
    end Init;

    Thing: Rec(Name => new String'("Thing"));

This example is similar to one that came from real code (written by me).
It looks silly here, but I thought I had good reasons...

Anyway, the AdaMagic compiler generated code to call Init *before*
initializing Thing.Name.  Init thus refers to that discriminant in its
raw undefined state.  This bad behavior does not seem to be forbidden by
the RM.  GNAT apparently chose a better order.

We plan to fix our compiler (in fact, I think Tuck already did so),
but it seems like this is a hole in the RM.  You're not supposed to be
able to get at undefined discriminants without using chapter-13-ish
features.

The problem here is that 3.3.1(20) talks about direct references to
discriminants, forgetting that one can get at the discriminants via a
pointer as shown above.  We should probably add something like "All
discriminants are initialized before evaluating any expression
containing the name of the current instance..."

The alternative would be to declare the above erroneous.  I don't much
like that -- after all, I tripped over this by accident, and I wasn't
messing around with low-level chap-13 junk.

P.S. I'm impressed that Tucker tracked down the cause of this bug
quickly.  It occurred in a 150,000-line program.

****************************************************************

From: Tucker Taft
Sent: Saturday, January 17, 2004  4:34 PM

Discriminants aren't the only problem.  You have access to
the whole record.  Various pointer fields might have stack
junk in them, and derefencing them would presumably be erroneous.
To be really safe, we would have to initialize all pointer fields
to null before evaluating any default initial expressions that
involve enclosing-rec'access.  But what about pointers that
are of a "not null" subtype?  They can presumably be assumed to be
non-null under normal circumstances, so no access-check is required
before derefencing them.  And what about plain old integer fields
that are supposedly, say, integer range 1..100?  Do we have to be
sure they are preeinitialized to some in-range value before
passing blah'access?

The overall implications are pretty worrisome.  Java "solves"
this problem by first initializing everything to 0, null, etc,
and then doing further initialization.  But as indicated above,
that doesn't solve the problem for Ada because a zero-ish value
is not reasonable for all subtypes.

I think we almost have to say derefencing a value produced by
passing enclosing_rec'Access may lead to erroneous execution
if the dereference occurs prior to the completion of default
initialization.  We could special-case discriminants, but I'm
not convinced that is really doing the programmer a big favor.
They have to treat these access values with "kid" gloves in
any case.  Unfortunately, I can't think of a way to protect
against the problem, short of disallowing the use of
enclosing_rec'access as an actual parameter in a function call in
a component default initial expression.

> ...
> The alternative would be to declare the above erroneous.  I don't much
> like that -- after all, I tripped over this by accident, and I wasn't
> messing around with low-level chap-13 junk.

Unfortunately discriminants are only the tip of the iceberg...

> P.S. I'm impressed that Tucker tracked down the cause of this bug
> quickly.  It occurred in a 150,000-line program.

Aw shucks.

****************************************************************

From: Robert A. Duff
Sent: Sunday, January 18, 2004  11:37 PM

> Discriminants aren't the only problem.  You have access to
> the whole record.

Yes, I see that now, but I still think it's a good idea to special-case
discriminants.  3.3.1(20) already goes to some trouble to make sure you
can't refer to discriminants before they've been initialized.  Other
components seem less worrisome, somehow.

I'm not sure I believe the above argument...

If I were the boss, initialization would happen in order (textual order
of the component declarations), for all record types.  I doubt that's
going to fly. ;-)
At least the user could know which components must be initialized.
Maybe we could add such a rule, but only for records where 'Access (or
'Unchecked_Accessed) is used in a potentially damaging way.

We certainly don't want to make my example illegal.  The main reason for
passing the 'Access was to create two records pointing at each other.
I.e., two record types declared in a decoupled way that actually
represent a single concept in the programmer's mind.  That seems like
a legitimate thing to do.  The problem was that in addition to saving
that pointer away, I took a quick peek at one of the discriminants.
We can't tell at compile time when that's going to happen.

****************************************************************

From: Gary Dismukes
Sent: Monday, January 19, 2004  4:11 PM

> Yes, I see that now, but I still think it's a good idea to special-case
> discriminants.  3.3.1(20) already goes to some trouble to make sure you
> can't refer to discriminants before they've been initialized.  Other
> components seem less worrisome, somehow.

3.3.1(20) goes to some trouble, but it seems that the current wording
doesn't even handle the cases of discriminant defaults properly,
or at least the wording is sloppy, because the last sentence still
allows component and discriminant assignments to happen after all
of the evaluations.  In any case, I agree there are problems here,
and it would be nice to fix the wording to address more cases.

> If I were the boss, initialization would happen in order (textual order
> of the component declarations), for all record types.  I doubt that's
> going to fly. ;-)
> At least the user could know which components must be initialized.
> Maybe we could add such a rule, but only for records where 'Access (or
> 'Unchecked_Accessed) is used in a potentially damaging way.

I think it would be reasonable to add additional constraints on
evaluation and initialization along the lines of what's already
done for delaying Initialize calls for controlled components
constrained by per-object expressions as specified in 7.6(12).
The AARM says:

  12.b   The fact that Initialize is done for components with access
  discriminants after other components allows the Initialize operation
  for a component with a self-referential access discriminant to assume
  that other components of the enclosing object have already been
  properly initialized.  For multiple such components, it allows some
  predictability.

The case of normal component initializations seems similar to this
when per-object expressions are involved in defaults.  So it would
seem reasonable to add rules requiring components initialized by
these special expressions to delay evaluation until earlier components
that don't involve per-object expressions have all been initialized.

****************************************************************

Questions? Ask the ACAA Technical Agent