AI22-0062-1

!standard 7.6.1(11)                                   23-03-21  AI22-0062-1/01

!class ramification 23-03-21

!status work item 23-03-21

!status received 23-03-14

!priority Low

!difficulty Easy

!subject Clarify “ceases to exist” definition

!summary

Clarify that an assignment which changes a discriminant of a variable can cause subcomponents of the object to come into existence or cease to exist.

!issue

The present wording is not as clear as it should be that a discriminant-dependent subcomponent of an object may “cease to exist” while the enclosing object continues to exist, although the intent is obvious.

Clarification of this point is needed in order to ensure that the 13.11.2 rule about "Evaluating a name that denotes a nonexistent object" (and the similar rule in 13.9.1) applies in the case of a reference to a component or subcomponent which no longer “exists” (in the informal English sense of the word).

Is execution of the following example erroneous? (Yes.)

   package Inst is new Address_To_Access_Conversions (T);

   type Optional_T (Has_T : Boolean := False) is
   record
      case Boolean is
         when False => null;
         when True => T_Component : aliased T;
      end case;
   end record;

   X : Optional_T := (Has_T => True, T_Component => ...);
   Ref : Inst.Object_Pointer := Inst.To_Pointer (X.T_Component'Address);
begin

   X := (Has_T => False);
   Do_Stuff (Ref.all); -- erroneous name evaluation

!recommendation

A “To Be Honest” note suffices here.  In particular, we do not need to explicitly enumerate which components come into existence and which ones go away when an assignment modifies the value(s) of an enclosing object’s discriminant(s).

!wording

Add after 7.6.1(11):

   To Be Honest:

   An assignment that changes the value of a discriminant of an unconstrained

   variable can cause some subcomponents of the variable to cease to exist

   and others to come into existence (for example, by changing the active

   variant of a variant part or the bounds of an array component).

!discussion

Existing legality rules already prevent constructing a dangling reference to a component by means of the Access or Unchecked_Access attributes. However, the example given here uses System.Address_To_Access_Conversions to demonstrate that there is a hole here that ought to be plugged.

This change should have no impact on any implementation; it just confirms that the Ada term “ceases to exist” is consistent with its informal English meaning.

!example

See !Issue section.

!ACATS test

We don’t write ACATS tests to test erroneous execution.

!appendix

From: Randy Brukardt

Sent: Tuesday, March 21, 2023  4:00 AM

Steve Baird provided the AI now numbered AI22-0062-1. (It can be found in the usual Google Docs place now, it will get to Ada-Auth.org in a few days).

I'm adding my notes for discussion here as I don't believe that the AI as constructed properly answers the question. If this was a small change, I probably would have made it to the AI, but as the AI class would need to change, along with the entire discussion, it would have changed the AI from Steve's submission. Additionally, I raised these concerns when Steve originally asked me about the question, so I presume that their omission was intentional (or possibly Tuck and Steve just got tired of reading my long-winded responses? ;-). Finally, since this is mainly a question of wording, it is not of much interest to the general public, so I didn't think giving it a Github Issue made much sense either. Thus I have put it here and we can discuss it when this AI comes up at a meeting.

Anyway, the !issue of the AI explains the problem pretty well. The reason it matters is that it is “cease to exist" that makes later accesses to an object erroneous. Once an object no longer exists, its memory can be recovered. If we have a way that an object continues to exist, then access to it is not erroneous and the memory cannot be reused.

My object is to the !recommendation of the AI. It says "A 'To Be Honest' note suffices here." But no reason for that opinion is given either in the !recommendation nor in the !discussion. The only point given is that we do not need (or want) to explicitly state which components come into existence and which go away, which makes some sense as that should already be defined elsewhere, but why that justifies giving no wording at all is mysterious at best.

During our initial private discussion of this point, Tucker made the point that 7.6.1(4) says we finalize objects that "still exist" and that 7.6.1(9/3) says that finalization only happens for components an object (currently) has. He seemed to think this was sufficient to explain the rules. I did not find rules about finalization to be compelling, since finalization is intentionally disconnected from memory management. In particular, one can access previously finalized objects without one's execution becoming erroneous -- that only happens once the objects cease to exist. Moreover, the "currently has" part of 7.6.1(9/3) is only found in an AARM To Be Honest note, and that note only talks about *extra* components, not ones that have disappeared. So here, we're basing the idea that we only need a To Be Honest note on a To Be Honest note that isn't even considering the case in question!

At some point, we need to come clean and actually give the rules. :-) Without doing that, we are depending on a pile of non-normative "everybody knows how this works". And Ada compilers do this sort of thing very differently (Janus/Ada may literally deallocate the memory making up the component); what one compiler actually does may differ enough to give different results in practice.

Finally, erroneous execution is extremely important to define well, as any sort of static analysis becomes invalid the moment erroneous execution occurs (after all, anything can be true if preceded by "if False"). So, without a clear definition of "cease to exist" for such components, we have the risk that static analysis might prove incorrect "facts" for discriminated types.

I don't think that anyone much disagrees with the intent that components can disappear during an assignment that changes a discriminant. But the language should be clear when that happens and thus that any later access is erroneous.

I had volunteered to help wordsmith a rule, but one was never proposed. I do not believe that a To Be Honest note alone is enough; the RM hints that components can appear and disappear but never really explains that.

Bottom line, somewhere in the RM (I'm not sure where would be best) there should be a statement like:

When an assignment changes the discriminants of a target object, some discriminant-dependent components could be created and others could cease to exist.

(I'd prefer to use "might" rather than "could", but "might" isn't allowed.) It would be better to be more specific, but of course that is riskier:

When an assignment changes the discriminants of a target object, components that depend on the previous values of the discriminants cease to exist, and components that depend on the new values of the discriminants are created.

This formulation begs the question of what happens if the components happen to exist for both the previous and new values of the discriminants. I would probably say that in that case the components (formally) cease to exist and then are recreated; that might matter if the component is in memory that gets reallocated in such a case. Since this has nothing to do with finalization (that is already happening to the entire target in any assignment) or initialization (that's happening as part of the assignment), compilers don't have to do anything for such a formal definition. But if avoiding deallocation is hard, that won't make their compiler incorrect.

I think I've said enough on this topic, and I should go to bed. :-)


 

From: Tucker Taft

Sent: Tuesday, March 21, 2023  7:34 AM

Thanks, Randy, for the in-depth analysis of this.  You have convinced me that we need some normative wording somewhere.


 

From: Randy Brukardt

Sent: Wednesday, March 22, 2023  12:27 AM

> ...It would be better to be more specific, but of course that is riskier:

>

>        When an assignment changes the discriminants of a target object,

> components that depend on the previous values of the discriminants

> cease to exist, and components that depend on the new values of the

> discriminants are created.

>

> This formulation begs the question of what happens if the components

> happen to exist for both the previous and new values of the

> discriminants. ...

To explain what I meant here better, consider the following expansion of Steve's original example:

---

   package Inst is new Address_To_Access_Conversions (T);

  type Array_of_T is array (Positive range <>) of aliased T;
  subtype Small_Natural is Natural range 0 .. 999;
  type Lengthed_T (Len : Small_Natural := 0) is
  record
     T_Components : Array_of_T (1 .. Len);
  end record;

  X : Lengthed_T := (Len => 2, T_Components => ...);
  Ref : Inst.Object_Pointer :=

      Inst.To_Pointer (X.T_Components(2)'Address);

begin

  X := (Len => 4, T_Components => ...);
  Do_Stuff (Ref.
all); -- erroneous??

 

---

In this example, X has a T_Components(2) when Ref is created, and still has one after the assignment to X. But whether those are the same memory depends on the implementation model. If an implementation is reallocating memory on an assignment that changes discriminants, then memory pointed at by the component (2) might in fact have changed by the assignment.

An implementation could probably avoid reallocation if the new discriminant makes the needed memory smaller (Janus/Ada does not do that currently, but it would make sense in some cases), and possibly even if more memory is needed (since the memory needed is probably “rounded-up" in some way), but it's always possible to expand the needed memory enough to require reallocation (for instance, if the type of the discriminant had been an unconstrained subtype Natural).

Whatever wording we adopt should define whether or not Ref.all in the above is (formally) erroneous. (As in all such cases, something being erroneous does not mean that it cannot work on a particular implementation, but rather that it is a not a portable construct that can be depended upon to work.) Given the end-run of Ada Legality Rules needed to even construct this case (by using 'Address rather than 'Access), I doubt that many real programs would be affected even if an implementation did change something based on the rule chosen.


 

From: Bob Duff

Sent: Wednesday, March 22, 2023  7:01 AM

> Given the end-run of Ada Legality Rules needed to even construct this

> case (by using 'Address rather than 'Access), I doubt that many real

> programs would be affected even if an implementation did change

> something based on the rule chosen.

Isn't the following enough to allow an implementation to declare the execution of your example to be erroneous?

  13.3 Operational and Representation Attributes

  ...

  12.c            The validity of a given address depends on the run-time model;

                  thus, in order to use Address clauses correctly, one needs

                  intimate knowledge of the run-time model.

  ...

  13/3 {AI05-0009-1} If an Address is specified, it is the programmer's

  responsibility to ensure that the address is valid and appropriate for the

  entity and its use; otherwise, program execution is erroneous.

  ...

  13.7.2 The Package System.Address_To_Access_Conversions

  ...

  6   An implementation may place restrictions on instantiations of

  Address_To_Access_Conversions.


 

From: Randy Brukardt

Sent: Wednesday, March 22, 2023  3:04 PM

The example contains no address clauses or specification. So I don't see how 13.3(12.c) applies (since it is about address clauses). I would argue that it doesn't mean to say anything about the act of taking/storing an address when no address clauses are present, and there is no normative justification for claiming that some addresses are invalid immediately after they are created.

13.3(13/3) does not apply since it starts with "If an Address is specified", and there is no specification of Address in this example. (Using the Address attribute is not a specification of the same!)

And 13.7.2(6) doesn't seem usable in this case. My understanding of the "restrictions" that an implementation may place is that such restrictions are compile-time rejections or run-time checks. There is no intent to allow implementations to declare anything erroneous by a whim. Given that erroneous execution makes anything after it unpredicable, we can't allow arbitrary injections of it (it would be trivial for an implementation to declare all execution erroneous and thus allow it to do anything it wants).

In this case, the instantiation of Address_to_Access_Conversions is perfectly usual; it is just instantiating with an arbitrary type T that could be simple enough to avoid any compile-time restriction (well, other that rejecting all such instantiations, which defeats the purpose of having the package). So there is no compile-time check that could help.

I don't see how one could identify addresses that point to non-existent components at runtime. The problem is in the combination of taking the address of a component (which by itself seems OK) and the later assignment that makes the component disappear/move, and identifying that at the point of use of the address seems impractical. So there is no runtime restriction that could reasonably be applied.

If we really wanted to avoid defining which components cease to exist in an assignment, we could say something about the use of 'Address in cases where renaming the prefix is illegal. We could make such cases illegal, or we could say that using such an address (via specification or via Address_to_Access_Conversions, or any other way) is erroneous. But I don't see why we would want to fix just the known problem rather than applying a more general fix.  Especially given that the components to finalize depends on those that an object actually "has", it seems valuable to define that as clearly as possible.