AI22-0112-1
!standard 5.2(4/2) 24-09-04 AI22-0112-1/03
!standard 5.2.1(3/5)
!class binding interpretation 24-07-09
!status Amendment 1-2022 24-07-18
!status WG9 Approved 24-10-10
!status ARG Approved 9-0-0 24-07-18
!status work item 24-07-09
!status received 24-07-09
!submitter Steve Baird
!priority Low
!difficulty Medium
!qualifier Error
!subject Assignment to generalized references with a target name symbol
Reintroduce the Ada 95 resolution rule for assignment_statements such that objects of limited types are not considered as possible solutions.
From 4.1.5(3/3), a "reference object" is an object whose type has the Implicit_Dereference aspect. A use of a reference object that includes an implicit dereference of the discriminant of the object is called a generalized reference.
The name of a reference object refers to the object itself and also implicitly refers to a generalized reference of that object. Thus, there always are two interpretations of such a name in the absence of other information. Some of the ramifications of this point were explored in AI22-0082-1. However, that AI did not discuss interactions with target name symbols.
If one has:
type Int_Ref (Ref : access Integer) is null
record
with Implicit_Dereference => Ref;
Int : aliased Integer := 0;
X : Int_Ref (Int'Access);
then an assignment like:
X := X + 1;
is legal, as the only possible interpretation of the "+" operator here is as an integer operator (there is no "+" that operates on Int_Ref). However,
X := @ + 1;
is ambiguous. That's because 5.2.1(3/5) says that the variable_name of the assignment_statement is a complete context, and X has two interpretations (as a reference object and as a generalized reference of that object), with nothing to choose between them. The operator in the source expression is not considered at all.
This is a more significant problem than it appears at first because a Variable_Indexing is required to be a function that returns a reference object. So, consider the following
package My_Vector is Ada.Containers.Vectors
(Integer);
My_Vect : My_Vector.Vector;
My_Vect(1) := My_Vect(1) + 1; -- OK.
My_Vect(1) := @ + 1; -- Illegal (in original Ada 2022).
The assignment with the target name symbol is illegal, again because the variable_name is ambiguous: it can be interpreted as an object of type My_Vector.Reference_Type or an object of type Integer. And as a complete context, no other information can be used to eliminate one of those possibilities.
In AI22-0082-1, we said "Users of a type with user-defined indexing do not want to have to think about the underlying implementation involving reference types." They just want to write reasonable things and have them work. The assignments above both appear reasonable, but one does not work. It appears that more work is needed for this problem.
Reintroduce the Ada 95 resolution rule for assignment_statements. Specifically, the variable_name of an assignment_statement is expected to be of any {nonlimited} type. This rule then also applies to the complete context of an assignment_statement that includes one or more target name symbols.
Other solutions are possible (but more work for implementers), see the !discussion.
Modify 5.2(4/2):
The variable_name of an assignment_statement is expected to be of any {nonlimited} type. The expected type for the expression is the type of the target.
[Editor’s notes: The word “nonlimited” in 5.2(5/2) should be marked as Redundant. The rule is still necessary to require the target to be a variable (and also to define the term “target”), and leaving the word “nonlimited” is harmless and informative. So there is no normative change to that rule proposed.
The AARM Note 5.2(4.a) needs to be rewritten to exclude assignments that contain target name symbols; that’s not directly related to this AI, but it’s clearly wrong.]
Add after 5.2.1(3/5):
AARM Ramification: The other resolution rules that apply to all assignment_statements (see 5.2) still apply here. In particular, interpretations that have limited types are not considered in this complete context.
To understand this problem and potential solutions further, we need to start with why the "complete context" rule was adopted in the first place.
The most understandable way to resolve the target name symbol is to treat it as a copy of the target variable_name. This would be identical to syntactically replacing the '@' with the variable_name. Unfortunately, this opens up the possibility that the '@' would resolve differently than the target name does.
For instance:
function F1 return access Integer;
function F1 return access Float;
function F2 (A : in Integer) return Float;
begin
F1.all := F2(@);
Here, using syntactic resolution, there is a unique solution where the target resolves to a dereference of a call of F1 returning an access to float, while the '@' resolves to a dereference of a call of F1 returning an access to integer.
There are similar examples that are ambiguous only when cases where the '@' resolves differently from the target name are considered. For instance:
function F1 return access Integer;
function F1 return access Float;
function F2 (A : in Integer) return Integer;
function F2 (A : in Integer) return Float;
begin
F1.all := F2(@);
Using syntactic resolution, if we use the Float versions of F1 and F2, the @ would resolve to the Integer version of F1. That would be ambiguous with the intended solution where all of the F1 and F2s are the Integer versions.
(Both of these examples are illegal given the actual Ada 2022 rule, as the target name is ambiguous as a complete context.)
These cases re-introduce the problems that the target name symbol is intended to resolve. The target name symbol has to unambiguously denote the target object - it must never denote anything else. Otherwise, we've just made the possible confusions between the target and occurrences of similar names even worse by hiding them.
Therefore, syntactic resolution was rejected. Given concern about difficulty implementing some other scheme, we adopted the "target variable_name is a complete context" rule. It was felt that overloaded targets are rare (true in Ada 95, not true in Ada 2012 as shown here), and that we could compatibly relax this rule if the need arose (it appears that it has).
This choice makes both of the cases given here illegal (even though the second one has a reasonable meaning). More importantly, of course, it also makes the examples given in the !issue illegal.
We start exploring the solution space by looking at the "proper" solution. That is, to say that the target symbol represents the specific target object resolved to (and no other). The main concern about this solution was the difficulty of implementing it. (Good wording seems hard to find as well, but that is solvable.) Call this solution #1.
[Editor's note: We number each of the solutions so we can refer to them simply. There is an extensive analysis of how each of these solutions applies to various examples like the ones discussed in the !example section.]
There are some features of this problem which simplify it. The most important is that an assignment statement is always itself a complete context. So anything special that is needed to resolve it is confined to resolving assignment statements -- it need not be done in any other context. That opens the possibility to more complex solutions than would make sense for a constituent of an expression.
One way to solve such an expression would be to try each possible resolution of the target individually. For each possible resolution, a copy of the source expression would be made, giving any '@''s in it the properties of the possible target. Then a normal resolution would be performed on the copy of the source expression. After doing these steps on all of the possible resolutions, the assignment would resolve if and only if there is a single possible target that resolves unambiguously.
Every implementation has the operations to implement this approach already. Some mechanism to treat ‘@’ as a single solution is already needed for the existing rule, in order to resolve the source expression once the target entity has been identified. And a mechanism to copy expressions must exist in order to implement generic instances. Resolution itself is standard and unchanged.
Since targets are unlikely to be heavily overloaded, the extra expense of this approach is likely to be tolerable. And, of course, an implementation could find shortcuts to avoid making extra copies and extra tree walks, which might make it possible to reduce the cost. (For instance, if a trial resolution is non-destructive on an expression tree, it probably would not be necessary to make copies of the source expression.) This approach does however give a worst-case implementation that works and requires no significant new functionality, demonstrating that “proper” resolution is viable.
A simpler solution would be to have the target of an assignment that contains some target name symbols continue to be resolved as a complete context, but to ignore any limited objects. That works for the containers as the reference types are (now, because of AI22-0082-1) limited types. (Making them limited types avoids problems in various other resolution cases.) This is solution #2. The rule would be something like:
If a target_name occurs in an assignment_statement A, the variable_name V of A is a complete context{ whose expected type is any nonlimited type}.
Note that the Ada 95 rule for assignment_statements was similar - the expected type was any nonlimited type. As such, compilers with an Ada 95 compatibility switch already have this capability, and probably would require little work.
Note that this change would fix the containers and other uses following our (new) advice to make most reference types limited (and especially indexing reference types). But it doesn't help with cases involving nonlimited reference types (including all of the Ada 2012 and original Ada 2022 examples).
An alternative solution that would help nonlimited reference types as well would be to ignore reference objects (but not their associated generalized references) in the complete context of an assignment_statement with a target name symbol in the expression. Call this solution #3.
This would be a new kind of resolution (but one localized to assignment statements). It could be painful to implement for some compilers. In particular, for compilers that do not materialize implicit dereferences before resolution, it is the presence of the reference object that makes the possibility of the generalized reference exist.
Resolution then checks both possibilities, using code, something like:
Resolve_Expected (Source, Target.Typ);
if Is_Reference (Target.Typ) then
Resolve_Expected (Source, Generalized_Reference_Type (Target.Typ));
end if;
(We've ignored the result handling here; it can be complex.)
Simply not including the reference object in the solution set does not work in this environment, as that also removes the generalized reference. One could probably have special-case code that leaves out the "Resolve_Expected" above, but that could entail a lot of code duplication depending exactly where the various resolution rules are implemented.
Finally, it is uncomfortable to have the resolution for assignment statements with and without target name symbols to be subtly different.
Luckily for solution #1, that is not a problem. The only resolution differences involve the '@' itself. That seems fundamental, as discussed previously.
For solution #2, we could easily revert the rule for all assignment statements to the Ada 95 version. Such a change would be compatible, since any assignment statement which would not resolve under the Ada 95 rule is illegal anyway. Only the error message would change. That would avoid needing to change the target symbol rule at all (the assignment_statement resolution would apply to that as well). So consider this expanded solution #2 as solution #4. Note that this change is easily justified: it doesn’t make a lot of sense to consider solutions that could never be legal. We generally don’t want resolution to be too smart, but it is worse having it be too dumb.
[Aside: It was AI95-00287 that moved "nonlimited" from the assignment_statement resolution rule to a legality rule. This was the AI that allowed limited aggregates. The AI removed a lot of rules requiring nonlimited objects. But it never explained why it did so for assignment statements (which are still not allowed to be limited). Nor is there anything in the mail. The minutes to the Mallorca meeting of June 2024 says "Tucker explains the wording changes. The change to 5.2 is not absolutely necessary, but it makes sense for these to be all legality rules, rather than name resolution rules." So it appears that there was no semantic requirement for this change, just a desire for consistency. Thus, there is no reason not to revert the rule to the original version.]
For solution #3, we probably would not want to have all assignments work this way. In the unlikely case that someone did want to assign to a reference object as a whole, there has to be some way to do it. Doing that for all assignments also would be incompatible (while solution #3 itself is compatible, since all such assignments are currently illegal, making some of them legal is always going to be compatible).
Editor's analysis of the solutions.
Here's my take on this problem. I rank the solutions in the order #4, #1, #2, and #3.
I prefer to keep the resolution for assignment statements with or without target symbols as close as possible. I think we want to minimize anomalies. Thus, I prefer #4 to #2, and prefer #1 over more kludgy changes to just the target symbol rules.
I view #3 as irregular (some expressions involving target symbols would resolve while the equivalent assignment expression would not), and as a completely new sort of restriction, potentially hard to implement. (It would be painful in Janus/Ada, which does not materialize implicit operations until after resolution.) If we're going to work hard to implement resolution, it makes the most sense to do so for the full resolution of #1 (which is most likely to match the user's intution), not some kludge.
The analysis of the examples (below) shows that no solution would have assignment_statements resolve *exactly* the same with or without target symbols. There are always going to be unlikely corner cases that differ. That seems fundamental to the meaning of target symbols.
As such, the simpler, broader, compatible solution seems best. #4 is simply reverting the resolution for all assignments to the Ada 95 rule (which every serious compiler still has implemented under their Ada 95 switch). #1 allows a few more cases but is likely to be substantially more work, and seems too much for a Binding Interpretation. Option #2 is the same work as #4, but less consistent. And option #3 can be as complex as #1 in implementation, but fixes fewer cases.
As such, I’ve written the AI having selected solution #4. This does not fix all of the issues, but note that we decided not to fix all of the possible issues in AI22-0082-1 either. So this choice is consistent with our previous choices in this area.
@drepl
The @i{variable_}@fa{name} of an @fa{assignment_statement} is expected to be of any type. The expected type for the @fa{expression} is the type of the target.
@dby
The @i{variable_}@fa{name} of an @fa{assignment_statement} is expected to be of any nonlimited type. The expected type for the @fa{expression} is the type of the target.
Here are a number of examples and how they resolve with the current rules as well as with each of the various proposed solutions. There is a summary of the effects of the various solutions at the very end of this section.
type Int_Ref (Ref : access Integer) is
record
Count : in Natural := 1;
end record
with Implicit_Dereference => Ref;
type Lim_Int_Ref (Ref : access Integer) is limited null
record
with Implicit_Dereference => Ref;
Int : aliased Integer := 0;
X : Int_Ref (Int'Access);
Y : Lim_Int_Ref (Int'Access);
package My_Vector is Ada.Containers.Vectors (Integer);
My_Vect : My_Vector.Vector;
function F1 return access Integer;
function F1 return access Float;
function F2 (A : in Integer) return Float;
function F3 (A : in Integer) return Integer;
function F3 (A : in Integer) return Float;
type Lim is limited null record;
function F4 return access Integer;
function F4 return access Lim;
function F5 (A : in Integer) return Integer;
function F5 (A : in Lim) return Lim;
function F6 (A : in Lim) return Lim;
begin
X := X + 1; -- (A)
X := @ + 1; -- (B)
X := (X with delta Count => 10); -- (C)
X := (@ with delta Count => 10); -- (D)
Y := Y + 1; -- (E)
Y := @ + 1; -- (F)
My_Vect(1) := My_Vect(1) + 1; -- (G)
My_Vect(1) := @ + 1; -- (H)
My_Vect(1) := My_Vect(1); -- (J)
My_Vect(1) := @; -- (K)
F1.all := F2(F1.all); -- (L)
F1.all := F2(@); -- (M)
F1.all := F3(F1.all); -- (N)
F1.all := F3(@); -- (P)
F4.all := F5(F4.all); -- (Q)
F4.all := F5(@); -- (R)
F4.all := F4.all + 1; -- (S)
F4.all := @ + 1; -- (T)
F4.all := F6(F4.all); -- (U)
F4.all := F6(@); -- (V)
The results of the examples are separated into groups: the first group are cases that are likely to appear in actual usage; the second group are cases that might appear in actual usage; and the third group are cases that are unlikely to appear in actual usage (these are mostly thought experiments).
Cases that are likely to appear in actual usage (these involve limited reference types, as used in the containers):
(E) Current rules: OK; Solution #1: OK; Solution #2: OK; Solution #3: OK; Solution #4: OK.
(F) Current rules: Illegal (ambig); Solution #1: OK; Solution #2: OK; Solution #3: OK; Solution #4: OK.
(G) Current rules: OK; Solution #1: OK; Solution #2: OK; Solution #3: OK; Solution #4: OK.
(H) Current rules: Illegal (ambig); Solution #1: OK; Solution #2: OK; Solution #3: OK; Solution #4: OK.
Cases that might appear in actual usage (these involve nonlimited reference types, which could be useful for handles and other kinds of references):
(A) Current rules: OK; Solution #1: OK; Solution #2: OK; Solution #3: OK; Solution #4: OK.
(B) Current rules: Illegal (ambig); Solution #1: OK; Solution #2: Illegal (ambig); Solution #3: OK; Solution #4: Illegal (ambig).
(C) Current rules: OK; Solution #1: OK; Solution #2: OK; Solution #3: OK; Solution #4: OK.
(D) Current rules: Illegal (ambig); Solution #1: OK; Solution #2: Illegal (ambig); Solution #3: OK; Solution #4: Illegal (ambig).
Cases that might appear in actual usage (these involve overloaded functions with both limited and nonlimited types):
(Q) Current rules: Illegal (ambig); Solution #1: Illegal (ambig); Solution #2: Illegal (ambig); Solution #3: Illegal (ambig); Solution #4: OK.
(R) Current rules: Illegal (ambig); Solution #1: Illegal (ambig); Solution #2: OK; Solution #3: Illegal (ambig); Solution #4: OK.
(S) Current rules: OK; Solution #1: OK; Solution #2: OK; Solution #3: OK; Solution #4: OK.
(T) Current rules: Illegal (ambig); Solution #1: OK; Solution #2: OK; Solution #3: Illegal (ambig); Solution #4: OK.
(U) Current rules: Illegal (legality); Solution #1: Illegal (legality); Solution #2: Illegal (legality); Solution #3: Illegal (legality); Solution #4: Illegal (no solution).
(V) Current rules: Illegal (ambig); Solution #1: Illegal (legality); Solution #2: Illegal (no solution); Solution #3: Illegal (ambig); Solution #4: Illegal (no solution).
For these latter two cases, the assignments are always illegal; only the reason that the expressions are illegal changes.
Cases that are unlikely to appear in actual usage:
(J) Current rules: OK(*); Solution #1: OK; Solution #2: OK; Solution #3: OK; Solution #4: OK.
* This resolves because the target name is interpreted as a variable indexing and the source expression is interpreted as a constant indexing; these have different types, so only the generalized reference interpretations of both match. Solution #4 OTOH ignores the limited reference objects.
(K) Current rules: Illegal (ambig); Solution #1: Illegal (ambig*); Solution #2: OK; Solution #3: OK; Solution #4: OK.
* Solution #1 doesn't resolve here, unlike the explicit case, because the @ is a constant view of a variable indexing (rather than the constant indexing of the explicit case). We then have solutions using the reference type and using the generalized reference and no way to choose between them. This demonstrates that there is no solution where everything works the same. Luckily, this isn't a likely case. Note that solutions #2 and #4 work because the reference type is limited and thus is not considered.
(L) Current rules: OK(*); Solution #1: OK; Solution #2: OK; Solution #3: OK; Solution #4: OK.
* The source expression and target name resolve to different F1 functions.
(M) Current rules: Illegal (ambig); Solution #1: Illegal (no solution); Solution #2: Illegal (ambig); Solution #3: Illegal (ambig); Solution #4: Illegal (ambig).
(N) Current rules: Illegal (ambig*); Solution #1: Illegal (ambig); Solution #2: Illegal (ambig); Solution #3: Illegal (ambig); Solution #4: Illegal (ambig).
* The F1 in the source expression could be either F1 - both work.
(P) Current rules: Illegal (ambig); Solution #1: OK(*); Solution #2: Illegal (ambig); Solution #3: Illegal (ambig); Solution #4: Illegal (ambig).
* @ has to use the same F1 in the Source and Target; this gives a unique solution.
The goal would be for each member of a pair to resolve in the same way (either OK or illegal). That is, an assignment using a target name symbol and the equivalent assignment not using a target name symbol should resolve (or not) the same way, with the same solution.
The current rules fail this on 7 out of the 10 pairs (which is why we're considering this AI). All of the solutions pass for the most likely cases (but not the current rules, of course). Solution #1 fails only on 3 out of 10 pairs (all of which are unlikely). Solution #2 fails on 4 out of 10 pairs. Solution #3 also fails on 3 out of 10 pairs (but one might occur). Solution #4 also fails on 3 out of 10 pairs (two of these might occur).
Looking only at the cases that might occur, 5 out of 7 fail the current rules. Solution #1 is perfect (no failures); there is one case that does not work (in either type of assignment) but works with some other solutions. Solution #2 fails 3 out of 7. Solution #3 fails 1 out of 7 (and also has a case that does not work but works with some other solutions). Solution #4 fails 2 out of 7.
ACATS tests both B-Tests and C-Tests using the examples should be constructed.
This topic was originally raised privately by Steve Baird in a different context. The rules actually have worse results than he remembered, thus this AI was created.