Version 1.1 of ai05s/ai05-0142-4.txt
!standard 3.10(9/2) 09-05-17 AI05-0142-4/01
!class Amendment 09-05-17
!status work item 09-05-17
!status received 09-05-17
!priority Medium
!difficulty Medium
!subject Aliased parameters and accessors for Ada.Containers
!summary
(See proposal.)
!problem
Modifying a portion of a larger opaque object (such as a container) is not
well-supported in Ada. The Ada.Containers packages provide a procedure
Replace_Element for this purpose. But this procedure requires copying the
element (potentially in both directions). That could be very expensive if the
element is large.
The Ada.Containers packages also provide an procedure Update_Element for this
purpose; this provides a writable object as a parameter to a subprogram passed
into the procedure. This procedure avoids the need to copy the element, but it
is hardly convenient to define a procedure for every component of the element
that needs changing. The extra syntax needed obscures the real meaning of the
program.
An option that was rejected for the Ada.Containers packages was to return an
access to the element type. However, this is problematic as it is difficult to
control the accessibility and lifetime of the returned access. If the element is
removed from the container, the returned access could become dangling; continued
use of the access would make the program erroneous. Moreover, the accessibility
of the returned object (and thus what could be done with it) would depend on the
actual implementation of the container. Bounded containers would typically only
return access values with a very short lifetime, while unbounded containers
would typically return access values with a much longer lifetime. Converting
from an unbounded to bounded form could thus introduce new runtime errors - a
serious maintenance hazard.
!proposal
Add aliased parameters, which allow 'Access of parts to be returned as access
discriminants and anonymous access returns.
!wording
Modify the first sentence of 3.10(9/2):
A view of an object is defined to be aliased if it is defined by an object_declaration{,
parameter_specification, } or component_definition with the reserved word aliased, or
by renaming an aliased view.
Add somewhere in 3.10.2(6-16): [Q: Where??]
The accessibility level of a formal aliased parameter in a function body
is that that the return object will have after a return statement completes
normally.
[That level is defined by 3.10.2(10.1/2).].
[The intent is that this level means that the checks defined in 6.5 for return
statements will always succeed (or fail) without any need for any runtime
garbage. (See the Language Design Principles below.) But this depends on what
we end up doing for AI-51; this works best with the fully dynamic version of that
AI. If we use Bob's version, then we will have to make a hole in the rules to
allow this scenario. One way to do that would be use immutably limited results
only, in that case, the other rules given above would be changed similarly.
Another option would be a syntax marker on the return that we want a shorter
lifetime for the result. - RLB]
Replace 6.1(15/2) by:
parameter_specification ::=
defining_identifier_list : mode [aliased] [null_exclusion] subtype_mark [:= default_expression]
| defining_identifier_list : access_definition [:= default_expression]
[Q: Does [aliased] go before or after the mode? I thought it worked better after the mode, as then
the types are more like the types in an object_declaration. But it probably doesn't matter.]
Add after 6.1(23): [Static Semantics]
An aliased parameter is a formal parameter that includes the reserved word aliased.
Modify 6.3.1(16/2):
Two profiles are mode conformant if they are type-conformant, [and] corresponding parameters
have identical modes, {both or neither are aliased parameters}, and, for access parameters or
access result types, the designated subtypes statically match, or the designated profiles are
subtype conformant.
Add in 6.4.1 of the AARM:
Language Design Principles
For aliased parameters of functions, we will ensure at the call site that a part of it
can be returned as part of the function result without creating a dangling pointer.
We do this with accessibility checks at the call site that all actual objects
of aliased parameters live as long as the function result; then we can allow them
to be returned as access discriminants or anonymous access results, as those have the
master of the function result as well.
Add after 6.4.1(6): [Legality Rules]
If the formal parameter is an aliased parameter, the type of the actual parameter shall be
tagged or the actual parameter shall be an aliased view of an object.
AARM Ramification: Tagged objects (and tagged aggregates for in parameters) do not need
to be aliased. This matches the behavior of unaliased formal parameters, which allow
'Access to be taken of any actual parameter.
In a function call, the accessibility level of the actual object
for each aliased parameter shall not be statically deeper than
accessibility level of the master of the function result.
[We could make this rule apply to all calls, since it can only fail for functions that
initialize allocators, but that doesn't seem to add anything interesting. Note that we
don't change the lifetime of anonymous objects or the accessibility of aliased parameters
for procedures, so it would be somewhat weird to have this apply to procedures as well.]
AARM Discussion: Since aliased parameters are either tagged or required to be objects,
there is always an object (possibly anonymous) to talk about. This is discussing
the static accessibility level of the actual object; it does not depend on
any runtime information (for instance when the actual object is a formal parameter
to another call).
AARM Ramification: This accessibility check (and its dynamic cousin as well) can only fail
if the function call is used to directly initialize a built-in-place object with a master
different than that enclosing the call. The only place all of those conditions exist is
in the initializer of an allocator; in all other cases this check will always pass.
Add a new Dynamic Semantics after 6.4.1(15): (this would be a new, outer level, bullet)
In a function call, for each aliased parameter, a
check is made that the master of the accessibility level of the actual object is
the same as or includes that of master of the function result.
AARM To Be Honest: We're talking about the "nominal" level of the actual object.
If the actual object is a formal parameter of some function call F, we do not
intend to require dynamic checks that depend on the master of the actual
parameters to F, as that would cause nasty distributed overhead (all tagged and
aliased parameters would have to carry accessibility levels).
Modify 7.6.1(13/3): [This wording was modified by AI05-0066-1]
The master of an object is the master enclosing its creation whose accessibility level
(see 3.10.2) is equal to that of the object, except in the case of an anonymous object
representing the result of an aggregate or function call. {If such an anonymous object
is part of the actual parameter expression for an aliased parameter of a function call,
the master of the object is the innermost master enclosing the evaluation of the aggregate
or function call, excluding the aggregate or function call itself. Otherwise, the}[The]
master of such an anonymous object is the innermost master enclosing the evaluation of
the aggregate or function call, which may be the aggregate or function call itself.
AARM Reason: (Add after the existing ones)
The special case for aliased parameters of functions is needed for the same reason,
as access discriminants of the returned object may designate one of these parameters.
In that case, we want to lengthen the lifetime of the anonymous objects as long as the
possible lifetime of the result.
AARM Ramification:
Note that the master given to anonymous objects in aliased parameters of functions is
not necessarily as long as the master of the object being initialized (if the funciton
call is used to initialize an allocator, for instance). In that case, the accessibility
check on aliased parameters will necessarily fail if any such anonymous objects exist.
This is necessary to avoid requiring the objects to live as long as the access type or
the implementation complexity of an implicit coextension.
Add the following to each Ada.Containers package immediately after
Update_Element (with suitable substitutions for the names of types) [note that
the names of the type, discriminant, and function are TBD].
type Accessor_Type (Element : not null access Element_Type) is
tagged limited private;
Accessor_Type needs finalization.
A default-initialized object of type Accessor_Type propagates Program_Error.
AARM Reason: It is expected that Accessor_Type will be a controlled
type, for which finalization will have some action to terminate the tampering
check for the associated container. If the object is created by default,
however, there is no associated container. Since this is useless, and supporting
this case would take extra work, we define it to raise an exception.
function Accessor (Container : in out aliased Vector; Position : in Cursor)
return Accessor_Type;
If Position equals No_Element, then Constraint_Error is propagated; if Position
does not designate an element in Container, then Program_Error is propagated.
Otherwise, Accessor returns an object whose discriminant is an access to the
element designated by Position. Program_Error is propagated if any operation
tampers with the elements of Container while the object returned by Accessor
exists and has not been finalized.
The element designated by Position is not an empty element after successful
completion of this operation. [This is only needed for the Vectors case. - ED]
Add the following to Ada.Containers.Vectors and its relatives.
function Accessor (Container : in out aliased Vector; Index : in Index_Type)
return Accessor_Type;
If Index is not in the range First_Index (Container) .. Last_Index (Container),
then Constraint_Error is propagated. Otherwise,
Accessor returns an object whose discriminant is an access to the element at
position Index. Program_Error is propagated if any operation tampers with the
elements of Container while the object returned by Accessor exists and has not
been finalized.
The element designated by Position is not an empty element after successful
completion of this operation.
Add the following to the Erroneous Execution section of each container package:
Execution is erroneous if the vector associated with the result of a call to
Accessor is finalized before the result object returned by Accessor is
finalized.
AARM Reason: Each object of Accessor_Type probably contains some reference to
the originating container. If that container is prematurely finalized (only
possible via Unchecked_Deallocation, as accessibility checks prevent passing a
container to Accessor that will not live as long as the result), the
finalization of the object of Accessor_Type will try to access a non-existent
object. This is a normal case of a dangling pointer created by
Unchecked_Deallocation; we have to explicitly mention it here as the pointer in
question is not visible in the specification of the type. (This is the same
reason we have to say this for invalid cursors.)
[Q: Should we add the read-only accessors? They would look like:
type RO_Accessor_Type (Element : not null access constant Element_Type) is
tagged limited private;
function RO_Accessor (Position : in aliased Cursor) return RO_Accessor_Type;
function RO_Accessor (Container : in aliased Vector; Index : in Index_Type)
return RO_Accessor_Type;
The names of all of these functions need work.]
!discussion
The idea is that the accessibility level of aliased parameters of functions is defined
such that returning a part (or all) of the parameter as an anonymous access result or as an
access discriminant will always succeed. That means no checks (and associated failures) need
to be done within the function; we make any needed checks at the call site instead.
The call site checks can fail only in obscure cases, as the vast majority of objects that could be
passed to a function call will outlive the function result. However, if the function result is
built-in-place, the result could be made library-level simply by including a call in an allocator
for a library level access type. In that case, we do not want to be returning an access to a local
object. The check is normally a static one, and imposes overhead only in rare cases (such as
passing a dereference of an anonymous access parameter).
We include the aliased property of parameters in mode conformance, so that the need (or not)
to make accessibility checks is always known at compile-time. This means that it cannot be
hidden by renames or generic formal parameters, and most importantly, does not change for
dispatching routines. It would be a disaster for a dispatching call to call a function that
has aliased parameters without making necessary accessibility checks on those parameters;
that would be possible if "aliased" was not part of mode conformance.
---
Note that the accessibility of the returned object from a call to Accessor will prevent converting
the returned access discriminant to any type that lives longer than the returned object. That means
that we can assume that the discriminant value cannot be accessed after the object disappears.
It is possible to use Unchecked_Deallocation to destroy the returned Accessor_Type object
prematurely, while continuing to use the returned access. The accessibility check on the container
argument to the Accessor function means that the returned access can't live longer than the
container, but the use of Unchecked_Deallocation would allow tampering on the container. This could
look like:
declare
type at is access MV.Accessor_Type;
procedure Free is new Ada.Unchecked_Deallocation (MV.Accessor_Type, at);
PAT : at := new MV.Accessor_Type'(Vect.Accessor (1))
Element : access Element_Type := PAT.Element; --
begin
Free (PAT); --
Vect.Delete (1); --
... Element ... --
end;
Obviously, this is highly unlikely to happen by accident. This is more like shooting oneself in
the foot by attaching a laser target to your foot and then firing a laser-seeking missile. It
doesn't seem worth worrying about someone who wants to go to these lengths to avoid a tampering
check - they're just as likely to use Unchecked_Conversion or .all'Unchecked_Access or something
else nasty.
---
We don't want to treat all parameters of functions like aliased parameters for compatibility
and wording reasons.
We don't want to have to define accessibility checks on values, and we proved that trying
to tie this property to particular types causes maintenance hazards (see the earlier
alternative AI05-0142-3 to see why).
In addition, the accessibility checks needed could cause some functions used to initialize
an allocator to become illegal or raise Program_Error. While that is relatively rare, it
still would be hard to work around.
Finally, changing the master of all anonymous objects in parameters of functions would make
function parameters live a long time in some cases (such as when the result is renamed or
part of an allocator). That could be especially nasty if the parameter is an object that
contains tasks that otherwise would have been terminated or releases locks when finalized.
This would be the worst sort of incompatibility, where the program still works but does
something different.
---
We need to be able to define the period of existence of the return object of the accessor
as a time when tampering with elements is prohibited. If we didn't do this, we would
be opening the door to all manner of problems (up to and including erroneousness) if
an element is added or deleted to the container.
To see how this could happen, consider the following example:
Assume that Ada.Containers.Vectors has an accessor function defined as follows:
function Accessor (C : in out aliased Vector; I : Index_Type) return access Element_Type;
Now consider the following program fragments:
package Data is
type Node (D : Boolean := True) is record
case D is
when True =>
CT : Natural;
when False =>
CF : Float;
end case;
end record;
subtype True_Node is Node (D => True);
subtype False_Node is Node (D => False);
package Node_Vector is new Ada.Containers.Vectors (Element_Type => Node,
Index_Type => Positive);
Node_List : Node_Vector.Vector;
end Data;
with Data;
package Far_Away is
procedure Some_Complex_Operation (...);
end Far_Away;
package body Far_Away is
procedure Some_Complex_Operation (...) is
Some_Index : Positive;
begin
...
Some_Index := <lengthy-computation-resulting-in-value-of-1>;
...
Data.Node_List.Delete (Some_Index);
...
end Some_Complex_Operation;
end Far_Away;
with Data;
package Process is
procedure Process_True (NT : in out Data.True_Node);
end Process;
with Far_Away;
package body Process is
procedure Process_True (NT : in out Data.True_Node) is
Component : renames NT.CT; --
begin
...
Far_Away.Some_Complex_Operation (...);
if Component > 10 then --
...
end if;
end Process_True;
end Process;
with Data, Process;
procedure Main is
begin
Data.Node_List.Append (New_Item => (D => False, CF => 1.0)); --
Data.Node_List.Append (New_Item => (D => True, CT => 4)); --
Data.Node_List.Append (New_Item => (D => False, CF => 26.5)); --
...
Process.Process_True (Data.Node_List.Accessor (2).all);
...
end Main;
Now, in the canonical implementation for a vector, all of the data is stored in
an array of some size, and when a component is deleted, the rest of them slide
down. So, when the Far_Away operation (probably called through many levels of
calls) deletes the first element of the vector, all of the rest of them move.
That means that the second element passed to Process_True unexpectedly changes
its value to that which was in the third element. But that element doesn't match
the constraint and doesn't even have a CT component! So we get erroneous
behavior. Note that the same thing would happen if an element was inserted at
the front of the vector.
Note that in this example no one has copied the access value returned by the accessor.
So a rule preventing copying does no good. You can also get a similar
effect by renaming the access value itself and doing bad things while the
renames exists.
Note that the erroneousness can't happen if you use Update_Element instead,
because the tampering check will cause the deletion to raise Program_Error.
Similarly, cursors have rules allowing implementations to detect this sliding
and raise Program_Error.
This example code seems plausible; the record type is similar to many found in the
author's compiler and using a library-level container is not unusual. So it seems
reasonable that cases like this would happen in practice. Thus we conclude that
a tampering check is needed for the accessor functions.
!examples
The new Accessor function could be used (based on a recent example from comp.lang.ada):
with Ada.Containers.Vectors;
procedure Check is
package Integer_Vectors is
new Ada.Containers.Vectors
(Index_Type => Natural,
Element_Type => Integer);
package Nested_Vectors is
new Ada.Containers.Vectors
(Index_Type => Natural,
Element_Type => Integer_Vectors.Vector,
"=" => Integer_Vectors."=");
IV : Integer_Vectors.Vector;
NV : Nested_Vectors.Vector;
begin
IV.Append(42);
NV.Append(IV);
NV.Accessor(0).Element.Append(43);
end Check;
The dereference does not need to be explicitly given in this case, and the fact that
the object returned by Accessor is limited and (probably) controlled is not visible
in the code.
You could also save the entire object as long as needed:
declare
My_IV : Nested_Vectors.Accessor_Type := NV.Accessor(0);
begin
(since My_NV would be built-in-place).
Note that this design still works as intended even if the user renames only the limited access value:
declare
My_IV : Integer_Vectors.Vector renames NV.Accessor(0).Element.all;
begin
Even in this case, the wrapping finalizable object still is protecting the container, because
the master of the returned object is the master of the renames: it will stick around as long
as the renames does.
Moreover, splitting the access from the object usually isn't possible:
declare
My_IV : access Integer_Vectors.Vector := NV.Accessor(0).Element; --
begin
This is illegal because the accessibility check fails. The (anonymous) object returned here
will have the master of the expression, which is shorter than the master of the declaration
of My_IV. Thus the accessibility check preventing shorter lifetimes will fail.
It is OK to rename the access:
declare
My_IV : access Integer_Vectors.Vector renames NV.Accessor(0).Element;
begin
as in this case, the anonymous object will continue to exist until the renames is finalized.
This element can be passed as a parameter, as again the wrapping object will live as long as
the parameter:
Operate (NV.Accessor(0).Element);
or
Munge (NV.Accessor(0).Element.all);
!ACATS test
ACATS tests would be needed for this feature.
!appendix
****************************************************************
Questions? Ask the ACAA Technical Agent