Version 1.1 of ai05s/ai05-0142-4.txt

Unformatted version of ai05s/ai05-0142-4.txt version 1.1
Other versions for file ai05s/ai05-0142-4.txt

!standard 3.10(9/2)          09-05-17 AI05-0142-4/01
!class Amendment 09-05-17
!status work item 09-05-17
!status received 09-05-17
!priority Medium
!difficulty Medium
!subject Aliased parameters and accessors for Ada.Containers
!summary
(See proposal.)
!problem
Modifying a portion of a larger opaque object (such as a container) is not well-supported in Ada. The Ada.Containers packages provide a procedure Replace_Element for this purpose. But this procedure requires copying the element (potentially in both directions). That could be very expensive if the element is large.
The Ada.Containers packages also provide an procedure Update_Element for this purpose; this provides a writable object as a parameter to a subprogram passed into the procedure. This procedure avoids the need to copy the element, but it is hardly convenient to define a procedure for every component of the element that needs changing. The extra syntax needed obscures the real meaning of the program.
An option that was rejected for the Ada.Containers packages was to return an access to the element type. However, this is problematic as it is difficult to control the accessibility and lifetime of the returned access. If the element is removed from the container, the returned access could become dangling; continued use of the access would make the program erroneous. Moreover, the accessibility of the returned object (and thus what could be done with it) would depend on the actual implementation of the container. Bounded containers would typically only return access values with a very short lifetime, while unbounded containers would typically return access values with a much longer lifetime. Converting from an unbounded to bounded form could thus introduce new runtime errors - a serious maintenance hazard.
!proposal
Add aliased parameters, which allow 'Access of parts to be returned as access discriminants and anonymous access returns.
!wording
Modify the first sentence of 3.10(9/2):
A view of an object is defined to be aliased if it is defined by an object_declaration{, parameter_specification, } or component_definition with the reserved word aliased, or by renaming an aliased view.
Add somewhere in 3.10.2(6-16): [Q: Where??]
The accessibility level of a formal aliased parameter in a function body is that that the return object will have after a return statement completes normally.
[That level is defined by 3.10.2(10.1/2).].
[The intent is that this level means that the checks defined in 6.5 for return statements will always succeed (or fail) without any need for any runtime garbage. (See the Language Design Principles below.) But this depends on what we end up doing for AI-51; this works best with the fully dynamic version of that AI. If we use Bob's version, then we will have to make a hole in the rules to allow this scenario. One way to do that would be use immutably limited results only, in that case, the other rules given above would be changed similarly. Another option would be a syntax marker on the return that we want a shorter lifetime for the result. - RLB]
Replace 6.1(15/2) by:
parameter_specification ::= defining_identifier_list : mode [aliased] [null_exclusion] subtype_mark [:= default_expression] | defining_identifier_list : access_definition [:= default_expression]
[Q: Does [aliased] go before or after the mode? I thought it worked better after the mode, as then the types are more like the types in an object_declaration. But it probably doesn't matter.]
Add after 6.1(23): [Static Semantics]
An aliased parameter is a formal parameter that includes the reserved word aliased.
Modify 6.3.1(16/2):
Two profiles are mode conformant if they are type-conformant, [and] corresponding parameters have identical modes, {both or neither are aliased parameters}, and, for access parameters or access result types, the designated subtypes statically match, or the designated profiles are subtype conformant.
Add in 6.4.1 of the AARM:
Language Design Principles For aliased parameters of functions, we will ensure at the call site that a part of it can be returned as part of the function result without creating a dangling pointer. We do this with accessibility checks at the call site that all actual objects of aliased parameters live as long as the function result; then we can allow them to be returned as access discriminants or anonymous access results, as those have the master of the function result as well.
Add after 6.4.1(6): [Legality Rules]
If the formal parameter is an aliased parameter, the type of the actual parameter shall be tagged or the actual parameter shall be an aliased view of an object.
AARM Ramification: Tagged objects (and tagged aggregates for in parameters) do not need to be aliased. This matches the behavior of unaliased formal parameters, which allow 'Access to be taken of any actual parameter.
In a function call, the accessibility level of the actual object for each aliased parameter shall not be statically deeper than accessibility level of the master of the function result.
[We could make this rule apply to all calls, since it can only fail for functions that initialize allocators, but that doesn't seem to add anything interesting. Note that we don't change the lifetime of anonymous objects or the accessibility of aliased parameters for procedures, so it would be somewhat weird to have this apply to procedures as well.]
AARM Discussion: Since aliased parameters are either tagged or required to be objects, there is always an object (possibly anonymous) to talk about. This is discussing the static accessibility level of the actual object; it does not depend on any runtime information (for instance when the actual object is a formal parameter to another call).
AARM Ramification: This accessibility check (and its dynamic cousin as well) can only fail if the function call is used to directly initialize a built-in-place object with a master different than that enclosing the call. The only place all of those conditions exist is in the initializer of an allocator; in all other cases this check will always pass.
Add a new Dynamic Semantics after 6.4.1(15): (this would be a new, outer level, bullet)
In a function call, for each aliased parameter, a check is made that the master of the accessibility level of the actual object is the same as or includes that of master of the function result.
AARM To Be Honest: We're talking about the "nominal" level of the actual object. If the actual object is a formal parameter of some function call F, we do not intend to require dynamic checks that depend on the master of the actual parameters to F, as that would cause nasty distributed overhead (all tagged and aliased parameters would have to carry accessibility levels).
Modify 7.6.1(13/3): [This wording was modified by AI05-0066-1]
The master of an object is the master enclosing its creation whose accessibility level (see 3.10.2) is equal to that of the object, except in the case of an anonymous object representing the result of an aggregate or function call. {If such an anonymous object is part of the actual parameter expression for an aliased parameter of a function call, the master of the object is the innermost master enclosing the evaluation of the aggregate or function call, excluding the aggregate or function call itself. Otherwise, the}[The] master of such an anonymous object is the innermost master enclosing the evaluation of the aggregate or function call, which may be the aggregate or function call itself.
AARM Reason: (Add after the existing ones) The special case for aliased parameters of functions is needed for the same reason, as access discriminants of the returned object may designate one of these parameters. In that case, we want to lengthen the lifetime of the anonymous objects as long as the possible lifetime of the result.
AARM Ramification: Note that the master given to anonymous objects in aliased parameters of functions is not necessarily as long as the master of the object being initialized (if the funciton call is used to initialize an allocator, for instance). In that case, the accessibility check on aliased parameters will necessarily fail if any such anonymous objects exist. This is necessary to avoid requiring the objects to live as long as the access type or the implementation complexity of an implicit coextension.
Add the following to each Ada.Containers package immediately after Update_Element (with suitable substitutions for the names of types) [note that the names of the type, discriminant, and function are TBD].
type Accessor_Type (Element : not null access Element_Type) is tagged limited private;
Accessor_Type needs finalization.
A default-initialized object of type Accessor_Type propagates Program_Error.
AARM Reason: It is expected that Accessor_Type will be a controlled type, for which finalization will have some action to terminate the tampering check for the associated container. If the object is created by default, however, there is no associated container. Since this is useless, and supporting this case would take extra work, we define it to raise an exception.
function Accessor (Container : in out aliased Vector; Position : in Cursor) return Accessor_Type;
If Position equals No_Element, then Constraint_Error is propagated; if Position does not designate an element in Container, then Program_Error is propagated. Otherwise, Accessor returns an object whose discriminant is an access to the element designated by Position. Program_Error is propagated if any operation tampers with the elements of Container while the object returned by Accessor exists and has not been finalized.
The element designated by Position is not an empty element after successful completion of this operation. [This is only needed for the Vectors case. - ED]
Add the following to Ada.Containers.Vectors and its relatives.
function Accessor (Container : in out aliased Vector; Index : in Index_Type) return Accessor_Type;
If Index is not in the range First_Index (Container) .. Last_Index (Container), then Constraint_Error is propagated. Otherwise,
Accessor returns an object whose discriminant is an access to the element at position Index. Program_Error is propagated if any operation tampers with the elements of Container while the object returned by Accessor exists and has not been finalized.
The element designated by Position is not an empty element after successful completion of this operation.
Add the following to the Erroneous Execution section of each container package:
Execution is erroneous if the vector associated with the result of a call to Accessor is finalized before the result object returned by Accessor is finalized.
AARM Reason: Each object of Accessor_Type probably contains some reference to the originating container. If that container is prematurely finalized (only possible via Unchecked_Deallocation, as accessibility checks prevent passing a container to Accessor that will not live as long as the result), the finalization of the object of Accessor_Type will try to access a non-existent object. This is a normal case of a dangling pointer created by Unchecked_Deallocation; we have to explicitly mention it here as the pointer in question is not visible in the specification of the type. (This is the same reason we have to say this for invalid cursors.)
[Q: Should we add the read-only accessors? They would look like:
type RO_Accessor_Type (Element : not null access constant Element_Type) is tagged limited private;
function RO_Accessor (Position : in aliased Cursor) return RO_Accessor_Type;
function RO_Accessor (Container : in aliased Vector; Index : in Index_Type) return RO_Accessor_Type;
The names of all of these functions need work.]
!discussion
The idea is that the accessibility level of aliased parameters of functions is defined such that returning a part (or all) of the parameter as an anonymous access result or as an access discriminant will always succeed. That means no checks (and associated failures) need to be done within the function; we make any needed checks at the call site instead.
The call site checks can fail only in obscure cases, as the vast majority of objects that could be passed to a function call will outlive the function result. However, if the function result is built-in-place, the result could be made library-level simply by including a call in an allocator for a library level access type. In that case, we do not want to be returning an access to a local object. The check is normally a static one, and imposes overhead only in rare cases (such as passing a dereference of an anonymous access parameter).
We include the aliased property of parameters in mode conformance, so that the need (or not) to make accessibility checks is always known at compile-time. This means that it cannot be hidden by renames or generic formal parameters, and most importantly, does not change for dispatching routines. It would be a disaster for a dispatching call to call a function that has aliased parameters without making necessary accessibility checks on those parameters; that would be possible if "aliased" was not part of mode conformance.
---
Note that the accessibility of the returned object from a call to Accessor will prevent converting the returned access discriminant to any type that lives longer than the returned object. That means that we can assume that the discriminant value cannot be accessed after the object disappears.
It is possible to use Unchecked_Deallocation to destroy the returned Accessor_Type object prematurely, while continuing to use the returned access. The accessibility check on the container argument to the Accessor function means that the returned access can't live longer than the container, but the use of Unchecked_Deallocation would allow tampering on the container. This could look like:
declare type at is access MV.Accessor_Type; procedure Free is new Ada.Unchecked_Deallocation (MV.Accessor_Type, at);
PAT : at := new MV.Accessor_Type'(Vect.Accessor (1))
Element : access Element_Type := PAT.Element; -- OK.
begin Free (PAT); -- Tampering checking ends here!
Vect.Delete (1); -- No check here. ... Element ... -- Oops, gone. end;
Obviously, this is highly unlikely to happen by accident. This is more like shooting oneself in the foot by attaching a laser target to your foot and then firing a laser-seeking missile. It doesn't seem worth worrying about someone who wants to go to these lengths to avoid a tampering check - they're just as likely to use Unchecked_Conversion or .all'Unchecked_Access or something else nasty.
---
We don't want to treat all parameters of functions like aliased parameters for compatibility and wording reasons.
We don't want to have to define accessibility checks on values, and we proved that trying to tie this property to particular types causes maintenance hazards (see the earlier alternative AI05-0142-3 to see why).
In addition, the accessibility checks needed could cause some functions used to initialize an allocator to become illegal or raise Program_Error. While that is relatively rare, it still would be hard to work around.
Finally, changing the master of all anonymous objects in parameters of functions would make function parameters live a long time in some cases (such as when the result is renamed or part of an allocator). That could be especially nasty if the parameter is an object that contains tasks that otherwise would have been terminated or releases locks when finalized. This would be the worst sort of incompatibility, where the program still works but does something different.
---
We need to be able to define the period of existence of the return object of the accessor as a time when tampering with elements is prohibited. If we didn't do this, we would be opening the door to all manner of problems (up to and including erroneousness) if an element is added or deleted to the container.
To see how this could happen, consider the following example:
Assume that Ada.Containers.Vectors has an accessor function defined as follows:
function Accessor (C : in out aliased Vector; I : Index_Type) return access Element_Type;
Now consider the following program fragments:
package Data is type Node (D : Boolean := True) is record case D is when True => CT : Natural; when False => CF : Float; end case; end record; subtype True_Node is Node (D => True); subtype False_Node is Node (D => False);
package Node_Vector is new Ada.Containers.Vectors (Element_Type => Node, Index_Type => Positive);
Node_List : Node_Vector.Vector; end Data;
with Data; package Far_Away is procedure Some_Complex_Operation (...); end Far_Away;
package body Far_Away is procedure Some_Complex_Operation (...) is Some_Index : Positive; begin ... Some_Index := <lengthy-computation-resulting-in-value-of-1>; ... Data.Node_List.Delete (Some_Index); ... end Some_Complex_Operation; end Far_Away;
with Data; package Process is procedure Process_True (NT : in out Data.True_Node); end Process;
with Far_Away; package body Process is procedure Process_True (NT : in out Data.True_Node) is Component : renames NT.CT; -- OK, NT is constrained. begin ... Far_Away.Some_Complex_Operation (...); if Component > 10 then -- ** Oh-oh!! Component has moved. ... end if; end Process_True; end Process;
with Data, Process; procedure Main is begin Data.Node_List.Append (New_Item => (D => False, CF => 1.0)); -- Element 1. Data.Node_List.Append (New_Item => (D => True, CT => 4)); -- Element 2. Data.Node_List.Append (New_Item => (D => False, CF => 26.5)); -- Element 3. ... Process.Process_True (Data.Node_List.Accessor (2).all); ... end Main;
Now, in the canonical implementation for a vector, all of the data is stored in an array of some size, and when a component is deleted, the rest of them slide down. So, when the Far_Away operation (probably called through many levels of calls) deletes the first element of the vector, all of the rest of them move. That means that the second element passed to Process_True unexpectedly changes its value to that which was in the third element. But that element doesn't match the constraint and doesn't even have a CT component! So we get erroneous behavior. Note that the same thing would happen if an element was inserted at the front of the vector.
Note that in this example no one has copied the access value returned by the accessor. So a rule preventing copying does no good. You can also get a similar effect by renaming the access value itself and doing bad things while the renames exists.
Note that the erroneousness can't happen if you use Update_Element instead, because the tampering check will cause the deletion to raise Program_Error. Similarly, cursors have rules allowing implementations to detect this sliding and raise Program_Error.
This example code seems plausible; the record type is similar to many found in the author's compiler and using a library-level container is not unusual. So it seems reasonable that cases like this would happen in practice. Thus we conclude that a tampering check is needed for the accessor functions.
!examples
The new Accessor function could be used (based on a recent example from comp.lang.ada):
with Ada.Containers.Vectors; procedure Check is package Integer_Vectors is new Ada.Containers.Vectors (Index_Type => Natural, Element_Type => Integer);
package Nested_Vectors is new Ada.Containers.Vectors (Index_Type => Natural, Element_Type => Integer_Vectors.Vector, "=" => Integer_Vectors."=");
IV : Integer_Vectors.Vector; NV : Nested_Vectors.Vector; begin IV.Append(42); NV.Append(IV); NV.Accessor(0).Element.Append(43); end Check;
The dereference does not need to be explicitly given in this case, and the fact that the object returned by Accessor is limited and (probably) controlled is not visible in the code.
You could also save the entire object as long as needed:
declare My_IV : Nested_Vectors.Accessor_Type := NV.Accessor(0); begin
(since My_NV would be built-in-place).
Note that this design still works as intended even if the user renames only the limited access value:
declare My_IV : Integer_Vectors.Vector renames NV.Accessor(0).Element.all; begin
Even in this case, the wrapping finalizable object still is protecting the container, because the master of the returned object is the master of the renames: it will stick around as long as the renames does.
Moreover, splitting the access from the object usually isn't possible:
declare My_IV : access Integer_Vectors.Vector := NV.Accessor(0).Element; -- Illegal! begin
This is illegal because the accessibility check fails. The (anonymous) object returned here will have the master of the expression, which is shorter than the master of the declaration of My_IV. Thus the accessibility check preventing shorter lifetimes will fail.
It is OK to rename the access:
declare My_IV : access Integer_Vectors.Vector renames NV.Accessor(0).Element; begin
as in this case, the anonymous object will continue to exist until the renames is finalized.
This element can be passed as a parameter, as again the wrapping object will live as long as the parameter:
Operate (NV.Accessor(0).Element);
or
Munge (NV.Accessor(0).Element.all);
!ACATS test
ACATS tests would be needed for this feature.
!appendix

****************************************************************

Questions? Ask the ACAA Technical Agent