Version 1.3 of ai05s/ai05-0212-1.txt
!standard A.18.2(9/2) 11-01-28 AI05-0212-1/03
!class Amendment 10-04-26
!status work item 10-04-26
!status received 10-02-27
!priority Medium
!difficulty Medium
!subject Accessors and Iterators for Ada.Containers
!summary
(See proposal.)
!problem
We've defined a number of new features that potentially can improve the use
of the containers libraries in AI05-0139-2, AI05-0142-4, AI05-0143-1, and
AI05-0162-1. The containers ought to be updated to use these new features.
!proposal
Add accessors and iterators to the containers packages.
!wording
[Editor's note: The following is based on AI05-0139-2/07.]
Add the following with to each container package:
with Ada.Iterator_Interfaces;
Modify the definition of the container types for each of the containers
(with suitable substitutions for the names of types) as follows:
type Vector is tagged private
with
Constant_Indexing => Constant_Reference,
Variable_Indexing => Reference,
Default_Iterator => Iterate,
Iterator_Element => Element_Type;
pragma Preelaborable_Initialization(Vector);
Add after constant No_Element:
package Vector_Iterator_Interfaces is new
Ada.Iterator_Interfaces (Cursors, No_Element);
Add the following to each Ada.Containers package immediately after
Update_Element (with suitable substitutions for the names of types).
type Constant_Reference_Type (Element : not null access constant Element_Type) is
with private
with
Implicit_Dereference => Element;
type Reference_Type (Element : not null access Element_Type) is
with private
with
Implicit_Dereference => Element;
Constant_Reference_Type and Reference_Type need finalization.
The default initialization of an object of type Constant_Reference_Type or Reference_Type
propagates Program_Error.
AARM Reason: It is expected that Reference_Type (and Constant_Reference_Type)
will be a controlled type, for which finalization will have some action to terminate
the tampering check for the associated container. If the object is created by default,
however, there is no associated container. Since this is useless, and supporting
this case would take extra work, we define it to raise an exception.
function Constant_Reference (Container : aliased in Vector; Position : in Cursor)
return Constant_Reference_Type;
This function (combined with the Constant_Indexing and Implicit_Dereference aspects)
provides a convinient way to get read access to the individual elements of
a container starting with a cursor.
If Position equals No_Element, then Constraint_Error is propagated; if Position
does not designate an element in Container, then Program_Error is propagated.
Otherwise, Constant_Reference returns an object whose discriminant is an access
value that designates the element designated by Position. Program_Error is propagated
if any operation tampers with the elements of Container while the object returned
by Constant_Reference exists and has not been finalized.
function Reference (Container : aliased in out Vector; Position : in Cursor)
return Reference_Type;
This function (combined with the Variable_Indexing and Implicit_Dereference aspects)
provides a convinient way to get read and write access to the individual elements of
a container starting with a cursor.
If Position equals No_Element, then Constraint_Error is propagated; if Position
does not designate an element in Container, then Program_Error is propagated.
Otherwise, Reference returns an object whose discriminant is an access value
that designates the element designated by Position. Program_Error is propagated if
any operation tampers with the elements of Container while the object returned
by Reference exists and has not been finalized.
The element designated by Position is not an empty element after successful
completion of this operation. [This is only needed for the Vectors case. - ED]
Add the following to Ada.Containers.Vectors and its relatives:
function Constant_Reference (Container : aliased in Vector; Index : in Index_Type)
return Constant_Reference_Type;
This routine (combined with the Constant_Indexing and Implicit_Dereference aspects)
provides a convinient way to get read access to the individual elements of
a container starting with an index value.
If Index is not in the range First_Index (Container) .. Last_Index (Container),
then Constraint_Error is propagated. Otherwise, Constant_Reference returns an object
whose discriminant is an access value that designates the element at position Index.
Program_Error is propagated if any operation tampers with the elements of Container
while the object returned by Constant_Reference exists and has not been finalized.
function Reference (Container : aliased in out Vector; Index : in Index_Type)
return Reference_Type;
This function (combined with the Variable_Indexing and Implicit_Dereference aspects)
provides a convinient way to get read and write access to the individual elements of
a container starting with an index value.
If Index is not in the range First_Index (Container) .. Last_Index (Container),
then Constraint_Error is propagated. Otherwise, Reference returns an object whose
discriminant is an access value that designates the element at position Index.
Program_Error is propagated if any operation tampers with the elements of Container
while the object returned by Reference exists and has not been finalized.
The element designated by Position is not an empty element after successful
completion of this operation.
Add the following to Ada.Containers.Hashed_Maps and the various other Map containers,
as well as inside of the Generic_Keys package for the various Set containers:
function Constant_Reference (Container : aliased in Map; Key : in Key_Type)
return Constant_Reference_Type;
This routine (combined with the Constant_Indexing and Implicit_Dereference aspects)
provides a convinient way to get read access to the individual elements of
a container starting with a key value.
Equivalent to Constant_Reference (Container, Find (Container, Key)).
function Reference (Container : aliased in out Map; Key : in Key_Type)
return Reference_Type;
This function (combined with the Variable_Indexing and Implicit_Dereference aspects)
provides a convinient way to get read and write access to the individual elements of
a container starting with an index value.
Equivalent to Reference (Container, Find (Container, Key)).
[Note: The versions for the set containers are unfortunately not primitive operations;
therefore they cannot be used via the _Indexing magic nor can they be called with
prefix notation. However, the reference semantics is operational, so the discriminant
and .all can be omitted; and they're probably still easier to use than Update_Element.
That seems like enough to justify defining them.
The Reference function for a Set and Key may need to work like the
Update_Element_Preserving_Key routine. More thought needed.]
After the existing procedure Reverse_Iterate of each container that has Reverse_Iterate,
add a new function:
function Iterate (Container : in Vector) return Iterators.Reversible_Iterator'Class;
Iterate returns an object that can be used in a loop_statement to loop
with a loop parameter designating each node in Container, starting with
the first node and moving the cursor as per the Next function if the reserved word
*reverse* is not used in the iterator_specification, and starting
with the last node and moving the cursor as per the Previous function otherwise.
Program_Error is propagated if any operation (in particular, the sequence_of_statements
of the loop_statement whose iterator_specification denotes this object) tampers
with the cursors of Container while the returned object exists. The returned
object from Iterate needs finalization.
[Note: This wording will need to be adjusted for containers that don't have Next, like
Maps. The wording will be parallel to that of the existing Iterate procedure.]
After the above routine for various Vector and List containers, add a new function:
function Iterate (Container : in Vector; Start : in Cursor) return Iterators.Reversible_Iterator'Class;
if Start is not No_Element and does not designate an item in Container, then Program_Error
is propagated. if Start is No_Element, the call is equivalent to Iterate (Container).
Otherwise, Iterate returns an object that can be used in a loop_statement to loop
with a loop parameter designating each node in Container, starting with
the node designated by From and moving the cursor as per the Next function if the reserved word
*reverse* is not used in the iterator_specification, or moving the cursor as per the Previous
function otherwise. Program_Error is propagated if any operation (in particular, the
sequence_of_statements of the loop_statement whose iterator_specification denotes this
object) tampers with the cursors of Container while the returned object exists. The returned
object from Iterate needs finalization.
AARM Note: Exits are allowed from the loops created using the iterator objects. in
particular, to stop the iteration at a particalar cursor, just add
exit when Cur = Stop;
in the body of the loop.
After the existing procedure Iterate of each container that does not have a procedure
Reverse_Iterate (for example, Hashed_Sets), add a new function:
function Iterate (Container : in Set) return Iterators.Forward_Iterator'Class;
Iterate returns an object that can be used in a loop_statement to loop
with a loop parameter designating each node in Container, starting
with the first node and moving the cursor as per the Next function.
Program_Error is propagated if any operation (in particular, the sequence_of_statements
of the loop_statement whose iterator_specification denotes this object) tampers
with the cursors of Container while the returned object exists. The returned
object from Iterate needs finalization.
Instead of the above, the Multiway_Tree containers would have:
function Iterate (Container : in Tree) return Iterators.Forward_Iterator'Class;
Iterate returns an object that can be used in a loop_statement to loop
with a loop parameter designating each node in Container, starting with the root node and
proceeding in a depth-first order. Program_Error is propagated if any operation (in particular,
the sequence_of_statements of the loop_statement whose iterator_specification denotes
this object) tampers with the cursors of Container while the returned object exists. The returned
object from Iterate needs finalization.
function Iterate_Subtree (Position : in Cursor) return Iterators.Forward_Iterator'Class;
Iterate returns an object that can be used in a loop_statement to loop
with a loop parameter designating each element in the subtree rooted by the node designated by
Position, starting with the node designated by Position and
proceeding in a depth-first order. if Position equals No_Element, then Constraint_Error is
propagated. Program_Error is propagated if any operation (in particular,
the sequence_of_statements of the loop_statement whose iterator_specification denotes
this object) tampers with the cursors of the container in which the node designated by Position
exists while the returned object exists. The returned
object from Iterate needs finalization.
function Iterate_Children (Container : in Tree; Parent : in Cursor) return Iterators.Reversible_Iterator'Class;
Iterate returns an object that can be used in a loop_statement to loop
with a loop parameter designating each child node of Parent. if Parent equals No_Element, then
Constraint_Error is propagated. if Parent does not designate a node in Container, then
Program_Error is propagated. Otherwise, if the reserved word *reverse* is not used in the
iterator_specification, the nodes are designated starting with the first child node and
moving the cursor as per the function Next_Sibling; if reserved word *reverse* is used, the
nodes are designated starting with the last child node and moving the cursor as per the
function Previous_Sibling. Program_Error is propagated if any operation (in particular,
the sequence_of_statements of the loop_statement whose iterator_specification denotes
this object) tampers with the cursors of Container while the returned object exists. The returned
object from Iterate needs finalization.
For each bounded container, delete the bullet and associated implementation note that says:
The pragma Preelaborate is replaced with pragma Pure.
Add an AARM note after the bullet about when the container needs finalization:
Implementation Note: The type Vector cannot depend on package Ada.Finalization unless the element
type depends on that package. The objects returned from the Iterator and Reference functions
probably do depend on package Ada.Finalization. Restricted environments may need to avoid
use of those functions and their associated types.
Add the following to the Erroneous Execution section of each container package:
Execution is erroneous if the vector associated with the result of a call to
Reference or Constant_Reference is finalized before the result object returned
by the call to Reference or Constant_Reference is finalized.
AARM Reason: Each object of Reference_Type and Constant_Reference_Type probably
contains some reference to the originating container. If that container is prematurely
finalized (which is only possible via Unchecked_Deallocation, as accessibility checks
prevent passing a container to Reference that will not live as long as the result), the
finalization of the object of Reference_Type will try to access a non-existent
object. This is a normal case of a dangling pointer created by
Unchecked_Deallocation; we have to explicitly mention it here as the pointer in
question is not visible in the specification of the type. (This is the same
reason we have to say this for invalid cursors.)
Add a new clause at the end of the containers section: (Currently A.18.33)
A.18.33 Example of container use.
[** TBD - bigger, better examples are supposed to be coming from Ed and Steve.
Examples of iterator use need to be included. **]
For the following declarations:
type Point is record
X, Y : Natural;
end record;
subtype Short is Natural range 1 .. 100;
package Point_Vectors is new Ada.Containers.Vectors (Short, Point);
use Point_Vectors;
Obj : Point_Vectors.Vector;
Cur : Point_Vectors.Cursor;
the effect of the Variable_Indexing and Implicit_Dereference aspects is that
Obj(Cur).X := 10;
has the effect of calling the Reference function and dereferencing the result, as in the
following expression:
Point_Vectors.Reference (Obj, Cur).Element.all.X := 10;
!discussion
Note that the accessibility of the returned object from a call to Reference will prevent converting
the returned access discriminant to any type that lives longer than the returned object. That means
that we can assume that the discriminant value cannot be accessed after the object disappears (thus
we can tie the tampering check to the lifetime of the returned object).
It is possible to use Unchecked_Deallocation to destroy the returned Reference_Type object
prematurely, while continuing to use the returned access. The accessibility check on the container
argument to the Reference function means that the returned access can't live longer than the
container, but the use of Unchecked_Deallocation would allow tampering on the container. This could
look like:
declare
type AR is access MV.Reference_Type;
procedure Free is new Ada.Unchecked_Deallocation (MV.Reference_Type, AR);
PAR : AR := new MV.Reference_Type'(Vect.Reference (1))
Element : access Element_Type := PAR.Element; --
begin
Free (PAR); --
Vect.Delete (1); --
... Element.all ... --
end;
Obviously, this is highly unlikely to happen by accident. This is more like shooting oneself in
the foot by attaching a laser target to your foot and then firing a laser-seeking missile. It
doesn't seem worth worrying about someone who wants to go to these lengths to avoid a tampering
check - they're just as likely to use Unchecked_Conversion or .all'Unchecked_Access or something
else nasty.
---
We need to be able to define the period of existence of the return object of the accessor
as a time when tampering with elements is prohibited. If we didn't do this, we would
be opening the door to all manner of problems (up to and including erroneousness) if
an element is added or deleted to the container.
To see how this could happen, consider the following example:
Assume that Ada.Containers.Vectors has an accessor function defined as follows:
function Reference (C : aliased in out Vector; I : Index_Type) return access Element_Type;
Now consider the following program fragments:
package Data is
type Node (D : Boolean := True) is record
case D is
when True =>
CT : Natural;
when False =>
CF : Float;
end case;
end record;
subtype True_Node is Node (D => True);
subtype False_Node is Node (D => False);
package Node_Vector is new Ada.Containers.Vectors (Element_Type => Node,
Index_Type => Positive);
Node_List : Node_Vector.Vector;
end Data;
with Data;
package Far_Away is
procedure Some_Complex_Operation (...);
end Far_Away;
package body Far_Away is
procedure Some_Complex_Operation (...) is
Some_Index : Positive;
begin
...
Some_Index := <lengthy-computation-resulting-in-value-of-1>;
...
Data.Node_List.Delete (Some_Index);
...
end Some_Complex_Operation;
end Far_Away;
with Data;
package Process is
procedure Process_True (NT : in out Data.True_Node);
end Process;
with Far_Away;
package body Process is
procedure Process_True (NT : in out Data.True_Node) is
Component : renames NT.CT; --
begin
...
Far_Away.Some_Complex_Operation (...);
if Component > 10 then --
...
end if;
end Process_True;
end Process;
with Data, Process;
procedure Main is
begin
Data.Node_List.Append (New_Item => (D => False, CF => 1.0)); --
Data.Node_List.Append (New_Item => (D => True, CT => 4)); --
Data.Node_List.Append (New_Item => (D => False, CF => 26.5)); --
...
Process.Process_True (Data.Node_List.Reference (2).all);
...
end Main;
In the canonical implementation for a vector, all of the data is stored in
an array of some size, and when a component is deleted, the rest of them slide
down. So, when the Far_Away operation (probably called through many levels of
calls) deletes the first element of the vector, all of the rest of them move.
That means that the second element passed to Process_True unexpectedly changes
its value to that which was in the third element. But that element doesn't match
the constraint and doesn't even have a CT component! So we get erroneous
behavior. Note that the same thing would happen if an element was inserted at
the front of the vector.
Note that in this example no one has copied the access value returned by the accessor.
So a rule preventing copying does no good. You can also get a similar
effect by renaming the access value itself and doing bad things while the
renames exists.
Note that the erroneousness can't happen if you use Update_Element instead,
because the tampering check will cause the deletion to raise Program_Error.
Similarly, cursors have rules allowing implementations to detect this sliding
and raise Program_Error.
This example code seems plausible; the record type is similar to many found in
the author's compiler and using a library-level container is not unusual. So it
seems reasonable that cases like this would happen in practice. Thus we conclude
that a tampering check is needed for the accessor functions.
!examples
The new Reference function can be used to update an element directly.
For example (based on a recent example from comp.lang.ada):
with Ada.Containers.Vectors;
procedure Check is
package Integer_Vectors is
new Ada.Containers.Vectors
(Index_Type => Natural,
Element_Type => Integer);
package Nested_Vectors is
new Ada.Containers.Vectors
(Index_Type => Natural,
Element_Type => Integer_Vectors.Vector,
"=" => Integer_Vectors."=");
IV : Integer_Vectors.Vector;
NV : Nested_Vectors.Vector;
begin
IV.Append(42);
NV.Append(IV);
NV.Reference(0).Element.all.Append(43);
end Check;
The dereference does not need to be explicitly given in this case, and the fact
that the object returned by Reference is limited and (probably) controlled is not
visible in the code as the function call is used directly as a prefix. So the call
can be just:
NV.Reference(0).Element.Append(43);
The "Implicit_Dereference" aspect of the returned object allows us to drop the
".Element" as well. That leave us with:
NV.Reference(0).Append(43);
Since containers have the "Variable_Indexing" aspect set to the Reference function,
directly indexing the container is equivalent to a call to Reference. So we also
could have written:
NV(0).Append(43);
This is essentially the same as what we could have written had the type of NV been
an array type.
You could also save the entire object as long as needed:
declare
My_IV : Nested_Vectors.Reference_Type := NV.Reference(0); --
begin
(since My_NV would be built-in-place).
Note that this design still works as intended even if the user renames only the
discriminant access value:
declare
My_IV : Integer_Vectors.Vector renames NV.Reference(0).Element.all;
begin
Even in this case, the wrapping (probably controlled) object still is protecting the
container, because the master of the returned object is the master of the
renames: it will stick around as long as the renames does.
Moreover, splitting the access from the object usually isn't possible:
declare
My_IV : access Integer_Vectors.Vector := NV.Reference(0).Element; --
begin
This is illegal because the accessibility check fails. The (anonymous) object
returned here will have the master of the expression (as a prefix, the special
"directly initializing" rule does not apply to this function call), which is shorter
than the master of the declaration of My_IV. Thus the accessibility check preventing
shorter lifetimes will fail.
It is OK to rename the access:
declare
My_IV : access Integer_Vectors.Vector renames NV.Reference(0).Element;
begin
as in this case, the anonymous object will continue to exist until the renames
is finalized.
This element can be passed as a parameter, as again the wrapping object will
live as long as the parameter:
Operate (NV.Reference(0).Element);
or
Munge (NV.Reference(0).Element.all);
For Maps, we define indexing by the key as well as by cursors. So if we have:
type Point is record
X, Y : Natural;
end record;
package Point_Maps is new Ada.Containers.Indefinite_Ordered_Maps (String, Point);
Locations : Point_Maps.Map;
Locations.Insert ("Tucker", (X | Y => 0));
Locations.Insert ("Randy", (X => 10, Y => 5));
Locations.Insert ("Steve", (X | Y => 20));
Locations.Insert ("John", (X => 50, Y => 100));
we can reference the elements directly via the keys (again we're using the
Variable_Indexing aspect to simplify the calls):
if Locations("Randy").X > 10 then ...
if Locations("Tucker").all = (X | Y => 0) then ...
Locations ("Steve").X := 15;
Locations ("John").all := (X | Y => 60);
The iterator functions can be used directly in a for loop:
for C in My_Vector.Iterator loop
if My_Vector(C).B then
My_Vector(C).Cnt := My_Vector(C).Cnt + 1;
end if;
end loop;
Note that the Reference and Constant_Reference functions (implicitly called above)
are very helpful in writing the body of this sort of loop.
When reasonable, the keyword "reverse" is supported with the usual meaning:
for C in reverse My_Vector.Iterator loop
Containers have the "Iterator" function as their default iterator. This is
always the iterator that unconditionally iterates the entire contents of the
container. These iterators can be used with the "of" syntax to directly access
the elements of a container. This means that the first loop above can be written:
for E of My_Vector loop
if E.B then
E.Cnt := E.Cnt + 1;
end if;
end loop;
Note however that the cursor form is more powerful. For instance, for a Map, the
default iterator can only access the elements and not the keys, while the cursor
form can access both. Similarly, for a tree, the element form does not allow the
loop body to know where in the tree the element exists; the cursor form allows
querying Depth, Parent, and many other useful functions.
!ACATS test
ACATS tests would be needed for these changes.
!appendix
From: Randy Brukardt
Sent: Never (Written January 28, 2011)
Notes on version /03 (compared to the /02 discussed at the phone meeting):
The declaration of the container type does not need an interface because we
replaced that by an aspect several versions back in AI05-0139-2. Sorry about
having completely forgotten that. That simplifies the description and eliminates
the incompatibility that we talked about.
Used the name "Vector_Iterator_Interfaces" for the magic instance, as the
suggested name "Iterator_Implementation" doesn't make much sense.
Added notes about the Set implementation of the Reference and Constant_Reference
routines for Keys.
Added a sentence about the object returned from the various iterator functions
needing finalization.
Added a note that all of the bounded containers are Preelaborated, not Pure.
(This drops A.18.19(3/3) and similar bullets.)
****************************************************************
Questions? Ask the ACAA Technical Agent