Version 1.6 of ai05s/ai05-0212-1.txt

Unformatted version of ai05s/ai05-0212-1.txt version 1.6
Other versions for file ai05s/ai05-0212-1.txt

!standard A.18.2(9/2)          11-03-08 AI05-0212-1/06
!standard A.18.33(0)
!class Amendment 10-04-26
!status work item 10-04-26
!status received 10-02-27
!priority Medium
!difficulty Medium
!subject Accessors and Iterators for Ada.Containers
!summary
(See proposal.)
!problem
We've defined a number of new features that potentially can improve the use of the containers libraries in AI05-0139-2, AI05-0142-4, AI05-0143-1, and AI05-0162-1. The containers ought to be updated to use these new features.
!proposal
Add accessors and iterators to the containers packages.
!wording
[Editor's note: The following is based on AI05-0139-2/07.]
Replace 7.6(4/1) with:
package Finalization is pragma Pure(Finalization);
[This changes this to a Pure package, and drops "Remote_Types" (which is implicit in Pure).]
Add the following with to each container package:
with Ada.Iterator_Interfaces;
Modify the definition of the container types for each of the containers (with suitable substitutions for the names of types) as follows [note: Set does not have Variable_Indexing as it does not allow in-place modification of elements]:
type Vector is tagged private with Constant_Indexing => Constant_Reference, Variable_Indexing => Reference, Default_Iterator => Iterate, Iterator_Element => Element_Type; pragma Preelaborable_Initialization(Vector);
Add after constant No_Element:
package Vector_Iterator_Interfaces is new Ada.Iterator_Interfaces (Cursors, No_Element);
Add the following to each Ada.Containers package (other than the Queues packages) immediately after Update_Element (with suitable substitutions for the names of types):
type Constant_Reference_Type (Element : not null access constant Element_Type) is private with Implicit_Dereference => Element;
type Reference_Type (Element : not null access Element_Type) is private with Implicit_Dereference => Element;
Constant_Reference_Type and Reference_Type need finalization.
The default initialization of an object of type Constant_Reference_Type or Reference_Type propagates Program_Error.
AARM Reason: It is expected that Reference_Type (and Constant_Reference_Type) will be a controlled type, for which finalization will have some action to terminate the tampering check for the associated container. If the object is created by default, however, there is no associated container. Since this is useless, and supporting this case would take extra work, we define it to raise an exception.
Add the following to each Ada.Containers package that has a Cursor type immediately after the above (with suitable substitutions for the names of types) - exception: there is no Reference in package Sets [meta-rule: There is a Constant_Reference for each Query_Element; a Reference for each Update_Element]:
function Constant_Reference (Container : aliased in Vector; Position : in Cursor) return Constant_Reference_Type;
This function (combined with the Constant_Indexing and Implicit_Dereference aspects) provides a convenient way to get read access to the individual elements of a container starting with a cursor.
If Position equals No_Element, then Constraint_Error is propagated; if Position does not designate an element in Container, then Program_Error is propagated. Otherwise, Constant_Reference returns an object whose discriminant is an access value that designates the element designated by Position. Program_Error is propagated if any operation tampers with the elements of Container while the object returned by Constant_Reference exists and has not been finalized.
function Reference (Container : aliased in out Vector; Position : in Cursor) return Reference_Type;
This function (combined with the Variable_Indexing and Implicit_Dereference aspects) provides a convenient way to get read and write access to the individual elements of a container starting with a cursor.
If Position equals No_Element, then Constraint_Error is propagated; if Position does not designate an element in Container, then Program_Error is propagated. Otherwise, Reference returns an object whose discriminant is an access value that designates the element designated by Position. Program_Error is propagated if any operation tampers with the elements of Container while the object returned by Reference exists and has not been finalized.
The element designated by Position is not an empty element after successful completion of this operation. [This is only needed for the Vectors case. - ED]
Add the following to Ada.Containers.Vectors and its relatives:
function Constant_Reference (Container : aliased in Vector; Index : in Index_Type) return Constant_Reference_Type;
This routine (combined with the Constant_Indexing and Implicit_Dereference aspects) provides a convenient way to get read access to the individual elements of a container starting with an index value.
If Index is not in the range First_Index (Container) .. Last_Index (Container), then Constraint_Error is propagated. Otherwise, Constant_Reference returns an object whose discriminant is an access value that designates the element at position Index. Program_Error is propagated if any operation tampers with the elements of Container while the object returned by Constant_Reference exists and has not been finalized.
function Reference (Container : aliased in out Vector; Index : in Index_Type) return Reference_Type;
This function (combined with the Variable_Indexing and Implicit_Dereference aspects) provides a convenient way to get read and write access to the individual elements of a container starting with an index value.
If Index is not in the range First_Index (Container) .. Last_Index (Container), then Constraint_Error is propagated. Otherwise, Reference returns an object whose discriminant is an access value that designates the element at position Index. Program_Error is propagated if any operation tampers with the elements of Container while the object returned by Reference exists and has not been finalized.
The element designated by Position is not an empty element after successful completion of this operation.
Add the following to Ada.Containers.Hashed_Maps and the various other Map containers:
function Constant_Reference (Container : aliased in Map; Key : in Key_Type) return Constant_Reference_Type;
This routine (combined with the Constant_Indexing and Implicit_Dereference aspects) provides a convenient way to get read access to the individual elements of a container starting with a key value.
Equivalent to Constant_Reference (Container, Find (Container, Key)).
function Reference (Container : aliased in out Map; Key : in Key_Type) return Reference_Type;
This function (combined with the Variable_Indexing and Implicit_Dereference aspects) provides a convenient way to get read and write access to the individual elements of a container starting with an key value.
Equivalent to Reference (Container, Find (Container, Key)).
Add the following as inside of the Generic_Keys package for the various Set containers:
[Note: Reference_Type is also declared inside of package Generic_Keys.]
function Reference_Preserving_Key (Container : aliased in out Set; Position : in Cursor) return Reference_Type;
This function (combined with the Implicit_Dereference aspect) provides a convenient way to get read and write access to the individual elements of a container starting with an key value.
If Position equals No_Element, then Constraint_Error is propagated; if Position does not designate an element in Container, then Program_Error is propagated. Otherwise, Reference_Preserving_Key uses Key to save the key value K; then returns an object whose discriminant is an access value that designates the element designated by Position. Program_Error is propagated if any operation tampers with the elements of Container while the object returned by Reference_Preserving_Key exists and has not been finalized. When the object returned by Reference_PReserving_Key is finalized, a check is made if K determines the same equivalence class as that for the new element; if not, the element is removed from the set and Program_Error is propagated.
function Constant_Reference (Container : aliased in Set; Key : in Key_Type) return Constant_Reference_Type;
This routine (combined with the Implicit_Dereference aspect) provides a convenient way to get read access to the individual elements of a container starting with a key value.
Equivalent to Constant_Reference (Container, Find (Container, Key)).
function Reference_Preserving_Key (Container : aliased in out Set; Key : in Key_Type) return Reference_Type;
This function (combined with the Implicit_Dereference aspect) provides a convenient way to get read and write access to the individual elements of a container starting with an key value.
Equivalent to Reference_Preserving_Key (Container, Find (Container, Key)).
[Note: The versions for the set containers are unfortunately not primitive operations; therefore they cannot be used via the _Indexing magic nor can they be called with prefix notation. However, the reference semantics is operational, so the discriminant and .all can be omitted; and they're probably still easier to use than Update_Element. That seems like enough to justify defining them.]
Add the following to Ada.Containers.Indefinite_Holders and its relatives:
function Constant_Reference (Container : aliased in Holder) return Constant_Reference_Type;
This routine (combined with the Implicit_Dereference aspect) provides a convenient way to get read access to the contained element of a holder container.
If Container is empty, Constraint_Error is propagated. Otherwise, Constant_Reference returns an object whose discriminant is an access value that designates the contained element. Program_Error is propagated if any operation tampers with the elements of Container while the object returned by Constant_Reference exists and has not been finalized.
function Reference (Container : aliased in out Holder) return Reference_Type;
This function (combined with the Implicit_Dereference aspects) provides a convenient way to get read and write access to the contained element of a holder container.
If Container is empty, Constraint_Error is propagated. Otherwise, Reference returns an object whose discriminant is an access value that designates the contained element. Program_Error is propagated if any operation tampers with the elements of Container while the object returned by Reference exists and has not been finalized.
After the existing procedure Reverse_Iterate of each container that has Reverse_Iterate, add a new function:
function Iterate (Container : in Vector) return Iterators.Reversible_Iterator'Class;
Iterate returns an object that can be used in a loop_statement to loop with a loop parameter designating each node in Container, starting with the first node and moving the cursor as per the Next function if the reserved word reverse is not used in the iterator_specification, and starting with the last node and moving the cursor as per the Previous function otherwise. Program_Error is propagated if any operation (in particular, the sequence_of_statements of the loop_statement whose iterator_specification denotes this object) tampers with the cursors of Container while the returned object exists. The returned object from Iterate needs finalization.
[Note: This wording will need to be adjusted for containers that don't have Next, like Maps. The wording will be parallel to that of the existing Iterate procedure.]
After the above routine for various Vector and List containers, add a new function:
function Iterate (Container : in Vector; Start : in Cursor) return Iterators.Reversible_Iterator'Class;
if Start is not No_Element and does not designate an item in Container, then Program_Error is propagated. if Start is No_Element, the call is equivalent to Iterate (Container). Otherwise, Iterate returns an object that can be used in a loop_statement to loop with a loop parameter designating each node in Container, starting with the node designated by From and moving the cursor as per the Next function if the reserved word *reverse* is not used in the iterator_specification, or moving the cursor as per the Previous function otherwise. Program_Error is propagated if any operation (in particular, the sequence_of_statements of the loop_statement whose iterator_specification denotes this object) tampers with the cursors of Container while the returned object exists. The returned object from Iterate needs finalization.
AARM Note: Exits are allowed from the loops created using the iterator objects. In particular, to stop the iteration at a particalar cursor, just add
exit when Cur = Stop;
in the body of the loop.
After the existing procedure Iterate of each container that does not have a procedure Reverse_Iterate (for example, Hashed_Sets), add a new function:
function Iterate (Container : in Set) return Iterators.Forward_Iterator'Class;
Iterate returns an object that can be used in a loop_statement to loop with a loop parameter designating each node in Container, starting with the first node and moving the cursor as per the Next function. Program_Error is propagated if any operation (in particular, the sequence_of_statements of the loop_statement whose iterator_specification denotes this object) tampers with the cursors of Container while the returned object exists. The returned object from Iterate needs finalization.
Instead of the above, the Multiway_Tree containers would have:
function Iterate (Container : in Tree) return Iterators.Forward_Iterator'Class;
Iterate returns an object that can be used in a loop_statement to loop with a loop parameter designating each node in Container, starting with the root node and proceeding in a depth-first order. Program_Error is propagated if any operation (in particular, the sequence_of_statements of the loop_statement whose iterator_specification denotes this object) tampers with the cursors of Container while the returned object exists. The returned object from Iterate needs finalization.
function Iterate_Subtree (Position : in Cursor) return Iterators.Forward_Iterator'Class;
Iterate returns an object that can be used in a loop_statement to loop with a loop parameter designating each element in the subtree rooted by the node designated by Position, starting with the node designated by Position and proceeding in a depth-first order. If Position equals No_Element, then Constraint_Error is propagated. Program_Error is propagated if any operation (in particular, the sequence_of_statements of the loop_statement whose iterator_specification denotes this object) tampers with the cursors of the container in which the node designated by Position exists while the returned object exists. The returned object from Iterate needs finalization.
function Iterate_Children (Container : in Tree; Parent : in Cursor) return Iterators.Reversible_Iterator'Class;
Iterate returns an object that can be used in a loop_statement to loop with a loop parameter designating each child node of Parent. If Parent equals No_Element, then Constraint_Error is propagated. If Parent does not designate a node in Container, then Program_Error is propagated. Otherwise, if the reserved word reverse is not used in the iterator_specification, the nodes are designated starting with the first child node and moving the cursor as per the function Next_Sibling; if reserved word reverse is used, the nodes are designated starting with the last child node and moving the cursor as per the function Previous_Sibling. Program_Error is propagated if any operation (in particular, the sequence_of_statements of the loop_statement whose iterator_specification denotes this object) tampers with the cursors of Container while the returned object exists. The returned object from Iterate needs finalization.
Add the following to the Erroneous Execution section of each (unbounded) container package:
Execution is erroneous if the vector associated with the result of a call to Reference or Constant_Reference is finalized before the result object returned by the call to Reference or Constant_Reference is finalized.
AARM Reason: Each object of Reference_Type and Constant_Reference_Type probably contains some reference to the originating container. If that container is prematurely finalized (which is only possible via Unchecked_Deallocation, as accessibility checks prevent passing a container to Reference that will not live as long as the result), the finalization of the object of Reference_Type will try to access a non-existent object. This is a normal case of a dangling pointer created by Unchecked_Deallocation; we have to explicitly mention it here as the pointer in question is not visible in the specification of the type. (This is the same reason we have to say this for invalid cursors.)
For each bounded container, delete AARM implementation note that says:
Package Containers.Bounded_Vectors cannot depend on package Ada.Finalization (because that package has Preelaborate categorization).
Add an AARM note after the bullet about when the container needs finalization:
Implementation Note: The type Vector cannot depend on package Ada.Finalization unless the element type depends on that package. The objects returned from the Iterator and Reference functions probably do depend on package Ada.Finalization. Restricted environments may need to avoid use of those functions and their associated types.
Add a new clause at the end of the containers section: (Currently A.18.33)
A.18.33 Example of container use
The following example using the containers to calculate the shortest paths between each pair of nodes in a set.
with Ada.Containers.Hashed_Sets; with Ada.Containers.Vectors; with Ada.Containers.Doubly_Linked_Lists; use Ada.Containers; generic type Node is range <>; package Shortest_Paths is type Distance is new Float range 0.0 .. Float'Last; type Edge is record To, From : Node; Length : Distance; end record;
package Node_Sets is new Vectors (Node, Boolean); package Node_Maps is new Vectors (Node, Node);
package Adjacency_Lists is new Doubly_Linked_Lists (Edge); use Adjacency_Lists;
package Graphs is new Vectors (Node, Adjacency_Lists.List);
package Paths is new Doubly_Linked_Lists (Node);
package Distances is new Vectors (Node, Distance);
function Shortest_Path (G : Graphs.Vector; Source : Node; Target : Node) return Paths.List; end Shortest_Paths;
package body Shortest_Paths is function Shortest_Path (G : Graphs.Vector; Source : Node; Target : Node) return Paths.List is use Node_Sets, Node_Maps, Paths, Distances, Graphs; Reached : Node_Sets.Vector; Reached_From : Node_Maps.Vector; So_Far : Distances.Vector; The_Path : Paths.List := Paths.Empty_List; Nearest : Node; Nearest_Distance : Distance; Next : Node; begin Reserve_Capacity (So_Far, Length (G)); Reserve_Capacity (Reached, Length (G));
-- Initially all nodes are unreachable. for N in G.Iterate loop So_Far (N) := Distance'Last; Reached (N) := False; end loop;
Reached (Source) := True;
while not Reached (Target) loop Nearest_Distance := Distance'Last;
-- Find closest node not reached yet.
for N in G.Iterate loop if not Reached (N) and then So_Far (N) < Nearest_Distance then Next := N; Nearest_Distance := So_Far (N); end if; end loop;
Reached (Next) := True;
-- Update minimum distance to newly reachable nodes.
for E of G (Next) loop if not Reached (E.To) then Nearest_Distance := Distance'Min (So_Far (E.To) + So_Far (Nearest), So_Far (E.To));
if Nearest_Distance < So_Far (E.To) then Reached_From (E.To) := Next; So_Far (E.To) := Nearest_Distance; end if; end if; end loop; end loop;
-- Rebuild path from target to source.
declare N : Node := Target; begin while N /= Source loop N := Reached_From (N); Prepend (The_Path, N); end loop; end; return The_Path; end; end Shortest_Paths;
Note that the effect of the Variable_Indexing aspect (on type Vector) and the Implicit_Dereference aspect (on type Reference_Type) is that
So_Far (N) := Distance'Last;
is a convenient short hand for
So_Far.Reference (N).Element.all := Distance'Last;
Similarly, the effect of the loop:
for E of G (Next) loop if not Reached (E.To) then ... end if; end loop;
is the same as:
for C in G (Next).Iterate loop declare E : Edge renames G (Next)(C).all; begin if not Reached (E.To) then ... end if; end; end loop;
which is the same as:
declare L : Adjecency_Lists.List renames G (Next); C : Adjacency_Lists.Cursor := L.First; begin while C /= No_Element loop declare E : Edge renames L(C).all; begin if not Reached (E.To) then ... end if; end; C := L.Next (C); end loop; end;
!discussion
Note that the accessibility of the returned object from a call to Reference will prevent converting the returned access discriminant to any type that lives longer than the returned object. That means that we can assume that the discriminant value cannot be accessed after the object disappears (thus we can tie the tampering check to the lifetime of the returned object).
It is possible to use Unchecked_Deallocation to destroy the returned Reference_Type object prematurely, while continuing to use the returned access. The accessibility check on the container argument to the Reference function means that the returned access can't live longer than the container, but the use of Unchecked_Deallocation would allow tampering on the container. This could look like:
declare type AR is access MV.Reference_Type; procedure Free is new Ada.Unchecked_Deallocation (MV.Reference_Type, AR);
PAR : AR := new MV.Reference_Type'(Vect.Reference (1))
Element : access Element_Type := PAR.Element; -- OK.
begin Free (PAR); -- Tampering checking ends here!
Vect.Delete (1); -- No check here. ... Element.all ... -- Oops, gone. end;
Obviously, this is highly unlikely to happen by accident. This is more like shooting oneself in the foot by attaching a laser target to your foot and then firing a laser-seeking missile. It doesn't seem worth worrying about someone who wants to go to these lengths to avoid a tampering check - they're just as likely to use Unchecked_Conversion or .all'Unchecked_Access or something else nasty.
---
We need to be able to define the period of existence of the return object of the accessor as a time when tampering with elements is prohibited. If we didn't do this, we would be opening the door to all manner of problems (up to and including erroneousness) if an element is added or deleted to the container.
To see how this could happen, consider the following example:
Assume that Ada.Containers.Vectors has an accessor function defined as follows:
function Reference (C : aliased in out Vector; I : Index_Type) return access Element_Type;
Now consider the following program fragments:
package Data is type Node (D : Boolean := True) is record case D is when True => CT : Natural; when False => CF : Float; end case; end record; subtype True_Node is Node (D => True); subtype False_Node is Node (D => False);
package Node_Vector is new Ada.Containers.Vectors (Element_Type => Node, Index_Type => Positive);
Node_List : Node_Vector.Vector; end Data;
with Data; package Far_Away is procedure Some_Complex_Operation (...); end Far_Away;
package body Far_Away is procedure Some_Complex_Operation (...) is Some_Index : Positive; begin ... Some_Index := <lengthy-computation-resulting-in-value-of-1>; ... Data.Node_List.Delete (Some_Index); ... end Some_Complex_Operation; end Far_Away;
with Data; package Process is procedure Process_True (NT : in out Data.True_Node); end Process;
with Far_Away; package body Process is procedure Process_True (NT : in out Data.True_Node) is Component : renames NT.CT; -- OK, NT is constrained. begin ... Far_Away.Some_Complex_Operation (...); if Component > 10 then -- ** Oh-oh!! Component has moved. ... end if; end Process_True; end Process;
with Data, Process; procedure Main is begin Data.Node_List.Append (New_Item => (D => False, CF => 1.0)); -- Element 1. Data.Node_List.Append (New_Item => (D => True, CT => 4)); -- Element 2. Data.Node_List.Append (New_Item => (D => False, CF => 26.5)); -- Element 3. ... Process.Process_True (Data.Node_List.Reference (2).all); ... end Main;
In the canonical implementation for a vector, all of the data is stored in an array of some size, and when a component is deleted, the rest of them slide down. So, when the Far_Away operation (probably called through many levels of calls) deletes the first element of the vector, all of the rest of them move. That means that the second element passed to Process_True unexpectedly changes its value to that which was in the third element. But that element doesn't match the constraint and doesn't even have a CT component! So we get erroneous behavior. Note that the same thing would happen if an element was inserted at the front of the vector.
Note that in this example no one has copied the access value returned by the accessor. So a rule preventing copying does no good. You can also get a similar effect by renaming the access value itself and doing bad things while the renames exists.
Note that the erroneousness can't happen if you use Update_Element instead, because the tampering check will cause the deletion to raise Program_Error. Similarly, cursors have rules allowing implementations to detect this sliding and raise Program_Error.
This example code seems plausible; the record type is similar to many found in the author's compiler and using a library-level container is not unusual. So it seems reasonable that cases like this would happen in practice. Thus we conclude that a tampering check is needed for the accessor functions.
!examples
The new Reference function can be used to update an element directly. For example (based on a recent example from comp.lang.ada):
with Ada.Containers.Vectors; procedure Check is package Integer_Vectors is new Ada.Containers.Vectors (Index_Type => Natural, Element_Type => Integer);
package Nested_Vectors is new Ada.Containers.Vectors (Index_Type => Natural, Element_Type => Integer_Vectors.Vector, "=" => Integer_Vectors."=");
IV : Integer_Vectors.Vector; NV : Nested_Vectors.Vector; begin IV.Append(42); NV.Append(IV); NV.Reference(0).Element.all.Append(43); end Check;
The dereference does not need to be explicitly given in this case, and the fact that the object returned by Reference is limited and (probably) controlled is not visible in the code as the function call is used directly as a prefix. So the call can be just:
NV.Reference(0).Element.Append(43);
The "Implicit_Dereference" aspect of the returned object allows us to drop the ".Element" as well. That leave us with:
NV.Reference(0).Append(43);
Since containers have the "Variable_Indexing" aspect set to the Reference function, directly indexing the container is equivalent to a call to Reference. So we also could have written:
NV(0).Append(43);
This is essentially the same as what we could have written had the type of NV been an array type.
You could also save the entire object as long as needed:
declare My_IV : Nested_Vectors.Reference_Type := NV.Reference(0); -- Or just NV(0). begin
(since My_NV would be built-in-place).
Note that this design still works as intended even if the user renames only the discriminant access value:
declare My_IV : Integer_Vectors.Vector renames NV.Reference(0).Element.all; begin
Even in this case, the wrapping (probably controlled) object still is protecting the container, because the master of the returned object is the master of the renames: it will stick around as long as the renames does.
Moreover, splitting the access from the object usually isn't possible:
declare My_IV : access Integer_Vectors.Vector := NV.Reference(0).Element; -- Illegal! begin
This is illegal because the accessibility check fails. The (anonymous) object returned here will have the master of the expression (as a prefix, the special "directly initializing" rule does not apply to this function call), which is shorter than the master of the declaration of My_IV. Thus the accessibility check preventing shorter lifetimes will fail.
It is OK to rename the access:
declare My_IV : access Integer_Vectors.Vector renames NV.Reference(0).Element; begin
as in this case, the anonymous object will continue to exist until the renames is finalized.
This element can be passed as a parameter, as again the wrapping object will live as long as the parameter:
Operate (NV.Reference(0).Element);
or
Munge (NV.Reference(0).Element.all);
For Maps, we define indexing by the key as well as by cursors. So if we have:
type Point is record X, Y : Natural; end record; package Point_Maps is new Ada.Containers.Indefinite_Ordered_Maps (String, Point);
Locations : Point_Maps.Map;
Locations.Insert ("Tucker", (X | Y => 0)); Locations.Insert ("Randy", (X => 10, Y => 5)); Locations.Insert ("Steve", (X | Y => 20)); Locations.Insert ("John", (X => 50, Y => 100));
we can reference the elements directly via the keys (again we're using the Variable_Indexing aspect to simplify the calls):
if Locations("Randy").X > 10 then ... if Locations("Tucker").all = (X | Y => 0) then ... Locations ("Steve").X := 15; Locations ("John").all := (X | Y => 60);
The iterator functions can be used directly in a for loop:
for C in My_Vector.Iterator loop if My_Vector(C).B then My_Vector(C).Cnt := My_Vector(C).Cnt + 1; end if; end loop;
Note that the Reference and Constant_Reference functions (implicitly called above) are very helpful in writing the body of this sort of loop.
When reasonable, the keyword "reverse" is supported with the usual meaning:
for C in reverse My_Vector.Iterator loop
Containers have the "Iterator" function as their default iterator. This is always the iterator that unconditionally iterates the entire contents of the container. These iterators can be used with the "of" syntax to directly access the elements of a container. This means that the first loop above can be written:
for E of My_Vector loop if E.B then E.Cnt := E.Cnt + 1; end if; end loop;
Note however that the cursor form is more powerful. For instance, for a Map, the default iterator can only access the elements and not the keys, while the cursor form can access both. Similarly, for a tree, the element form does not allow the loop body to know where in the tree the element exists; the cursor form allows querying Depth, Parent, and many other useful functions.
!ACATS test
ACATS tests would be needed for these changes.
!appendix

From: Randy Brukardt
Sent: Never (Written January 28, 2011)

Notes on version /03 (compared to the /02 discussed at the phone meeting):

The declaration of the container type does not need an interface because we
replaced that by an aspect several versions back in AI05-0139-2. Sorry about
having completely forgotten that. That simplifies the description and eliminates
the incompatibility that we talked about.

Used the name "Vector_Iterator_Interfaces" for the magic instance, as the
suggested name "Iterator_Implementation" doesn't make much sense.

Added notes about the Set implementation of the Reference and Constant_Reference
routines for Keys.

Added a sentence about the object returned from the various iterator functions
needing finalization.

Added a note that all of the bounded containers are Preelaborated, not Pure.
(This drops A.18.19(3/3) and similar bullets.)

****************************************************************

From: Randy Brukardt
Sent: Monday, February  7, 2011  7:28 PM

Steve brings up an interesting question in some private mail: should the Ada
strings packages have changes to incorporate iterators  and indexing?
Presumably, we'd make those changes in this same AI.

Here is a list of the various changes and how they might be used in strings
packages:

Aspect Implicit_Dereference
   This doesn't make sense for the strings packages; we don't want to be making
   pointers at individual characters. Requiring the individual characters to be
   aliased to support this strategy could cause some implementations to waste a
   lot of memory (on a word-addressable machine in particular).

Aspect Constant_Indexing
   This maps nicely on the existing function Element for the Bounded and
   Unbounded strings. (Fixed strings, of course, already have this as an array
   type.) So I think we should consider adding this.

Aspect Variable_Indexing
   This maps to procedure Replace_Element, but not sensibly. The current scheme
   imagines that it maps to a function returning an access to an element (using
   reference) -- but using an access type doesn't make much sense in this
   context. Rather, we'd rather pass a value to a procedure to set the element.
   That of course would require some sort of change to the currently defined
   mechanism for Variable_Indexing. I'd guess that is too complicated to define,
   so I think it probably would make more sense to just forget this capability.
   (Whether this means we should forget about providing *read* access I can't
   say.)

Iterator interfaces
   These seem like overkill here, since the indexes are integers and iterating
   an entire object S just requires
       for I in 1..Length(S) loop
   But see the next item.

Aspects Default_Iterator and Iterator_Element
   These would be useful, as they would enable the string to be iterated with
   the container syntax:
       for Ch of My_Unb_String loop

   There are a couple of issues:
     (1) First, without Variable_Indexing, Ch could only be used in a read-only
	 way. That might be pretty limiting.
     (2) We'd have to define an iterator function much like we do in the
	 containers to be used here. That adds a bit of complication (no big
	 deal, though).

   It's not clear if this is worth it.

So, the only slam-dunk is Constant_Indexing. It's not clear if that alone is
worth defining (it could be confusing that writing elements isn't supported). Or
we could consider trying to come up with an extra mapping for Variable_Indexing
that would map:
    S(N) := C;
into
    Replace_Element (S, N, C);

In which case we would want to define all of these things.

Thoughts??

****************************************************************

From: Tucker Taft
Sent: Monday, February  7, 2011  8:06 PM

> Aspect Implicit_Dereference
>     This doesn't make sense for the strings packages; we don't want to
> be making pointers at individual characters. Requiring the individual
> characters to be aliased to support this strategy could cause some
> implementations to waste a lot of memory (on a word-addressable
> machine in particular).

We could generalize Implicit_Dereference a bit by having a different
equivalence, such as "Ref" by itself is equivalent to
"Ref.Discrim.all(Ref.Index)" or "Ref.Discrim.all.Field" and thereby avoid the
need to have the individual elements be aliased, so long as some enclosing
composite object is aliased.

...
> Aspect Variable_Indexing
>     This maps to procedure Replace_Element, but not sensibly. The
> current scheme imagines that it maps to a function returning an access
> to an element (using reference) -- but using an access type doesn't
> make much sense in this context. Rather, we'd rather pass a value to a
> procedure to set the element. That of course would require some sort
> of change to the currently defined mechanism for Variable_Indexing.
> I'd guess that is too complicated to define, so I think it probably
> would make more sense to just forget this capability. (Whether this
> means we should forget about providing *read* access I can't say.)

For Variable_Indexing, if we generalize Implicit_Dereference, then we can also
generalize Variable_Indexing.

...
>     There are a couple of issues:
>       (1) First, without Variable_Indexing, Ch could only be used in a
> read-only way. That might be pretty limiting.

But see above.

...
> Thoughts??

Might be worth investigating a generalization of Implicit_Dereference which
works with unaliased subcomponents of an aliased composite object.

****************************************************************

From: Randy Brukardt
Sent: Monday, February  7, 2011  8:22 PM

...
> We could generalize Implicit_Dereference a bit by having a different
> equivalence, such as "Ref" by itself is equivalent to
> "Ref.Discrim.all(Ref.Index)" or "Ref.Discrim.all.Field"
> and thereby avoid the need to have the individual elements be aliased,
> so long as some enclosing composite object is aliased.

I don't see how this helps. The public view of an Unbounded_String is a
featureless partial view. Whether or not there is any enclosing composite object
that is aliased is not something that the public can know about. OTOH, the
aspects are all about the *public* view of something.

I don't think it is a good idea to expose anything about the implementation of
this type, because there probably are implementations that violate it.

For instance, you could make the scheme you suggest here work if you were
willing to require that all implementations store unbounded strings as an access
to a string. [Then an operation that returned such a pointer could be used as an
implicit dereference.] But that would break any implementations that use a
chunked implementation (which might make sense if assignments/modifications are
more common in programs than the reading operations).

****************************************************************

From: Tucker Taft
Sent: Monday, February  7, 2011  10:28 PM

> I don't see how this helps. The public view of an Unbounded_String is
> a featureless partial view. Whether or not there is any enclosing
> composite object that is aliased is not something that the public can know about.
> OTOH, the aspects are all about the *public* view of something.

Hmmm...  Getting creative here.  Suppose we define a new aspect
"Private_Implicit_Dereference" which indicates that the associated type is a
reference type which when implicitly dereferenced has the specified type:

    type Element_Reference(<>) is private
      with
        Private_Implicit_Dereference => Character;

    ...
private

    type Unbounded_String is new Finalization.Controlled with record
        Chars : access String;
    end record;

    type Element_Reference(UStr : access Unbounded_String) is record
        Index : Positive := 1;
    end record
      with
        Implicit_Dereference => UStr.Chars(Index);

Then we extend the Implicit_Dereference aspect to take not just a name of an
access discriminant, but allow a more general name as exemplified by the above.

> I don't think it is a good idea to expose anything about the
> implementation of this type, because there probably are implementations that
> violate it...

Agreed.

****************************************************************

From: Randy Brukardt
Sent: Monday, February  7, 2011  10:50 PM

> Hmmm...  Getting creative here.  Suppose we define a new aspect
> "Private_Implicit_Dereference" which indicates that the associated
> type is a reference type which when implicitly dereferenced has the
> specified type:
...
> Then we extend the Implicit_Dereference aspect to take not just a name
> of an access discriminant, but allow a more general name as
> exemplified by the above.

That seems like a lot of mechanism for not much gain. We *really* want to be
able to do assignments of values directly into elements of this type; I don't
see much point in even faking a by-reference mechanism. (Note that there can be
components that you can't usefully take an address of; this seems similar.)

That's why I suggested changing Variable_Indexing so that it could work by
calling the existing procedure Replace_Element. That also seems like quite a lot
of mechanism, but at least it doesn't require faking something no one wants in
the first place.

That is, allow Variable_Indexing to match a procedure with three parameters,
then convert
    S(N) := C;
to
    Replace_Element (S, N, C);

This would also require a bit more care with the definition of the two cases. In
particular, for "in out" parameter passing by-copy, you would have to use
Constant_Indexing on the way in and Variable_Indexing on the way out.

The only unusual thing is that S(N) would not be the name of an object (rather
the name of a value), so it couldn't be renamed.

=============

In any case, I don't want this corner case to screw up the nice proposal. So
let's not try too hard with this one...

****************************************************************

From: Tucker Taft
Sent: Monday, February  7, 2011  11:14 PM

> That seems like a lot of mechanism for not much gain. We *really* want
> to be able to do assignments of values directly into elements of this
> type; I don't see much point in even faking a by-reference mechanism.
> (Note that there can be components that you can't usefully take an
> address of; this seems similar.)

I don't understand the above paragraph.  You say we want to be able to do
assignments directly into elements of this type, but then you don't see much
point in faking a by-reference mechanism. These seem to be contradictory
statements (at least to me).

> That's why I suggested changing Variable_Indexing so that it could
> work by calling the existing procedure Replace_Element. That also
> seems like quite a lot of mechanism, but at least it doesn't require
> faking something no one wants in the first place.

This would only work for elements of a by-copy type when passing an OUT or
IN-OUT parameter.

> That is, allow Variable_Indexing to match a procedure with three
> parameters, then convert
>      S(N) := C;
> to
>      Replace_Element (S, N, C);
>
> This would also require a bit more care with the definition of the two
> cases. In particular, for "in out" parameter passing by-copy, you
> would have to use Constant_Indexing on the way in and Variable_Indexing on the way out.
>
> The only unusual thing is that S(N) would not be the name of an object
> (rather the name of a value), so it couldn't be renamed.

Sounds pretty ugly, particularly since it wouldn't work at all for non-by-copy
element types.

> =============
>
> In any case, I don't want this corner case to screw up the nice
> proposal. So let's not try too hard with this one...

Agreed.  We have much bigger fish to fry...

****************************************************************

From: Randy Brukardt
Sent: Monday, February  7, 2011  11:33 PM

> I don't understand the above paragraph.  You say we want to be able to
> do assignments directly into elements of this type, but then you don't
> see much point in faking a by-reference
mechanism.
> These seem to be contradictory statements (at least to me).

I don't see any reason that assignment necessarily has anything to do with a
by-reference mechanism. The obvious counter-example is by-copy parameter passing
-- you still can assign into such parameters; even if you can get a useful
address for them (and you don't have to be able to), it doesn't correspond to
any other object (and certainly not the actual).

In implementation terms, Janus/Ada has the concept of assignable components that
you can't take an address of. For instance, most bit-packed components don't
have a machine address per se (some container does, of course). Such things will
always be passed by copy, and assignments always require changing the entire
"container" (in the case of bit operations, you have to do a read, some masking,
and then a write). Setting a single bit in a packed array always falls into this
model.

I was trying to abstract this model into something that makes sense at the
language level. For such a thing, you can *only* talk about the container and
operations on it; it doesn't make sense to even think about references to
individual elements because you cannot make an assignment that way, anymore than
you can talk about setting a single bit in a word (or whatever your addressible
unit is).

Anyway, it seems obvious that this isn't going to work very easily, so it's
probably best to leave it for Ada 2020.

Which leaves the question of whether to use Constant_Indexing in Bounded_Strings
and Unbounded_Strings now, or forget it as insufficiently helpful.

****************************************************************

From: Tucker Taft
Sent: Monday, February  7, 2011  11:58 PM

> I don't see any reason that assignment necessarily has anything to do
> with a by-reference mechanism. The obvious counter-example is by-copy
> parameter passing -- you still can assign into such parameters; even
> if you can get a useful address for them (and you don't have to be
> able to), it doesn't correspond to any other object (and certainly not the
> actual).

I think we are using the term "reference" in two different ways.
The idea of the "Private_Implicit_Dereference" is that a pointer-like thing can
be converted into a name of a component of a data structure. It doesn't mean it
is converted into an *address*, so it would work just fine for a packed array.
It necessarily *starts* with going through a "true" pointer, but then it might
refer to a subcomponent of some packed structure.

> In implementation terms, Janus/Ada has the concept of assignable
> components that you can't take an address of. For instance, most
> bit-packed components don't have a machine address per se (some
> container does, of course). Such things will always be passed by copy,
> and assignments always require changing the entire "container" (in the
> case of bit operations, you have to do a read, some masking, and then
> a write). Setting a single bit in a packed array always falls into this model.
>
> I was trying to abstract this model into something that makes sense at
> the language level. For such a thing, you can *only* talk about the
> container and operations on it; it doesn't make sense to even think
> about references to individual elements because you cannot make an
> assignment that way, anymore than you can talk about setting a single
> bit in a word (or whatever your addressible unit is).

Here is where you are using the term "reference" and seem to be implying that it
requires an address.  I was thinking that Ref.Discrim.all.Chars(Index) was a
"reference" to a particular component of the Chars array, but there is no
requirement that Ref.Discrim.all.Chars(Index)'Address be well defined.  Merely
that "Ref.Discrim.all.Chars(Index)" be a variable name that can be used on the
LHS of an assignment or passed as an OUT parameter.  Presumably it will be
passed by copy if necessary, in the normal way.

> Anyway, it seems obvious that this isn't going to work very easily, so
> it's probably best to leave it for Ada 2020.

Agreed.

>
> Which leaves the question of whether to use Constant_Indexing in
> Bounded_Strings and Unbounded_Strings now, or forget it as
> insufficiently helpful.

I think it could be confusing to be able to use these in some contexts but not
others.

****************************************************************

From: Jean-Pierre Rosen
Sent: Tuesday, February  8, 2011  3:37 AM

> Which leaves the question of whether to use Constant_Indexing in
> Bounded_Strings and Unbounded_Strings now, or forget it as
> insufficiently helpful.

I'm hesitant (like anybody I guess), but here are some thoughts FWIW:

1) This is all syntactic sugar with no new functionality. I know "it's the sugar
   that helps the medicine go down", but I don't think we heard much complaints
   that the string packages were too complicated to be usable, or anything like
   that.

2) We have no idea how people will receive the new forms of iterators.
Will they adopt it immediately, or just stick to the old-fashioned way on the
ground that it is closer to what is actually happening? No one knows.

3) Extending the new iterator mechanism in the next revision of Ada if there is
   demand for it will be easy. Undoing a last-minute extension that missed some
   important point will not.

So, I'm inclined to saying: "nice idea; take a note of it for the next
revision".

****************************************************************

From: Edmond Schonberg
Sent: Tuesday, February  8, 2011  2:42 PM

I think that iteration over unbounded strings is adequately supported currently,
and there is no need to extend the new iterator machinery to cover them. We know
the lower bound is 1, we have the function Length, the function Element and the
procedure Replace_Element. No one has complained that there is no direct
indexing syntax for them, and they have not been treated as containers so far.
I agree with Jean-Pierre that this can wait to see whether there is a perceived
need (none so far).

****************************************************************

From: Randy Brukardt
Sent: Thursday, February 10, 2011  2:12 AM

...
> 1) This is all syntactic sugar with no new functionality. I know "it's
> the sugar that helps the medicine go down", but I don't think we heard
> much complaints that the string packages were too complicated to be
> usable, or anything like that.

Well, I HAVE heard that. But what we can do easily won't help any of the real
problems with these packages enough to make it worth the effort. We would need
to tackle the string literal problem in order to do that, but we're definitely
not going to do that this time.
  (Which only leaves users with the
         function "+" (S : String) return Unbounded_String renames Ada.Strings.Unbounded.To_Unbounded_String;
   "solution", it's considered too ugly to put into the standard package. Why
   NO solution is preferable escapes me. If I don't do the above, my programs
   look something like:

    Header := new Line_Rec'(Number => 1,
                            Value => Ada.Strings.Unbounded.To_Unbounded_String ("============"),
                            Next => null);

which is not great for readability...)

****************************************************************

From: Jean-Pierre Rosen
Sent: Thursday, February 10, 2011  3:15 AM

Are you saying that the ban on use clauses hampers readability? I can't agree
any more ;-). And I have also no problem with the renaming as "+", which was
intended by JDI. YMMV ...

****************************************************************

From: Robert Dewar
Sent: Thursday, February 10, 2011  5:22 AM

Well to me Ada.Strings.Unbounded.To_Unbounded is plain horrible compared to just
To_Unbounded. This package is designed to be used without use clauses. People
who ban use clauses shoot themselves in the foot in a case like this, no
sympathy :-)

BTW, I really think it is a pity that we did not add a unary operator intended
for conversion, the use of unary "+" is a bit bogus (always fascinating to me
how operators bring anti-use folks to their senses, almost no one wants to write
Algebra."+"(A, B) :-) :-))

****************************************************************

From: Randy Brukardt
Sent: Thursday, February 10, 2011  1:28 PM

You both (maybe intentionally) missed my point. I'll use the renames here if there are more than one or two uses of "To_Unbounded_String". This is the same conditions under which I would consider using a use clause. So this sort of thing only shows up in t
he "rare usage" scenario.

In any case, I don't find:

      Header := new Line_Rec'(Number =>  1,
                              Value => To_Unbounded_String ("============"),
                              Next =>  null);

to be any improvement. It is still way too long and wordy (for something that is
logically an identity function).

Also note that it is not unusual to have both Ada.Strings.Fixed and
Ada.Strings.Unbounded in the same code; you can't "use" both or you'll never
figure out resolution errors. (I've been there!)

> BTW, I really think it is a pity that we did not add a unary operator
> intended for conversion, the use of unary "+" is a bit bogus (always
> fascinating to me how operators bring anti-use folks to their senses,
> almost no one wants to write Algebra."+"(A, B) :-) :-))

Sorry, I sometimes write exactly that, for two reasons:
   (1) If there is only one operator, it takes longer to find a nearby
       declarative part and figure out the name of the appropriate type than it
       does to just write this in prefix notation.
   (2) If there are resolution problems, this is good way to tell the compiler
       exactly what you mean. Trial-and-error is often the only way to fix these
       things, because the typical compiler error messages aren't much help for
       fixing non-resolving expressions involving operations (there are just too
       many of them; the typical program that I debug has more than 50 "="
       defined and the compiler simply can't guess effectively as to the
       intended meaning).

It also should be noted that I have much less objection to use clauses when they
are applied to overloadable entities. It's the non-overloadable entities that
cause all of the trouble, requiring you to put *in* as much dot notation
elsewhere as you take out locally. "use all type" will definitely let me use
"use" clauses more often. Unfortunately, that doesn't help Ada.Strings.Unbounded
much -- as previously noted, I think To_Unbounded_String is ridiculous with or
without a use clause.

****************************************************************


Questions? Ask the ACAA Technical Agent