Version 1.2 of ai05s/ai05-0139-2.txt

Unformatted version of ai05s/ai05-0139-2.txt version 1.2
Other versions for file ai05s/ai05-0139-2.txt

!standard 5.5          09-12-28 AI05-0139-2/02
!class Amendment 09-02-13
!status work item 09-02-13
!status received 09-01-19
!priority Medium
!difficulty Medium
!subject Syntactic sugar for accessors, containers, and iterators
!summary
(See proposal.)
!problem
Using an iterator over one of the Ada 2005 containers libraries requires writing a subprogram which gets called repeatedly with each element of the container. That effectively requires writing the loop body out-of-line from the loop, making the code inside-out of its normal flow, requiring the writing of a lot of extra text, and thus making it much harder to read and understand.
A better mechanism is required.
In addition, using the notion of "accessors" or "references" as defined in AI05-0142 is syntactically awkward. Finally, it is often clearer to iterate over the elements of a container (or array) directly, rather than iterating over a cursor (or index) into the container (or array).
Using accessors for user-defined dereference (rather than simple access values as proposed in AI05-141) has the potential to provide a safer mechanism that might be used for persistence, relocating garbage collection, detection of dangling references, etc. However, using accessors would be syntactically painful compared to a simple ".all."
In all these cases, a modest amount of "syntactic sugar" might make the use of accessors, containers, and iterators more natural and easier to read.
!proposal
Define a set of "magic" generic units which are linked to some appropriate syntactic sugar for accessors, containers, and iterators:
generic package Iterator_Interfaces defines a pair of interfaces:
generic type Cursor is private; No_Element : in Cursor; package Ada.Iterator_Interfaces is
type Basic_Iterator is limited interface; function First (Object : Basic_Iterator) return Cursor; function Next (Object : Basic_Iterator; Position : Cursor) return Cursor;
type Reversible_Iterator is limited interface and Basic_Iterator; function Last (Object : Reversible_Iterator) return Cursor; function Previous (Object : Reversible_Iterator; Position : Cursor) return Cursor;
end Ada.Iterator_Interfaces;
Note that the routines defined in the magic generic generally match those used in iteration for the Ada.Containers packages, and their meaning is the same. The only difference is that we've added an iterator object parameter to the Next and Previous functions (which is necessary to make them part of the interface).
The interface Basic_Iterator defines a user-defined iterator for the forward direction; the interface Reversible_Iterator defines a user-defined iterator for both the forward and reverse direction.
The function First is expected to deliver the first cursor value for the iterator iterating in the forward direction, or return No_Element if no cursor value is available. The function Next is expected to deliver the next cursor value for the iterator iterating in the forward direction given the previous cursor value returned from a call to First or Next, or return No_Element if no additional cursor values are available.
The function Last is expected to deliver the last cursor value for the iterator iterating in the reverse direction, or return No_Element if no cursor value is available. The function Previous is expected to deliver the next cursor value for the iterator iterating in the reverse direction given the previous cursor value returned from a call to Last or Previous, or return No_Element if no additional cursor values are available.
The generic package Element_References defines two tagged types which can be used for accessing elements of a container or similar structure:
generic type Element_Type(<>) is limited private; package Ada.Element_References is type Element_Reference(Data : access Element_Type) is new Ada.Finalization.Limited_Controlled with private; type Element_Constant_Reference(Data : access constant Element_Type) is new Ada.Finalization.Limited_Controlled with private; end Ada.Element_References;
A function that returns a result of a type descended from Element_Reference or Element_Constant_Reference has ".Data.all" implicitly applied to its result at the point of call, if necessary to satisfy overload resolution. This allows the convenient use of accessor functions both on the left-hand-side and right-hand-side of an assignment, and in other contexts. For example:
function Lookup(Tab : Hash_Table; Key : String) return Element_Reference;
Lookup(T1, "hello") := Lookup(T1, "goodbye");
This last line is equivalent to:
Lookup(T1, "hello").Data.all := Lookup(T1, "goodbye").Data.all;
Our final "magic" generic package, Container_Interfaces, defines an interface to be used in conjunction with the Reversible_Iterator defined in Interator_Interfaces, and the element reference types defined in Element_References, as the basis for a container that takes full advantage of our new syntactic sugar (which is further defined below):
generic with package Iterators is new Ada.Iterator_Interfaces(<>); type Iterator_Type is new Iterators.Reversible_Iterator with private; with package References is new Ada.Element_References(<>); type Element_Reference(Data : access References.Element_Type) is new Element_References.Element_Reference with private; type Element_Constant_Reference(Data : access constant Element_Type) is new Element_References.Element_Constant_Reference with private; package Ada.Container_Interfaces is type Container is limited interface; function Default_Iterator(C : Container) return Iterator_Type is abstract; function Element_Ref(C : aliased in out Container; Cursor : Iterators.Cursor) return Element_Reference is abstract; function Element_CRef(C : aliased in Container; Cursor : Iterators.Cursor) return Element_Constant_Reference is abstract; end Ada.Container_Interfaces;
Indexed_component syntax may be used with any object of a type descended from the Container interface, as though it were a single-dimensional array indexed by the Cursor type. It is equivalent to a call on Element_Ref in any context that requires a variable; otherwise it is equivalent to a call on Element_CRef. For example:
V : Int_Vectors.Vector; C1, C2 : Int_Vectors.Cursor; ... C1 := ... C2 := ... V(C1) := V(C2);
The last line is equivalent to:
Element_Ref(V, C1).Data.all := Element_CRef(V, C2).Data.all;
These three generic packages would typically be instantiated as follows:
type Cursor_Type is ... package Iterators is new Ada.Iterator_Interfaces(Cursor_Type, No_Element); type Iterator_Type is new Iterators.Reversible_Iterator with ...
package References is new Ada.Element_References(Element_Type); type Element_Reference is new References.Element_Reference with ... type Element_Constant_Reference is new References.Element_Constant_Reference with ...
package Containers is new Ada.Container_Interfaces(Iterators, Iterator_Type, References, Element_Reference, Element_Constant_Reference); type Container_Type is new Containers.Container with ...
NEW ITERATOR SYNTAX
Given these three generic packages and their associated types, we then add two new kinds of "for" loop parameter specifications which provide for iteration over containers (and arrays) in a natural way:
defining_identifier IN [REVERSE] iterator_expression defining_identifier [: subtype_indication] OF container_expression
The first requires an expression returning an object of a type descended from Basic_Iterator, or Reversible_Iterator if REVERSE appears. The defining_identifier is of the corresponding Cursor type, and can be used to index into the container.
The second requires an expression returning an array, or an object of a type descended from the Container interface defined above.
An example of the first iterator would be:
V : Int_Vectors.Vector; ... for C in V.Iterate loop V(C) := V(C) + 1; -- Note that this is equivalent to: -- Element_Ref(V, C).Data.all := Element_CRef(V, C).Data.all + 1 end loop;
Some examples of the second iterator would be:
Arr : array(1..10) of Natural := ... ... for X : Natural of Arr loop -- The ": Natural" part is optional. -- X renames Arr(<index>) where <index> in Arr'Range. -- X is a variable if Arr is a variable.
X := X + 1; end loop;
--------
Vec : Int_Vectors.Vector; ... for X of Vec loop -- X renames Vec(<cursor>) where <cursor> in Default_Iterator(Vec). -- and Vec(<cursor>) is equiv to Element_[C]Ref(Vec, <cursor>).Data.all. -- X is a variable if Vec is a variable.
X := X + 1; end loop;
The "of" syntax, suggested by J.P. Rosen, seems to help identify X as an element of the array/container rather than as an index or cursor in the sequence defined by the discrete range or iterator.
!wording [REMOVED FOR NOW]
!discussion [NOT UPDATED -- SEE PROPOSAL FOR SOME NEWER DISCUSSION]
This is based on the earlier proposal and discussion recorded in AC-0112.TXT. This proposal was modified to address the comments made on that one, and has been simplified considerably.
Note that the iterator interface can be directly applied to a container type (allowing it to be directly iterated) or it can be part of a separate function (which would allow adding additional parameters to the iterator and/or adding additional storage for the iterators). By using interfaces, we give maximum flexibility to the programmer that is using these iterators.
If the creator of the iterator needs extra storage to accomplish the iteration, they can include it in the type of the iterator.
If the creator of the iterator needs to get control when the loop is exited prematurely, they can define a controlled iterator type. In that case, (presuming that the iterator_expression was a function call) when the loop is exited, the Finalization routine for the iterator object will be called.
We provide a form for reverse iterators. Reverse iteration makes sense for many kinds of containers (vectors and lists, for instance). Given that the syntax already exists and every Ada programmer is familiar with it, not providing it would quickly become one of the most common frustrations of programming in Ada. That seems silly.
We could have simplified the proposal by requiring that all iterators support Last and Previous, but iterators where reverse iteration does not make sense would have to raise an exception when Last is called. The proposed features, in contrast, make a compile-time determination as to whether reverse is allowed. That will detect errors earlier; it generally is better to detect errors at compile-time if possible.
It might appear that this proposal would require some solution to the instantiation for private types problem, but as the examples below show, this is not true.
Note that the actual iterator type can be either limited or non-limited. The value of a limited iterator object is that an iterator implementation can assume (presuming that it only exports functions and not types) that the iterator object is created and destroyed with the loop. This allows safe cleanup operations when the loop is exited (as in the examples below). If that capability is not needed, an iterator implementation allow reuse of iterator objects (especially if no temporary storage is needed).
Note that if temporary storage or loop exit actions are needed, the iterator object must be separate from the container object that it is iterating on. Otherwise nested loops on the same container would not work.
for I in My_Container.Iterator loop for J in My_Container.Iterator loop ... end loop; end loop;
The I and J loops need separate temporary storage, which must be provided by the iterator object (the loop can only provide the loop parameter cursor object).
Issues:
This proposal requires some solution to the modify-in-place problem for the containers packages (see the family of AI05-0142 proposals). Without that, modifying an element in an iterator would require writing an extra subprogram, meaning the new syntax would not have gained anything.
The use of a controlled type for an iterator in this proposal would require bounded containers to be preelaborable units, rather than pure units. We add a prohibition on the actual container from being controlled, so that finalization adverse users could still use the bounded containers (but obviously not the iterators). This seems OK, given that the current accessor proposal (AI05-0142-4) has similar requirements.
It would be nice for an iterator implementation to be able to prevent the creation of iterator objects not associated with a loop. For instance, given the implementation of the example below:
My_Iter : Some_Instance.Basic_Iterator'Class := My_List.Iterator;
would have the effect setting tampering on My_List and keeping it set until until the object is destroyed (potentially a long time in the future). But this is not the natural thing to do -- it requires a lot more typing than the intended usage -- and no major definitional problems are encountered in this case. We would need a lot more mechanism to prevent the creation of objects outside of the loop context, so we do not try to prevent this misuse.
Alternatives considered:
One option considered was including the name of the cursor type somewhere in the syntax. Something like:
defining_identifier : subtype_mark in [reverse] iterator_expression
That was not done mainly because differs from the existing syntax for no clear benefit. It would make the type of the identifier clearer, but it also would mean that switching from a discrete iterator to a container one would require some syntax changes that don't really fit into the model. (One solution to that problem would be to optionally add the type name to the discrete for loop as well, but that seems like a bunch of additional changes for little value.)
It should be noted that with this formulation, the existing discrete iterators could be considered just a predefined instantiation of this magic generic for discrete types. The analogy breaks down a bit for ranges (they don't look like a function call), but it isn't hard to imagine such a unification.
A more important consideration was allowing elements of a container to be directly iterated on. That is, the type of the loop parameter could be the element of a container (that is, anything at all) rather than the cursor to the item.
That would require the element type to be included in the generic somewhere, as well as an accessor function.
But the important consideration would be how the iterated element actually is handled. The programmer may only want read-only access to the elements, or they may want to change a copy of the element, or they my need to change the element in place. It is difficult to make these decisions for the programmer, and providing a large family of iterators quickly gets complicated.
Given all of this, and that existing Ada iteration is on read-only index values (which are can be considered a restricted form of cursors), we only provide iterator on cursors, which are treated as constants within the loop. This way the programmer can pick the most appropriate way to access the actual element: via Element, Update_Element, or (hopefully) some form of Modifiable_Element.
!examples
This feature could easily be used in the existing Ada.Containers packages. We give a suggested implementation for Ada.Containers.Doubly_Linked_Lists; the other containers could be similar.
The start of the visible part of Ada.Containers.Doubly_Linked_Lists would be modified as follows:
with Ada.Iterator_Interfaces; generic type Element_Type is private; with function "=" (Left, Right : Element_Type) return Boolean is <>; package Ada.Containers.Doubly_Linked_Lists is pragma Preelaborate(Doubly_Linked_Lists); pragma Remote_Types(Doubly_Linked_Lists);
package Cursors is type Cursor is private; pragma Preelaborable_Initialization(Cursor);
No_Element : constant Cursor; private ... -- not specified by the language end Cursors; subtype Cursor is Cursors.Cursor; No_Element : Cursor renames Cursors.No_Element;
package Iterators is new Ada.Iterator_Interfaces (Cursors, No_Element);
type List is new Iterators with private; pragma Preelaborable_Initialization(List);
Empty_List : constant List;
...
Note that the changes here are the addition of the nested package Cursors, the addition of the instantiation, and the subtype and renames to eliminate the effect of the nested package. The majority of the package is unchanged.
After the existing procedure Reverse_Iterate, we would add a new function:
function Iterate (Container : in List) return List_Iterators.Reversible_Iterator'Class;
This would be described as
Iterate returns an object that can be used in a loop_statement to loop with a loop parameter designating each node in Container, starting with the first node and moving the cursor as per the Next function if the reserved word reverse is not used in the user_loop_parameter_statement, and starting with the last node and moving the cursor as per the Previous function otherwise. Program_Error is propagated if any operation (in particular, the sequence_of_statements of the loop_statement whose iterator_expression denotes this object) tampers with the cursors of Container while the returned object exists.
This could be implemented by defining a controlled type in the private part of Ada.Containers.Doubly_Linked_Lists:
type Our_List_Iterator is new Ada.Controlled.Limited_Controlled and List_Iterators.Reversible_Iterator with record Our_List : access List; end record; function First (Object : Our_List_Iterator) return Cursor; function Next (Object : Our_List_Iterator; Position : Cursor) return Cursor; function Last (Object : Our_List_Iterator) return Cursor; function Previous (Object : Our_List_Iterator; Position : Cursor) return Cursor; procedure Finalize (Object : in out Our_List_Iterator);
The implementation of these operations would be something like:
function Iterate (Container : in List) return List_Iterators.Reversible_Iterator'Class is begin return Iter:Our_List_Iterator do Iter.Our_List := Container'Unchecked_Access; -- This is safe unless someone does an Unchecked_Deallocation on -- the container in the loop; we don't try to protect against that -- in other cases for Ada.Containers, so we won't here, either. -- Set a tampering context for Iter.Our_List.all. end return; end Iterate;
function First (Object : Our_List_Iterator) return Cursor is begin return Object.Our_List.First; end First;
function Next (Object : Our_List_Iterator; Position : Cursor) return Cursor is begin return Next (Position); end Next;
-- Similar implementations for Last and Previous.
procedure Finalize (Object : in out Our_List_Iterator) is begin -- Clear the tampering context for Object.Our_List. end Finalize;
Note that we didn't add these interfaces to the actual container types. That was because we wanted these iterators to be a tampering context (thus ensuring that they terminate and do not lead to erroneous execution). Users would be very surprised if a "safe" construct like a for loop caused erroneous execution (which will happen if the node designated by the loop parameter is deleted).
The function Iterate would normally be used directly in a loop:
for C in My_List.Iterate loop ... Element (C) ... end loop;
We could also define a partial list iterator:
function Partial_Iterate (Container : in List; Start : in Cursor; Finish : in Cursor := No_Element) return List_Iterators.Reversible_Iterator'Class;
where Start and Finish (if not No_Element) would have to be cursors for Container. In this case, the iteration would run from Start to Finish (or No_Element, if Finish is not on the list after/before Start) forward or in reverse. (This would also be a tampering context as above.)
We would use Partial_Iterator directly in a loop:
for C in Partial_Iterate (My_List, Start, Finish) loop ... Element (C) ... end loop;
Some reviewers have said that they would prefer the iterator to work directly on the container object. This would only eliminate a single selected name, which does not seem worth the problems that would be introduced. The example above means that an iterator is written:
for C in My_List.Iterate loop
Only the truly lazy would find a significant difference from:
for C in My_List loop
Moreover, we are planning (as of this writing) to use a similar mechanism to provide safe accessors. That mechanism also requires writing an extra selector; it would seem bizarre to work hard to eliminate that here, but still handle use it for accessors.
Finally, a direct syntax would not allow nested loops (as different termination code is needed for each loop) unless significant additional magic is defined. Essentially, something like function Iterate would have to be defined internally to the interface to provide the needed temporary space/termination object. This would require far more mechanism than the current proposal.
About the only thing going for the direct syntax is that it would be a bit easier to understand. However, usage of the current proposal is so easy that once a programmer sees an example, they will immediately understand how it works. Understanding the details would rarely be needed.
!ACATS test
!appendix

From: Tucker Taft

Thoughts on Iterators, 8-Nov-2009, for ARG meeting in St. Pete

Relevant players in the game:

  Container, Cursor, Iterator, Element.
  
-------------

    for Element in [reverse] Container loop
        ... -- body of loop
    end loop;
    
is equiv to:

    declare
	Iterator : Iterator_Type := Iterate(Container);
	Cursor : Cursor_Type := First(Iterator);
    begin
	while Has_Element(Cursor) loop
	  declare
	    Element : Element_Type renames Container(Cursor);
	  begin
	      ... -- body of loop	      
	    Cursor := Next(Iterator, Cursor);
	  end;
	end loop;
    end;

or...

    declare
	Iterator : Iterator_Type := Iterate(Container);
    begin
	while More(Iterator) loop
	  declare
	    Cursor : Cursor_Type := Get_Next(Iterator'Access);
	    Element : Element_Type renames Container(Cursor);
	  begin
	      ... -- body of loop	      
	  end;
	end loop;
    end;   
    
---------

    Container(Cursor)
    
  is equiv to:
  
    Element_{Ref,CRef}(Container, Cursor).Data.all
    
    for Cursor in Special_Iterator(Container) loop
        Put(Container(Cursor));
    end loop;
    
---------
    
    for Element in Container loop
        Put(Element);
    end loop;
    
 is equiv to:
      
    for Cursor in Default_Iterator(Container) loop
      declare
        Element : Element_Type renames Container(Cursor);
      begin
        Put(Element);
      end;
    end loop;
    
 similarly:
 
    for Element in Array_Obj loop
       Put(Element);
    end loop;
    
 is equiv to:
 
    for I in Array_Obj'Range(1) loop
       for J in Array_Obj'Range(2) loop
           ...
         declare
           Element : Array_Obj_Component_Type renames Array_Obj(I, J, ...);
         begin
           Put(Element);
         end;
           ...
       end loop;
   end loop;
       

---------

    package Iterators is new Ada.Iterator_Interfaces(Cursor_Type, No_Element);
    type Iterator_Type is new Iterators.Reversible_Iterator with ...
    
    package Containers is new 
      Ada.Container_Interfaces(Iterators, Iterator_Type, Element_Type, Element_Reference);
    type Container_Type is new Containers.Container with ...

   generic
       with package Iterators is new Ada.Iterator_Interfaces(<>);
       type Iterator_Type is new Iterators.Reversible_Iterator with private;
       type Element_Type(<>) is limited private;
       type Element_Reference(Data : access Element_Type) is limited private;
       type Element_Constant_Reference(Data : access constant Element_Type) is limited private;
   package Ada.Container_Interfaces is
       type Container is limited interface;
       function Default_Iterator(C : Container) return Iterator_Type is abstract;
       function Element_Ref(C : aliased in out Container; Cursor : Iterators.Cursor) 
         return Element_Reference is abstract;
       function Element_CRef(C : aliased in Container; Cursor : Iterators.Cursor) 
         return Element_Constant_Reference is abstract;
   end Ada.Container_Interfaces;
   
----------

   generic
       type Cursor is private;
       No_Element : in Cursor;
   package Ada.Iterator_Interfaces is

       type Basic_Iterator is limited interface;
       function First (Object : Basic_Iterator) return Cursor;
       function Next (Object : Basic_Iterator; Position : Cursor) return Cursor;

       type Reversible_Iterator is limited interface and Basic_Iterator;
       function Last (Object : Reversible_Iterator) return Cursor;
       function Previous (Object : Reversible_Iterator; Position : Cursor) return Cursor;

   end Ada.Iterator_Interfaces;
   
****************************************************************

From: Robert Dewar
Sent: Sunday, November  8, 2009  10:26 AM

Equivalent always makes me nervous. Are we sure we can make this strong a statement?

****************************************************************

From: Tucker Taft
Sent: Sunday, November  8, 2009  11:40 AM

This is more like the definition of "new T" when T has a user-defined storage pool.
It is equivalent to a call on Allocate because its semantics are defined as calling
Allocate.

For the proposed syntax for iterators and indexing, their semantics will be defined
in terms of calls on certain subprograms, etc.

****************************************************************

From: Robert Dewar
Sent: Sunday, December  8, 2009  1:14 PM

> This is more like the definition of "new T" when T has a user-defined 
> storage pool.  It is equivalent to a call on Allocate because its 
> semantics are defined as calling Allocate.

I think it is fine to say "the behavior is as though xyz was called bla bla".
I just think we often get ourselves in trouble if the formal definition uses the
word equivalent, it's such a strong word from a formal point of view.

****************************************************************

From: Tucker Taft
Sent: Tuesday, December  8, 2009  5:01 PM

Here is a new variant for AI05-139 on user-defined iterators, etc.  [This is
version /01 of this AI - ED.] It is based on all the earlier proposals,
and incorporates the notes on iterators I passed around at the last ARG meeting.

I removed the wording, and didn't update the discussion, so you will just
have to focus on the !proposal section, which includes some simple examples.

****************************************************************


Questions? Ask the ACAA Technical Agent