Version 1.1 of ai05s/ai05-0139-1.txt
!standard 5.5 09-02-13 AI05-0139-1/01
!class Amendment 09-02-13
!status work item 09-02-13
!status received 09-01-19
!priority Medium
!difficulty Medium
!subject User-defined iterators
!summary
(See proposal.)
!problem
Using an iterator over one of the Ada 2005 containers libraries requires writing
a subprogram which gets called repeatedly with each element of the container.
That effectively requires writing the loop body out-of-line from the loop,
making the code inside-out of its normal flow, requiring the writing of a lot of
extra text, and thus making it much harder to read and understand.
A better mechanism is required.
!proposal
Define a magic generic unit that defines a pair of interfaces:
generic
type Cursor is private;
No_Element : in Cursor;
package Ada.Iterator_Interfaces is
type Basic_Iterator is limited interface;
function First (Object : Basic_Iterator) return Cursor;
function Next (Object : Basic_Iterator; Position : Cursor) return Cursor;
type Reversible_Iterator is new Basic_Iterator with null record;
function Last (Object : Reversible_Iterator) return Cursor;
function Previous (Object : Reversible_Iterator; Position : Cursor) return Cursor;
end Ada.Iterator_Interfaces;
Note that the routines defined in the magic generic generally match those used
in iteration for the Ada.Containers packages, and their meaning is the same. The
only difference is that we've added an iterator object parameter to the Next and
Previous functions (which is necessary to make them part of the interface).
We then add a new kind of loop parameter specification:
defining_identifier in [reverse] iterator_expression
The rules for this are:
* The iterator_expression is expected to be of an (instance of
Iterator_Interfaces).Basic_Iterator'Class [the parens here mean
that we're talking about (any) type declared in an instance of
Iterator_Interfaces];
* The type of the defining_identifier is the appropriate Cursor type
for the instance of Iterator_Interfaces;
* the keyword "reverse" can only appear if the type is of an (instance
of Iterator_Interfaces).Reversible_Iterator'Class;
* The iterator_expression is a build-in-place context;
* the master of the iterator_expression is the (entire) loop; the
iterator_expression continues to exist until the loop is completed
(normally or abnormally).
The loop executes by calling First (or Last if the keyword "reverse" is
present), followed by calling Next (or Previous if the keyword "reverse" is
present) for each following iteration until the function called returns
No_Element. The loop body is executed once for each cursor value (other than
No_Element) obtained from the functions with the loop parameter set to that
cursor value.
!wording
** TBD **
!discussion
This is based on the earlier proposal and discussion recorded in AC-0112.TXT.
This proposal was modified to address the comments made on that one, and has
been simplified considerably.
Note that the iterator interface can be directly applied to a container type
(allowing it to be directly iterated) or it can be part of a separate function
(which would allow adding additional parameters to the iterator and/or adding
additional storage for the iterators). By using interfaces, we give maximum
flexibility to the programmer that is using these iterators.
If the creator of the iterator needs extra storage to accomplish the iteration,
they can include it in the type of the iterator.
If the creator of the iterator needs to get control when the loop is exited
prematurely, they can define a controlled iterator type. In that case,
(presuming that the iterator_expression was a function call) when the loop is
exited, the Finalization routine for the iterator object will be called.
We provide a form for reverse iterators. Reverse iteration makes sense for many
kinds of containers (vectors and lists, for instance). Given that the syntax
already exists and every Ada programmer is familiar with it, not providing it
would quickly become one of the most common frustrations of programming in Ada.
That seems silly.
We could have simplified the proposal by requiring that all iterators support
Last and Previous, but iterators where reverse iteration does not make sense
would have to raise an exception when Last is called. The proposed features, in
contrast, make a compile-time determination as to whether reverse is allowed.
That will detect errors earlier; it generally is better to detect errors at
compile-time if possible.
Issues:
This proposal requires some solution to the instantiation for private types
problem (see the many alternatives for AI05-0074) in order to be useful in the
Ada.Containers packages. As this is a "magic" generic (really to provide a
signature), we could relax the rules for it (only), but that is pretty
distasteful.
This proposal requires some solution to the modify-in-place problem for the
containers packages (see earlier proposals). Without that, modifying an element
in an iterator would require writing an extra subprogram, meaning the new syntax
would not have gained anything.
The rule about an iterator_expression being a build-in-place context is intended
to allow iterators to get control on premature loop exit. If we didn't have
that, it would not be possible to prevent reuse of iterator objects and thus
the finalization of the objects would not necessarily occur when the loop
is exited. This is somewhat of a hack - there is no intrinsic reason why an
iterator object should not be allowed to be reused. Note, however, that reusable
iterator types can be defined: they just have to be non-limited (and presumably
not controlled). So perhaps we can justify this limitation in the service of
additional control.
The use of a controlled type for an iterator in this proposal would require
bounded containers to be preelaborable units, rather than pure units. We add a
prohibition on the actual container from being controlled, so that finalization
adverse users could still use the bounded containers (but obviously not the
iterators).
Alternatives considered:
One option considered was including the name of the cursor type somewhere in the
syntax. Something like:
defining_identifier : subtype_mark in [reverse] iterator_expression
That was not done mainly because differs from the existing syntax for no clear
benefit. It would make the type of the identifier clearer, but it also would
mean that switching from a discrete iterator to a container one would require
some syntax changes that don't really fit into the model. (One solution to that
problem would be to optionally add the type name to the discrete for loop as
well, but that seems like a bunch of additional changes for little value.)
It should be noted that with this formulation, the existing discrete iterators
could be considered just a predefined instantiation of this magic generic for
discrete types. The analogy breaks down a bit for ranges (they don't look like a
function call), but it isn't hard to imagine such a unification.
A more important consideration was allowing elements of a container to be
directly iterated on. That is, the type of the loop parameter could be the
element of a container (that is, anything at all) rather than the cursor to the
item.
That would require the element type to be included in the generic somewhere, as
well as an accessor function.
But the important consideration would be how the iterated element actually is
handled. The programmer may only want read-only access to the elements, or they
may want to change a copy of the element, or they my need to change the element
in place. It is difficult to make these decisions for the programmer, and
providing a large family of iterators quickly gets complicated.
Given all of this, and that existing Ada iteration is on read-only index values
(which are can be considered a restricted form of cursors), we only provide
iterator on cursors, which are treated as constants within the loop. This way
the programmer can pick the most appropriate way to access the actual element:
via Element, Update_Element, or (hopefully) some form of Modifiable_Element.
!examples
This feature could easily be used in the existing Ada.Containers packages. We
give a suggested implementation for Ada.Containers.Doubly_Linked_Lists; the
other containers could be similar.
Somewhere in the visible part of Ada.Containers.Doubly_Linked_Lists we would add
an instantiation of Iterator_Interfaces:
package List_Iterators is Ada.Iterator_Interfaces (Cursor, No_Element);
And a function Iterate:
function Iterate (Container : in List) return List_Iterators.Reversible_Iterator'Class;
In the private part of Ada.Containers.Doubly_Linked_Lists, we'd define a
controlled type:
type Our_List_Iterator is new Ada.Controlled.Limited_Controlled and
List_Iterators.Reversible_Iterator with record
Our_List : access List;
end record;
function First (Object : Our_List_Iterator) return Cursor;
function Next (Object : Our_List_Iterator; Position : Cursor) return Cursor;
function Last (Object : Our_List_Iterator) return Cursor;
function Previous (Object : Our_List_Iterator; Position : Cursor) return Cursor;
procedure Finalize (Object : in out Our_List_Iterator);
The implementation would be something like:
function Iterate (Container : in List) return List_Iterators.Reversible_Iterator'Class is
begin
return Iter:Our_List_Iterator do
Iter.Our_List := Container'Unchecked_Access;
--
--
--
--
end return;
end Iterate;
function First (Object : Our_List_Iterator) return Cursor is
begin
return Object.Our_List.First;
end First;
function Next (Object : Our_List_Iterator; Position : Cursor) return Cursor is
begin
return Next (Position);
end Next;
--
procedure Finalize (Object : in out Our_List_Iterator) is
begin
--
end Finalize;
Note that we didn't add these interfaces to the actual container types. That was
because we wanted these iterators to be a tampering context (thus ensuring that
they terminate and do not lead to erroneous execution). This is important from
an analysis perspective: proving that a loop terminates is equivalent to proving
that it is equivalent to a for loop. For this reason, we don't want for loops
that don't terminate (at least logically; if the container is very large, the
loop might not terminate for a long time).
We could also define a partial list iterator:
function Partial_Iterate (Container : in List;
Start : in Cursor;
Finish : in Cursor := No_Element)
return List_Iterators.Reversible_Iterator'Class;
where Start and Finish (if not No_Element) would have to be cursors for
Container. In this case, the iteration would run from Start to Finish (or
No_Element, if Finish is not on the list after/before Start) forward or in
reverse. (This would also be a tampering context as above.)
We would use Partial_Iterator directly in a loop:
for C in Partial_Iterate (My_List, Start, Finish) loop
... Element (C) ...
end loop;
!ACATS test
!appendix
****************************************************************
From: Randy Brukardt
Date: Monday, January 19, 2009 11:31 PM
Here's the third of my trial balloons. For those following the recent
comp.lang.ada discussion, this hopefully explains why I said that defining
proper syntactic iterators was "trivial". It least I think this is a very simple
proposal that really has no reason not to be adopted other than the needed
prerequisites.
[This is version /01 of the AI - ED.]
****************************************************************
From: Randy Brukardt
Date: Tuesday, January 20, 2009 12:27 PM
There's a couple of points that I left out of this one yesterday:
---
The use of a controlled type for an iterator would require bounded containers to
be preelaborable units, rather than pure units. We'd still have a prohibition on
the actual container from being controlled, so that finalization adverse users
could still use the bounded containers (but obviously not the iterators).
---
I had said "This proposal requires some solution to the modify-in-place problem
for the containers packages." That was true in my original draft, which tried
put the iterator interface directly on the container type. Since I gave up that
model for Ada.Containers (in order to support tampering), it would be possible
to hide an unsafe accessor in the private part and allow it to be used only in
an element iterator (which, being a tampering context, is safe from most bad
operations). So an access-to-element alternative could in fact be accomplished
safely and practically without having to make unsafe operations available. Doing
so would definitely require a solution to the instantiation on partial types
problem (the current solution could be put into a child package although that
would make the usage unnecessarily messy).
****************************************************************
Questions? Ask the ACAA Technical Agent