!standard 5.5 09-06-08 AI05-0139-1/02 !class Amendment 09-02-13 !status work item 09-02-13 !status received 09-01-19 !priority Medium !difficulty Medium !subject User-defined iterators !summary (See proposal.) !problem Using an iterator over one of the Ada 2005 containers libraries requires writing a subprogram which gets called repeatedly with each element of the container. That effectively requires writing the loop body out-of-line from the loop, making the code inside-out of its normal flow, requiring the writing of a lot of extra text, and thus making it much harder to read and understand. A better mechanism is required. !proposal Define a magic generic unit that defines a pair of interfaces: generic type Cursor is private; No_Element : in Cursor; package Ada.Iterator_Interfaces is type Basic_Iterator is limited interface; function First (Object : Basic_Iterator) return Cursor; function Next (Object : Basic_Iterator; Position : Cursor) return Cursor; type Reversible_Iterator is limited interface and Basic_Iterator; function Last (Object : Reversible_Iterator) return Cursor; function Previous (Object : Reversible_Iterator; Position : Cursor) return Cursor; end Ada.Iterator_Interfaces; Note that the routines defined in the magic generic generally match those used in iteration for the Ada.Containers packages, and their meaning is the same. The only difference is that we've added an iterator object parameter to the Next and Previous functions (which is necessary to make them part of the interface). We then add a new kind of loop parameter specification: defining_identifier in [reverse] iterator_expression See the wording below for the detailed rules. !wording Replace 5.5(3) by: iteration_scheme := while condition | for loop_parameter_specification | for user_loop_parameter_specification [Author's note: We need a different kind of loop parameter syntax so that 5.5(6) doesn't get triggered for user-defined iterators, as that would give it the wrong sort of type; this also reduces forward references just to syntax, which is generally accepted in the Standard.] Modify 5.5(9): For execution of a loop_statement with a for iteration_scheme {with a loop_parameter_specification}, the ... [Author's note: More detail to eliminate confusion between the two kinds of for loop.] 5.5.1 The generic package Iterator_Interfaces The generic package Iterator_Interfaces provide a template for the definition of user-defined iterators. generic type Cursor is private; No_Element : in Cursor; package Ada.Iterator_Interfaces is type Basic_Iterator is limited interface; function First (Object : Basic_Iterator) return Cursor; function Next (Object : Basic_Iterator; Position : Cursor) return Cursor; type Reversible_Iterator is limited interface and Basic_Iterator; function Last (Object : Reversible_Iterator) return Cursor; function Previous (Object : Reversible_Iterator; Position : Cursor) return Cursor; end Ada.Iterator_Interfaces; The interface Basic_Iterator defines a user-defined iterator for the forward direction; the interface Reversible_Iterator defines a user-defined iterator for both the forward and reverse direction. The function First is expected to deliver the first cursor value for the iterator iterating in the forward direction, or return No_Element if no cursor value is available. The function Next is expected to deliver the next cursor value for the iterator iterating in the forward direction given the previous cursor value returned from a call to First or Next, or return No_Element if no additional cursor values are available. The function Last is expected to deliver the last cursor value for the iterator iterating in the reverse direction, or return No_Element if no cursor value is available. The function Previous is expected to deliver the next cursor value for the iterator iterating in the reverse direction given the previous cursor value returned from a call to Last or Previous, or return No_Element if no additional cursor values are available. 5.5.2 User-defined iteration [Author's note: This is a separate clause, so that we don't need a forward reference to the generic package.] Syntax user_loop_parameter_specification ::= defining_identifier in [reverse] iterator_expression Name Resolution Rules The iterator_expression of a user_loop_parameter_specification is expected to be of any tagged type. Legality Rules The iterator_expression shall be of a type covered by exactly one type Basic_Iterator'Class of an instance of Iterator_Interfaces. AARM Reason: It would be possible for a single type to inherit from multiple instances of Iterator_Interfaces. However, since the type of the loop parameter is determined by the instance of the type Basic_Iterator'Class, we can have only one such interface in order for the loop parameter to have an unambiguous type. If the reserved word *reverse* is present, the iterator_expression shall be of a type covered by type Reversible_Iterator'Class of an instance of Iterator_Interfaces. Static Semantics A user_loop_parameter_specification declares a loop parameter, which is an object whose subtype is that of the actual type given for the instance of Iterator_Interfaces that contains the Basic_Iterator interface from which the type of the iterator_expression is descended. [Author's note: Wording is as similar as possible to 5.5(6).] Dynamic Semantics For the execution of a loop_statement with a for iteration_scheme with a user_loop_parameter_specification, the iterator_expression is first evaluated; the result is treated as an implicit by-reference formal parameter *I* of the loop_statement. [Author's note: I don't like this wording much, but it has the correct semantics. The intent is that this evaluation work just like the evaluation of a by-reference parameter - see below.] AARM Ramification: If the iterator_expression is a function call, a temporary object will be created here. A statement is a master (see 7.6.1), so the master of this temporary object (if any) is the loop statement. That means that the temporary object is finalized when the loop is left (including by transfers of control), and designers of user-defined iterators can depend on this behavior. The sequence_of_statements of statements is executed once for each value returned from the primitive operations of the type of the iterator_expression (or until the loop is left as the consequence of a transfer of control). Prior to each such iteration, a value is assigned to the loop parameter as follows: * For the first iteration of a loop that does not include the reserved word *reverse*, the loop parameter is initialized from the result of a call of *I*.First; * For following iterations of a loop that does not include the reserved word *reverse*, the loop parameter is initialized from the result of a call of *I*.Next passed the current value of the loop parameter; * For the first iteration of a loop that includes the reserved word *reverse*, the loop parameter is initialized from the result of a call of *I*.Last; * For following iterations of a loop that includes the reserved word *reverse*, the loop parameter is initialized from the result of a call of *I*.Previous passed the current value of the loop parameter. If any of these function calls return the value No_Element (as specified in the appropriate instance of Iterator_Interfaces), the loop is completed and left without executing the sequence_of_statements. The loop_statement propagates any exception propagated by these function calls. [Author's note: I did not provide any special case if the inherited operations are not implemented as suggested in 5.5.1. That means that an implementation must actually call the routines as described (in case they do something weird); only as-if optimizations are allowed. Since the functions can only return a cursor (including No_Element) or raise an exception, there is no need to have a special case for misbehaving functions.] Add after 7.5(2.7/2): * the iterator_expression of a user_loop_parameter_specification of a loop_statement (see 5.5.2); ** Containers updates TBD ** (see !examples) !discussion This is based on the earlier proposal and discussion recorded in AC-0112.TXT. This proposal was modified to address the comments made on that one, and has been simplified considerably. Note that the iterator interface can be directly applied to a container type (allowing it to be directly iterated) or it can be part of a separate function (which would allow adding additional parameters to the iterator and/or adding additional storage for the iterators). By using interfaces, we give maximum flexibility to the programmer that is using these iterators. If the creator of the iterator needs extra storage to accomplish the iteration, they can include it in the type of the iterator. If the creator of the iterator needs to get control when the loop is exited prematurely, they can define a controlled iterator type. In that case, (presuming that the iterator_expression was a function call) when the loop is exited, the Finalization routine for the iterator object will be called. We provide a form for reverse iterators. Reverse iteration makes sense for many kinds of containers (vectors and lists, for instance). Given that the syntax already exists and every Ada programmer is familiar with it, not providing it would quickly become one of the most common frustrations of programming in Ada. That seems silly. We could have simplified the proposal by requiring that all iterators support Last and Previous, but iterators where reverse iteration does not make sense would have to raise an exception when Last is called. The proposed features, in contrast, make a compile-time determination as to whether reverse is allowed. That will detect errors earlier; it generally is better to detect errors at compile-time if possible. It might appear that this proposal would require some solution to the instantiation for private types problem, but as the examples below show, this is not true. Note that the actual iterator type can be either limited or non-limited. The value of a limited iterator object is that an iterator implementation can assume (presuming that it only exports functions and not types) that the iterator object is created and destroyed with the loop. This allows safe cleanup operations when the loop is exited (as in the examples below). If that capability is not needed, an iterator implementation allow reuse of iterator objects (especially if no temporary storage is needed). Note that if temporary storage or loop exit actions are needed, the iterator object *must be* separate from the container object that it is iterating on. Otherwise nested loops on the same container would not work. for I in My_Container.Iterator loop for J in My_Container.Iterator loop ... end loop; end loop; The I and J loops need separate temporary storage, which must be provided by the iterator object (the loop can only provide the loop parameter cursor object). Issues: This proposal requires some solution to the modify-in-place problem for the containers packages (see the family of AI05-0142 proposals). Without that, modifying an element in an iterator would require writing an extra subprogram, meaning the new syntax would not have gained anything. The use of a controlled type for an iterator in this proposal would require bounded containers to be preelaborable units, rather than pure units. We add a prohibition on the actual container from being controlled, so that finalization adverse users could still use the bounded containers (but obviously not the iterators). This seems OK, given that the current accessor proposal (AI05-0142-4) has similar requirements. It would be nice for an iterator implementation to be able to prevent the creation of iterator objects not associated with a loop. For instance, given the implementation of the example below: My_Iter : Some_Instance.Basic_Iterator'Class := My_List.Iterator; would have the effect setting tampering on My_List and keeping it set until until the object is destroyed (potentially a long time in the future). But this is not the natural thing to do -- it requires a lot more typing than the intended usage -- and no major definitional problems are encountered in this case. We would need a lot more mechanism to prevent the creation of objects outside of the loop context, so we do not try to prevent this misuse. Alternatives considered: One option considered was including the name of the cursor type somewhere in the syntax. Something like: defining_identifier : subtype_mark in [reverse] iterator_expression That was not done mainly because differs from the existing syntax for no clear benefit. It would make the type of the identifier clearer, but it also would mean that switching from a discrete iterator to a container one would require some syntax changes that don't really fit into the model. (One solution to that problem would be to optionally add the type name to the discrete for loop as well, but that seems like a bunch of additional changes for little value.) It should be noted that with this formulation, the existing discrete iterators could be considered just a predefined instantiation of this magic generic for discrete types. The analogy breaks down a bit for ranges (they don't look like a function call), but it isn't hard to imagine such a unification. A more important consideration was allowing elements of a container to be directly iterated on. That is, the type of the loop parameter could be the element of a container (that is, anything at all) rather than the cursor to the item. That would require the element type to be included in the generic somewhere, as well as an accessor function. But the important consideration would be how the iterated element actually is handled. The programmer may only want read-only access to the elements, or they may want to change a copy of the element, or they my need to change the element in place. It is difficult to make these decisions for the programmer, and providing a large family of iterators quickly gets complicated. Given all of this, and that existing Ada iteration is on read-only index values (which are can be considered a restricted form of cursors), we only provide iterator on cursors, which are treated as constants within the loop. This way the programmer can pick the most appropriate way to access the actual element: via Element, Update_Element, or (hopefully) some form of Modifiable_Element. !examples This feature could easily be used in the existing Ada.Containers packages. We give a suggested implementation for Ada.Containers.Doubly_Linked_Lists; the other containers could be similar. The start of the visible part of Ada.Containers.Doubly_Linked_Lists would be modified as follows: with Ada.Iterator_Interfaces; generic type Element_Type is private; with function "=" (Left, Right : Element_Type) return Boolean is <>; package Ada.Containers.Doubly_Linked_Lists is pragma Preelaborate(Doubly_Linked_Lists); pragma Remote_Types(Doubly_Linked_Lists); package Cursors is type Cursor is private; pragma Preelaborable_Initialization(Cursor); No_Element : constant Cursor; private ... -- not specified by the language end Cursors; subtype Cursor is Cursors.Cursor; No_Element : Cursor renames Cursors.No_Element; package Iterators is new Ada.Iterator_Interfaces (Cursors, No_Element); type List is new Iterators with private; pragma Preelaborable_Initialization(List); Empty_List : constant List; ... Note that the changes here are the addition of the nested package Cursors, the addition of the instantiation, and the subtype and renames to eliminate the effect of the nested package. The majority of the package is unchanged. After the existing procedure Reverse_Iterate, we would add a new function: function Iterate (Container : in List) return List_Iterators.Reversible_Iterator'Class; This would be described as Iterate returns an object that can be used in a loop_statement to loop with a loop parameter designating each node in Container, starting with the first node and moving the cursor as per the Next function if the reserved word *reverse* is not used in the user_loop_parameter_statement, and starting with the last node and moving the cursor as per the Previous function otherwise. Program_Error is propagated if any operation (in particular, the sequence_of_statements of the loop_statement whose iterator_expression denotes this object) tampers with the cursors of Container while the returned object exists. This could be implemented by defining a controlled type in the private part of Ada.Containers.Doubly_Linked_Lists: type Our_List_Iterator is new Ada.Controlled.Limited_Controlled and List_Iterators.Reversible_Iterator with record Our_List : access List; end record; function First (Object : Our_List_Iterator) return Cursor; function Next (Object : Our_List_Iterator; Position : Cursor) return Cursor; function Last (Object : Our_List_Iterator) return Cursor; function Previous (Object : Our_List_Iterator; Position : Cursor) return Cursor; procedure Finalize (Object : in out Our_List_Iterator); The implementation of these operations would be something like: function Iterate (Container : in List) return List_Iterators.Reversible_Iterator'Class is begin return Iter:Our_List_Iterator do Iter.Our_List := Container'Unchecked_Access; -- This is safe unless someone does an Unchecked_Deallocation on -- the container in the loop; we don't try to protect against that -- in other cases for Ada.Containers, so we won't here, either. -- Set a tampering context for Iter.Our_List.all. end return; end Iterate; function First (Object : Our_List_Iterator) return Cursor is begin return Object.Our_List.First; end First; function Next (Object : Our_List_Iterator; Position : Cursor) return Cursor is begin return Next (Position); end Next; -- Similar implementations for Last and Previous. procedure Finalize (Object : in out Our_List_Iterator) is begin -- Clear the tampering context for Object.Our_List. end Finalize; Note that we didn't add these interfaces to the actual container types. That was because we wanted these iterators to be a tampering context (thus ensuring that they terminate and do not lead to erroneous execution). Users would be very surprised if a "safe" construct like a for loop caused erroneous execution (which will happen if the node designated by the loop parameter is deleted). The function Iterate would normally be used directly in a loop: for C in My_List.Iterate loop ... Element (C) ... end loop; We could also define a partial list iterator: function Partial_Iterate (Container : in List; Start : in Cursor; Finish : in Cursor := No_Element) return List_Iterators.Reversible_Iterator'Class; where Start and Finish (if not No_Element) would have to be cursors for Container. In this case, the iteration would run from Start to Finish (or No_Element, if Finish is not on the list after/before Start) forward or in reverse. (This would also be a tampering context as above.) We would use Partial_Iterator directly in a loop: for C in Partial_Iterate (My_List, Start, Finish) loop ... Element (C) ... end loop; Some reviewers have said that they would prefer the iterator to work directly on the container object. This would only eliminate a single selected name, which does not seem worth the problems that would be introduced. The example above means that an iterator is written: for C in My_List.Iterate loop Only the truly lazy would find a significant difference from: for C in My_List loop Moreover, we are planning (as of this writing) to use a similar mechanism to provide safe accessors. That mechanism also requires writing an extra selector; it would seem bizarre to work hard to eliminate that here, but still handle use it for accessors. Finally, a direct syntax would not allow nested loops (as different termination code is needed for each loop) unless significant additional magic is defined. Essentially, something like function Iterate would have to be defined internally to the interface to provide the needed temporary space/termination object. This would require far more mechanism than the current proposal. About the only thing going for the direct syntax is that it would be a bit easier to understand. However, usage of the current proposal is so easy that once a programmer sees an example, they will immediately understand how it works. Understanding the details would rarely be needed. !ACATS test !appendix From: Randy Brukardt Date: Monday, January 19, 2009 11:31 PM Here's the third of my trial balloons. For those following the recent comp.lang.ada discussion, this hopefully explains why I said that defining proper syntactic iterators was "trivial". It least I think this is a very simple proposal that really has no reason not to be adopted other than the needed prerequisites. [This is version /01 of the AI - ED.] **************************************************************** From: Randy Brukardt Date: Tuesday, January 20, 2009 12:27 PM There's a couple of points that I left out of this one yesterday: --- The use of a controlled type for an iterator would require bounded containers to be preelaborable units, rather than pure units. We'd still have a prohibition on the actual container from being controlled, so that finalization adverse users could still use the bounded containers (but obviously not the iterators). --- I had said "This proposal requires some solution to the modify-in-place problem for the containers packages." That was true in my original draft, which tried put the iterator interface directly on the container type. Since I gave up that model for Ada.Containers (in order to support tampering), it would be possible to hide an unsafe accessor in the private part and allow it to be used only in an element iterator (which, being a tampering context, is safe from most bad operations). So an access-to-element alternative could in fact be accomplished safely and practically without having to make unsafe operations available. Doing so would definitely require a solution to the instantiation on partial types problem (the current solution could be put into a child package although that would make the usage unnecessarily messy). **************************************************************** From: Randy Brukardt Date: Saturday, February 28, 2009 9:10 PM At the recent meeting, we had a rather forceful discussion about "safe" vs. "unsafe" iterators. I said that I would not stand for unsafe iterators on the predefined containers. (The existing containers have safe iterators as they do not allow "tampering with cursors" and require checks to prevent that.) The discussion about what "safe" meant concentrated on the fact that a "safe" iterator has guaranteed termination. That probably led some of you to think I was overstating the case for "safe" iterators. Unfortunately, I was too busy taking notes to be able to come up with the other part of safe iterators: they can't become erroneous as long as the container object exists. The following loop (in the absence of a tampering check) is erroneous: for C in My_List loop if then Delete_First (My_List); end if; end loop; The problem is that the cursor C has become invalid (A.18.3(153-156/2)); the iterator will then do an Next operation on it, which is erroneous by A.18.3(157/2). This is really hard to see in this formulation (which is why it is so dangerous), so lets look at the loop written explicitly: declare C : Cursor := First (My_List); begin while C /= No_Element loop if then Delete_First (My_List); end if; C := Next (C); -- (!) end loop; end; Here we can see the use of C after it becomes dangling, but even here it isn't obvious that it is dangling. Recall that one could also cause this problem using Delete, Splice, and various other operations. (Many of those would require making a copy of the cursor, which hopefully would be a red flag -- but not necessarily.) And the bad operation could have been called in a subprogram called from three levels of calls started from this loop - it is not necessarily right in the loop. Since this form of loop will be the most common in practice (no one is going to write an old passive iterator if they have this syntax available), it should be reasonably safe and should not become erroneous easily. Thus a tampering check (as in the existing procedure Iterate) is needed. Two caveats to this discussion: (1) It is possible for the container in an iterator to be destroyed (usually via Unchecked_Deallocation) while an iterator is running. That is clearly erroneous by existing language rules, and there is nothing practical that can be done to prevent it. So even a "safe" iterator is not completely safe. But it seems unusual to be destroying a container that way; deleting an element is a much more likely operation. (2) Another way to prevent this particular problem would be to require the use of "invalid" cursors that reference a container that still exists to raise Program_Error rather than be erroneous. That, however, is a distributed overhead that may not be appropriate to require on all containers. Most existing containers implementations do some form of such a check, but it is not clear that any cheap scheme for such a check would be sufficient to implement a language requirement. That's because a cheap check would use some sort of serial number scheme, and it would always be possible to do a set of operations that would cause the check to fail to be made. (It would be pretty unlikely that that would happen.) Requiring such checks would have the effect of making all containers usage safer (such as the user-written loop in the second example), and would reduce the iterators issues solely to termination (for which I still would disagree about avoiding checks but would not be required to bang my shoes on the desk and stalk out... ;-) Note that we could only make this requirement in the case covered by A.18.3(156/2); the other cases have to remain erroneous. So we'd probably have to define a new cursor state (say "dangling") which was guaranteed to raise Program_Error. **************************************************************** From: Matthew Heaney Date: Tuesday, March 3, 2009 5:48 PM > Since this form of loop will be the most common in practice (no one is > going to write an old passive iterator if they have this syntax > available), it should be reasonably safe and should not become > erroneous easily. Thus a tampering check (as in the existing procedure > Iterate) is needed. I don't know whether this is relevant, but in GNAT this error is detected: with Ada.Containers.Doubly_Linked_Lists; procedure Test_Iter is package Lists is new Ada.Containers.Doubly_Linked_Lists (Positive); use Lists; L : List; C : Cursor; begin L.Append (1); L.Append (2); L.Append (3); C := L.First; while C /= No_Element loop if Element (C) = 1 then L.Delete_First; end if; C := Next (C); end loop; end Test_Iter; $ gnatmake -gnata -gnat05 test_iter gcc -c -gnata -gnat05 test_iter.adb gnatbind -x test_iter.ali gnatlink test_iter.ali $ ./test_iter raised SYSTEM.ASSERTIONS.ASSERT_FAILURE : bad cursor in Next You have to compile with assertions enabled to turn on the detection, but otherwise it just works, without requiring any extra tampering checks. In any event, I don't know how you'd implement a tampering check when you're performing an active iteration. There is no "stack" as you have when performing a passive iteration. **************************************************************** From: Randy Brukardt Date: Tuesday, March 3, 2009 6:03 PM > I don't know whether this is relevant, but in GNAT this error is > detected: ... Well, one option obviously would be require the detection of this error. But then there is a distributed overhead (at least I presume so, since you allow turning the check off). ... > In any event, I don't know how you'd implement a tampering check when > you're performing an active iteration. There is no "stack" as you > have when performing a passive iteration. True, but a syntactical "for" loop is much more like a passive iterator than an active one. The mechanism of iteration is hidden in it (like a passive iterator but unlike an active one). So the user of it has no reason to know what operations will cause it to fail to work. (Yes, they could "lift the hood" and poke around at how the implementation is defined, but surely we want to people to be able to use the abstraction without doing that.) My net takeway is that we want "for loop" syntax to work as much like the passive iterator as possible (with the added bonus that exits are much easier with the "for loop" syntax). After all, no sane person will ever call the passive iterator again once the "for loop" syntax is available. Getting much less safe is going in the wrong direction. **************************************************************** From: Matthew Heaney Date: Tuesday, March 3, 2009 6:37 PM > Well, one option obviously would be require the detection of this > error. But then there is a distributed overhead (at least I presume > so, since you allow turning the check off. There's a function we call anytime a cursor is passed as a parameter (well, called when assertions are enabled). If there's a problem the condition is usually detectable right away (e.g. when deallocating a node, we set the prev and next pointer to the value of the node being deleted, and later test for this case). > True, but a syntactical "for" loop is much more like a passive > iterator than an active one. The mechanism of iteration is hidden in > it (like a passive iterator but unlike an active one). So the user of > it has no reason to know what operations will cause it to fail to > work. (Yes, they could "lift the hood" and poke around at how the > implementation is defined, but surely we want to people to be able to > use the abstraction without doing that.) If you're talking about adding a first-class language construct for iterating over containers, then yes, manipulating the tampering bits would work. (I assume that the mechanism would give the implementor some kind of hook a la Controlled.Initialize to indicate that iteration has started.) The VB(script) runtime works this way. The container object has a distinguished _NewEnum method that is called when the loop "elaborates", returning an object that is sort of a context for the iteration, and the script run-time uses this to maintain iterator state: Dim objFiles, objFile; Set objFiles = objFolder.Files For Each objFile In objFiles 'do something If objFile.Predicate Then Exit For End If Next The iteration object ("enumerator") is automatically destroyed when the loop terminates. **************************************************************** From: Bob Duff Date: Tuesday, March 3, 2009 6:34 PM > At the recent meeting, we had a rather forceful discussion about "safe" vs. > "unsafe" iterators. I said that I would not stand for unsafe iterators > on the predefined containers. (The existing containers have safe > iterators as they do not allow "tampering with cursors" and require > checks to prevent > that.) > > The discussion about what "safe" meant concentrated on the fact that a > "safe" iterator has guaranteed termination. I presume you mean that they should terminate for the predefined containers. And allow me to create iterators that don't terminate, right? I mean, infinite data structures can be useful. I think I agree with your other comments, on tampering and so forth. **************************************************************** From: Randy Brukardt Date: Wednesday, March 4, 2009 1:39 PM ... > > The discussion about what "safe" meant concentrated on the fact that > > a "safe" iterator has guaranteed termination. > > I presume you mean that they should terminate for the predefined containers. Correct, I was only talking about the predefined containers. > And allow me to create iterators that don't terminate, right? > I mean, infinite data structures can be useful. There is nothing that the language could do to guarantee termination. Personally, I find the use of the keyword "for" with something that does not terminate abhorrent (see below), but this is clearly no different than defining "*" to do a divide. So you would be able to do what you want. One of the great features of Ada is that you can't modify the for loop parameter. Thus, when you see the keyword "for", you know automatically that the loop terminates without any further analysis. Indeed, I was taught that the only practical way to prove loop termination is to show that the loop is equivalent to a for loop -- and that is a technique that I use (informally) periodically. (Since 30 years have gone by since I learned that, perhaps some better technique has been invented.) That technique would not be very useful if for loops themselves were not guaranteed to terminate. While it's clear that we can't *force* iterators to be written so they always terminate (that would require proving the absence of bugs in the definition of an iterator, which is surely beyond what Ada can do - maybe SPARK could do it, but it would have to learn about exceptions to do so), surely we want anything that is language-defined to have that property. And personally, I'll consider any non-terminating iterators to be broken, but YMMV. **************************************************************** From: Robert I. Eachus Date: Wednesday, March 4, 2009 9:30 PM >Indeed, I was taught that the only practical way to prove loop termination is >to show that the loop is equivalent to a for loop -- and that is a technique >that I use (informally) periodically. (Since 30 years have gone by since I >learned that, perhaps some better technique has been invented.) That technique >would not be very useful if for loops themselves were not guaranteed to >terminate. Off topic, but:I thought it might be of interest to more than Randy: Yes, there are better methods. The generalization of the for loop approach is to associate the problem space with a finite set. If the algorithm visits another member of the set after a finite step, and never revisits a set member, the program will always terminate.(There are a lot of cases where allowing the final node to be visited twice simplifies the proof, but that is a detail.) For a trivial example, say you want to find the shortest route through a graph containing n nodes so that each node is visited once.. The problem space is the n factorial permutations of n. Any algorithm that visits members of the problem space only once, and either terminates there or moves to a new member within a finite time, always terminates. This proof works for the brute force algorithm that visits all permutations, and also works for much more complex (and faster) algorithms that eliminate entire sets of permutations at each step. (At least for graphs with non-negative distances.) If a solution was allowed to pass through some nodes more than once, I guess you could generalize to visiting each point in the solution set at most n times, but an alternative is to renumber the nodes. Try adding a count of untraversed adjoining edges to the nodes to get a larger problem set. There is another way to prove termination, that became somewhat popular around 1980 with respect to linear programming and other optimization techniques. Unfortunately for actual computer codes, instead of pure math, it is often very painful. Associate a real (not integer or floating-point) variable with the loop, and find a definition such that both that the value of variable has a limit, and that each iteration of the loop monotonically increases (or decreases) the variable. Now if the termination condition requires only that the variable reach a value close to* the limit, the loop terminates. Many geometric methods to solve linear programming problems use some variation on this to show termination. More practically, some actually use it! Prior to starting the program, you have no clue as to what the final, correct answer is. But you do know from the initial conditions that you can specify a volume such that any volume that small contains a single potential solution. (To put in LP terms, when the volume is that small the set of slack variables can be easily read off.) So you have two loops, one which reduces the volume known to contain the solution at a rate with a polynomial bound in the parameters, and a second short loop in the number of inequalities which reads out the solution. ;-) *Yes, I know I am being mathematically imprecise here. Any real proof will define both the behavior of the variable near the limit, and the distance between the termination condition and the actual limit in terms of some common epsilon. Obviously, using floating point of some sort rather than real numbers is where the pain comes in. The distance between the termination condition and the actual limit has to be large relative to the numerical accuracy of the representation, and no two steps in the loop can be closer together than about twice the epsilon of the representation. So a proof using real numbers is just a starting point. **************************************************************** From: Randy Brukardt Sent: Wednesday, May 27, 2009 10:06 PM [The following is a partial message from another thread, see AI05-0135-1 for the entire thread. - Editor.] One example that I've been thinking about is the proposed generic to provide iterators. If we do decide to define that "magic" generic, we will need to rearrange the Ada.Containers packages a lot to use it. But that rearrangement would seem to be incompatible in a number of ways, and limiting on implementation strategies to boot. To refresh, assume that we have a magic iterator generic (this is simplified for this example, as are the following packages) and we have Steve's "integrated packages": generic type Cursor is private; No_Element : in Cursor; package Ada.Iterator_Interfaces is type Basic_Iterator is limited interface; function First (Object : Basic_Iterator) return Cursor; function Next (Object : Basic_Iterator; Position : Cursor) return Cursor; end Ada.Iterator_Interfaces; [Note that the exact form of this generic doesn't matter much in terms of what rearrangements need to be done to the Ada.Containers packages.] Now, we currently have: generic type Element_Type is private; package Ada.Containers.Doubly_Linked_Lists is type List is tagged private; type Cursor is private; Empty_List : constant List; No_Element : constant Cursor; function Is_Empty (Container : List) return Boolean; procedure Clear (Container : in out List); function Element (Position : Cursor) return Element_Type; ... private ... -- not specified by the language end Ada.Containers.Doubly_Linked_Lists; In order to add Basic_Iterator visibly to type List, we'll have to rearrange this into using at least one nested package. The simplest I can come up with is: with Ada.Iterator_Interfaces; generic type Element_Type is private; package Ada.Containers.Doubly_Linked_Lists is use package Cursors is type Cursor is private; No_Element : constant Cursor; private ... -- not specified by the language end Cursors; package Iterators is new Ada.Iterator_Interfaces (Cursors, No_Element); type List is new Iterators with private; Empty_List : constant List; function Is_Empty (Container : List) return Boolean; procedure Clear (Container : in out List); function Element (Position : Cursor) return Element_Type; ... private ... -- not specified by the language end Ada.Containers.Doubly_Linked_Lists; Limited with incompatibilities don't matter in this case, as you can't get a limited view of a generic unit. Primitiveness incompatibilities are definitely in evidence, but I do have to wonder if it matters. In particular, if you derive from type Cursor currently, you will inherit function Element. But you will no longer inherit that routine with this rearrangement. I don't know how significant that is (I would suggest it is not very significant). Naming incompatibilities don't seem to be a problem with this example. They could be if we wanted to remove the "integrated" nested package (as someone could have named it explicitly in a use clause or in an expanded name). That's not likely to be an issue in this case. Since there is only two entities (plus the inherited "=") in the nested packages, we could just rename them and not need any special construct at all. But we'd still have the incompatibilities. (I did not expect this result when I started writing this message.) But there still is a significant issue. While we are not specifying what goes into the private parts, we do need to be able to write them. A cursor needs a reference to a node and to the full container (the List in this example). The node probably could be defined in the private part of package Cursors, but we can't define a reference to List in the private part -- it hasn't been declared yet! And it *can't* be declared yet, because it depends on the interface that we haven't yet created. One way to workaround this problem is to use a Taft-amendment type in the private part of Cursors. But we can't complete that with the type List itself; we'd have to use a derived type. And that is obnoxious, forcing many junk type conversions. Even using class-wide types won’t get rid of those conversions, because they would be toward the root of the tree. Another possible workaround would be to clutter the spec with an incomplete type List before the nested package. But that could not be completed with the private extension List, so we would still end up with the same problem. In conclusion, this rearrangement works, has some client incompatibility (but it probably isn't important), but forces a major amount of junk type conversions in the body of the package. Of course, this is only one example, and it would be nice to look at additional examples in the future. **************************************************************** From: Tucker Taft Sent: Thursday, May 28, 2009 2:12 PM > Another possible workaround would be to clutter the spec with an > incomplete type List before the nested package. But that could not be > completed with the private extension List, so we would still end up with the > same problem. This particular problem seems fixable. The requirement that an incomplete type declaration be completed with a full_type_declaration seems arbitrary. There seems no inherent reason that it couldn't be completed with a private type or private extension declaration. The incomplete types produced from a "limited with" allow that, so the mechanism will be needed in Ada 2005 to handle such a completion anyway. > In conclusion, this rearrangement works, has some client > incompatibility (but it probably isn't important), but forces a major > amount of junk type conversions in the body of the package. Of course, > this is only one example, and it would be nice to look at additional examples > in the future. Normally I would have expected that almost the entire package would be included in an "integrated" package, and then this big nested package would be followed by various generic instantiations. But you have illustrated a case where that doesn't work, because you want one of the types to be built on the result of a generic instantiation. I wonder how common that is. How would you do it without this feature? As you point out, the integrated package is mostly just a short-hand for a bunch of renames, and in this case, there aren't that many. Hiding the nested package name automatically from use-visibility might be a helpful advantage of the proposed feature ;-). It does seem the ability to define an incomplete type that is completed by a partial view would be helpful, independent of this proposal. I wonder if your "magic iterator" generic has this problem in general, and perhaps it needs to be *defining* a cursor type, rather than taking it as a parameter. Or is it the fact that we want the container to be usable as an iterator? I think that is probably a bad idea in general, as it is inherently problematic to have the container carry some state about a particular iterator over the container. I understand the desire for a *syntax* approximating: for Element in Container loop ... end loop; But I see no requirement that this means we somehow have an iterator built into the Container itself. Also, we would certainly want to allow multiple tasks to iterate over a Container in parallel, I would think, or to allow nested iterators such as: for Left in Container loop for Right in Container loop Try_Pair(Left, Right); end loop; end loop; I think the above *syntax* should always presume there is an anonymous iterator object created, and that is what carries the state of the iteration. In any case, I wonder whether it is the strange nature of this generic that is creating the problem, or is there a common need to be able to define in a single package two recursively-dependent types one of which is based on a generic instantiation of the other. I suppose this is a very hard question to answer... Thanks for the example! **************************************************************** From: Randy Brukardt Sent: Thursday, May 28, 2009 2:49 PM ... > This particular problem seems fixable. The requirement that an > incomplete type declaration be completed with a full_type_declaration > seems arbitrary. There seems no inherent reason that it couldn't be > completed with a private type or private extension declaration. > The incomplete types produced from a "limited with" allow that, so the > mechanism will be needed in Ada 2005 to handle such a completion > anyway. Yes, I in fact wrote this up originally assuming that such completions were allowed. But before making a fool out of myself, I looked up the actual rules and changed this part accordingly. It would still clutter the spec a bit with an extra type declaration that doesn't really have anything to do with anything other than the implementation, but not a major problem overall. > > In conclusion, this rearrangement works, has some client > > incompatibility (but it probably isn't important), but forces a > > major amount of junk type conversions in the body of the package. Of > > course, this is only one example, and it would be nice to look at > additional examples in the future. > > Normally I would have expected that almost the entire package would be > included in an "integrated" package, and then this big nested package > would be followed by various generic instantiations. But you have > illustrated a case where that doesn't work, because you want one of > the types to be built on the result of a generic instantiation. I > wonder how common that is. No idea. It seems to need interfaces to make much sense, so it would mainly come up in new code. > How would you do it without this feature? As you point out, the > integrated package is mostly just a short-hand for a bunch of renames, > and in this case, there aren't that many. > Hiding the nested package name automatically from use-visibility might > be a helpful advantage of the proposed feature ;-). I was surprised to find out that it doesn't seem to matter much either way; the key is to use the nested package. Making it "integrated" is just icing. > It does seem the ability to define an incomplete type that is > completed by a partial view would be helpful, independent of this > proposal. Yes, I think so. > I wonder if your "magic iterator" generic has this problem in general, > and perhaps it needs to be *defining* a cursor type, rather than > taking it as a parameter. I don't know how that could work, given that container cursors include a reference to the container object. I suppose the entire thing could be created as a mix-in generic (adding to the container directly), but that seems annoying (isn't that exactly the sort of thing that interfaces were created to avoid?). > Or is it the fact that > we want the container to be usable as an iterator? I think that is > probably a bad idea in general, as it is inherently problematic to > have the container carry some state about a particular iterator over > the container. I strongly agree with this, but you need to convince Ed. :-) Maybe the rest of your message will help. > I understand the desire for a *syntax* approximating: > > for Element in Container loop > ... > end loop; I had proposed for Element in Container.Iterator loop ... end loop; (where "Iterator" is a function returning an anonymous iterator object). Since we're using that same sort of model for accessors (there is an "extra" discriminant reference in those), I don't see the problem. But of course, I was always happy with this formulation, it is other people that were not. > But I see no requirement that this means we somehow have an iterator > built into the Container itself. Also, we would certainly want to > allow multiple tasks to iterate over a Container in parallel, I would > think, or to allow nested iterators such as: > > for Left in Container loop > for Right in Container loop > Try_Pair(Left, Right); > end loop; > end loop; > > I think the above *syntax* should always presume there is an anonymous > iterator object created, and that is what carries the state of the > iteration. With the accessor proposal, you have to write a single extra selection; that seems fine to me for iterators, too. Since it would not be surprising to use both of them together, it would actually be kinda weird to have simpler syntax for the iterator and not the accessor: for Cursor in Container.Iterator loop Container.Accessor(Cursor).Element.Component := 10; end loop; > In any case, I wonder whether it is the strange nature of this generic > that is creating the problem, or is there a common need to be able to > define in a single package two recursively-dependent types one of > which is based on a generic instantiation of the other. I suppose > this is a very hard question to answer... I would expect a similar problem any time that you wanted to define an interface that needs customization with some properties. But it is hard to tell for sure, other than to say that designers are clever and come up with structures that no one expects. ****************************************************************