Version 1.4 of ai05s/ai05-0139-2.txt
!standard 5.5 10-02-11 AI05-0139-2/04
!class Amendment 09-02-13
!status work item 09-02-13
!status received 09-01-19
!priority Medium
!difficulty Medium
!subject Syntactic sugar for accessors, containers, and iterators
!summary
(See proposal.)
!problem
Using an iterator over one of the Ada 2005 containers libraries requires writing
a subprogram which gets called repeatedly with each element of the container.
That effectively requires writing the loop body out-of-line from the loop,
making the code inside-out of its normal flow, requiring the writing of a lot of
extra text, and thus making it much harder to read and understand.
A better mechanism is required.
In addition, using the notion of "accessors" or "references" as defined in AI05-0142
is syntactically awkward. Finally, it is often clearer to iterate over the elements
of a container (or array) directly, rather than iterating over a cursor (or index)
into the container (or array).
Using accessors for user-defined dereference (rather than simple access values
as proposed in AI05-141) has the potential to provide a safer mechanism that
might be used for persistence, relocating garbage collection, detection of
dangling references, etc. However, using accessors would be syntactically
painful compared to a simple ".all."
In all these cases, a modest amount of "syntactic sugar" might make the use
of accessors, containers, and iterators more natural and easier to read.
!proposal
Define a "magic" interface, a new operator, a "magic" generic unit,
and some additional "aspects" for the aspect_specification, all of
which are linked to some appropriate syntactic sugar for accessors,
containers, and iterators:
1) SUPPORT FOR ACCESSORS/REFERENCES
We propose to add a "magic" interface called Reference. Any type
descended from Reference with exactly one access discriminant is
considered a "reference" type, and provides implicit dereference in all
contexts, with the "implicit dereference" of an object Ref of such a
type being equivalent to "Ref.<discrim>.all," where "<discrim>" is the
name of the access discriminant. If the access discriminant is an
access constant, then of course this dereference is a constant view. If
not, then it is a variable view, and can be used on the left-hand-side
of an assignment, as the actual for an [in] out parameter, etc. These
reference types are particularly useful because they can be controlled
types or have a controlled part, so that when finalized, a user-defined
finalization action will be invoked. This allows, for example, the
"tampering" state of a container to be reset, a persistent object to be
"unpinned," a reference count to be decremented, etc.
The Reference interface is defined in its own package:
package Ada.References is
type Reference is limited interface;
end Ada.References;
Here is an example of use. We imagine type T and a type T_Table which
is some kind of container with T elements in it, indexed by a string.
with Ada.References;
with Ada.Finalization;
package Whatever is
type T is private;
Null_T : constant T; --
function New_T(...) return T;
type T_Table is tagged limited private;
type Ref_T(Data : access T) is
new Ada.Finalization.Limited_Controlled
and Ada.References.Reference with ...
--
procedure Enter(Tab : in out T_Table; Key : String; Value : T);
--
function Lookup_Or_Create(Tab : access T_Table; Key : String) return access T;
--
--
--
function Element(Tab : access T_Table; Key : String) return Ref_T;
--
private
...
end Whatever;
package body Whatever is
function Element(Tab : access T_Table; Key : String) return Ref_T is
--
begin
--
return Ref_T'(Data => Lookup(Tab, Key));
end Element;
...
procedure Do_Something(Tab : access T_Table) is
--
--
--
begin
Tab.Element("Apple") := New_T(...);
if Tab.Element("Banana") = Null_T then
Tab.Element("Banana") := New_T(...);
end if;
...
end Do_Something;
My_Tab : aliased T_Table;
begin
Do_Something(My_Tab'access);
end Whatever;
As you can see, references provides a very natural interface for dealing with
elements of a container.
2) SUPPORT FOR CONTAINER INDEXING
We now take this one step further by allowing the definition of a new
operator, "()", which allows the elimination of the function name in
uses like that above, simplifying the notation to simply
"Tab("Banana")". This works as follows:
package Whatever is
... as above
function "()"(Tab : access T_Table; Key : String) return Ref_T;
...
end Whatever;
Now the body of Do_Something can be simply:
procedure Do_Something(Tab : access T_Table) is
--
--
--
begin
Tab("Apple") := New_T(...);
if Tab("Banana") = Null_T then
Tab("Banana") := New_T(...);
end if;
...
end Do_Something;
Essentially we are allowing a container to be "indexed" in the
same way as an array is indexed, with the key (or potentially
various different key types) being arbitrary.
The basic rule with "()" is that it must be suitable for prefix
notation, meaning that it is either a primitive operation or a
class-wide operation of some tagged type (via its first parameter), and
if class-wide, is declared immediately within a package where some
tagged type covered by the class-wide type is immediately declared. It
must have at least two parameters. There seems no particular reason it
couldn't have more than two parameters. There are no predefined "()"
operators, though one could imagine that all array types have such
operators, if it helps in understanding the model.
One relatively important rule for "()" is that one can define two
operators, one that requires the controlling operand to be a variable,
and one that allows it to be a constant, and overload resolution will
resolve to the one that requires a variable if the actual is a variable,
and otherwise it will resolve to the one that allows a constant. This
rule is important so that the same notation, such as "Container(Key)",
can be used when Container is a constant as when it is a variable, and
the variableness of the result can be determined by the variableness of
the Container.
Hence, we would more likely in a package such as the one used in
the above example, define an additional reference type:
type Ref_Const_T(Data : access constant T) is
new Ada.Finalization.Limited_Controlled
and Ada.References.Reference with ...
and an additional "()" operator:
function "()"(Tab : access constant Table; Key : String) return Ref_Const_T;
With these two additional declarations, we can now use the "Tab(Key)"
notation whether "Tab" is a constant or a variable, and the appropriate
overloading of "()" will be invoked.
3) SUPPORT FOR ITERATORS
Given these two capabilities, we are now ready to define a more general
"iterator" syntax, that is, a variant of the "for" loop statement which
works naturally on containers. In fact we propose two distinct iterator
syntaxes:
a) The first syntax is intended for iterating over the set of cursors or
indices that select elements of a container. This is most analogous to
what is already available for arrays by writing "for I in Arr'Range
loop":
for Cursor in Iterate(Container) loop
Container(Cursor) := Container(Cursor) + 1;
end loop;
What the above presumes is that the function Iterate returns an object
of a type derived from a "magic" iterator type (see below). The
iterator type provides a First and Next operation which return a Cursor
type, and a No_Element cursor value to indicate end of iteration. The above
expands into:
declare
_Iter : Iterator_Type := Iterate(Container);
Cursor : Cursor_Type := First(_Iter);
begin
while Cursor /= No_Element loop
Container(Cursor) := Container(Cursor) + 1;
--
--
Cursor := Next(_iter, Cursor);
end loop;
end;
The generic package Iterator_Interfaces defines a pair of "magic" interfaces
used to "enable" the above iterator syntax for a given cursor type:
generic
type Cursor is private;
No_Element : in Cursor;
package Ada.Iterator_Interfaces is
type Basic_Iterator is limited interface;
function First (Object : Basic_Iterator) return Cursor;
function Next (Object : Basic_Iterator; Position : Cursor) return Cursor;
type Reversible_Iterator is limited interface and Basic_Iterator;
function Last (Object : Reversible_Iterator) return Cursor;
function Previous (Object : Reversible_Iterator; Position : Cursor) return Cursor;
end Ada.Iterator_Interfaces;
By instantiating this package with a given cursor type, and then
deriving one or more type(s) from the Basic_Iterator and/or
Reversible_Iterator interfaces, this first iterator syntax is enabled for
the given iterator type(s). See above for an example.
Note that the routines defined in this magic generic generally match
those used in iteration for the Ada.Containers packages, and their
meaning is the same. The only difference is that we've added an iterator
object parameter to the Next and Previous functions (which is necessary
to make them part of the interface).
The interface Basic_Iterator defines a user-defined iterator for the
forward direction; the interface Reversible_Iterator defines a
user-defined iterator for both the forward and reverse direction.
Descendants of Reversible_Iterator can be used in loops with the
REVERSE reserved word.
The function First is expected to deliver the first cursor value for the
iterator iterating in the forward direction, or return No_Element if no
cursor value is available. The function Next is expected to deliver the
next cursor value for the iterator iterating in the forward direction
given the previous cursor value returned from a call to First or Next,
or return No_Element if no additional cursor values are available.
The function Last is expected to deliver the last cursor value for the
iterator iterating in the reverse direction, or return No_Element if no
cursor value is available. The function Previous is expected to deliver
the next cursor value for the iterator iterating in the reverse
direction given the previous cursor value returned from a call to Last
or Previous, or return No_Element if no additional cursor values are
available.
b) The second syntax is intended for iterating directly over the elements
of a container or an array:
for Element of Container loop
Element := Element + 1;
end loop;
or
for Element of Arr loop
Element := Element + 1;
end loop;
This second syntax is more convenient, but less flexible, in that there
is no way to specify the particular iterator to use, so it requires
that there be a "default" iterator.
Two new "aspects" for an aspect_specification on a container type are
used to enable the second syntax (Default_Iterator and
Iterator_Element). If the aspects Default_Iterator and Iterator_Element
are specified, then the above loop over Container expands into:
declare
_Iter : Iterator_Type := Container_Type'Default_Iterator(Container);
_Cursor : Cursor := First(_Iter);
begin
while _Cursor /= No_Element loop
declare
Element : Container_Type'Iterator_Element renames Container(_Cursor);
--
begin
Element := Element + 1;
_Cursor := Next(_Iter, _Cursor);
end;
end loop;
end;
Here is how this generic and the iterator-related aspects might be used:
type Cursor_Type is ...
package Iterators is new Ada.Iterator_Interfaces(Cursor_Type, No_Element);
type Container is ...
with
Default_Iterator => Iterate,
Iterator_Element => Element;
type Element_Ref(E : access Element) is limited
new Ada.References.Reference with private;
type Element_CRef(E : access constant Element) is limited
new Ada.References.Reference with private;
type Iterator_Type is new Iterators.Reversible_Iterator with ...
function Iterate(Con : Container) return Iterator_Type;
function "()"(Con : access Container; Cur : Cursor_Type) return Element_Ref;
function "()"(Con : access constant Container; Cur : Cursor_Type) return Element_CRef;
...
These two new kinds of "for" loops would require two new kinds of
"for" loop parameter specifications:
defining_identifier IN [REVERSE] iterator_expression
defining_identifier [: subtype_indication] OF container_expression
The first requires an expression returning an object of a type
descended from Basic_Iterator, or Reversible_Iterator if REVERSE appears.
The defining_identifier is of the corresponding Cursor type, and can
be used to index into the container.
The second requires an expression returning an array, or an object
of a type which has aspects Default_Iterator and Iterator_Element
specified. The subtype_indication is optional, since the Iterator_Element
aspect of the container type determines the Element type. Similarly for an
array, the Element type is simply the array component type. One purpose of
the subtype_indication might be to tighten the requirements on the
elements of the container/array, acting essentially as an assertion
about the values stored in the container/array at the time of
the iteration. Another purpose is simply to help the reader,
and provide an extra check that the Iterator_Element type of the
container is as expected.
Here are some additional examples of iterators.
An example of the first iterator would be:
V : Int_Vectors.Vector;
...
for C in V.Iterate loop
V(C) := V(C) + 1;
--
--
end loop;
Some examples of the second iterator would be:
Arr : array(1..10) of Natural := ...
...
for X : Natural of Arr loop
--
--
--
X := X + 1;
end loop;
--
Vec : Int_Vectors.Vector;
...
for X of Vec loop
--
--
--
X := X + 1;
end loop;
The "of" syntax, suggested by J.P. Rosen, seems to help identify X
as an element of the array/container rather than as an index or cursor
in the sequence defined by the discrete range or iterator.
!wording
** TBD **
!discussion [NOT FULLY UPDATED -- SEE PROPOSAL FOR SOME NEWER DISCUSSION]
This is based on the earlier proposal and discussion recorded in AC-0112.TXT.
This proposal was modified to address the comments made on that one, and has
been simplified considerably.
Note that the iterator interface can be directly applied to a container type
(allowing it to be directly iterated) or it can be part of a separate function
(which would allow adding additional parameters to the iterator and/or adding
additional storage for the iterators). By using interfaces, we give maximum
flexibility to the programmer that is using these iterators.
If the creator of the iterator needs extra storage to accomplish the iteration,
they can include it in the type of the iterator.
If the creator of the iterator needs to get control when the loop is exited
prematurely, they can define a controlled iterator type. In that case,
(presuming that the iterator_expression was a function call) when the loop is
exited, the Finalization routine for the iterator object will be called.
We provide a form for reverse iterators. Reverse iteration makes sense for many
kinds of containers (vectors and lists, for instance). Given that the syntax
already exists and every Ada programmer is familiar with it, not providing it
would quickly become one of the most common frustrations of programming in Ada.
That seems silly.
We could have simplified the proposal by requiring that all iterators support
Last and Previous, but iterators where reverse iteration does not make sense
would have to raise an exception when Last is called. The proposed features, in
contrast, make a compile-time determination as to whether reverse is allowed.
That will detect errors earlier; it generally is better to detect errors at
compile-time if possible.
It might appear that this proposal would require some solution to the
instantiation for private types problem, but as the examples below show, this
is not true.
Note that the actual iterator type can be either limited or non-limited. The
value of a limited iterator object is that an iterator implementation can
assume (presuming that it only exports functions and not types) that the
iterator object is created and destroyed with the loop. This allows safe
cleanup operations when the loop is exited (as in the examples below).
If that capability is not needed, an iterator implementation allow reuse of
iterator objects (especially if no temporary storage is needed).
Note that if temporary storage or loop exit actions are needed, the iterator
object must be separate from the container object that it is iterating
on. Otherwise nested loops on the same container would not work.
for I in My_Container.Iterator loop
for J in My_Container.Iterator loop
...
end loop;
end loop;
The I and J loops need separate temporary storage, which must be provided by
the iterator object (the loop can only provide the loop parameter cursor object).
Issues:
The use of a controlled type for an iterator in this proposal would require
bounded containers to be preelaborable units, rather than pure units. We add a
prohibition on the actual container from being controlled, so that finalization
averse users could still use the bounded containers (but obviously not the
iterators). This seems OK, given that the current accessor proposal (AI05-0142-4)
has similar requirements.
It would be nice for an iterator implementation to be able to prevent the
creation of iterator objects not associated with a loop. For instance, given
the implementation of the example below:
My_Iter : Some_Instance.Basic_Iterator'Class := My_List.Iterator;
would have the effect setting tampering on My_List and keeping it set until
the object is destroyed (potentially a long time in the future). But this
is not the natural thing to do -- it requires a lot more typing than the intended
usage -- and no major definitional problems are encountered in this case. We
would need a lot more mechanism to prevent the creation of objects outside of
the loop context, so we do not try to prevent this misuse.
Alternatives considered:
One option considered was including the name of the cursor type somewhere in the
syntax. Something like:
defining_identifier : subtype_mark in [reverse] iterator_expression
That was not done mainly because differs from the existing syntax for no clear
benefit. It would make the type of the identifier clearer, but it also would
mean that switching from a discrete iterator to a container one would require
some syntax changes that don't really fit into the model. (One solution to that
problem would be to optionally add the type name to the discrete for loop as
well, but that seems like a bunch of additional changes for little value.)
It should be noted that with this formulation, the existing discrete iterators
could be considered just a predefined instantiation of this magic generic for
discrete types. The analogy breaks down a bit for ranges (they don't look like a
function call), but it isn't hard to imagine such a unification.
!examples [NOT FULLY UPDATED -- SEE PROPOSAL FOR ADDITIONAL EXAMPLES]
This feature could easily be used in the existing Ada.Containers packages. We
give a suggested implementation for Ada.Containers.Doubly_Linked_Lists; the
other containers could be similar.
The start of the visible part of Ada.Containers.Doubly_Linked_Lists would be
modified as follows:
with Ada.Iterator_Interfaces;
generic
type Element_Type is private;
with function "=" (Left, Right : Element_Type) return Boolean is <>;
package Ada.Containers.Doubly_Linked_Lists is
pragma Preelaborate(Doubly_Linked_Lists);
pragma Remote_Types(Doubly_Linked_Lists);
package Cursors is
type Cursor is private;
pragma Preelaborable_Initialization(Cursor);
No_Element : constant Cursor;
private
... --
end Cursors;
subtype Cursor is Cursors.Cursor;
No_Element : Cursor renames Cursors.No_Element;
package Iterators is new
Ada.Iterator_Interfaces (Cursors, No_Element);
type List is new Iterators with private;
pragma Preelaborable_Initialization(List);
Empty_List : constant List;
...
Note that the changes here are the addition of the nested package Cursors,
the addition of the instantiation, and the subtype and renames to eliminate the
effect of the nested package. The majority of the package is unchanged.
After the existing procedure Reverse_Iterate, we would add a new function:
function Iterate (Container : in List) return List_Iterators.Reversible_Iterator'Class;
This would be described as
Iterate returns an object that can be used in a loop_statement to loop
with a loop parameter designating each node in Container, starting with the
first node and moving the cursor as per the Next function if the reserved word
reverse is not used in the user_loop_parameter_statement, and starting
with the last node and moving the cursor as per the Previous function otherwise.
Program_Error is propagated if any operation (in particular, the sequence_of_statements
of the loop_statement whose iterator_expression denotes this object) tampers with
the cursors of Container while the returned object exists.
This could be implemented by defining a controlled type in the private part of
Ada.Containers.Doubly_Linked_Lists:
type Our_List_Iterator is new Ada.Controlled.Limited_Controlled and
List_Iterators.Reversible_Iterator with record
Our_List : access List;
end record;
function First (Object : Our_List_Iterator) return Cursor;
function Next (Object : Our_List_Iterator; Position : Cursor) return Cursor;
function Last (Object : Our_List_Iterator) return Cursor;
function Previous (Object : Our_List_Iterator; Position : Cursor) return Cursor;
procedure Finalize (Object : in out Our_List_Iterator);
The implementation of these operations would be something like:
function Iterate (Container : in List) return List_Iterators.Reversible_Iterator'Class is
begin
return Iter:Our_List_Iterator do
Iter.Our_List := Container'Unchecked_Access;
--
--
--
--
end return;
end Iterate;
function First (Object : Our_List_Iterator) return Cursor is
begin
return Object.Our_List.First;
end First;
function Next (Object : Our_List_Iterator; Position : Cursor) return Cursor is
begin
return Next (Position);
end Next;
--
procedure Finalize (Object : in out Our_List_Iterator) is
begin
--
end Finalize;
Note that we didn't add these interfaces to the actual container types. That was
because we wanted these iterators to be a tampering context (thus ensuring that
they terminate and do not lead to erroneous execution). Users would be very
surprised if a "safe" construct like a for loop caused erroneous execution
(which will happen if the node designated by the loop parameter is deleted).
The function Iterate would normally be used directly in a loop:
for C in My_List.Iterate loop
... Element (C) ...
end loop;
We could also define a partial list iterator:
function Partial_Iterate (Container : in List;
Start : in Cursor;
Finish : in Cursor := No_Element)
return List_Iterators.Reversible_Iterator'Class;
where Start and Finish (if not No_Element) would have to be cursors for
Container. In this case, the iteration would run from Start to Finish (or
No_Element, if Finish is not on the list after/before Start) forward or in
reverse. (This would also be a tampering context as above.)
We would use Partial_Iterator directly in a loop:
for C in Partial_Iterate (My_List, Start, Finish) loop
... Element (C) ...
end loop;
!ACATS test
!appendix
From: Tucker Taft
Thoughts on Iterators, 8-Nov-2009, for ARG meeting in St. Pete
Relevant players in the game:
Container, Cursor, Iterator, Element.
-------------
for Element in [reverse] Container loop
... -- body of loop
end loop;
is equiv to:
declare
Iterator : Iterator_Type := Iterate(Container);
Cursor : Cursor_Type := First(Iterator);
begin
while Has_Element(Cursor) loop
declare
Element : Element_Type renames Container(Cursor);
begin
... -- body of loop
Cursor := Next(Iterator, Cursor);
end;
end loop;
end;
or...
declare
Iterator : Iterator_Type := Iterate(Container);
begin
while More(Iterator) loop
declare
Cursor : Cursor_Type := Get_Next(Iterator'Access);
Element : Element_Type renames Container(Cursor);
begin
... -- body of loop
end;
end loop;
end;
---------
Container(Cursor)
is equiv to:
Element_{Ref,CRef}(Container, Cursor).Data.all
for Cursor in Special_Iterator(Container) loop
Put(Container(Cursor));
end loop;
---------
for Element in Container loop
Put(Element);
end loop;
is equiv to:
for Cursor in Default_Iterator(Container) loop
declare
Element : Element_Type renames Container(Cursor);
begin
Put(Element);
end;
end loop;
similarly:
for Element in Array_Obj loop
Put(Element);
end loop;
is equiv to:
for I in Array_Obj'Range(1) loop
for J in Array_Obj'Range(2) loop
...
declare
Element : Array_Obj_Component_Type renames Array_Obj(I, J, ...);
begin
Put(Element);
end;
...
end loop;
end loop;
---------
package Iterators is new Ada.Iterator_Interfaces(Cursor_Type, No_Element);
type Iterator_Type is new Iterators.Reversible_Iterator with ...
package Containers is new
Ada.Container_Interfaces(Iterators, Iterator_Type, Element_Type, Element_Reference);
type Container_Type is new Containers.Container with ...
generic
with package Iterators is new Ada.Iterator_Interfaces(<>);
type Iterator_Type is new Iterators.Reversible_Iterator with private;
type Element_Type(<>) is limited private;
type Element_Reference(Data : access Element_Type) is limited private;
type Element_Constant_Reference(Data : access constant Element_Type) is limited private;
package Ada.Container_Interfaces is
type Container is limited interface;
function Default_Iterator(C : Container) return Iterator_Type is abstract;
function Element_Ref(C : aliased in out Container; Cursor : Iterators.Cursor)
return Element_Reference is abstract;
function Element_CRef(C : aliased in Container; Cursor : Iterators.Cursor)
return Element_Constant_Reference is abstract;
end Ada.Container_Interfaces;
----------
generic
type Cursor is private;
No_Element : in Cursor;
package Ada.Iterator_Interfaces is
type Basic_Iterator is limited interface;
function First (Object : Basic_Iterator) return Cursor;
function Next (Object : Basic_Iterator; Position : Cursor) return Cursor;
type Reversible_Iterator is limited interface and Basic_Iterator;
function Last (Object : Reversible_Iterator) return Cursor;
function Previous (Object : Reversible_Iterator; Position : Cursor) return Cursor;
end Ada.Iterator_Interfaces;
****************************************************************
From: Robert Dewar
Sent: Sunday, November 8, 2009 10:26 AM
Equivalent always makes me nervous. Are we sure we can make this strong a statement?
****************************************************************
From: Tucker Taft
Sent: Sunday, November 8, 2009 11:40 AM
This is more like the definition of "new T" when T has a user-defined storage pool.
It is equivalent to a call on Allocate because its semantics are defined as calling
Allocate.
For the proposed syntax for iterators and indexing, their semantics will be defined
in terms of calls on certain subprograms, etc.
****************************************************************
From: Robert Dewar
Sent: Sunday, December 8, 2009 1:14 PM
> This is more like the definition of "new T" when T has a user-defined
> storage pool. It is equivalent to a call on Allocate because its
> semantics are defined as calling Allocate.
I think it is fine to say "the behavior is as though xyz was called bla bla".
I just think we often get ourselves in trouble if the formal definition uses the
word equivalent, it's such a strong word from a formal point of view.
****************************************************************
From: Tucker Taft
Sent: Tuesday, December 8, 2009 5:01 PM
Here is a new variant for AI05-139 on user-defined iterators, etc. [This is
version /01 of this AI - ED.] It is based on all the earlier proposals,
and incorporates the notes on iterators I passed around at the last ARG meeting.
I removed the wording, and didn't update the discussion, so you will just
have to focus on the !proposal section, which includes some simple examples.
****************************************************************
From: Tucker Taft
Sent: Tuesday, February 2, 2010 10:09 PM
The "iterator" AI (AI-139-2) which has morphed into a more general "syntactic
sugar" AI has some very complex "magic" generics that need to be instantiated
"just so" to get the desired effect. I was wondering whether we might simplify
things by simply having just one magic generic (essentially Randy's original one
for iterators) plus a set of "magic" interfaces, i.e.:
generic
type Cursor is private;
No_Element : in Cursor;
package Ada.Iterator_Interfaces is
type Basic_Iterator is limited interface;
function First (Object : Basic_Iterator) return Cursor;
function Next (Object : Basic_Iterator; Position : Cursor) return Cursor;
type Reversible_Iterator is limited interface and Basic_Iterator;
function Last (Object : Reversible_Iterator) return Cursor;
function Previous (Object : Reversible_Iterator; Position : Cursor) return Cursor;
end Ada.Iterator_Interfaces;
package Ada.Magic is
-- (Hopefully we could come up with a better package name!)
type Reference is limited interface;
type Container is limited interface;
end Ada.Magic;
There would be no operations defined on the magic interfaces.
Defining a type descended from one of them would simply signal the intention of
using the corresponding syntactic sugar.
The one I am most enamored of these days is Reference, which given a type that
is a descendant of Reference, and which has one access discriminant would
implicitly dereference that access discriminant when the type is the result
type of a function call. This shorthand has come up several times, not just in
this AI, but also in our thinking about user-defined dereference, and I am also
seeing it might be useful in a reworking of the "subpool" AI.
There seem to be several places where you want to return a short-lived reference
to an object, but then get control again when the reference goes away, and you
don't want to have to write "Func(...).Data.all" to use the result, but merely
write "Func(...)", with the ".Data.all" being implicit.
Since this is syntactic sugar, I'm not uncomfortable with it being "activated"
somewhat specially. But it should be simple. If I know I can easily take any
type that has one access discriminant and give it this nice implicit dereference
capability by just adding Reference as one of its progenitors, that is very
light weight.
I envision something similar for the Container interface. Here we want to be
able to index into the container and end up with a Reference to an element using
the notation "My_Container_Obj(I)". Again, I would just like to add Container as
a progenitor to some container-ish type, and then use a specially-named function
(e.g. "Index" or "Element") that returns a result of a type descended from
Reference, and have "My_Container_Obj(I)" expand into, approximately,
"Index(My_Container_Obj, I).Data.all".
In both of these cases, actually instantiating generics, etc., to get this
syntactic sugar is complex, while just adding "Reference" or "Container" as a
progenitor and following some naming conventions might be very straightforward.
Comments?
****************************************************************
From: Edmond Schonberg
Sent: Tuesday, February 2, 2010 10:50 PM
This is certainly more appealing that the levels of generics in the current AI. Is the access discriminant part of the "magic" or is it a
user-visible record definition somewhere? I take it that it is the
actual for type Cursor? Should Cursor then be declared as an extension of this
magic Reference interface, or am I misunderstanding this altogether?
****************************************************************
From: Tucker Taft
Sent: Tuesday, February 2, 2010 11:01 PM
Cursor would be an index type for some object of a Container type, not the
result type. For example, the Element function would take a Cursor, and return
a Reference.
I didn't include a discussion of the iterator stuff per-se. I was just focusing
on the accessors/references and the container syntactic sugar, and wasn't
worrying about the iterator "for ... in/of ... loop" stuff which is where
cursors mostly come into the game.
****************************************************************
From: Tucker Taft
Sent: Tuesday, February 2, 2010 11:11 PM
I forgot to answer your question about the access discriminant. The idea is
that any type derived from Reference which has exactly one access discriminant
would get the implicit dereference syntactic sugar.
****************************************************************
From: Bob Duff
Sent: Wednesday, February 3, 2010 8:09 AM
> package Ada.Magic is
;-)
> Comments?
I think it's a good approach. If we're going to have magic, we might as well go
all the way to Harry Potter style.
Programmers will need a "cook book" style explanation.
****************************************************************
From: Tucker Taft
Sent: Wednesday, February 3, 2010 8:31 AM
Ada.Magic, what a coincidence! ;-)
By the way, the more I have thought about it, the more it seems handy to be able
easily to define a "wrapper" for an access value (i.e. a descendant of
Reference) which has that access value as an access discriminant designating the
object of interest, plus other hidden stuff that might need finalization.
We have talked about persistent containers of some sort.
Returning a Reference to an element of a persistent container makes it easy to
"pin" the page containing the element in memory as long as the Reference exists.
The AI-111 distinction between subpools and handles on subpools also becomes
more straightforward, where what I was calling a "handle" is just a Reference,
that keeps the designated subpool from disappearing while it exists. A Reference
has the properties I wanted for a handle, namely it is immutable (since you
can't change the value of an access discriminant), can require finalization
(e.g. to decrement a reference count), and would be implicitly dereferenced when
used in a subpool_specification in an allocator, eliminating the need to include
"implicit_dereference" in the subpool_specification syntax, since the automatic
dereference associated with References would be part of the syntax for "name"
itself.
Similarly the other things I discussed in AI-111 also fit nicely with this idea
of a Reference. In particular, what I called a "strong reference" needed to be
"split" apart into a static handle and an access value to be useful, but it is
easier to simply provide a function that takes a "strong reference" and returns
a Reference designating the particular object, with the "static handle" on the
associated subpool buried inside that Reference, ensuring that the subpool
persists as long as the access value to the object in the subpool is floating
around, while also providing implicit dereference to avoid the annoying need to
write "ref.Data.all" everywhere.
I also have a feeling the reachability checks can be removed from the compiler's
concern, by burying them all in operations related to References.
Of course, as Randy will remind me, the devil is in the details...
****************************************************************
From: Bob Duff
Sent: Wednesday, February 3, 2010 8:43 AM
Or, "The devil is in the wording". ;-)
But I do think you're on the right track, here.
****************************************************************
From: Randy Brukardt
Sent: Wednesday, February 3, 2010 1:00 PM
I wish I did, because the idea is appealing. But the idea of some magic
connecting two visibly unrelated types to a third ought to cause Mr. Private to
come out of retirement to complain. :-)
To be more specific: In the containers packages as currently proposed for
AI-142-4, Containers, Cursors, and Reference_Type are all unrelated private
types. Without magic, a call:
Container_Object.Reference(My_Cursor).Element.all := ...
creates a reference. The specs of the interesting things are:
type Cursor is private;
type Vector is tagged private;
type Reference_Type (Element : not null access Element_Type) is
tagged limited private;
function Reference (Container : aliased in out Vector; Position : in Cursor)
return Reference_Type;
We want the "magic" to allow this call to be written as:
Container_Object(My_Cursor) := ...
Doing so requires the relationship between the three types to be exposed in the
interface. There is no way to do that with a single magic interface unless we
are willing to destroy privacy.
One could imagine requiring a magic interface on two or all three of the types.
(Requiring an interface on Cursors would require making Cursors a tagged type.
And that would require defining Cursors in a nested package so all of the
container operations are not primitive on cursors [if they were, they would be
illegal]. That's probably too much change.
The constant case could be worse:
function Constant_Reference (Position : in Cursor) return Reference_Type;
Here, there is no visible connection between the types at all. Luckily, we
aren't suggesting that for Ada.Containers, but it seems reasonable for
user-defined containers. This is, after all, the culmination of the "implicit
container" model that Matt likes so much. Here, you are either going to deny the
use of nice syntax or require privacy breaking.
So I think we need the magic generic for the accessors (and honestly, I don't
see what the concern is, the users will almost never instantiate these things,
and when they do, they'll copy the containers implementation). The incantation
for enabling magic can be as complex as necessary in order for it to work.
The only case where this idea might work is the "Reference" one. In that case,
the interface is directly placed on the Reference_Type implementation, and that
probably is enough to describe some semantics of omitting the .all. IMHO, that's
the least interesting of the three, given that Ada allows you to omit the .all
in most contexts anyway. (I do see how you could use that to implement most of
AI-111, which has been my point all along vis-a-vis that AI: we don't need the
complex mechanisms so long as we have a way to get finalization right). It's
surely the one I could live without (and I have serious doubts that it can be
resolved without ambiguity).
****************************************************************
From: Randy Brukardt
Sent: Wednesday, February 3, 2010 1:05 PM
...
> package Ada.Magic is
> -- (Hopefully we could come up with a better package name!)
> type Reference is limited interface;
> type Container is limited interface;
> end Ada.Magic;
>
> There would be no operations defined on the magic interfaces.
> Defining a type descended from one of them would simply signal the
> intention of using the corresponding syntactic sugar.
Why would we want to use an interface for that purpose? The reason I used an
interface was to tie operations on unrelated types together. If you are going to
do that some other way, just say so. Interfaces require the type they are
applied to be a tagged type, and I don't see any semantic reason that this magic
needs to be restricted to tagged types.
The only advantage I can see is that we probably wouldn't want to use a pragma
to mark this (a pragma that changes the acceptable syntax seems to go far beyond
"good taste", however you define that), and other options seems pretty heavy.
But then again, this is a "heavy" feature, given the impact it has on reading
code -- it needs to be clearly marked.
****************************************************************
From: Tucker Taft
Sent: Wednesday, February 3, 2010 2:41 PM
...
> We want the "magic" to allow this call to be written as:
>
> Container_Object(My_Cursor) := ...
>
> Doing so requires the relationship between the three types to be
> exposed in the interface. There is no way to do that with a single
> magic interface unless we are willing to destroy privacy.
I don't follow your logic. Can you drag me through it?
> One could imagine requiring a magic interface on two or all three of
> the types. (Requiring an interface on Cursors would require making
> Cursors a tagged type. And that would require defining Cursors in a
> nested package so all of the container operations are not primitive on
> cursors [if they were, they would be illegal]. That's probably too much
> change.
I'm not recommending that.
> The constant case could be worse:
>
> function Constant_Reference (Position : in Cursor) return
> Reference_Type;
>
> Here, there is no visible connection between the types at all.
> Luckily, we aren't suggesting that for Ada.Containers, but it seems
> reasonable for user-defined containers. This is, after all, the
> culmination of the "implicit container" model that Matt likes so much.
> Here, you are either going to deny the use of nice syntax or require privacy
> breaking.
I don't understand where the privacy breaking comes into play.
I was suggesting that by declaring a container as a descendant of the
Magic.Container interface, you "activate" some of the syntactic sugar. I was
further suggesting that by declaring the Reference_Type as a descendant of
Magic.Reference, you "activate" the ".<acc_discrim>.all" syntactic sugar. I
don't see any privacy breaking here, so you should explain where you see it.
This idea of using an interface type as effectively a "marker" is used
successfully in other languages. You could use a pragma or aspect
clause/specification, but having the presence of a pragma or aspect clause
affect syntax seems in particularly bad taste. We already give some special
capabilities to types having Controlled or Storage_Pool as an ancestor, so this
feels like a modest step up from that.
> So I think we need the magic generic for the accessors (and honestly,
> I don't see what the concern is, the users will almost never
> instantiate these things, and when they do, they'll copy the
> containers implementation). The incantation for enabling magic can be
> as complex as necessary in order for it to work.
>
> The only case where this idea might work is the "Reference" one. In
> that case, the interface is directly placed on the Reference_Type
> implementation, and that probably is enough to describe some semantics of omitting the .all.
It's not just the ".all", it is omitting the ".Element.all", which is certainly
not currently allowed by Ada.
> IMHO, that's the least interesting of the three, given that Ada allows
> you to omit the .all in most contexts anyway. (I do see how you could
> use that to implement most of AI-111, which has been my point all
> along vis-a-vis that AI: we don't need the complex mechanisms so long
> as we have a way to get finalization right). It's surely the one I
> could live without (and I have serious doubts that it can be resolved without ambiguity).
Not surprisingly, I don't agree. Eliminating the need to write ".Element.all"
all over the place changes the dynamic significantly, and makes these References
actually usable and powerful.
****************************************************************
From: Randy Brukardt
Sent: Wednesday, February 3, 2010 3:59 PM
> > Doing so requires the relationship between the three types to be
> > exposed in the interface. There is no way to do that with a single
> > magic interface unless we are willing to destroy privacy.
>
> I don't follow your logic. Can you drag me through it?
I don't see how the fact that you've used the shorthand for a container
reference would lead the compiler to select the correct function to implement
that syntax. The only way I can think of is for the compiler to look inside the
representation of Reference_Type and see that it in fact references a Vector.
And even that is tenuous.
...
> I don't understand where the privacy breaking comes into play.
>
> I was suggesting that by declaring a container as a descendant of the
> Magic.Container interface, you "activate"
> some of the syntactic sugar.
This is the problem.
> I was further suggesting that
> by declaring the Reference_Type as a descendant of Magic.Reference,
> you "activate" the ".<acc_discrim>.all"
> syntactic sugar.
This is OK.
> I don't see any privacy breaking here, so you should explain where you
> see it.
I was thinking backwards, but so are you.
I presume that the effect of the Magic.Container interface would be to allow
writing the call:
Container_Object.<some_function>(params)
as
Container_Object(params)
(I'm presuming that your model is that the reference part is separate magic, so
these are separable magic items. That's good, IMHO).
But the problem here is that the function name is now gone. How can the compiler
decide what function is being called? The original interface idea made it clear:
it was named in the interface. But now you have left that out.
So you end up having to look at *every* primitive and class-wide function whose
first parameter has the right type (the latter because this is prefix notation,
and it ought to work like it). You can't do that by name (as you do for
everything else in Ada resolution), you have to pick up everything. That means a
wide symboltable search, through all of the appropriate packages.
And when does this apply? I have to think it has to apply to any function,
because you have nothing else to say. That means that you can replace:
Container_Object.Capacity
and
Container_Object.Length
by
Container_Object
That would appear to make Container_Object itself ambiguous. OK, so we could
insist on two parameters.
We can then replace:
Container_Object.To_Cursor (1)
by
Container_Object(1)
But that's going to be ambiguous with the intended use (result resolution should
bail us out most of the time, but not always).
And then the killer:
Container_Object.Element (1)
and
Container_Object.Reference (1).Element.all (assuming the use of
Magic.Reference on the latter)
would both be:
Container_Object(1)
And both have the same result type. So our new syntax is always ambiguous.
What's the point of that??
I suppose you could fix this by allowing the special "omission" syntax to apply
only to function returning a type with a Reference interface. But that seems
very limiting; the new Magic.Container interface becomes a "gazebo" (as Dmitry
put it on in a c.l.a. discussion); it only works in one particular case
specifically tied to the implementation of Ada.Containers. It would not be
usable in a different containers design.
For instance, if you willing to directly return an anonymous access, you
couldn't use this syntax. That seems insane.
So I think it is critical that the marker (currently a container interface)
declare the result type that is intended. It would be best if it could also
declare the function whose name can be omitted. The only sane ways to do that is
to use a magic generic interface, or mark using a pragma or new syntax (the
latter being too heavy). And, assuming we have one for iterators, I don't see
what the problem of having more magic generic interfaces. That is, you are
trying to solve a non-problem.
Note that using an interface to select the function is also less than ideal,
because it forces the subprogram to have a particular name, and only allows one
such function per container. Both of those are quite limiting as well. (This is
the real problem with interfaces: a single type can either have or not have a
particular interface. This is the same problem we have with stream attributes:
it makes perfect sense to have multiple hierarchies of attributes for different
purposes [say XML and binary streaming], but only one set can exist, and the
other has to be created completely manually without any composition. Sometimes
this is appropriate [finalization comes to mind], but it usually isn't
[shouldn't fence in your users]. At least if the interface is in a generic, you
can make multiple interfaces to cover different needs, and you can use renames
to get the intended names of your functions -- but it's all harder than
necessary.)
> This idea of using an interface type as effectively a "marker" is used
> successfully in other languages. You could use a pragma or aspect
> clause/specification, but having the presence of a pragma or aspect
> clause affect syntax seems in particularly bad taste.
> We already give some special capabilities to types having Controlled
> or Storage_Pool as an ancestor, so this feels like a modest step up
> from that.
If it was "modest", I'd agree. But the need to search for all primitive
functions that match some profile (rather than searching by name as I suspect
all Ada symboltables are currently set up) makes this a very expensive idea.
Plus it is very limited in utility *or*
Truly, the good thing that I see here is that you've proven that the interfaces
for the Reference part and the Accessor part are completely separate logically.
You had the pieces confused in the past, and that added to the complexity. But
the Accessor interface needs to have nothing at all to do with the Reference
interface, and indeed it is better if they are unrelated. (Presuming you believe
in the need for the Reference at all, I'm still dubious.)
...
> > The only case where this idea might work is the "Reference" one. In
> > that case, the interface is directly placed on the Reference_Type
> > implementation, and that probably is enough to describe some
> > semantics of omitting the .all.
>
> It's not just the ".all", it is omitting the ".Element.all", which is
> certainly not currently allowed by Ada.
That's true, but .Element is an access discriminant that is being used only
because the lack of some better way to get the right accessibility. I never
wanted the .Element part in the first place; the whole mechanism is about
getting rid of the .all and still having the right lifetime for the access
value. The .Element got introduced for the lifetime reasons only, and only to
avoid adding some other mechanism to get the right lifetime.
Anyone who uses access discriminants on purpose is either the most brilliant Ada
programmer possible or is delusional. You might in fact be the former, but I
don't think we need to invent language features just for you. :-)
For me, this is all about getting rid of the .all and having the right lifetime
if someone tries to abuse the reference. The existence of the access
discriminant is a (goofy) means to that end, and I'd prefer to never even think
it exists. And I can't imagine anyone doing this on purpose outside a container
library.
> > IMHO, that's the least interesting of the three, given that Ada
> > allows you to omit the .all in most contexts anyway. (I do see how
> > you could use that to implement most of AI-111, which has been my
> > point all along vis-a-vis that AI: we don't need the complex
> > mechanisms so long as we have a way to get finalization right). It's
> > surely the one I could live without (and I have serious doubts that
> > it can be resolved without ambiguity).
>
> Not surprisingly, I don't agree. Eliminating the need to write
> ".Element.all" all over the place changes the dynamic significantly,
> and makes these References actually usable and powerful.
Let's agree to disagree on this one. It doesn't matter anyway, I don't think
there is a problem with this particular idea, because no guessing/wide search is
required to figure out what is meant. I just don't think it works sanely for the
container part.
****************************************************************
From: Tucker Taft
Sent: Wednesday, February 3, 2010 4:26 PM
> But the problem here is that the function name is now gone. How can
> the compiler decide what function is being called? The original
> interface idea made it clear: it was named in the interface. But now
> you have left that out. ...
Ahh, I see your confusion. We (language designers) would still have to specify
the implicit function name that is used in the expansion of the syntactic sugar,
in the same way we choose a particular function name in any language-defined
package. I had suggested something like "Element" or "Index", but conceivably
we could define a new operator designator such as "()" for this purpose.
Suppose we choose "Gazebo" ;-). Then for a type descended from Magic.Container,
the syntax:
Container_Object(params)
would expand into
Container_Object.Gazebo(params)
And then if the result type were a Reference, it would expand further into
Container_Object.Gazebo(params).<accdim>.all
One possibility would be to use something quite explicit and unlikely to collide
with some existing function, such as "Container_Component" or
"Container_Element."
****************************************************************
From: Randy Brukardt
Sent: Wednesday, February 3, 2010 4:50 PM
> One possibility would be to use something quite explicit and unlikely
> to collide with some existing function, such as "Container_Component"
> or "Container_Element."
Humm, if we decided on the operator designator "()" as you put it, note that we
no longer need the interface at all. The use of "()" provides the syntactic
marker needed.
That is, if we have:
function "()" (Container : aliased in out Vector; Position : in Cursor)
return Reference_Type renames Reference;
we know we are allowing
Container_Object (My_Cursor) to stand for
Container_Object."()" (My_Cursor) or, if you prefer
Container_Object.Reference (My_Cursor)
And even better, this isn't tied to particular return types or names; you can
have as many as you want so long as the results are unambiguous. One would
probably have to require that the first parameter be one that is allowed in a
prefix notation call (otherwise, you couldn't use the special syntax, so what
would be the point).
This avoid having to come up with a magic name, as well. (OK, "()" is the magic
name, but it is completely magic, just as all other operator symbols are. Seems
right to me.)
****************************************************************
From: Tucker Taft
Sent: Wednesday, February 10, 2010 4:39 PM
Here is an update to AI-139-2, [This is version /03 - Editor] which proposes
syntactic "sugar" for using iterators, accessors, and containers. This new
revision grew out of a recent phone discussion. The major changes involve
providing a single "magic" Reference interface instead of a generic package, and
an "()" operator to support indexing into a Container as though it were an
array.
I'm not personally convinced the "default iterator"
stuff is worth the added complexity. It works easily for arrays, but it may be
overkill to try to get it to work for containers as well.
As usual, any and all comments welcome.
****************************************************************
Questions? Ask the ACAA Technical Agent