Version 1.13 of ais/ai-20302.txt
!standard A.18 (00) 04-11-01 AI95-00302-03/08
!standard A.18.1 (00)
!standard A.18.2 (00)
!standard A.18.3 (00)
!standard A.18.4 (00)
!standard A.18.5 (00)
!standard A.18.6 (00)
!standard A.18.7 (00)
!standard A.18.8 (00)
!standard A.18.9 (00)
!standard A.18.10 (00)
!standard A.18.11 (00)
!standard A.18.12 (00)
!standard A.18.13 (00)
!standard A.18.14 (00)
!standard A.18.15 (00)
!standard A.18.16 (00)
!standard A.4.9 (00)
!class amendment 04-01-14
!status work item 04-01-14
!status received 04-01-14
!priority Medium
!difficulty Hard
!subject Container library
!summary
The following API describes a standard container library for Ada. The library
comprises sequence containers (vectors and lists), for inserting elements at
specified positions, and associative containers (sets and maps), which position
elements in order by key. The library is general, flexible, and efficient, and
its design has been guided by the philosophy that a library should stay out of
the programmer's way.
!problem
If is often the case that the solution to a programming problem requires
a collection of elements. The Ada language of course provides a
built-in array type, but typical problems often require a data structure
with better time behavior or more sophisticated semantics. Even if you
can get by with using an array, any non-trivial array manipulation
quickly becomes unwieldy, and hence error-prone.
An array is also not general enough in that it only provides a mapping
from a discrete index to the element type. The developer typically
needs a container that can map an element to any arbitrary type. The
key type often needs to be a string, but of course this cannot be used
as an array index subtype.
With no general-purpose standard container library, a developer is left
to craft his own container type. The data structure invariably chosen
is some kind of linked-list, because it's relatively simple. However,
manipulation of raw access types tends to be error-prone, and is a
source of memory leaks. A linked-list also does not perform well as a
general-purpose set or map when the number of elements is large.
One argument for having a standard library is that it's nice to be able
to use a language right out-of-the-box. If a developer has to leave the
language to solve common problems then perhaps that is a sign the
language doesn't provide enough support for the developer. Other
languages with which Ada competes, such as C++ or Java, come with rich
standard container libraries.
Many existing container libraries are either badly designed or difficult
to use. One issue is that it should be simple and easy to instantiate a
component. If it requires more than a single instantiation to create a
useable container, then it's too much pain and few developers will bother.
If he does bother his experience will be unpleasant. Programming is
work, but it should be fun work, not drudgery.
Libraries often have the problem that they are not sufficiently general,
or simply not intended as industrial-strength libraries. The library
designer can't possibly predict all the myriad uses to which his library
will be put, and so the library must be general and flexible enough to
suit unimagined uses.
Another problem is that these libraries often don't scale well to large
numbers of elements. Their behavior is neither predictable nor is it
even specified. They are often both time and space inefficient for
various other reasons. In real-time programs especially, it's important
to have predictable run-time behavior. An industrial-strength,
general-purpose standard container library must provide containers that
perform better than linear time complexity. Searches in particular need
to be fast.
!proposal
This library API proposal is modeled on the Standard Template Library (STL), an
algorithms and data structure library popularized by Alexander Stepanov, and
included in the C++ standard library.
We can broadly characterize containers as being either "sequence
containers" or "associative containers." All sequence containers allow
insertion and deletion at any position in the container, but each one
optimizes differently for insertions at certain positions. Associative
containers associate elements with a key, which defines how elements are
ordered in the container.
A vector is a sequence container optimized for insertion at the back
end. It of course allows insertion at any position, but as you move
toward the front the cost of vector insertion approaches linear time
complexity. A vector supports constant-time random access of elements.
A linked-list is a sequence container that provides constant-time
insertion and deletion at any position, but does not provide random
access. The list container is doubly-linked, having nodes with next and
previous pointers, and thus supports both forward and reverse iteration.
All of the containers specify positions using a cursor,
which is similar to an access type in that it designates an element.
Selector functions and procedures allow access to the element. Many cursor
operations are common to all containers. Containers also support operations
specific to that container; for instance, vectors also specify positions via an
integer index subtype.
The associative containers order their elements by key. A map has an
explicit key type, but with a set the key is implicit in the element
itself.
The hashed map associative containers scatter keys in the container according
to a hash function. The size of the hash table is automatically increased when
the length of the container equals its capacity (which specifies the length
before which it is guaranteed that no automatic hash table resizing occurs),
thus preserving constant time complexity for insertions.
The set associative container maintains elements in sort order. Insertions
and searches have O(log N) time complexity even in the worst case.
All of the containers have alternate forms that accept an element type that is
indefinite. The indefinite hashed maps also accept an indefinite key type,
allowing (for example) type String to be used as the generic actual key type.
There are also library-level subprograms for returning the hash value of
strings, and generic procedures for sorting an array.
The design of these packages follow the principle that they can be implemented
in Ada without any special implementation-specific magic. This is important,
as we do not want to burden implementers with special purpose requirements.
We also want to follow the principle that any capability used for the
implementation of the language is made available to users.
This API is based on an existing container library called Charles. The
source code itself and a couple of papers about the design of Charles
can be found at:
<http://home.earthlink.net/~matthewjheaney/charles/index.html>
Following the API proper, this proposal concludes with an examples
section that illustrates the kinds of problems the library solves, and
the most effective way to use the library to solve them.
!wording
Add Ada.Strings.Hash and Ada.Strings.Unbounded.Hash to A.4.7(29).
[This gives us Wide_ versions of these functions.]
A.4.9 String Hashing
Static Semantics
The following library-level subprograms are defined:
with Ada.Containers;
function Ada.Strings.Hash (Key : String) return Containers.Hash_Type;
pragma Pure (Ada.Strings.Hash);
Return an implementation-defined value which is a function of the value of Key.
If A and B are strings such that A equals B, Hash(A) equals Hash(B).
with Ada.Containers;
function Ada.Strings.Unbounded.Hash (Key : Unbounded_String) return
Containers.Hash_Type;
pragma Preelaborate (Ada.Strings.Unbounded.Hash);
Return an implementation-defined value which is a function of the value of Key.
If A and B are unbounded strings such that A equals B, Hash(A) equals Hash(B).
Implementation Advice
The various Hash functions should be good hash functions, returning
a wide spread of values for different string values. It should be unlikely
for similar strings to return the same value.
A.18 Containers
This clause presents the specifications of the package Containers and
several child packages, which provide facilities for storing collections
of elements.
A variety of sequence and associative containers are provided. Each container
includes a cursor type. A cursor is a reference to an element within a
container. Many operations on cursors are common to all of the containers.
Within this clause we provide Implementation Advice for the desired
average or worst case time complexity of certain operations on a
container. This advice is expressed using a big-O notation.
A complexity of O(f(N)), presuming f is some function of a
length parameter N and t(N) is the time the operation takes
(on average or worst case, as specified) for the length N,
means that there exists a finite A such that for any N, t(N)/f(N) < A.
AARM Note: Of course, an implementation can do better than a specified
O(f(N)): for example, O(1) meets the requirements for O(log N).
If the advice suggests that the complexity should be less than
O(f(N)), then for any arbitrarily small positive real D, there
should exist a positive integer M such that for all N > M,
t(N)/f(N) < D.
AARM Text
Language Design Principles
This clause provides a number of useful containers for Ada. Only the most
useful containers are provided. Ones that are relatively easy to code,
redundant, or rarely used are omitted from this set, even if they are
generally included in containers libraries.
The containers packages are modeled on the Standard Template Library (STL), an
algorithms and data structure library popularized by Alexander Stepanov, and
included in the C++ standard library. The structure and terminology differ from
the STL where that better maps to common Ada usage. For instance, what the STL
calls "iterators" are called "cursors" here.
The following major non-limited containers are provided:
* (Expandable) Vectors of any non-limited type;
* Doubly-linked Lists of any non-limited type;
* Hashed Maps keyed by any non-limited type containing any
non-limited type;
* Ordered Sets of any non-limited type.
Separate versions for definite and indefinite element types are provided, as
those for definite types can be implemented more efficiently.
Each container includes a cursor, which is a reference to an element within
a container. Cursors generally remain valid as long as the container exists and
the element referenced is not deleted. Many operations on cursors are common to
all of the containers. This makes it possible to write generic algorithms that
work on any kind of container.
The containers packages are structured so that additional packages can be
added in the future. Indeed, we hope that these packages provide the basis
for a more extensive secondary standard for containers.
If containers with similar functionality (but different performance
characteristics) are provided, we suggest that a prefix be used to identify
the class of the functionality: "Ada.Containers.Bounded_Sets" (for a set
with a maximum number of elements); "Ada.Containers.Protected_Maps" (for a
map which can be accessed by multiple tasks at one time);
"Ada.Containers.Persistent_Vectors" (for a persistent vector which continues
to exist between executions of a program) and so on.
Note that the language already includes several requirements that are
important to the use of containers. First, library packages must be
reentrant - multiple tasks can use the packages as long as they operate on
separate containers. Thus, it is only necessary for a user to protect a
container if a single container needs to be used by multiple tasks.
Second, the language requires that language-defined types stream "properly".
That means that the stream attributes can be used to implement persistence
of containers when necessary, and containers can be passed between
partitions of a program.
Finally, the language requires that equality of language-defined types
compose "properly". This means that the version of "=" directly used by
users is the same one that will be used in generics and in predefined equality
operators of types with components of the containers and/or cursors. This
prevents the abstraction from breaking unexpectedly.
If a container's element type is controlled, the point at which the element
is finalized will depend on the implementation of the container. We do not
specify precisely where this will happen (it will happen no later than the
finalization of the container, of course) in order to give implementation's
flexibility to cache, block, or split the nodes of the container. In particular,
Delete does not necessarily finalize the element; the implementation may (or
may not) hold the space for reuse. (The reference implementations show this
well, as Delete for a Vector does not finalize the element, while Delete for
an Ordered_Set does.)
This is not likely to be a hardship, as the element type has to be non-limited.
Types used to manage scarce resources generally need to be limited. Otherwise,
the amount of resources needed is hard to control, as the language allows
a lot of variation in the number or order of adjusts/finalizations. For
common uses of non-limited controlled types such as managing storage, the
types already have to manage arbitrary copies.
The use of controlled type also brings up the possibility of failure of
finalization (and thus deallocation) of an element. This is a "serious bug",
as AI-179 puts it, so we don't try to specify what happens in that case.
The implementation should propagate the exception.
It is expected that exceptions propagated from these operations do not
damage containers. That is, if Storage_Error is propagated because of an
allocation failure, or Constraint_Error is propagated by the assignment of
elements, the container can continue to be used without further exceptions.
The intent is that it should be possible to recover from errors without
losing data. We don't try to state this formally in most cases, because it is
hard to define precisely what is and is not allowed behavior.
When this clause says that the behavior of something is unspecified, we
really mean that any result of executing Ada code short of erroneous
execution is allowed. We do not mean that memory not belonging to the
parameters of the operation can be trashed. When we mean to allow erroneous
behavior, we specifically say that execution is erroneous. All this means
if the containers are written in Ada is that checks should not be suppressed
or removed assuming some behavior of other code, and that the implementation
should take care to avoid creating internal dangling accesses by assuming
behavior from generic formals that can't be guaranteed. We don't
try to say this normatively because it would be fairly complex, and
implementers are unlikely to increase their support costs by fielding
implementations that are unstable if given buggy hash functions, et. al.
End AARM Text
A.18.1 The Package Containers
The package Containers is the root of the containers subsystem.
Static Semantics
The library package Containers has the following declaration:
package Ada.Containers is
pragma Pure;
type Hash_Type is mod <Implementation-Defined>;
type Count_Type is range 0 .. <implementation-defined>;
end Ada.Containers;
Hash_Type represents the range of the result of a hash function. Count_Type
represents the (potential or actual) number of elements of a container.
Implementation Advice
Hash_Type'Modulus should be at least 2**32. Count_Type'Last should be at least
2**31-1.
AARM Note: This is not a requirement so that these types can be declared
properly on machines with native sizes that are not 32 bits. For instance, a
24-bit target could use 2**24 for Hash_Type'Modulus.
A.18.2 The Package Containers.Vectors
The language-defined package Containers.Vectors provides a private type Vector
and a set of operations. A vector container allows insertion and deletion at
any position, but it is specifically optimized for insertion and deletion at
the high end (the end with the higher index) of the container. A vector
container also provides random access to its elements.
A vector container object manages an unconstrained internal array, which
expands as necessary as items are inserted. The length of a vector is
the number of elements that the vector contains. The capacity of a vector
corresponds to the number of elements that can be stored in the internal array,
and will always be no less than the length of the vector.
A vector container may contain empty elements. Empty elements do not have a
specified value.
AARM Notes:
Vectors are not intended to be sparse (that is, there are elements at all
defined positions). Users are expected to use other containers (like a Map)
when they need sparse structures (there is a Note to this effect at the
end of this subclause).
The internal array is a conceptual model of a vector. There is no requirement
for an implementation to be a single contiguous array.
End AARM Notes.
Static Semantics
The library package Containers.Vectors has the following declaration:
generic
type Index_Type is range <>;
type Element_Type is private;
with function "=" (Left, Right : Element_Type)
return Boolean is <>;
package Ada.Containers.Vectors is
pragma Preelaborate (Vectors);
subtype Extended_Index is
Index_Type'Base range Index_Type'First-1 .. Index_Type'Last +
Boolean'Pos (Index_Type'Base'Last > Index_Type'Last);
No_Index : constant Extended_Index := Extended_Index'First;
subtype Index_Subtype is Index_Type;
type Vector is tagged private;
type Cursor is private;
Empty_Vector : constant Vector;
No_Element : constant Cursor;
function To_Vector (Length : Count_Type) return Vector;
function To_Vector
(New_Item : Element_Type;
Length : Count_Type) return Vector;
function "&" (Left, Right : Vector) return Vector;
function "&" (Left : Vector;
Right : Element_Type) return Vector;
function "&" (Left : Element_Type;
Right : Vector) return Vector;
function "&" (Left, Right : Element_Type) return Vector;
function "=" (Left, Right : Vector) return Boolean;
function Capacity (Container : Vector) return Count_Type;
procedure Reserve_Capacity (Container : in out Vector;
Capacity : in Count_Type);
function Length (Container : Vector) return Count_Type;
function Is_Empty (Container : Vector) return Boolean;
procedure Clear (Container : in out Vector);
function To_Cursor (Container : Vector;
Index : Extended_Index) return Cursor;
function To_Index (Position : Cursor) return Extended_Index;
function Element (Container : Vector;
Index : Index_Type)
return Element_Type;
function Element (Position : Cursor) return Element_Type;
procedure Query_Element
(Container : in Vector;
Index : in Index_Type;
Process : not null access procedure (Element : in Element_Type));
procedure Query_Element
(Position : in Cursor;
Process : not null access procedure (Element : in Element_Type));
procedure Update_Element
(Container : in Vector;
Index : in Index_Type;
Process : not null access procedure (Element : in out Element_Type));
procedure Update_Element
(Position : in Cursor;
Process : not null access procedure (Element : in out Element_Type));
procedure Replace_Element (Container : in Vector;
Index : in Index_Type;
By : in Element_Type);
procedure Replace_Element (Position : in Cursor;
By : in Element_Type);
procedure Assign (Target : in out Vector;
Source : in Vector);
procedure Move (Target : in out Vector;
Source : in out Vector);
procedure Insert (Container : in out Vector;
Before : in Extended_Index;
New_Item : in Vector);
procedure Insert (Container : in out Vector;
Before : in Cursor;
New_Item : in Vector);
procedure Insert (Container : in out Vector;
Before : in Cursor;
New_Item : in Vector;
Position : out Cursor);
procedure Insert (Container : in out Vector;
Before : in Extended_Index;
New_Item : in Element_Type;
Count : in Count_Type := 1);
procedure Insert (Container : in out Vector;
Before : in Cursor;
New_Item : in Element_Type;
Count : in Count_Type := 1);
procedure Insert (Container : in out Vector;
Before : in Cursor;
New_Item : in Element_Type;
Position : out Cursor;
Count : in Count_Type := 1);
procedure Prepend (Container : in out Vector;
New_Item : in Vector);
procedure Prepend (Container : in out Vector;
New_Item : in Element_Type;
Count : in Count_Type := 1);
procedure Append (Container : in out Vector;
New_Item : in Vector);
procedure Append (Container : in out Vector;
New_Item : in Element_Type;
Count : in Count_Type := 1);
procedure Insert_Space (Container : in out Vector;
Before : in Extended_Index;
Count : in Count_Type := 1);
procedure Insert_Space (Container : in out Vector;
Before : in Cursor;
Position : out Cursor;
Count : in Count_Type := 1);
procedure Set_Length (Container : in out Vector;
Length : in Count_Type);
procedure Delete (Container : in out Vector;
Index : in Extended_Index;
Count : in Count_Type := 1);
procedure Delete (Container : in out Vector;
Position : in out Cursor;
Count : in Count_Type := 1);
procedure Delete_First (Container : in out Vector;
Count : in Count_Type := 1);
procedure Delete_Last (Container : in out Vector;
Count : in Count_Type := 1);
function First_Index (Container : Vector) return Index_Type;
function First (Container : Vector) return Cursor;
function First_Element (Container : Vector)
return Element_Type;
function Last_Index (Container : Vector) return Extended_Index;
function Last (Container : Vector) return Cursor;
function Last_Element (Container : Vector)
return Element_Type;
procedure Swap (Container : in Vector;
I, J : in Index_Type);
procedure Swap (I, J : in Cursor);
generic
with function "<" (Left, Right : Element_Type)
return Boolean is <>;
procedure Generic_Sort (Container : in Vector);
function Find_Index (Container : Vector;
Item : Element_Type;
Index : Index_Type := Index_Type'First)
return Extended_Index;
function Find (Container : Vector;
Item : Element_Type;
Position : Cursor := No_Element)
return Cursor;
function Reverse_Find_Index (Container : Vector;
Item : Element_Type;
Index : Index_Type := Index_Type'Last)
return Extended_Index;
function Reverse_Find (Container : Vector;
Item : Element_Type;
Position : Cursor := No_Element)
return Cursor;
function Contains (Container : Vector;
Item : Element_Type) return Boolean;
function Next (Position : Cursor) return Cursor;
function Previous (Position : Cursor) return Cursor;
procedure Next (Position : in out Cursor);
procedure Previous (Position : in out Cursor);
function Has_Element (Position : Cursor) return Boolean;
procedure Iterate
(Container : in Vector;
Process : not null access procedure (Position : in Cursor));
procedure Reverse_Iterate
(Container : in Vector;
Process : not null access procedure (Position : in Cursor));
private
... --
end Ada.Containers.Vectors;
The type Vector needs finalization (see 7.6).
Empty_Vector represents the empty vector container. It has a length of 0. If an
object of type Vector is not otherwise initialized, it will be initialized to
the same value as Empty_Vector.
No_Element represents a cursor that designates no element. If an object of type
Cursor is not otherwise initialized, it will be initialized to the same
value as No_Element.
No_Index represents a position that does not correspond to any element.
The subtype Extended_Index includes the indices covered by Index_Type
plus the value No_Index and, if it exists, the successor to the
Index_Type'Last.
AARM Note: We require the existence of Index_Type'First - 1, so that No_Index
and Last_Index of an empty vector is well-defined. We don't require the
existence of Index_Type'Last + 1, as it is only used as the position of
insertions (and needs to be allowed only when inserting an empty vector).
function To_Vector (Length : Count_Type) return Vector;
Returns a vector with a length of Length. The vector is filled with empty
elements.
function To_Vector
(New_Item : Element_Type;
Length : Count_Type) return Vector;
Returns a vector with a length of Length, and elements initialized to the
value New_Item.
function "&" (Left, Right : Vector) return Vector;
Returns a vector comprising the elements of Right appended in their
original order to the vector Left.
function "&" (Left : Vector;
Right : Element_Type) return Vector;
Returns a vector comprising the element Right appended to the vector Left.
function "&" (Left : Element_Type;
Right : Vector) return Vector;
Returns a vector comprising the element Left prepended to the vector Right.
function "&" (Left, Right : Element_Type) return Vector;
Returns a vector comprising the element Left followed by the element Right.
function "=" (Left, Right : Vector) return Boolean;
If Left denotes the same object as Right, then this function returns
True. Otherwise if Left and Right have different lengths, then this
function returns False. Otherwise, it compares each element in Left to
the corresponding element in Right using the generic formal equality
operator. If element equality returns False, then this function returns
False; otherwise, this function returns True. Any exception raised
during evaluation of element equality is propagated.
function Capacity (Container : Vector) return Count_Type;
Returns the length of the internal array.
procedure Reserve_Capacity (Container : in out Vector;
Capacity : in Count_Type);
Reserve_Capacity sets the capacity of Container to a value which is at least the
value Capacity and which is greater than the length of Container, expanding (or
contracting) the internal array to hold at least Capacity elements. Expansion
or contraction will require allocation, and possibly copying and deallocation
of elements. Any exceptions raised by these operations are propagated, leaving
the container with at least the original length and elements.
AARM Notes
Expanding the internal array can be done by allocating a new, longer array,
copying the elements, and deallocating the original array. This may raise
Storage_Error, or cause an exception from a controlled subprogram. We require
that a failed Reserve_Capacity does not lose any elements if an exception occurs,
but we do not require a specific order of evaluations or copying.
This routine is used to preallocate the internal array to the specified
capacity such that future Inserts do not require memory allocation overhead.
Therefore, the implementation should allocate the needed memory to make that
true at this point, even though the visible semantics could be preserved by
waiting until the memory is needed. This doesn't apply to the indefinite
element container, because elements will have to be allocated individually.
The implementation does not have to contract the internal array if the
capacity is reduced, as any capacity greater than or equal to the specified
capacity is allowed.
End AARM Notes
function Length (Container : Vector) return Count_Type;
The Length function returns the number of active elements in Container.
function Is_Empty (Container : Vector) return Boolean;
Equivalent to Length (Container) = 0.
procedure Clear (Container : in out Vector);
Sets the length of Vector to 0. Its capacity does not change.
function To_Cursor (Container : Vector;
Index : Extended_Index) return Cursor;
If the value of Index is not in the range First_Index (Container) .. Last_Index
(Container), then No_Element is returned. Otherwise, a cursor designating the
element currently at Index in Container is returned.
function To_Index (Position : Cursor) return Extended_Index;
If Position is No_Element, No_Index is returned. Otherwise, the
index (within its containing vector) of the element designated by Cursor is
returned.
AARM Note: This implies that the index is determinable from a bare cursor
alone. The basic model is that a vector cursor is implemented as a record
containing an access to the vector container and an index value. This does
constrain implementations, but it also allows all of the cursor operations
to be defined in terms of the corresponding index operation (which should be
primary for a vector).
function Element (Container : Vector;
Index : Index_Type)
return Element_Type;
If Index is not in the range First_Index (Container) .. Last_Index (Container),
then Constraint_Error is propagated. Otherwise this function returns the
element at position Index.
function Element (Position : Cursor) return Element_Type;
Returns the element designated by Position. If Position equals No_Element,
then Constraint_Error is propagated.
procedure Query_Element
(Container : in Vector;
Index : in Index_Type;
Process : not null access procedure (Element : in Element_Type));
If Index is not in the range First_Index (Container) .. Last_Index (Container),
then Constraint_Error is propagated. Otherwise, it calls Process.all with the
element at position Index as the parameter. Any exception raised by Process
is propagated.
procedure Query_Element
(Position : in Cursor;
Process : not null access procedure (Element : in Element_Type));
Calls Process.all with the element designated by Position as the parameter. If
Position equals No_Element, then Constraint_Error is propagated. Any exceptions
raised by Process are propagated.
procedure Update_Element
(Container : in Vector;
Index : in Index_Type;
Process : not null access procedure (Element : in out Element_Type));
If Index is not in the range First_Index (Container) .. Last_Index (Container),
then Constraint_Error is propagated. Otherwise, it calls Process.all with the
element at position Index as the parameter. Any exception raised by Process
is propagated.
If Element_Type is unconstrained and definite, then the Element parameter
of Process.all shall be unconstrained.
AARM Note: This means that the elements cannot be directly allocated from the
heap (nor aliased unless AI-363 is included in the Amendment); it must be
possible to change the discriminants of the element in place.
The element at position Index is not an empty element after successful
completion of this operation.
AARM Note: Since reading an empty element is a bounded error, attempting to
use this procedure to replace empty elements may fail. Use Replace_Element
to do that reliably.
procedure Update_Element
(Position : in Cursor;
Process : not null access procedure (Element : in out Element_Type));
Calls Process.all with the element designated by Position as the parameter. If
Position equals No_Element, then Constraint_Error is propagated. Any exceptions
raised by Process are propagated.
If Element_Type is unconstrained and definite, then the Element parameter
of Process.all shall be unconstrained.
The element designated by Position is not an empty element after successful
completion of this operation.
procedure Replace_Element (Container : in Vector;
Index : in Index_Type;
By : in Element_Type);
If Index does not specify a value in the range First_Index (Container) ..
Last_Index (Container), then Constraint_Error is propagated. Otherwise this
procedure assigns the value By to the element at position Index. Any exceptions
raised during the assignment are propagated. The element at position Index is
not an empty element after successful completion of this operation.
procedure Replace_Element (Position : in Cursor;
By : in Element_Type);
This procedure assigns the value By to the element designated by Position.
If Position equals No_Element, then Constraint_Error is propagated.
Any exception raised during the assignment is propagated. The element
designated by Position is not an empty element after successful completion of
this operation.
AARM Note: Replace_Element and Update_Element are the only ways that an element
can change from empty to non-empty.
procedure Assign (Target : in out Vector;
Source : in Vector);
If Target denotes the same object as Source, then the operation has no
effect. Otherwise, Assign first calls Clear (Target), then
Reserve_Capacity (Target, Length (Source)). It then assigns the active elements
of Source to the corresponding positions in Target, and then sets the length
of Target to the length of Source. Any exception raised during element
assignment is propagated.
procedure Move (Target : in out Vector;
Source : in out Vector);
If Target denotes the same object as Source, then the operation has no
effect. If Length (Target) is greater than 0, then it raises Constraint_Error.
Otherwise, the internal array is removed from Source (making its capacity 0)
and moved to Target (making its capacity the original capacity of Source). The
length of Target is set to the length of Source, and the length of Source is
set to 0.
AARM Note:
If Capacity (Target) /= 0, the previous internal array may need to be
deallocated. We don't mention this explicitly, because it is covered by the
"no memory loss" Implementation Requirement. If the deallocation raises an
exception, the implementation should leave both Vectors in a consistent state
(usually empty for Target and the original state for Source).
procedure Insert (Container : in out Vector;
Before : in Extended_Index;
New_Item : in Vector);
If Length(New_Item) is 0, then Insert does nothing. Otherwise, it calculates
the new length NL as the sum of the current length and Length (New_Item); if
the value of Last appropriate for length NL would be greater than
Index_Type'Last then Constraint_Error is propagated. If the value of Before is
not in the range First_Index (Container) .. Last_Index (Container) + 1, then
Constraint_Error is propagated.
If the current vector capacity is less than or equal to the new length,
Reserve_Capacity (Container, NL) is called to increase the vector capacity. Then
Insert slides the elements in the range Before .. Last_Index (Container) up by
Length(New_Item) positions, and then copies the elements of New_Item to the
positions starting at Before. Any exception raised during the copying is
propagated.
AARM Note:
Moving the elements does not necessarily involve copying. Similarly, since
Reserve_Capacity does not require the copying of elements, it does not need to be
explicitly called (the implementation can combine the operations if it wishes
to). [Note to reviewers: I didn't want to duplicate the messy wording and notes
about exceptions not losing anything.]
End AARM Note.
procedure Insert (Container : in out Vector;
Before : in Cursor;
New_Item : in Vector);
If Before is not No_Element, and does not designate an element in Container,
then Program_Error is propagated.
If Length(New_Item) is 0, then Insert does nothing.
If Before is No_Element, then is equivalent to Insert (Container,
Last_Index (Container) + 1), New_Item); otherwise is
equivalent to Insert (Container, To_Index (Before), New_Item);
AARM Note: The check on Before checks that the cursor does not belong to some
other Container. This check implies that a reference to the container is
included in the cursor value. This wording is not meant to require detection of
dangling cursors; such cursors are defined to be invalid, which means that
execution is erroneous, and any result is allowed (including not raising an
exception).
procedure Insert (Container : in out Vector;
Before : in Cursor;
New_Item : in Vector;
Position : out Cursor);
If Before is not No_Element, and does not designate an element in Container,
then Program_Error is propagated.
If Length(New_Item) is 0, then Insert sets Position to Before. Otherwise,
create a temporary (call it Temp_Index) and set it to Last_Index (Container) + 1
if Before equals No_Element, and To_Index (Before) otherwise. Then
Insert (Container, Before, New_Item) is called, and finally
Position is set to To_Cursor (Container, Temp_Index).
AARM Note: The messy wording because Before is invalidated by Insert, and we
don't want Position to be invalid after this call. An implementation probably
only needs to copy Before to Position.
procedure Insert (Container : in out Vector;
Before : in Extended_Index;
New_Item : in Element_Type;
Count : in Count_Type := 1);
Equivalent to Insert (Container, Before, To_Vector (New_Item, Count));
procedure Insert (Container : in out Vector;
Before : in Cursor;
New_Item : in Element_Type;
Count : in Count_Type := 1);
Equivalent to Insert (Container, Before, To_Vector (New_Item, Count));
procedure Insert (Container : in out Vector;
Before : in Cursor;
New_Item : in Element_Type;
Position : out Cursor;
Count : in Count_Type := 1);
Equivalent to Insert (Container, Before, To_Vector (New_Item, Count), Position);
procedure Prepend (Container : in out Vector;
New_Item : in Vector;
Count : in Count_Type := 1);
Equivalent to Insert (Container, Index_Type'First, New_Item).
procedure Prepend (Container : in out Vector;
New_Item : in Element_Type;
Count : in Count_Type := 1);
Equivalent to Insert (Container, Index_Type'First, New_Item, Count).
procedure Append (Container : in out Vector;
New_Item : in Vector);
Equivalent to Insert (Container, Last_Index (Container) + 1, New_Item).
procedure Append (Container : in out Vector;
New_Item : in Element_Type;
Count : in Count_Type := 1);
Equivalent to Insert (Container, Last_Index (Container) + 1, New_Item, Count).
procedure Insert_Space (Container : in out Vector;
Before : in Extended_Index;
Count : in Count_Type := 1);
Equivalent to Insert (Container, Before, New_Item, Count), with the
difference that the inserted elements are empty elements.
procedure Insert_Space (Container : in out Vector;
Before : in Cursor;
Position : out Cursor;
Count : in Count_Type := 1);
If Before is not No_Element, and does not designate an element in Container,
then Program_Error is propagated.
If Length(New_Item) is 0, then Insert does nothing. Otherwise,
create a temporary (call it Temp_Index) and set it to
Last_Index (Container) + 1 if Before equals No_Element, and
To_Index (Before) otherwise. Then Insert_Space (Container, Temp_Index,
Count) is called, and finally Position is set to To_Cursor (Container,
Temp_Index).
procedure Set_Length (Container : in out Vector;
Length : in Count_Type);
If Length is larger than the capacity of Container, calls
Reserve_Capacity (Container, Length), then sets the length of the
Container to Length. If Length is greater than the original length of
Container, the added elements are empty elements.
procedure Delete (Container : in out Vector;
Index : in Extended_Index;
Count : in Count_Type := 1);
If Count is 0, the operation has no effect. If Index does not specify a
value in the range First_Index (Container) .. Last_Index (Container), then
Constraint_Error is propagated. Otherwise Delete slides the active
elements (if any) starting Index plus Count down to Index. Any
exceptions raised during element assignment are propagated.
AARM Note: If Index + Count >= Last_Index(Container), this effectively
truncates the vector (setting Last_Index to Index - 1 and consequently
setting Length to Index - Index_Type'First).
procedure Delete (Container : in out Vector;
Position : in out Cursor;
Count : in Count_Type := 1);
If Position is No_Element, this operation has no effect.
If Position does not designate an element in Container, then Program_Error is
raised. If Count is 0, the operation has no effect. Otherwise is equivalent to
Delete (Container, To_Index (Position), Count).
procedure Delete_First (Container : in out Vector;
Count : in Count_Type := 1);
Equivalent to Delete (Container, Index_Type'First, Count).
procedure Delete_Last (Container : in out Vector;
Count : in Count_Type := 1);
If Length (Container) <= Count then is equivalent to
Clear (Container); otherwise is equivalent to Delete (Container,
Index_Type'Val(Index_Type'Pos(Last_Index(Container)) - Count + 1), Count).
function First_Index (Container : Vector) return Index_Type;
Returns the value Index_Type'First.
AARM Note: We'd rather call this "First", but then calling most routines in
here with First (Some_Vect) would be ambiguous.
function First (Container : Vector) return Cursor;
If Container is empty, First returns the value No_Element. Otherwise,
returns a cursor that designates the first element in Container.
function First_Element (Container : Vector) return Element_Type;
Equivalent to Element (Container, First_Index (Container)).
function Last_Index (Container : Vector) return Extended_Index;
If Container is empty, returns No_Index;
otherwise, returns the position of the last element in Container.
function Last (Container : Vector) return Cursor;
If Container is empty, Last returns the value No_Element. Otherwise,
returns a cursor that designates the last element in Container.
function Last_Element (Container : Vector) return Element_Type;
Equivalent to Element (Container, Last_Index (Container)).
procedure Swap (Container : in Vector;
I, J : in Index_Type);
If I or J does not specify a value in the range First_Index (Container) ..
Last_Index (Container), then Constraint_Error is propagated. Otherwise
Swap exchanges the values of the elements at positions I and J.
AARM Notes: To Be Honest: The implementation is not required to actually
copy the elements if it can do the swap some other way. But it is allowed
to copy the elements if needed.
procedure Swap (I, J : in Cursor);
If either I or J is No_Element, then Constraint_Error is propagated. If I and J
designate elements in different containers, then Program_Error is propagated.
Otherwise Swap exchanges the values of the elements designated by I and J.
AARM Notes:
After a call to Swap, I designates the element value previously
designated by J, and J designates the element value previously
designated by I. The cursors do not become ambiguous from this operation.
To Be Honest: The implementation is not required to actually
copy the elements if it can do the swap some other way. But it is allowed
to copy the elements if needed.
End AARM Notes.
generic
with function "<" (Left, Right : Element_Type) return Boolean is <>;
procedure Generic_Sort (Container : in Vector);
Reorders the elements of Container such that the elements are
sorted smallest first as determined by the generic formal "<" operator
provided. Any exception raised during evalution of "<" is propagated.
AARM Notes:
This implies swapping the elements, usually including an intermediate copy.
This means that the elements will usually be copied. (As with Swap,
if the implementation can do this some other way, it is allowed to.) Since
the elements are non-limited, this usually will not be a problem. Note that
there is Implementation Advice below that the implementation should use
a sort that minimizes copying of elements.
The sort is not required to be stable (and the fast algorithm required will
not be stable). If a stable sort is needed, the user can include the original
location of the element as an extra "sort key". We considered requiring the
implementation to do that, but it is mostly extra overhead -- usually there
is something already in the element that provides the needed stability.
End AARM Notes
function Find_Index (Container : Vector;
Item : Element_Type;
Index : Index_Type := Index_Type'First)
return Extended_Index;
Searches the elements of Container for an element equal to Item,
starting at position Index. If there are no elements in the
range Index .. Last_Index (Container) equal to Item, then Find_Index returns
No_Index. Otherwise, it returns the index of the matching
element.
function Find (Container : Vector;
Item : Element_Type;
Position : Cursor := No_Element)
return Cursor;
If Position is not No_Element, and does not designate an element in Container,
then Program_Error is propagated.
Searches the elements of Container for an element equal to Item,
starting at the first element if Cursor equals No_Element, and at
the element designated by Cursor otherwise, and searching to the last
element in Container. If an item equal to Item is found, Find returns a
cursor designating the first element found equal to Item. If no such item is
found, it returns No_Element.
function Reverse_Find_Index (Container : Vector;
Item : Element_Type;
Index : Index_Type := Index_Type'Last)
return Extended_Index;
Searches the elements of Container in reverse for an element equal to
Item, starting at position Index. If Index is greater than Last_Index
(Container), then the search begins at position Last_Index (Container).
If there are no elements in the range Index_Type'First .. Index equal
to Item, then Reverse_Find_Index returns No_Index.
Otherwise, it returns the index of the matching element.
function Reverse_Find (Container : Vector;
Item : Element_Type;
Position : Cursor := No_Element)
return Cursor;
If Position is not No_Element, and does not designate an element in Container,
then Program_Error is propagated.
Searches the elements of Container for an element equal to Item,
starting at the last element if Cursor equals No_Element, and at
the element designated by Cursor otherwise, and searching backwards to the
first element in Container. If an item equal to Item is found, Find returns a
cursor designating the first element found equal to Item. If no such item is
found, it returns No_Element.
function Contains (Container : Vector;
Item : Element_Type) return Boolean;
Equivalent to Has_Element (Find (Container, Item)).
function Next (Position : Cursor) return Cursor;
If Position equals No_Element or designates the last element of the container,
then Next returns the value No_Element. Otherwise, returns a cursor that
designates the element with index To_Index (Position) + 1 in the
same vector as Position.
function Previous (Position : Cursor) return Cursor;
If Position equals No_Element or designates the first element of the container,
then Previous returns the value No_Element. Otherwise, returns a cursor that
designates the element with index (To_Index (Position) - 1) in the
same vector as Position.
The procedure Next is equivalent to Position := Next (Position).
The procedure Previous is equivalent to Position := Previous (Position).
function Has_Element (Position : Cursor) return Boolean;
Returns True if Position designates an element, and returns False otherwise.
AARM Note: To Be Honest: This function may not detect cursors that
designate deleted elements; such cursors are invalid (see below) and any
use of them (including in this routine) is erroneous.
procedure Iterate
(Container : in Vector;
Process : not null access procedure (Position : in Cursor));
Invokes Process.all with a cursor that designates each element in Container, in
index order. Any exception raised by Process is propagated.
Program_Error is propagated if:
* Process.all attempts to insert or delete elements from Container; or
* Process.all finalizes Container; or
* Process.all calls Move with Container as a parameter.
AARM Note:
This check takes place when the operations that insert or delete elements, etc.
are called. There is no check needed if an attempt is made to insert or delete
nothing (that is, Count = 0 or Length(Item) = 0).
The check is easy to implement: each container needs a counter. The counter
is incremented when Iterate is called, and decremented when Iterate completes.
If the counter is nonzero when an operation that inserts or deletes is called,
Finalize is called, or one of the other operations in the list occurs,
Program_Error is raised.
Swap and Generic_Sort are not included here, as they only copy elements.
End AARM Notes.
procedure Reverse_Iterate
(Container : in Vector;
Process : not null access procedure (Position : in Cursor));
Iterates over the nodes in Container as per Iterate, except
that elements are traversed in reverse order.
Legality Rules
An instantiation of Containers.Vectors shall be at the library level.
AARM Note
A implementation will typically need to use controlled types to insure that
the Implementation Requirements are met. These would require all instantiations
to occur at the library level. We certainly do not want to require magic
for nested container instantiations, while not giving similar capabilities to
users. We've made this a legality rule to enhance portability. This rule will
be dropped if AI-344 or some other solution to nested controlled types is
adopted.
Bounded (Run-Time) Errors
Reading the value of an empty element by calling Element, Query_Element,
Update_Element, Generic_Sort, "=", Find, or Reverse_Find is a bounded error.
The implementation may treat the element as having any valid value of the
element type, or raise Constraint_Error or Program_Error before modifying the
vector.
AARM Notes: For instance, a default initialized element could be returned. Or
some previous value of an element. But returning random junk is not allowed
if the type has default initial value(s).
Assignment and streaming of empty elements are NOT bounded errors.
This is consistent with regular composite types, for which assignment and
streaming of uninitialized components do not cause a bounded error, but reading
the uninitialized component does cause a bounded error.
There are other operations which are defined in terms of the operations listed
above.
End AARM Notes.
A Cursor value is ambiguous if any of the following have occurred since it
was created:
* Insert or Delete has been called on the vector that contains the element the
cursor designates with an index value (or a cursor designating an element at
such an index value) less than or equal to the index value of the element
designated by the cursor;
* The vector that contains the element it designates has been passed to an
instance of Generic_Sort.
It is a bounded error to call any subprogram other than "=" or Has_Element
declared in Containers.Vectors with an ambiguous (but not invalid, see below)
cursor parameter. Possible results are:
* The cursor may be treated as if it was No_Element;
* The cursor may designate some element in the vector (but not necessarily
the element that it originally designated);
* Constraint_Error may be raised; or
* Program_Error may be raised.
AARM Note: Cursors are made ambiguous if an Insert or Delete occurs that moves
the elements in the internal array including the designated ones. After such an
operation, the cursor probably still designates an element (although it might
not after a deletion), but it is a different element. That violates the
definition of cursor -- it designates a particular element.
For "=" or Has_Element, the cursor works normally (it would not be No_Element).
We don't want to trigger an exception simply for comparing a bad cursor.
While it is possible to check for or prevent these cases, in many cases the
overhead necessary to make the check (or prevent the problems) is substantial
in time or space.
End AARM Notes.
Erroneous Execution
A Cursor value is invalid if any of the following have occurred since it was
created:
* The vector that contains the element it designates has been finalized;
* The vector that contains the element it designates has been used as the
Source or Target of a call to Move;
* The element it designates has been deleted.
The result of "=" or Has_Element is unspecified if it is called with an
invalid cursor parameter. Execution is erroneous if any other subprogram
declared in Containers.Vectors is called with an invalid cursor parameter.
Execution also is erroneous if the called Process procedure of a call to
Query_Element or Update_Element executes an operation that causes the Position
cursor or To_Cursor of the Index parameter of Query_Element or Update_Element
to become invalid or ambiguous.
AARM Notes: The list above (combined with the bounded error cases) is intended
to be exhaustive. In other cases, a cursor value continues to designate its
original element. For instance, cursor values survive the appending of new
elements.
End AARM Notes.
Implementation Requirements
No storage associated with a vector object shall be lost upon assignment or
scope exit.
Implementation Advice
Containers.Vectors should be implemented similarly to an array. In particular,
if the length of a vector is N, then
* the worst-case time complexity of Append with Count=1 and Element should
be O(log N);
* the worst-case time complexity of Prepend with Count=1 or Delete_First
with Count=1 of the vector should be O(N log N).
AARM Note
We do not mean to overly constrain implementation strategies here. However, it
is important for portability that the performance of large containers has
roughly the same factors on different implementations. If a program is moved
to an implementation that takes O(N) time to access elements, that program
could be unusable when the vectors are large. We allow O(log N) access because
the proportionality constant and caching effects are likely to be larger than
the log factor, and we don't want to discourage innovative implementations.
The worst-case time complexity of a call on an instantiation of
Containers.Vectors.Generic_Sort should be O(N**2), and
the average time complexity should be better than O(N**2).
AARM Note
In other words, we're requiring the use of a better than O(N**2) sorting
algorithm, such as Quicksort. No Bubble sorts allowed!
Containers.Vectors.Generic_Sort should minimize copying of elements.
AARM Note - To Be Honest
We do not mean "absolutely minimize" here; we're not intending to require a
single copy for each element. Rather, we want to suggest that the sorting
algorithm chosen is one that does not copy items unnecessarily. Bubble sort
would not meet this advice, for instance.
NOTE:
All elements of a vector occupy locations in the internal array.
If a sparse container is required, a Hashed_Map should be used rather than a
vector.
If Index_Type'Base'First = Index_Type'First an instantiation of
Ada.Containers.Vectors will raise Constraint_Error. A value below
Index_Type'First is required so that an empty vector has a meaningful
value of Last_Index.
AARM Note: This property is the main reason why only integer types (as opposed
to any discrete type) are allowed as the index type of a vector. An enumeration
or modular type would require a subtype in order to meet this requirement.
A.18.3 The Package Containers.Doubly_Linked_Lists
The language-defined package Containers.Doubly_Linked_Lists provides
private types List and Cursor, and a set of operations for each type. A
list container is optimized for insertion and deletion at any position.
A doubly-linked list container object manages a linked list of internal
storage nodes, each of which contains an element and pointers to the
next (successor) and previous (predecessor) internal nodes. A cursor
designates particular storage nodes (and thus elements) within a list.
Static Semantics
The library package Containers.Doubly_Linked_Lists has the following
declaration:
generic
type Element_Type is private;
with function "=" (Left, Right : Element_Type) return Boolean is <>;
package Ada.Containers.Doubly_Linked_Lists is
pragma Preelaborate (Doubly_Linked_Lists);
type List is tagged private;
type Cursor is private;
Empty_List : constant List;
No_Element : constant Cursor;
function "=" (Left, Right : List) return Boolean;
function Length (Container : List) return Natural;
function Is_Empty (Container : List) return Boolean;
procedure Clear (Container : in out List);
function Element (Position : Cursor)
return Element_Type;
procedure Query_Element
(Position : in Cursor;
Process : not null access procedure (Element : in Element_Type));
procedure Update_Element
(Position : in Cursor;
Process : not null access procedure (Element : in out Element_Type));
procedure Replace_Element (Position : in Cursor;
By : in Element_Type);
procedure Move (Target : in out List;
Source : in out List);
procedure Prepend (Container : in out List;
New_Item : in Element_Type;
Count : in Count_Type := 1);
procedure Append (Container : in out List;
New_Item : in Element_Type;
Count : in Count_Type := 1);
procedure Insert (Container : in out List;
Before : in Cursor;
New_Item : in Element_Type;
Count : in Count_Type := 1);
procedure Insert (Container : in out List;
Before : in Cursor;
New_Item : in Element_Type;
Position : out Cursor;
Count : in Count_Type := 1);
procedure Insert (Container : in out List;
Before : in Cursor;
Position : out Cursor;
Count : in Count_Type := 1);
procedure Delete (Container : in out List;
Position : in out Cursor;
Count : in Count_Type := 1);
procedure Delete_First (Container : in out List;
Count : in Count_Type := 1);
procedure Delete_Last (Container : in out List;
Count : in Count_Type := 1);
generic
with function "<" (Left, Right : Element_Type)
return Boolean is <>;
procedure Generic_Sort (Container : in out List);
generic
with function "<" (Left, Right : Element_Type)
return Boolean is <>;
procedure Generic_Merge (Target : in out List;
Source : in out List);
procedure Reverse_List (Container : in out List);
procedure Swap (I, J : in Cursor);
procedure Swap_Links (Container : in out List;
I, J : in Cursor);
procedure Splice (Target : in out List;
Before : in Cursor;
Source : in out List);
procedure Splice (Target : in out List;
Before : in Cursor;
Position : in Cursor);
procedure Splice (Target : in out List;
Before : in Cursor;
Source : in out List;
Position : in Cursor);
function First (Container : List) return Cursor;
function First_Element (Container : List)
return Element_Type;
function Last (Container : List) return Cursor;
function Last_Element (Container : List)
return Element_Type;
function Contains (Container : List;
Item : Element_Type) return Boolean;
function Find (Container : List;
Item : Element_Type;
Position : Cursor := No_Element)
return Cursor;
function Reverse_Find (Container : List;
Item : Element_Type;
Position : Cursor := No_Element)
return Cursor;
function Next (Position : Cursor) return Cursor;
function Previous (Position : Cursor) return Cursor;
procedure Next (Position : in out Cursor);
procedure Previous (Position : in out Cursor);
function Has_Element (Position : Cursor) return Boolean;
procedure Iterate
(Container : in List;
Process : not null access procedure (Position : in Cursor));
procedure Reverse_Iterate
(Container : in List;
Process : not null access procedure (Position : in Cursor));
private
... --
end Ada.Containers.Doubly_Linked_Lists;
The type List needs finalization (see 7.6).
Empty_List represents the empty list. If an object of type List is not
otherwise initialized, it will be initialized to the same value as
Empty_List.
No_Element represents a cursor that designates no element. If an object of type
Cursor is not otherwise initialized, it will be initialized to the same
value as No_Element.
function "=" (Left, Right : List) return Boolean;
If Left denotes the same object as Right, then this function returns
True. Otherwise if Left and Right have different lengths, then this
function returns False. Otherwise, it compares each element in Left to
the corresponding element in Right using the generic formal equality
operator. If element equality returns False, then this function returns
False; otherwise, this function returns True. Any exception raised
during evaluation of element equality is propagated.
The function Length returns the number of elements in Container.
The function Is_Empty is equivalent to Length (Container) = 0.
Clear removes all the nodes in Container, and sets the length to 0. Any
exceptions raised during deallocation of storage are propagated.
function Element (Position : Cursor) return Element_Type;
Returns the element stored on the node designated by Position. If
Position equals No_Element, then Constraint_Error is propagated.
procedure Query_Element
(Position : in Cursor;
Process : not null access procedure (Element : in Element_Type));
If Position equals No_Element, then Constraint_Error is propagated. Otherwise,
Query_Element invokes Process.all with the element on node designated by
Position as the argument.
procedure Update_Element
(Position : in Cursor;
Process : not null access procedure (Element : in out Element_Type));
If Position equals No_Element, then Constraint_Error is propagated. Otherwise,
Update_Element invokes Process.all with the element on node designated by
Position as the argument.
If Element_Type is unconstrained and definite, then the Element parameter
of Process.all shall be unconstrained.
AARM Note: This means that the elements cannot be directly allocated from the
heap (nor aliased unless AI-363 is included in the Amendment); it must be
possible to change the discriminants of the element in place.
procedure Replace_Element (Position : Cursor;
By : Element_Type);
Assigns the value By to the element stored on the node designated by
Position. If Position equals No_Element, then Constraint_Error is
propagated.
procedure Move (Target : in out List;
Source : in out List);
If Target denotes the same object as Source, then the operation has no
effect. If Length (Target) is greater than 0, then it raises
Constraint_Error. Otherwise, the nodes in Source are moved to Target. The
length of Target is set to the length of Source, and the length of Source is
set to 0.
Prepend is equivalent to Insert (Container, First (Container), New_Item, Count).
Append is equivalent to Insert (Container, No_Element, New_Item, Count).
procedure Insert (Container : in out List;
Before : in Cursor;
New_Item : in Element_Type;
Count : in Count_Type := 1);
If Before is not No_Element, and does not designate an element in Container,
then Program_Error is propagated.
Insert allocates Count new nodes whose element is initialized to the value
New_Item, and inserts them prior to the node designated by Before. If
Before equals No_Element, the new nodes are inserted immediately
following the last node (if any).
AARM Note: The check on Before checks that the cursor does not belong to some
other Container. This check implies that a reference to the container is
included in the cursor value. This wording is not meant to require detection of
dangling cursors; such cursors are defined to be invalid, which means that
execution is erroneous, and any result is allowed (including not raising an
exception).
procedure Insert (Container : in out List;
Before : in Cursor;
New_Item : in Element_Type;
Position : out Cursor;
Count : in Count_Type := 1);
If Before is not No_Element, and does not designate an element in Container,
then Program_Error is propagated.
Insert allocates Count new nodes whose element is initialized to the value
New_Item, and inserts them prior to the node designated by Before. If
Before equals No_Element, the new nodes are inserted immediately
following the last node (if any). Position designates the first
newly-inserted node. Any exception raised during allocation of
internal storage is propagated, and Container is not modified.
procedure Insert (Container : in out List;
Before : in Cursor;
Position : out Cursor;
Count : in Count_Type := 1);
If Before is not No_Element, and does not designate an element in Container,
then Program_Error is propagated.
Allocates Count new nodes with elements initialized with any implicit initial
values for any part (as for an object_declaration with no initialization
expression - see 3.3.1), and inserts them prior to the node designated by
Before. If Before equals No_Element, the new node is inserted immediately
following the last node (if any). Position designates the first newly-inserted
node. Any exception raised during allocation of internal storage is
propagated, and Container is not modified.
procedure Delete (Container : in out List;
Position : in out Cursor;
Count : in Count_Type := 1);
If Position equals No_Element, this operation has no effect. If Position does
not designate an element in Container, then Program_Error is propagated. Otherwise
Delete removes Count nodes starting at the node designated by Position from
Container (or all of the nodes if there are less than Count nodes starting at
Position). Any exception raised during deallocation of internal storage is
propagated.
procedure Delete_First (Container : in out List;
Count : in Count_Type := 1);
If Length(Container) >= Count, Delete_First removes the first Count nodes from
Container; otherwise, all of the nodes in Container are removed. Any exceptions
raised during deallocation of storage are propagated.
procedure Delete_Last (Container : in out List;
Count : in Count_Type := 1);
If Length(Container) >= Count, Delete_Last removes the last Count nodes from
Container; otherwise, all of the nodes in Container are removed. Any exceptions
raised during deallocation of storage are propagated.
generic
with function "<" (Left, Right : Element_Type) return Boolean is <>;
procedure Generic_Sort (Container : in out List);
Reorders the nodes of Container such that the elements are
sorted smallest first as determined by the generic formal "<" operator
provided. The sort must be stable. Any exception raised during evaluation of
"<" is propagated.
AARM Notes
Unlike array sorts, we do require stable sorts here. That's because
algorithms in the merge sort family (as described by Knuth) can be both
fast and stable. Such sorts use the extra memory as offered by
the links to provide better performance.
Note that list sorts never copy elements; it is the nodes, not the elements,
that are reordered.
End AARM Notes
generic
with function "<" (Left, Right : Element_Type) return Boolean is <>;
procedure Generic_Merge (Target : in out List;
Source : in out List);
If Source denotes the same object as Target, the operation has no
effect. Otherwise this operation reorders nodes such that they are
removed from Source and moved to Target. Target and Source are assumed to be
sorted smallest first as determined by the generic formal "<" operator. The
nodes in Source containing elements that are less than elements in Target are
spliced in before the elements already in Target. The nodes in Source container
elements that are equal to or greater than elements in Target are spliced in
after the elements already in Target. Any exception raised during evaluation
of "<" is propagated, immediately terminating the merge operation.
procedure Reverse_List (Container : in out List);
Reorders the nodes of Container in reverse order.
procedure Swap (I, J : in Cursor);
If either I or J is No_Element, then Constraint_Error is propagated. If I and J
designate elements in different containers, then Program_Error is propagated.
Otherwise Swap exchanges the values of the elements designated by I and J.
AARM Notes:
After a call to Swap, I designates the element value previously
designated by J, and J designates the element value previously
designated by I. The cursors do not become ambiguous from this operation.
AARM Notes: To Be Honest: The implementation is not required to actually
copy the elements if it can do the swap some other way. But it is allowed
to copy the elements if needed.
procedure Swap_Links (Container : in out List;
I, J : in Cursor);
If either I or J is No_Element, then Constraint_Error is propagated. If I or J
do not designate an element in Container, then Program_Error is propagated.
Otherwise, Swap_Links exchanges the nodes designated by I and J.
AARM Note: Unlike Swap, this exchanges the nodes, not the
elements. No copying is performed. I and J designate the same elements after
this call as they did before it. This operation can provide better performance
than Swap if the element size is large.
procedure Splice (Target : in out List;
Before : in Cursor;
Source : in out List);
If Before is not No_Element, and does not designate an element in Target,
then Program_Error is propagated.
If Source denotes the same object as Target, the operation has no
effect. Otherwise, Splice reorders nodes such that they are removed from
Source and moved to Target, just prior to Before. If Before equals
No_Element, the nodes of Source are spliced after the last node of
Target. The length of Target is incremented by the number of
nodes in Source, and the length of Source is set to 0.
procedure Splice (Target : in out List;
Before : in Cursor;
Position : in Cursor);
If either of Before or Position is not No_Element, and does not designate an
element in Target, then Program_Error is propagated.
If Position equals No_Element, or if Position equals Before, or if the
successor of Position equals Before, the operation has no effect.
Otherwise the node designated by Position becomes the predecessor of Before.
If Before equals No_Element, then Position is moved after the last node.
procedure Splice (Target : in out List;
Before : in Cursor;
Source : in out List;
Position : in Cursor);
If Before does not equal No_Element, and does not designate a node in
Target, then Program_Error is propagated. If Position does not equal
No_Element, and does not designate a node in Source, then Program_Error
is propagated.
If Source denotes the same object as Target, then Splice is
equivalent to the Splice operation sans Source parameter. If Position
equals No_Element, the operation has no effect. Otherwise the node
designated by Position is removed from Source and moved to Target, immediately
prior to Before. If Before equals No_Element, then Position is moved after the
last node of Target. The length of Target is incremented, and the length of
Source is decremented.
function First (Container : List) return Cursor;
Returns a cursor that designates the first node. If Container is empty,
First returns the value No_Element.
function First_Element (Container : List) return Element_Type;
If Container is empty, then Constraint_Error is propagated. Otherwise,
it returns Element (First (Container)).
function Last (Container : List) return Cursor;
Returns a cursor that designates the last node. If Container is empty,
Last returns the value No_Element.
function Last_Element (Container : List) return Element_Type;
If Container is empty, then Constraint_Error is propagated. Otherwise,
it returns Element (Last (Container)).
function Contains (Container : List;
Item : Element_Type) return Boolean;
Equivalent to Find (Container, Item) /= No_Element.
function Find (Container : List;
Item : Element_Type;
Position : Cursor := No_Element)
return Cursor;
If Position is not No_Element, and does not designate an element
in Container, then Program_Error is propagated.
Searches the nodes of Container for an element equal to Item. The
search starts at the node designated by Position. If Position equals
No_Element, then the search begins at the first node. If no element is
found that matches Item, then Find returns the value No_Element.
function Reverse_Find (Container : List;
Item : Element_Type;
Position : Cursor := No_Element)
return Cursor;
If Position is not No_Element, and does not designate an element
in Container, then Program_Error is propagated.
Searches the nodes of Container in reverse for an element equal to Item.
The search starts at the node designated by Position. If Position
equals No_Element, then the search begins at the last node. If no
element is found that matches Item, then Find returns the value
No_Element.
function Next (Position : Cursor) return Cursor;
If Position equals No_Element or designates the last node of the container,
then Next returns the value No_Element. Otherwise, returns a cursor that
designates the successor of the node designated by Position.
function Previous (Position : Cursor) return Cursor;
If Position equals No_Element or designates the first node of the container,
then Previous returns the value No_Element. Otherwise, returns a cursor that
designates the predecessor of the node designated by Position.
The procedure Next is equivalent to Position := Next (Position).
The procedure Previous is equivalent to Position := Previous (Position).
function Has_Element (Position : Cursor) return Boolean;
Returns True if Position designates an element, and returns False otherwise.
AARM Note: To Be Honest: This function may not detect cursors that
designate deleted elements; such cursors are invalid (see below) and any
use of them (including in this routine) is erroneous.
procedure Iterate
(Container : in List;
Process : not null access procedure (Position : in Cursor));
Invokes Process.all with a cursor that designates each node in Container. Any
exceptions raised during Process are propagated.
Program_Error is propagated if:
* Process.all attempts to insert or delete elements from Container; or
* Process.all calls a routine that reorders the elements of Container
(Swap_Links, Splice, Generic_Sort, or Generic_Merge); or
* Process.all finalizes Container; or
* Process.all calls Move with Container as a parameter.
AARM Note:
This check takes place when the operations that insert or delete elements, etc.
are called. There is no check needed if an attempt is made to insert or delete
nothing (that is, Count = 0).
See Iterate for vectors for a suggested implementation of the check.
Swap is not included here, as it only copies elements.
End AARM Notes.
procedure Reverse_Iterate
(Container : in List;
Process : not null access procedure (Position : in Cursor));
Iterates over the nodes in Container as per Iterate, except
that elements are traversed in reverse order.
Erroneous Execution
A Cursor value is invalid if any of the following have occurred since it was
created:
* The list that contains the element it designates has been finalized;
* The list that contains the element it designates has been used as the
Source or Target of a call to Move;
* The element it designates has been deleted.
The result of "=" or Has_Element is unspecified if it is called with an invalid
cursor parameter. Execution is erroneous if any other subprogram declared in
Containers.Doubly_Linked_Lists is called with an invalid cursor parameter.
Execution also is erroneous if the called Process procedure of a call to
Query_Element or Update_Element executes an operation that causes the Position
cursor of Query_Element or Update_Element to become invalid.
AARM Notes: The list above is intended to be exhaustive. In other cases, a
cursor value continues to designate its original element. For instance,
cursor values survive the insertion and deletion of other nodes.
While it is possible to check for these cases, in many cases the overhead
necessary to make the check is substantial in time or space. Implementations
are encouraged to check for as many of these cases as possible and raise
Constraint_Error if detected.
End AARM Notes.
Implementation Requirements
No storage associated with a doubly-linked List object shall be lost
upon assignment or scope exit.
Implementation Advice
Containers.Doubly_Linked_Lists should be implemented similarly to a linked
list. In particular, if N is the length of a list, then the worst-case time
complexity of Element, Insert with Count=1, and Delete with Count=1 should be
O(log N).
AARM Note
We do not mean to overly constrain implementation strategies here. However, it
is important for portability that the performance of large containers has
roughly the same factors on different implementations. If a program is moved to
an implementation that takes O(N) time to access elements, that program could
be unusable when the lists are large. We allow O(log N) access because the
proportionality constant and caching effects are likely to be larger than the
log factor, and we don't want to discourage innovative implementations.
The worst-case time complexity of a call on an instantiation of
Containers.Doubly_Linked_Lists.Generic_Sort should be O(N**2), and
the average time complexity should be better than O(N**2).
AARM Note
In other words, we're requiring the use of a better than O(N**2) sorting
algorithm, such as Quicksort. No Bubble sorts allowed!
NOTE
Sorting a list never copies elements, and is a stable sort (equal elements
remain in the original order). This is different than sorting an array or
vector, which may need to copy elements, and is probably not a stable sort.
A.18.4 Maps
The language-defined packages Containers.Hashed_Maps and Containers.Ordered_Maps
provide private types Map and Cursor, and a set of operations for each type.
A map container allows an arbitrary type to be used as a key to find the
element associated with that key. A hashed map uses a hash function to organize
the keys, while an ordered map orders the keys per a specified relation.
This section describes the declarations that are common to both kinds of maps.
See A.18.5 for a description of the semantics specific to Containers.Hashed_Maps
and A.18.6 for a description of the semantics specific to
Containers.Ordered_Maps.
The type Map is used to represent maps. The type Map needs finalization (see
7.6).
A map contains pairs of keys and elements, called nodes. Map cursors
designate nodes, but also can be thought of as designating an element (the
element contained in the node) for consistency with the other containers. There
exists an equivalence relation on keys, whose definition is different for hashed
maps and ordered maps. A map never contains two or more nodes with equivalent
keys. The length of a map is the number of nodes it contains.
Each nonempty map has two particular nodes called the *first node* and the *last
node* (which may be the same). Each node except for the last node has a
successor node. If there are no other intervening operations, starting with
the first node and repeatedly going to the successor node will visit each node
in the map exactly once until the last node is reached. The exact definition of
these terms is different for hashed maps and ordered maps.
Empty_Map represents the empty Map object. It has a length of 0. If an object
of type Map is not otherwise initialized, it is initialized to the same
value as Empty_Map.
No_Element represents a cursor that designates no node. If an object of type
Cursor is not otherwise initialized, it is initialized to the same
value as No_Element.
function "=" (Left, Right : Map) return Boolean;
If Left and Right denote the same map object, then the function returns True. If
Left and Right have different lengths, then the function returns False.
Otherwise, for each key K in Left, the function returns False if:
* a key equivalent to K is not present in Right; or
* the element associated with K in Left is not equal to the element associated
with K in Right (using the generic formal equality operator for elements).
If the function has not returned a result after checking all of the keys, it
returns True. Any exception raised during evaluation of key equivalence or
element equality is propagated.
function Length (Container : Map) return Count_Type;
Returns the number of nodes in Container.
function Is_Empty (Container : Map) return Boolean;
Equivalent to Length (Container) = 0.
procedure Clear (Container : in out Map);
Removes all the nodes from Container. Any exception raised during deallocation
of
storage is propagated.
function Key (Position : Cursor) return Key_Type;
If Position equals No_Element, then Constraint_Error is propagated. Otherwise,
Key returns the key component of the node designated by Position.
function Element (Position : Cursor) return Element_Type;
If Position equals No_Element, then Constraint_Error is propagated. Otherwise,
Element returns the element component of the node designated by Position.
procedure Query_Element
(Position : in Cursor;
Process : not null access procedure (Key : in Key_Type;
Element : in Element_Type));
If Position equals No_Element, then Constraint_Error is propagated. Otherwise,
Query_Element calls Process.all with the key and element from the node
designated by Position as the arguments.
procedure Update_Element
(Position : in Cursor;
Process : not null access procedure (Key : in Key_Type;
Element : in out Element_Type));
If Position equals No_Element, then Constraint_Error is propagated. Otherwise,
Update_Element calls Process.all with the key and element from the node
designated by Position as the arguments.
If Element_Type is unconstrained and definite, then the Element parameter
of Process.all shall be unconstrained.
AARM Note: This means that the elements cannot be directly allocated from the
heap (nor aliased unless AI-363 is included in the Amendment); it must be
possible to change the discriminants of the element in place.
procedure Replace_Element (Position : in Cursor;
By : in Element_Type);
If Position equals No_Element, then Constraint_Error is propagated.
Otherwise this operation assigns By to the element of the node designated by
Position.
procedure Move (Target : in out Map;
Source : in out Map);
Move first clears Target. Then each node from Source is removed from Source and
inserted into Target. The length of Source is 0 after a successful call to
Move.
procedure Insert (Container : in out Map;
Key : in Key_Type;
New_Item : in Element_Type;
Position : out Cursor;
Inserted : out Boolean);
Insert checks if a node with a key equivalent to Key is already present in
Container. If a match is found, Inserted is set to False and Position
designates the element with the matching key. Otherwise, Insert allocates a new
node, initializes it to Key and New_Item, and adds it to Container; Inserted is
set to True and Position designates the newly-inserted node. Any exception
raised during allocation is propagated and Container is not modified.
procedure Insert (Container : in out Map;
Key : in Key_Type;
Position : out Cursor;
Inserted : out Boolean);
Insert inserts Key into Container as per the five-parameter Insert, with the
difference that an element initialized with any implicit initial values for any
part (as for an object_declaration with no initialization expression - see
3.3.1) is inserted.
procedure Insert (Container : in out Map;
Key : in Key_Type;
New_Item : in Element_Type);
Insert inserts Key and New_Item into Container as per the five-parameter
Insert, with the difference that if a node with a key equivalent to Key is
already in the map, then Constraint_Error is propagated.
AARM Note: This is equivalent to:
declare
Inserted : Boolean; C : Cursor;
begin
Insert (Container, Key, New_Item, C, Inserted);
if not Inserted then
raise Constraint_Error;
end if;
end;
but doesn't require the hassle of out parameters.
procedure Include (Container : in out Map;
Key : in Key_Type;
New_Item : in Element_Type);
Include inserts Key and New_Item into Container as per the five-parameter
Insert, with the difference that if a node with a key equivalent to Key is
already in the map, then this operation assigns Key and New_Item to the
matching node. Any exception raised during assignment is propagated.
AARM Note: This is equivalent to:
declare
C : Cursor := Find (Container, Key);
begin
if C = No_Element then
Insert (Container, Key, New_Item);
else
Replace (Container, Key, New_Item);
end if;
end;
but this avoids doing the search twice.
procedure Replace (Container : in out Map;
Key : in Key_Type;
New_Item : in Element_Type);
Replace checks if a node with a key equivalent to Key is present in Container.
If a match is found, Replace assigns Key and New_Item to the matching node;
otherwise, Constraint_Error is propagated.
AARM Note: We update the key as well as the element, as the key might include
additional information that does not participate in equivalence. If only the
element needs to be updated, use
Replace_Element (Find (Container, Key), New_Element).
procedure Delete (Container : in out Map;
Key : in Key_Type);
Delete checks if a node with a key equivalent to Key is present in Container.
If a match is found, Delete removes the node from the map; otherwise,
Constraint_Error is propagated.
procedure Delete (Container : in out Map;
Position : in out Cursor);
If Position equals No_Element, Delete has no effect. If Position does not
designate an element in Container, then Program_Error is propagated. Otherwise,
Delete removes the node designated by Position from the map. Position is set to
No_Element on return.
AARM Note: The check on Position checks that the cursor does not belong to some
other map. This check implies that a reference to the map is included in the
cursor value. This wording is not meant to require detection of dangling
cursors; such cursors are defined to be invalid, which means that execution is
erroneous, and any result is allowed (including not raising an exception).
procedure Exclude (Container : in out Map;
Key : in Key_Type);
Exclude checks if a node with a key equivalent to Key is present in Container.
If a match is found, Exclude removes the node from the map and then
deallocates the node.
function Contains (Container : Map;
Key : Key_Type) return Boolean;
Equivalent to Find (Container, Key) /= No_Element.
function Find (Container : Map;
Key : Key_Type) return Cursor;
If Length (Container) equals 0, then Find returns No_Element. Otherwise, Find
checks if a node with a key equivalent to Key is present in Container. If a
match is found, a cursor designating the matching node is returned; otherwise,
No_Element is returned.
function Element (Container : Map;
Key : Key_Type) return Element_Type;
Equivalent to Element (Find (Container, Key)).
function First (Container : Map) return Cursor;
If Length (Container) = 0, then First returns No_Element. Otherwise,
First returns a cursor that designates the first node in Container.
function Next (Position : Cursor) return Cursor;
Returns a cursor that designates the successor of the node designated by
Position. If Position designates the last node, then No_Element is returned. If
Position equals No_Element, then No_Element is returned.
procedure Next (Position : in out Cursor);
Equivalent to Position := Next (Position).
function Has_Element (Position : Cursor) return Boolean;
Returns True if Position designates a node, and returns False otherwise.
AARM Note: To Be Honest: This function may not detect cursors that designate
deleted elements; such cursors are invalid (see below).
procedure Iterate
(Container : in Map;
Process : not null access procedure (Position : in Cursor));
Iterate calls Process.all with a cursor that designates each node in Container,
starting with the first node and moving the cursor according to the successor
relation. Any exception raised by Process.all is propagated. Program_Error is
propagated if:
* Process.all attempts to insert or delete nodes from Container; or
* Process.all finalizes Container; or
* Process.all calls Move with Container as a parameter.
AARM Note:
This check takes place when the operations that insert or delete elements, etc.
are called.
See Iterate for vectors for a suggested implementation of the check.
End AARM Notes.
Erroneous Execution
A Cursor value is invalid if any of the following have occurred since it was
created:
* The map that contains the node it designates has been finalized;
* The map that contains the node it designates has been used as the
Source or Target of a call to Move;
* The node it designates has been deleted from the map.
The result of "=" or Has_Element is unspecified if these function are called
with an invalid cursor parameter. Execution is erroneous if any other subprogram
declared in Containers.Hashed_Maps or Containers.Ordered_Maps is called with an
invalid cursor parameter.
Execution also is erroneous if the called Process.all procedure of a call to
Query_Element or Update_Element executes an operation that causes the Position
cursor of Query_Element or Update_Element to become invalid.
AARM Notes: The list above is intended to be exhaustive. In other cases, a
cursor value continues to designate its original element. For instance, cursor
values survive the insertion and deletion of other nodes.
While it is possible to check for these cases, in many cases the overhead
necessary to make the check is substantial in time or space. Implementations
are encouraged to check for as many of these cases as possible and raise
Program_Error if detected.
End AARM Notes.
Implementation Requirements
No storage associated with a Map object shall be lost upon assignment or
scope exit.
A.18.5 The Package Containers.Hashed_Maps
Static Semantics
The library package Containers.Hashed_Maps has the following
declaration:
generic
type Key_Type is private;
type Element_Type is private;
with function Hash (Key : Key_Type) return Hash_Type;
with function Equivalent_Keys (Left, Right : Key_Type)
return Boolean;
with function "=" (Left, Right : Element_Type)
return Boolean is <>;
package Ada.Containers.Hashed_Maps is
pragma Preelaborate (Hashed_Maps);
type Map is tagged private;
type Cursor is private;
Empty_Map : constant Map;
No_Element : constant Cursor;
function "=" (Left, Right : Map) return Boolean;
function Length (Container : Map) return Count_Type;
function Is_Empty (Container : Map) return Boolean;
procedure Clear (Container : in out Map);
function Key (Position : Cursor) return Key_Type;
function Element (Position : Cursor) return Element_Type;
procedure Query_Element
(Position : in Cursor;
Process : not null access procedure (Key : in Key_Type;
Element : in Element_Type));
procedure Update_Element
(Position : in Cursor;
Process : not null access procedure (Key : in Key_Type;
Element : in out Element_Type));
procedure Replace_Element (Position : in Cursor;
By : in Element_Type);
procedure Move (Target : in out Map;
Source : in out Map);
procedure Insert (Container : in out Map;
Key : in Key_Type;
New_Item : in Element_Type;
Position : out Cursor;
Inserted : out Boolean);
procedure Insert (Container : in out Map;
Key : in Key_Type;
Position : out Cursor;
Inserted : out Boolean);
procedure Insert (Container : in out Map;
Key : in Key_Type;
New_Item : in Element_Type);
procedure Include (Container : in out Map;
Key : in Key_Type;
New_Item : in Element_Type);
procedure Replace (Container : in out Map;
Key : in Key_Type;
New_Item : in Element_Type);
procedure Delete (Container : in out Map;
Key : in Key_Type);
procedure Delete (Container : in out Map;
Position : in out Cursor);
procedure Exclude (Container : in out Map;
Key : in Key_Type);
function Contains (Container : Map;
Key : Key_Type) return Boolean;
function Find (Container : Map;
Key : Key_Type)
return Cursor;
function Element (Container : Map;
Key : Key_Type)
return Element_Type;
function First (Container : Map)
return Cursor;
function Next (Position : Cursor) return Cursor;
procedure Next (Position : in out Cursor);
function Has_Element (Position : Cursor) return Boolean;
function Equivalent_Keys (Left, Right : Cursor)
return Boolean;
function Equivalent_Keys (Left : Cursor;
Right : Key_Type)
return Boolean;
function Equivalent_Keys (Left : Key_Type;
Right : Cursor)
return Boolean;
procedure Iterate
(Container : in Map;
Process : not null access procedure (Position : in Cursor));
function Capacity (Container : Map) return Count_Type;
procedure Reserve_Capacity (Container : in out Map;
Capacity : in Count_Type);
private
... --
end Ada.Containers.Hashed_Maps;
An object of type Map contains an expandable hash table, which is used to
provide direct access to nodes. The capacity of an object of type Map is the
maximum number of nodes that can be inserted into the hash table prior to it
being automatically expanded.
AARM Notes
The expected implementation for a Map uses a hash table which is grown when it
is too small, with linked lists hanging off of each bucket. Note that in that
implementation a cursor needs a back pointer to the Map object to implement
iteration; that could either be in the nodes, or in the cursor object. To
provide an average O(1) access time, capacity would typically equal the number
of buckets in such an implementation, so that the average bucket linked list
length would be no more than 1.0.
There is no defined relationship between elements in a hashed map. Typically,
iteration will return elements in the order that they are hashed in.
End AARM Notes
Two keys K1 and K2 are defined to be equivalent if Equivalent_Keys (K1, K2)
returns True.
Function Hash is expected to return the same value each time it is called with a
particular key value. For any two equivalent key values, Hash is expected to
return the same value. If Hash behaves in some other manner, the behavior of
this package is unspecified. Which subprograms of this package call Hash, and
how many times they call it, is unspecified.
AARM Notes
The implementation is not required to protect against Hash raising an exception,
or returning random numbers, or any other "bad" behavior. It's not practical to
do so, and a broken Hash function makes the container unusable.
The implementation can call Hash whenever it is needed; we don't want to
specify how often that happens. The result must remain the same (this is
logically a pure function), or the behavior is unspecified.
End AARM Notes
Function Equivalent_Keys is expected to return the same value each time it is
called with a particular pair of key values. For any two keys K1 and K2, the
boolean values Equivalent_Keys (K1, K2) and Equivalent_Key (K2, K1) are expected
to be equal. If Equivalent_Keys behaves in some other manner, the behavior of
this package is unspecified. Which subprograms of this package call
Equivalent_Keys, and how many times they call it, is unspecified.
AARM Note
As with Hash, the implementation is not required to protect against
Equivalent_Keys raising an exception or returning random results. Similarly, the
implementation can call this operation whenever it is needed. The result must
remain the same (this is a logically pure function), or the behavior is
unspecified.
If the value of a key stored in a node of a map is changed other than by an
operation in this package such that at least one of Hash or Equivalent_Keys give
different results, the behavior of this package is unspecified.
AARM Notes
The implementation is not required to protect against changes to key values
other than via the operations declared in the Hashed_Maps package.
To see how this could happen, imagine an instantiation of Hashed_Maps where
the key type is an access-to-variable type and Hash returns a value derived
from the components of the designated object. Then, any operation that has a
key value could modify those components and change the hash value:
Key (Map).Some_Component := New_Value;
This is really a design error on the part of the user of the map; it shouldn't
be possible to modify keys stored in a map. But we can't prevent this error
anymore than we can prevent someone passing as Hash a random number generator.
End AARM Notes
Which nodes are the first node and the last node of a map, and which node is the
successor of a given node, are unspecified, other than the general semantics
described in A.18.4.
AARM Note
Typically the first node will be the first node in the first bucket, the last
node will be the last node in the last bucket, and the successor will be
obtained by following the collision list, and going to the next bucket at the
end of each bucket.
procedure Clear (Container : in out Map);
In addition to the semantics described in A.18.4, Clear does not affect the
capacity of Container.
AARM Note:
procedure Move (Target : in out Map;
Source : in out Map);
The intended implementation is that the internal hash table of Target is first
deallocated; then the internal hash table is removed from Source and moved to
Target.
End AARM Note.
procedure Insert (Container : in out Map;
Key : in Key_Type;
New_Item : in Element_Type;
Position : out Cursor;
Inserted : out Boolean);
In addition to the semantics described in A.18.4, if Length (Container) equals
Capacity (Container), then Insert first calls Reserve_Capacity to increase the
capacity of Container to some larger value.
AARM Notes:
Insert should only compare keys that hash to the same bucket in the hash table.
We specify when Reserve_Capacity is called to bound the overhead of capacity
expansion operations (which are potentially expensive). Moreover, expansion can
be predicted by comparing Capacity(Map) to Length(Map). Since we don't specify
by how much the hash table is expanded, this only can be used to predict the
next expansion, not later ones.
End AARM Notes.
AARM Notes:
procedure Delete (Container : in out Map;
Key : in Key_Type);
Delete should only compare keys that hash to the same bucket in the hash
table. The node containing the element may be deallocated now,
or it may be saved and reused later.
End AARM Notes.
AARM Notes:
procedure Exclude (Container : in out Map;
Key : in Key_Type);
Exclude should only compare keys that hash to the same bucket in the hash
table. Exclude should work on an empty map; nothing happens in that case.
End AARM Notes.
AARM Note:
function Find (Container : Map;
Key : Key_Type) return Cursor;
Find should only compare keys that hash to the same bucket in the hash table.
End AARM Notes.
AARM Note:
function First (Container : Map) return Cursor;
In a typical implementation, this will be the first node in the lowest numbered
hash bucket that contains a node.
End AARM Notes.
AARM Note:
function Next (Position : Cursor) return Cursor;
In a typical implementation, this will return the next node in a bucket; if
Position is the last node in a bucket, this will return the first node in the
next non-empty bucket.
A typical implementation will need to a keep a pointer at the map container
in the cursor in order to implement this function.
End AARM Note.
function Equivalent_Keys (Left, Right : Cursor)
return Boolean;
Equivalent to Equivalent_Keys (Key (Left), Key (Right)).
function Equivalent_Keys (Left : Cursor;
Right : Key_Type) return Boolean;
Equivalent to Equivalent_Keys (Key (Left), Right).
function Equivalent_Keys (Left : Key_Type;
Right : Cursor) return Boolean;
Equivalent to Equivalent_Keys (Left, Key (Right)).
procedure Iterate
(Container : in Map;
Process : not null access procedure (Position : in Cursor));
In addition to the semantics described in A.18.4, Program_Error is propagated
if Process.all calls Reserve_Capacity.
AARM Note:
See Iterate for vectors for a suggested implementation of the check.
We have to include Reserve_Capacity here, as rehashing probably will change
the order that elements are stored in the map.
End AARM Notes.
function Capacity (Container : Map) return Count_Type;
Returns the capacity of Container.
procedure Reserve_Capacity (Container : in out Map;
Capacity : in Count_Type);
Reserve_Capacity allocates a new hash table such that the length of the
resulting map can become at least the value Capacity without requiring an
additional call to Reserve_Capacity, and is large enough to hold the current
length of Map. Reserve_Capacity then rehashes the nodes in Container onto the
new hash table. It replaces the old hash table with the new hash table, and
then deallocates the old hash table. Any exception raised during allocation is
propagated and Container is not modified.
AARM Notes: This routine is used to preallocate the internal hash table to the
specified capacity such that future Inserts do not require expansion of the
hash table. Therefore, the implementation should allocate the needed memory to
make that true at this point, even though the visible semantics could be
preserved by waiting until enough elements are inserted.
While Reserve_Capacity can be used to reduce the capacity of a map, we do not
specify whether an implementation actually supports reduction of the capacity.
Since the actual capacity can be anything greater than or equal to Count, an
implementation never has to reduce the capacity.
End AARM Notes
Implementation Advice
If N is the length of a map, the average time complexity of the subprograms
Element, Insert, Include, Replace, Delete, Exclude and Find that take a key
parameter should be O(log N). The average time complexity of the subprograms
that take a cursor parameter should be O(1). The average time complexity of
Reserve_Capacity should be O(N).
AARM Note
We do not mean to overly constrain implementation strategies here. However, it
is important for portability that the performance of large containers has
roughly the same factors on different implementations. If a program is moved to
an implementation for which Find is O(N), that program could be unusable when
the maps are large. We allow O(log N) access because the proportionality
constant and caching effects are likely to be larger than the log factor, and
we don't want to discourage innovative implementations.
A.18.6 The Package Containers.Ordered_Maps
Static Semantics
The package Containers.Ordered_Maps has the following declaration:
generic
type Key_Type is private;
type Element_Type is private;
with function "<" (Left, Right : Key_Type) return Boolean is <>;
with function "=" (Left, Right : Element_Type) return Boolean is <>;
package Ada.Containers.Ordered_Maps is
pragma Preelaborate (Ordered_Maps);
type Map is tagged private;
type Cursor is private;
Empty_Map : constant Map;
No_Element : constant Cursor;
function "=" (Left, Right : Map) return Boolean;
function Length (Container : Map) return Count_Type;
function Is_Empty (Container : Map) return Boolean;
procedure Clear (Container : in out Map);
function Key (Position : Cursor) return Key_Type;
function Element (Position : Cursor) return Element_Type;
procedure Query_Element
(Position : in Cursor;
Process : not null access procedure (Key : in Key_Type;
Element : in Element_Type));
procedure Update_Element
(Position : in Cursor;
Process : not null access procedure (Key : in Key_Type;
Element : in out Element_Type));
procedure Replace_Element (Position : in Cursor;
By : in Element_Type);
procedure Move (Target : in out Map;
Source : in out Map);
procedure Insert (Container : in out Map;
Key : in Key_Type;
New_Item : in Element_Type;
Position : out Cursor;
Inserted : out Boolean);
procedure Insert (Container : in out Map;
Key : in Key_Type;
Position : out Cursor;
Inserted : out Boolean);
procedure Insert (Container : in out Map;
Key : in Key_Type;
New_Item : in Element_Type);
procedure Include (Container : in out Map;
Key : in Key_Type;
New_Item : in Element_Type);
procedure Replace (Container : in out Map;
Key : in Key_Type;
New_Item : in Element_Type);
procedure Delete (Container : in out Map;
Key : in Key_Type);
procedure Delete (Container : in out Map;
Position : in out Cursor);
procedure Delete_First (Container : in out Map);
procedure Delete_Last (Container : in out Map);
procedure Exclude (Container : in out Map;
Key : in Key_Type);
function Contains (Container : Map;
Key : Key_Type) return Boolean;
function Find (Container : Map;
Key : Key_Type) return Cursor;
function Element (Container : Map;
Key : Key_Type) return Element_Type;
function Floor (Container : Map;
Key : Key_Type) return Cursor;
function Ceiling (Container : Map;
Key : Key_Type) return Cursor;
function First (Container : Map) return Cursor;
function First_Key (Container : Map) return Key_Type;
function First_Element (Container : Map) return Element_Type;
function Last (Container : Map) return Cursor;
function Last_Key (Container : Map) return Key_Type;
function Last_Element (Container : Map) return Element_Type;
function Next (Position : Cursor) return Cursor;
procedure Next (Position : in out Cursor);
function Previous (Position : Cursor) return Cursor;
procedure Previous (Position : in out Cursor);
function Has_Element (Position : Cursor) return Boolean;
function "<" (Left, Right : Cursor) return Boolean;
function ">" (Left, Right : Cursor) return Boolean;
function "<" (Left : Cursor; Right : Key_Type) return Boolean;
function ">" (Left : Cursor; Right : Key_Type) return Boolean;
function "<" (Left : Key_Type; Right : Cursor) return Boolean;
function ">" (Left : Key_Type; Right : Cursor) return Boolean;
procedure Iterate
(Container : in Map;
Process : not null access procedure (Position : in Cursor));
procedure Reverse_Iterate
(Container : in Map;
Process : not null access procedure (Position : in Cursor));
private
... --
end Ada.Containers.Ordered_Maps;
Two keys K1 and K2 are equivalent if both K1 < K2 and K2 < K1 return False,
using the generic formal less-than operator for keys.
Functions "<" and "=" on Key_Type values are expected to return the same result
value each time they are called with a particular pair of key values. If A = B
returns True, then B = A is expected to also return True. If A < B returns
True, then B < A is expected to return False. For any two equivalent elements,
"=" is expected to return True. If "<" or "=" behaves in some other manner, the
behavior of this package is unspecified. Which subprograms of this package call
"<" and "=", and how many times these functions are called, is unspecified.
AARM Notes
The implementation is not required to protect against "<" or "=" raising an
exception, or returning random results, or any other "bad" behavior. It's not
practical to do so, and a broken "<" or "=" function makes the container
unusable.
The implementation can call "<" and "=" whenever it is needed; we don't want to
specify how often that happens. The result must remain the same (these are
logically pure functions), or the behavior is unspecified.
End AARM Notes
If the value of a key stored in a map is changed other than by an operation in
this package such that at least one of "<" or "=" give different results, the
behavior of this package is unspecified.
AARM Notes
The implementation is not required to protect against changes to key values
other than via the operations declared in the Ordered_Maps package.
To see how this could happen, imagine an instantiation of Ordered_Maps package
where the key type is an access-to-variable type and "<" returns a value
derived from comparing the components of the designated objects. Then, any
operation that has a key value (even if the key value is constant) could modify
those components and change the result of "<":
Key (Map).Some_Component := New_Value;
This is really a design error on the part of the user of the map; it shouldn't
be possible to modify keys stored in a map such that "<" changes. But we
can't prevent this error anymore than we can prevent someone passing as "<"
a routine that produces random answers.
End AARM Notes
The first node of a nonempty map is the one whose key is less than the key of
all the other nodes in the map. The last node of a nonempty map is the one
whose key is greater than the key of all the other elements in the map. The
successor of a node is the node with the smallest key that is larger than the
key of the given node. The predecessor of a node is the node with the largest
key that is smaller than the key of the given node. All comparisons are done
using the generic formal less-than operator for keys.
procedure Delete_First (Container : in out Map);
If Container is empty, Delete_First has no effect. Otherwise the node
designated by First (Container) is removed from Container.
procedure Delete_Last (Container : in out Map);
If Container is empty, Delete_Last has no effect. Otherwise the node designated
by Last (Container) is removed from Container.
function Floor (Container : Map;
Key : Key_Type) return Cursor;
Floor searches for the last node whose key is not greater than Key. If such a
node is found, a cursor that designates it is returned. Otherwise No_Element is
returned.
function Ceiling (Container : Map;
Key : Key_Type) return Cursor;
Ceiling searches for the first node whose key is not less than Key, using the
generic formal less-than operator for keys. If such a node is found, a cursor
that designates it is returned. Otherwise No_Element is returned.
function First_Key (Container : Map) return Key_Type;
Equivalent to Key (First (Container)).
function First_Element (Container : Map) return Element_Type;
Equivalent to Element (First (Container)).
function Last (Container : Map) return Cursor;
Returns a cursor that designates the last node in Container. If Container is
empty, returns No_Element.
function Last_Key (Container : Map) return Key_Type;
Equivalent to Key (Last (Container)).
function Last_Element (Container : Map) return Element_Type;
Equivalent to Element (Last (Container)).
function Previous (Position : Cursor) return Cursor;
If Position equals No_Element, then Previous returns No_Element. Otherwise
Previous returns the a cursor designating the node that precedes the one
designated by Position. If Position designates the first element, then Previous
returns No_Element.
procedure Previous (Position : in out Cursor);
Equivalent to Position := Previous (Position).
function "<" (Left, Right : Cursor) return Boolean;
Equivalent to Key (Left) < Key (Right).
function ">" (Left, Right : Cursor) return Boolean;
Equivalent to Key (Right) < Key (Left).
function "<" (Left : Cursor; Right : Key_Type) return Boolean;
Equivalent to Key (Left) < Right.
function ">" (Left : Cursor; Right : Key_Type) return Boolean;
Equivalent to Right < Key (Left).
function "<" (Left : Key_Type; Right : Cursor) return Boolean;
Equivalent to Left < Key (Right).
function ">" (Left : Key_Type; Right : Cursor) return Boolean;
Equivalent to Key (Right) < Left.
procedure Reverse_Iterate
(Container : in Map;
Process : not null access procedure (Position : in Cursor));
Iterates over the nodes in Container as per Iterate, with the difference that
the nodes are traversed in predecessor order, starting with the last node.
Implementation Advice
If N is the length of a map, then the worst-case time complexity of the
Element, Insert, Include, Replace, Delete, Exclude and Find operations that take
a key parameter should be O((log N)**2) or better. The worst-case time
complexity of the subprograms that take a cursor parameter should be O(1).
AARM Note
A balanced (red-black) tree for keys has O(log N) worst-case performance. Note
that a O(N) worst-case implementation (like a list) would be wrong. We do not
mean to overly constrain implementation strategies here. However, it is
important for portability that the performance of large containers has roughly
the same factors on different implementations. If a program is moved to an
implementation that takes O(N) to find elements, that program could be unusable
when the maps are large. We allow the extra log N factors because the
proportionality constant and caching effects are likely to be larger than the
log factor, and we don't want to discourage innovative implementations.
A.18.7 Sets
The language-defined packages Containers.Hashed_Sets and
Containers.Ordered_Sets provide private types Set and Cursor, and a set of
operations for each type. A set container allows elements of an arbitrary type
to be stored without duplication. A hashed set uses a hash function to organize
elements, while an ordered set orders its element per a specified relation.
This section describes the declarations that are common to both kinds of sets.
See A.18.8 for a description of the semantics specific to
Containers.Hashed_Sets and A.18.9 for a description of the semantics specific
to Containers.Ordered_Sets.
The type Set is used to represent sets. The type Set needs finalization (see
7.6).
A set contains elements. Set cursors designate elements. There exists an
equivalence relation on elements, whose definition is different for hashed sets
and ordered sets. A set never contains two or more equivalent elements. The
length of a set is the number of elements it contains.
Each nonempty set has two particular elements called the first element and
the last element (which may be the same). Each element except for the last
element has a successor element. If there are no other intervening
operations, starting with the first element and repeatedly going to the
successor element will visit each element in the map exactly once until the
last element is reached. The exact definition of these terms is different for
hashed sets and ordered sets.
Empty_Set represents the empty Set object. It has a length of 0. If an object
of type Set is not otherwise initialized, it is initialized to the same
value as Empty_Set.
No_Element represents a cursor that designates no element. If an object of type
Cursor is not otherwise initialized, it is initialized to the same
value as No_Element.
function "=" (Left, Right : Set) return Boolean;
If Left and Right denote the same set object, then the function returns True.
If Left and Right have different lengths, then the function returns False.
Otherwise, for each element E in Left, the function returns False if an element
equivalent to E is not present in Right. If the function has not returned a
result after checking all of the elements, it return True. Any exception raised
during evaluation of element equivalence is propagated.
function Length (Container : Set) return Count_Type;
Returns the number of elements in Container.
function Is_Empty (Container : Set) return Boolean;
Equivalent to Length (Container) = 0.
procedure Clear (Container : in out Set);
Removes all the elements from Container. Any exception raised during
deallocation of storage is propagated.
function Element (Position : Cursor) return Element_Type;
If Position equals No_Element, then Constraint_Error is propagated. Otherwise,
Element returns the element designated by Position.
procedure Query_Element
(Position : in Cursor;
Process : not null access procedure (Element : in Element_Type));
If Position equals No_Element, then Constraint_Error is propagated. Otherwise,
Query_Element calls Process.all with the element designated by Position as the
argument.
procedure Replace_Element (Container : in Set;
Position : in Cursor;
By : in Element_Type);
If Position equals No_Element, then Constraint_Error is propagated.
If Position does not designate an element in Container, then Program_Error
is propagated. Otherwise, the element designated by Position is tested for
equivalence to By; if equivalent Replace assigns By to the element designated
by Position. Otherwise, the element designated by Position is removed from the
container, then By is inserted into the container. If the insertion fails,
Program_Error is propagated.
procedure Move (Target : in out Set;
Source : in out Set);
Move first clears Target. Then each element from Source is removed from Source
and inserted into Target. The length of Source is 0 after a successful call to
Move.
procedure Insert (Container : in out Set;
New_Item : in Element_Type;
Position : out Cursor;
Inserted : out Boolean);
Insert checks if an element equivalent to New_Item is already present in
Container. If a match is found, Inserted is set to False and Position
designates the matching element. Otherwise, Insert adds New_Item to Container;
Inserted is set to True and Position designates the newly-inserted element. Any
exception raised during allocation is propagated and Container is not modified.
procedure Insert (Container : in out Set;
New_Item : in Element_Type);
Insert inserts New_Item into Container as per the four-parameter Insert, with
the difference that if an element equivalent to New_Item is already in the set,
then Constraint_Error is propagated.
AARM Note: This is equivalent to:
declare
Inserted : Boolean; C : Cursor;
begin
Insert (Container, New_Item, C, Inserted);
if not Inserted then
raise Constraint_Error;
end if;
end;
but doesn't require the hassle of out parameters.
procedure Include (Container : in out Set;
New_Item : in Element_Type);
Include inserts New_Item into Container as per the four-parameter Insert, with
the difference that if an element equivalent to New_Item is already in the set,
then it is replaced. Any exception raised during assignment is propagated.
procedure Replace (Container : in out Set;
New_Item : in Element_Type);
Replace checks if an element equivalent to New_Item is already in the set. If a
match is found, that element is replaced with New_Item; otherwise,
Constraint_Error is propagated.
procedure Delete (Container : in out Set;
Item : in Element_Type);
Delete checks if an element equivalent to Item is present in Container. If a
match is found, Delete removes the element from the set; otherwise,
Constraint_Error is propagated.
procedure Delete (Container : in out Set;
Position : in out Cursor);
If Position equals No_Element, Delete has no effect. If Position does not
designate an element in Container, then Program_Error is propagated. Otherwise,
Delete removes the node designated by Position from the set. Position is set to
No_Element on return.
AARM Note: The check on Position checks that the cursor does not belong to some
other set. This check implies that a reference to the set is included in the
cursor value. This wording is not meant to require detection of dangling
cursors; such cursors are defined to be invalid, which means that execution is
erroneous, and any result is allowed (including not raising an exception).
procedure Exclude (Container : in out Set;
Item : in Element_Type);
Exclude checks if an element equivalent to Item is present in Container. If a
match is found, Exclude removes the element from the set.
function Contains (Container : Set;
Item : Element_Type) return Boolean;
Equivalent to Find (Container, Item) /= No_Element.
function Find (Container : Set;
Item : Element_Type) return Cursor;
If Length (Container) equals 0, then Find returns No_Element. Otherwise, Find
checks if an element equivalent to Item is present in Container. If a match is
found, a cursor designating the matching element is returned; otherwise,
No_Element is returned.
function First (Container : Set) return Cursor;
If Length (Container) = 0, then First returns No_Element. Otherwise, First
returns a cursor that designates the first node in Container.
function Next (Position : Cursor) return Cursor;
Returns a cursor that designates the successor of the element designated by
Position. If Position designates the last element, then No_Element is returned.
If Position equals No_Element, then No_Element is returned.
procedure Next (Position : in out Cursor);
Equivalent to Position := Next (Position).
function Has_Element (Position : Cursor) return Boolean;
Returns True if Position designates an element, and returns False otherwise.
AARM Note: To Be Honest: This function may not detect cursors that designate
deleted elements; such cursors are invalid (see below).
procedure Iterate
(Container : in Set;
Process : not null access procedure (Position : in Cursor));
Iterate calls Process.all with a cursor that designates each element in
Container, starting with the first node and moving the cursor according to the
successor relation. Any exception raised by Process.all is propagated.
Program_Error is propagated if:
* Process.all attempts to insert or delete elements from Container; or
* Process.all finalizes Container; or
* Process.all calls Move with Container as a parameter.
AARM Note:
This check takes place when the operations that insert or delete elements, etc.
are called.
See Iterate for vectors for a suggested implementation of the check.
End AARM Notes.
procedure Union (Target : in out Set;
Source : in Set);
Union inserts into Target the elements of Source that are not equivalent to
some element already in Target.
AARM Note: If the objects are the same, the result is the same as the
original object. The implementation needs to take care so that aliasing effects
do not make the result trash; Union (S, S); must work.
function Union (Left, Right : Set) return Set;
Returns a set comprising all of the elements of Left, and the elements of Right
that are not equivalent to some element of Left.
procedure Intersection (Target : in out Set;
Source : in Set);
Union deletes from Target the elements of Target that are not equivalent to
some element of Source.
AARM Note: If the objects are the same, the result is the same as the
original object. The implementation needs to take care so that aliasing effects
do not make the result trash; Intersection (S, S); must work.
function Intersection (Left, Right : Set) return Set;
Returns a set comprising all the elements of Left that are equivalent to
the some element of Right.
procedure Difference (Target : in out Set;
Source : in Set);
If Target denotes the same object as Source, then Difference clears Target.
Otherwise, it deletes from Target the elements that are equivalent to some
element of Source.
function Difference (Left, Right : Set) return Set;
Returns a set comprising the elements of Left that are not equivalent to some
element of Right.
procedure Symmetric_Difference (Target : in out Set;
Source : in Set);
If Target denotes the same object as Source, then Symmetric_Difference clears
Target. Otherwise, it deletes from Target the elements that are equivalent to
some element of Source, and inserts into Target the elements of Source that are
not equivalent to some element of Target.
function Symmetric_Difference (Left, Right : Set) return Set;
Returns a set comprising the elements of Left that are not equivalent to some
element of Right, and the elements of Right that are not equivalent to some
element of Left.
function Overlap (Left, Right : Set) return Boolean;
If an element of Left is equivalent to some element of Right, then Overlap
returns True. Otherwise it returns False.
AARM Notes: This operation is commutative. If Overlap returns False, the two
sets are disjoint.
function Is_Subset (Subset : Set;
Of_Set : Set) return Boolean;
If an element of Subset is not equivalent to some element of Of_Set, then
Is_Subset returns False. Otherwise it returns True.
AARM Note: This operation is not commutative, so we use parameter names that
make it clear in named notation which set is which.
Both Containers.Hashed_Set and Containers.Ordered_Set declare a nested generic
package Generic_Keys, which provides operations that allow set manipulation in
terms of a key (typically, a portion of an element) instead of a complete
element. The formal function Key of Generic_Keys extracts a key value from an
element. It is expected to return the same value each time it is called with a
particular element. The behavior of Generic_Keys is unspecified if Key behaves
in some other manner.
A key is expected to unambiguously determine one equivalence class for elements.
The behavior of Generic_Keys is unspecified if the formal parameters of this
package behave in some other manner.
The subprograms in package Generic_Keys named Contains, Find, Element, Delete,
and Exclude, are equivalent to the corresponding subprograms in the parent
package, with the difference that the Key parameter is used locate an element in
the set.
procedure Replace (Container : in out Set;
Key : in Key_Type;
New_Item : in Element_Type);
Equivalent to Replace (Container, Find (Container, Key), New_Item).
function Key (Position : Cursor) return Key_Type;
Equivalent to Key (Element (Position)).
procedure Checked_Update_Element
(Container : in out Set;
Position : in Cursor;
Process : not null access procedure (Element : in out Element_Type));
If Position equals No_Element, then Constraint_Error is propagated. If Position
does not designate an element in Container, then Program_Error is propagated.
Otherwise, Checked_Update_Element uses Key to save the key value K of the
element designated by Position. Checked_Update_Element then calls Process.all
with that element as the argument. After Process.all returns,
Checked_Update_Element checks if K determines the same equivalence class as
that for the new element; if not, the element is removed from the set, then
inserted back into the set. If the insertion fails, Program_Error is
propagated.
AARM Note: The key check insures that the invariants of the set are preserved
by the modification. If not, we try to re-insert the element at the correct
place. If that fails, we have no choice other than to drop the element
completely, because saving a value to roll back to would be expensive and
defeat the purpose of this routine (which is cheap, in-place modifications).
If Element_Type is unconstrained and definite, then the Element parameter
of Process.all shall be unconstrained.
AARM Note: This means that the elements cannot be directly allocated from the
heap (nor aliased unless AI-363 is included in the Amendment); it must be
possible to change the discriminants of the element in place.
Erroneous Execution
A Cursor value is invalid if any of the following have occurred since it was
created:
* The set that contains the element it designates has been finalized;
* The set that contains the element it designates has been used as the
Source or Target of a call to Move;
* The element it designates has been deleted from the set.
The result of "=" or Has_Element is unspecified if these functions are called
with an invalid cursor parameter. Execution is erroneous if any other
subprogram declared in Containers.Hashed_Sets or Containers.Ordered_Sets is
called with an invalid cursor parameter.
Execution also is erroneous if the called Process.all procedure of a call to
Query_Element or Checked_Update_Element executes an operation that causes the
Position cursor of Query_Element or Checked_Update_Element to become invalid.
AARM Notes: The list above is intended to be exhaustive. In other cases, a
cursor value continues to designate its original element. For instance, cursor
values survive the insertion and deletion of other elements.
While it is possible to check for these cases, in many cases the overhead
necessary to make the check is substantial in time or space. Implementations
are encouraged to check for as many of these cases as possible and raise
Program_Error if detected.
End AARM Notes.
Implementation Requirements
No storage associated with a Set object shall be lost upon assignment or scope
exit.
A.18.8 The Package Containers.Hashed_Sets
Static Semantics
The package Containers.Hashed_Sets has the following declaration:
generic
type Element_Type is private;
with function Hash (Element : Element_Type) return Hash_Type;
with function Equivalent_Elements (Left, Right : Element_Type)
return Boolean;
package Ada.Containers.Hashed_Sets is
pragma Preelaborate (Hashed_Sets);
type Set is tagged private;
type Cursor is private;
Empty_Set : constant Set;
No_Element : constant Cursor;
function "=" (Left, Right : Set) return Boolean;
function Length (Container : Set) return Count_Type;
function Is_Empty (Container : Set) return Boolean;
procedure Clear (Container : in out Set);
function Element (Position : Cursor) return Element_Type;
procedure Query_Element
(Position : in Cursor;
Process : not null access procedure (Element : in Element_Type));
procedure Replace_Element (Container : in Set;
Position : in Cursor;
By : in Element_Type);
procedure Move (Target : in out Set;
Source : in out Set);
procedure Insert (Container : in out Set;
New_Item : in Element_Type;
Position : out Cursor;
Inserted : out Boolean);
procedure Insert (Container : in out Set;
New_Item : in Element_Type);
procedure Include (Container : in out Set;
New_Item : in Element_Type);
procedure Replace (Container : in out Set;
New_Item : in Element_Type);
procedure Delete (Container : in out Set;
Item : in Element_Type);
procedure Delete (Container : in out Set;
Position : in out Cursor);
procedure Exclude (Container : in out Set;
Item : in Element_Type);
function Contains (Container : Set;
Item : Element_Type) return Boolean;
function Find (Container : Set;
Item : Element_Type) return Cursor;
function First (Container : Set) return Cursor;
function Next (Position : Cursor) return Cursor;
procedure Next (Position : in out Cursor);
function Has_Element (Position : Cursor) return Boolean;
function Equivalent_Elements (Left, Right : Cursor)
return Boolean;
function Equivalent_Elements (Left : Cursor;
Right : Element_Type)
return Boolean;
function Equivalent_Elements (Left : Element_Type;
Right : Cursor)
return Boolean;
procedure Iterate
(Container : in Set;
Process : not null access procedure (Position : in Cursor));
procedure Union (Target : in out Set;
Source : in Set);
function Union (Left, Right : Set) return Set;
function "or" (Left, Right : Set) return Set renames Union;
procedure Intersection (Target : in out Set;
Source : in Set);
function Intersection (Left, Right : Set) return Set;
function "and" (Left, Right : Set) return Set renames Intersection;
procedure Difference (Target : in out Set;
Source : in Set);
function Difference (Left, Right : Set) return Set;
function "-" (Left, Right : Set) return Set renames Difference;
procedure Symmetric_Difference (Target : in out Set;
Source : in Set);
function Symmetric_Difference (Left, Right : Set) return Set;
function "xor" (Left, Right : Set) return Set
renames Symmetric_Difference;
function Overlap (Left, Right : Set) return Boolean;
function Is_Subset (Subset : Set;
Of_Set : Set) return Boolean;
function Capacity (Container : Set) return Count_Type;
procedure Reserve_Capacity (Container : in out Set;
Capacity : in Count_Type);
generic
type Key_Type (<>) is limited private;
with function Key (Element : in Element_Type) return Key_Type;
with function Hash (Key : Key_Type) return Hash_Type;
with function Equivalent_Keys (Left : Key_Type; Right : Element_Type)
return Boolean;
package Generic_Keys is
function Contains (Container : Set;
Key : Key_Type)
return Boolean;
function Find (Container : Set;
Key : Key_Type)
return Cursor;
function Key (Position : Cursor) return Key_Type;
function Element (Container : Set;
Key : Key_Type)
return Element_Type;
procedure Replace (Container : in out Set;
Key : in Key_Type;
New_Item : in Element_Type);
procedure Delete (Container : in out Set;
Key : in Key_Type);
procedure Exclude (Container : in out Set;
Key : in Key_Type);
procedure Checked_Update_Element
(Container : in out Set;
Position : in Cursor;
Process : not null access procedure (Element : in out Element_Type));
function Equivalent_Keys (Left : Cursor;
Right : Key_Type)
return Boolean;
function Equivalent_Keys (Left : Key_Type;
Right : Cursor)
return Boolean;
end Generic_Keys;
private
... --
end Ada.Containers.Hashed_Sets;
An object of type Set contains an expandable hash table, which is used to
provide direct access to elements. The capacity of an object of type Set is
the maximum number of elements that can be inserted into the hash table prior
to it being automatically expanded.
Two elements E1 and E2 are defined to be equivalent if Equivalent_Elements
(E1, E2) returns True.
Function Hash is expected to return the same value each time it is called with
a particular element value. For any two equivalent elements, Hash is expected
to return the same value. If Hash behaves in some other manner, the behavior of
this package is unspecified. Which subprograms of this package call Hash, and
how many times they call it, is unspecified.
Function Equivalent_Elements is expected to return the same value each time
it is called with a particular pair of element values. For any two elements E1
and E2, the boolean values Equivalent_Keys (E1, E2) and Equivalent_Key (E2, E1)
are expected to be equal. If Equivalent_Elements behaves in some other manner,
the behavior of this package is unspecified. Which subprograms of this package
call Equivalent_Elements, and how many times they call it, is unspecified.
If the value of an element stored in a set is changed other than by an
operation in this package such that at least one of Hash or Equivalent_Elements
give different results, the behavior of this package is unspecified.
AARM Note
See Hashed_Maps for a suggested implementation, and for justification of the
restrictions regarding Hash and Equivalent_Elements. Note that sets only need
to store elements, not key/element pairs.
Which elements are the first element and the last element of a set, and which
element is the successor of a given element, are unspecified, other than the
general semantics described in A.18.7.
procedure Clear (Container : in out Set);
In addition to the semantics described in A.18.7, Clear does not affect the
capacity of Container.
procedure Insert (Container : in out Set;
New_Item : in Element_Type;
Position : out Cursor;
Inserted : out Boolean);
In addition to the semantics described in A.18.7, if Length (Container) equals
Capacity (Container), then Insert first calls Reserve_Capacity to increase the
capacity of Container to some larger value.
function First (Container : Set) return Cursor;
If Length (Container) = 0, then First returns No_Element. Otherwise, First
returns a cursor that designates the first hashed element in Container.
function Equivalent_Elements (Left, Right : Cursor)
return Boolean;
Equivalent to Equivalent_Elements (Element (Left), Element (Right)).
function Equivalent_Elements (Left : Cursor;
Right : Element_Type) return Boolean;
Equivalent to Equivalent_Elements (Element (Left), Right).
function Equivalent_Elements (Left : Element_Type;
Right : Cursor) return Boolean;
Equivalent to Equivalent_Elements (Left, Element (Right)).
procedure Iterate
(Container : in Set;
Process : not null access procedure (Position : in Cursor));
In addition to the semantics described in A.18.7, Program_Error is propagated
if Process.all calls Reserve_Capacity.
function Capacity (Container : Set) return Count_Type;
Returns the capacity of Container.
procedure Reserve_Capacity (Container : in out Set;
Capacity : in Count_Type);
Reserve_Capacity allocates a new hash table such that the length of the
resulting set can become at least the value Capacity without requiring an
additional call to Reserve_Capacity, and is large enough to hold the current
length of Set. Reserve_Capacity then rehashes the nodes in Container onto the
new hash table. It replaces the old hash table with the new hash table, and
then deallocates the old hash table. Any exception raised during allocation is
propagated and Container is not modified.
For any element E, the function Generic_Keys.Hash must be such that Hash (E) =
Generic_Keys.Hash (Key (E)). If Key or Generic_Keys.Hash behave in some other
manner, the behavior of Generic_Keys is unspecified. Which subprograms of
Generic_Keys call Generic_Keys.Hash, and how many times they call it, is
unspecified.
For any two elements E1 and E2, the boolean values Equivalent_Element (E1, E2),
Equivalent_Keys (Key (E1), E2) and Equivalent_Keys (Key (E2), E1) are all
expected to be equal. If Key or Equivalent behave in some other manner, the
behavior of Generic_Keys is unspecified. Which subprograms of Generic_Keys call
Equivalent, and how many times they call it, is unspecified.
function Equivalent_Keys (Left : Cursor;
Right : Key_Type) return Boolean;
Equivalent to Equivalent_Keys (Key (Left), Right).
function Equivalent_Keys (Left : Key_Type;
Right : Cursor) return Boolean;
Equivalent to Equivalent_Keys (Left, Key (Right)).
Implementation Advice
If N is the length of a set, the average time complexity of the subprograms
Insert, Include, Replace, Delete, Exclude and Find that take an element
parameter should be O(log N). The average time complexity of the subprograms
that take a cursor parameter should be O(1). The average time complexity of
Reserve_Capacity should be O(N).
AARM Note:
See Hashed_Maps for implementation notes regarding some of the operations of
this package.
A.18.9 The Package Containers.Ordered_Sets
Static Semantics
The package Containers.Ordered_Sets has the following declaration:
generic
type Element_Type is private;
with function "<" (Left, Right : Element_Type) return Boolean is <>;
with function "=" (Left, Right : Element_Type) return Boolean is <>;
package Ada.Containers.Ordered_Sets is
pragma Preelaborate (Ordered_Sets);
type Set is tagged private;
type Cursor is private;
Empty_Set : constant Set;
No_Element : constant Cursor;
function "=" (Left, Right : Set) return Boolean;
function Length (Container : Set) return Count_Type;
function Is_Empty (Container : Set) return Boolean;
procedure Clear (Container : in out Set);
function Element (Position : Cursor) return Element_Type;
procedure Query_Element
(Position : in Cursor;
Process : not null access procedure (Element : in Element_Type));
procedure Replace_Element (Container : in Set;
Position : in Cursor;
By : in Element_Type);
procedure Move (Target : in out Set;
Source : in out Set);
procedure Insert (Container : in out Set;
New_Item : in Element_Type;
Position : out Cursor;
Inserted : out Boolean);
procedure Insert (Container : in out Set;
New_Item : in Element_Type);
procedure Include (Container : in out Set;
New_Item : in Element_Type);
procedure Replace (Container : in out Set;
New_Item : in Element_Type);
procedure Delete (Container : in out Set;
Item : in Element_Type);
procedure Exclude (Container : in out Set;
Item : in Element_Type);
procedure Delete (Container : in out Set;
Position : in out Cursor);
procedure Delete_First (Container : in out Set);
procedure Delete_Last (Container : in out Set);
procedure Union (Target : in out Set;
Source : in Set);
function Union (Left, Right : Set) return Set;
function "or" (Left, Right : Set) return Set renames Union;
procedure Intersection (Target : in out Set;
Source : in Set);
function Intersection (Left, Right : Set) return Set;
function "and" (Left, Right : Set) return Set renames Intersection;
procedure Difference (Target : in out Set;
Source : in Set);
function Difference (Left, Right : Set) return Set;
function "-" (Left, Right : Set) return Set renames Difference;
procedure Symmetric_Difference (Target : in out Set;
Source : in Set);
function Symmetric_Difference (Left, Right : Set) return Set;
function "xor" (Left, Right : Set) return Set renames
Symmetric_Difference;
function Overlap (Left, Right : Set) return Boolean;
function Is_Subset (Subset : Set;
Of_Set : Set) return Boolean;
function Contains (Container : Set;
Item : Element_Type) return Boolean;
function Find (Container : Set;
Item : Element_Type)
return Cursor;
function Floor (Container : Set;
Item : Element_Type)
return Cursor;
function Ceiling (Container : Set;
Item : Element_Type)
return Cursor;
function First (Container : Set) return Cursor;
function First_Element (Container : Set) return Element_Type;
function Last (Container : Set) return Cursor;
function Last_Element (Container : Set) return Element_Type;
function Next (Position : Cursor) return Cursor;
procedure Next (Position : in out Cursor);
function Previous (Position : Cursor) return Cursor;
procedure Previous (Position : in out Cursor);
function Has_Element (Position : Cursor) return Boolean;
function "<" (Left, Right : Cursor) return Boolean;
function ">" (Left, Right : Cursor) return Boolean;
function "<" (Left : Cursor; Right : Element_Type)
return Boolean;
function ">" (Left : Cursor; Right : Element_Type)
return Boolean;
function "<" (Left : Element_Type; Right : Cursor)
return Boolean;
function ">" (Left : Element_Type; Right : Cursor)
return Boolean;
procedure Iterate
(Container : in Set;
Process : not null access procedure (Position : in Cursor));
procedure Reverse_Iterate
(Container : in Set;
Process : not null access procedure (Position : in Cursor));
generic
type Key_Type (<>) is limited private;
with function Key (Element : Element_Type) return Key_Type;
with function "<" (Left : Key_Type; Right : Element_Type)
return Boolean is <>;
with function ">" (Left : Key_Type; Right : Element_Type)
return Boolean is <>;
package Generic_Keys is
function Contains (Container : Set;
Key : Key_Type) return Boolean;
function Find (Container : Set;
Key : Key_Type)
return Cursor;
function Floor (Container : Set;
Item : Key_Type)
return Cursor;
function Ceiling (Container : Set;
Item : Key_Type)
return Cursor;
function Key (Position : Cursor) return Key_Type;
function Element (Container : Set;
Key : Key_Type)
return Element_Type;
procedure Replace (Container : in out Set;
Key : in Key_Type;
New_Item : in Element_Type);
procedure Delete (Container : in out Set;
Key : in Key_Type);
procedure Exclude (Container : in out Set;
Key : in Key_Type);
function "<" (Left : Cursor; Right : Key_Type)
return Boolean;
function ">" (Left : Cursor; Right : Key_Type)
return Boolean;
function "<" (Left : Key_Type; Right : Cursor)
return Boolean;
function ">" (Left : Key_Type; Right : Cursor)
return Boolean;
procedure Checked_Update_Element
(Container : in out Set;
Position : in Cursor;
Process : not null access procedure
(Element : in out Element_Type));
end Generic_Keys;
private
... --
end Ada.Containers.Ordered_Sets;
Two elements E1 and E2 are equivalent if both E1 < E2 and E2 < E1 return
False, using the generic formal less-than operator for elements.
Functions "<" and "=" on Element_Type values are expected to return the same
result value each time they are called with a particular pair of element values.
If A = B returns True, then B = A is expected to also return True. If A < B
returns True, then B < A is expected to return False. For any two equivalent
elements, "=" is expected to return True. If "<" or "=" behaves in some other
manner, the behavior of this package is unspecified. Which subprograms of this
package call "<" and "=", and how many times these functions are called, is
unspecified.
If the value of an element stored in a set is changed other than by an
operation in this package such that at least one of "<" or "=" give different
results, the behavior of this package is unspecified.
AARM Note
See Ordered_Maps for a suggested implementation, and for justification of the
restrictions regarding "<" and "=". Note that sets only need to store elements,
not key/element pairs.
The first element of a nonempty set is the one which is less than all the other
elements in the set. The last element of a nonempty set is the one which is
greater than all the other elements in the set. The successor of an element is
the smallest element that is larger than the given element. The predecessor of
an element is the largest element that is smaller than the given element. All
comparisons are done using the generic formal less-than operator for elements.
procedure Delete_First (Container : in out Set);
If Container is empty, Delete_First has no effect. Otherwise the element
designated by First (Container) is removed from Container.
procedure Delete_Last (Container : in out Set);
If Container is empty, Delete_Last has no effect. Otherwise the element
designated by Last (Container) is removed from Container.
function Floor (Container : Set;
Item : Element_Type) return Cursor;
Floor searches for the last element which is not greater than Item. If such an
element is found, a cursor that designates it is returned. Otherwise No_Element
is returned.
function Ceiling (Container : Set;
Item : Element_Type) return Cursor;
Ceiling searches for the first element which is not less than Item. If such an
element is found, a cursor that designates it is returned. Otherwise No_Element
is returned.
function First_Element (Container : Set) return Element_Type;
Equivalent to Element (First (Container)).
function Last (Container : Set) return Cursor;
Returns a cursor that designates the last node in Container. If Container is
empty, returns No_Element.
function Last_Element (Container : Set) return Element_Type;
Equivalent to Element (Last (Container)).
function Previous (Position : Cursor) return Cursor;
If Position equals No_Element, then Previous returns No_Element. Otherwise
Previous returns a cursor designating the element that precedes the one
designated by Position. If Position designates the first element, then Previous
returns No_Element.
procedure Previous (Position : in out Cursor);
Equivalent to Position := Previous (Position).
function "<" (Left, Right : Cursor) return Boolean;
Equivalent to Element (Left) < Element (Right).
function ">" (Left, Right : Cursor) return Boolean;
Equivalent to Element (Right) < Element (Left).
function "<" (Left : Cursor; Right : Element_Type) return Boolean;
Equivalent to Element (Left) < Right.
function ">" (Left : Cursor; Right : Element_Type) return Boolean;
Equivalent to Right < Element (Left).
function "<" (Left : Element_Type; Right : Cursor) return Boolean;
Equivalent to Left < Element (Right).
function ">" (Left : Element_Type; Right : Cursor) return Boolean;
Equivalent to Element (Right) < Left.
procedure Reverse_Iterate
(Container : in Set;
Process : not null access procedure (Position : in Cursor));
Iterates over the elements in Container as per Iterate, with the difference
that the elements are traversed in predecessor order, starting with the last
node.
For any two elements E1 and E2, the boolean values (E1 < E2), (Key(E1) < E2),
and (Key(E2) > E1) are all expected to be equal. If Key, Generic_Keys."<", or
Generic_Keys.">" behave in some other manner, the behavior of this package is
unspecified. Which subprograms of this package call Key, Generic_Keys."<" and
Generic_Keys.">", and how many times the functions are called, is unspecified.
In addition to the semantics described in A.18.7, the subprograms in package
Generic_Keys named Floor, Ceiling, "<", and ">", are equivalent to the
corresponding subprograms in the parent package, with the difference that the
Key subprogram parameter is compared to elements in the container using the
Generic_Keys generic formal relational operators.
Implementation Advice
If N is the length of a set, then the worst-case time complexity of the
Insert, Include, Replace, Delete, Exclude and Find operations that take an
element parameter should be O((log N)**2) or better. The worst-case time
complexity of the subprograms that take a cursor parameter should be O(1).
AARM Note:
See Ordered_Maps for implementation notes regarding some of the operations of
this package.
A.18.10 The Package Containers.Indefinite_Vectors
The language-defined package Containers.Indefinite_Vectors provides a
private type Vector and a set of operations. It provides the same
operations as the package Vectors does, with the difference that the
generic formal Element_Type is indefinite.
Static Semantics
The declaration of the library package Containers.Indefinite_Vectors has
the same contents as Containers.Vectors except:
* The generic formal Element_Type is indefinite.
* The Element parameter of access subprogram Process of Update_Element
may be constrained even if Element_Type is unconstrained.
A.18.11 The Package Containers.Indefinite_Doubly_Linked_Lists
The language-defined package Containers.Indefinite_Doubly_Linked_Lists
provides private types List and Cursor, and a set of operations for each
type. It provides the same operations as the package Doubly_Linked_Lists
does, with the difference that the generic formal Element_Type is
indefinite.
Static Semantics
The declaration of the library package
Containers.Indefinite_Doubly_Linked_Lists has the same contents as
Containers.Doubly_Linked_Lists except:
* The generic formal Element_Type is indefinite.
* The procedure with the profile:
procedure Insert (Container : in out List;
Before : in Cursor;
Position : out Cursor;
Count : in Count_Type := 1);
is omitted.
AARM Note: This procedure is omitted because there is no way to create a
default-initialized object of an indefinite type. We considered having this
routine insert an empty element similar to the empty elements of a vector,
but rejected this possibility because the semantics are fairly complex and
very different from the existing case. That would make it more error-prone
to convert a container from a definite type to an indefinite type; by
omitting the routine completely, any problems will be diagnosed by the
compiler.
* The Element parameter of access subprogram Process of Update_Element
may be constrained even if Element_Type is unconstrained.
A.18.12 The Package Containers.Indefinite_Hashed_Maps
The language-defined package Containers.Indefinite_Hashed_Maps provides
a map with the same operations as the package Hashed_Maps, with the
difference that the generic formal types Key_Type and Element_Type are
indefinite.
Static Semantics
The declaration of the library package Containers.Indefinite_Hashed_Maps
has the same contents as Containers.Hashed_Maps except:
* The generic formal Key_Type is indefinite.
* The generic formal Element_Type is indefinite.
* The procedure with the profile:
procedure Insert (Container : in out Map;
Key : in Key_Type;
Position : out Cursor;
Inserted : out Boolean);
is omitted.
AARM Note: This procedure is omitted because there is no way to create a
default-initialized object of an indefinite type. We considered having this
routine insert an empty element similar to the empty elements of a vector,
but rejected this possibility because the semantics are fairly complex and
very different from the existing case. That would make it more error-prone
to convert a container from a definite type to an indefinite type; by
omitting the routine completely, any problems will be diagnosed by the
compiler.
* The Element parameter of access subprogram Process of Update_Element
may be constrained even if Element_Type is unconstrained.
A.18.13 The Package Containers.Indefinite_Ordered_Maps
The language-defined package Containers.Indefinite_Ordered_Maps provides
a map with the same operations as the package Ordered, with the
difference that the generic formal types Key_Type and Element_Type are
indefinite.
Static Semantics
The declaration of the library package Containers.Indefinite_Ordered_Maps
has the same contents as Containers.Ordered_Maps except:
* The generic formal Key_Type is indefinite.
* The generic formal Element_Type is indefinite.
* The procedure with the profile:
procedure Insert (Container : in out Map;
Key : in Key_Type;
Position : out Cursor;
Inserted : out Boolean);
is omitted.
AARM Note: This procedure is omitted because there is no way to create a
default-initialized object of an indefinite type. We considered having this
routine insert an empty element similar to the empty elements of a vector,
but rejected this possibility because the semantics are fairly complex and
very different from the existing case. That would make it more error-prone
to convert a container from a definite type to an indefinite type; by
omitting the routine completely, any problems will be diagnosed by the
compiler.
* The Element parameter of access subprogram Process of Update_Element
may be constrained even if Element_Type is unconstrained.
A.18.14 The Package Containers.Indefinite_Hashed_Sets
The language-defined package Containers.Indefinite_Hashed_Sets provides
a set with the same operations as the package Hashed_Sets, with the
difference that the generic formal type Element_Type is indefinite.
Static Semantics
The declaration of the library package Containers.Indefinite_Hashed_Sets
has the same contents as Containers.Hashed_Sets except:
* The generic formal Element_Type is indefinite.
* The Element parameter of access subprogram Process of
Checked_Update_Element may be constrained even if Element_Type is
unconstrained.
A.18.15 The Package Containers.Indefinite_Ordered_Sets
The language-defined package Containers.Indefinite_Ordered_Sets provides
a set with the same operations as the package Ordered_Sets, with the
difference that the generic formal type Element_Type is indefinite.
Static Semantics
The declaration of the library package Containers.Indefinite_Ordered_Sets
has the same contents as Containers.Ordered_Sets except:
* The generic formal Element_Type is indefinite.
* The Element parameter of access subprogram Process of
Checked_Update_Element may be constrained even if Element_Type is
unconstrained.
A.18.16 Array Sorting
The language-defined procedures Containers.Generic_Array_Sort and
Containers.Generic_Constrained_Array_Sort provide sorting on
arbitrary array types.
Static Semantics
The procedure Containers.Generic_Array_Sort has the following declaration:
generic
type Index_Type is (<>);
type Element_Type is private;
type Array_Type is array (Index_Type range <>) of Element_Type;
with function "<" (Left, Right : Element_Type)
return Boolean is <>;
procedure Ada.Containers.Generic_Array_Sort (Container : in out Array_Type);
pragma Pure (Ada.Containers.Generic_Array_Sort);
Reorders the elements of Container such that the elements are
sorted smallest first as determined by the generic formal less-than operator
provided. Any exception raised during evaluation of less-than is propagated.
AARM Notes
This implies swapping the elements, usually including an intermediate copy.
This of course means that the elements will be copied. Since the elements are
non-limited, this usually will not be a problem. Note that there is
Implementation Advice below that the implementation should use
a sort that minimizes copying of elements.
The sort is not required to be stable (and the fast algorithm required will
not be stable). If a stable sort is needed, the user can include the original
location of the element as an extra "sort key". We considered requiring the
implementation to do that, but it is mostly extra overhead -- usually there
is something already in the element that provides the needed stability.
The procedure Containers.Generic_Constrained_Array_Sort has the following
declaration:
generic
type Index_Type is (<>);
type Element_Type is private;
type Array_Type is array (Index_Type) of Element_Type;
with function "<" (Left, Right : Element_Type)
return Boolean is <>;
procedure Ada.Containers.Generic_Constrained_Array_Sort
(Container : in out Array_Type);
pragma Pure (Ada.Containers.Generic_Constrained_Array_Sort);
Reorders the elements of Container such that the elements are
sorted smallest first as determined by the generic formal less-than operator
provided. Any exception raised during evaluation of less-than is propagated.
Implementation Advice
The worst-case time complexity of a call on an instantiation of
Containers.Generic_Array_Sort or Containers.Generic_Constrained_Array_Sort
should be O(N**2) or better, and the average time complexity should be better
than O(N**2).
AARM Note
In other words, we're requiring the use of a sorting algorithm better than
O(N**2), such as Quicksort. No Bubble sorts allowed!
Containers.Generic_Array_Sort and Containers.Generic_Constrained_Array_Sort
should minimize copying of elements.
AARM Note - To Be Honest
We do not mean "absolutely minimize" here; we're not intending to require a
single copy for each element. Rather, we want to suggest that the sorting
algorithm chosen is one that does not copy items unnecessarily. Bubble sort
would not meet this advice, for instance.
!example
A.18.2 The Package Containers.Vectors
Append is the canonical method for inserting items into a vector
container:
procedure Copy (A : Array_Subtype) is
V : Vector;
begin
Reserve_Capacity (V, Capacity => A'Length);
for I in A'range loop
Append (V, New_Item => A (I));
end loop;
...
end Copy;
The Reserve_Capacity operation tells the vector object to preallocate an internal
array having at least the capacity specified. If you need to perform many
repeated insertions, then if you know the ultimate length apriori you
should always call Reserve_Capacity beforehand. This is more efficient because it
allocates the internal array once, and therefore avoids the repeated
reallocation, copying, and deallocation cycles that might be necessary
otherwise as the array is expanded.
Instead of appending new items to the back of the vector, another technique
is to declare the vector object with the requisite length immediately:
procedure Copy (A : Array_Subtype) is
V : Vector := To_Vector (Count => A'Length);
J : Index_Subtype'Base := Index_Subtype'First;
begin
for I in A'range loop
Replace_Element (V, Index => J, By => A (I));
J := J + 1;
end loop;
...
end Copy;
Here the elements of the vector are initialized to an "empty" value.
When the element value is replaced using Replace_Element, the state of
the element changes from empty to non-empty.
The procedure Set_Length can be used to change the length of a vector
outside of a declarative region. (One could use To_Vector too, of
course, but it's more efficient to use Set_Length, since that operation
allocates a new array only if the capacity of the existing array is too
small.)
You can use a vector to implement a stack in the traditional way:
package Stacks is new Ada.Containers.Vectors (ET);
use Stacks;
Stack : Stacks.Vector;
procedure Push (E : in ET) is
begin
Append (Stack, New_Item => E);
end;
function Top return ET is
begin
return Last_Element (Stack);
end;
procedure Pop is
begin
Delete_Last (Stack);
end;
The Insert_Space operation essentially opens up a "hole" in the middle
of the internal array. It's more efficient to do it this way than
inserting items one-at-a-time, because the sliding is done only once.
For example, we can copy an array (or any other container) into a vector
at some arbitrary position like this:
procedure Copy
(A : in Array_Subtype;
V : in out Vector;
I : in Extended_Index) is
J : Extended_Index := I;
begin
Insert_Space (V, Before => I, Count => A'Length); --
for Index in A'range loop
Replace_Element (V, J, By => A (Index)); --
J := J + 1;
end loop;
...
end Copy;
You can think of Clear as "removing" the elements in the container, but
of course it does not really remove them. The elements that were
formerly active simply now become inactive. In particular, the internal
array is not altered, and no "finalization" of the active elements
occurs. (Of course, the elements are finalized when the master of the
vector object is left.) If this is required, then then user must effect
this himself prior to clearing the vector. Here is one way to do that:
procedure Finalize (Element : in out My_Element_Type) is ...;
procedure My_Clear (V : in out Vector) is
begin
for I in First_Index (V) .. Last_Index (V) loop
Update_Element (V, I, Finalize'access);
end loop;
Clear (V);
end My_Clear;
Here we use the Update_Element modifier, and pass Finalize as the Process
parameter.
The internal array is managed by the implementation, and there is no
requirement that it has any particular capacity; the only requirement is that
the actual capacity is the same or larger than the most recent call to
Reserve_Capacity. Thus Reserve_Capacity (V, 0) might, but might not, deallocate the
internal array. If you want to clear the vector and also deallocate the
internal array, you can use Move:
procedure Clear_And_Deallocate (V : in out Vector) is
V2 : Vector; --
begin
Clear (V); --
Move (Target => V, Source => V2); --
end;
The internal array that belonged to V is deallocated, and the null (or
otherwise small) array of V2 is moved into V.
These, however, are exceptional cases. Usually it is best to ignore the
existence of the internal array altogther, and just let the container
implementation manage memory (other than possibly setting an initial capacity).
If some sort of finalization of the last element is necessary prior to
its "removal" by a deletion operation, the programmer is responsible for
effecting this action prior to calling the operation. As an example,
suppose we have a vector whose elements are access objects, and we need
to deallocate the element when it is "popped" from the vector using
Delete_Last. We can do that like this:
procedure Pop (V : in out Vector) is
procedure Free is --
new Ada.Unchecked_Deallocation (T, T_Access);
procedure Process (V : in out Vector) is --
begin
Free (V);
end Process;
begin
Update_Element (V, Index => Last_Index (V), Process => Process'access);
Delete_Last (V);
end Pop;
The First_Index and Last_Index selectors allow iteration over a vector
analogous to iteration over an array, using the loop machinery provided
by the language:
procedure Op (V : in Vector) is
procedure Process (E : in Element_Subtype) is ...;
begin
for I in First_Index (V) .. Last_Index (V) loop
Process (E => Element (V, I));
end loop;
end Op;
We could also use the cursor-based operations to do the same thing.
procedure Op (V : in Vector) is
procedure Process (E : in Element_Subtype) is ...;
I : Cursor := First (V);
begin
while Has_Element (I) loop
Process (E => Element (I));
Next (I);
end loop;
end Op;
Here we iterate over all of the vector elements. Alternatively we could
use a passive iterator:
procedure Op (V : in Vector) is
procedure Process (E : in Element_Subtype) is ...;
procedure Process (I : in Cursor) is
begin
Process (E => Element (I));
end;
begin
Iterate (V, Process'access);
end Op;
The Update_Element operation is very important, as it allows
in-place modification of elements. For example, suppose we have a vector
whose elements are lists, and we want to append an item to the list
element at a specified vector position. We can do that as follows:
procedure Append
(V : Vector_of_Lists.Vector;
I : Index_Subtype;
E : Element_Subtype) is
procedure Process (L : in out List) is
begin
Append (L, New_Item => E);
end;
begin
Update_Element (V, Index => I, Process => Process'access);
end;
If we have a container whose elements are vectors, we can use
Update_Element in combination with Move to insert a vector onto the
container without actually copying the vector. (Actually, this is true
for all containers -- not just vectors.) Suppose that we have a list of
vectors:
procedure Op (L : in List_of_Vectors.List) is
V : Vector;
procedure Move_V (E : in out Vector) is
begin
Clear (E); --
Move (Target => E, Source => V);
end;
begin
Append (V, New_Item => E);
... --
Append (L, New_Item => Empty_Vector);
--
Update_Element (Last (L), Move_V'access);
end;
A new, default-initialized vector element is appended to L by
Insert, and then immediately replaced by moving the internal array
of vector V into that new, empty element, without any copying.
If ordinary assignment of elements is acceptable, then Replace_Element
allows array-like modification of vector elements:
procedure Op (V : in out Vector) is
I : Index_Subtype := ...;
E : Element_Subtype := ...;
begin
Insert (V, Before => I, New_Item => E);
... --
Replace_Element (V, Index => I, By => E); --
end;
All containers are non-limited, and hence allow ordinary assignment. In
the unique case of a vector, there is a separate assignment procedure:
Assign (Target => V1, Source => V2);
The reason is that the model for a vector is that it's implemented using
an unconstrained array. During ordinary assignment, the internal array
is deallocated (during controlled finalization), and then a new internal
array is allocated (during controlled adjustment) to store a copy of the
elements from the source vector. However, assignment would probably be
more efficient if it were able to hold on to the existing array, and
simply copy the elements of the source onto that array, instead of
having to allocate a new one. This is exactly how Assign works.
A.18.3 The Package Containers.Doubly_Linked_Lists
You can use a doubly-linked list to implement a queue in the traditional
way:
package Queues is new Ada.Containers.Doubly_Linked_Lists (ET);
use Queues;
Queue : Queues.List;
procedure Push (E : in ET) is
begin
Append (Queue, New_Item => E);
end;
function Top return ET is
begin
return First_Element (Queue);
end;
procedure Pop is
begin
Delete_First (Queue);
end;
The doubly-linked list container allows iteration in both directions.
To iterate forward you start at first and increment the cursor:
procedure Op (L : in List) is
C : Cursor := First (L);
begin
while Has_Element (C) loop
Process (C);
Next (C);
end loop;
end;
To iterate in reverse you start at last and decrement the cursor:
procedure Op (L : in List) is
C : Cursor := Last (C);
begin
while Has_Element (C) loop
Process (C);
Previous (C);
end loop;
end;
Note that in both cases the iteration terminates by falling off the end
of the list, at which point the cursor assumes the distinguished value
No_Element, and Has_Element returns False.
All of the containers have an operation to swap a pair of elements in
the container:
procedure Swap
(V : in out Vector_Of_Lists.Vector;
I, J : in Extended_Index) is
begin
Vector_Of_Lists.Swap (V, I, J); --
end;
For the definite form (the case here), this will make a copy of the
element in order to perform the swap. However, we would prefer not to
make a copy of this element, which is a list and hence potentially large
(or otherwise expensive to copy). To avoid making a copy, we can use
Move and nested processing routines:
procedure Swap
(V : in out Vector_Of_Lists.Vector;
I, J : in Extended_Index) is
procedure Process_I (IE : in out List) is
procedure Process_J (JE : in out List) is
IE_Temp : List;
begin
Move (Target => IE_Temp, Source => IE);
Move (Target => IE, Source => JE);
Move (Target => JE, Source => IE_Temp);
end;
begin
Update_Element (V, Index => J, Process => Process_J'access);
end;
begin
Update_Element (V, Index => I, Process => Process_I'access);
end Swap;
To do the swap, we need direct visibility to both objects, so nested
process procedures are required. We first move the list element at index I
of the vector into a temporary. This allows another vector to by copied
into that index position. We do that, moving the list element at index
J into (empty) list element at index I. We then move the temporary
(holding the list that was originally at index I) into the vector at
index position J.
It's often the case that during an insertion you don't have an item
value to assign to the new element, and instead you want simply to
insert a new element (initialized to the default value for its type, if
applicable) and then modify it directly. For example:
procedure Op (L : in out List) is
procedure Process (E : in out ET) is
begin
... --
end;
C : Cursor;
begin
Insert --
(Container => L,
Before => No_Element, --
Position => C); --
--
Update_Element (C, Process'access);
end Op;
A.18.5 The Package Containers.Hashed_Maps
It's often the case that you know apriori the total number of elements
you intend to insert into the map. Under these circumstances you should
always Reserve_Capacity the map first (similar to a vector container), and then
perform the insertions. This preallocates a hash table that has the proper
capacity, and thus avoids the automatic rehashing that occurs
during normal insertion to preserve the load factor. For example:
procedure Op (N : Count_Type) is
M : Map_Types.Map; --
Position : Map_Types.Cursor;
Inserted : Boolean;
begin
Reserve_Capacity (M, Capacity => N); --
for I in 1 .. N loop
Insert --
(Container => Map,
Key => New_Key (I),
New_Item => New_Element (I),
Position => Position,
Inserted => Inserted);
end loop;
...
end Op;
Note that Clear doesn't delete the internal hash table -- it just
deletes the nodes hanging off the hash table.
The simplest and fastest way to iterate over all the elements in the map
is to use a passive iterator:
procedure Op (M : in Map_Types.Map) is
procedure Process (C : in Map_Types.Cursor) is
K : Key_Subtype := Key (C);
E : Element_Subtype := Element (C);
begin
... --
end;
begin
Iterate (M, Process'access);
end;
You could of course implement this function yourself, by iterating over
the items in the map:
procedure Op (M : in Map_Types.Map) is
procedure Process (C : in Map_Types.Cursor) is ...;
C : Map_Types.Cursor := First (M);
begin
while Has_Element (C) loop
Process (C);
Next (C);
end loop;
end Op;
However, a manual loop probably isn't as efficient as a passive iterator
(especially for a hashed map), because the passive iterator can store
implementation-specific context in order to find elements in sequence
more quickly.
generic algorithms are typically written to work with iterators this way:
generic
type Cursor is private;
with function Next (C : Cursor) return Cursor is <>;
with procedure Process (C : Cursor) is <>;
with function Has_Element (C : Cursor) return Boolean is <>;
procedure Generic_Algorithm (First : in Cursor);
The implementation would look something like this:
procedure Generic_Algorithm (First : in Cursor) is
C : Cursor := First;
begin
while Has_Element (C) loop
...
Process (C);
...
C := Next (C);
end loop;
end Generic_Algorithm;
The benefit is that this algorithm will work with any "sequence of
items," which just means any container with a cursor having the
requisite properties, as specified in the generic formal region. The
virtue of this approach is that it abstracts-away the container. The
generic algorithm above (and others like it) works with all the
containers in the library -- it even works for built-in array types.
To make this work with a map, we can just instantiate with an appropriate
Process operation:
procedure Op (M : in Map_Types.Map) is
procedure Process (C : Cursor) is ...;
procedure Algorithm is
new Generic_Algorithm (Map_Types.Cursor); --
begin
Algorithm (First (Map));
end;
In a POSIX OS that supports real-time signals, the OS will deliver a
payload-carrying signal to the app. In the case of a socket, when I/O
completes asynchronously, the OS delivers an RT signal that specifies
the file descriptor of the socket whose I/O completed.
The problem is that I typically declare the socket as part the
representation of some abstraction that gets allocated dynamically, and
therefore I have no idea which object the socket belonged to, so I have
no idea how to act on the information the OS is providing me.
The abstraction I have in mind looks like this:
package Sessions is
type Session_Type (<>) is limited private;
function Session_Access is access all Session_Type;
function Setup_Session return Session_Type; --
procedure Play
(Session : access Session_Type;
Stream : in String);
...
procedure IO_Completion (Session : access Session_Type);
private
type Session_Type is limited record
Socket : Socket_Type;
...;
end Session_Type;
end Sessions;
What I need to do is correlate the file descriptor reported in the RT
signal to a session object. With a map it's almost trivial. In the body
I can instantiate the map as follows. First we make a hash function for
socket descriptors:
function Hash_FD (fd : in C.int) return Ada.Containers.Hash_Type is
begin
return Ada.Containers.Hash_Type (fd); --
end;
Next we instantiate the hashed map package using our hash function:
package FD_Map_Types is
new Ada.Containers.Hashed_Maps
(Key_Type => C.int,
Element_Type => Session_Access,
Hash => Hash_FD,
Equivalent_Keys => C."=");
Now I can declare a map object in the body:
package Sessions is
...
FD_Map : FD_Map_Types.Map;
When I allocate a new session object, this opens the socket. A socket
object has a selector function to return its file descriptor. I use
this as the key to insert the session object into the map:
function Setup_Session return Session_Access is
Session : constant Session_Access := new Session_Type;
Position : FD_Map_Types.Cursor;
Inserted : Boolean;
begin
Open (Session.Socket, ...);
Insert
(Container => FD_Map,
Key => FD (Session.Socket),
New_Item => Session,
Position => Position,
Inserted => Inserted);
...
return Session;
end;
Now that the session object has inserted itself into the map, I can use
map lookup to find that session when I receive a signal. Something
like:
procedure Handle_Signal (Siginfo : in Siginfo_Type)
FD : constant C.int := Siginfo.FD;
C : constant Cursor := Find (FD_Map, Key => FD);
begin
if C /= No_Element then --
IO_Completion (Element (C));
end if;
end Handle_Signal;
and then the session object reacts to the I/O completion accordingly.
Hashed maps with type String as the key are nearly ubiquitous. The
canonical example is of course the word-frequency problem, in which
"words" (using some suitable definition of delimiter) are counted as
they are scanned in the input stream.
We can solve this problem easily using the indefinite form of the hashed
map, with string as the key and subtype Natural as the element:
with Ada.Strings.Hash_String;
package Wordcount_Maps is
new Ada.Containers.Indefinite_Hashed_Maps
(Key_Type => String,
Element_Type => Natural,
Hash => Ada.Strings.Hash, --
Equivalent_Keys => "="); --
Wordcount_Map : Wordcount_Maps.Map;
Here we've specified the hash function for strings provided by the
library. The scanning phase looks like this:
Open (File, In_File, Name);
Scan_Lines:
while not End_Of_File (File) loop
Get_Line (File, Line, Line_Last);
Line_First := Line'First;
Scan_Line:
loop
Find_Token (Line (Line_First .. Line_Last), ...);
exit when Word_Last = 0;
--
Insert (Word => Line (Word_First .. Word_Last));
Line_First := Word_Last + 1;
end loop Scan_Line;
end loop Scan_Lines;
Now all we have to do is implement Insert. That function looks like this:
procedure Insert (Word : String) is
procedure Increment (Word : in String;
Count : in out Natural) is
begin
Count := Count + 1;
end;
Position : Wordcount_Maps.Cursor;
Inserted : Boolean;
begin --
Insert
(Container => Wordcount_Map,
Key => To_Lower (Word),
New_Item => 0, --
Position => Position,
Inserted => Inserted);
Update_Element (Position, Increment'access);
end Insert;
Map (and set) insertion works conditionally. It searches the container
to determine whether there is an equivalent key already in the map.
Note that in the example above, the New_Item parameter has the value 0.
This is deliberate. What happens is that if the word is already in the
map, then the insertion "fails" in the sense that no new node is
allocated. Rather, Insert reports the fact that the key was already in
the map (by returning the value False for Inserted), and a cursor that
designates the node with the matching key.
But not inserting a new node is exactly the behavior we want. In the
case of a word already in the map, the cursor returned designates an
existing word/count pair, whose count is non-zero. When we update the
count object, we simply increment its value.
However, the word might not be in the map, in which case the insertion
"succeeds," which means a new node is inserted whose element is
initialized to the value of New_Item, which here is 0. Position
designates the newly-inserted element (really, it designates the node
containing that key/element pair). When we update the element, the count
has the value 0, and so by incrementing it the count gets set to the
correct value 1.
Conditional insertion is a necessary feature of any efficient map
abstraction. It makes no sense to search for the element (via Find, say)
to determine whether it's in the map, and if it's not in the map call a
separate operation to insert it. This would be horribly inefficient
because the lookup done by insert would only duplicate the lookup just
done by the search.
To dump the contents of the map, you can use the passive iterator:
declare
procedure Process (C : in Wordcount_Maps.Cursor) is
begin
Put (Key (C)); Put (':'); Put (Element (C)); New_Line;
end;
begin
Iterate (Map, Process'access);
end;
This would display the words in their order in the hashed map. That's
probably not what you want (especially for a well-performing hash table,
which would scatter keys everywhere), which is to display them in order
by frequency. We can do that easily enough by populating an array with
map cursors, and then sorting the array according to element value:
procedure Print_Results (Histogram : in Wordcount_Map) is
type Cursor_Array is
array (Count_Type range <>) of Wordcount_Maps.Cursor;
Cursors : Cursor_Array (1 .. Length (Histogram));
I : Count_Type := Cursors'First;
procedure Process (C : in Wordcount_Maps.Cursor) is
begin
Cursors (I) := C;
I := Count_Type'Succ (I);
end;
begin --
Iterate (Histogram, Process'access); --
... --
end Print_Results;
Here we use the passive iterator for maps to populate the array. As
with all containers, it's usually simpler and more efficient to use a
passive iterator if you're going to traverse all the elements in the
container.
We now need to sort the array of cursors, and to do that we need an
order relation for cursors. We want to sort the elements in reverse
order, so that largest histogram count is listed first in the output. We
can define the order relation like this:
Sort_Array:
declare
function "<" (L, R : Wordcount_Maps.Cursor)
return Boolean is
begin
return Element (L) > Element (R); --
end;
procedure Sort is
new Generic_Array_Sort
(Index_Type => Count_Type,
Element_Type => Word_Count_Maps.Cursor,
Array_Type => Cursor_Array); --
begin
Sort (Cursors);
end Sort_Array;
We can do better though: suppose that for counts that are equal, we want
break the tie by listing the items in alphabetic order of the words. We
only have fix our order relation to compare keys, too:
Sort_Array:
declare
function "<" (L, R : Wordcount_Maps.Cursor)
return Boolean is
begin
if Element (L) = Element (R) then
return Key (L) < Key (R); --
else
return Element (L) > Element (R); --
end if;
end;
procedure Sort is
new Generic_Array_Sort
(Index_Type => Count_Type,
Element_Type => Word_Count_Maps.Cursor,
Array_Type => Cursor_Array); --
begin
Sort (Cursors);
end Sort_Array;
To display the results, we iterate through the sorted cursor array:
Print_Array:
for Index in Cursors'range loop
declare
C : constant Wordcount_Maps.Cursor := Cursors (Index);
begin
Put (Element (C), Width => 0); --
Put (':');
Put (Key (C)); --
New_Line;
end;
end loop Print_Array;
As another example, consider a client that connects to a streaming media
server to play a video. The server opens the file and then transmits
frames to the client. Ultra-efficient file I/O is usually done by
memory-mapping the sections on disk. Typically, a server maintains its
own file cache, so that many clients streaming the same file can share
the memory-mapped sections.
When the client request comes in, the server must look up the file by
name in order to see if it's in cache. If it's already in cache, then
we increment a reference count. If it's not, we create some context for
the file and create a new entry for it in the cache.
Suppose the context type looks like this:
type Context_Type is limited record
File : File_Type;
Ref_Count : Natural;
...;
end record;
type Context_Access is access Context_Type;
Our cache is just a map, indexed by file name:
package File_Cache_Types is
new Ada.Containers.Indefinite_Hashed_Maps
(Key_Type => String,
Element_Type => Context_Access,
Hash => Hash_String_Case_Insensitive,
Equivalent_Keys => Equal_String_Case_Insensitive);
File_Cache : File_Cache_Types.Map;
The naive way to implement the lookup is:
procedure Setup (Name : in String) is
Position : File_Cache_Types.Cursor :=
Find (File_Cache, Key => Name);
Inserted : Boolean;
Context : Context_Access;
begin --
if Position = No_Element then
Context := new Context_Type;
Context.Ref_Count := 0;
... --
Insert
(Container => File_Cache,
Key => Name,
New_Item => Context,
Position => Cursor,
Inserted => Inserted);
else
Context := Element (Position);
end if;
Context.Ref_Count := Context.Ref_Count + 1;
... --
end Setup;
The problem is that we're duplicating work: we first searched for the
file context in the cache, and if it wasn't found we insert a new entry,
which just searches again. The correct way to do this is as follows:
procedure Setup (Name : in String) is
Position : File_Cache_Types.Cursor;
Not_In_Cache : Boolean;
Context : Context_Access;
begin
Insert
(Container => File_Cache,
Key => Name,
New_Item => null, --
Position => Position,
Inserted => Not_In_Cache);
if Not_In_Cache then
pragma Assert (Element (Position) = null);
Context := new Context_Type;
Context.Ref_Count := 0;
... --
Replace_Element (Position, By => Context);
else
Context := Element (Position);
end if;
Context.Ref_Count := Context.Ref_Count + 1;
... --
end Setup;
Here we make an insertion attempt, by trying to insert a null context
into the map. If it's already in the map, then the insertion fails, but
that's just what we want to happen, because we wish to share the file
already in cache. If it's not in the map, the insertion succeeds, by
creating a slot for this file (the context is null), which we just fill
in with a newly-allocated context object.
In the RTSP protocol, requests are sent to a server in (clear) text. To
create a session, a client connects to the server and sends it a SETUP
request, e.g.
SETUP rtsp://mheaney/thematrix.avi RTSP/1.0
Transport: ...
Each RTSP session has a "session id," which is a random string of
characters at least 8 characters long. When a SETUP request doesn't
specify a session id, this is interpreted to mean that the client wishes
to create a new session.
On the server side, it allocates a session object (described above), and
generates a session id for it. The representation of the session object
looks like this:
type Session_Type is limited record
Id : String (1 .. 8);
...;
end record;
And the session constructor looks like:
function Setup_Session return Session_Access is
Session : constant Session_Access := new Session_Type;
begin
Generate_Id (Session.Id);
... --
end;
The server responds by including the session id in the response:
RTSP/1.0 200 OK
Session: HAiye8-r
And thereafter, a client sends requests to a session by specifying a
session id header:
PLAY rtsp://mheaney/thematrix.avi RTSP/1.0
Range: npt=101.42-
Session: HAiye8-r
The server-side problem is this. When the server receives the request
from the client, it parses the request and looks for a session id. The
problem then becomes finding the session object that is associated with
that unique id. There might very well be hundreds of session objects,
so whatever method we use has to run fast. (The is a real-time
streaming server, and RTSP request/response overhead must be kept to a
minimum.)
What we do is declare a string-key map object that uses the session id
as the key, and a Session_Access as the element, like this:
package Id_Maps is
new Ada.Containers.Indefinite_Hashed_Maps
(Key_Type => String,
Element_Type => Session_Access,
Hash => Ada.Strings.Hash, --
Equivalent_Keys => "="); --
Id_Map : Id_Maps.Map;
use Id_Maps;
When the session is allocated, we insert the id/session pair into the
map like this:
function Setup_Session return Session_Access is
Session : constant Session_Access := new Session_Type;
Position : Id_Map_Types.Cursor;
Inserted : Boolean;
begin
Generate_Id (Session.Id);
Insert
(Container => Id_Map,
Key => Session.Id,
New_Item => Session,
Position => Position,
Inserted => Inserted);
...
return Session;
end;
When a client issues a request, we parse out the session id, and then
forward the command to the session object associated with that session
id key:
procedure Play
(Session_Id : in String;
NPT_Range : in NPT_Range_Type;
RTSP_Status : out RTSP_Status_Type) is
Position : constant Cursor :=
Find (Id_Map, Key => Session_Id);
Session : Session_Access;
begin
if Position = No_Element then
RTSP_Status := RTSP.Session_Not_Found;
return;
end if;
Session := Element (Position);
Play (Session, NPT_Range, RTSP_Status);
end;
When we create a session object, we insert a pointer to the session
object in the Id_Map. The complementary problem is how to handle
deletion of the session object. Suppose we have a function like this:
procedure Free (X : in out Session_Access);
What's the best way to remove the object from the map? Since the
session knows its own Id, it can use the key-form of Delete:
procedure Free (X : in out Session_Access) is
begin
if X /= null then
Delete (Id_Map, Key => X.Id);
Deallocate (X);
end if;
end;
Another option is to use the cursor-form of Delete. What we can do is
cache the cursor returned by Insert as part of the state of the
session object:
function Setup_Session return Session_Access is
Session : constant Session_Access := new Session_Type;
Inserted : Boolean;
begin
Generate_Id (Session.Id);
Insert
(Container => Id_Map,
Key => Session.Id,
New_Item => Session,
Position => Session.Id_Map_Position,
Inserted => Inserted);
pragma Assert (Inserted);
pragma Assert (Key (Session.Id_Map_Position) = Session.Id);
pragma Assert (Element (Session.Id_Map_Position) = Session);
...
return Session;
end Setup_Session;
Now we can implement Free this way:
procedure Free (X : in out Session_Access) is
begin
if X /= null then
Delete (Id_Map, Position => X.Id_Map_Position);
Deallocate (X);
end if;
end;
This turns out to be a very common idiom. In the body of a package that
declares a type, you declare a set or map container to keep track of
instances of the type. As part of its representation, the type includes
a cursor that designates the node that contains this instance. When the
object is finalized, it deletes itself from the container.
In the examples we've seen, the session object inserts itself into the
map during the constructor function Setup_Session, and deletes itself
during the destructor operation Free. If the type is controlled, another
possibility is to do the insertion during Initialize, and the deletion during
Finalize. This technique would work even for stack-allocated (controlled)
objects.
Let's pretend Free doesn't remove the designated session object from the
map, but rather has its traditional semantics of merely deallocating the
object. To shutdown all the sessions, we could do this:
Id_Map : Id_Map_Types.Map;
...
procedure Shutdown_Sessions is
procedure Process (C : in Id_Map_Types.Cursor) is
begin
Update_Element (C, Free'access); --
end;
begin --
Iterate (Id_Map, Process'access); --
Clear (Id_Map);
end Shutdown_Sessions;
The passive iterator visits all the sessions in the map and Free's them;
this also sets all of the map elements to null. We then Clear the map,
which sets its length to 0.
In other examples, we have been silent about the operation of
Generate_Id, which makes a session id for us comprising a sequence of 8
random characters. One of our requirements for a session id is that it
must be unique among all the sessions currently being streamed by this
server.
Even if we use a random number generator to synthesize characters, we
still must check to ensure that this session id is unique. The obvious
way is to use Find to see if it's in the map already:
procedure Generate_Id (Id : out Id_Subtype) is
Position : Id_Maps.Cursor;
begin
loop
Synthesize_Random_String (Id);
Position := Find (Id_Map, Key => Id);
exit when Position = No_Element; --
end loop;
end Generate_Id;
The constructor for session objects generates a unique session id, and
then uses the id as the key when inserting the new session object:
function Setup_Session return Session_Access is
I : Id_Maps.Cursor;
B : Boolean;
Session : Session_Access := new Session_Type;
begin
Generate_Id (Session.Id); --
Insert
(Container => Id_Map,
Key => Session.Id,
Element => Session,
Position => I,
Inserted => B);
...
end Setup_Session;
One issue with this algorithm is that Insert duplicates the work done
just earlier by Find, when checking the newly-synthesized id for
uniqueness. A simpler way to efficiently generate a unique session id
and insert it into the map is to just check whether the insertion
succeeded:
function Setup_Session return Session_Access is
Position : Id_Maps.Cursor;
Inserted : Boolean;
Id : Id_Subtype;
begin --
loop
Generate_Id (Id);
Insert
(Container => Id_Map,
Key => Id,
Position => Position,
Inserted => Inserted);
exit when Inserted;
end loop;
...
end Setup_Session;
Here we don't need a separate call to Find, because the regular Insert
operation does a search anyway.
A.18.9 The Package Containers.Ordered_Sets
Suppose we have a set and want to copy the elements from the set into an
array. Here's one way to do it:
procedure Op (S : in Integer_Sets.Set) is
A : Integer_Array (1 .. Length (S));
I : Integer_Sets.Cursor := First (S);
J : Positive := A'First;
begin
while Has_Element (I) loop
A (J) := Element (I);
Next (I);
J := J + 1;
end loop;
...
end Op;
Here we're incrementing both the array index and the set cursor
manually. However, when you're iterating over two containers
simultaneously, you can often let one or the other drive the iteration,
so that only one position needs to be incremented manually.
We can change the example use a built-in for loop:
procedure Op (S : in Integer_Sets.Set) is
A : Integer_Array (1 .. Length (S));
I : Integer_Sets.Cursor := First (S);
begin
for J in A'range loop
A (J) := Element (I);
Next (I);
end loop;
...
end Op;
This lets the array drive the iteration. In general, however, you should
use a passive iterator in preference to an explicit loop. The reason is
that if a container knows that its elements are being traversed
sequentially, then it can use a traversal algorithm that takes advantage
of that container's unique representation. The algorithm above might run
faster if we do this:
procedure Op (S : in Integer_Sets.Set) is
A : Integer_Array (1 .. Length (S));
J : Positive := A'First;
procedure Fill_Element_of_A (I : Integer_Sets.Cursor) is
begin
A (J) := Element (I);
J := J + 1;
end;
begin
Iterate (S, Fill_Element_of_A'access);
...
end Op;
This lets the set drive the iteration.
Let's continue the streaming server examples from earlier. In TCP, a
client "connects" to a server, who is "listening" on a port known to the
client. When the server "accepts" the connection, this establishes a
dedicated connection with that client. We can reify this in code as
follows:
package Connections is
type Connection_Type (<>) is limited private;
type Connection_Access is access Connection_Type;
function Accept_Connection (Socket : Socket_Type)
return Connection_Access; --
...
end Connections;
We have a need to keep track of all of our current connections. So when
we create a new connection object, what we do is insert it in a set:
package body Connections is
function "<" (L, R : Connection_Access) return Boolean is
begin
return L.all'Address < R.all'Address;
end;
package Connection_Sets is
new Ada.Containers.Ordered_Sets (Connection_Access);
Connection_Set : Connection_Sets.Set;
function Accept_Connection (Socket : Socket_Type)
return Connection_Access is
Connection : constant Connection_Access :=
new Connect_Type;
Position : Connection_Sets.Cursor;
Inserted : Boolean;
begin
Insert
(Container => Connection_Set,
New_Item => Connection,
Position => Position,
Inserted => Inserted);
...
return Connection;
end;
...
end Connections;
When a new connection object is allocated, it is inserted into the
Connection_Set. Here insertions will always succeed because each
allocated object has a unique access value.
Now let's suppose we want to shutdown a specific connection. We can do
it like this:
procedure Shutdown (X : in out Connection_Access) is
begin
if X /= null then
Delete (Connection_Set, Item => X);
Free (X);
end if;
end Shutdown;
Here we use the item-form of Delete to simply remove the item from the
set, and then deallocate it.
Now let's suppose we want to shutdown the entire system, and so we need
to clear out all of the connection objects. We could do it like this:
procedure Shutdown_Connections is
X : Connection_Access;
begin
while not Is_Empty (Connection_Set) loop
X := First_Element (Connection_Set);
Shutdown (X); --
end loop;
end Shutdown_Connections;
Another technique would be to use an active iterator, like this:
procedure Shutdown_Connections is
I : Cursor;
X : Connection_Access;
begin
while not Is_Empty (Connection_Set) loop
I := First (Connect_Set);
X := Element (I);
Delete (Connection_Set, Position => I);
Free (X);
end loop;
end Shutdown_Connections;
Here we use the cursor-form of Delete. This is probably more efficient
than using the item-form of Delete, since the cursor-form doesn't have
to search for the item.
There are special set operations to operate on the first element. Using those
would simplify this example:
procedure Shutdown_Connections is
X : Connection_Access;
begin
while not Is_Empty (Connection_Set) loop
X := First_Element (Connect_Set);
Delete_First (Connection_Set);
Free (X);
end loop;
end Shutdown_Connections;
One of the canonical container examples is the set-of-employees.
Suppose we have an employee type defined this way:
type Employee_Type is record
SSN : SSN_Type; --
Name : Name_Type;
Home_Address : Home_Address_Type;
...;
end record;
To make a set, we need to establish an order relationship for its
elements. Since each social security number is presumably unique
(unless your identity has been stolen), we can use that to define an
order for Employee_Type:
function "<" (L, R : Employee_Type) return Boolean is
begin
return L.SSN < R.SSN;
end;
This allows us to instantiate the set in the normal way:
package Employee_Sets is
new Ada.Containers.Ordered_Sets (Employee_Type);
Employees : Employee_Sets.Set;
When someone gets a job at our firm, we add them to our little database
as follows:
procedure Hire (Name : Name_Type; SSN : SSN_Type; ...) is
Employee : Employee_Type;
Position : Employee_Sets.Cursor;
Inserted : Boolean;
begin
Employee.SSN := SSN;
Employee.Name := Name;
...
Insert (Employees, Employee, Position, Inserted);
end;
Now let's suppose that we need to modify some information for an
employee. Like a map, a set orders its elements in key order, except
that for a set the element is its own key. In the example here, the key
is really the SSN part of the employee object.
Suppose we only know the employee's social security number. How do we
find that employee? Remember that the "key" of a set is just the
element itself. One way is to synthesize a dummy employee object, and
then look it up by element type:
procedure Change_Address
(SSN : SSN_Type;
New_Home : Home_Address_Type) is
Position : Cursor;
begin
declare
Dummy : Employee_Type;
begin
Dummy.SSN := SSN;
Position := Find (Employees, Item => Dummy);
end;
if Has_Element (Position) then ...;
end;
But this is kind of a hack. I don't really want to make a dummy element
just to look up the real element. For many types synthesizing a dummy
object this way might not even be possible (say, because the type is private
and you only have visibility to the partial view of the type).
A much more elegant technique is to use the nested generic package
Generic_Keys, which allows you to explicitly name the key-part of the
element. We can instantiate that package like this:
function "<"
(Employee : Employee_Type;
SSN : SSN_Type) return Boolean is
begin
return Employee.SSN < SSN;
end;
function ">"
(Employee : Employee_Type;
SSN : SSN_Type) return Boolean is
begin
return Employee.SSN > SSN;
end;
function SSN (Employee : Employee_Type) return SSN_Type is
begin
return Employee.SSN;
end;
package SSN_Keys is
new Employee_Sets.Generic_Keys (SSN_Type, Key => SSN);
--
With this new package we can now look up the employee by his SSN
directly:
procedure Change_Address
(SSN : SSN_Type;
New_Home : Home_Address_Type) is
Position : Cursor := Find (Employees, Key => SSN);
begin
if Has_Element (Position) then ...;
end;
To actually change the employee's address in the example above, we use
the special element modifier operation:
procedure Change_Address
(SSN : SSN_Type;
New_Home : Home_Address_Type) is
procedure Set_Home
(Employee : in out Employee_Type) is
begin
Employee.Home := New_Home;
end;
Position : Cursor := Find (Employees, Key => SSN);
begin
if Has_Element (Position) then
SSN_Keys.Checked_Update_Element
(Container => Employees,
Position => Position,
Process => Set_Home'Address);
...
end if;
end Change_Address;
Modifying a set element requires special handling, since a set orders
its elements by key. Indiscriminate changes to element values are
therefore not allowed, as this would break the set abstraction if the
order relation were changed. The operation Checked_Update_Element
allows the element to be modified, but it checks to determine whether
the key has changed, and if so it moves the element to a new position
that preserves the order relation among existing elements. In this
example we only modify the Home component of the Element_Type record, so
Checked_Update_Element would not need to take any further action.
Now let's say that the employee's wallet was stolen, which contained his
social security card. In order to prevent identity theft, he needs to
apply for a new social security number, and then change his entry in the
database.
One way to do this would be to first copy the employee object and remove
it from the set, then change the value of its SSN field, and finally
(re)insert the element:
procedure Change_SSN
(Old_SSN : SSN_Type;
New_SSN : SSN_Type) is
Old_Position, New_Position : Cursor;
Inserted : Boolean;
begin
if New_SSN = Old_SSN then
return;
end if;
Old_Position := Find (Employees, Key => Old_SSN);
if not Has_Element (Old_Position) then
return;
end if;
New_Position := Find (Employees, Key => New_SSN);
if Has_Element (New_Position) then
raise Duplicate_SSN;
end if;
declare
Employee : Employee_Type := Element (Old_Position);
begin
Delete (Employees, Old_Position);
Employee.SSN := New_SSN;
Insert (Employees, Employee, New_Position, Inserted);
pragma Assert (Inserted);
end;
end Change_SSN;
Another technique is to use Checked_Update_Element, which allows the
element's key to be modified, and then moves then element to its new
relative location in the set:
procedure Change_SSN
(Old_SSN : SSN_Type;
New_SSN : SSN_Type) is
Old_Position, New_Position : Cursor;
Inserted : Boolean;
begin
if New_SSN = Old_SSN then
return;
end if;
Old_Position := Find (Employees, Key => Old_SSN);
if not Has_Element (Old_Position) then
return;
end if;
New_Position := Find (Employees, Key => New_SSN);
if Has_Element (New_Position) then
raise Duplicate_SSN;
end if;
declare
procedure Set_SSN (Employee : in out Employee_Type) is
begin
Employee.SSN := New_SSN;
end;
begin
SSN_Keys.Checked_Update_Element
(Container => Employees,
Position => Old_Position,
Process => Set_SSN'access);
end;
end Change_SSN;
Changing the key this way is probably more efficient than deleting and
then immediately re-inserting the element, since the element doesn't
need to be copied, and there is no deallocation and re-allocation of the
element's internal node. When it detects that the element's key has
been changed, Checked_Update_Element works by first removing the node
from the tree and then inserting the same node back into the tree.
Suppose now we want a list all the employees in the firm. One way to do
it is like this:
procedure Display is
procedure Print (I : in Employee_Sets.Cursor) is
procedure Do_Print (E : in Employee_Type) is
begin
Put ("Name: "); Put (E.Name);
Put ("SSN: "); Put (E.SSN);
...;
end;
begin
Query_Element (Position => I, Process => Do_Print'access);
end;
begin
Iterate (Employees, Print'access);
end;
However, this will list employees in order of their social security
number. This is probably not what we want, which is to print employees
in order of name.
One way would be to copy the elements into some other container, which
is sorted according to the criterion we desire. However, if elements
are large or otherwise not easily copied, then this is not not really an
option.
A much better way is not to copy elements directly but rather to copy
cursors that designate those elements:
procedure Display_Employees_In_Name_Order is
function "<" (L, R : Employee_Sets.Cursor)
return Boolean is
Result : Boolean;
procedure Process_LE (LE : in Employee_Type) is
procedure Process_RE (RE : in Employee_Type) is
begin
Result := LE.Name < RE.Name;
end;
begin
Query_Element (R, Process_RE'access);
end Process_LE;
begin
Query_Element (L, Process_LE'access);
return Result;
end;
type Cursor_Array is
array (Count_Type range <>) of Employee_Sets.Cursor;
procedure Sort is
new Generic_Array_Sort (Count_Type, Cursor, Cursor_Array);
procedure Do_Print (E : in Employee_Type) is
begin
Put ("Name: "); Put (E.Name);
Put ("SSN: "); Put (E.SSN);
...;
end;
C : Employee_Sets.Cursor := First (Employee_Sets);
function Get_Cursor return Employee_Sets.Cursor is
Result : Cursor := C;
begin
Next (C);
return Result;
end;
Cursors : Cursor_Array (1 .. Length (Employee_Set)) :=
(others => Get_Cursor);
begin
Sort (Cursors);
for Index in Cursors'range loop
C := Cursors (Index);
Query_Element (Position => C, Process => Do_Print'access);
end loop;
end Display_Employees_In_Name_Order;
First we use an active iterator for sets to insert a cursor designating
every employee into the array. Next we define an order for relation for
the array elements, which here are just set cursors. We wish to print
employees in order of name, so the order relation for cursors is defined
in terms of the names of the employees designated by the cursors.
Implementing the sort order relation turns out to slightly tricky,
because we don't want to make a copy of the employee just do get its
name. We use nested process routines for Query_Element to create a
context in which both employee objects are directly visible, and then
compare employee names by querying the employee elements directly.
Now that the employees (really, cursors that designate the employees)
have been sorted, we loop to traverse all the set cursors, and print
each employee (in name order) in turn.
In you were paying very careful attention to the Id_Map hashed-map
example, you might have realized that since the session object (which
was the element of the map) had an Id field, we were in fact duplicating
the Id object, since it's also stored as the key-part of the map entry.
In turns out we didn't really need to use a map. We could have used a
set, and instantiated the generic package Generic_Keys using type
String as the formal Key_Type.
function "<" (L, R : Session_Access) return Boolean is
begin
return L.Id < R.Id;
end;
package Session_Set_Types is
new Ada.Containers.Ordered_Sets (Session_Access);
--
Session_Set : Session_Set_Types.Set;
function "<"
(Session : Session_Access;
Id : String) return Boolean is
begin
return Session.Id < Id;
end;
function ">"
(Session : Session_Access;
Id : String) return Boolean is
begin
return Session.Id > Id;
end;
function Id (Session : Session_Access) return String is
begin
return Session.Id;
end;
package Id_Keys is
new Session_Set_Types.Generic_Keys (String, Key => Id);
This lets us perform session lookups based on the session identifier:
procedure Play
(Session_Id : in String;
NPT_Range : in NPT_Range_Type;
RTSP_Status : out RTSP_Status_Type) is
Position : constant Session_Set_Types.Cursor :=
Id_Keys.Find (Session_Set, Key => Session_Id);
Session : Session_Access;
begin
if Position = No_Element then
RTSP_Status := RTSP.Session_Not_Found;
return;
end if;
Session := Element (Position);
Play (Session, NPT_Range, RTSP_Status);
end;
We can insert a session object into the set in the normal way, using the
item-form of insertion:
function Setup_Session return Session_Access is
Session : constant Session_Access := new Session_Type; --
Position : Session_Set_Types.Cursor;
Inserted : Boolean;
begin
Generate_Id (Session.Id);
Insert
(Container => Sessions_Set,
New_Item => Session, --
Position => Position,
Inserted => Inserted);
...
return Session;
end;
This example also illustrates that sets and maps are essentially the
same. The only real difference is where the key lives.
--!corrigendum A.18
!ACATS Test
ACATS tests will be needed for this library.
!appendix
Report of the ARG Select Committee on Containers
February 3, 2004
Executive Summary
The committee selected the second proposal as a starting point for a
standard containers library, with a number of simple changes. The
changes were simple enough that we produced a version of the library with
the changes made (AI-00302-3/01).
The resulting proposal is not much larger than the Vector and Matrix
libraries already adopted for the standard. It also should be a good seed
for a more encompassing secondary standard.
Therefore, we recommend that the ARG adopt this alternative for the standard.
By the ARG Select Committee on Containers:
Randy Brukardt
Bob Duff
Tucker Taft
Full Report
Goals
A core library of containers is an important addition to Ada. Other competitive
programming languages include standard sets of containers, and these are
widely used. Users often note that a standard set of containers is a missing
piece of Ada. In addition, adding such containers to the standard is not
a large burden on implementers.
However, the resources available for work on the standard preclude adding a
large container library to the standard. If the library is too large, it will
be insufficiently reviewed, and that has the danger of providing something
useless.
Therefore, the committee settled on a limited set of goals:
(1) To provide a number of the most useful containers to Ada users in
a standard fashion;
(2) To provide a framework for future work in this area (hopefully leading
to a secondary or de-facto standard).
We considered other goals as well. Performance issues were deemed of secondary
importance. Most uses of containers (indeed, most software) do not have
critical performance requirements. To provide a library with the variety of
components needed to meet critical requirements (bounded and unbounded forms,
array and list implementations, etc.) would be beyond the resources available
to work on the standard. Moreover, the existence of many components actually
makes construction of simple applications harder: the programmer has to choose
a component based on performance considerations that are simply irrelevant for
the application.
Evaluation of existing proposals
We determined that the most important containers are the following:
* extensible "vectors" (like an array, indexed by any discrete type);
* (hashed) "maps" (or "hash table", with arbitrary keys);
* (sorted) "sets" (set of arbitrary items).
The names "map", "set", and "vector" are those used in the Java containers.
We evaluated the two proposals for their support of these components.
Alternative 1 (AI-302-1/07) contains a number of low level data structure
components such as Lists, Bags, Queues, etc. These can be used to create
"vector", "map", and "set" containers, but the containers themselves are
absent. Moreover, most of these components are relatively easy to create
when needed.
Alternative 2 (AI-302-2/02) contains mainly five containers: vector, list,
map, set, and multiset. These include the abstractions mentioned above. We
also determined that the basic design was consistent and sound.
Therefore, we discarded alternative 1, and concentrated on improving and
simplifying alternative 2.
We decided the sorts of changes that we would consider. The great value to
having containers in the standard is that they are standard: everybody has
them and can use them. Perfection is not required of the standard components.
Moreover, what is one person's "improvement" is another's "mistake". In
addition, we run the risk of introducing real errors by further fiddling.
Therefore, we decided to simplify the interfaces by deleting unnecessary
capabilities, by systematic substitutions, and by introducing missing
capabilities (along with general wordsmithing). In particular, we avoided
changing existing interfaces unless there was a clear error.
The specific improvements and simplifications are detailed in the Appendix.
Performance issues
For the purposes of components in the standard, the precise performance of them
is not important. Whatever the performance is will be good enough for the vast
majority of uses - in prototyping, quick and dirty programs, and the majority
of programs that aren't performance critical. Therefore, we provide only a
single version of each component. We don't, for instance, provide both Vectors
and Lists, which are really the same abstraction with the different performance
characteristics.
However, it is important that the performance characteristics of the components
be specified. That is, if searches are expected to be no worse than O(N), we
need to say that. That's because we want programs using the components to be
portable. That wouldn't be true for programs using components with large
numbers of items if the performance characteristics vary widely between
implementations. Consider a Vector component. It could in theory be implemented
with an array or with a linked list. The cost of an arbitrary insertion is O(N)
for the array implementation and O(1) for the list implementation. If a program
using a large vector is moved from a list implementation to an array
implementation, the performance change could be so large as to make the program
non-functional. That is unacceptable, so we specify minimum performance
characteristics. But those characteristics are not intended to specify a
particular implementation, only to insure that some characteristics can be
relied upon.
Therefore, the containers library needs to suggest some performance
characteristics. We believe Implementation Advice is best for this purpose,
as we don't have to be as precise in the language defining the characteristics,
and implementations are required to document deviations from the given
advice.
Appendix
Detailed changes made to the Alternative 2 proposal
The Unchecked_Modification packages were dropped. These are just a hack to
avoid copying keys - a solely performance-based concern. Other stuff does
not logically belong in keys, and modifying the key value itself it a disaster
waiting to happen.
The Vector, List, and Multiset abstractions are essentially the same
abstraction with differences in performance and details. When performance
is not critical, only one is needed.
The package structure has many levels of empty packages for organization.
These are unnecessary when there are only a few packages. Moreover, related
packages can be given similar names (i.e. "Bounded_Set", "Protected_Vector"),
which provides all of the organization needed. The extra empty packages
were eliminated. Similarly, "Unbounded" was dropped; these are the most
general forms, and should be the ones used for general-purpose programming.
Other forms (in a secondary standard) would be more specialized.
We discussed dropping the special string maps. We eventually decided to keep
them, because string maps are common, and a Map cannot be instantiated with
"String" (the key type must be definite).
We also discussed whether Sets should be sorted. We concluded that the extra
cost of sorted insertions is fairly small, and thus there is little advantage
to using unsorted sets other than when performance is critical (which again
is not the purpose of the standard). We did, however, name the package
"Sorted_Sets" so that a basic unsorted set could be provided in a secondary
standard without contorted naming.
We added a modular Hash_Type to Ada.Containers. The choice of Integer'Base is
a horrible one on compilers that use 16-bit Integer (as allowed by the
standard), and in any case, the hashing type should be modular.
The string hash and comparison functions were moved to be part of the
Ada.Strings hierarchy. It would be a bad idea to have the hash functions
of all types gathered in one place (in the containers library).
Unbounded string hash and comparison functions were added.
We changed the type names to the more descriptive "Vector_Type", "Map_Type",
and "Set_Type". These are much better for users who use Use clauses. The
argument that having a common name makes it easier to change between containers
is mostly irrelevant: changing between the provided containers is going to be
rare. Moreover, qualifying "Container_Type" (that is, "Vector.Container_Type")
would be necessary in any unit with more than one container -- eliminating any
advantage for using the same name.
The proposal confused the meaning of "iterator", using it both for the
code that visits each element of a container (the conventional meaning) and
the "handle" or "cursor" used to access an element of a container. We decided
to use "cursor" for the second meaning to make the interfaces clearer.
We added a sort routine to Vector. For some reason, this was only present
in the (removed) List abstraction. Having a simple sort available can
simplify many programming problems.
We added legality rules that these packages must be instantiated at the
library level. The requirement that these packages do not
leak memory (like Ada.Strings.Unbounded) imply that they are implemented with
controlled types (or something like controlled types). We do not want to
implicitly require implementers to support nested controlled types without
making that support available to users. (If AI-344 or AC-50 were adopted, we
could drop these rules.)
The proposal was completely missing definitions for the string hash and
compare functions.
Performance requirements were moved from the !proposal into Implementation
Advice. As much as possible, mention of specific implementation strategies
was moved into AARM notes following that Advice. (We're not going to specify
red-black trees!).
An Assert pragma was added to the Vector package to prevent instantiation with
a type for which Index_Type'Base'First = Index_Type'First. For such a type,
the initial value of Last and the value of Front must necessarily raise
Constraint_Error. It's better to fail an assertion immediately, rather than
during some later operation.
The Map container in the proposal is far too specific to a particular
implementation. It exposes that implementation in the interface, and as a
result makes iteration operations harder. That seems like a bad choice for
a simple abstraction; it's fine to suggest an implementation, but bad to
make it part of the interface. We therefore simplified the interface and the
description. (We consulted with the author of the proposal on this and other
changes.)
****************************************************************
From: Marius Amado Alves
Sent: Wednesday, February 4, 2004 5:13 AM
>Report of the ARG Select Committee on Containers
>February 3, 2004
>...
Sorry for my poor knowledge of ARG procedure.
Does this step mean the library is secured for Ada 2005?
Thanks.
****************************************************************
From: Martin Dowie
Sent: Wednesday, February 4, 2004 5:30 AM
> Sorry for my poor knowledge of ARG procedure.
> Does this step mean the library is secured for Ada 2005?
> Thanks.
Nope - it's still a "Work Item", see:
http://www.ada-auth.org/cgi-bin/cvsweb.cgi/AIs/AI-20302.TXT?rev=1.1
Also, in the text of the AI :-
with Ada.Containers;
package Ada.Strings.Case_Insensitive is
pragma Pure (Case_Insensitive);
function "=" (Left, Right : String) return Boolean;
function "/=" (Left, Right : String) return Boolean;
^^^^
Guess this wasn't really meant.
****************************************************************
From: Martin Dowie
Sent: Wednesday, February 4, 2004 8:11 AM
1) Couple of typos in package Ada.Containers.Maps
generic
with procedure Process (Cursor : in Cursor_Type) is <>;
procedure Generic_Iteration (Map : in Map_Type);
- description refers to 'Generic_Cursor'
function Length (Map : Map_Type) return Natural;
- description refers to 'Container' when it should be 'Map'
2) For routines like 'Generic_Iteration' shouldn't the 'Process'
generic subprogram parameter not have a 'Stop : out Boolean'
parameter? To allow early exit of the iteration, without
having to raise exceptions?
****************************************************************
From: Martin Dowie
Sent: Wednesday, February 4, 2004 8:21 AM
Is package Ada.Containers.Maps.Strings[ACMS] really what is
intended, as Ada.Containers.Maps[ACM] is generic this means
to use ACMS a user must first instantiate ACM and then
instantiate ACMS.
Charles didn't suffer from this problem as Unbounded maps (~ACM)
and String Maps (~ACMS) were siblings not parent/child.
****************************************************************
From: Matthew Heaney
Sent: Wednesday, February 4, 2004 8:57 AM
>2) For routines like 'Generic_Iteration' shouldn't the 'Process'
> generic subprogram parameter not have a 'Stop : out Boolean'
> parameter? To allow early exit of the iteration, without
> having to raise exceptions?
Just use an active iterator.
****************************************************************
From: Matthew Heaney
Sent: Wednesday, February 4, 2004 9:52 AM
Note that that's not really the correct mode anyway: it should be inout,
not just out, like this:
generic
with procedure Process
(Cursor : in Cursor_Type;
Done : in out Boolean) is <>;
procedure Generic_Iteration (Map : in out Map_Type);
The problem with just out-mode is that you always have to give the
parameter a value. But this is wrong, since you shouldn't be compelled
to say anything if you merely want to continue. You should only have
say something when you want to stop.
If you only want to visit some of the items, then just use an active
iterator, and exit the loop when you need to:
declare
I : Cursor_Type := First (M);
J : constant Cursor_Type := Back (M);
begin
while I /= J loop
declare
E : Element_Type := Element (I);
begin
--do something with E
exit when Predicate (E);
end;
Increment (I);
end loop;
end;
****************************************************************
From: Martin Dowie
Sent: Wednesday, February 4, 2004 10:06 AM
[snip]
> If you only want to visit some of the items, then just use an active
> iterator, and exit the loop when you need to:
[snip]
I could but wasn't part of the purpose of the library to allow us to
do common things more easily? And I'd have to say I'd use a 'Quit'
version a _lot_ more than the current process everything,
every time one.
I'd be delighted if both versions could be included! :-)
****************************************************************
From: Matthew Heaney
Sent: Wednesday, February 4, 2004 11:16 AM
Dowie, Martin (UK) wrote:
> I could but wasn't part of the purpose of the library to allow us to
> do common things more easily? And I'd have to say I'd use a 'Quit'
> version a _lot_ more than the current process everything,
> every time one.
It would be helpful if you could be specific about what kind of
container you were using.
The vector has neither active nor passive iterators, which means that
for a vector you have to use a loop anyway.
For the hashed map, I would find it very odd if you needed to traverse
only some of its elements, since elements are stored in hash order.
What would be the nature of the predicate?
The sorted set is the borderline case.
****************************************************************
From: Peter Hermann
Sent: Wednesday, February 4, 2004 5:57 AM
> package Ada.Strings.Case_Insensitive is
indeed useful.
expected to be overloaded for fixed and (un)bounded strings.
****************************************************************
From: Matthew Heaney
Sent: Wednesday, February 4, 2004 9:02 AM
>Is package Ada.Containers.Maps.Strings[ACMS] really what is
>intended, as Ada.Containers.Maps[ACM] is generic this means
>to use ACMS a user must first instantiate ACM and then
>instantiate ACMS.
That's definitely a bug in the report. The string-key map is not a
child of a generic. Maybe we should do this:
package Ada.Containers.Maps
package Ada.Containers.String_Maps
****************************************************************
From: Marius Amado Alves
Sent: Wednesday, February 4, 2004 9:07 AM
Yes, please change that. There is a steady requirement that a single
instantiation must be enough to get a container.
****************************************************************
From: Pascal Obry
Sent: Wednesday, February 4, 2004 10:10 AM
> The problem with just out-mode is that you always have to give the
> parameter a value. But this is wrong, since you shouldn't be compelled
> to say anything if you merely want to continue. You should only have
> say something when you want to stop.
Agreed, this is the way iterators are designed in the POSIX 1003.5 standard
for example.
****************************************************************
From: Marius Amado Alves
Sent: Wednesday, February 4, 2004 9:02 AM
> 2) For routines like 'Generic_Iteration' shouldn't the 'Process'
> generic subprogram parameter not have a 'Stop : out Boolean'
> parameter? To allow early exit of the iteration, without
> having to raise exceptions?
Indeed some people ban the use of exceptions for control flow. I guess they
are not a majority in the committee. Fortunately ;-)
/* However to take the exception route the exception should be defined.
(Exit/Terminate_Immediately, _Now, _Prematurely?) Or a specification be made
of what exceptions the iterator is guaranteed to propagate. Simply "all"
would do. Maybe this is already there. I'm sorry, I didn't had time to read
the AI fully yet. */
****************************************************************
From: Randy Brukardt
Sent: Wednesday, February 4, 2004 8:55 PM
Marius Amado Alves wrote:
> Sorry for my poor knowledge of ARG procedure.
> Does this step mean the library is secured for Ada 2005?
What it means is that the study committee has issued a report. No more, and
no less. I would hope that we know more after the March ARG meeting, but
there is no guarantee that we'll work on it (we never seem to get to
everything on the agenda - we didn't work on AI-351, Time Ops in San Diego,
for instance).
Primarily, we just "cleaned up" Matt Heaney's proposal. We didn't change (as
opposed to remove) functionality, with the exception of the Map container
(where we reverted to a design more like the one Charles actually uses -
with Matt's input). So the vast majority of design decisions are Matt's --
we'd prefer to avoid design-by-committee.
Martin Dowie wrote:
> Is package Ada.Containers.Maps.Strings[ACMS] really what is
> intended, as Ada.Containers.Maps[ACM] is generic this means
> to use ACMS a user must first instantiate ACM and then
> instantiate ACMS.
Nope, that's clearly a bug. String_Maps ought to be usable by itself (it
doesn't depend on the other package at all). (And this one is my fault, for
not noticing the effect of the change.)
And later, replying to Matt:
>[snip]
>> If you only want to visit some of the items, then just use an active
>> iterator, and exit the loop when you need to:
>[snip]
>I could but wasn't part of the purpose of the library to allow us to
>do common things more easily? And I'd have to say I'd use a 'Quit'
>version a _lot_ more than the current process everything,
>every time one.
My understanding of Matt's design is that you use the passive iterator when
you want to process everything (which is by far the most common), and you
use an active iterator when you want to process part of the items. You might
use an exception to terminate iteration in an error case, but not if you
intended only to process part of the items. (Of course, there is no law
requiring that, so YMMV!)
I hadn't noticed that there is no passive iterator for vectors until Matt
pointed it out last night (about 20 minutes before we released the report!).
Consistency would suggest that there should be one, but note that it is
easier to write an active iterator for a vector than it is to write a
passive one:
for I in First(Vect) .. Last(Vect) loop
-- Do whatever.
end loop;
versus
declare
procedure Process (I : in Index_Subtype) is
begin
-- Do whatever.
end Process;
procedure Do_It_All is new Generic_Iterator (Process);
begin
Do_It_All (Vect);
end;
Besides being longer and harder to read, you have to know or look up the
index subtype for the vector in order to write this. So we reached no
conclusion about that in the 20 minutes we had to think about it.
Marius Amado Alves wrote:
> /* However to take the exception route the exception should be defined.
> (Exit/Terminate_Immediately, _Now, _Prematurely?) Or a specification be
made
> of what exceptions the iterator is guaranteed to propagate. Simply "all"
> would do. Maybe this is already there. I'm sorry, I didn't had time to
read
> the AI fully yet. */
The wording for Generic_Iteration for a Map says:
Generic_Iteration calls Process with a cursor that designates each
node in the Map. Any exceptions raised during Process are propagated.
So it's covered. This is important, because it means that the implementation
must be able to clean itself up (if any is needed) when an exception
propagates - it can't leave the Map in an unstable state.
****************************************************************
From: Jeffrey Carter
Sent: Wednesday, February 4, 2004 8:53 PM
AI-302-03 asks
> Anybody got better wording [for the quality of the String hashing
> function]? Matt was nice enough to ignore these definitions
> completely!
See
P. K. Pearson, "Fast Hashing of Variable-Length Text Strings," Comm.
ACM, 1990 Jun
It describes a "hashing function specifically tailored to
variable-length text strings." It says that "similar strings are not
likely to collide." (An implementation can be found in
PragmARC.Hash_Fast_Variable_Length.) Perhaps you might think this last
quote is "better wording".
The actual algorithm produces 8-bit hash values, which may no longer be
considered adequate, given
> Hash_Type'Modulus shall be at least as large as the smaller of
> System.Max_Binary_Modulus and 2**32.
I have some comments on the proposal:
The proposal has a structure called a "Vector" which is actually a list,
which is a sequence that allows insertions and deletions at any point.
"Vector" refers to a mathematical concept related to matrices to most
software engineers. It may be that the STL refers to lists as vectors,
but I hope we do not have to follow C++'s mistakes.
Further, the proposal requires an inefficient array implementation, and
several of the operations refer to this implementation. I think this is
a mistake. Specify an general, unbounded list and let the implementor
choose the implementation (which could be an array). As the proposal
points out, correctly implementing a general list is not trivial, so it
makes sense for a standard library to provide a list.
Maps and sets also specify a specific implementation.
If the intention is to have an extensible array structure, then I
suggest that they be called Extensible_Arrays.
Vector should have an iterator, in addition to allowing the user to
explicitly iterate over the structure.
> Open issue: This function returns a value that doesn't depend on it's
> parameter. It possibility could be removed in favor of just saying
> Index_Type'Pred(Index_Type'First) appropriately. Committee discussion
> with the original proposal's author was inconclusive.
I'd say that it should be a constant, not a function. The same seems to
hold for First.
Given that Back is defined as Index_Type'Succ (Last (Vector) ), and Last
(Vector) could be Index_Type'Last, there seems to be a problem. There
should be an assertion that Index_Type'Base'Last > Index_Type'Last.
All the problems with Index_Type disappear with a general list, which
would use a cursor.
I would propose that the sort algorithm be made available to users for
normal array types as well as for vectors. That would involve putting it
in its own library unit and refering to that unit in Vectors.
The Map structure is required to be implemented with a hash table. If
we're going to have such a requirement, it should at least be named
Hashed_Maps.
An important thing about maps is that they provide fast searching,
typically based on a lower-level structure such as a hash table or
balanced tree. Such structures have uses of their own in addition to
creating maps, and independent of the key/value concept of a map. For
example, an application may collect a number of values and then need to
quickly determine if a value is in that collection, and a searchable
structure with a Get_First operation can be used for a priority queue.
None of these applications use key/value pairs. Therefore, I think it's
important to provide the underlying searchable structure to users.
(Indeed, given the ease with which a user can wrap a key/value pair in a
record, define comparison operations for that record that only use the
key part, and create a map structure, given the existence of a
searchable structure, it could be argued, since the proposal states that
easily implemented structures should not be part of the library, that
the library should only supply searchable structures, and not maps.)
Do we really need Maps.[Wide_]Strings, given that an Unbounded_String
can be used for the key type, and that this library should not be used
for applications in which the use of Unbounded_Strings is not acceptable?
The Sets package is mostly incomprehensible. Sets deal with elements,
and operations must include testing if an element is in a set, creating
a set from a list of elements (set "literals"), and set union,
intersection, difference, and symmetric difference. Except for the
membership test, these are missing from the package, so I don't see what
it has to do with sets. It appears to be a searchable structure, not a
set. This is corroborated by the package Generic_Keys, which allows the
structure to be used as a map.
The discussion of the package begins by talking about nodes, which is an
undefined term. The reader has no idea what it has to do with the
package, which is not specified in terms of nodes.
"Sans" is a French word. Since the ARM is in English, we should use the
English "without" instead. "No" might also be acceptable.
I'd like to thank the select committee for their work. No library will
completely please everyone. I will welcome any standard container
library in Ada 0X.
****************************************************************
From: Tucker Taft
Sent: Wednesday, February 4, 2004 9:24 PM
The term "vector" for extensible array is used in Java
as well. I think we should strive to use terminology
that has become widely used in the programming community.
I personally consider an extensible array (i.e. a vector) a useful and
important standard container. I don't feel the same way about a linked
list, because it is so easy to implement what you want, and there
are so many options when it comes to how to link the objects
together that having a standard container for that hardly
seems worthwhile (IMHO).
So we settled on Vector, Map, and Set as three basic yet
important abstractions that will help lift the level of
programming above arrays and records. In my experience
with using languages that have large container libraries,
it is these three that are used more widely than all
the others combined.
****************************************************************
From: Randy Brukardt
Sent: Wednesday, February 4, 2004 9:29 PM
I agree with one caveat: we're already adding something else called "Vector"
to the standard (see AI-296), and two might just be too confusing.
But, the container vector is more useful than the list container (because of
the calculated O(1) access to elements). And they're too similar to support
both when we're trying to support something managable.
****************************************************************
From: Randy Brukardt
Sent: Wednesday, February 4, 2004 9:39 PM
Jeffrey Carter said:
...
> Further, the proposal requires an inefficient array implementation, and
> several of the operations refer to this implementation. I think this is
> a mistake. Specify an general, unbounded list and let the implementor
> choose the implementation (which could be an array). As the proposal
> points out, correctly implementing a general list is not trivial, so it
> makes sense for a standard library to provide a list.
>
> Maps and sets also specify a specific implementation.
No, an implementation is suggested (in AARM notes), as are performance
characteristics. That was one of the larger changes to Matt's original
proposal. If we made that change incompletely somewhere, that needs to be
fixed.
That said, the most important thing is that all implementations have
consistent performance characteristics (so that porting a program from GNAT
to ObjectAda doesn't fail for performance reasons). If GNAT used an array
implementation and ObjectAda used a list implementation for a Vector, access
to elements (which would be O(N) on the imagined OA implementation) could be
too slow for the port to be viable. That needs to be avoided. OTOH,
specifying too much about the implementation would prevent using a better
one -- in that case, we might as well just specify the source code of the
entire library (including the bodies!), and we don't need all of this
wording!
> I would propose that the sort algorithm be made available to users for
> normal array types as well as for vectors. That would involve putting it
> in its own library unit and refering to that unit in Vectors.
Bad idea. To do that, you'd need provide generic formal accessor functions;
that would have a huge overhead of function calls for both Vectors and
Arrays. On a code shared implementation like Janus/Ada, it probably would
run ten times slower than the specified one.
If we want an array sort, we should declare one:
generic
type Index_Type is (<>);
type Element_Type is private;
function "<" (Left, Right : Element_Type) return Boolean is <>;
type Array_Type is array (Index_Type) of Element_Type;
procedure Ada.Generic_Sort (Arr : in out Array_Type);
(We'd need an unconstrained version, too.) But keep it separate from the
Vector one (or any List one, for that matter).
****************************************************************
From: Matthew Heaney
Sent: Thursday, February 5, 2004 9:31 AM
I have hosted a reference implementation at my Earthlink home page:
<http://home.earthlink.net/~matthewjheaney/charles/ai302-20040205.zip>
For now it only includes the vector. There's a test_sort program in
there too, so you have something you can run.
I'll have the set and maps done in a few days.
****************************************************************
From: Robart A. Duff
Sent: Thursday, February 5, 2004 10:13 AM
Thanks, Matt!
****************************************************************
From: Jeffery Carter
Sent: Thursday, February 5, 2004 10:58 AM
Randy Brukardt wrote:
> No, an implementation is suggested (in AARM notes), as are performance
> characteristics. That was one of the larger changes to Matt's original
> proposal. If we made that change incompletely somewhere, that needs to be
> fixed.
The normative text for vectors says
"A vector container object manages an unconstrained internal array"
That specifies an array implementation.
> Bad idea. To do that, you'd need provide generic formal accessor functions;
> that would have a huge overhead of function calls for both Vectors and
> Arrays. On a code shared implementation like Janus/Ada, it probably would
> run ten times slower than the specified one.
Given that an array implementation is specified, there is no need for
formal accessor functions. The vector can simply call an instantiation
of the sort with the appropriate slice of its internal array. Since we
require such an algorithm to exist, and it is useful to many users, it
makes sense for it to be available outside the vector package.
> If we want an array sort, we should declare one:
>
> generic
> type Index_Type is (<>);
> type Element_Type is private;
> function "<" (Left, Right : Element_Type) return Boolean is <>;
> type Array_Type is array (Index_Type) of Element_Type;
> procedure Ada.Generic_Sort (Arr : in out Array_Type);
>
> (We'd need an unconstrained version, too.) But keep it separate from the
> Vector one (or any List one, for that matter).
If we only have one, I'd prefer it to be unconstrained. That allows
operations such as the vector sort discussed above, where the size of
the slice may change from call to call, without repeated instantiations.
Sort for a list is a different creature. Merge sort is a good choice
there, since a list already has the O(N) additional space that merge
sort requires for array sorting (in the links), provided you have access
to the list internals. Thus you get O(N log N) time in all cases and
O(1) space.
****************************************************************
From: Randy Brukardt
Sent: Thursday, February 5, 2004 3:23 PM
Jeff Carter wrote:
> The normative text for vectors says
>
> "A vector container object manages an unconstrained internal array"
>
> That specifies an array implementation.
Precisely my point. That is intended to say that there is a logical array in
the container, but not necessarly an actual one. Matt's descriptions were
too implementation-specific, and we moved most of that. But I'm not
surprised that some was missed.
...
> Given that an array implementation is specified, there is no need for
> formal accessor functions. The vector can simply call an instantiation
> of the sort with the appropriate slice of its internal array. Since we
> require such an algorithm to exist, and it is useful to many users, it
> makes sense for it to be available outside the vector package.
There is no intent that an array implementation is specified (it certainly
won't be implemented that way on Janus/Ada); only that the performance
characteristics are similar (or better) than that of an array
implementation.
In any case, I have no idea how an external generic would be able to mess
around with the internal array - it certainly can't see it! You'd have to
put the sort into the spec in order to do that -- and that's whats proposed
and what you're objecting to.
****************************************************************
From: Matthew Heaney
Sent: Thursday, February 5, 2004 3:40 PM
Randy Brukardt wrote:
> Precisely my point. That is intended to say that there is a logical array in
> the container, but not necessarly an actual one.
Yes, exactly. This allows the implementor to leave the type system in
order choose the optimal implementation for the vector container.
An implementor can use any implementation that satisfies the property
that insertion at the back end is (amortized) constant time, and the
property that random access is constant time.
****************************************************************
From: Jeffrey Carter
Sent: Thursday, February 5, 2004 6:52 PM
Randy Brukardt wrote:
>>The normative text for vectors says
>>
>>"A vector container object manages an unconstrained internal array"
>>
>>That specifies an array implementation.
>
> Precisely my point. That is intended to say that there is a logical array in
> the container, but not necessarly an actual one. Matt's descriptions were
> too implementation-specific, and we moved most of that. But I'm not
> surprised that some was missed.
I read it as specifying an implementation. I suggest the wording be
revised to make it clear that the discussion is of a logical array, not
a requirement for an actual array.
> In any case, I have no idea how an external generic would be able to mess
> around with the internal array - it certainly can't see it! You'd have to
> put the sort into the spec in order to do that -- and that's whats proposed
> and what you're objecting to.
I guess I wasn't clear. You would provide the external sort, and also
specify the sort in the spec, with wording that the sort has the same
characteristics as the external sort. This is based on the assumption
that an array implementation is specified, so the sort algorithm, useful
on arrays, must exist anyway.
I'm reminded of my surprise that Ada-83 compilers had to support
inifinte-precision arithmetic, but the language did not require that it
made available to users. If the compiler writers have to implement the
functionality, why not make it available to users? Case-insensitive
string comparison is a similar thing: compilers have to recognize that
frog, Frog, and FROG are the same identifier, but are (were) not
required to make such comparisons available to users.
****************************************************************
From: Jeffrey Carter
Sent: Thursday, February 5, 2004 11:38 AM
Tucker Taft wrote:
> The term "vector" for extensible array is used in Java
> as well. I think we should strive to use terminology
> that has become widely used in the programming community.
I disagree, even though I know that's dangerous when discussing Ada with
STT. An application that uses both extensible arrays and mathematical
vectors will be very confusing if both are called vectors. Since an
explicit design goal of Ada is to emphasize ease of reading, calling an
extensible array a vector seems inappropriate.
> I personally consider an extensible array (i.e. a vector) a useful and
> important standard container. I don't feel the same way about a linked
> list, because it is so easy to implement what you want, and there
> are so many options when it comes to how to link the objects
> together that having a standard container for that hardly
> seems worthwhile (IMHO).
I have no objections to an extensible array, provided it's clearly
identified as such. I think it should look different from the proposal,
but that's mainly a taste issue. I'd want direct analogs to indexing,
both LHS and RHS (Put and Get?); slices, both LHS and RHS (Replace_Slice
and Slice?); and 'First, 'Last, and 'Length (though 'First is a constant
for an EA). An equivalent to 'range would be nice, but impossible. The
only difference to a normal array would be that Put and Replace_Slice
can accept indices not in First .. Last. I haven't given it a great deal
of thought, so I'm sure I'm missing some subtleties, but I don't see a
need for Front, Back, Insert, Delete, and so on.
The proposal says that containers "that are relatively easy to code,
redundant, or rarely used are omitted". It also says that lists are
difficult to implement correctly. Given a list, structures such as
deques, stacks, and especially queues are easy to implement. Since
queues are common structures and not redundant (none of the proposed
containers provides an efficient implementation of a queue), the
proposal itself seems to argue that lists should be provided, since they
are not easy to code correctly, and provide a basis for the user to
easily code queues.
> So we settled on Vector, Map, and Set as three basic yet
> important abstractions that will help lift the level of
> programming above arrays and records. In my experience
> with using languages that have large container libraries,
> it is these three that are used more widely than all
> the others combined.
There was an article by Mills [Harlan D. Mills, Richard C. Linger: Data
Structured Programming: Program Design without Arrays and Pointers. IEEE
Trans. Software Eng. 12(2): 192-197 (1986)] that proposed that
applications only use queues, stacks, and sets (real sets, with union,
intersection, and such operations). It's an interesting concept, and I
agree with the aim of programs using appropriate abstractions and hiding
lower level implementation details, especially use of pointers.
****************************************************************
From: Alexandre E. Kopilovitch
Sent: Thursday, February 5, 2004 9:04 AM
Tucker Taft wrote:
> The term "vector" for extensible array is used in Java
> as well. I think we should strive to use terminology
> that has become widely used in the programming community.
So call it Java_Vector - that will be at least consistent.
Do you think that Java meaning for "vector" is more significant for Ada than
mathematical meaning of this term (which never implied extensibility) ?
Why not call that thing Flexible_Array (after Algol-68, I think) - this name
will directly reflect the essense.
****************************************************************
From: Robert A. Duff
Sent: Thursday, February 5, 2004 1:37 PM
Bill Wulf and other professors at CMU circa late 1970's were using the
term "vector" to mean "array" (not necessarily extensible); that's the
first time *I* heard it. So it's not a Java-ism.
I think this meaning of "vector" derives from the maths meaning,
even if it's not precisely the same thing.
****************************************************************
From: Stephen Leake
Sent: Thursday, February 5, 2004 2:18 PM
Jeffrey Carter <jrcarter@acm.org> writes:
> Tucker Taft wrote:
>
> > The term "vector" for extensible array is used in Java
> > as well. I think we should strive to use terminology
> > that has become widely used in the programming community.
>
> I disagree, even though I know that's dangerous when discussing Ada
> with STT. An application that uses both extensible arrays and
> mathematical vectors will be very confusing if both are called
> vectors. Since an explicit design goal of Ada is to emphasize ease of
> reading, calling an extensible array a vector seems inappropriate.
I agree with Tucker. I have code that uses both Cartesian vectors and
extensible arrays. One is SAL.Math_Double.DOF_3.Cart_Vector_Type, the
other is SAL.Poly.Unbounded_Arrays. Obviously, I have different names
for them, as Carter wants. But if I called them
SAL.Math_Double.DOF_3.Vector and SAL.Poly.Vector, I would have no
chance of confusion. That's what package hierarchies are for.
Since both Java and C++ use the term "vector" for an extensible array,
I think Ada should also. Part of the point of the OY revision is to
make the language more attractive to current users of other languages.
This is an easy way to do that.
> (Replace_Slice and Slice?); and 'First, 'Last, and 'Length (though
> 'First is a constant for an EA).
'First is not constant for SAL.Poly.Unbounded_Arrays; I provide both
Append and Prepend operations. I don't think I've ever used Prepend,
though; it was really just an exercise in what was possible.
> .. I don't see
> a need for Front, Back, Insert, Delete, and so on.
I use Insert and Delete in real applications.
> The proposal says that containers "that are relatively easy to code,
> redundant, or rarely used are omitted". It also says that lists are
> difficult to implement correctly. Given a list, structures such as
> deques, stacks, and especially queues are easy to implement. Since
> queues are common structures and not redundant (none of the proposed
> containers provides an efficient implementation of a queue), the
> proposal itself seems to argue that lists should be provided, since
> they are not easy to code correctly, and provide a basis for the
> user to easily code queues.
I agree. A lists package would be nice.
But I also agree with Tucker, that it is difficult to come up with one
list package that really meets a wide range of needs.
Perhaps one list package, that meets a narrow range of needs, would
still be useful. It would set a style standard for other list packages.
****************************************************************
From: Matthew Heaney
Sent: Thursday, February 5, 2004 2:48 PM
Alexandre E. Kopilovitch wrote:
> So call it Java_Vector - that will be at least consistent.
>
> Do you think that Java meaning for "vector" is more significant for Ada than
> mathematical meaning of this term (which never implied extensibility) ?
>
> Why not call that thing Flexible_Array (after Algol-68, I think) - this name
> will directly reflect the essense.
Tucker T. and Bob D. are both correct: the container is a "vector."
Alexandre K. and Jeff C. are both incorrect. The container is not a
list, not a Java_Vector, not an Extensible_Array, and not a Flexible_Array.
It is a vector. It has the same semantics as the identically-named
container in the STL. The one named "vector."
The container whose name is vector does not have array semantics. There
is no slicing for example.
The container whose name is vector has the following important properties:
o inserting at the back end is amortized constant time
o supports random access of elements, in constant time
Yes, internally a vector is implemented as an array. The Size function
returns the length of this internal array, and Resize can be used to
expand its length.
But it is not an array. It is a container. Whose name is "vector".
Just like the one in the STL.
****************************************************************
From: Alexandre E. Kopilovitch
Sent: Thursday, February 5, 2004 3:46 PM
No problem with all that if another term was chosen. Now, with "vector", this
is name squatting (well, participation in name squatting in Ada case), which
is fully appropriate for Java, somehow understandable for C++, but seems
(still) inappropriate for Ada, especially taking into account that the involved
term belongs to some Ada-friendly domain.
****************************************************************
From: Robert A. Duff
Sent: Thursday, February 5, 2004 3:38 PM
I wrote:
> Bill Wulf and other professors at CMU circa late 1970's were using the
> term "vector" to mean "array" (not necessarily extensible); that's the
> first time *I* heard it. So it's not a Java-ism.
Actually, the meaning was "one-dimensional array". But there was no
implication that they could grow.
> I think this meaning of "vector" derives from the maths meaning,
> even if it's not precisely the same thing.
I mean, what's a vector in 3-space? Basically, a one-dimensional array
of 3 real numbers -- the X, Y, and Z coordinates.
Matt wrote:
> It is a vector. It has the same semantics as the identically-named
> container in the STL. The one named "vector."
This stuff comes from the C++ STL. I think gratuitous differences from
that are unhelpful. (But I admit that I was one of the folks pushing
for "cursor" instead of "iterator".)
> The container whose name is vector does not have array semantics. There
> is no slicing for example.
Well, "no slicing" is hardly fundamental. It could be added, or
programmed by the client.
> The container whose name is vector has the following important properties:
>
> o inserting at the back end is amortized constant time
> o supports random access of elements, in constant time
I think "random access" is the essence of array semantics. After all,
anything you can do with an array you can do with a linked list, and
vice versa -- the only fundamental difference is the efficiency
properties.
****************************************************************
From: Matthew Heaney
Sent: Thursday, February 5, 2004 9:31 AM
Robert A Duff wrote:
> This stuff comes from the C++ STL. I think gratuitous differences from
> that are unhelpful. (But I admit that I was one of the folks pushing
> for "cursor" instead of "iterator".)
Yes. The world has settled on the name "vector." Let's use the terms
everyone else is using, unless we have a good reason not to.
(BTW, that's also why I used the name "Iterator_Type". But I have no
issues with the name "Cursor_Type".)
> I think "random access" is the essence of array semantics. After all,
> anything you can do with an array you can do with a linked list, and
> vice versa -- the only fundamental difference is the efficiency
> properties.
But that's the essence of the argument!
Yes, it's *possible* to seek to specific elements in a linked list, but
I would hardly call that "random access."
If you need fast random access to the elements in a container, and the
number of elements in the container is large, then you can effectively
rule out using a linked list as the container.
Of course you could make the argument the other way. If you need
constant-time insertion of elements at any position, then that
effectively rules out a vector, in favor of a list.
****************************************************************
From: Alexandre E. Kopilovitch
Sent: Thursday, February 5, 2004 3:21 PM
Robert A Duff wrote:
> Bill Wulf and other professors at CMU circa late 1970's were using the
> term "vector" to mean "array" (not necessarily extensible); that's the
> first time *I* heard it.
Yes, CMU always was (as far as I know) primarily engineering educational
facility, and I know well that engineers (not software engineers, but rather
general kind of engineers) often called "vector" any column or row of numbers.
(not bothering themselves with the question how the components of that "vector"
transform with a change of coordinate system). But apparently they never used
this term for arrays of any other objects, and I almost never seen a case
(even in engineering) where "vector" was used for extensible array - except
Java and perhaps some C++ libraries.
A notable exception is APL, in which "vector" is the basic term, and that
"vector" is extensible. But in APL that "vector" is equipped with vast
nomenclature of functions, many of them associated with genuine mathematical
vectors, so the entire balance for the term was acceptable.
> So it's not a Java-ism.
Yes, not exactly - there were other precedents of sloppy usage of this term.
But nevertheless a strong impression remains that it is exactly Java, which
is a real reason, ground and reference for proposing this term for extensible
arrays *now and for Ada0Y*.
> I think this meaning of "vector" derives from the maths meaning,
> even if it's not precisely the same thing.
No, not at all - it lacks the primary mathematical meaning of it, and adds
the primary feature, which meaning is totally non-mathematical (that is, there
is no attempt to bring any mathematical meaning to it... and it will not be
simple, if attempted).
****************************************************************
From: Matthew Heaney
Sent: Thursday, February 5, 2004 5:11 PM
Jeffrey Carter wrote:
> The actual algorithm produces 8-bit hash values, which may no longer be
> considered adequate, given
>
>> Hash_Type'Modulus shall be at least as large as the smaller of
>> System.Max_Binary_Modulus and 2**32.
In Charles I copied the hash function from GNAT. I figure if it's good
enough for Robert Dewar it's good enough for me...
> Vector should have an iterator, in addition to allowing the user to
> explicitly iterate over the structure.
No. Vector iterators are fragile, and hence very error prone.
They are fragile because the (logical) internal array gets thrown away
during expansion, which invalidates the iterator. It's too hard to keep
track of whether a vector iterator is still valid, and most of the time
you end up with a dangling reference.
The STL has vector iterators in order to provide the infrastructure
necessary to support generic algorithms.
In Ada they are not necessary, because you can use locally-declared
subprograms to fit within such a framework.
>> Open issue: This function returns a value that doesn't depend on it's
>> parameter. It possibility could be removed in favor of just saying
>> Index_Type'Pred(Index_Type'First) appropriately. Committee discussion
>> with the original proposal's author was inconclusive.
>
>
> I'd say that it should be a constant, not a function. The same seems to
> hold for First.
Front can probably go away. First is there for consistency with other
containers.
> Given that Back is defined as Index_Type'Succ (Last (Vector) ), and Last
> (Vector) could be Index_Type'Last, there seems to be a problem. There
> should be an assertion that Index_Type'Base'Last > Index_Type'Last.
That's not really possible for generic actual index types such as
Natural or Positive.
We could get rid of the assertion, but this would impact implementors.
That's why it's still an open issue.
In my reference implementation, I don't think the generic actual type
has to have IT'Base'First < IT'First, since internally I use Integer
subtypes for everything.
http://home.earthlink.net/~matthewjheaney/charles/ai302-20040205.zip
> All the problems with Index_Type disappear with a general list, which
> would use a cursor.
The original proposal included list containers, but they were not
included in the subcommittee report, in order to keep the size of the
report more manageable.
> An important thing about maps is that they provide fast searching,
> typically based on a lower-level structure such as a hash table or
> balanced tree.
My original proposal had both sorted and hashed maps, but in order to
keep the subcommittee report small support for sorted maps was removed.
> Such structures have uses of their own in addition to
> creating maps, and independent of the key/value concept of a map. For
> example, an application may collect a number of values and then need to
> quickly determine if a value is in that collection, and a searchable
> structure with a Get_First operation can be used for a priority queue.
That's what the sorted set is for.
> None of these applications use key/value pairs.
So use the sorted set.
> Therefore, I think it's
> important to provide the underlying searchable structure to users.
Just use the sorted set container. If guarantees that searches only
take O (log N) even in the worst case.
> (Indeed, given the ease with which a user can wrap a key/value pair in a
> record, define comparison operations for that record that only use the
> key part, and create a map structure, given the existence of a
> searchable structure, it could be argued, since the proposal states that
> easily implemented structures should not be part of the library, that
> the library should only supply searchable structures, and not maps.)
The (hashed) map stores the key and element as separate components of
the internal node of storage.
If you have a record like that, containing a key-part component, then
use the sorted set, and instantiate the nested generic package Generic_Keys.
> Do we really need Maps.[Wide_]Strings, given that an Unbounded_String
> can be used for the key type, and that this library should not be used
> for applications in which the use of Unbounded_Strings is not acceptable?
Yes, we really need string-key maps.
> The Sets package is mostly incomprehensible. Sets deal with elements,
> and operations must include testing if an element is in a set, creating
> a set from a list of elements (set "literals"), and set union,
> intersection, difference, and symmetric difference. Except for the
> membership test, these are missing from the package, so I don't see what
> it has to do with sets. It appears to be a searchable structure, not a
> set. This is corroborated by the package Generic_Keys, which allows the
> structure to be used as a map.
A "set" is really any sorted sequence of items. If you want set
intersection, symmetric difference, etc, then just use a generic
algorithm. See the Charles library for such algorithms.
Of course, if you want target of a set union operation to be the set
itself, then just use Insert to insert the items.
The subcommittee report has several examples of how sets are used, and
there's at least one example showing how to use the nested generic package.
See the last two slides in my AE-2003 paper presentation for an example
of how to take the union of a set and a (sorted) list:
http://home.earthlink.net/~matthewjheaney/charles/charlesppt.htm
My original proposal has the same example at the very end:
http://home.earthlink.net/~matthewjheaney/charles/ai302.txt
> "Sans" is a French word. Since the ARM is in English, we should use the
> English "without" instead. "No" might also be acceptable.
Je crois que non. C'est une bonne idea.
The name for Delete_Sans_Increment comes from Emacs lisp, which has the
functions file-name-sans-extension and file-name-sans-versions.
It was also in homage to Ada's French history, given that her original
designer was French, and worked for a French company.
Why do you think "rendevous" was named that way?
> I'd like to thank the select committee for their work. No library will
> completely please everyone. I will welcome any standard container
> library in Ada 0X.
If you don't immediately grok how vectors and sets and maps work, then I
suggest familiarizing yourself with the STL. There are lots of tutorials
on the WWW.
I also recommend Stanley Lippman's little book Essential C++. That was
my introduction to the STL, and what originally convinced me that
Stepanov's approach was the correct one.
You might also like Accelerated C++ by Andrew Koenig and Barbara Moo,
which uses the STL as a basis for teaching C++.
****************************************************************
From: Randy Brukardt
Sent: Thursday, February 5, 2004 5:49 PM
Matt's too modest. The tutorial that makes up the !example section is
actually quite good. I learned a lot about how the packages work (and how to
use them) from reading it carefully, and I recommend that everyone do that
to better understand Matt's work.
****************************************************************
From: Randy Brukardt
Sent: Thursday, February 5, 2004 3:48 PM
Jeffrey Carter wrote:
...
> > I personally consider an extensible array (i.e. a vector) a useful and
> > important standard container. I don't feel the same way about a linked
> > list, because it is so easy to implement what you want, and there
> > are so many options when it comes to how to link the objects
> > together that having a standard container for that hardly
> > seems worthwhile (IMHO).
>
> I have no objections to an extensible array, provided it's clearly
> identified as such. I think it should look different from the proposal,
> but that's mainly a taste issue. I'd want direct analogs to indexing,
> both LHS and RHS (Put and Get?); slices, both LHS and RHS (Replace_Slice
> and Slice?); and 'First, 'Last, and 'Length (though 'First is a constant
> for an EA). An equivalent to 'range would be nice, but impossible. The
> only difference to a normal array would be that Put and Replace_Slice
> can accept indices not in First .. Last. I haven't given it a great deal
> of thought, so I'm sure I'm missing some subtleties, but I don't see a
> need for Front, Back, Insert, Delete, and so on.
Let's see:
- direct analogs to indexing, both LHS and RHS (Element, Replace_Element);
- slices (nope);
- 'First (First), 'Last (Last), 'Length (Length);
Looks like pretty much everything is in there. And slicing will be expensive
if the implementation is not a straight array, so it's somewhat dubious.
Insert and Delete provide easier ways of adding or removing items than
slices - and how often do you use a slice of a non-string type for something
other than inserting or deleting elements anyway??
Ada doesn't (and isn't) going to support user-defined indexing or
user-defined attributes, so this is about the best you can do. So what's the
complaint (other than the name)??
> The proposal says that containers "that are relatively easy to code,
> redundant, or rarely used are omitted". It also says that lists are
> difficult to implement correctly.
I think that's a mistake; only very rare operations are difficult to code.
We didn't update every piece of the original text, and that one is
misleading.
> Given a list, structures such as
> deques, stacks, and especially queues are easy to implement. Since
> queues are common structures and not redundant (none of the proposed
> containers provides an efficient implementation of a queue), the
> proposal itself seems to argue that lists should be provided, since they
> are not easy to code correctly, and provide a basis for the user to
> easily code queues.
The user can easily code a queue in terms of a Vector (that's one of the
uses of Insert!). We dropped the list component because it had an identical
interface to the Vector component, but was less flexible (no computed O(1)
access).
In any case efficiency is not a goal of the standard containers. It would be
incorrect for the standard to specify performance to the point that only a
single implementation would be possible. Moreover, we anticipate a secondary
standard that *does* try to provide more control over performance (by adding
lists, bounded forms, etc.)
In my view, it is a mistake for projects to depend on standard containers
where there are critical performance requirements (not just time, but also
space as well). In that case, you really have to have control of the
implementation -- you really need *all* of the source code. You can't trust
something provided by the standard (or your compiler vendor) in those cases.
In any case, the purpose of these containers is to provide a seed and a
standard direction. I would hope that they would reduce the tower of babel
that Ada containers are nowdays - by providing a style for other containers
to follow. No one is suggesting that these are sufficient to solve all
programming problems - just 80% of them, especially in prototypes and in Q&D
programs.
****************************************************************
From: Martin Dowie
Sent: Thursday, February 5, 2004 5:50 PM
> Dowie, Martin (UK) wrote:
> > I could but wasn't part of the purpose of the library to allow us to
> > do common things more easily? And I'd have to say I'd use a 'Quit'
> > version a _lot_ more than the current process everything,
> > every time one.
>
> It would be helpful if you could be specific about what kind of
> container you were using.
I was thinking, primarily, of a project that used single (bounded) lists to
hold commands (a basic, domain-specific, scripting language I guess),
one of which was 'stop this sequence of commands'.
This pattern has since shown itself to be quite common in embedded
systems - for either domain-specific scripting languages or graphics.
There is the other idiom where one is processing an iteration of items
and an external event occurs that stops the processing - e.g. the 'stop'
button is pushed on a GUI-search window, but it could equally be a
50Hz message over a 1553.
****************************************************************
From: Randy Brukardt
Sent: Thursday, February 5, 2004 6:14 PM
> I was thinking, primarily, of a project that used single (bounded) lists to
> hold commands (a basic, domain-specific, scripting language I guess),
> one of which was 'stop this sequence of commands'.
My understanding of the model is that passive iterators are only for cases
where you want to iterate over the entire container. Thus, this is clearly a
use for an active iterator. Indeed, given the iteration model of packages,
there's hardly any reason to use a passive iterator. They're harder to write
(a subprogram and instantiation are required), and (especially if a Quit
parameter is provided), harder to understand.
We dropped the passive iterator from the Ada.Directories package precisely
because even ARG members were confused about how it worked. Even though it
was a classic passive iterator with a Quit parameter. Perhaps the confusion
really was the Quit parameter (I thought it was the whole idea), but in any
case, you've got to keep them simple.
> This pattern has since shown itself to be quite common in embedded
> systems - for either domain-specific scripting languages or graphics.
>
> There is the other idiom where one is processing an iteration of items
> and an external event occurs that stops the processing - e.g. the 'stop'
> button is pushed on a GUI-search window, but it could equally be a
> 50Hz message over a 1553.
It seems to me that an abort situation is best handled by propagating an
exception. Otherwise, you end up distributing termination code/flags
everywhere in the application. But YMMV.
****************************************************************
From: Jeffery Carter
Sent: Thursday, February 5, 2004 6:39 PM
Matthew Heaney wrote:
> Alexandre K. and Jeff C. are both incorrect. The container is not a
> list, not a Java_Vector, not an Extensible_Array, and not a
> Flexible_Array.
Matthew H. is incorrect. The data structure is not a vector.
I am at least as qualified as Matthew H. to make such pronouncements.
****************************************************************
From: Jeffery Carter
Sent: Friday, February 6, 2004 1:05 PM
A comment on type names.
Ada 83, with the unfortunate* exception of File_Type, did not use
"_Type" on the end of predefined type names. We have Address and Count,
not Address_Type and Count_Type. Ada 95 adhered to this principle, so we
have Storage_Element and Unbounded_String, not Storage_Element_Type and
Unbounded_String_Type.
For consistency, I think the Ada-0X process should also adhere to this
principle. The use of "_Type" on type names in the proposal should be
eliminated. This takes some time and thought to do well; I am willing to
volunteer for the effort if the Committee cannot spare the time and
cannot find anyone preferable.
This is a matter of consistently. While it is not my style, and not
recommended by the Quality and Style Guide, I have used libraries that
use the "_Type" convention without problem. I am concerned that the ARM
be consistent far more than I am about what convention the ARM uses.
*"Unfortunate" because it is inconsistent.
****************************************************************
From: Matthew Heaney
Sent: Friday, February 6, 2004 9:33 AM
I have updated the reference implementation, which now has the sorted
set container, too.
There's also a test_sets.adb, so you have something to run. You can
pass a seed on the command line.
<http://home.earthlink.net/~matthewjheaney/charles/ai302-20040206.zip>
I'll take care of the hashed map containers this weekend, and post Mon AM.
****************************************************************
From: Matthew Heaney
Sent: Friday, February 6, 2004 3:36 PM
Martin Dowie wrote:
> I was thinking, primarily, of a project that used single (bounded) lists to
> hold commands (a basic, domain-specific, scripting language I guess),
> one of which was 'stop this sequence of commands'.
It sounds like you have a sequence container, that you traverse from
front to back.
The only sequence container in the proposal is a vector, which doesn't
have a passive iterator. Again, I recommend just using a loop:
for Index in First (V) .. Last (V) loop
declare
Command : Command_Type := Element (V, Index);
begin
exit when Is_Stop (Command);
-- process command
end;
end loop;
If these are commands that have an order (say, each command has a
timestamp, and commands are executed in timestamp order), then you can
use the sorted set. Again, an explicit loop is appropriate:
declare
I : Cursor_Type := First (S);
J : constant Cursor_Type := Back (S);
begin
while I /= J loop
declare
Command : Command_Type := Element (I);
begin
exit when Is_Stop (Command);
-- process command
end;
Increment (I);
end loop;
end;
****************************************************************
From: Alexandre E. Kopilovitch
Sent: Friday, February 6, 2004 4:24 PM
> The only sequence container in the proposal is a vector,
Ah, yes, it's Sequence - quite right name for that container (and not Vector).
****************************************************************
From: Jeffrey Carter
Sent: Friday, February 6, 2004 7:17 PM
Randy Brukardt wrote:
> Let's see:
> - direct analogs to indexing, both LHS and RHS (Element, Replace_Element);
> - slices (nope);
> - 'First (First), 'Last (Last), 'Length (Length);
>
> Looks like pretty much everything is in there. And slicing will be expensive
> if the implementation is not a straight array, so it's somewhat dubious.
> Insert and Delete provide easier ways of adding or removing items than
> slices - and how often do you use a slice of a non-string type for something
> other than inserting or deleting elements anyway??
Slicing isn't included because C++ doesn't have slices, so it's a
foreign concept to its library and users. If we want to attract users of
inferior languages to Ada, it should be because Ada is better. Ada's
slices are a way that Ada is better; Ada's standard extensible array
component should be better than its competition by also offering them. I
do not see mimicking C++'s shortcomings as advisable.
Insertion and deletion are basic operations of lists, but not of arrays.
That's why the list and vector components had the same set of
operations: they both specify lists with different implementations.
Since String is an array, and [Un]Bounded_String is an extensible array,
and we're now told the correct name is Vector, shouldn't these be
renamed to something like Character_Vector?
> Ada doesn't (and isn't) going to support user-defined indexing or
> user-defined attributes, so this is about the best you can do. So what's the
> complaint (other than the name)??
I don't expect user-defined indexing, slices, or attributes, which is
why I talked about "analogs" to them. Missing slices is one complaint.
And, yes, the name is unarguably wrong.
In the C family of languages, users are accustomed to having to look at
implementations in order to understand how to use something. Subprogram
"prototypes" (yet another misused term to add to the collection) are
generally insufficient, and appropriate comments are often lacking. So
it comes as no surprise to me that C++ expects newcomers to its library,
looking for an extensible array, and not finding anothing with an
appropriate name, to have to look at the operations of the components to
find that the inappropriately named "vector" is really an extensible array.
However, this is not the Ada way, and I think it completely
inappropriate to mimick this mistake. Looking at other languages'
library to select useful components is fine; insisting that an Ada
version must be identical to that of another language, including
mistakes, is not.
> The user can easily code a queue in terms of a Vector (that's one of the
> uses of Insert!). We dropped the list component because it had an identical
> interface to the Vector component, but was less flexible (no computed O(1)
> access).
The perfomance of a queue based on an extensible array is likely to be
just as objectionable as extracting an element from an extensible array
based on a list. That the vector and list components both had the same
interface is further evidence that mimicking the STL is a bad idea.
Insert and delete are as foreign to an extensible array as indexing and
slicing should be to a list.
> In my view, it is a mistake for projects to depend on standard containers
> where there are critical performance requirements (not just time, but also
> space as well). In that case, you really have to have control of the
> implementation -- you really need *all* of the source code. You can't trust
> something provided by the standard (or your compiler vendor) in those cases.
I agree. That doesn't mean that the standard shouldn't provide a basis
for queues with performance characteristics suitable for performance
non-critical applications, which an extensible array does not provide.
****************************************************************
From: Randy Brukardt
Sent: Friday, February 6, 2004 8:24 PM
Jeff Carter wrote:
...
> I agree. That doesn't mean that the standard shouldn't provide a basis
> for queues with performance characteristics suitable for performance
> non-critical applications, which an extensible array does not provide.
Huh? You've said, in effect, that the performance isn't good enough for
applications where the performance doesn't matter. That's a pretty goofy
statement!
My opinion has not changed: if you care about performance *at all*, you
*cannot* depend on *any* standard containers. But usually the performance
does not matter at all (or so little as to be equivalent to not at all): the
number of elements in the container is small (which would be true for
virtually all queues), and/or it is used infrequently, and/or the
application is a throw-away.
Otherwise, if you are writing portable code, you shouldn't use a predefined
container library at all -- the performance is likely to vary much more
across implementations than code you write yourself. For instance, on
Janus/Ada, any generic list container is going run 2-5 times slower than the
same list created yourself -- that's just the effect of the extra call
overhead and the shared body (which means the elements will be dynamically
allocated - separately - in any case - at least doubling the allocation
overhead). I'd expeect that effect to be much less on GNAT, for example,
because they don't share generic bodies and thus don't have the double
allocation overhead.
If your application doesn't care about the component being 5 times slower,
then it is highly unlikely that it is going to care about whether the
Vector/Sequence/List component is implemented as an array, as a list, as a
tree, as a hash table, or as something else.
My preference with these components would be to say absolutely nothing about
performance or implementation (because anything said is as meaningless as
real-time metrics are). But others believe that that would cause real
portability problems, and I'm willing to go along with that.
The problem I see is a lot of people are looking far too closely at tiny
pieces of abstractions. You might have a queue or a list as part of a large
abstraction, but they're pretty much useless by themselves. And given that
creating a queue or stack (both of which have only two operations, both
trivial!) would take 3 minutes max, it makes no sense to use a complex (and
necessarily slow) container library for just that -- indeed, it probably
would be more work to use a container than the 3 minutes.
I much prefer the vision of this containers library, where the only
containers included are those that are large, complex, multi-purpose, and
have a clear abstraction.
****************************************************************
From: Jeffrey Carter
Sent: Friday, February 6, 2004 7:39 PM
Matthew Heaney wrote:
> No. Vector iterators are fragile, and hence very error prone.
Modifying a structure from an iterator should be a bounded error.
> They are fragile because the (logical) internal array gets thrown away
> during expansion, which invalidates the iterator. It's too hard to keep
> track of whether a vector iterator is still valid, and most of the time
> you end up with a dangling reference.
You can only talk about what happens internally during an operation if a
specific implementation is required, which Randy assures us is not the case.
> A "set" is really any sorted sequence of items. If you want set
> intersection, symmetric difference, etc, then just use a generic
> algorithm. See the Charles library for such algorithms.
I've used sets for decades, in discrete math, in specification languages
such as Z, and in programming. A set is an unordered collection of
elements from a universe that provides operations such as membership,
union, intersection, and the like, represented by mathematical symbols
that I can't reliably represent in an e-mail.
An implementation of a set may be sorted to speed up operations, but
that's a feature of the implementation, not of the concept implemented.
That's a distinction that many users of C-family languages seem unable
to make, but that I expect from those who embrace Ada.
> The name for Delete_Sans_Increment comes from Emacs lisp, which has the
> functions file-name-sans-extension and file-name-sans-versions.
Yet another case of mimicking others' errors.
> It was also in homage to Ada's French history, given that her original
> designer was French, and worked for a French company.
>
> Why do you think "rendevous" was named that way?
"Rendezvous" is not a predefined indentifier in the ARM. It was chosen
because no English word has the precise meaning intended, and Ada's
designers understood the importance of precise terminology.
> If you don't immediately grok how vectors and sets and maps work, then I
> suggest familiarizing yourself with the STL. There are lots of tutorials
> on the WWW.
I've been using arrays, including extensible arrays, sets, and maps for
decades. I've also been using vectors for decades, having done a lot of
scientific programming that required matrix math. I doubt that a study
of C++ mistakes would have any effect besides raising my blood pressure.
****************************************************************
From: Jeffrey Carter
Sent: Friday, February 6, 2004 7:22 PM
Randy Brukardt wrote:
> Precisely my point. That is intended to say that there is a logical array in
> the container, but not necessarly an actual one. Matt's descriptions were
> too implementation-specific, and we moved most of that. But I'm not
> surprised that some was missed.
On closer inspection, the Size and Resize operations certainly imply an
array implementation; they are meaningless otherwise.
****************************************************************
From: Randy Brukardt
Sent: Friday, February 6, 2004 9:09 PM
Huh? Resize tells the container a reasonable size to use; what the container
does with that information is up to it. Size simply returns that
information.
That's no different than many of the attributes in Ada, which (if set),
always return the values that they were set to. But what the compiler does
with those values is (almost) completely implementation-defined.
The only real requirement here is O(1) element access (which prevents the
use of a straight linked list).
Janus/Ada will probably use an array of pointers (or possibly array of
arrays of pointers); we're going to be (implicitly) allocating the elements
anyway, we might as well do it explicitly and take advantage of that to make
Insert/Delete/Sort (and any expansions) much cheaper (presuming the elements
are bigger than scalar types). An array of arrays of pointers is even
better, because insertion cost is bounded by the maximum size of an array
chunk -- but there is more overhead and complexity, so I'd like to see some
real uses before deciding on an implementation.
Note that a pure list component has no real opportunity for "better"
implementations, and indeed, any implementation on Janus/Ada would suffer
from "double" allocation.
****************************************************************
From: Martin Dowie
Sent: Saturday, February 7, 2004 4:02 AM
> We dropped the passive iterator from the Ada.Directories package precisely
> because even ARG members were confused about how it worked. Even though it
> was a classic passive iterator with a Quit parameter. Perhaps the confusion
> really was the Quit parameter (I thought it was the whole idea), but in any
> case, you've got to keep them simple.
I didn't find it confusing so I provided an extra child
Ada.Directories.Iterate - and I've used it repeatedly!
> > This pattern has since shown itself to be quite common in embedded
> > systems - for either domain-specific scripting languages or graphics.
> >
> > There is the other idiom where one is processing an iteration of items
> > and an external event occurs that stops the processing - e.g. the 'stop'
> > button is pushed on a GUI-search window, but it could equally be a
> > 50Hz message over a 1553.
>
> It seems to me that an abort situation is best handled by propagating an
> exception. Otherwise, you end up distributing termination code/flags
> everywhere in the application. But YMMV.
I have tended to work in deeply enbedded systems, where exceptions (in
any language!) are at best frowned upon and quite often forbidden! :-(
****************************************************************
From: Martin Dowie
Sent: Saturday, February 7, 2004 4:25 AM
> > I was thinking, primarily, of a project that used single (bounded) lists to
> > hold commands (a basic, domain-specific, scripting language I guess),
> > one of which was 'stop this sequence of commands'.
>
> It sounds like you have a sequence container, that you traverse from
> front to back.
Pretty much, although we also read in where each 'First' is as the whole
contained many 'subroutines'.
> The only sequence container in the proposal is a vector, which doesn't
> have a passive iterator. Again, I recommend just using a loop:
I suspect the first thing I will do is add an extra child generic subprogram
Ada.Containers.Vectors.Iterate! :-)
****************************************************************
From: Martin Krischik
Sent: Saturday, February 7, 2004 6:16 AM
> I suspect the first thing I will do is add an extra child generic
> subprogram Ada.Containers.Vectors.Iterate! :-)
Well, guess don't use GNAT. GNAT gets quite upset if you try to add something
to the Ada packages.
****************************************************************
From: Marius Amado Alves
Sent: Saturday, February 7, 2004 7:45 PM
I'd expect *any* compiler to get really upset with this ;-)
****************************************************************
From: Martin Dowie
Sent: Sunday, February 8, 2004 2:08 AM
"gcc -gnatg" or "gnatmake -a" will stop any warnings :-)
****************************************************************
From: Martin Krischik
Sent: Saturday, February 7, 2004 5:09 AM
> Jeffrey Carter wrote:
> > Given a list, structures such as
> > deques, stacks, and especially queues are easy to implement. Since
> > queues are common structures and not redundant (none of the proposed
> > containers provides an efficient implementation of a queue), the
> > proposal itself seems to argue that lists should be provided, since they
> > are not easy to code correctly, and provide a basis for the user to
> > easily code queues.
> The user can easily code a queue in terms of a Vector (that's one of the
> uses of Insert!). We dropped the list component because it had an identical
> interface to the Vector component, but was less flexible (no computed O(1)
> access).
True enough. But if you wanted a build generic queue on top of the vector the
tag should not be hidden from view. Otherwise one need to repeat all the
access methods instead of just renaming the one provided from the parent
package.
In fact the hidden tag is the one feature which I realey dislike in charles.
****************************************************************
From: Stephen Leake
Sent: Saturday, February 7, 2004 8:40 AM
"Randy Brukardt" <randy@rrsoftware.com> writes:
> Report of the ARG Select Committee on Containers
> February 3, 2004
Thanks for the committee's hard work on this.
What is the rationale for making the Map Key_Type definite, as opposed
to indefinite? Since an indefinite Key_Type is required for
Containers.Maps.Strings, why not make that capability available to the
users?
I don't see a discussion of this in AI-302-03/01.
Another point: Containers.Vectors.Size should return Index_Type'Base,
and the Size parameter in Resize should also be Index_Type'Base. It's
confusing to have different types for Size and Index.
There's also a problem if Natural'Last < Index_Type'Last; you
can't have a vector that contains every index!
****************************************************************
From: Randy Brukardt
Sent: Saturday, February 7, 2004 6:03 PM
> What is the rationale for making the Map Key_Type definite, as opposed
> to indefinite?
The 'committee' primarily adopted the existing proposal submitted by Matt
Heaney. We decided not to change any of the major design decisions of that
proposal - because no package will suit everyone or every need, and we felt
it was more important to standardize something coherently designed for most
needs than to fiddle endlessly with it and risk introducing serious bugs.
Which is to say, I don't know. :-)
> Since an indefinite Key_Type is required for
> Containers.Maps.Strings, why not make that capability available to the
> users?
We definitely expect that the strings container will use a purpose-built
data structure for storing strings, not some general indefinite item
capability. Ways to compactly and efficiently store sets of varying size
strings are well known and commonly used.
Such algorithms could be extended to a general "unconstrained array of
elementary", but that hardly seems to be a worthwhile definition for keys.
...
> Another point: Containers.Vectors.Size should return Index_Type'Base,
> and the Size parameter in Resize should also be Index_Type'Base. It's
> confusing to have different types for Size and Index.
>
> There's also a problem if Natural'Last < Index_Type'Last; you
> can't have a vector that contains every index!
Yes, that's a serious problem on Janus/Ada (Integer is 16-bit). However, you
want the Size and Resize operations to take a numeric type that contains
zero -- and certainly Index_Type is not that. Index_Type could be a subtype
of an enumeration type or a subtype of a modular type (neither of which can
contain zero) or a subtype of an integer type not containing zero.
We had a short, inconclusive discussion about whether the index type ought
to be range <> rather than (<>) (because enumeration and modular types fail
the assertion and thus aren't directly usable), but that still doesn't
guarantee a zero. Moreover, if the integer type has negative numbers, then
the Length of the vector could be larger than Index_Type'Last.
So I don't see a great solution. I wondered about using "Hash_Type" here (it
has the correct properties), but that seems like a misuse of the type (and a
bad idea in a library that most Ada programmers will read - you want to show
them good style in standard libraries).
****************************************************************
From: Martin Krischik
Sent: Saturday, February 7, 2004 5:15 AM
> The perfomance of a queue based on an extensible array is likely to be
> just as objectionable as extracting an element from an extensible array
> based on a list. That the vector and list components both had the same
> interface is further evidence that mimicking the STL is a bad idea.
> Insert and delete are as foreign to an extensible array as indexing and
> slicing should be to a list.
Well, depends. Most queues are not supposed to grow indefinetly so an using a
vector with an modular type as index will give you good perfomace. Every Ada
tutorial contains a expample on how to do it.
****************************************************************
From: Martin Krischik
Sent: Saturday, February 7, 2004 6:14 AM
> The committee selected the second proposal as a starting point for a
> standard containers library, with a number of simple changes. The
> changes were simple enough that we produced a version of the library with
> the changes made (AI-00302-3/01).
Any place where I can actualy read the draft?
Anyway, looking at the reference impementation vom Matthew Heaney (thanks for
the quick responce) I have an improvements to suggest:
type Element_Type is private;
I said this bevore that is too limiting. With that signature you can't even
store strings. And more important you cant store Element'Class. In fact I
predict that with that signature 80% of all data stored will be "access to
something".
I have often heard Ada does not need garbage collection since a good container
library should take care of memory management - and now I ready to follow
that point. But taking that argument, vector is not a good container.
Since vector will need heap storrage anyway and performace is only a minor
issue I suggest:
type Element_Type (<>) is private;
****************************************************************
From: Randy Brukardt
Sent: Saturday, February 7, 2004 6:05 PM
> Any place where I can actualy read the draft?
The same place that you can read any other AI: www.ada-auth.org.
****************************************************************
From: Martin Krischik
Sent: Sunday, February 8, 2004 4:58 AM
I looked there but I only found a very long discussion but not the
actual concluding decision.
****************************************************************
From: Randy Brukardt
Sent: Monday, February 9, 2004 6:03 PM
Don't know what you're looking for, but certainly the entire AI is posted
there. As with all AIs, the !wording section is what goes into the standard.
****************************************************************
From: Martin Krischik
Sent: Saturday, February 7, 2004 6:24 AM
> > The only sequence container in the proposal is a vector,
>
> Ah, yes, it's Sequence - quite right name for that container (and not
> Vector).
No, in my book elements in a Sequence have only a relative positions, or at
least the relative position is the primary position and absolut position is
only the secondary.
That is: Get_Next (V); is faster or as fast as Get (V, 5);
****************************************************************
From: Martin Krischik
Sent: Saturday, February 7, 2004 6:32 AM
> My understanding of the model is that passive iterators are only for cases
> where you want to iterate over the entire container.
Yes.
> Indeed, given the iteration model of packages,
> there's hardly any reason to use a passive iterator.
Passive Iterators should allways provide the fastes mean to iterate over the
hole container. They should do so by knowing the internals of the container.
Of course it only matters in advanced container with B-Trees or AVL-Trees as
as internal structure. But I have only seen those in IBM's Open Class Library
(which is far better the the STL).
But there are no advanced containers in AI 302.
****************************************************************
From: Randy Brukardt
Sent: Saturday, February 7, 2004 6:21 PM
> Passive Iterators should allways provide the fastes mean to iterate over the
> hole container. They should do so by knowing the internals of the
> container.
That might be true in a language with a built-in iterator construct, but it
is certainly not true in Ada because of the overhead of calling the generic
formal subprogram for each element. In Janus/Ada, the overhead of calling a
formal subprogram is at least double of a normal subprogram (we have to save
and restore display information, because you could be calling into a more
nested scope than the generic body -- something that normally isn't possible
in Ada).
Other compilers may not have that overhead, but they'll certainly have call
overhead. Whereas, the explicit loop iterator for Vectors only needs to call
Element. So the call overhead is at best a wash, and at worst much worse for
the passive iterator. Moreover, the compiler is a lot more likely to be able
to in-line the call to Element (which likely has a pretty simple
implementation and thus will meet the in-lining qualifications), than the
bunch of arbitrary code in the Process formal routine.
So, a passive iterator will only be faster in complex containers (where you
have to separate the Element and Successor functions). For a Vector (where
the language already has the needed iteration mechanism built-in), it's
going to be slower (or, if you're really lucky, the same speed) and it
certainly is a lot harder to write.
So I think having it on Vector would simply be for consistency; you'd never
actually use it if you know you're dealing with a Vector.
****************************************************************
From: Robert A. Duff
Sent: Saturday, February 7, 2004 7:22 PM
> Other compilers may not have that overhead, but they'll certainly have call
> overhead. Whereas, the explicit loop iterator for Vectors only needs to call
> Element. So the call overhead is at best a wash, and at worst much worse for
> the passive iterator. Moreover, the compiler is a lot more likely to be able
> to in-line the call to Element (which likely has a pretty simple
> implementation and thus will meet the in-lining qualifications), than the
> bunch of arbitrary code in the Process formal routine.
I don't see why the compiler shouldn't inline the Process routine,
assuming the compiler isn't doing shared generics. They're usually
small, but anyway, the Process routine is typically called exactly
once, so it shouldn't matter how big it is.
****************************************************************
From: Randy Brukardt
Sent: Saturday, February 7, 2004 7:33 PM
Most compilers have limitations on what can be inlined; Process (which
contains arbitrary code) is far more likely to violate one of those
limitations than Element (which never changes and is likely to be very
simple). In addition, many compilers only inline when you give pragma
Inline, and you can't do that on a generic formal.
****************************************************************
From: Robert A. Duff
Sent: Saturday, February 7, 2004 7:43 PM
If Process violates whatever these arbitrary restrictions are, then
sure, you can't get it inlined. But typically Process is very simple --
often just one line of code that calls some other procedure to do the
real work, passing some additional parameters. Process isn't a "real"
procedure, conceptually -- it's just the body of a loop.
In my current project, we make heavy use of the generic iterator
pattern, and I think that in many many cases, Process is just
a line or two of code. (And if it's more, inlining is relatively
less important.)
>... In addition, many compilers only inline when you give pragma
> Inline, and you can't do that on a generic formal.
You give the inline on the actual. In non-sharing implementations,
that should apply inside the instance. And the iterator procedure
itself can be inlined, too.
****************************************************************
From: Randy Brukardt
Sent: Saturday, February 7, 2004 8:04 PM
Certainly it's not real (which is one thing I dislike about passive
iterators in Ada - but we've discussed that before), but if it is very short
(or the bodies of your loops are typically very short), then you're
programming style must be very different from mine. The only loops that I
write that are very short are those that I probably shouldn't have written
in the first place (like the one finding the last '.' in a string) --
there's a routine somewhere in Ada.Strings that will do the job, but looking
it up is more work than writing the loop. (And a lot of them would be
replaced by a Vector/List/Sequence container if I had one.)
But just looking at the spam filter I'm working on at this moment: The
average loop length is about 25 lines, the mean is around 8 lines. (There
are more short loops than I would have guessed. But most of them wouldn't
exist if I had a container to use instead - most of them are insert-at-end
or delete-specific-item from a list.)
...
> You give the inline on the actual. In non-sharing implementations,
> that should apply inside the instance. And the iterator procedure
> itself can be inlined, too.
At which point, you *equal* the performance of the active iterator. And only
if *everything* goes right. The OP claimed that the passive iterator would
always have better performance, and that's certainly not true for the vector
container. I doubt that it would be true for the Map container, either. It
could be true for a complex container, but those aren't commonly used.
****************************************************************
From: Alexandre E. Kopilovitch
Sent: Saturday, February 7, 2004 7:55 PM
Martin Krischik wrote:
> > > The only sequence container in the proposal is a vector,
> >
> > Ah, yes, it's Sequence - quite right name for that container (and not Vector).
>
> No, in my book elements in a Sequence have only a relative positions, or at
> least the relative position is the primary position and absolut position is
> only the secondary.
I don't know in which domain your book was grown up, but I can assure you that
in mathematics (and by extension in physics and other natural sciences as they
use mathematical apparatus) elements of a sequence are commonly indexed, and
those indices are always treated as absolute position (which may be zero or
even negative). By the way, your book is also certainly not from Biology/Genetics,
where term "sequence" is used heavily, and they often speak about both absolute
and relative positions in sequences.
We have clearly different usage of terms "vector" and "sequence": substantial
part of today's software engineering (tools and books) use them one way, while
mathematics (and all natural sciences that use it heavily) always use them another
way.
So all the argument here about Vector/Sequence here is about Ada's choice of
preference: will Ada choose software engineering (effectively, Java and C++
libraries) side or mathematical/scientific side on this issue.
I suppose (or hope) that the thesis "Ada is for problem space, not for solution
space" implies the latter.
****************************************************************
From: Martin Krischik
Sent: Sunday, February 8, 2004 11:40 AM
> I don't know in which domain your book was grown up, but I can assure you
It's the english dictornary: "Aufeinanderfolge, Reihenfolge, Szene,
Zeitfolge". Ah, you don't speak german. Well let's look for "Reihenfolge" in
a rushian dictornary (and have a fight with my wives rushian keyboard):
"???????????".
Asking my wives what it means she said "one after the other, queue".
> that in mathematics (and by extension in physics and other natural sciences
> as they use mathematical apparatus) elements of a sequence are commonly
> indexed, and those indices are always treated as absolute position (which
> may be zero or even negative). By the way, your book is also certainly not
> from Biology/Genetics, where term "sequence" is used heavily, and they
> often speak about both absolute and relative positions in sequences.
I have spend 4 years in Great Britain I am shure if I ask anyone on the street
there "what is a sequence" he or she will answer somthing like "one after the
other" - and that is relativ positioning.
> We have clearly different usage of terms "vector" and "sequence":
> substantial part of today's software engineering (tools and books) use them
> one way, while mathematics (and all natural sciences that use it heavily)
> always use them another way.
Even when it comes done to software engineering: IBM's Open Class Library has
a Sequence - for relativ positioning getFirst, getNext, insertAfter. Usualy
used to fill listboxes.
> So all the argument here about Vector/Sequence here is about Ada's choice
> of preference: will Ada choose software engineering (effectively, Java and
> C++ libraries) side or mathematical/scientific side on this issue.
I don't like the STL that much. So I am not realy defending "vector".
> I suppose (or hope) that the thesis "Ada is for problem space, not for
> solution space" implies the latter.
I agree with you on that too.
But I think we are off topic here.
****************************************************************
From: Marius Amado Alves
Sent: Saturday, February 7, 2004 8:41 PM
Randy Brukardt wrote:
>The 'committee' primarily adopted the existing proposal submitted by Matt
>Heaney. We decided not to change any of the major design decisions of that
>proposal - because no package will suit everyone or every need, and we felt
>it was more important to standardize something coherently designed for most
>needs than to fiddle endlessly with it and risk introducing serious bugs.
>
>Which is to say, I don't know. :-)
I do: there is none (except perhaps the implicit one: ease of
implementation). On the other hand, there is a rationale for indefinite
elements. This requirement has been largely felt and voiced since ever,
and I included it in my Bases document (I think stored in alternative
1), and even formulated it as an Annex (stored in alternative 2 but
applicable to any alternative). But I've always seemed to feel some
resistance from Matt and the ARG. Which resistance I find inexplicable.
I really don't see how making the element type indefinite may
"compromise coherence" or "introduce bugs". Sure it complicates the
implementation. But the increase in power for the user is a quantum
leap, as it frees him from doing tricky memory management in many
situations. In my proposed Annex I included this passage from someone
who should be dear to at least one person in that group--perhaps in the
hope of making those strange walls of resistance just shiver a bit:
<<If I ask a student whether her design is as good as Chartres, she often smiles tolerantly
at me as if to say, "Of course not, that isnt't what I am trying to do.... I could never do
that." Then, I express my disagreement, and tell her: "That standard *must* be our
standard. If you are going to be a builder, no other standard is worthwhile.">>
-- Cristopher Alexander, Foreword to [Gabriel 1996]
****************************************************************
From: Randy Brukardt
Sent: Saturday, February 7, 2004 9:20 PM
> I do: there is none (except perhaps the implicit one: ease of
> implementation). On the other hand, there is a rationale for indefinite
> elements.
Perhaps. But that wasn't the question. The question was why aren't there
indefinite *keys*.
...
> But I've always seemed to feel some
> resistance from Matt and the ARG.
Given that the "ARG" (other than the subcommittee) has not yet looked at
these proposals, that's a pretty bizarre statement.
...
> I really don't see how making the element type indefinite may
> "compromise coherence" or "introduce bugs". Sure it complicates the
> implementation.
And, on most implementations, I would expect it to make it *many* times
slower. (It wouldn't have any effect on Janus/Ada, I don't think, because we
already have to allocate an element at a time anyway.) I would guess that it
is that efficiency concern that Matt is responding to. But I'll let him
respond himself...
****************************************************************
From: Marius Amado Alves
Sent: Sunday, February 8, 2004 6:26 AM
>... that wasn't the question. The question was why aren't there
>indefinite *keys*.
>
Oops... sorry.
Curiously enough if you have indefinite elements the requirement for
indefinite keys looses strength: you can then use elementary containers
or indefinite element positions as keys.
>...
>
>>But I've always seemed to feel some
>>resistance from Matt and the ARG.
>
>Given that the "ARG" (other than the subcommittee) has not yet looked at
>these proposals, that's a pretty bizarre statement.
Just a feeling. The proposals are there in the AI, and there was some
discussion.
>>I really don't see how making the element type indefinite may
>>"compromise coherence" or "introduce bugs". Sure it complicates the
>>implementation.
>
>And, on most implementations, I would expect it to make it *many* times
>slower....
No. The system should chose at compile time a specific body according to
the 'Definite attribute of the actual element type.
Aside. Of course there is still no standard means to do this, but it
would be a nice extension. Conditional compilation of generic bodies
based on instantiation properties. Variant units :-)
generic
type T is private;
...
package G is
when T'Definite =>
...;
when others =>
...;
end;
(On the subject of conditional compilation, see also the recent Ada
Preprocessor thread on CLA.)
In the meanwhile, there is no requirement that Ada.Containers be
implemented strictly in Ada, is there? I doubt any Ada 95 container
(arrays, files) is.
End of aside.
So no coherence problem, nor bugs, nor efficiency problem :-)
****************************************************************
[Editor's note: For continuing mail on this AI, see AI-00302-04.]
****************************************************************
Possible improvements from the original author (Matt Heaney), February 18, 2004
o The vector container declares a subtype of its generic formal index type:
subtype Index_Subtype is Index_Type;
This turns out to be very useful when you need to keep track of what is
the range of the vector container index type. I had a real headache
when writing an app when I switched from Positive to Natural, and it's
because I didn't use an index subtype.
We could generalize this for all the containers. For example the map
container would look like this:
subtype Key_Subtype is Key_Type;
subtype Element_Subtype is Element_Type;
This turns out to be useful when you need to instantiate the
Generic_Element function, which you can do using just the subtypes:
type Element_Access is
access all XXX_Maps.Element_Subtype;
function To_Access is
new XXX_Maps.Generic_Element (Element_Access);
o There is an open issue about what value Sorted_Sets.Find should return
if Find fails to find the search item. It could either be Null_Cursor
or it could be the value Back (Set). (The API now says that it's the
value Null_Cursor.)
The benefit of Null_Cursor is that detecting accidental deferences of
the return value of a failed search is easy, since the internal access
object is null. However, that breaks symmetry with searches over a
sequence, which fall off the end onto Back when the search fails.
If we were interested in trying to prevent any dereference of the Back
sentinel node of a set, one possibility would be to give the sentinel
node a special color (assuming that the set is implemented as a
red-black tree, of course), and then test that in the dereference
operations Element and Generic_Element.
But this might be too paranoid, and there are probably many wrong ways
to dereference a cursor that I can't even imagine. And an implementor
might use some other data structure that makes the detection difficult
or impossible.
So we could just return Back and not worry about accidental
dereferences. (I don't have any data to suggest that accidental
deferences would even be a problem.) I'm beginning to think that I was
being a too conservative when I said Find should return Null_Cursor
instead of Back, and I'm concerned now about the inconsistency.
o This also illustrates the fact that support for aggregate-style
operations in the current spec is weak, mostly supporting manipulation
of elements one-at-a-time.
If we are interesting in a vector being a component for manipulation of
unbounded arrays then there are operations in Ada.Strings.Unbounded that
might be useful, such as Replace_Slice and Overwrite.
One of the errors in Ada.Strings.Unbounded was that to insert an
aggregate of elements (characters) into the string you have to insert an
array type. But suppose you only have another Unbounded_String? We can
generalize insert to allow insertion of a vector or a vector slice into
another vector, like this:
procedure Insert (Vector : in out Vector_Type;
Before : in Index_Type'Base;
New_Item : in Vector_Type);
procedure Insert (Vector : in out Vector_Type;
Before : in Index_Type'Base;
New_Item : in Vector_Type;
First : in Index_Type'Base;
Last : in Index_Type'Base);
That's just insertion. Slice assignment could be done using:
procedure Replace_Elements (Vector : in Vector_Type;
Low : in Index_Type'Base;
High : in Index_Type'Base;
By : in Element_Type);
which provides the vector analog of
V (I .. J) := (others => E);
We have also discussed assignment operations. A vector is non-limited
so of course you can say:
V1 := V2;
This replaces the target with the entire range of source. Another
possibility is to assign to the target just a slice of the source:
V1 := V2 (I .. J);
Both of these operations would look like this:
procedure Assign (Target : in out Vector_Type;
Source : in Vector_Type);
procedure Assign (Target : in out Vector_Type;
Source : in Vector_Type;
Low : in Index_Type'Base;
High : in Index_Type'Base);
We could generalize further still. The operation
procedure Assign (Target : in out Vector_Type;
Source : in Element_Type;
Count : in Element_Count);
is the vector analog of
subtype Array_Subtype is Array_Type (1 .. Count);
V1 := Array_Subtype'(others => Source);
Assign operations might also be more efficient than constructor
functions, since the target can be built in place, without any
controlled finalization and adjustment.
Assignment operations are probably more efficient but it might make
sense to have constructor functions too:
function To_Vector (Length : Element_Count)
return Vector_Type;
function To_Vector (Source : Vector_Type;
Low : Index_Type'Base;
High : Index_Type'Base)
return Vector_Type;
function To_Vector (Source : Element_Type;
Count : Element_Count)
return Vector_Type;
Another possibility is to provide concatenation operators, too.
o The aggregate Delete operation looks like this:
procedure Delete (Vector : in out Vector_Type;
First : in Index_Type'Base;
Count : in Natural);
But this also feels wrong, since it specifies the first index and a
count. Count should probably be reserved for aggregate-style
insertions. The analog in Ada.Strings.Unbounded is declared like this:
procedure Delete (Vector : in out Vector_Type;
From : in Index_Type'Base;
Through : in Index_Type'Base);
This seems better since it specifies the range using a more traditional
Ada closed-range style.
o The Resize operation is the analog of the reserve() operation in the
STL std::vector class. However, there is no real analog of the resize()
operation in std::vector, a member function which expands the internal
array as necessary, and sets the number of elements to the specified
value.
You can do this using other operations but its awkward. It might be
better to provide an operation like:
procedure Set_Length
(Vector : in out Vector_Type;
Length : in Element_Count);
which sets the length directly.
Randy asked whether this would leave "holes" in the vector. But there
are no real wholes, since it's just a contiguous array under the hood
and so the normal initialization rules for array components apply.
o There has been interest in passive iterators for the vector. We
should probably include them. Here's what they look like:
generic
with procedure Process
(Element : in Element_Type) is <>;
procedure Generic_Constant_Iteration
(Vector : in Vector_Type);
generic
with procedure Process
(Element : in out Element_Type) is <>;
procedure Generic_Iteration
(Vector : in Vector_Type);
generic
with procedure Process
(Element : in Element_Type) is <>;
procedure Generic_Constant_Reverse_Iteration
(Vector : in Vector_Type);
generic
with procedure Process
(Element : in out Element_Type) is <>;
procedure Generic_Reverse_Iteration
(Vector : in Vector_Type);
o I discussed aggregate-style operations for vectors above. Another
possibility is to declare an array type, and provide operations that
operation that have an array parameter. Something like:
type Vector_Elements is
array (Index_Type range <>) of aliased Element_Type;
[Editor's note: This only works for definite elements, of course.]
and then provide operations like:
function To_Vector (Source : Vector_Elements)
return Vector_Type;
function To_Array (Source : Vector_Type)
return Vector_Elements;
function Slice (Vector : Vector_Type;
Low : Index_Type'Base;
High : Index_Type'Base)
return Array_Type;
procedure Assign (Target : in out Vector_Type;
Source : in Array_Type);
procedure Copy (Source : in Vector_Type;
Target : out Array_Type;
Last : out Index_Type'Base);
etc.
Yet another possibility is to declare a nested generic that declares a
generic formal array type, so the user can specify his own array type:
generic
type Array_Type is
array (Index_Type range <>) of Element_Type;
package Generic_Arrays is
function To_Vector (Source : Array_Type)
return Vector_Type;
function To_Array (Source : Vector_Type)
return Array_Type;
...
end Generic_Arrays;
o There is still some debate about the exact nature and purpose of the
vector container. My model (and I think Bob Duff's model) is that a
vector is implemented internally using an unconstrained array. A vector
allows insertion at any position, but it is specifically optimized for
insertion at the back end.
One of the benefits of the STL std::vector is that you can do this:
vector<HANDLE> v;
v.push_back(h1);
v.push_back(h2);
const HANDLE* const ph = &v[0];
const DWORD n = v.size();
WaitForMultipleObjects(n, ph, INFINITE);
In other words under the hood a vector is just a normal C-style array.
You're allowed to take the address of a vector element and the language
guarantees that the elements are in contiguous memory, as if they had
been declared as elements of a plain array.
The Ada vector container should follow the same model. It doesn't make
any sense to try to improve (say) insertions in the middle of a vector,
since there are other perfectly-good containers for that (like a list).
o There was some discussion about providing a stable sort, too. We
could even provide a partial sort.
o There seems to be interest in being able to sort arrays, too. A
generic operation for sorting an array would look like:
generic
type Index_Type is (<>);
type Element_Type is private;
type Array_Type is array (Index_Type range <>) of Element_Type;
with function "<" (Left, Right : Element_Type)
return Boolean is <>;
procedure Ada.Containers.Generic_Sort_Unconstrained_Array
(Source : in out Array_Type);
pragma Pure (Ada.Containers.Generic_Sort_Unconstrained_Array);
There would be an analogous operation for sorting a constrained array.
You could generalize further still, and provide an operation for sorting
any container having a cursor with the requisite properties (such as a
difference operator, etc).
o We could generalize the vector sort by allowing the user to pass in a
Swap operation for vector elements. The generic operation could have a
named default (that does swapping using normal assignment):
procedure Swap
(Vector : in Vector_Type;
I, J : in Index_Type'Base);
generic
with function "<" (Left, Right : Element_Type)
return Boolean is <>;
with procedure Swap
(Vector : in Vector_Type;
I, J : in Index_Type'Base) is Vectors.Swap;
procedure Generic_Sort (Vector : in Vector_Type);
This would be useful for elements that are expensive to copy, say that
are implemented as a controlled type that contains a value by
reference. The swap for that element could then just swap the pointers,
instead of finalizing and then adjusting controlled objects.
****************************************************************
Comments on parameter names, Randy Brukardt, April 16, 2004.
Most of the parameter names in the containers libraries are "Container".
We considered using more functional names. We started with Insert:
procedure Insert (Container : in out Vector;
Before : in Cursor;
New_Item : in Element_Type;
Count : in Size_Type := 1);
One possibility would be to use what Ada.Strings.Unbounded uses. The parameter
there is named "Source". That appears to be for compatibility with the
function version of the routine, but it is clearly inappropriate for a
procedure. So no help there.
Another possibility would be to use "Into":
procedure Insert (Into : in out Vector;
Before : in Cursor;
New_Item : in Element_Type;
Count : in Size_Type := 1);
which makes nice named calls:
Insert (Into => My_Vector, Before => Blob, New_Item => An_Item);
Similarly, Append and Prepend could use "To":
procedure Append (To : in out Vector;
Before : in Cursor;
New_Item : in Element_Type;
Count : in Size_Type := 1);
And Delete could use "From":
procedure Delete (From : in out Vector;
Position : in Cursor;
Count : in Size_Type := 1);
but then "Position" should really be "At" -- except that it's reserved in Ada.
And Replace_Element and Find should use "In", but that is clearly not going
to work. We could use "Inside", but that's a stretch.
Then, we'd have to find something to use for the various functions. The
existing:
function Last (Container : Vector) return Cursor;
makes as much sense as anything I can come up with.
I don't think that the above buy enough to change Matt's choices (except for
a couple obvious mistakes, already fixed).
****************************************************************
From: Randy Brukardt
Sent: Friday, September 3, 2004 7:43 PM
This document lists significant changes between AI-302-3/05 (which for the
most part reflects the Palma meetings conclusions) and AI-302-3/06 (which
reflects more recent e-mail discussions and experience). Simple typos aren't
listed. I've numbered the changes for easy reference. Most of these require
no discussion. A few questions about issues that require some discussion are
at the end.
1) !summary: Changed (vectors) => (vectors and lists) to reflect both kinds
of sequence containers.
2) !proposal: The summary of the semantics of hashed containers is
simplified to be less precise (it was too specific to a particular
implementation).
3) The name of the Count parameter of To_Vector was changed to Length,
given that that is what it signifies.
4) The name of the Count parameter of Set_Capacity for Vectors was changed
to Capacity, given that that is what it signifies. (Note that the Map one
already called it Capacity in the spec., so there also was a consistency
issue.)
5) The container to Generic_Sort has been made class-wide. Generic units
are not primitive operations, and it is usually a mistake to use specific
tagged types in non-primitive operations. By making it class-wide,
instantiations are directly usable by any extensions. Note that
implementations should take care not to dispatch to primitive operations of
Vectors and Lists in the implementations of these routines; the routines are
defined in terms of semantics, not primitive routines, and thus any
overridings could have the wrong semantics.
6) For vectors, the two Find operations could be ambiguous if the default
parameters are used. For instance:
Op (Find (V, E)); -- index-based or cursor-based Find?
Therefore, the index-returning Find was renamed to Find_Index. This is
similar to the way the ambiguity was handled for First and Last. A similar
change was made for Reverse_Find. (Note: We change the index version, so
that the cursor-based version has the same name and profile for all of the
containers.)
7) function Is_In (Item : Element_Type;
Container : Vector) return Boolean;
was replaced by
function Contains (Container : Vector; Item : Element_Type) return
Boolean;
This new profile allows users to use prefix notation on calls:
if Obj.Contains (Element) then ...
This was discussed on Ada-Comment; the only advantage of the first profile
is that it is similar to that used in Ada.Strings. But most people thought
"Contains" sounds more natural, and they want to be able to use prefix
notation.
8) The names Iteration and Reverse_Iteration were changed to Iterate and
Reverse_Iterate; they should be a verb. The process routines should have a
parameter of Cursor. The latter was purely a cut-and-paste error (repeated
everywhere). Similarly, the text about Elements following Iteration for maps
should be after Update_Element for maps.
9) Added text to insure that inserting a null vector into a vector never
raises an exception. This is needed to insure that all of the inserts are
consistent on this point. (The base Insert already has this wording.)
Otherwise, the shorter Inserts
would raise an exception for an index out of range, and the basic one would
not. This behavior is needed so that inserting a empty vector into another
empty vector does not fail (we don't want to have to require testing for
non-empty before doing an Insert).
10) The description of Delete_Last for Vector was simplified to use Clear
instead of Delete when everything will be deleted.
11) The AARM note after Delete for Vector was corrected to use Last_Index
rather than length; the note as written presumed that First_Index = 1 (the
classic incorrect Ada assumption about arrays).
12) Added wording to say that Last_Index returns Index_Type'First - 1 if the
container is empty. That makes it clear what the result is, as the position
of the last element isn't defined in that case (since there are no elements
in the container).
13) Insert for lists had a wording correction to plural so that it is
obvious that multiple elements can be inserted (the wording originally was
copied from the single element version, which was dropped in favor of a
default parameter of 1 for the Count).
14) Corrected AARM notes for Insert, Delete, Find for Hash to say "should
only compare keys" instead of "should only compare elements", as it is the
keys that are compared, not the elements.
15) function Key (Position : Cursor) return Key_Type was added to the
specification
of Generic_Keys inside of Ordered_Sets. This operation matches one declared
in Hashed_Maps, and we want these containers to be as similar as possible.
The operation was originally omitted because there was no key read function
in the generic spec; now that one has been added, there is no reason to omit
this function.
16) Added wording so that if something outside of the package Hashed_Maps
changes a
key value, the behavior of the package is unspecified. If a key includes an
access-to-variable component, and the designated object of the access
component takes part in hashing or equality, then it is impossible to
guarantee the behavior of the package. The key could be changed even without
any call into the package at all (if a copy of the access is saved
somewhere).
This really is a user error, similar to passing in a hash function that
doesn't provide the same results when called with the same key. Moreover,
implementations can't protect against it even if they wanted to, since it
can happen without any access to the Hashed_Maps package.
Similar wording was added to Ordered_Sets, as elements could have similar
problems.
17) Changed the name of the parameters of Is_Disjoint to Left and Right.
This operation is communative, and might was well have parameter names
similar to Union and Intersection. OTOH, Is_Subset is neither communative
nor "common knowledge". Thus, its parameters were named "Subset" and
"Of_Set", so it is clear which is the subset that is being tested. This is
clearly preferable in named prefix calls:
Obj.Is_Subset (Of_Set => Obj2);
which is crystal-clear which is being tested to be a subset of which set.
(This operation is not common enough for readers of code to "know" which
operand is which.)
18) Pragma Pure was added to the array sort subprograms.
19) Matt made a number of improvements to the !examples section, adding and
reordering examples.
20) The following was added to all of the containers:
procedure Query_Element
(Position : in Cursor;
Process : not null access procedure (Element : in Element_Type));
This is valuable for reading large elements without copying them. For Sets,
this also avoids the need to check if the key was modified (meaning it can
be used even without defining a key). Since the key check could be
expensive, this is a significant savings. The routine was added to all of
the containers for consistency.
21) Nick Roberts points out that now that Index_Type is an integer type, it
is a lot simpler and clearer to say Index_Type'First-1 rather than
Index_Type'Pred(Index_Type'First). I've made this change generally.
22) Nick also pointed out that "any exceptions raised ... are propagated"
should be singular, as only one exception can be raised by a routine.
(Unless, of course, there are multiple operations in the "...").
23) Nick asked for an AARM note to clarify the purpose of the pragma Assert
in the
Vectors package. This was added.
24) Nick asked for the introductory description of length and capacity for
vectors to
be clarified (as it used "length" in the description of capacity, among
other sins).
25) Nick asked that the list of containers in Language Design Principles be
put in the order that they're defined in the text. He also asked that the
statement about
separate definite and indefinite versions be stated more clearly.
26) Update_Element for Sets has been renamed Checked_Update_Element, as it
is different from the other Update_Elements both in profile (the Set is
included in the
call) and in semantics. This is necessary because the routine may be forced
to delete the element and raise Constraint_Error if it is modified to match
an existing key. In that case, the set itself will need to be modified;
potentially including pointers in the implementation such as First or Last.
Thus the profile must be different.
27) At Palma, we decided that Swap should always swap elements, not nodes.
(Thus, the cursors would point at the changed elements afterwards). However,
this suffers from two major problems: First, it isn't necessarily possible
for indefinite types. It is not possible to copy the elements between nodes
(because they are potentially different shapes or even types), and it is
possible (although one would have to leave Ada to do it) to put the elements
directly in the nodes. Second, the purpose of the swap operations is to
provide better performance than the user can get from manually copying the
elements. But, that clearly requires swapping the nodes for a list. Only a
novice programmer would write a list swap that *copied* the designated
objects.
Similarly, if there is inherent indirectness in a container implementation,
we want to allow Swap to take advantage of it.
Therefore, we didn't change the semantics of Swap for lists. Rather, because
the vector operation is different, we renamed it to Swap_Elements. This will
prevent problems if a container is switched from a vector to a list or vice
versa.
I added some AARM notes to explain the difference and the reasons.
Matt pointed out that the Container parameter to Swap_Elements for cursors
is unnecessary; it violates our meta-rule that Container parameters are only
included on cursor operations if the cardinality of the Container may
change. It was removed, since there is no longer a need to be the same as
the list Swap.
28) The Success parameter of Insert for Maps and Sets was changed to
Inserted. Success is a boolean parameter, and Success = False implies some
sort of failure or error. But it is not an error (necessarily) for Insert to
return this value; it just means that the item was previously in the Map or
Set. The caller of Insert must decide if Inserted = False is an error
(failure) or if it is intended behavior. Thus we select a name that doesn't
have a strong connotation.
29) The Replace operation of Maps was renamed Insert_Or_Replace. This
operation has insertion semantics (if the key is not in the map) or
replacement semantics otherwise. Thus the name change; it's confusing
without it. (The suggestion of renaming it to Insert has the same problem,
in the other direction). It's a useful shorthand, especially as it does not
have the Out parameters of regular Insert. If all you need to do is stick
something into the map, and only care that it ends up there, this is the
operation to use.
A similar operation was defined for Sets. This operation is named Insert,
because it has no replacement function (the element is unchanged if it is
already in the map). Again, it is useful if you want to add an element to a
Set, and have no need to worry about whether it is already there or where it
ended up. (This operation was requested by users on Ada-Comment, noting that
defining 'dead' objects to receive return values that they don't care about
is ugly and possibly dangerous.) For more on this subject, see Question Q5.
Questions:
Q1) Find_Index returns Last_Index (Container) + 1 if the element is not
found. This seems consistent to me (it's past the end of the container in a
forward search), but Matt worries that First_Index (Container) - 1 might be
thought of as better. The trouble with First_Index (Container - 1 is that
you can't put it into an object:
declare
I : Index_Type := Index_Type'First;
begin
I := Find_Index (Vect, Item, I);
while I <= Last_Index (Vect) loop
-- Do something to the element I.
I := Find_Index (Vect, Item, I+1);
end loop;
end;
If Find_Index returned Index_Type'First - 1, saving the result of Find_Index
would raise Constraint_Error if the item is not found. That's not what we
want, I think.
Q2) The parameters to Generic_Merge have not been made class-wide (even
though the comments about non-primitive operations with specific tagged
parameters mentioned for Generic_Sort hold here, too). That's because both
parameters need to be the same type. An alternative would be to make them
class-wide, and then have a runtime check (of the actual tags) that they
actually are the same type. But that is not very O-O. A third possibility
would be to repeat the type in the generic spec:
generic
type List_Type is new List with private;
with function "<" (Left, Right : Element_Type)
return Boolean is <>;
procedure Generic_Merge (Target : in out List_Type;
Source : in out List_Type);
But that is not very consistent with the rest of the specification. Some
guidance would be helpful here.
Q3) The generic formal part for maps has:
with function "=" (Left, Right : Key_Type)
return Boolean is <>;
with function Is_Equal_Key (Left, Right : Key_Type)
return Boolean is "=";
Matt wonders why both operations are needed; his reference implementation
never uses "=". This came from the Palma meeting; it's mentioned explicitly
in the meeting minutes. The intent is that Map equality use "=" rather than
Is_Equal_Key. The idea is that Maps that compare equal have no visible
differences. Consider a map with keys that are strings representing numbers.
Is_Equal_Key probably would compare the 'Values of the keys (so "0046" and
"46" would be the same), but "=" would just compare the strings.
The other reason is that Is_Equal_Key often will not be native equality (as
in the example above). But it will be often enough that we want to be able
to provide "=" as a result.
However, Map equality is not an important operation; it exists simply
because the type is non-limited and thus we have to define it. Arguably, the
extra operation doesn't buy much. It doesn't cost much, either, as it will
never need to be specified in an instantiation as long as named notation is
used.
Matt points out that the reason this operation is called "Is_Equal_Key" is
that otherwise there would be two "=" operators in the formal part, and that
would prevent using named notation for either. Moreover, it is not
unreasonable (see above) for the user to override the key equality. So, if
one is dropped, it would have to be the "=" operation.
I did add some text specifying the required relationship between "=" and
Is_Equal_Key: if "=" returns True for some operands, then Is_Equal_Key also
returns True for those operands. I also added some text about the design
need to give key equality a different name than "=" in order to allow named
notation in instantiations.
Q4) Set_Capacity is defined to raise Constraint_Error if Length (Container)
> Count. Matt would prefer that this case is not separately handled. He
would like
Set_Capacity (M, 0)
to be a shorthand for setting the Map or Vector to the smallest reasonable
size. (I find this a bit odd, as Matt never wanted this routine to even
allow smaller values. But whatever.) Note that just dropping the check would
not be enough; we'd have to redo the description of the operation to say
that the capacity is set to at least Count_Type'Max (Count, Length
(Container)) -- because we don't want this operation to drop elements. I'm
unsure that the benefit is worth the change, and it seems like a bug to me
to try to set the capacity of a container to be smaller than the number of
elements it holds.
Q5) There is a lot of confusion about the meaning of the Insert operations
for maps and sets. The changes outlined in (29) above should help.
However, the semantics of the Insert that does not return an Inserted value
remain controversial. Five possible behaviors were identified for Maps:
1) Insert raises an exception when the key is already in the map;
2) Insert does nothing when the key is already in the map;
3) Insert replaces the existing element when the key is already in the
map;
4) Replace raises an exception when the key is not already in the map;
5) Replace does nothing when the key is not already in the map.
(One could imagine a Replace that inserts when not found, but that is just
choice 3.)
Convincing examples were shown for virtually all of these semantics. Note
that replacement semantics (choices 3 through 5) make less sense for Sets,
as you would be replacing an equivalent element. However, since it certainly
possible for an element to have components that do not participate in the
equivalence test, it is easy to imagine examples where such replacements
make sense.
It was suggested to provide all of these operations with different names,
but that requires coming up with five sets of names, with the resultant
clutter and confusion. (And this is on top of the Insert that returns a
Position and Inserted result.) Moreover, all of these routines can be
written using other primitives. We therefore decided to provide (3) for Maps
and (2) for Sets.
Another suggestion was to use an enumeration similar to the ones in
Ada.Strings. Then we'd have:
type Exists_Action is (Error, Ignore, Replace);
procedure Insert (Container : in out Map;
Key : in Key_Type;
New_Item : in Element_Type;
If_Exists : in Exists_Action := Error);
type Nonexistent_Action is (Error, Ignore, Insert);
procedure Replace (Container : in out Map;
Key : in Key_Type;
New_Item : in Element_Type;
If_Nonexistent : in Nonexistent_Action := Error);
The objections to this idea boiled down to two: this is control-coupling,
and that some people hate the design of Ada.Strings; they find it hard to
find or remember the names of the parameters.
Several people suggested that control-coupling was preferable to a forest of
routines, or one routine with non-intutive semantics. Moreover,
control-coupling is mainly a problem for the implementer of a routine; it is
not bad for the user of a routine, and make in fact increase the options
available to them.
I personally am in favor of this sort of solution, because (1) it allows
using simple names for the routines, without surprise semantics; (2) it
allows dodgy cases to be errors by default, while allowing the specification
of other behaviors when needed (which is makes using the library safer, by
cutting surprise behavior); (3) it doesn't force users into using the
Position-returning Insert when they don't need the Position for a later
operation.
However, it does slightly complicate the specifications of the packages, and
it only got lukewarm support, so the decision has been left to the ARG.
Q6) Tucker has mentioned that he often has components in the key of a map
beyond the actual key participating ones. (This is similar to the behavior
of a set; if we had a Hashed_Set this probably would be less of an issue.)
For that to be effective, it would be necessary to change a key that is
already in a map. Currently, neither Replace_Element nor Insert_or_Replace
change the value of a key that is in the map; only the element is changed.
In order to get the sort of semantics that Tucker seems to be suggesting,
we'd need a way to change the value of a key. But such an operation would
potentially change the location of the element, so it could be fairly
expensive. Moreover, it would likely require allocation even if the hash
didn't change for the indefinite form of the container.
Finally, whether or not the key is replaced would seem to be another
(orthogonal) option for the Insert routine "6) Insert replaces the key and
the element when the key is already in the map; 7) Insert replaces the key,
leaving the element unchanged when the key is already in the map".
This complication doesn't seem worth it to me, but as it came up very late,
the entire ARG needs to discuss the issue.
****************************************************************
From: Randy Brukardt
Sent: Monday, October 3, 2004 xx:xx PM
Here is a listing of the AI-302 updates that I made beyond those discussed at
the meeting. These mostly have come up in more recent e-mail discussions.
1) Moved the AARM noted describing the now removed Assert to a (user) "Note".
The fact that instantiating Ada.Containers.Vectors with a type for which
Index_Type'Base'First = Index_Type'First will raise Constraint_Error is
obvious from reading the spec. carefully -- but it seems to me few users
will read that carefully. The note reads:
If Index_Type'Base'First = Index_Type'First an instantiation of
Ada.Containers.Vectors will raise Constraint_Error. A value below
Index_Type'First is required so that an empty vector has a meaningful
value of Last_Index.
2) Matt pointed out that Insert expects to be passed Index_Type'Last + 1 in
some circumstances. In particular, inserting nothing into a full vector is
not expected to raise an exception (to be similar to similar array
operations). We want to include that value in Extended_Index as long as it
actually exists. (Unlike the lower bound, we don't want to insist that this
value exists, because if we did, we wouldn't be able to instantiate this
generic with Positive or Natural.) So, we changed the declaration of
Extended_Index to:
subtype Extended_Index is
Index_Type'Base range Index_Type'First-1 .. Index_Type'Last +
Boolean'Pos (Index_Type'Base'Last > Index_Type'Last);
This declaration allows both special values (if they exist) and no others.
3) Replace and Exclude operations matching the ones in the Ordered_Sets were
added to the Generic_Keys generic, as it is odd that Delete was in there and
not the others. (The intent is that this package closely match Hashed_Maps
[and Ordered_Maps, if it ever is defined] - as many operations on keys in
Hashed_Maps should be represented here as possible.) Insert and Include were
omitted, as there could be no guarentee that the Key passed in matches the
one in the Element passed in. (We could check, of course, but that seems
like going too far; moreover, it's hard to imagine how these could be used.)
Replace simply doesn't worry about it; it is defined in terms of Replace
(see below), replacing the element referred to by the Key. Thus it works
similarly to Checked_Update_Element.
Replace_Element (Container, Cursor, New_Item) also has been added to the
Set itself, as there is not a Replace_Element for a set. This tries to
replace in place, but will do an insert/delete if necessary. [Should this
have a different name? The profile and semantics are a bit different than
the normal Replace_Element?]
4) The definition of *cursor* was buried in the introductory AARM text. I moved
that into the introductory paragraphs of A.18, so that it appears in the
normative text. Readers of the standard should be presented the basic model
of the containers.
5) The index forms of Element, Replace_Element, Query_Element, and
Update_Element took Index_Type'Base for some reason, but passing No_Index
raises Constraint_Error. So I changed these to Index_Type, so that the
specification doesn't allow No_Index to be passed.
6) Added an erroneous case for abuse of the Process procedure of Query_Element
and Update_Element. This usually looks like:
Execution also is erroneous if the called Process procedure of a call to
Query_Element or Update_Element executes an operation that causes the
Position cursor of Query_Element or Update_Element to become invalid.
For lists, maps, and sets, the only problem occurs if the element is deleted
directly, or if the container is finalized (via Unchecked_Deallocation).
Insertions and other Deletions don't matter, as the nodes are logically
separate.
For vectors, the rule also includes ambiguous cursors. An insert or delete to
the left of the cursor will move the elements; if the element is passed by
reference, that will clobber the element being operated on with unknown
effects. We don't want to require that optimization is off in Process
subprograms! The vector version also requires wording to cover the index
version of the routines.
I'd like to suggest that we consider adding a check that the element being
processed is not deleted by the Process procedure. This check requires only a
bit per node (or a short list of elements in process), and covers all of the
new dangerous cases for most of the containers. (Bad use of
Unchecked_Deallocation is hardly new to the containers, and Move will not
actually cause problems in practice, as the nodes are not changed, just the
container that they belong to.) Deleting yourself requires contortions (the
Process routine does not have a cursor to use for this operation), and, since
it damages the element parameter, the effects could be widespread. The check
also would prevent calling Update_Element on the same element again, which
would have different results depending on the parameter passing mode (and which
makes the check cheaper). The overhead of the check would only apply to the
various Deletes and Update_Element; no other routines would need to check. The
text would be:
If the Process procedure deletes the element designated by Cursor, or calls
Update_Element on Cursor, Program_Error is raised.
AARM Note: This check has to be done in the code for Delete and Update_Element,
of course.
Making vector Update_Element safe would also require checking for any
operations that would make the cursor ambigious. (That's a bounded error in
other cases.)
7) Matt had previously made a number of suggestions about improving/correcting
the examples. These have been integrated.
8) Delete for cursors does nothing if the cursor is No_Element for Lists, Maps,
and Sets. (Matt says this was intended to model the effect of
Unchecked_Deallocation.) Delete for cursors in Vectors, on the other hand,
raised Constraint_Error in this case. I changed the wording for Delete for
cursors in Vectors to be consistent with the other three.
(Note, however, that Delete for Keys is now defined to raise an exception if
the key is not found; the routine Exclude is defined not to raise an exception.
Thus, this is a bit inconsistent. It would be more consistent if Delete for
cursors for all containers raised an exception when given No_Element. This
doesn't seem particularly important to change, but it should be considered.)
9) Nick pointed out that Insert for Vectors was leaving the Out parameter
Position undefined if the length of the insertion was 0. I fixed that
wording to make it equal to Before.
10) Added an AARM note to the effect that when we say "unspecified" in this
clause (A.18), we don't mean "erroneous". If we meant "erroneous", we said
that. And included some ramifications of that (checking must not be suppressed;
don't create dangling pointers by assuming behavior of generic formals).
This intent should be written down somewhere, even if it is too complex and
error-prone to try to do so normatively.
11) Removed some implementation details from the !proposal section, which
should be a high-level description. (Thanks to Matt for picking those up.)
12) Ada.Strings.Unbounded.Hash needs to be Preelaborated, not Pure, as its
parent is only Preelaborated (not Pure).
13) Changed Set_Length's wording to:
If Length is larger than the capacity of Container, calls
Reserve_Capacity (Container, Length), then sets the length of the
Container to Length. If Length is greater than the original length of
Container, the added elements are empty elements.
so that this operation doesn't shrink the capacity of a vector. (It wasn't
right in the previous version, either, as it would have raised Constraint_Error
in that case.)
14) Changed the name of Ensure_Capacity to Reserve_Capacity, as "Reserve" is
the name the STL uses, and it seems more descriptive. (See e-mail.)
15) Wording was added to Iterate for each container to say that Program_Error
is raised if the Process routine calls an operation that will modify or reorder
the container. Each container needs slightly different wording for various
reasons (nodes can be reordered in Lists; rehashing in a Map would change the
order).
This decision grew out of a discussion between Matt and me as to what exactly
the passive iterator should allow.
We both agreed that trying to implement a passive iterator that could stand
insertions and deletions of elements was hard. Morevoer, if the user needs to
do that, they can use an active iterator (that is, a loop with explicit
cursors) to do so. So, we agreed that inserting or deleting elements from
within a passive iterator was bad, and there is no need or intent support it.
The main undecided issue is what to do if the user does indeed make a mistake
and insert or delete an element from the container during a passive iterator.
There seem to be 4 possibilities:
1) Specified results (it works in some specified way);
2) Unspecified results (it works, but what it does isn't specified);
3) Erroneous (anything goes);
4) Check for bad cases and raise an exception.
(1) is clearly too burdensome on the implementation, and besides, we don't
want it.
(2) would insure that the program wouldn't crash, but otherwise the results
wouldn't be portable.
(3) would allow anything, implementers could ignore the possibility.
(4) would be the most portable, but there are concerns about overhead.
I originally wrote (2) using the wording: "Which cursors are presented to
Process is unspecified if..." But that seems to be a burden on implementations
for little benefit.
I object to (3), because users *will* make this mistake, and likely
implementations of the iterators would have very bad effects. If the node that
the iterator was holding onto was deleted, it probably would be
Unchecked_Deallocated, the memory might be reused, and when the pointers are
walked, just about anything could happen.
(4) seemed to have too much overhead, but once we stopped trying to support any
insertion or deletion into the container, the cost became quite reasonable. All
the implementation of the check would need is a counter (8 bits probably is
enough) in each container. When an Iterate starts, the counter is incremented;
when it completes, the counter is decremented. Each of the operations on the
list of problem operations check that the counter is zero, raising
Program_Error if the counter is nonzero.
(We don't have to worry about tasking issues, as the container object is inside
of the Iterate call the entire time. If some other task makes a call during
that time, we have bad use of shared variables, and we don't care what happens.
In fact, what will happen is that Program_Error would be raised, which is
probably a good thing.)
That has very little overhead, because virtually all of the operations in
question allocate or deallocate memory, and thus are expensive anyway, an
additional compare and branch will have no visible impact on performance.
(Sorting and Merging are also expensive; Swap_Links and Splice are the only
exceptions.) Operations that don't modify the container don't need to make any
check.
This has the advantage of making passive iterators completely safe against
problems caused by what container operations are invoked in Process. (Yes,
calling Unchecked_Deallocation on the container could still cause problems, but
that is covered by other rules of the language -- and even it would raise
Program_Error.) It also means that uses of passive iterators are safely
portable (whereas active iterators could have problems if a dangling cursor was
used) -- which gives them a clear advantage.
This check is another one that could be dropped in an "unchecked" container.
Thus, I've worded this check into all of the passive iterators.
The wording enumerates the reasons that a check is needed:
"if Process attempts to insert or delete elements into Container; or"
"modifies Container" would be too broad, as it could include replacing the
value of an element.
We need also to talk about finalization and about calling Move, as the current
wording only talks about cursors being passed to operations, not something that
happens *during* an operation. Moreover, once we decide to have a check,
including that check in the body of Finalize and Move is not difficult.
****************************************************************
Questions? Ask the ACAA Technical Agent