Version 1.1 of acs/ac-00257.txt

Unformatted version of acs/ac-00257.txt version 1.1
Other versions for file acs/ac-00257.txt

!standard A.18          14-02-09 AC95-00257/00
!class confirmation 14-02-09
!status received no action 14-02-09
!status received 13-11-12
!subject Thoughts on containers
!summary
!appendix

From: Jeff Cousins
Sent: Tuesday, November 12, 2013  12:28 AM

Hello John,

Just some thoughts about what would be useful to add to your book with regards
to Containers.

Maybe for the first example of a Container add a reminder that if "=" is
undefined then the default is used.

For Doubly_Linked_Lists and Multiway_Trees point out that cursors have the
important property that they stay pointing to the same element even if other
elements are inserted or deleted earlier in the list/tree.

On the other hand, for Vectors there is the concept of "ambiguous" - if other
elements are inserted or deleted earlier in the vector then the cursor might
still be pointing to the same element (as for a list or tree), or it might be if
that it previously pointed to the old nth element it now points to the new nth
element (which is what happens with GNAT, perhaps not too surprising for an
indexable structure), or it might point anywhere.  I think that in practice if
cursors are to be used with vectors they will often need to be preceded by a
Find, and if there is multi-tasking then the Find and the operation using the
cursor will need encapsulating within a single protected operation.

For Swap and Swap_Links maybe contrast that with the former the value of the
element pointed to by the cursor changes, whereas for the latter the value of
the element pointed to by the cursor is unchanged.

Because Maps don't allow a new node to be inserted if a node with an equivalent
key already exists, keys have the useful property that they are unique.
(Probably the main thing we use Maps for is to remove duplicates from a list (in
the general sense, I don't mean Doubly_Linked_List)).

Point out that for Hashed Maps the order is pseudo-random (depending on the
hash), there will be no obvious relationship to either the order of the keys or
the order of insertion.

Point out that for Sets, updating an element using Replace_Element may
immediately re-order the set, so if the user is attempting to step through a set
using First then repeated Nexts to step the cursor, it may skip over (or repeat)
elements.  So for a set containing 12, 23, 34, 45, ... and trying to double all
the values, doubling 12 gives 24, 24 is after 23 so the first element becomes
the second element, so Next will give a cursor pointing to 34, skipping over the
23.  Quite interesting to watch for test code that you don't have to actually
deliver!  (Trying to double a set containing 12, 24, 36, 48, ... would of course
soon fail the uniqueness test).  It's curious that Update_Element was banned but
Replace_Element still allowed.  It's a bit dubious whether cursors are of any
use for sets.

For Indefinite Containers, Update_Element is not terribly useful if the element
is a string, say, rather than a nested container - the "Process" is supplied by
the user and subject to the usual run-time checks, so if it tries to update the
element to something of a different length it will fail the length check.

So...

if you had a definite Container of numbers one could iterate around doubling all
the numbers using Update_Element,

if you had an indefinite Container of animal names you could update each
character of each name to the next letter of the alphabet using Update_Element,
but...
   you couldn't update all of the animal names to that of their predators/prey
   using Update_Element unless the lengths of the names matched, you would have
   to use Element to read then Replace_Element to write.

***************************************************************

From: Tucker Taft
Sent: Wednesday, November 20, 2013  11:50 AM

I would not recommend using "ambiguous" cursors at all.  More generally, before
doing an operation that "tampers with cursors" the user would be wise to null
out any cursors that happen to be "hanging around."

***************************************************************

From: Jeff Cousins
Sent: Friday, November 22, 2013  9:02 AM

> I would not recommend using "ambiguous" cursors at all.

Quite.  But in a multi-tasking system another task could "get in" between the
first task using Find, say, to obtain a cursor to an item, and the first task
then using the cursor to do something with the item, and the second task do an
insert or delete resulting in the cursor becoming ambiguous in-between.  Hence
the suggestion for a health warning.

***************************************************************

From: Geert Bosch
Sent: Monday, November 25, 2013  12:09 PM

> I think that in practice if cursors are to be used with vectors they will
> often need to be preceded by a Find, and if there is multi-tasking then the
> Find and the operation using the cursor will need encapsulating within a
> single protected operation.

Isn't it the case that any such unsynchronized use would be erroneous anyway?
That is, if you may access a container from two different tasks, they need to
synchronize anyway to avoid erroneous execution.

So, ALL operations (including read-only operations) would need to be implemented
as protected operations.

***************************************************************

From: Jeff Cousins
Sent: Monday, November 25, 2013  4:44 PM

I think many users wish that all Containers had been protected, not just the new
queues.  But Vectors do have that extra level of risk, compared with Doubly
Linked Lists and Multiway Trees, of a cursor becoming ambiguous, requiring
exclusion beyond that of individual subprograms.  And even in a single task
system a programmer could easily accidentally make a cursor ambiguous.

***************************************************************

From: Tucker Taft
Sent: Monday, November 21, 2013  8:50 PM

> I think many users wish that all Containers had been protected, not just the
> new queues.

This would introduce significant overhead for what at least should be a rare
situation. One historical note about Java was that the original "HashTable"
class in Java synchronized all access, but it was very slow, so the next release
of Java introduced an unsynchronized "HashMap" class and effectively made
HashTable obsolescent.

I could see adding (yet) another set of containers, such as Protected_...,
though this is beginning to turn into a nasty combinatorial situation
(indefinite, bounded, protected, ...).  Alternatively, you could just create
your own Protected_blah with just the operations you need exposed, and implement
it with the unsynchronized container of your choice.

> But Vectors do have that extra level of risk, compared with Doubly
> Linked Lists and Multiway Trees, of a cursor becoming ambiguous,
> requiring exclusion beyond that of individual subprograms.  And even
> in a single task system a programmer could easily accidentally make a cursor
> ambiguous.

Using cursors with Vectors seems like it is hardly worth it.  It seems simpler
and safer to just use indices.

***************************************************************

From: Randy Brukardt
Sent: Monday, November 25, 2013  9:15 PM

> I could see adding (yet) another set of containers, such as
> Protected_..., though this is beginning to turn into a nasty
> combinatorial situation (indefinite, bounded, protected, ...).
> Alternatively, you could just create your own Protected_blah with just
> the operations you need exposed, and implement it with the
> unsynchronized container of your choice.

Of course, we considered that several times in the past, but there are two
problems: one is that the sort of locking needed depends on the type of
multi-access. For many uses (where the structure is built up initially and just
read by multiple tasks), the locking could be much simpler. More importantly,
protected versions are problematical for any of the routines that use call-backs
and/or tampering. It quite possible that the protected package would be
reentered during such an operation; that either causes deadlock (bad) or allows
race conditions (bad) or much more complex locking (more overhead - and it's not
clear that this actually works).

Probably a protected container would have to do without those operations -- but
of course that means no iterators or accessors.

So we have a combinatorial explosion (especially if one has protected for
reading and protected for all operations), and a loss of functionality or
safety. So we gave up and only defined the queues. Perhaps that was punting on
2nd down, but those issues haven't gone anywhere.

***************************************************************

From: Brad Moore
Sent: Tuesday, November 26, 2013  9:43 AM

I think what we have now is the right mix.
Even though the containers do not have protection for multi-access it can still
be done.

For instance, I have examples where I create an
Ada.Containers.Doubly-Linked-Lists in parallel or iterate through a Vector in
parallel.

It should in theory be possible to iterate through any of the non-protected
containers in parallel.

Had there been protection around these containers, it would have probably ruled
out this parallelism, as there would have been too much synchronization
occurring between the tasks/threads.

***************************************************************

Questions? Ask the ACAA Technical Agent