Version 1.4 of ai05s/ai05-0021-1.txt

Unformatted version of ai05s/ai05-0021-1.txt version 1.4
Other versions for file ai05s/ai05-0021-1.txt

!standard A.18.3(102/2)          07-05-17 AI05-0021-1/03
!standard A.18.2(139/2)
!standard A.18.3(83/2)
!standard A.18.4(38/2)
!standard A.18.7(36/2)
!standard A.18.2(237/2)
!standard A.18.3(151/2)
!class binding interpretation 06-11-10
!status ARG Approved 10-0-1 06-11-18
!status work item 06-11-10
!status received 06-08-31
!priority Medium
!difficulty Easy
!qualifier Error
!subject Issues with containers
!summary
(1) The equivalence given for Delete_First is incorrect because it attempts to match the result of the function call First(Container) with the in out parameter Position. Nevertheless, the required behaviour is as expected and is spelled out in detail.
(2) Query_Element raises Program_Error if the elements of the container (vector, list, map, or set) that contains the element designated by the cursor are tampered with.
(3) Merge raises Program_Error if Source and Target are the same and non-empty.
!question
(1) A.18.3(102/2) says "Equivalent to Delete (Container, First (Container), Count)", but the second parameter of Delete has mode "in out", and thus cannot be passed a function call. Should this be fixed? (Yes.)
(2) A.18.2(139/2) has "Program_Error is propagated if Process.all tampers with the elements of Container". But there is no Container in sight; what container is intended?
(3) A.18.2(237/2) says for a Merge that "afterwards, Target contains the union of the elements that were initially in Source and Target; Source is left empty". What happens if Source and Target are the same? It's not possible in that case to have the prescribed result.
!recommendation
(See summary.)
!wording
(1) Replace A.18.3(102/2) with words:
If Container is empty, Constraint_Error is propagated. Otherwise, Delete_First removes (from Container) Count elements starting at the first element in Container (or all of the elements if there are fewer than Count elements in Container).
(2) Modify A.18.2(139/2) as follows:
.. Program_Error is propagated if Process.all tampers with the elements of {the vector that contains the element designated by Cursor}[Container].
Make similar modifications for A.18.3(83/2), A.18.4(38/2), and A.18.7(36/2).
(3) Modify A.18.2(237/2) as follows: {If Source is empty, then Merge does nothing. If Source and Target are the same non-empty container object, then Program_Error is propagated. Otherwise, }Merge removes elements from Source and inserts them into Target; afterwards, Target contains the union of the elements that were initially in Source and Target; Source is left empty. If Target and Source are initially sorted smallest first, then Target is ordered smallest first as determined by the generic formal "<" operator; otherwise, the order of elements in Target is unspecified. Any exception raised during evaluation of "<" is propagated.
The same change needs to be made to A.18.3(151/2).
!discussion
(1) An equivalence that is not quite equivalent is garbage; it's better to simply describe what we want in English.
(2) All of the Query_Element routines have this problem.
(3) With this change, Merge raises Program_Error in precisely those cases where the invariant on the result cannot be preserved. (It's not possible for a single object to be empty and non-empty at the same time.) The case in question (merging with yourself) is weird enough that it is not worthwhile to make programmers worry about different invariants for (only) that case.
!corrigendum A.18.2(139/2)
Replace the paragraph:
If Position equals No_Element, then Constraint_Error is propagated. Otherwise, Query_Element calls Process.all with the element designated by Position as the argument. Program_Error is propagated if Process.all tampers with the elements of Container. Any exception raised by Process.all is propagated.
by:
If Position equals No_Element, then Constraint_Error is propagated. Otherwise, Query_Element calls Process.all with the element designated by Position as the argument. Program_Error is propagated if Process.all tampers with the elements of the vector that contains the element designated by Cursor. Any exception raised by Process.all is propagated.
!corrigendum A.18.2(237/2)
Replace the paragraph:
Merge removes elements from Source and inserts them into Target; afterwards, Target contains the union of the elements that were initially in Source and Target; Source is left empty. If Target and Source are initially sorted smallest first, then Target is ordered smallest first as determined by the generic formal "<" operator; otherwise, the order of elements in Target is unspecified. Any exception raised during evaluation of "<" is propagated.
by:
If Source is empty, then Merge does nothing. If Source and Target are the same non-empty container object, then Program_Error is propagated. Otherwise, Merge removes elements from Source and inserts them into Target; afterwards, Target contains the union of the elements that were initially in Source and Target; Source is left empty. If Target and Source are initially sorted smallest first, then Target is ordered smallest first as determined by the generic formal "<" operator; otherwise, the order of elements in Target is unspecified. Any exception raised during evaluation of "<" is propagated.
!corrigendum A.18.3(83/2)
Replace the paragraph:
If Position equals No_Element, then Constraint_Error is propagated. Otherwise, Query_Element calls Process.all with the element designated by Position as the argument. Program_Error is propagated if Process.all tampers with the elements of Container. Any exception raised by Process.all is propagated.
by:
If Position equals No_Element, then Constraint_Error is propagated. Otherwise, Query_Element calls Process.all with the element designated by Position as the argument. Program_Error is propagated if Process.all tampers with the elements of the list that contains the element designated by Cursor. Any exception raised by Process.all is propagated.
!corrigendum A.18.3(102/2)
Replace the paragraph:
Equivalent to Delete (Container, First (Container), Count).
by:
If Container is empty, Constraint_Error is propagated. Otherwise, Delete_First removes (from Container) Count elements starting at the first element in Container (or all of the elements if there are fewer than Count elements in Container).
!corrigendum A.18.3(151/2)
Replace the paragraph:
Merge removes elements from Source and inserts them into Target; afterwards, Target contains the union of the elements that were initially in Source and Target; Source is left empty. If Target and Source are initially sorted smallest first, then Target is ordered smallest first as determined by the generic formal "<" operator; otherwise, the order of elements in Target is unspecified. Any exception raised during evaluation of "<" is propagated.
by:
If Source is empty, then Merge does nothing. If Source and Target are the same non-empty container object, then Program_Error is propagated. Otherwise, Merge removes elements from Source and inserts them into Target; afterwards, Target contains the union of the elements that were initially in Source and Target; Source is left empty. If Target and Source are initially sorted smallest first, then Target is ordered smallest first as determined by the generic formal "<" operator; otherwise, the order of elements in Target is unspecified. Any exception raised during evaluation of "<" is propagated.
!corrigendum A.18.4(38/2)
Replace the paragraph:
If Position equals No_Element, then Constraint_Error is propagated. Otherwise, Query_Element calls Process.all with the key and element from the node designated by Position as the arguments. Program_Error is propagated if Process.all tampers with the elements of Container. Any exception raised by Process.all is propagated.
by:
If Position equals No_Element, then Constraint_Error is propagated. Otherwise, Query_Element calls Process.all with the key and element from the node designated by Position as the arguments. Program_Error is propagated if Process.all tampers with the elements of the map that contains the element designated by Cursor. Any exception raised by Process.all is propagated.
!corrigendum A.18.7(36/2)
Replace the paragraph:
If Position equals No_Element, then Constraint_Error is propagated. Otherwise, Query_Element calls Process.all with the element designated by Position as the argument. Program_Error is propagated if Process.all tampers with the elements of Container. Any exception raised by Process.all is propagated.
by:
If Position equals No_Element, then Constraint_Error is propagated. Otherwise, Query_Element calls Process.all with the element designated by Position as the argument. Program_Error is propagated if Process.all tampers with the elements of the set that contains the element designated by Cursor. Any exception raised by Process.all is propagated.
!ACATS test
!appendix

From: Pascal Leroy
Date: Thursday, August 31, 2006  1:03 AM

I'm sure that you'll like the buglet below, uncovered by the engineer who works
on the implementation of the Containers at IBM.

---

At least for now, I'm implementing the various suprograms in Ada.Containers
which are specified to be equivalent to a call to another suprogram with
specific parameters just as stated in the manual.  This has unconvered the
following problem for Delete_First (Delete_Last) in
Ada.Containers.Doubly_Linked_Lists (RM05 A.18.3 (102)):

procedure Delete_First (Container : in out List; Count :  in Count_Type := 1); 
{AI95-00302-03} Equivalent to Delete (Container, First (Container), Count). 

and the spec for Delete looks like 

procedure Delete (Container : in out List; Position : in out Cursor; Count : in Count_Type := 1); 

Since when can a function result be used as an in out mode parameter? 

****************************************************************

From: Randy Brukardt
Date: Friday, September  1, 2006 11:40 PM

I know about that, or at least some similar cases. I made the executive decision
to ignore the problem, because the alternatives aren't pretty. Surely no one
wants to see:
 
Equivalent to 
    declare
       Temp : Cursor := First (Container);
   begin
       Delete (Container, Temp, Count);
   end;
 
The alternative is to dump the equivalence entirely. That wouldn't be
horrible in this case:
 
If Container is empty, Constraint_Error is propagated. Otherwise, Delete_First
removes (from Container) Count elements starting at the first element in Container
(or all of the elements if there are fewer than Count elements in Container).
 
But it's more words, more interpretation by implementors and users alike, and I'm
not certain if all of the phrases (like "first element") are well-defined.

****************************************************************

From: Pascal Leroy
Date: Monday, September 4, 2006  4:41 AM

If we say "equivalent" we must mean "equivalent".  I hate it when the RM says
"foo is equivalent to bar" and when you scratch the surface you see that it
means "foo is equivalent to bar unless it contains tasks or has access
discriminants or...".  So this rule should be written in English.  And yes,
it's going to be a bit longer, but who cares given the size of the containers.

****************************************************************

From: Pascal Leroy
Date: Monday, August 14, 2006  3:35 AM

A.18.2(139) has "Program_Error is propagated if Process.all tampers with
the elements of Container".  Nice try, but there is no container in sight.
It should say "the container designated by Cursor" or something like that.

****************************************************************

From: Pascal Leroy
Date: Thursday, September 28, 2006  1:26 AM

The containers have an operation named Merge with profiles like:

	procedure Merge (Target  : in out Vector;
	                 Source  : in out Vector);

The description of the semantics says that "afterwards, Target contains
the union of the elements that were initially in Source and Target; Source
is left empty" (see A.18.2(236/2-237/2)).

However if Merge is called with the same container for Target and Source:

	Merge (V, V);

It is obviously not possible to comply with the above post-condition.
What should happen?  Should P_E be raised?  Or should the operation be a
no-op?

****************************************************************

From: Tucker Taft
Date: Thursday, September 28, 2006  8:28 AM

Another option would be a bounded error.  Since it is already
a bounded error to call Merge on unsorted vectors, it seems
reasonable to make this presumably less-likely problem also
a bounded error.  This would allow Program_Error to be raised,
but also allow the operation to be a no-op, or to end up with
Target and Source both twice as long, or both empty.

****************************************************************

From: Pascal Leroy
Date: Thursday, September 28, 2006  9:13 AM

A bounded error is a bad idea here.  I'm not sure if this problem is
likely or not, but at any rate a key difference with a merge on unsorted
vectors is that checking if the vectors are sorted defeats the purpose,
while checking if the containers are identical is easy (containers are
tagged and therefore passed by reference).

So I don't see why we would want to have a non-portability here, and
Matt's proposal seems fine to me.  I could also go with P_E if someone had
a strong preference for that.  Ending up with both containers empty one
the other hand is not acceptable as it would mean dropping elements on the
floor, and we try hard to avoid doing that.

****************************************************************

From: Matthew Heaney
Date: Thursday, September 28, 2006  9:45 AM

> So I don't see why we would want to have a non-portability here, and Matt's
> proposal seems fine to me.  I could also go with P_E if someone had a strong
> preference for that.

I answer questions like this by asking myself "What would Dijkstra do?".

In his reviews of the red, blue, green, and yellow candidate language
proposals, he noted that one of the languages (I don't remember which color it
was) didn't allow you to declare a zero-length array.  He thought this was
ridiculous! His argument was that Western Civilization required centuries to
invent the concept of zero and to incorporate it into our number system.  And
his conclusion was that the language designer who disallowed zero-length arrays
hadn't learned this lesson.

So the merge issue reminds me of the zero-length array issue. The argument that
Merge should raise PE if Target and Source are the same is equivalent to the
argument that zero-length arrays shouldn't be allowed.  And my answer is the
same as Dijkstra's: you always need to allow the 0 case.  The graceful behavior
when Target and Source are the same is to do nothing.

One of the things we did during the container library design was to be able to
gracefully handle all weird zero cases, e.g. appending 0 items to the end of a
vector that was already at its maximum capacity, etc.  We should follow that
same philosophy here, and allow Target and Source to be the same object (and
for Merge to do nothing when that's the case).  This also makes merge
consistent with the behavior for the union of sets.

> Ending up with both containers empty one the other hand is not acceptable as
> it would mean dropping elements on the floor, and we try hard to avoid doing
> that.

Right, but in this case clearing both containers would be contrary to spirit of
the merge operation (it would have just the opposite semantics, in fact).  The
intent is to *move* elements from one container to another.  The fact that the
source list is empty following the operation doesn't mean the container has
been *cleared*, in the sense that its elements have been destroyed.  Its
elements have been *moved* somewhere else, in this case to the target
container.  Merge never destroys elements.  Following the operation target has
all of the elements from both the target and source, and that's true (er,
should be true) irrespective of whether target and source happen to denote the
same object.  (You always need a zero...)

So my Official Position is that any ambiguity in the RM language should be
removed, by saying the if Target and Source denote the same object then Merge
does nothing.

****************************************************************
From: Tucker Taft
Date: Thursday, September 28, 2006  9:36 AM

> A bounded error is a bad idea here.  I'm not sure if this problem is
> likely or not, but at any rate a key difference with a merge on unsorted
> vectors is that checking if the vectors are sorted defeats the purpose,
> while checking if the containers are identical is easy (containers are
> tagged and therefore passed by reference).

Good point.  The check is pretty trivial.  I don't generally
like taking 'Address of parameters, but as you point out these
are tagged, so 'Address is well-defined and can be used reliably
to determine whether the actuals are in fact the same container.

> So I don't see why we would want to have a non-portability here, and
> Matt's proposal seems fine to me.

I seem to have missed Matt's proposal.  Can you resend it?

 > ... I could also go with P_E if someone had
> a strong preference for that.  

I would recommend Program_Error since in my mind it seems
like a clear bug.  Perhaps Matt had an example, but I can't
think of a case where you would end up with Merge(X,X) as
a result of some "degenerate" case of sorting or merging.
If the only time this could arise is because of a mistake
on the programmer's part, then it would make sense to
raise Program_Error.  Since it violates the general expectations
that Length(Target'out) = Length(Target'in) + Length(Source'in)
and Length(Source'out) = 0, it seems like allowing it to
act as a no-op would likely propagate problems further and
break lots of other things, not doing the programmer any
favors.

> ... Ending up with both containers empty one
> the other hand is not acceptable as it would mean dropping elements on the
> floor, and we try hard to avoid doing that.

You've convinced me that bounded error is unwise,
so now I favor Program_Error.

****************************************************************

From: Matthew Heaney
Date: Thursday, September 28, 2006  10:32 AM

> Good point.  The check is pretty trivial.  I don't generally
> like taking 'Address of parameters, but as you point out these
> are tagged, so 'Address is well-defined and can be used reliably
> to determine whether the actuals are in fact the same container.

Right, object identity for tagged objects is determined by taking the 'Address
of the parameter.  (I realize that "identity" for non-limited types is a
delicate issue, but I am ignoring the limited vs. non-limited distinction for
now.)

There are several cases where the RM requires you to handle the case when
parameters denote the same container object, and the gcc implementation of the
container library does this using 'Address of the parameters.

> > So I don't see why we would want to have a non-portability here, and
> > Matt's proposal seems fine to me.
> 
> I seem to have missed Matt's proposal.  Can you resend it?

Robert sent Pascal's email to me (I'm not on the ARG list), and I responded to
both Pascal and Robert.  I cc'd the arg list so maybe you have it already.
(Let me know if you don't.)

I give a more substantive discussion of why I think PE is not proper in my
second response, citing EWD as my source.

> I would recommend Program_Error since in my mind it seems like a clear bug.

Great minds think, er, different.  My argument is that it's not a bug.  Rather
it's just the terminus of considering all of the objects that can be passed as
the Source parameter.  When Source happens to denote the same object as Target,
then the work of merge is already done and so it does nothing else.

> Perhaps Matt had an example, but I can't
> think of a case where you would end up with Merge(X,X) as
> a result of some "degenerate" case of sorting or merging.

But you don't necessarily know which objects happen to get passed to
merge. Requiring developers to keep track of objects ("is object target the
same as object source"?) is only adding complexity.

Saying that PE should be raised is just going to force developers to say:

  if Target'Address /= Source'Address then
    Target.Merge (Source);
  end if;

In other words, the postcondition is exactly the same (when T is the same as S
then don't change the state of the program).  So let's save them the trouble,
and move that identity test into Merge itself.

> If the only time this could arise is because of a mistake
> on the programmer's part, 

It's not a mistake, it's just mitigation of complexity.  You're asking the
developer here to assume this extra burden of keeping track of whether objects
are identical.  

The library should handle limiting cases gracefully, on the programmers'
behalf.  That's exactly what we did throughout the library, say by allowing 0
items to be inserted into a full vector.  

If I have to worry about all the limiting cases myself, then using the library
is like walking on eggshells.  So I have to do everything very gingerly, in
abject fear that some limiting case I forgot is going to trigger PE.  Testing
doesn't give me confidence that I've properly handled all the limiting cases.

So let the library assume that burden, because the developer has other things
to worry about.  Sometimes the best thing to do is ... nothing.

> then it would make sense to
> raise Program_Error.  

We should weaken the precondition to include the limiting case (meaning that
Target is same as Source should be allowed, and Merge should do nothing).

> Since it violates the general expectations
> that Length(Target'out) = Length(Target'in) + Length(Source'in)
> and Length(Source'out) = 0, 

You're looking at this too abstractly.  Merge is basically a special kind of
Move operation.  What Merge does is to move (not copy and not destroy) elements
from Source to Target.  If Target is the same object as Source then it already
has all of the elements from Source and do there's nothing else for Merge to
do.

> it seems like allowing it to
> act as a no-op would likely propagate problems further and
> break lots of other things, not doing the programmer any
> favors.

I would argue just the opposite.  If Merge were to raise PE here then all
you're doing is creating a trap door for an unhandled exception. The limiting
case here should be handled the same way we handle limiting cases everywhere;
that is, by doing nothing.

> You've convinced me that bounded error is unwise,
> so now I favor Program_Error.

Well hopefully I have convinced you that PE is unwise too...

****************************************************************

From: Stephen Michell
Date: Thursday, September 28, 2006  11:43 AM

[Responding to the original question - Ed.]

Good point. I vote for idempotent.

****************************************************************

From: Robert Dewar
Date: Thursday, September 28, 2006  2:50 PM

> So I don't see why we would want to have a non-portability here, and
> Matt's proposal seems fine to me.  I could also go with P_E if someone had
> a strong preference for that.  Ending up with both containers empty one
> the other hand is not acceptable as it would mean dropping elements on the
> floor, and we try hard to avoid doing that.

I also support Matt's proposal. The only justification for a bounded
error is that the check would be expensive, but that is not the case here.

****************************************************************

From: Matthew Heaney
Date: Thursday, September 28, 2006  3:44 PM

> Sort/Merge is a bit tricky, and I can see that a typical
> *bug* might be to confuse an input with an output, but
> I can't see why you would do it intentionally, or how
> it would somehow simplify the algorithm.

My issue with this line of argument is that it's totally open-ended.  You're
saying "Well I can't imagine why someone would want to do that..."  But just
because we can't see a reason for someone to do it doesn't therefore mean that
it should be prohibited.  It might just be a poverty of our imagination.  

Indeed the whole building-blocks approach to design is precisely to get out of
developer's way so so he can do stuff we didn't think of.  That's called
flexibility and it's a good thing.

The more general issue is whether we're solving a real problem or an imaginary
one.  It would be one thing if, say, a user attempted to delete an element from
a container using a cursor that designated an element in a different container.
It would be disaster if the node were unlinked from the one container but the
element count from the other container were decremented.  We correctly raise PE
in that case because the carrying out the action would cause definite harm (in
this case because representation invariants would be violated).

But the merge case is different.  It's not a matter of a container becoming
corrupted.  We're guessing that something is broken with the user's algorithm,
and raising PE to state that we don't like what he's doing.  I would argue,
when we're wearing our library design hats, that detecting errors in some
higher-level algorithm is not the business we should be in.  Our responsibility
as library designers is to ensure that containers can never get broken, so that
algorithm designers only have to worry about correctness of their algorithms.  

It might very well be the case that passing the same object as the source and
target parameters is a bug in the algorithm of the library user, but from our
narrow perspective as library writers we can't know that.  I am inclined to
give the user the benefit of the doubt and assume that he knows what he's
doing.  Perhaps what he's doing is wacky (I don't happen to think so, but
Tucker disagrees), but it's not our business as library writers to judge.

I'm not sure whether Tucker's argument even makes any sense.  Suppose we assume
that passing the same object as source and target indicates a problem (in the
higher-level algorithm).  So we raise PE to alert the user.  However, all we've
done is to push the problem back one level: how do we know that raising PE
doesn't cause some other kind of problem?  We're in no position to know what's
worse: allowing target and source to be the same, and doing nothing; or, not
allowing target and source to be the same, and raising PE.  The issue is that
we raise PE because we *think* there's a problem with the algorithm.  But if
we're wrong (and doing nothing would have a benign effect on the algorithm, as
I would argue) then the cure is worse than the disease, since we *know* that PE
would cause problems.  (Or at least require action by the developer.  PE is
rather draconian since it means "this program is meaningless," and we require
that developers modify their programs to eliminate that exception.)

P. S.  I just grep'd the STL source in my development environment, and this is what it
looks like:

  void merge(_Myt& _Right)
  {       // merge in elements from _Right, both ordered by operator<
       if (&_Right != this)
       {       // safe to merge, do it
           iterator _First1 = begin(), _Last1 = end();
           iterator _First2 = _Right.begin(), _Last2 = _Right.end();

           while (_First1 != _Last1 && _First2 != _Last2)
               if (*_First2 < *_First1)
                       {       // splice in an element from _Right
                       iterator _Mid2 = _First2;
                       _Splice(_First1, _Right, _First2, ++_Mid2, 1);
                       _First2 = _Mid2;
                       }
               else
                       ++_First1;

           if (_First2 != _Last2)
                _Splice(_Last1, _Right, _First2, _Last2,
                        _Right._Mysize);  // splice remainder of _Right
       }
   }


In other words, merge does ... nothing if the target and source denote the same
object.  So great minds do think alike after all!!!

****************************************************************

From: Robert I. Eachus
Date: Thursday, September 28, 2006  4:13 PM

> I also support Matt's proposal. The only justification for a bounded 
> error is that the check would be expensive, but that is not the case here. 

Ditto.  Tucker's position is that Merging a Container with itself should
not come up. However, right now we have a very well diffined case, in that
Merging any two empty Containers 'works' (for some definition of work ;-).
Also, in either case, it is somewhat trivial for a user to write a wrapper
function if he wants/needs different semantics. For 90% or more of users
there will be no need to do this.  So the question becomes whether one or
the other case is more likely to occur in real code.

As a strawman consider a situation where you have an array of keys and
you want to do an insertion of items into a set of per key Containers,
then later merge some of the keys.  There are three choices at the margin.
The first is to have an array of access values initialized to Null. The
second is to have an array of Containers intialized to the same (empty)
Container. (If you do this, you have to do a check when putting the first
item in an array entry and create a new container.)  The last choice is
to initialize the list to different empty Containers.

For a small list of keys the last choice will almost certainly be the
most efficient approach.  But where there are lots of keys, say you are
sorting by 5-digit Zip Code, creating lots of unused Containers will probably
result in significantly worse performance.  (Notice that this is definitely
a real case, given the Post Office rules for third class bulk mailings.
Some groups will need to be merged, and the right choices will probably not
known until the addresses have been sorted by Zip Codes.)

So is it appropriate to deprecate the middle alternative? I don't think so.
It is obviously an unusual set of circumstances, and the extra code to make
the the middle case work with a Merge that raises Program_Error is a couple
of lines. You could try to argue that it could prevent an error when the
programmer forgets to create new Containers when appropriate. But the error
will occur well before Merge is ever called...

****************************************************************

From: Randy Brukardt
Date: Thursday, September 28, 2006  4:47 PM

I don't have much an opinion on the issue at hand (just that it is decided),
but I think Matt is (as often seems to happen) has lost sight of what Ada is
about.

He said:

> Indeed the whole building-blocks approach to design is precisely to get out of
> developer's way so so he can do stuff we didn't think of.  That's called
> flexibility and it's a good thing.

True enough, but that has to be balanced with Ada's desire to detect errors
early. Surely, for every check in Ada, there is some case where you might
actually want some specific behavior. But the checks exist because we (the
designers) believe that it is more likely that the user made a mistake
rather than having done something intended. Think about integer overflow
checks, for instance -- certainly, we could have decided what happens
without raising an exception (C does that), but it was decided it was more
likely to be a bug than intended. And surely C is more flexible than Ada in
this respect, but a C program is also a lot more likely to get a bogus
answer on the margins than an Ada program.

> The more general issue is whether we're solving a real problem or an imaginary
> one.  It would be one thing if, say, a user attempted to delete an element from
> a container using a cursor that designated an element in a different container.
> It would be disaster if the node were unlinked from the one container but the
> element count from the other container were decremented.  We correctly raise PE
> in that case because the carrying out the action would cause definite harm (in
> this case because representation invariants would be violated).
>
> But the merge case is different.  It's not a matter of a container becoming
> corrupted.  We're guessing that something is broken with the user's algorithm,
> and raising PE to state that we don't like what he's doing.  I would argue,
> when we're wearing our library design hats, that detecting errors in some
> higher-level algorithm is not the business we should be in.

I strongly disagree with this statement. We should be in the business of
detecting all errors that can reasonably be detected. If that means that
some unusual cases require more work by the programmer, so be it. (Of
course, we shouldn't take this too far -- it's clearly a balancing act.)

> Our responsibility as library designers is to ensure that containers can never get
> broken, so that algorithm designers only have to worry about correctness of their
> algorithms.

Surely this is important, but if we can detect errors in the higher-level
use of the library, that is should be done as well.

...
> I'm not sure whether Tucker's argument even makes any sense.  Suppose we assume
> that passing the same object as source and target indicates a problem (in the
> higher-level algorithm).  So we raise PE to alert the user.  However, all we've
> done is to push the problem back one level: how do we know that raising PE
> doesn't cause some other kind of problem?

Of course it causes other problems, that's the idea -- usually it causes the
program to crash or reset, and that will hopefully show up in testing
(leading to the bug getting fixed, rather than causing trouble after
deployment, as would happen if the problem is undetected).

...
> P. S.  I just grep'd the STL source in my development
> environment, and this is what it
> looks like:
...
> In other words, merge does ... nothing if the target and source denote the same
> object.  So great minds do think alike after all!!!

Well, you had almost convinced me, but if the C++ folks do it, it must be
wrong. ;-) Seriously, so far as I know, they make little attempt to detect
errors in the STL (nothing beyond the most obvious ones - most misuses are
erroneous), so I don't think that proves anything much at all (other than
that they didn't want to corrupt the container in this case).

****************************************************************

From: Tucker Taft
Date: Thursday, September 28, 2006  5:04 PM

> As a strawman consider a situation where you have an array of keys and 
> you want to do an insertion of items into a set of per key Containers, 
> then later merge some of the keys.  There are three choices at the 
> margin.  The first is to have an array of access values initialized to 
> Null.  The second is to have an array of Containers intialized to the 
> /same/ (empty) Container.  (If you do this, you have to do a check when 
> putting the first item in an array entry and create a new container.)  
> The last choice is to initialize the list to different empty Containers.

I'm not sure I understand your setup here.  If you have an array
of *Containers* then you don't need to initialize them to anything.
They default initialize to being empty.  It is meaningless to talk
about initializing two different containers to the "same" Container
value.  Container objects are either the same object, or completely
independent (at least from the client perspective).

On the other hand, if you have an array of access values, I presume you are
initializing the access values to something non-null so you can
*read* from them (but *not* write to them) without first
checking for null.  But now we come to the mystery.  How do
you plan to merge these?  You can't merge a non-empty container
into this "shared" empty container, because then suddenly all of
the formerly empty elements of the array would become non-empty.
So that means that at least you have to check whether the
"source" of any merge is non-empty, at which point if you discover it
*is* empty, you presumably have no need to do the merge at all.

> ...
> So is it appropriate to deprecate the middle alternative?   I don't 
> think so.  It is obviously an unusual set of circumstances, and the 
> extra code to make the the middle case  work with a Merge that raises 
> Program_Error is a couple of lines.  You could try to argue that it 
> could prevent an error when the programmer forgets to create new 
> Containers when appropriate.  But the error will occur well before Merge 
> is ever called...

I'm not sure why you say that.  If anything, this seems like
a perfect example of how the Program_Error could catch a bug.

I suppose we could define the semantics to
first say, if the Source is empty, then the operation is
a no-op.  If the Source and Target are the same
non-empty container object, then Program_Error is raised.
Otherwise, Source is merged in with Target, and Source is
set to empty.

This preserves the rule that Length(Target'out) = Length(Target'in) +
Length(Source'in) and Length(Source'out) = 0.
Anything that doesn't preserve the above two things seems broken
and deserves a Program_Error.  If you can't state a simple
postcondition like the above, then I don't think the behavior
in a potentially "degenerate" case can be considered a "smooth"
extension of the normal case.

I still await a convincing example...

****************************************************************

From: Tucker Taft
Date: Thursday, September 28, 2006  5:20 PM

I might add that another reason we have in the past made
something illegal or have it raise an exception is because
there are more than one potentially reasonable answers,
and there is no obvious way to choose between them.

From my perspective, if we are trying to guess what
Merge(X, X) is supposed to produce, then that implies
it is probably better that the programmer decide
explicitly.  For example, I could imagine that the
result should be a double-length vector, since merge is
effectively a concatenate and a sort combined into one.
Alternatively, I could imagine we end up with X being
empty, if we believe the semantics of Merge end with
an unconditional setting of the Source to empty.
And of course, there is the third alternative of it
being a no-op, the one which is presumably the least
useful in terms of it doing something of value, since
the programmer can easily code their own "no-op" ;-).

****************************************************************

From: Joyce L. Tokar
Date: Thursday, September 28, 2006  5:52 PM

I like the bounded error option simply because is provides
an alert, via Program_Error, that the operation has been called
in an erroneous way.  This could be highlighting a program error
that would not be caught at compile time. And could easily 
be overlooked at a code walk-through (if these are still done 
in these days of auto-code generation ;-) )

****************************************************************

From: Matthew Heaney
Date: Thursday, September 28, 2006  6:12 PM

> I'm not sure why you say that.  If anything, this seems like
> a perfect example of how the Program_Error could catch a bug.

The problem here is that you're cherry-picking the data to confirm a conclusion
made apriori.  For every example where PE would catch bug in a higher-level
algorithm, it would be simple enough to find a counter-example in which raising
PE would cause bugs (e.g. any program that doesn't have a handler for PE).

> This preserves the rule that Length(Target'out) = Length(Target'in) +
> Length(Source'in) and Length(Source'out) = 0.

But that rule only applies when Target and Source denote different objects.  To
preserve the rule we could just rewrite the rule.

We're really talking about a set of entities (the container elements), that
preserve their existence after the merge.  Discussion about "length" is just
our short-hand way of refering to those entities.  But that particular language
is getting in the way of denoting the actual semantics.  (Shades of the Whorf
Hypothesis?)  In other words it doesn't make any sense to say that
"Length(Source'out) = 0" if Source denotes the same object as Target.

One of these days I've got to sit down and learn predicate calculus...

> Anything that doesn't preserve the above two things seems broken
> and deserves a Program_Error.  If you can't state a simple
> postcondition like the above, then I don't think the behavior
> in a potentially "degenerate" case can be considered a "smooth"
> extension of the normal case.

Here's one way to look at it.  Consider the set of all sorted list containers.
What is the meaning of the program when you apply the merge operation to any
pair of containers in the set?  You said earlier that "0 is the limit of all
numbers greater than 0," so another way to look at it is to say "Target is the
limit of all containers that are not Target."

> I still await a convincing example...

Yes, me too.  ;^)

****************************************************************

From: Tucker Taft
Date: Thursday, September 28, 2006  7:16 PM

>> I'm not sure why you say that.  If anything, this seems like
>> a perfect example of how the Program_Error could catch a bug.
> 
> The problem here is that you're cherry-picking the data to confirm a conclusion
> made apriori.  For every example where PE would catch bug in a higher-level
> algorithm, it would be simple enough to find a counter-example in which raising
> PE would cause bugs (e.g. any program that doesn't have a handler for PE).

I think we aren't really adding much light anymore, mostly
just heat.  But I think we have substantially different
views of what an exception like "Program_Error" is for.
In my view it is to help debug the program during the
testing phase.  I would never expect a program to have
a handler for Program_Error, unless it was at a very
high level, and it was intended to be a program that
soldiered on even if there were semi-fatal coding errors.

In any case, I certainly don't consider the raising of
Program_Error to be *causing* a bug, even if there is
no handler.  It is announcing the presence of a bug.
It is just like the Constraint_Error raised by an
out of bounds array index.

****************************************************************

From: Matthew Heaney
Date: Thursday, September 28, 2006  7:43 PM

> I might add that another reason we have in the past made
> something illegal or have it raise an exception is because
> there are more than one potentially reasonable answers,
> and there is no obvious way to choose between them.

But isn't that what designers do: analyze alternatives and then pick one?  You
yourself had this role during the Ada95 design to decide whether to allow
out-parameters for functions.

Raising PE isn't that act that says you couldn't agree on the semantics.  It is
specifying semantics, stating that "we think you're doing something wrong
here", and stating it in the strongest manner possible by raising PE.

In general my exception philosophy is that you raise an exception when you're
unable to satisfy the postcondition.  PE is saying "we don't know what this
program means."  For example if you have an entry barrier that raises an
exception then PE makes sense, since you have no result at all.

But merge isn't like that, because we have a choice, and so we can just pick
one alternative.  We can just decide by fiat what merge means when target and
source denote the same object.


> And of course, there is the third alternative of it
> being a no-op, the one which is presumably the least
> useful in terms of it doing something of value, since
> the programmer can easily code their own "no-op" ;-).

But you have framed this in a way that advances your point of view!  I could
just as easily say:

  And of course, there is the third alternative of it
  raising PE, the one which is presumably the least
  useful in terms of it doing something of value, since
  the programmer can easily code their own "program error".



> Randy Brukardt wrote:
> > I don't have much an opinion on the issue at hand (just that it is decided),
> > but I think Matt is (as often seems to happen) has lost sight of what Ada is
> > about.

Oh dear, now I am being branded as a yahoo!  Maybe next you'll accuse me of
being unpatriotic!!  You're either with us or you're against us!!!

There is no Platonic realm which describes in plain text "what Ada is about."
Moses did not descend from Mt. Sinai carrying the RM.  This is not a language
issue.  We are not debating "what Ada means."  We are debating the semantics of
a container library that happens to be written in Ada.  This is a debate we
would have no matter what the language it was written in.

Systems of any kind are built in layers of abstraction.  The issue here of
algorithm correctness has nothing to do with the issue of correctness of
container libraries, since they describe different levels in the hierarchy.


> > He said:
> > 
> >> Indeed the whole building-blocks approach to design is precisely to get
> > out of
> >> developer's way so so he can do stuff we didn't think of.  That's called
> >> flexibility and it's a good thing.
> > 
> > True enough, but that has to be balanced with Ada's desire to detect errors
> > early. 

Hmmm, remind me again, how exactly is raising PE "detecting errors early"?

The issue is not about detecting a mere error.  It is about detecting an error
in an algorithm that uses the library.  I have already argued that this is
impossible for the library writer, since that's at a completely different level
of abstraction.

Indeed, a developer's algorithm could be wrong for any number of reasons.  How
are we supposed to detect that?  You argue that "if target and source denote
the same object then this obviously means his algorithm is broken" but how do
you know this?


> > Surely, for every check in Ada, there is some case where you might
> > actually want some specific behavior. But the checks exist because we
> > (the designers) believe that it is more likely that the user made a mistake
> > rather than having done something intended.

No, the checks exist because you're unable to deliver a result (you're unable
to satisfy the postcondition).  That's not the case here, and merge certainly
can deliver a result when target and source denote the same object.


>>> But the merge case is different.  It's not a matter of a container becoming
>>> corrupted.  We're guessing that something is broken with the user's
>>> algorithm, and raising PE to state that we don't like what he's doing.  I
>>> would argue, when we're wearing our library design hats, that detecting
>>> errors in some higher-level algorithm is not the business we should be in.

> > I strongly disagree with this statement. We should be in the business of
> > detecting all errors that can reasonably be detected. 

Motherhood and apple pie.  Yes of course we should detect all errors that can
reasonably be detected, and no one has argued otherwise.  My argument is that
the container library designer should only be responsible for ensuring the
correctness of the container library.  We can't possibly know whether a user of
the library has a bug in his algorithm.  Raising PE because we *think* there's
a bug in his algorithm doesn't raise my confidence in its correctness.  (In
fact my confidence would probably be lowered.)


> > If that means that
> > some unusual cases require more work by the programmer, so be it. (Of
> > course, we shouldn't take this too far -- it's clearly a balancing act.)

But I could just as easily frame this as:

  If that means that some unusual cases require more work by the container, so
  be it.

> >> Our responsibility as library designers is to ensure that containers can
> > never get
> >> broken, so that algorithm designers only have to worry about correctness
> > of their
> >> algorithms.
> > 
> > Surely this is important, but if we can detect errors in the higher-level
> > use of the library, that is should be done as well.

Well let's just cure world hunger while we're at it!  Maybe we can solve the
energy crisis too!!

You can't possibly know whether passing the same container for both source and
target is an error.  The algorithm might be wrong for any number of reasons,
even when source and target are different.

> > ...
> >> I'm not sure whether Tucker's argument even makes any sense.  Suppose we
> > assume
> >> that passing the same object as source and target indicates a problem (in
> > the
> >> higher-level algorithm).  So we raise PE to alert the user.  However, all
> > we've
> >> done is to push the problem back one level: how do we know that raising PE
> >> doesn't cause some other kind of problem?
> > 
> > Of course it causes other problems, that's the idea -- usually it causes the
> > program to crash or reset, and that will hopefully show up in testing
> > (leading to the bug getting fixed, rather than causing trouble after
> > deployment, as would happen if the problem is undetected).


So you're saying that a crash or reset is better than doing nothing?  How do
you know that?

This demonstrates perfectly the point I was trying to make.  Saying that the
algorithm must be wrong because the same object was passed for both source and
target is an imaginary problem.  A crash or reset is a real problem, one that
we created.

****************************************************************

From: Robert A. Duff
Date: Thursday, September 28, 2006  8:27 PM

I don't know what the right answer is, here.  But I wonder if it makes sense
to ask ourselves, what would the semantics of a Merge function be, which takes
two 'in' parameters, and returns the result?  Does that shed any light on what
the Merge procedure ought to do?

Or just heat?  ;-)

****************************************************************

From: Tucker Taft
Date: Thursday, September 28, 2006  9:05 PM

> I don't know what the right answer is, here.  But I wonder if it makes sense
> to ask ourselves, what would the semantics of a Merge function be, which takes
> two 'in' parameters, and returns the result?  Does that shed any light on what
> the Merge procedure ought to do?

With Merge(A, B) => C, presumably Length(C) = Length(A) + Length(B) always.

> Or just heat?  ;-)

We'll see.

****************************************************************

From: Tucker Taft
Date: Thursday, September 28, 2006  8:52 PM

Matthew Heaney wrote:
> Tucker Taft <stt@sofcheck.com> writes:
> 
>> I might add that another reason we have in the past made
>> something illegal or have it raise an exception is because
>> there are more than one potentially reasonable answers,
>> and there is no obvious way to choose between them.
> 
> But isn't that what designers do: analyze alternatives and then pick one?  You
> yourself had this role during the Ada95 design to decide whether to allow
> out-parameters for functions...

I'm not sure the analogy is very good, since this was just
a simple yes or no (although there were some creative
middle grounds that were suggested at some point in
the past year that might be worth revisiting, FWIW).

The fact is that we relatively frequently either postponed
a decision or decided to eliminate a feature altogether
when there was no clearly superior alternative.  It is
my strong belief that when it comes to language design,
you want the final choice to be the "intuitively obvious"
one, or at least feel that way after the fact.  If you
don't create a pretty strong criterion like this, you
have the real danger of settling on a half-baked solution,
when with a little more hard work there is a "right" solution
out there that can be found.


...

>> Randy Brukardt wrote:
>>> I don't have much an opinion on the issue at hand (just that it is decided),
>>> but I think Matt is (as often seems to happen) has lost sight of what Ada is
>>> about.
> 
> Oh dear, now I am being branded as a yahoo!  Maybe next you'll accuse me of
> being unpatriotic!!  You're either with us or you're against us!!!
> ...

Randy was admittedly being a bit hyperbolic.  But I do think
there is a philosophical difference of opinion here about
what run-time checks and the associated exceptions are
for.  In my experience in Ada, exceptions are mostly
used for debugging.  There are cases where exceptions indicate
a problem that should be handled locally, such as Name_Error,
where you don't want to separate the check for existence
of a file from the actual opening of the file, to avoid race
conditions.  But in most cases, the value of exceptions
is their ability to pinpoint where there is a program
bug.

I would claim that one of the main reasons Ada is so productive
is that by the time you get an Ada program to compile,
and then run through a few test cases without raising
any exceptions, the program is very close to being
correct.  By contrast, one of the things that makes C/C++
so much of a pain to use is that the symptoms of a bug (e.g.
crashes, blue screens, etc.) are often separated from the cause
by millions of instructions because there are few if
any run-time consistency checks being performed automatically.

So I guess I don't agree with your model that the only point
of exceptions is to preserve the integrity of the data
managed by a library.  The exceptions are supposed to
help the programmer detect bugs in their algorithm, where
what they have asked the library to do doesn't make sense,
or is somehow ambiguous.  You don't need exceptions
to preserve integrity.  Just silently return when the
programmer has asked for something that could be problematic.
But that's not going to help them find the bugs close
to where the bug resides.

For what it is worth, there are several places in Ada 95
and Ada 2005 where we decided to raise an exception
rather than make some non-obvious choice among various
alternatives, for example, the tag check associated
with calling a binary dispatching operator with mismatched
tags.  We considered using the nearest common ancestor
implementation of the operation, but concluded that in
many cases, that would not be what the programmer wanted,
and in any case there was probably some underlying confusion.

Similar decisions were made in specifying the semantics
for Unchecked_Union, where Program_Error is raised in
a number of circumstances where it might have been possible
to choose some non-exception-raising alternative, but
the choice would have been relatively arbitrary.

And interestingly, we made the decision in Ada 2005 to change
what happened when Raise_Exception is passed a Null_Id
for the Exception_Id, to now raise Constraint_Error
rather than just silently return.

Your approach is certainly reasonable, but I believe it
*does* differ from what has been our approach in doing
the Ada 95 and Ada 2005 language designs.  Rather than
choosing among alternatives where there is no clear winner,
we have more than once chosen to raise an exception.

****************************************************************

From: Randy Brukardt
Date: Thursday, September 28, 2006  9:07 PM

...
> But isn't that what designers do: analyze alternatives and then pick one? You
> yourself had this role during the Ada95 design to decide whether to allow
> out-parameters for functions.

I don't think bringing up one of Ada's all-time worst decisions (and one made
by Jean, not Tucker) is likely to help your case any...

> Randy Brukardt wrote:
> > > I don't have much an opinion on the issue at hand (just that it is decided),
> > > but I think Matt is (as often seems to happen) has lost sight of what Ada is
> > > about.
>
> Oh dear, now I am being branded as a yahoo!  Maybe next you'll accuse me of
> being unpatriotic!!  You're either with us or you're against us!!!

Excuse me?? Let's not descend into imaginary name-calling!! I've just
pointed out that you have a blind spot about the role of checking in Ada --
one that we spent a lot of effort overcoming in the Containers work. That's
doesn't make you "against us", just potentially wrong when it comes to
checking in Ada.

You must realize by now that most of us have more expansive views of the
role of checking in Ada than you do. Reacting to someone pointing that out
by accusing the author of name-calling (which this author certainly did not
do, made clear by the quotes above) is not cool, and does not advance the
discourse.

> > > He said:
> > >
> > >> Indeed the whole building-blocks approach to design is precisely to get out of
> > >> developer's way so so he can do stuff we didn't think of.  That's called
> > >> flexibility and it's a good thing.
> > >
> > > True enough, but that has to be balanced with Ada's desire to detect errors
> > > early.
>
> Hmmm, remind me again, how exactly is raising PE "detecting errors early"?

Since PE should not be handled in a program, the resulting crash will point
out the bug in early testing, when it still can be fixed.

> The issue is not about detecting a mere error.  It is about detecting an error
> in an algorithm that uses the library.  I have already argued that this is
> impossible for the library writer, since that's at a completely different level
> of abstraction.

That's an overly narrow view of things. It's like saying that you have to
put up with whatever stupidity your government tosses at you, "because it's
a different level of abstraction".

> Indeed, a developer's algorithm could be wrong for any number of reasons. How
> are we supposed to detect that?  You argue that "if target and source denote
> the same object then this obviously means his algorithm is broken" but how do
> you know this?

Because no one has shown a remotely plausible reason for it happening in a
correct program. (And in any case, I'm speaking more generally here -- as I
said before, I really don't care about the case at hand. I *do* really care
that you think the best solution to virtually any unusual circumstance is to
do nothing. I think that generally should be the *last* option, only used
when it clearly is likely to occur in real programs.)

> > > Surely, for every check in Ada, there is some case where you might
> > > actually want some specific behavior. But the checks exist because we
> > > (the designers) believe that it is more likely that the user made a mistake
> > > rather than having done something intended.
>
> No, the checks exist because you're unable to deliver a result (you're unable
> to satisfy the postcondition).  That's not the case here, and merge certainly
> can deliver a result when target and source denote the same object.

Again, that is an overly narrow view of the purpose of checks. In Claw, we
try to always raise an exception unless the *correct* result is going to
occur. Remember the analogy to integer overflow checking: you always get a
result after multiplying two numbers, but it might not be the correct
result.

Moreover, we try to prevent misuse of the library. That's important, because
programmers are human and they make mistakes. If it is easy to make a
mistake without detection, the errors can persist for a long time, and might
even be depended on (causing a maintenance hazard).

...
> > > I strongly disagree with this statement. We should be in the business of
> > > detecting all errors that can reasonably be detected.
>
> Motherhood and apple pie.  Yes of course we should detect all errors that can
> reasonably be detected, and no one has argued otherwise.  My argument is that
> the container library designer should only be responsible for ensuring the
> correctness of the container library.

And that argument is (IMHO), wrong. The designer of the library is
responsible for designing the library so that it is as hard as possible to
misuse. Detection of misuse is an important part of that.

In any case, this isn't "just" a library -- we're talking about a
building-block piece of the Ada Standard itself. It has to be held to higher
standards than a mere library would be.

> We can't possibly know whether a user of
> the library has a bug in his algorithm.

Again, this is a rather narrow viewpoint. It's often obvious to the library
that there is a bug in the user's algorithm.

...
> > >> Our responsibility as library designers is to ensure that containers can never get
> > >> broken, so that algorithm designers only have to worry about correctness of their
> > >> algorithms.
> > > Surely this is important, but if we can detect errors in the higher-level
> > > use of the library, that is should be done as well.
>
> Well let's just cure world hunger while we're at it!  Maybe we can solve the
> energy crisis too!!

Why not? Thinking too locally is a common mistake. There are a lot of
interdependencies between things, and keeping focused on a tiny corner is
unlikely to work. It's like optimizing a bad algorithm - things might speed
up a bit, but replacing the algorithm is better.

In any case, I'm not saying that you have to solve hunger, I'm saying that
you have to consider how a feature (in this case a library) is going to be
used, and detect obvious misuses. Having as few errors as possible is *not*
a good thing.

[Besides, those problems are trivially solved (implementing the solution is
somewhat harder ;-).]

> You can't possibly know whether passing the same container for both source and
> target is an error.  The algorithm might be wrong for any number of reasons,
> even when source and target are different.

Huh? That's backwards - we're surely not going to detect *all* user errors,
but that's hardly a reason for not detecting *some* user errors. The case in
question appears to be a user error.

...
> > > Of course it causes other problems, that's the idea -- usually it causes the
> > > program to crash or reset, and that will hopefully show up in testing
> > > (leading to the bug getting fixed, rather than causing trouble after
> > > deployment, as would happen if the problem is undetected).
>
> So you're saying that a crash or reset is better than doing
> nothing?  How do you know that?
>
> This demonstrates perfectly the point I was trying to make.  Saying that the
> algorithm must be wrong because the same object was passed for both source and
> target is an imaginary problem.  A crash or reset is a real problem, one that
> we created.

It's not imaginary at all. A crash or reset is *always* better than doing
nothing *if the user program is incorrect*. Ignoring a problem is never a
good idea.

The only question is whether, in this particular case, there is any case
where passing the same object for the source and target makes sense. A
single example that is even mildly plausible would do the trick. But instead
of providing that, you accuse me of name-calling. I can only presume that
there is no such example, in which case I must side with Tucker.

****************************************************************

From: Randy Brukardt
Date: Thursday, September 28, 2006  9:15 PM

> Randy was admittedly being a bit hyperbolic.

I'm afraid I don't see where I was being hyperbolic; I surely did not mean
to be so. I was just disagreeing (for the umpteenth time) with Matt's view
of checks.

> But I do think
> there is a philosophical difference of opinion here about
> what run-time checks and the associated exceptions are
> for. <followed by lots more good stuff>

As usual, better said than I can -- I wish I'd have gotten this an hour ago
so I wouldn't have wasted time writing my own less-valuable response.

****************************************************************

From: Robert I. Eachus
Date: Thursday, September 28, 2006  9:17 PM

> I'm not sure I understand your setup here.  If you have an array
> of *Containers* then you don't need to initialize them to anything.
> They default initialize to being empty.  It is meaningless to talk
> about initializing two different containers to the "same" Container
> value.  Container objects are either the same object, or completely
> independent (at least from the client perspective).

I am looking at things from the perspective of the initial algorithm 
designer, and the question of whether or not the abstract design can be 
easily mapped to Ada Containers.  If Merge has the property that merging 
an empty source into a target is a no-op, it is easier for the 
designer.  Yes, the difference is slight, but it is there.

Think of my example as follows:   First take a set of objects and 
distribute them into set of Containers.  Then in a second part of the 
application, merge some Containers.  If it is the case that the author 
of the second part has to use a Merge which can balk at merging two 
empty containers, the choices are to use a wrapper, or to make the 
*identity* part of the defined interface between the parts..

> I'm not sure why you say that.  If anything, this seems like
> a perfect example of how the Program_Error could catch a bug.

Remember the (possible) Program_Error will occur in the second part of 
the program, while the error is in the first part or in the interface 
documentation.  But even then Program_Error might not be raised until 
the code had been in production use for a while.

> I suppose we could define the semantics to
> first say, if the Source is empty, then the operation is
> a no-op.  If the Source and Target are the same
> non-empty container object, then Program_Error is raised.
> Otherwise, Source is merged in with Target, and Source is
> set to empty.

I don't think of this as a compromise, but it does work to eliminate, 
not just possibility of Program_Error in my example, but the possibility 
of a no-op merge ever actually being an error.  As for when the Source 
is non-empty, I don't feel strongly about Program_Error or no-op 
semantics.  Yes, in that case if there was an actual problem in the 
coding of part one, say using an array of access values all pointing to 
the same Container, you would get Program_Error.

From a succeeding message...

> In any case, I certainly don't consider the raising of
> Program_Error to be *causing* a bug, even if there is
> no handler.  It is announcing the presence of a bug.
> It is just like the Constraint_Error raised by an
> out of bounds array index. 

And now I see why so much heat here.  I don't see, especially at the 
design level, the idea that merging the wrong *empty* Container can be a 
bug. (If it is the source, or both source and target are empty.)  As far 
as I am concerned Containers should have the property that adding 
(merging) nothing into a Container has no effect.  A gratuitous 
Program_Error is obviously very different from no effect.

I can argue from the point of view of the abstraction that merging two 
Containers with the same contents should have no effect on the target.  
But similarly I can argue that the source should always be empty after a 
merge, so the non-empty case has a conflict that needs to be resolved.  
I am arguing for the minium necessary fix here.  And the minimum 
necessary fix doen't need to change the case where the Source is any 
empty Container.

****************************************************************

From: Robert Dewar
Date: Thursday, September 28, 2006  11:07 PM

> I don't have much an opinion on the issue at hand (just that it is decided),
> but I think Matt is (as often seems to happen) has lost sight of what Ada is
> about.

I must say I find this comment totally bizarre. Matt's argument seems 
sound to me (I also find the ad hominem style of the comment a pity!)

> True enough, but that has to be balanced with Ada's desire to detect errors
> early. Surely, for every check in Ada, there is some case where you might
> actually want some specific behavior. But the checks exist because we (the
> designers) believe that it is more likely that the user made a mistake
> rather than having done something intended. Think about integer overflow
> checks, for instance -- certainly, we could have decided what happens
> without raising an exception (C does that), but it was decided it was more
> likely to be a bug than intended. And surely C is more flexible than Ada in
> this respect, but a C program is also a lot more likely to get a bogus
> answer on the margins than an Ada program.

OK, so this is a bit of general stuff that we agree on, but how does
it apply here?

> Surely this is important, but if we can detect errors in the higher-level
> use of the library, that is should be done as well.

Why do you think it is an error? That's the point you need to argue.

****************************************************************

From: Robert Dewar
Date: Thursday, September 28, 2006  11:13 PM

Joyce L. Tokar wrote:
> I like the bounded error option simply because is provides
> an alert, via Program_Error, that the operation has been called
> in an erroneous way. 

If we want a Program_Error let's get a program error, there is
no point in making it a bounded error if we think it should raise
an exception!

I really see NO good argument for implementation defined behavior
here.

****************************************************************

From: Robert Dewar
Date: Thursday, September 28, 2006  11:12 PM

> Excuse me?? Let's not descend into imaginary name-calling!! I've just
> pointed out that you have a blind spot about the role of checking in Ada --
> one that we spent a lot of effort overcoming in the Containers work. That's
> doesn't make you "against us", just potentially wrong when it comes to
> checking in Ada.

Randy, in my opinion, your comment *was* highly improper, and Matt was
right to call you on it!

****************************************************************

From: Randy Brukardt
Date: Thursday, September 28, 2006 11:43 PM

> Randy, in my opinion, your comment *was* highly improper, and Matt was
> right to call you on it!

OK, I respect your opinion. It certainly wasn't meant to be inflammatory, it
was just a statement of opinion. But I don't have any idea why; it is the
sort of thing I would say about anyone.

Here's the original comment again:

> I don't have much an opinion on the issue at hand (just that it is decided),
> but I think Matt is (as often seems to happen) has lost sight of what Ada is
> about.

There's an extra word, but that's it. And I explained specifically what I
meant further down in the note. "what Ada is about" could be construed to
cover more ground than I meant it to (just the properties given it by the
designers, and specifically those properties that differ from other common
programming languages), but I don't know of any other way to express that.

Matt's argument doesn't work for me because it depends on a narrow view of
the purpose of checks. (Tucker explained that better than I could.) We've
repeatedly told Matt that we (at least the vocal subset of the ARG) want a
more expansive view of checks. To me, this is the fundamental difference
between Ada and other languages; if you abandon that, you've got C and Ada
has little reason to exist.

What would you have said instead??

****************************************************************

From: Matthew Heaney
Date: Thursday, September 28, 2006  11:47 PM

> The only question is whether, in this particular case, there is any case
> where passing the same object for the source and target makes sense. A
> single example that is even mildly plausible would do the trick. 

I have no example to offer, since I do not care to speculate about when this
would make sense!  It's completely up the user, to whom I defer.

> But instead of providing that, you accuse me of name-calling.

Well I can't tell you how many angels can dance on the head of the pin, either!

> I can only presume that there is no such example, in which case I must side
> with Tucker.

OK, the tribe has spoken.  Our philosophies wrt the use of exceptions just
differ, that's all...

****************************************************************

From: Matthew Heaney
Date: Thursday, September 28, 2006 11:50 PM

> > Randy was admittedly being a bit hyperbolic.
> 
> I'm afraid I don't see where I was being hyperbolic; I surely did not mean
> to be so. I was just disagreeing (for the umpteenth time) with Matt's view
> of checks.

Randy and I are just arguing, so what else is new...

> > But I do think
> > there is a philosophical difference of opinion here about
> > what run-time checks and the associated exceptions are
> > for. <followed by lots more good stuff>
> 
> As usual, better said than I can -- I wish I'd have gotten this an hour ago
> so I wouldn't have wasted time writing my own less-valuable response.

Yeah, that pretty much sums it up.  Our philosophies are different, that's all.
Vive la difference, as they say...

****************************************************************

From: Tucker Taft
Date: Friday, September 29, 2006  1:28 AM

> I have no example to offer, since I do not care to speculate about when this
> would make sense!  It's completely up the user, to whom I defer.

It may seem silly that I am interested in having a legitimate
example to be convinced, and I understand your point of view that
you don't want to limit the user, but part of the language
design philosophy of Ada is that some limitations can be a good
thing, creating a more, rather than a less, productive language.

I also admit that you can pretty easily go overboard, and one
person's helpful limitation is another person's silly restriction.
The Ada 83 restriction against variable declarations after nested
bodies comes to mind...

The best arbiter is a set of examples, in my experience.  A couple
of good examples can take the wind out of the best crafted sail.
But the examples can't be the kind that use one-character variable
names with no motivation.  There needs to be a recognizable,
reasonable intent behind the example to make it a powerful one.

> OK, the tribe has spoken.  Our philosophies wrt the use of exceptions just
> differ, that's all...

Actually, I think this case justifies a Program_Error even
with your philosophy, if you accept that the most "intuitive"
postconditions that seem to go with Merge (in my view at least ;-)
are not satisfiable if Source and Target are the same object (unless
they are the empty container).

You described some "higher level" postcondition for Merge, but
it left me feeling pretty cold.  Bob's query about a Merge
function rather than a procedure added to my feeling.  It
seems clear to me that the result of a merge must have a
length that is the sum of the lengths of its inputs.  Anything
else isn't a "merge" from my perspective.

I think Robert Eachus gave at least one example why it makes
sense for Merge(Target => Blah, Source => Empty) to always be a no-op,
even if "Blah" and "Empty" happen to be the same object, so
I would go for the slightly more complicated semantics
discussed, where empty source is a no-op, non-empty
source & target the same object is a Program_Error, and
otherwise the result of merge is that target grows in length
by the length of source, and source ends up empty.

This means that all cases where the "intuitive" postconditions
are satisfiable will succeed, and other cases raise Program_Error.

****************************************************************

From: Pascal Leroy
Date: Friday, September 29, 2006  2:03 AM

Gosh, we haven't had such a heated discussion on this list for a looong
time!

If anything, reading this thread has convinced me that the right answer is
a Program_Error.  I can't think of a good reason why anyone would want to
merge a non-empty container with itself.  This seems much more likely to
be a programmer mistake than a useful operation.  In the unlikely case
where the programmer absolutely wants to do that, it's easy enough to do
the check before calling Merge (presumably they wouldn't't need to use
'Address, because they would have access values in their hand).

To me this is just another case where Matt and the rest of us have
different views of what exceptions are for.  If you look at the minutes of
the 2004 Atlanta meeting, you'll see a very similar discussion regarding
the version of Delete that takes a cursor: should it raise an exception or
be a no-op if the cursor is null?  Matt wanted a no-op because the
post-condition (the element designated by the cursor is no longer in the
container) is true.  We decided for C_E on the ground that such a usage
pattern was more likely to be a bug, and also because we thought that the
post-condition didn't make much sense anyway (what is the element
designated by a null cursor anyway?).

Robert E. made an excellent point though with respect to merging null
containers.  The approach he describes (sharing null containers until
there is a need to actually store something) is one that shows up often in
practice, especially in sparse data structures.  And in this case the
post-condition clearly makes sense and is verified 'cause 0+0=0.

So I vote for P_E except that if source is empty the operations should be
a no-op.

****************************************************************

From: Robert Dewar
Date: Friday, September 29, 2006  5:20 AM

> OK, the tribe has spoken.  Our philosophies wrt the use of exceptions just
> differ, that's all...

Right, and it is a perfectly legitimate difference of opinion, and
it is not a matter of one party or the other having no appreciation
of Ada as a language, despite what Randy might think.

Perhaps we might stick to the technical issue here. It is really
not helpful at all, and does not bolster your argument, to accuse
those who disagree with you of not having lost sight of what Ada
is about.

For the record, I guess I must also fall in the category of people
that Randy thinks have lost sight of what Ada is about.

For me, you only want to raise exceptions in clear error situations
where you cannot come up with a reasonable definition of behavior.
That's because raising exceptions does NOT necessarily improve
safety (Ariane-5 should have been a sufficient lesson!)

Yes, you do want to raise an error if there are no clear semantics
that you can describe, or the semantics is clearly undesirable (e.g.
Java decision to silently wrap signed arithmetic).

But every time you add an exception condition, you increase the burden
of proof that the application is exception free. The practical ability
to prove applications are exception free is a valuable tool for high
integrity applications (see Praxis papers in this area, which are
from real applications).

In this case, I find the raising of an exception to be a mistake,
though of the three possibilites.

Detect and nop
Raise PE
Bounded Error

The bounded error case is by FAR the worst choice I think.

****************************************************************

From: Robert Dewar
Date: Friday, September 29, 2006  5:27 AM

> I also admit that you can pretty easily go overboard, and one
> person's helpful limitation is another person's silly restriction.
> The Ada 83 restriction against variable declarations after nested
> bodies comes to mind...

As does the restriction on out parameters for functions .. sorry
off topic but could not resist.

> The best arbiter is a set of examples, in my experience.  A couple
> of good examples can take the wind out of the best crafted sail.
> But the examples can't be the kind that use one-character variable
> names with no motivation.  There needs to be a recognizable,
> reasonable intent behind the example to make it a powerful one.

That I agree with. I think we do need an example where the lack of
an exception is useful.

Making limiting cases be errors is always a risky area. I once worked
on a machine where zero shift counts were don't care conditions, since
the engineer had decided that obviously no one would have any use for
zero bit shifts (and this applied to the variable shift case). A good
example would have stopped that horrible decision.

>> OK, the tribe has spoken.  Our philosophies wrt the use of exceptions just
>> differ, that's all...
> 
> Actually, I think this case justifies a Program_Error even
> with your philosophy, if you accept that the most "intuitive"
> postconditions that seem to go with Merge (in my view at least ;-)
> are not satisfiable if Source and Target are the same object (unless
> they are the empty container).

I must say that I have trouble seeing an example, Matt says he has
no example to offer because he does not care to speculate, but if
we can't even imagine an example where the proposed semantics are
useful, then that's worrisome. TO be able to speculate on appropriate
behavior without being able to speculate on an example is an
inconsistency that is problematic.

After all if some Java designer said to us: I don't think integer
overflow should cause an error condition, it should just wrap, even
though this violates obvious post conditions. And we asked for an
example to justify this, and he replied that he did not care to
speculate, we would be unlikely to be convinced.

> You described some "higher level" postcondition for Merge, but
> it left me feeling pretty cold.  Bob's query about a Merge
> function rather than a procedure added to my feeling.  It
> seems clear to me that the result of a merge must have a
> length that is the sum of the lengths of its inputs.  Anything
> else isn't a "merge" from my perspective.

That really is a strong argument

> I think Robert Eachus gave at least one example why it makes
> sense for Merge(Target => Blah, Source => Empty) to always be a no-op,
> even if "Blah" and "Empty" happen to be the same object, so
> I would go for the slightly more complicated semantics
> discussed, where empty source is a no-op, non-empty
> source & target the same object is a Program_Error, and
> otherwise the result of merge is that target grows in length
> by the length of source, and source ends up empty.

That seems right to me

> This means that all cases where the "intuitive" postconditions
> are satisfiable will succeed, and other cases raise Program_Error.

And no bounded error .. right now, that seems the right choice to me.
Matt, I really think you have to come up with at least one example
where the semantics you propose make sense.

Violating the post condition on sum-of-lengths seems likely to
increase the proof burden more than raising an exception.

****************************************************************

From: Joyce L. Tokar
Date: Friday, September 29, 2006  2:56 PM

>> I like the bounded error option simply because is provides
>> an alert, via Program_Error, that the operation has been called
>> in an erroneous way. 

>If we want a Program_Error let's get a program error, there is
>no point in making it a bounded error if we think it should raise
>an exception!

>I really see NO good argument for implementation defined behavior
>here.

Fine by me...

****************************************************************

From: Joyce L. Tokar
Date: Friday, September 29, 2006  3:02 PM

Of course, by making this particular behavior raise PE, it is easy
for tools like SPARK to do the analysis/layin the requirements that
the two parameters can not be the same object. Thus, analytical
tools, like SPARK, would be able to capture the error condition
early in the analysis phase, and resulting in exception-free code
...what the end game is in some camps.

****************************************************************

From: Robert Dewar
Date: Friday, September 29, 2006  5:09 PM

I am not sure what capture means here.

The issue is the extra proof rules this requirement generates ...
which seem non-trivial to me ...

****************************************************************

From: Robert A. Duff
Date: Friday, September 29, 2006  5:45 PM

Matthew Heaney writes:
> Pascal Leroy writes:
> 
> > So I don't see why we would want to have a non-portability here, and Matt's
> > proposal seems fine to me.  I could also go with P_E if someone had a strong
> > preference for that.
> 
> I answer questions like this by asking myself "What would Dijkstra do?".

Dijkstra would say something like, "Anyone who thinks Merge should do X should
be jailed for corrupting the minds of innocent children", for some X.  ;-)

Let me push a little harder on my function idea.

Tucker said:

> With Merge(A, B) => C, presumably Length(C) = Length(A) + Length(B) always.

Right.  And for Merge(X,X) => Y, we have:

    Length(Y) = Length(X) + Length(X) = 2*Length(X)

I can't imagine such a function being designed to raise Program_Error.

It seems to me that the important postcondition for the Merge procedure is:

    Length(Target) = Length(old Target) + Length(old Source)

The part about "Length(Source) = 0" seems like an efficiency hack that allows
us to destroy the Source.  Does one actually want to look at Source after the
Merge?

What's wrong with thinking that Merge(X,Y) is equivalent to:

    X := Merge(X,Y); -- the Merge function
    if X and Y are distict objects then
        Y := Empty;
    end if;

In other words, just tweak the postcondition so that the part about
Length(Source) = 0 is only true if they're not the same.

That seems more natural to me that making it a no-op.  It preserves Robert
Eachus' desire about empty sources.

I could also live with Program_Error.

I think "bounded error" is a bad idea here.

****************************************************************

From: Randy Brukardt
Date: Friday, September 29, 2006  9:25 PM

> That seems more natural to me that making it a no-op.  It preserves Robert
> Eachus' desire about empty sources.

Maybe, but it means that the "Merge" would be *creating* elements if (and
only if!) the two parameters were the same. In all other cases, it just is
moving elements around. That's pretty weird. Indeed, that is the difference
between function forms and procedure forms - the functions make copies of
everything, the procedures try to work in place.

> I could also live with Program_Error.

One of Tucker's criteria was that we should raise an exception when we
didn't have a clear idea of the right answer - lest we surprise the user of
the routine. We have two reasonable meaning here (double everything or
no-op), and both have built-in user surprises (double means that extra
elements are created by a routine that doesn't already do so; no-op means
that the obvious post-condition for the target is violated). And no one
seems to know why this should happen in practice.

So I think Program_Error is the best solution. That's especially true
because if some important use comes to light in the future, we can adopt one
of the above behaviors and remain compatible. We can't go the other way in
the future.

> I think "bounded error" is a bad idea here.

Yes, I agree. It should be detected somehow and have a defined result.

****************************************************************

From: Tucker Taft
Date: Friday, September 29, 2006  9:30 PM

...
> In other words, just tweak the postcondition so that the part about
> Length(Source) = 0 is only true if they're not the same.

One problem is that this doesn't really work very well
at all for linked lists, and we have tried to preserve an
approximate equivalence.  Merging two lists presumably does not
involve creating any new list elements.  So there is no
way you could satisfy the postcondition on the Length
if the two operands are the same non-empty list.

One could also imagine that if vectors are implemented
with a chunked representation, merge might be implemented
without having to allocate new chunks.
 
> That seems more natural to me that making it a no-op.  It preserves Robert
> Eachus' desire about empty sources.
> 
> I could also live with Program_Error.

I really think this is the right answer.

Robert makes the good point that simple postconditions
means simpler verification of an algorithm.
I think we want to raise Program Error if
we can't say that when done, Target
contains the merge of Source and Target, and
Source is empty.  Otherwise, Merge adds
unnecessary complexity to the verification of
an algorithm that uses it.

> I think "bounded error" is a bad idea here.

I don't think anyone is pushing for that anymore.

****************************************************************

From: Tucker Taft
Date: Friday, September 29, 2006  9:37 PM

Robert Dewar wrote:
> Joyce L. Tokar wrote:
>> Of course, by making this particular behavior raise PE, it is easy
>> for tools like SPARK to do the analysis/layin the requirements that
>> the two parameters can not be the same object. Thus, analytical
>> tools, like SPARK, would be able to capture the error condition
> 
> I am not sure what capture means here.

I think what Joyce means is that a tool like SPARK
could be used to prove that the error condition
(that the two parameters are the same object) can't
occur at run-time, much like it does for other
run-time checks.  So I think she is agreeing that
raising Program_Error is fine, even for safety-critical
systems, since an effort is made to prove for such
systems (with help from tools like SPARK) that none
of the potential exceptions could in fact occur.

> The issue is the extra proof rules this requirement generates ...
> which seem non-trivial to me ...

Agreed. And I suspect Joyce would agree as well.

****************************************************************

From: Joyce L. Tokar
Date: Saturday, September 30, 2006  9:50 AM

Thanks Tucker, I couldn't have said it better.

****************************************************************

From: Robert Dewar
Date: Saturday, September 30, 2006  10:28 AM

Just for the record

SPARK is a well defined programming language, related to Ada

The SPARK examiner is a static analyzer that checks that the
static assertions in a SPARK program are consistent.

Proof checking is the domain of a separate set of Praxis tools

****************************************************************

From: Matthew Heaney
Date: Saturday, September 30, 2006  9:47 AM

> > I could also live with Program_Error.
> 
> I really think this is the right answer.

Well, here's another tack against PE and for no-op.

We already have operations that move elements from one list to another.  Each
of these cases:

X.Move (X);
X.Splice (No_Element, X);

is a no-op.  So why isn't 

X.Merge (X);

a no-op too?

> Robert makes the good point that simple postconditions
> means simpler verification of an algorithm.

Whenever there's a container operation that accepts another container as an
argument, you have to do something special to test whether the container
parameters denote the same object.  So whatever verification you would need to
do for Merge is exactly same as the verification you're already doing
throughout the library for lots of other operations.

> I think we want to raise Program Error if
> we can't say that when done, Target
> contains the merge of Source and Target, and
> Source is empty.

No, of course you can't say that Source is empty, if Target and Source are the
same object -- so you don't say that.  You handle that case the same way you
handle it every other time:

Move says:

  If Target denotes the same object as Source, then Move has no effect.

Splice says:

  Otherwise, if Source denotes the same object as Target, the operation has no
  effect

The set operations Difference and Symmetric_Difference begin with

   If Target denotes the same object as Source,

There are probably others...

> Otherwise, Merge adds
> unnecessary complexity to the verification of
> an algorithm that uses it.

But that (putative) complexity already exists for several other operations.

Furthermore, you seem to be arguing that:

  do nothing when Source denotes same object as Target

is more complex than:

  raise PE when Source is non-empty and denotes same object as Target

The term "complexity" is a pejorative, so again you're framing the argument in
a way leads to your preferred conclusion.  Remember what Jean Ichbiah said (I'm
paraphrasing):

  When someone doesn't like your computer language, they don't say so directly.
  Instead they say that it's too complex.

So having considered your arguments, I stand by my original position: that
no-op is entirely proper here.  The description for Merge simply needs to be
modified slightly to use the same language we already use for Move and Splice,
to handle the case when Target and Source denote the same object.

****************************************************************

From: Robert Dewar
Date: Saturday, September 30, 2006  10:27 AM

> So having considered your arguments, I stand by my original position: that
> no-op is entirely proper here.  The description for Merge simply needs to be
> modified slightly to use the same language we already use for Move and Splice,
> to handle the case when Target and Source denote the same object.

Well that also seems a very reasonable argument.

****************************************************************

From: Robert I. Eachus
Date: Saturday, September 30, 2006  12:28 PM

> I think Robert Eachus gave at least one example why it makes
> sense for Merge(Target => Blah, Source => Empty) to always be a no-op,
> even if "Blah" and "Empty" happen to be the same object, so
> I would go for the slightly more complicated semantics
> discussed, where empty source is a no-op, non-empty
> source & target the same object is a Program_Error, and
> otherwise the result of merge is that target grows in length
> by the length of source, and source ends up empty.
>
> This means that all cases where the "intuitive" postconditions
> are satisfiable will succeed, and other cases raise Program_Error.

Obviously I am comfortable with that resolution. It is worth pointing 
out though, that my original thinking is almost identical to Matt's.  At 
the implementation level (actual written code) I find it hard to justify 
not raising Program_Error in even the empty Source and Target case.  But 
doing things this way does a much better job of preserving the 
abstractions at the design level, where the most important property of 
any Container is what it contains.  So it feels ugly to me to insist 
that two different empty Containers can have different effects when 
passed as a Target.

As I intended to say in my example (but didn't say all that clearly).  
The real effect of deciding that Merge(Empty, Empty) can raise 
Program_Error is that either the implementor of the second part of the 
application has to put in a two line wrapper--not a big deal--or the 
interface between the parts of the application has to specify the 
identity of the empty Containers in the array.  It's that last that puts 
my nose out of joint.  If the interface is not specified correctly and 
the Merge wrapper is not present in the second part of the application, 
maintenance of the first half of the application can create a 
Program_Error in the second half.  Thus by trying to be too helpful to 
the programmer instead of the designer, we have introduced a real bug 
that may be discovered years later.

****************************************************************

From: Jean-Pierre Rosen
Date: Sunday, October 1, 2006  1:40 AM

Randy Brukardt a écrit :
> Maybe, but it means that the "Merge" would be *creating* elements if (and
> only if!) the two parameters were the same. In all other cases, it just is
> moving elements around. That's pretty weird. Indeed, that is the difference
> between function forms and procedure forms - the functions make copies of
> everything, the procedures try to work in place.
> 
So, the model for Merge is:

for every element in Source loop
    Remove Element from Source
    Add it to Target
end loop;

Applying this to the case where Source = Target would argue for the no-op.

****************************************************************

From: Robert Dewar
Date: Sunday, October 1, 2006  5:11 AM

That seems a nice consistent formulation to me!

****************************************************************

From: Edmond Schonberg
Date: Sunday, October 1, 2006  2:14 PM

Except that the iteration over a changing domain is suspicious, in  
particular if source and target coincide. A more natural formulation  
might be:

    while source is not empty
       remove element from source
       insert it into target
    end loop

which of course is an infinite loop when source and target  
coincide :-)!  I vote for program_error., given the multiple meanings  
we can find for this case!

****************************************************************

From: Matthew Heaney
Date: Sunday, October 1, 2006  9:22 PM

Would you reach the same conclusion about Move and Splice?

****************************************************************

From: Pascal Leroy
Date: Monday, October 2, 2006  2:27 AM

> I must say that I have trouble seeing an example, Matt says 
> he has no example to offer because he does not care to 
> speculate, but if we can't even imagine an example where the 
> proposed semantics are useful, then that's worrisome. TO be 
> able to speculate on appropriate behavior without being able 
> to speculate on an example is an inconsistency that is problematic.

Absolutely.

Note that the situation is not symmetrical here: we can all see how
merging a container with itself could be an error (the silly programmer
just mistyped Merge(X,X) instead of Merge(X,Y)) so finding an example that
supports P_E is trivial.  On the other hand, it seems hard to construct an
example where this operation would be useful; at least, none has been
presented so far (excepting the case mentioned by Eachus where a null
container is merged with itself).

General philosophical considerations don't help: I would really need a
semi-concrete example to believe that merging a container with itself can
be handy in some situations.

****************************************************************

From: Robert Dewar
Date: Monday, October 2, 2006  2:43 AM

> General philosophical considerations don't help: I would really need a
> semi-concrete example to believe that merging a container with itself can
> be handy in some situations.

I have to agree with this. The sort of thing we need is an example where
the self-merge just falls out as a special case rather than an 
intentional deliberately written case. Similar to justifying the
need to support comparing an item with itself, not because people
write

   if X = X then

but because people do write

   if X (J) = X (I) then

where at some point J and I just happen to be equal.

I must say I have tried to think of an example for the merge
case and have not succeeded.

****************************************************************

From: Pascal Leroy
Date: Monday, October 2, 2006  3:14 AM

...
> It seems to me that the important postcondition for the Merge 
> procedure is:
> 
>     Length(Target) = Length(old Target) + Length(old Source)
> 
> The part about "Length(Source) = 0" seems like an efficiency 
> hack that allows us to destroy the Source.

You cannot be serious!

It has been argued that passing the same container twice was a mistake
(Tuck's viewpoint).  It has been argued that passing the same container
twice was a limiting case (Matt's viewpoint).  But so far no-one has
argued that it was a frequent occurrence.  Now you are saying: look, this
operation which occurs once in a blue moon is not a mistake and it should
succeed, but by the way it has the potential to silently consume
considerable quantities of storage.

It's always useful to consider what happens if the containers are used to
store big objects.  One example that Matt used time and again during the
design of the containers is that of a video stream, where each element of
the container would be a frame.  Say that you are merging two streams
(perhaps the odd- and even-numbered frames of a sequence that is being
de-interlaced).  Now you're telling me that if I happen to pass the same
video stream twice (which looks like an error to me, but that's not the
point) it is going to duplicate each frame of the source stream, thereby
consuming gigabytes of data that is likely to only be recovered when the
main program exits!

That seems like a sure recipe for disaster: if you're lucky it will raise
Storage_Error hic et nunc, but if you are not it's going to raise it a
million light years away.  I hope you like debugging.

> Does one actually 
> want to look at Source after the Merge?

Remember that the case we are discussing is probably the case where the
containers are referenced by access values (I realize that this is not the
only way that you can pass the same container twice).  So chances are that
the containers are allocated from some storage pool, and I would argue
that the first thing you want to do after the Merge is to
unchecked-deallocate Source (e.g. to free bookkeeping information held
internally by the container).  You'd be sorry if that caused Target to be
deallocated, and cursors pointing into Target to become invalid.

> What's wrong with thinking that Merge(X,Y) is equivalent to:
> 
>     X := Merge(X,Y); -- the Merge function
>     if X and Y are distict objects then
>         Y := Empty;
>     end if;
> 
> In other words, just tweak the postcondition so that the part about
> Length(Source) = 0 is only true if they're not the same.

What is wrong is the if_statement.  By definition, if_statements introduce
discontinuities, and you want the post-condition to be as "continuous" as
possible otherwise it's very hard to reason about it.  The postcondition
of the Merge function is just fine because it doesn't have any special
case, but by introducing a special case for the Merge procedure (and one
that is unlikely and fairly hard to detect) you adding unnecessary
complexity. 

****************************************************************

From: Pascal Leroy
Date: Monday, October 2, 2006  3:27 AM

> We already have operations that move elements from one list 
> to another.  Each of these cases:
> 
> X.Move (X);
> X.Splice (No_Element, X);
> 
> is a no-op.  So why isn't 
> 
> X.Merge (X);
> 
> a no-op too?

Maybe because the name doesn't convey that meaning?

I have always found the semantics of Splice confusing, and I re-read it
today and it still seems obscure, but I lost that battle long ago.  It's
intuitively clear to me what Splice does if the containers are distinct
and the cursors are non-null.  For other cases, I have to look at the
wording of the RM to know what's supposed to happen.  (Exercise for the
readers: look at the Ada specs for the various Splices and try to guess
under what circumstances C_E is *not* raised when a null cursor is given,
and what the subprogram does in that case.)

Move, on the other hand, is intuitively clear: if I move some piece of
data from one location to the same location, nothing happens, that much is
clear.

Merge only has an intuitive meaning to me when the containers are
distinct: if I merge two sorted decks of cards, I end up with one sorted
deck containing all the cards.  I never try to merge a deck with itself.

Intuition matters and the identifiers matter to guide intuition.  We can
invent any semantics we like, but it's not going to help users if they
cannot have a simple mental model of what's going on.

****************************************************************

From: Robert Dewar
Date: Monday, October 2, 2006  3:40 AM

> Merge only has an intuitive meaning to me when the containers are
> distinct: if I merge two sorted decks of cards, I end up with one sorted
> deck containing all the cards.  I never try to merge a deck with itself.

Actually I find the deck of cards analogy leans slightly in the opposite
direction.

The idea of merging seems to be to add in some new stuff that was not
there before, if you are merging from the same source, then there is
no new stuff, so there is no effect.

****************************************************************

From: Brad Moore
Date: Tuesday, October 3, 2006  12:07 AM

I have another analogy to consider.
 
If I say to someone:
 
"I plan to merge the coins in your pocket in with the coins in my pocket"
 
It is very easy to conceptualize what I am proposing, regardless whether
either party has any pocketed coins or not before this transaction takes place.
 
It does not however make sense for me to say:
 
"I plan to merge the coins in my pocket"
 
It takes two to tango, and it takes two to merge.
 
Now consider if I decide that I must give all the coins in my pocket to
charity so I say:
 
"I plan to donate the coins in my pocket to be shared equally amonst all
members of the Royal Family living below the poverty line"
 
When my brain starts to jump through hoops to figure what it really means
to merge a single item, or to divide x items amongst zero people, the same
thing happens. A mental hiccup. In other words, my brain raises an exception.
Perhaps Ada should do the same.
 
Also, if finding a programming example of merging an object into itself
is proving to be elusive, try finding an English sentence using the word
"merge" where there truly is only one item being merged. I'm not saying it
can't be done, but I am not having any luck trying.
 
eg. "She merged the ingredients to make the cake."  vs "She merged the
ingredient to make the cake" "They merged two companies" vs "They merged a company"
 
****************************************************************


Questions? Ask the ACAA Technical Agent