CVS difference for ais/ai-30302.txt

Differences between 1.12 and version 1.13
Log of other versions for file ais/ai-30302.txt

--- ais/ai-30302.txt	2004/11/03 00:53:43	1.12
+++ ais/ai-30302.txt	2004/11/14 06:37:26	1.13
@@ -24157,33 +24157,569 @@
 
 ****************************************************************
 
-From: Matthew Heaney
-Sent: Friday, September 2, 2004  9:31 AM
+From: Randy Brukardt
+Sent: Tuesday, November 2, 2004  7:04 AM
+
+This update (AI-302-3/08) mainly reorganizes the text to add Hashed_Sets and
+Ordered_Maps to the containers. Hopefully, this will be more consistent than
+the previous definition.
+
+Most of the work on this was done by Pascal Leroy, although I did make some
+editorial fixes to it.
+
+Note that the version to be considered at the next ARG meeting in Atlanta
+needs to be finished before November 15th, so try to make comments soon.
+(But no need to repeat existing ones.)
 
+This AI is in the usual place at http://www.ada-auth.org/ais.html.
+
 ****************************************************************
 
 From: Matthew Heaney
-Sent: Friday, September 2, 2004  9:31 AM
+Sent: Saturday, November 13, 2004  12:31 PM
+
+In the interest of fully exploring all the issues at hand, I offer a brief
+rebuttal to Randy's points below. --MJH
+
+
+> -----Original Message-----
+> From: Randy Brukardt [mailto:randy@rrsoftware.com] 
+> Sent: Wednesday, October 27, 2004 11:58 PM
+> To: Ada Comment
+> Subject: [Ada-Comment] Matt's review of ai-302-3/07 draft (resp. 1)
+> 
+> 
+> ...
+> > A.17 Containers
+> >
+> > ...
+> >
+> 
+
+Matt said:
+
+> > In particular, it is perfectly acceptable (in fact, the API is
+> > designed to facilitate this) for multiple tasks to be 
+> iterating over a 
+> > same container object, using either cursors or the passive iterator.
+
+
+Randy replied:
+
+> Maybe, but it violates A(3).
+
+MJH:
+
+Does A(3) apply when the calls denote the same container ("overlapping
+objects")?
+
+ENDMJH:
+
+
+> I would be very opposed to trying to repeal A(3) for these 
+> packages. There's a number of reasons for that:
+> 
+>   1) It would be different than all other Ada-defined 
+> packages. That would add to user confusion.
+
+MJH:
+
+Well, I could argue that what is confusing is the fact that a container can
+be modified during a read-only call.  What exactly does an in-mode parameter
+even mean?
+
+ENDMJH.
+
+
+>   2) It would prevent implementations from doing any sort of 
+> modifications on reading. For instance, the common technique 
+> of making most recently referenced elements to be more 
+> accessible in a hash table couldn't be used. Nor could caches 
+> or reference counts. We had similar restrictions in some 
+> operations in Claw, and it proved to be very constraining.
+
+MJH:
+
+Fair enough, but realize that that permission will more than likely violate
+users' expectations, if the container is passed as an in-mode parameter.  At
+a minimum, you're going to have to use some kind of indirection to do this
+(a la function Random).
+
+The problem with the internal counter technique you want to use to keep
+track of invocations of Iterate is that Iterate passes the container as an
+in-mode parameter.  Either some sort of trickery will be involved, or you'll
+have to allocate the counter object (yech!).
+
+ENDMJH.
+
+
+>   3) It would require a lot of wording to implement. We'd 
+> have to define precisely which operations are reading and 
+> which are writing; that would be complex to do.
+
+MJH:
+
+But defining those operations is required in any case, since if you want to
+require that an implementation raise Program_Error when an operation is
+called during Iterate, you must define which specific operations are
+required to raise PE.
+
+ENDMJH.
 
+
+>   4) It would make it far more likely for users to try to 
+> access containers from multiple tasks and would lead to 
+> errors. For instance, your proposed semantics wouldn't allow 
+> one task to write a container and another to read it, but 
+> it's likely that users would try to do so.
+
+
+MJH:
+
+My counter-argument is that changing some state behind the scenes (meaning
+that there are state changes even when the container is passed as an in-mode
+parameter) is going to cause you even greater problems, since most users
+will assume (in spite of A(3)) that since the container is passed as an
+in-mode parameter, that there are no state changes and therefore it's safe
+for multiple tasks to call in-mode operations.  Passing the container as
+in-mode and making a state change is like false advertising.
+
+ENDMJH.
+
+
+> The current rule is quite simple: if you want to use multiple 
+> tasks on a single container (or any other entity of a 
+> predefined type), wrap it in a protected object. Any other 
+> rule is going to be far more complex, both to use and to understand.
+
+MJH:
+
+As I argue above, prohibiting simultaneous reads by multiple tasks is
+already hard to understand.  Most developers will assume that multiple
+readers are allowed, and they will therefore most likely corrupt the
+internal state that we don't advertise can change.
+
+ENDMJH.
+
+
+> We've always expected that a secondary standard would 
+> consider task-safe "protected" containers. But the locking 
+> needed for that is quite expensive, and it certainly 
+> shouldn't be mandated.
+
+MJH:
+
+My main argument is that if there is "physical" state that can change (even
+if the "logical" state does not change), then we need to advertise this
+fact.
+
+ENDMJH.
+
 ****************************************************************
 
 From: Matthew Heaney
-Sent: Friday, September 2, 2004  9:31 AM
+Sent: Saturday, November 13, 2004  6:39 PM
+
+> Another response to Matt's review:
+> 
+> > procedure Iterate
+> >    (Container : in List;
+> >     Process : not null access procedure (Position : in Cursor));
+
+...
+
+> > MJH:
+> >
+> > This requirement is redundant, since we already have a 
+> meta-rule that 
+> > says container behavior isn't specified if the container object is 
+> > simultaneously read from and written to. This rule applies even if 
+> > it's the same task doing the reading and writing (as would 
+> be the case 
+> > when a container is modified during passive iteration).
+> 
+> There is no such rule that I know of. A(3) of course covers 
+> it for multiple tasks, but for a single task, the only rules 
+> are the ones I just added to Query_Element, Update_Element, 
+> and Iterate.
+> 
+> Moreover, such a rule would require careful definition of 
+> exactly which operations are reading and which are writing. 
+> (You can't assume that that is "obvious" in the Standard!) 
+> There certainly is nothing that complex in the current AI. 
+
+MJH:
+
+My assumption was that it's perfectly clear which operations are reading and
+which are writing, since that's what the parameter mode indicates.  Any time
+the container is modified, we pass the container as the first parameter,
+with parameter mode inout.
+
+If you do add the check to determine whether passive iteration is in
+progress, then you're still going to have to specify which operations are
+supposed to raise PE.  For example, is calling function Length allowed
+during passive iteration?  What about First or Last?
+
+ENDMJH.
+
+
+> > Note that the suggested implementation of the check doesn't 
+> work when 
+> > there are multiple reader tasks.  A container must support 
+> > simultaneous reading by multiple tasks.
+> 
+> A(3) only requires that multiple tasks work for disjoint 
+> objects. I'm very opposed to going further (see my other note).
+
+
+MJH:
+
+That's fine, but how do you intend to implement the counter?  The container
+is passed as an in-mode parameter.
+
+An in-mode parameter is telling me that there is no (logical) state change.
+You want to allow a physical state change, even when the parameter is
+in-mode.  Fair enough, I'm all in favor of maximizing vendor freedom, but
+only to a point, since otherwise the problem becomes intractable, as you
+attempt to divine all the myriad things a vendor *might* do, were he allowed
+to change the (physical) state for any operation.
+
+One potential human-engineering issue is that if a user isn't aware that a
+container can always change state (even when it's passed as an in-mode
+parameter), he may inadvertently corrupt the container, even if he calls
+only in-mode operations (using multiple tasks).  But that might just mean
+we'll have to educate him.
+
+ENDMJH.
+
 
+> Since virtually everything that Matt has to say is based on 
+> the two above fallacies, I'll just reiterate the important points:
+> 
+> The Ada philosophy is that unspecified (resp. erroneous 
+> execution and bounded error) behavior is bad. Even if there 
+> was a "meta-rule" (whatever that is), it would be a bad 
+> thing. We only allow less than completely specified behavior 
+> when it is too expensive to check.
+> 
+> If we do allow less than completely specified behavior, we 
+> try to specify it as tightly as possible. (That is a bounded 
+> error if we can.) We don't rely on "meta-rules" that say 
+> anything goes; that's for the C and C++ folks.
+> 
+> In this case, we want to define what the passive iterator 
+> does in all cases. That helps code be portable and safe. The 
+> only real advantage of the passive iterator is that it is 
+> safe -- you'll only get valid cursors from it, and you'll get 
+> each one exactly once, and so on.
+
+MJH:
+
+But we do define portable and safe behavior, but only when the user promises
+not to modify the container while iterating.  If he modifies the container
+while iterating, then he can suffer the consequences.  This is just basic
+economics.
+
+ENDMJH.
+
+
+> If we don't protect against deletion of elements, there is no 
+> chance for the iterator to be implemented safely (certainly 
+> not if you use one node per element, as is likely for most of 
+> the containers, like lists and maps). Indeed, it would be 
+> hard for it to be implemented without being erroneous in some 
+> case, as Matt gives examples to describe. I find "erroneous" 
+> to be completely unacceptable for the passive iterator. It 
+> should *never* give the Process routine a dangling pointer or 
+> run over random memory. If it can do that, we shouldn't have 
+> it at all, because it them offers nothing at all over the 
+> active case (it's harder to use, and *less* safe?).
+> 
+> I tried to define this as a bounded error, with it being 
+> unspecified what cursors are returned, but Matt convinced me 
+> that that was too hard to implement, and that Insert and 
+> Delete in a passive iterator is a mistake anyway. Fine; then 
+> simply checking that these don't happen is the better option.
+
+
+MJH:
+
+I'd love it if we could find a way for this to be a bounded error.  (Hey,
+I'd accept erroneous behavior, but I appear to be losing that battle.)
+
+ENDMJH.
+
+ 
+> In any case, we should err on the side of safe and 
+> predictable. The containers have far too much "unspecified" 
+> behavior for my taste already. (I'd prefer checks on 
+> Query_Element and Update_Element, but these would have some 
+> space overhead on every element, which might be too much - 
+> which is why I didn't write them in.)
+> 
+> Matt seems to have little interest in portable, predicable 
+> behavior. I don't understand that, unless he thinks everyone 
+> is going to use his implementations anyway (so it doesn't 
+> matter what the standard says). I can say with certainty that 
+> that will not be the case; the standard has to provide the 
+> predictability.
+
+
+MJH:
+
+Similar to what Dijkstra said, my brain can only tolerate a certain
+threshold of complexity, so I have to accept my limitations, and find some
+way to solve hard problems.  Yes, I tend to view this specification through
+the lens of one particular implementation, but that's the only way I know
+how to make this problem tractable.  It is certainly not because I don't
+care about portability and predictability.
+
+Our disagreement might simply reflect our difference in problem solving
+approaches: I tend to be a bottom-up designer, and you tend to be a top-down
+designer.  Vive la difference...
+
+ENDMJH.
+
 ****************************************************************
 
 From: Matthew Heaney
-Sent: Friday, September 2, 2004  9:31 AM
+Sent: Saturday, November 13, 2004  7:05 PM
+
+> > ...
+> >
+> >    generic
+> > ...
+> >    package Generic_Keys is
+> >
+> > ...
+> >        procedure Replace (Container : in out Set;
+> >                           Key       : in     Key_Type;
+> >                           New_Item  : in     Element_Type);
+> >
+> > MJH:
+> >
+> > This operation needs to be removed from this API.
+> ...
+> (followed by a lot of irrelevant discussion)
+> > The bottom line is that Generic_Keys.Replace must be 
+> removed from this 
+> > API.  It provides no new functionality, and only adds unnecessary 
+> > clutter.
+> 
+> For someone who is always proposing unnecessary operations, 
+> this is quite a stretch! I don't understand why you feel so 
+> strongly about this operation.
+
+MJH:
+
+Because it has goofy semantics, and no sensible person will use it.
+
+ENDMJH.
+
+
+> Certainly, this isn't the most useful routine, except in one 
+> case: when a project decides to move from a Map to a Set 
+> representation.
+
+MJH:
+
+The purpose of Generic_Keys is to allow key-based manipulation of a set, not
+to make it easy to change from a map to a set.  Even if a project does use
+Generic_Keys to facilitate such a change, they still have work to do: at a
+minimum they'll have to instantiate Generic_Keys.  If they need a key-based
+Replace too, then they can easily write it themselves as a local procedure.
+
+ENDMJH.
+
+
+> Therefore, it should be the case that as many operations as 
+> possible from the Map be available in the Set, and in 
+> particular in the Generic_Keys (since that is the direct 
+> counterpart of the Map). Forcing half the operations to be 
+> rewritten or moved is simply not acceptable.
+
+MJH:
+
+Fine, they can use Generic_Keys, and if they choose, write their own
+Replace.
+
+ENDMJH.
+
+
+> Thus, the Replace operation is declared here simply because a 
+> similar operation is declared in the Maps. I would have 
+> preferred to declare an Insert and Include as well, but these 
+> require goofy checks that the key is compatible with the 
+> element. Note that I would not expect any of these operations 
+> to be used in new (hand-written) code; they'd only be used 
+> when a Map is converted into a Set. So the fact that their 
+> semantics is a little weird is irrelevant -- they're simply 
+> an aid for changing between containers.
 
+MJH:
+
+But this particular aid doesn't need to be provided by us.  It can be
+implemented using other container primitives.
+
+ENDMJH.
+
+
+> Anyway, I want Generic_Keys to be as similar as possible to 
+> Maps (and indeed, our style guide will recommend avoiding 
+> operations that are not common, like Update_Element - even if 
+> it is renamed, it won't have the same parameter list or 
+> semantics). That means that there will be a few operations 
+> with odd semantics. So be it.
+
+MJH:
+
+That the key-based Replace operation has "odd semantics" hardly bolsters
+your case.  It needs to go...
+
+ENDMJH.
+
 ****************************************************************
 
-From: Matthew Heaney
-Sent: Friday, September 2, 2004  9:31 AM
+From: Randy Brukardt
+Sent: Saturday, November 13, 2004 11:55 PM
+
+General remark: I wish you'd say something only once. Repeating yourself just bulks
+up our mail files, and makes me repeat myself, ad nausem.
+
+Matt wrote:
+
+> > Certainly, this isn't the most useful routine, except in one 
+> > case: when a project decides to move from a Map to a Set 
+> > representation.
+> 
+> MJH:
+> 
+> The purpose of Generic_Keys is to allow key-based manipulation of a set, not
+> to make it easy to change from a map to a set.  Even if a project does use
+> Generic_Keys to facilitate such a change, they still have work to do: at a
+> minimum they'll have to instantiate Generic_Keys.
+
+Of course. And change the names of the types. But that's all that they should have to
+do.
+
+> If they need a key-based
+> Replace too, then they can easily write it themselves as a local 
+> procedure.
+
+That's not much an argument, for two reasons: (a) it applies to virtually everything
+in Generic_Keys, and (b) such a subprogram would be in the wrong package, and you'd
+still have to change every call. (At least those of us who never use use clauses
+would.)
+
+...
+> > Thus, the Replace operation is declared here simply because a 
+> > similar operation is declared in the Maps. I would have 
+> > preferred to declare an Insert and Include as well, but these 
+> > require goofy checks that the key is compatible with the 
+> > element. Note that I would not expect any of these operations 
+> > to be used in new (hand-written) code; they'd only be used 
+> > when a Map is converted into a Set. So the fact that their 
+> > semantics is a little weird is irrelevant -- they're simply 
+> > an aid for changing between containers.
+> 
+> But this particular aid doesn't need to be provided by us.  It can be
+> implemented using other container primitives.
 
+Sure, as can virtually everything else in Generic_Keys. That argument could be used
+to suggest that most of the operations in Generic_Keys be dropped.
+
+Certainly the Delete, Exclude, and Equivalent_Keys operations should be dropped:
+they're easy to write yourself and hardly ever would be used.
+
+Indeed, the only operations that would be a problem to write yourself are
+Checked_Update_Element and Find (both for performance reasons). The rest of them are
+unnecessary (if you take the view you have above).
+
+Let me say it again. I want to make moving between a Map and Set as easy as possible.
+Rewriting nearly every call doesn't work!
+
+If we can't have that, (and your view holds sway), then lets get rid of all of the
+junk operations that exist mainly to make it easier to port (or are just junk
+operations): the ones named above, "Swap" in lists, "Equivalent_Elements" in
+Hashed_Sets, "Equivalent_Keys" everywhere, ordering operators in ordered containers,
+etc. And let's name all of the operations that have different semantics/profiles
+differently.
+
 ****************************************************************
 
-From: Matthew Heaney
-Sent: Friday, September 2, 2004  9:31 AM
+From: Randy Brukardt
+Sent: Sunday, November 14, 2004 12:31 AM
+
+Matt said:
+> Matt said:
+>
+> > > In particular, it is perfectly acceptable (in fact, the API is
+> > > designed to facilitate this) for multiple tasks to be
+> > iterating over a
+> > > same container object, using either cursors or the passive iterator.
+>
+>
+> Randy replied:
+>
+> > Maybe, but it violates A(3).
+>
+> Does A(3) apply when the calls denote the same container ("overlapping
+> objects")?
+
+Yes, of course. Note that the parameter mode isn't mentioned in A(3): it
+*always* applies to by-reference parameters.
+
+...
+> > The current rule is quite simple: if you want to use multiple
+> > tasks on a single container (or any other entity of a
+> > predefined type), wrap it in a protected object. Any other
+> > rule is going to be far more complex, both to use and to understand.
+>
+> As I argue above, prohibiting simultaneous reads by multiple tasks is
+> already hard to understand.  Most developers will assume that multiple
+> readers are allowed, and they will therefore most likely corrupt the
+> internal state that we don't advertise can change.
+
+I would presume that these "most developers" aren't Ada programmers used to
+using tasking, because experienced Ada programmers know about avoid multiple
+access to the same object. Most of them stumble over it the first time they
+write a tasking program (using Text_IO), and ought not to be surprised
+afterwards.
+
+> > We've always expected that a secondary standard would
+> > consider task-safe "protected" containers. But the locking
+> > needed for that is quite expensive, and it certainly
+> > shouldn't be mandated.
+>
+> My main argument is that if there is "physical" state that can change
+(even
+> if the "logical" state does not change), then we need to advertise this
+fact.
+
+I'd have no problem making the container parameters to Iterate,
+Query_Elements, and Update_Elements in out. (Indeed, our style guide here
+says to do that always for primitive operations of extensible types, in
+order to not fence in extensions. It's unfortunate that that cannot be done
+for functions...)
+
+But I disagree with your fundimental point. An 'in' parameter of a composite
+type is really advertising that the logical properties of the type don't
+change. The programmer should never be concerned about what actually happens
+behind the scenes of the implementation. Writing code that depends on the
+implementation not modifying something is simply wrong, IMHO.
+
+After, 'in' parameters that aren't fully constant have a long history in
+Ada. Text_IO doesn't advertise physical (or even logical) state changes.
+Claw certainly does not. Ada.Numerics.Float_Random certainly has state
+changes in its generator parameters. Even Ada.Strings.Unbounded could change
+its 'in' parameters (I don't think any of the  vendor implementations
+actually do that, but I've seen a couple of replacement packages that do.)
+The Rosen trick is widely used.
+
+So you shouldn't put too much meaning into 'in' parameters. I'd certainly
+not trust any tasking behavior unless the parameter was by-copy or the
+routine was explicitly documented to be task-safe. And predefined Ada
+packages are not required to be task-safe in this way.
 
 ****************************************************************
 

Questions? Ask the ACAA Technical Agent