CVS difference for ai12s/ai12-0266-1.txt

Differences between 1.8 and version 1.9
Log of other versions for file ai12s/ai12-0266-1.txt

--- ai12s/ai12-0266-1.txt	2018/10/17 22:33:36	1.8
+++ ai12s/ai12-0266-1.txt	2019/01/04 04:24:24	1.9
@@ -1415,3 +1415,396 @@
 
 ****************************************************************
 
+From: Randy Brukardt
+Sent: Monday, November 19, 2018  11:18 PM
+
+The (very drafty) minutes for AI12-0266-1 say:
+
+> Can Split return zero chunks? Randy thought not, since something has
+> to be iterated, but others think that is also silly. So we need to
+> adjust the subtypes.
+
+OTOH, approved AI12-0251-1 (which defines a chunk_specification) says:
+
+> When the chunk_specification is elaborated, a check is made that the
+> determined maximum number of logical threads of control is greater
+> than zero. If this check fails, Program_Error is raised.
+
+So here we're not allowing zero chunks. It seems weird to me that a parallel
+iterator would allow zero chunks even though the underlying loop would not.
+
+Brad had used a subtype to ensure that the routines couldn't return no chunks,
+and it seems to me that that would be more consistent with the
+chunk_specification.
+
+Thoughts??
+
+****************************************************************
+
+From: Tucker Taft
+Sent: Tuesday, November 20, 2018  9:19 AM
+
+> So here we're not allowing zero chunks. It seems weird to me that a
+> parallel iterator would allow zero chunks even though the underlying loop
+> would not.
+
+I see these as somewhat different.  One is specifying the maximum number of
+chunks.  The other is returning the actual number of chunks.  It seems weird to
+ever specify a maximum of zero, but it seems fine to have an actual number of
+chunks being zero.  I suppose we could allow a maximum of zero if and only if
+the underlying iteration has zero elements.  This would allow a computation that
+divided the number of iterations by the chunk size, always rounding up.
+
+> Brad had used a subtype to ensure that the routines couldn't return no
+> chunks, and it seems to me that that would be more consistent with the
+> chunk_specification.
+>
+> Thoughts??
+
+I think it should be OK for the Split routine to return zero chunks if there are
+no elements.  I don't believe there is a promise that it returns the maximum
+number of chunks. but only that the number of chunks is no greater than the
+specified maximum.  So I think things are OK as is, perhaps modulo allowing a
+zero maximum if the number of elements is known to be zero as well.
+
+****************************************************************
+
+From: Randy Brukardt
+Sent: Tuesday, November 20, 2018  1:21 PM
+
+The number of chunks is also the number of logical threads of control. And I
+can't see how anything can be executed with zero logical threads of control,
+even if there are no elements. (It usually takes some execution to determine
+that there are zero elements.)
+
+Perhaps your mental model is that the actual number of logical threads of
+control is the number of chunks plus one for loop control? I don't see how that
+could be derived from the wording as given; if that's your intent, there needs
+to be at least an AARM note to that effect.
+
+****************************************************************
+
+From: Tucker Taft
+Sent: Tuesday, November 20, 2018  1:54 PM
+
+> The number of chunks is also the number of logical threads of control.
+> And I can't see how anything can be executed with zero logical threads
+> of control, even if there are no elements. (It usually takes some
+> execution to determine that there are zero elements.)
+
+I think that was probably why the original AI disallowed ever having a maximum
+of zero.  But I think the obvious thing would be that if the maximum number of
+chunks is <= 1, then you don't bother spawning any new threads.
+
+> Perhaps your mental model is that the actual number of logical threads
+> of control is the number of chunks plus one for loop control? I don't
+> see how that could be derived from the wording as given; if that's
+> your intent, there needs to be at least an AARM note to that effect.
+
+I certainly don't feel that strongly about allowing a maximum of zero.  On the
+other hand, you already have a thread which is elaborating the chunk
+specification, so there is nothing preventing that thread from at least noticing
+that the maximum is zero, and performing the check that in fact there are no
+iterations to perform.  So I don't see it as a big deal either way, and I don't
+really see the need for an AARM note, since the check that there is in fact
+nothing to do is reasonably done before we start spawning any additional
+threads.
+
+****************************************************************
+
+From: Randy Brukardt
+Sent: Wednesday, November 21, 2018  12:00 AM
+
+If that's the logic (and that's fine), then there doesn't seem to be any reason
+to disallow a maximum of zero (it just would mean that the original thread
+executes the loop, if any execution is needed). I'd expect that 1 chunk would
+mean the same thing (why spawn one thread and have the original thread sitting
+around waiting for it?), but at least logically it would be different. Certainly
+we don't want a negative maximum, so we need some sort of check in any case.
+
+The idea would be that the number is the maximum of *extra* logical threads
+(some of which might get mapped to the existing thread that is elaborating the
+loop). If that's the case, then zero makes sense -- not a lot, as it essentially
+would cancel the "parallel" keyword of the loop (of course, any Legality checks
+would remain), but it wouldn't be nonsense.
+
+Anyway, if we want to change this to between these consistent (and I do, one way
+or the other), then we need some proposed revised wording to put into a clean-up
+AI. A suggestion?
+
+****************************************************************
+
+From: Tucker Taft
+Sent: Wednesday, November 21, 2018  10:26 AM
+
+On further thought, I don't really think it works to have a zero maximum in the
+case where there is a chunk parameter if the number of iterations is non-zero,
+because what value would the chunk parameter have inside the loop?  We could
+allow zero chunks when there is no specified chunk parameter, or when a check
+indicates there are no iterations, but allowing zero chunks in general and just
+saying the loop should be treated as a sequential loop doesn't work in the
+presence of a chunk parameter.
+
+I am just about convinced to go back to having a minimum of one chunk determined
+by the chunk specification, and let the programmer use Integer'Min or equivalent
+to prevent zero showing up.  I also still think it is fine for Split to return
+fewer chunks than the maximum, including zero chunks.  I don't see the
+contradiction.  I would say the "Maximum_Split" parameter (or whatever it is now
+called) should be required to be greater than zero for consistency with this AI,
+but it seems fine if Chunk_Count returns zero when there are zero elements in
+the container.
+
+****************************************************************
+
+From: Randy Brukardt
+Sent: Wednesday, November 21, 2018  2:59 PM
+
+> ... I also still think it is fine for Split to return fewer chunks
+> than the maximum, including zero chunks.
+> I don't see the contradiction.  I would say the "Maximum_Split"
+> parameter (or whatever it is now called) should be required to be
+> greater than zero for consistency with this AI, but it seems fine if
+> Chunk_Count returns zero when there are zero elements in the
+> container.
+
+That only makes sense to me if we ban empty chunks otherwise. With the anything
+goes scheme you are proposing, you have the overhead to check for empty loops
+*and* overhead to check for zero chunks. That seems silly. (In almost all cases,
+null ranges and the like require dedicated code at the machine level.)
+
+My understanding of Brad's original proposal is that empty chunks were allowed
+but there had to be at least one chunk. That makes the most sense to me, since
+for some data structures it might be expensive to guarantee that elements exist
+in a particular range (think graphs or trees). In that case, allowing no chunks
+is just extra overhead, as that has to be checked for explicitly before
+starting, and you're doing a similar check anyway for each chunk.
+
+OTOH, if you were to not allow empty chunks, then it makes sense for an empty
+container to return no chunks. In that case, the overhead of checking for empty
+moves to the chunk part of the code (and not in the individual chunk loops). I
+think this is not as good of a model because it would require expensive complete
+walks of some data structures -- but it's not a critical issue as I don't think
+it matters for any of the existing containers. (It matters mainly for
+hypothetical user-defined containers, clearly a less important problem.)
+
+Requiring both is madness -- there should only be one way to present an empty
+container to the parallel iteration. But that seems to be what you have
+suggested.
+
+****************************************************************
+
+From: Tucker Taft
+Sent: Wednesday, November 21, 2018  4:19 PM
+
+> ...
+> Requiring both is madness -- there should only be one way to present
+> an empty container to the parallel iteration. But that seems to be what
+> you have suggested.
+
+I guess I don't see it that way.  Both seem reasonable.  One or more empty
+chunks are fine, even if there are elements in other chunks, and it is fine to
+have no chunks at all when there are no elements.  They seem orthogonal.
+
+Obviously you feel differently.  Perhaps a straw vote?
+
+****************************************************************
+
+From: Randy Brukardt
+Sent: Wednesday, November 21, 2018  6:44 PM
+
+> I guess I don't see it that way.  Both seem reasonable.  One or more
+> empty chunks are fine, even if there are elements in other chunks, and
+> it is fine to have no chunks at all when there are no elements.  They
+> seem orthogonal.
+
+"Parallel" exists to improve performance. Adding extra code (and thus execution time) into the iteration scheme just because you feel it is "fine"
+for the items to be empty doesn't make any sense to me. Either one or the other should be required to be non-empty so the extra overhead of the test that is inevitably required for dealing with the empty case is eliminated.
+That does seem to argue that the chunks should be non-empty (as that overhead is
+repeated for each chunk, while the other test only occurs once), but I don't
+care a lot which is chosen.
+
+If we were just defining an interface without any performance implications, then
+I'd probably agree with you (more flexible is better). But that's not the case
+here. This is the same reason that I got Brad to invert the interface so that
+the iterator queries for each chunk rather than trying to return them all at
+once -- we have to wring out every bit of avoidable overhead. And this is
+overhead that can be avoided.
+
+> Obviously you feel differently.  Perhaps a straw vote?
+
+I don't think we can effectively conduct a straw poll here, as only a few people
+will have followed the conversation enough to have an informed opinion. Most
+likely, Brad will chose some approach and then you or I or both can argue at the
+upcoming meeting for a different approach. It's only a couple of weeks from now
+anyway.
+
+****************************************************************
+
+From: Brad Moore
+Sent: Thursday, December 6, 2018  12:22 AM
+
+I have a somewhat different perspective to offer here, as it appears that apples
+and oranges are being compared, or rather, Apples and Apple slices.
+
+I think the semantics of the chunk_specification syntax in AI12-0251-1 is about
+specifying the number of logical threads of control being applied to a loop,
+whereas the Split function in AI12-0266-1 is about splitting a container into
+separate chunks of elements. The Advised_Split parameter to the Split call
+logically makes sense to me as the recommended number of splits to be applied to
+the container.
+
+So if Advised_Split is specified as 0, then to me, it makes sense to think of
+that as zero splits are applied, ie one chunk. (Which could be an empty chunk if
+there are no elements in the container or iterator to be iterated.
+
+From this, an Advised_Split of 1 would mean split the container once, (i.e two
+chunks), and an Advised_Split of 2 would mean the the container twice, (i.e.
+three chunks) and so on.
+
+Thus, if N is the number of logical threads specified by the chunk_specification
+in a parallel loop, it would equate to N - 1 being passed as the Advised_Split
+parameter in a call to Split.
+
+This feels like it would be simpler to explain and understand.
+
+Otherwise if feels messy and kludgy to me to try to rationalize that 0 and 1 for
+Advised_Split mean the same thing.
+
+****************************************************************
+
+From: Jean-Pierre Rosen
+Sent: Thursday, December 6, 2018  1:58 AM
+
+> So if Advised_Split is specified as 0, then to me, it makes sense to
+> think of that as zero splits are applied, ie one chunk. (Which could
+> be an empty chunk if there are no elements in the container or iterator to be iterated.
+>
+>>From this, an Advised_Split of 1 would mean split the container once,
+>>(i.e two chunks),
+> and an Advised_Split of 2 would mean the the container twice, (i.e. three
+> chunks) and so on.
+
+This looks to me as opening the door to endless discussions and explanations to
+users; although I'm one of these who hardly followed the discussions, it seems
+to me that if the name is that ambiguous, it should be changed: call it
+Advised_Chunks, there will be no discussion.
+
+****************************************************************
+
+From: Tucker Taft
+Sent: Thursday, December 6, 2018  7:17 AM
+
+I prefer Max_Chunks.  "Advised" is a weird term in this context, and implies
+that it is a "hint" rather than a maximum.
+
+****************************************************************
+
+From: Randy Brukardt
+Sent: Thursday, December 6, 2018  4:54 PM
+
+That's not the question. (At least, it wasn't *my* question.) AI12-0251-1 makes
+it clear that the maximum number of chunks cannot be zero, ergo 0 cannot be
+passed to Max_Chunks (or whatever name for the parameter was decided upon in
+Lexington). If one tries to set the maximum number of chunks to 0, Program_Error
+is raised.
+
+My concern is Split *returning* zero chunks. I see absolutely no reason to allow
+that, given that we are allowing chunks to be empty. If we do allow returning
+zero chunks, the loop code would necessarily need a special case to deal with
+that (just like dealing with null slices and empty iterations burdens the
+generated code). Typically, in Ada we have no choice about allowing zero of
+something, but in this case, we do. We have to allow *something* to be empty,
+but there's no need to allow *everything* to be empty. It doesn't make sense for
+a loop to execute with zero threads, regardless of whether that loop does
+anything. Thus, I don't believe that Split should be allowed to return zero
+chunks (nor should the maximum to allowed to be zero).
+
+****************************************************************
+
+From: Tucker Taft
+Sent: Thursday, December 6, 2018  8:27 PM
+
+Last time we went around on this I came to agree that we should at least
+disallow 0 for the "determined maximum number of chunks" in the latest update of
+the AI12-0251-1 AI.
+
+So hopefully this means we never have to allow zero for the "Max_Chunks"
+parameter (or whatever we end up calling it) to Split.
+
+As far as Split, I am fine if others believe it is better to disallow Split from
+returning zero chunks.  It does seem marginally more efficient at the point of
+use to not have to worry about this special case.
+
+****************************************************************
+
+From: Richard Wai
+Sent: Wednesday, November 21, 2018  12:10 PM
+
+I've been trying to get up to speed on the various AIs regarding parallel
+iteration, so please forgive me if I missed something!
+
+I generally take a more application-specific view on Ada. Direct use of the
+generalized iteration and indexing facilities by the programmer can be extremely
+useful in a lot of cases. We have actually used these quite a bit in our work.
+So with that perspective, I'm pretty interested in continuity/clarity in
+implementing parallelized iteration.
+
+It's probably safe to say that Chunk_Finish is related to Has_Element. Yet it's
+name suggest a disconnect, which could be confusing, especially for direct
+programmer application. Where one iterates until Has_Element returns False*, it
+could seem that one iterates on a Chunk until Chunk_Finished returns True.
+
+* I note here that the original AI does suggest that Chunk_Finished returns
+False at the end of a Chunk, and I don't see any specific amendment that changes
+this. This is obviously confusing, as one would expect Chunk_Finish to return
+True when the Chunk is finished, though I recognise that we iterate until
+Has_Element returns False, so that would be consistent.
+
+So here is my humble suggestion:
+
+Change Chunk_Finished to Chunk_Has_Element, and specify that it returns True if
+and only if Position is a member of the chunk indexed by Chunk.
+
+This also makes implementation of Chunk_Index a bit easier, as it can simply
+specify a first/last Cursor, which Chink_Has_Element can check against easily
+enough.
+
+-- As a further suggestion, in this case, I think Start_Of_Chunk should then
+also be renamed to Chunk_First, for consistency.
+
+****************************************************************
+
+From: Tucker Taft
+Sent: Wednesday, November 21, 2018  12:21 PM
+
+Good suggestions, Richard!
+
+****************************************************************
+
+From: Richard Wai
+Sent: Wednesday, November 21, 2018  12:52 PM
+
+If only I had run my email through the Ada compiler first, it would have found
+all of my typos!
+
+: "Chunk_Finish" is undefined
+: "Chink_Has_Element" is undefined
+
+****************************************************************
+
+From: Randy Brukardt
+Sent: Wednesday, November 21, 2018  2:46 PM
+
+> Good suggestions, Richard!
+
+I concur, but a caution: many AIs were extensively changed by the discussions at
+the October ARG meeting, but neither the minutes nor the updated AIs (in the
+case where they were approved) are available yet. (I'm working on those as hard
+as I can with the holiday week.) So you're necessarily looking at obsolete
+versions of the AIs.
+
+In this particular case, we changed many of the names of the routines, as well
+as many of the parameters, but not the specific one you are concerned about.
+
+****************************************************************
+

Questions? Ask the ACAA Technical Agent