Version 1.1 of ai12s/ai12-0251-2.txt

Unformatted version of ai12s/ai12-0251-2.txt version 1.1
Other versions for file ai12s/ai12-0251-2.txt

!standard 5.5.2(2/3)          18-03-28 AI12-0251-2/01
!standard 5.5.2(5/4)
!standard 5.5.2(7/3)
!class Amendment 18-03-28
!status work item 18-03-28
!status received 18-03-28
!priority Low
!difficulty Medium
!subject Parallel loop chunking libraries
A library package is proposed to provide a programmer with a mechanism to explicitly break a range of values to be iterated over into a group of non-overlapping iteration chunks, that may be executed in parallel via a parallel loop statement.
There are certain parallel loop algorithms where the programmer needs to explicitly split the iteration of the loop into smaller "chunks" of iteration where each chunk can execute in parallel with the others. For instance, there are cases where each chunk may have initialisation or finalization needs. In some cases, the programmer does not need to know or specify how many chunks are created, and would prefer that the implementation provide a mechanism to suggest how the iterations should be broken down into chunks.
In many cases, it does not matter if the user's chunking directly maps to chunking that would be applied by the implementation for a parallel loop. That is, to the compiler, it is just applying the parallelism to iterate over the chunks, which may result in the chunks themselves being further chunked. In other words, the compiler does not need to treat such a loop any differently than any other parallel loop.
Dividing a set of iterations into evenly sized chunks can be non-trivial, and error-prone. The programmer should not be expected to provide logic to do this every time it is needed.
To provide this capability, a library package is proposed to provide the user with a mechanism to determine the number of chunks and their iteration boundaries. This proposal depends on AI12-0??? (Parallel container iterators) which defines the chunk array that can be returned for a parallel iterator.
Add a new subclause 5.5.3
5.5.3 User-Defined Parallel Loop Chunking for Discrete Types
The following language-defined generic library package exists:
with Ada.Iterator_Interfaces; with Ada.Containers;
generic type Loop_Index is (<>); package Ada.Discrete_Chunking is
pragma Preelaborate; pragma Remote_Types;
function Has_Element (Position : Loop_Index) return Boolean is (True);
package Chunk_Array_Iterator_Interfaces is new Ada.Iterator_Interfaces (Loop_Index, Has_Element);
function Split (Chunks : Natural := 0; From : Loop_Index := Loop_Index'First; To : Loop_Index := Loop_Index'Last) return Chunk_Array_Iterator_Interfaces.Chunk_Array'Class with Pre => Chunks <= Loop_Index'Pos(To) - Loop_Index'Pos(From) + 1, Nonblocking => False;
private ... -- not specified by the language end Ada.Discrete_Chunking;
function Split (Chunks : Natural := 0; From : Loop_Index := Loop_Index'First; To : Loop_Index := Loop_Index'Last) return Chunk_Array_Iterator_Interfaces.Chunk_Array'Class with Pre => Chunks <= Loop_Index'Pos(To) - Loop_Index'Pos(From) + 1;
Split returns a Chunk_Array object (see 5.5.1) for user-defined parallel iteration over a range of values of a discrete subtype starting with the value From and ending with the value To. The Chunks value indicates the number of elements returned in the Chunk_Array result. If Chunks value equals zero, then the number of elements of the Chunk_Array result is determined by the implementation.
declare package Manual_Chunking is new Ada.Discrete_Chunking (Color); -- See 3.5.1
Chunks : constant Manual_Chunking.Chunk_Array_Iterator_Interfaces.Chunk_Array'Class := Manual_Chunking.Split (Chunks => 3, From => White, To => Black); begin parallel for Chunk in 1 .. Chunks.Length loop declare File : Text_IO.File_Type; begin Text_IO.Create(File => File, Name => "Team" & Natural'Image(Chunk) & ".txt");
for Hue in Chunks.Start(Chunk) .. Chunks.Finish(Chunk) loop Put_Line (Color'Image(Hue)); end loop;
Text_IO.Close(File); end; end loop; end;
The proposal for container iterators (AI12-0266-1) provides a mechanism to apply chunking to containers. Since this capability exists for the containers, as all the containers have an Iterate primitive that return a parallel iterator object that can be used to obtain a chunk array, it was felt the same capability should exist for discrete types. This proposal provides a simple mechanism to extend that same capability to a range of values of a discrete subtype.
Sometimes there are loops that need initialization or finalization for each executor, that might be too expensive to apply for each iteration, but would be worthwhile if applied once per chunk of execution. An example might be where a file needs to be opened for each executor where the loop writes results to the file, and after the executor completes its chunk, the file needs to be closed. Other possible uses include memory allocation for temporary data structures.
Such loops would require explicit user-defined chunking, where the user code explicitly calls the Split function to obtain a Chunk_Array object. This allows the user to express the parallelism more explicitly by writing an outer loop that iterates through the number of chunks and an inner loop that iterates through the elements of each chunk.
It was considered whether syntax should be used to provide this same capability, but that seemed like a heavy solution to a problem that can be easily solved with a simple library.
None should be needed for a library.
!ACATS test
ACATS C-Tests are needed to check that the new library is supported. Generally, we do not write B-Tests for libraries, as they just would test basic Ada rules tested by the use of any package.

From: Brad Moore
Sent: Wednesday, March 28, 2018 12:19 AM

This is yet another new AI carved off from AI12-0119-1 (Parallel Operations)
[This is version /01 of the AI - ED]

This one is a library only, alternative to AI12-0251-1 (Explicit chunk 
definition for parallel loops), which is a syntax based solution.

This solution piggy backs onto the parallel container iterator capability.
That AI defines a parallel iterator interface that returns a chunk array for
a container.

This proposal simply inherits that same interface but applies it to a discrete
range and provides a chunk array that the programmer can use to iterate
through chunks.


From: Tucker Taft
Sent: Wednesday, March 28, 2018  3:17 PM

This should probably be identified as AI12-0251-2, to make clear this is an
alternative to that proposal.  I would doubt that we want to end up with both. 

[The rest is about AI12-0251-1, and is filed there. - ED]


From: Randy Brukardt
Sent: Thursday, March 29, 2018  12:38 AM

Your editor does read and follow the minutes, even if not everyone else
does. ;-)


Questions? Ask the ACAA Technical Agent