Version 1.2 of ai12s/ai12-0251-2.txt
!standard 5.5.2(2/3) 18-03-28 AI12-0251-2/01
!standard 5.5.2(5/4)
!standard 5.5.2(7/3)
!class Amendment 18-03-28
!status No Action (7-0-0) 18-10-21
!status work item 18-03-28
!status received 18-03-28
!priority Low
!difficulty Medium
!subject Parallel loop chunking libraries
!summary
A library package is proposed to provide a programmer with a mechanism to
explicitly break a range of values to be iterated over into a group of
non-overlapping iteration chunks, that may be executed in parallel via a
parallel loop statement.
!problem
There are certain parallel loop algorithms where the programmer needs to
explicitly split the iteration of the loop into smaller "chunks" of iteration
where each chunk can execute in parallel with the others. For instance, there
are cases where each chunk may have initialisation or finalization needs.
In some cases, the programmer does not need to know or specify how many chunks
are created, and would prefer that the implementation provide a mechanism to
suggest how the iterations should be broken down into chunks.
In many cases, it does not matter if the user's chunking directly maps to
chunking that would be applied by the implementation for a parallel loop.
That is, to the compiler, it is just applying the parallelism to iterate over
the chunks, which may result in the chunks themselves being further chunked.
In other words, the compiler does not need to treat such a loop any differently
than any other parallel loop.
Dividing a set of iterations into evenly sized chunks can be non-trivial, and
error-prone. The programmer should not be expected to provide logic to do this
every time it is needed.
!proposal
To provide this capability, a library package is proposed to provide the user
with a mechanism to determine the number of chunks and their iteration
boundaries. This proposal depends on AI12-0??? (Parallel container iterators)
which defines the chunk array that can be returned for a parallel iterator.
!wording
Add a new subclause 5.5.3
5.5.3 User-Defined Parallel Loop Chunking for Discrete Types
The following language-defined generic library package exists:
with Ada.Iterator_Interfaces;
with Ada.Containers;
generic
type Loop_Index is (<>);
package Ada.Discrete_Chunking is
pragma Preelaborate;
pragma Remote_Types;
function Has_Element (Position : Loop_Index) return Boolean is (True);
package Chunk_Array_Iterator_Interfaces is new
Ada.Iterator_Interfaces (Loop_Index, Has_Element);
function Split
(Chunks : Natural := 0;
From : Loop_Index := Loop_Index'First;
To : Loop_Index := Loop_Index'Last)
return Chunk_Array_Iterator_Interfaces.Chunk_Array'Class
with Pre => Chunks <= Loop_Index'Pos(To) - Loop_Index'Pos(From) + 1,
Nonblocking => False;
private
... --
end Ada.Discrete_Chunking;
function Split
(Chunks : Natural := 0;
From : Loop_Index := Loop_Index'First;
To : Loop_Index := Loop_Index'Last)
return Chunk_Array_Iterator_Interfaces.Chunk_Array'Class
with Pre => Chunks <= Loop_Index'Pos(To) - Loop_Index'Pos(From) + 1;
Split returns a Chunk_Array object (see 5.5.1) for user-defined parallel
iteration over a range of values of a discrete subtype starting with the value
From and ending with the value To. The Chunks value indicates the number of
elements returned in the Chunk_Array result. If Chunks value equals zero,
then the number of elements of the Chunk_Array result is determined by the
implementation.
Examples
declare
package Manual_Chunking is new Ada.Discrete_Chunking (Color); --
Chunks : constant
Manual_Chunking.Chunk_Array_Iterator_Interfaces.Chunk_Array'Class :=
Manual_Chunking.Split (Chunks => 3, From => White, To => Black);
begin
parallel
for Chunk in 1 .. Chunks.Length loop
declare
File : Text_IO.File_Type;
begin
Text_IO.Create(File => File,
Name => "Team" & Natural'Image(Chunk) & ".txt");
for Hue in Chunks.Start(Chunk) .. Chunks.Finish(Chunk) loop
Put_Line (Color'Image(Hue));
end loop;
Text_IO.Close(File);
end;
end loop;
end;
!discussion
The proposal for container iterators (AI12-0266-1) provides a mechanism to apply
chunking to containers. Since this capability exists for the containers, as all
the containers have an Iterate primitive that return a parallel iterator object
that can be used to obtain a chunk array, it was felt the same capability
should exist for discrete types. This proposal provides a simple mechanism to
extend that same capability to a range of values of a discrete subtype.
Sometimes there are loops that need initialization or finalization for each
executor, that might be too expensive to apply for each iteration, but would be
worthwhile if applied once per chunk of execution. An example might be where
a file needs to be opened for each executor where the loop writes results to
the file, and after the executor completes its chunk, the file needs to be closed.
Other possible uses include memory allocation for temporary data structures.
Such loops would require explicit user-defined chunking, where the user code
explicitly calls the Split function to obtain a Chunk_Array object. This
allows the user to express the parallelism more explicitly by writing an outer
loop that iterates through the number of chunks and an inner loop that iterates
through the elements of each chunk.
It was considered whether syntax should be used to provide this same capability,
but that seemed like a heavy solution to a problem that can be easily solved
with a simple library.
!ASIS
None should be needed for a library.
!ACATS test
ACATS C-Tests are needed to check that the new library is supported.
Generally, we do not write B-Tests for libraries, as they just would test
basic Ada rules tested by the use of any package.
!appendix
From: Brad Moore
Sent: Wednesday, March 28, 2018 12:19 AM
This is yet another new AI carved off from AI12-0119-1 (Parallel Operations)
[This is version /01 of the AI - ED]
This one is a library only, alternative to AI12-0251-1 (Explicit chunk
definition for parallel loops), which is a syntax based solution.
This solution piggy backs onto the parallel container iterator capability.
That AI defines a parallel iterator interface that returns a chunk array for
a container.
This proposal simply inherits that same interface but applies it to a discrete
range and provides a chunk array that the programmer can use to iterate
through chunks.
****************************************************************
From: Tucker Taft
Sent: Wednesday, March 28, 2018 3:17 PM
This should probably be identified as AI12-0251-2, to make clear this is an
alternative to that proposal. I would doubt that we want to end up with both.
[The rest is about AI12-0251-1, and is filed there. - ED]
****************************************************************
From: Randy Brukardt
Sent: Thursday, March 29, 2018 12:38 AM
Your editor does read and follow the minutes, even if not everyone else
does. ;-)
****************************************************************
Questions? Ask the ACAA Technical Agent