Version 1.2 of ai05s/ai05-0174-1.txt

Unformatted version of ai05s/ai05-0174-1.txt version 1.2
Other versions for file ai05s/ai05-0174-1.txt

!standard D.10.1 (00)          09-10-23 AI05-0174-1/01
!class Amendment 09-10-23
!status work item 09-10-23
!status received 09-10-23
!priority High
!difficulty Medium
!subject Implement Task barriers in Ada
!summary
Add a Simple_Barrier type to allow many tasks to be blocked and be released at once.
!problem
As general purpose computing is moving to parallel architectures and eventually to massively parallel machines, there is a need to effeciently schedule many (hundreds or thousands) of tasks using barrier primitives. The POSIX OS and a number of machines provide a barrier where N tasks wait on a barrier and are released simultaneously when all N are ready to execute. Functionality like this should be added to Ada.
!proposal
There are many situations where the release of N tasks is required to execute an algorithm in parallel. Often the calculation is relatively small for each task on each iteration, but the number of tasks is high and the cost of linearly scheduling them and releasing them would remove almost any gains made through parallelizing them in the first place.
The simple barrier release paradigm is that N-1 tasks are blocked on a common place waiting for release. One task is executing the sequential code prior to the parallel release. When that task suspends, the release condition is satisfied and all callers can execute. One of the callers is designated as the special task, which can perform sequential actions as the other tasks queue again for their next activity.
We implement this functionality as an Ada package, a child of Ada.Synchronous_Task_Control:
package Ada.Synchronous_Task_Control.Barriers is
type Simple_Barrier(Number_Waiting : Positive) is limited private;
procedure Wait_For_Release (The_Barrier : in out Simple_Barrier; Last_Released : out Boolean);
end Ada.Synchronous_Task_Control.Barriers;
When a variable of type Simple_Barrier is created with Number_Waiting = N, there are no waiting tasks and the barrier is set to block tasks. When the count reaches N, all tasks are simultaneously released and the "Last_Released" out parameter is set in [an arbitrary] one of the callers.
When the scope closing the barrier is closed, the Simple_Barrier finalization is completed after all tasks waiting on the barrier have been removed from the barrier [via abort, termination, asynchronous transfer of control]
Since it is expected that this functionality will often be mapped to hardware with a maximum number of release gates in a barrier construct, the maximum number of tasks that can be released using simple_barriers is implementation dependent. A system parameter, Maximum_Parallel_Release, provides this limit.
!discussion
This paradigm was discussed at IRTAW 14, Portovenere, Italy, 7-9 October 2009. We examined alternative syntax and semantics, such as - default setting of the barrier to True or to False, - explicit releasing of the tasks via Set_True as is done in suspension objects.
We agreed that alternative functionality is needed to permit more complex interactions, but these require sharing some state and this is best done in the context of a highly optimized protected object. The extremely simple syntax and semantics provided for Simple_Barriers exactly matches the most common paradigm of N threads executing a single loop iteration and being released in a "for" loop from the loop evaluation release point. One of the threads will execute sequential parts of the program after the parallel portion is complete, but will execute its parallel portion prior to the sequential execution. The task that executes this part is determined by the return of a Boolean TRUE at the release of the barrier. It is implementation dependant whether or not the same task executes the sequential portion on different executions of a release from a single barrier.
For the case of the protected object, we agreed on the following:
New syntax is required. Protected object semantics must be preserved and be identical for the cases of completely parallel readers and sequential readers (i.e. on a single CPU). Blocking tasks must be able to be released once the count of blocked tasks on a single entry has been reached. Blocking tasks must be able to be released based on any arbitrary set of conditions. Where there is a single "final" task in the set of released tasks, it must block at a blockage point (such as a "then" in a "select ... then ... end select" until the reader count becomes 0 (or 1 if we started at N), then that final task executes the portion that writes to the protected state, including setting the lock for the entry subprogram barrier condition. In a sequential version of the program, the final task performs this action and does not need to wait.
!example
** TBD **
--!corrigendum D.10.1(0)
!ACATS test
Add an ACATS C-Test to test the new package.
!appendix

****************************************************************


Questions? Ask the ACAA Technical Agent