!standard D.10.1 (00) 10-02-24 AI05-0174-1/02 !standard 13.7(10) !standard 13.7(30) !class Amendment 09-10-23 !status work item 09-10-23 !status received 09-10-23 !priority High !difficulty Medium !subject Implement Task barriers in Ada !summary Add a Synchronous_Barrier type to allow many tasks to be blocked and be released at once. !problem As general purpose computing is moving to parallel architectures and eventually to massively parallel machines, there is a need to efficiently schedule many (hundreds or thousands) of tasks using barrier primitives. The POSIX OS interface provides a barrier primitive where N tasks wait on a barrier and are released simultaneously when all N are ready to execute. Functionality like this should be added to Ada. !proposal There are many situations where the release of N tasks is required to execute an algorithm in parallel. Often the calculation is relatively small for each task on each iteration, but the number of tasks is high and the cost of linearly scheduling them and releasing them would remove almost any gains made through parallelizing them in the first place. The synchronous barrier release paradigm is that N-1 tasks are blocked on a common place waiting for release. One task is executing the sequential code prior to the parallel release. When that task suspends, the release condition is satisfied and all callers can execute. One of the callers is designated as the special task, which can perform sequential actions as the other tasks queue again for their next activity. We implement this functionality as an Ada package, a child of Ada: package Ada.Synchronous_Barriers is pragma Pure; subtype Barrier_Limit is Positive range 1 .. System.Max_Parallel_Release; type Synchronous_Barrier (Number_Waiting : Barrier_Limit) is limited private; procedure Wait_For_Release (The_Barrier : in out Synchronous_Barrier; Released_Last : out Boolean); end Ada.Synchronous_Barriers; When a variable of type Synchronous_Barrier is created with Number_Waiting = N, there are no waiting tasks and the barrier is set to block tasks. When the count reaches N, all tasks are simultaneously released and the "Released_Last" out parameter is set to True in an arbitrary one of the callers. All other callers result in "Released_Last" being set to False upon returning from the call. Since it is expected that this functionality will often be mapped to hardware with a maximum number of release gates in a barrier construct, the maximum number of tasks that can be released using Synchronous_Barriers is implementation-defined. A system parameter, Max_Parallel_Release, provides this limit. !wording Add after 13.7(10) (in package System): Max_Parallel_Release : constant := implementation-defined; Add after 13.7(30): Max_Parallel_Release The maximum number of tasks that can be released using an Ada.Synchronous_Barriers.Synchronous_Barrier object. D.10.1 Barriers This clause introduces a language-defined package to synchronously release a group of tasks after the number of blocked tasks reaches a specified count value. Static Semantics The following language-defined library package exists: with System; package Ada.Synchronous_Barriers is pragma Pure; subtype Barrier_Limit is Positive range 1 .. System.Max_Parallel_Release; type Synchronous_Barrier (Number_Waiting : Barrier_Limit) is limited private; procedure Wait_For_Release (The_Barrier : in out Synchronous_Barrier; Released_Last : out Boolean); end Ada.Synchronous_Barriers; Dynamic Semantics Each call to Wait_For_Release blocks the calling task until the number of blocked tasks associated with the Synchronous_Barrier object is equal to Number_Waiting, at which time all blocked tasks are released. Released_Last is set to True for one of the released tasks, and set to False for all other released tasks. The selection for determining which task sets Released_Last to True is implementation defined. Once all tasks have been released, a Synchronous_Barrier object may be reused to block another Number_Waiting number of tasks. As the first step of the finalization of a Synchronous_Barrier, each blocked task is unblocked and Program_Error is raised at the place of the call to Wait_For_Release. Bounded (Run-Time) Errors It is a bounded error to call Wait_For_Release on a Synchronous_Barrier object after that object is finalized. If the error is detected, Program_Error is raised. Otherwise, the call proceeds normally, which may leave a task blocked forever. !discussion This paradigm was discussed at IRTAW 14, Portovenere, Italy, 7-9 October 2009. We examined alternative syntax and semantics, such as - default setting of the barrier to True or to False, - explicit releasing of the tasks via Set_True as is done in suspension objects. We agreed that alternative functionality is needed to permit more complex interactions, but these require sharing some state and this is best done in the context of a highly optimized protected object. The extremely simple syntax and semantics provided for Synchronous_Barriers exactly matches the most common paradigm of N threads executing a single loop iteration and being released in a "for" loop from the loop evaluation release point. One of the threads will execute sequential parts of the program after the parallel portion is complete, but will execute its parallel portion prior to the sequential execution. The task that executes this part is determined by the return of a Boolean TRUE at the release of the barrier. It is implementation defined whether or not the same task executes the sequential portion on different executions of a release from a single barrier. For the case of the protected object, we agreed on the following: New syntax is required. Protected object semantics must be preserved and be identical for the cases of completely parallel readers and sequential readers (i.e. on a single CPU). Blocking tasks must be able to be released once the count of blocked tasks on a single entry has been reached. Blocking tasks must be able to be released based on any arbitrary set of conditions. Where there is a single "final" task in the set of released tasks, it must block at a blockage point (such as a "then" in a "select ... then ... end select" until the reader count becomes 0 (or 1 if we started at N), then that final task executes the portion that writes to the protected state, including setting the lock for the entry subprogram barrier condition. In a sequential version of the program, the final task performs this action and does not need to wait. Another AI is needed to address this need. This AI is only focussed on the Synchronous_Barriers package. !example with Ada.Synchronous_Barriers; use Ada.Synchronous_Task_Control; with Ada.Text_IO; use Ada.Text_IO; with Ada.Task_Identification; use Ada; procedure Test_Barriers is Number_Of_Tasks : constant := 1_000; Barrier : Barriers.Synchronous_Barrier (Number_Waiting => Number_Of_Tasks); task type Worker is end Worker; task body Worker is Released_Last : Boolean := False; begin Barriers.Wait_For_Release (Barrier, Released_Last); if Released_Last then Put_Line ("Released Last!" & Task_Identification.Image (T => Task_Identification.Current_Task)); end if; end Worker; Worker_Array : array (1 .. Number_Of_Tasks - 1) of Worker; Released_Last : Boolean := False; begin delay 5.0; Barriers.Wait_For_Release (Barrier, Released_Last); if Released_Last then Put_Line ("Released Last!" & Task_Identification.Image (T => Task_Identification.Current_Task)); end if; end Test_Barriers; --!corrigendum D.10.1(0) !ACATS test Add an ACATS C-Test to test the new package. !appendix From: Brad Moore Sent: Wednesday, February 24, 2010 12:33 AM Here is the other part of my homework. I have collaborated with Stephen Michell and Luke Wong to update AI-174 having to do with providing a facility for synchronous barriers. Please see the attached. **************************************************************** From: Tucker Taft Sent: Wednesday, February 24, 2010 12:56 AM The term "implementation dependent" is not used withing the Ada standard. You probably should use "implementation defined" instead. **************************************************************** From: Randy Brukardt Sent: Wednesday, February 24, 2010 7:30 PM Note that it is never used in normative wording (it's in the proposal and discussion). But I think it would be better to use "implementation defined"; I made that change. I also fixed the type of the parameter to Wait_for_Release (it wasn't changed from "Simple_Barrier") - in both packages. Also simple_barriers was still used in the last paragraph of the Proposal, I fixed that, too. **************************************************************** From: Bob Duff Sent: Wednesday, February 24, 2010 8:00 PM Doesn't "implementation defined" imply some sort of documentation requirement? If we don't want to require that, we're supposed to say "not defined by This International Standard" or some such obnoxious verbiage. But in non-normative places, "implementation dependent" is just fine. **************************************************************** From: Randy Brukardt Sent: Wednesday, February 24, 2010 8:56 PM Yes, but in this case the term is used in relation to the constant System.Max_Parallel_Release. I think it would be good (and easy) for an implementation to document that constant. And in non-normative wording, it doesn't really matter anyway. ****************************************************************