!standard D.10.1 (00) 10-08-12 AI05-0174-1/04 !class Amendment 09-10-23 !status Amendment 2012 10-08-12 !status ARG Approved 10-0-0 10-06-19 !status work item 10-03-13 !status ARG Approved 9-0-2 10-02-26 !status work item 09-10-23 !status received 09-10-23 !priority High !difficulty Medium !subject Implement Task barriers in Ada !summary Add a Synchronous_Barrier type to allow many tasks to be blocked and be released at once. !problem As general purpose computing is moving to parallel architectures and eventually to massively parallel machines, there is a need to efficiently schedule many (hundreds or thousands) of tasks using barrier primitives. The POSIX OS interface provides a barrier primitive where N tasks wait on a barrier and are released simultaneously when all N are ready to execute. Functionality like this should be added to Ada. !proposal There are many situations where the release of N tasks is required to execute an algorithm in parallel. Often the calculation is relatively small for each task on each iteration, but the number of tasks is high and the cost of linearly scheduling them and releasing them would remove almost any gains made through parallelizing them in the first place. The synchronous barrier release paradigm is that N-1 tasks are blocked on a common place waiting for release. One task is executing the sequential code prior to the parallel release. When that task suspends, the release condition is satisfied and all callers can execute. One of the callers is designated as the special task, which can perform sequential actions as the other tasks queue again for their next activity. We implement this functionality as an Ada package, a child of Ada: package Ada.Synchronous_Barriers is pragma Pure(Synchronous_Barriers); subtype Barrier_Limit is Positive range 1 .. ; type Synchronous_Barrier (Number_Waiting : Barrier_Limit) is limited private; procedure Wait_For_Release (The_Barrier : in out Synchronous_Barrier; Notified : out Boolean); end Ada.Synchronous_Barriers; When a variable of type Synchronous_Barrier is created with Number_Waiting = N, there are no waiting tasks and the barrier is set to block tasks. When the count reaches N, all tasks are simultaneously released and the "Released_Last" out parameter is set to True in an arbitrary one of the callers. All other callers result in "Released_Last" being set to False upon returning from the call. Since it is expected that this functionality will often be mapped to hardware with a maximum number of release gates in a barrier construct, the maximum number of tasks that can be released using Synchronous_Barriers is implementation-defined. A system parameter, Max_Parallel_Release, provides this limit. !wording D.10.1 Synchronous Barriers This clause introduces a language-defined package to synchronously release a group of tasks after the number of blocked tasks reaches a specified count value. Static Semantics The following language-defined library package exists: package Ada.Synchronous_Barriers is pragma Preelaborate(Synchronous_Barriers); subtype Barrier_Limit is Positive range 1 .. ; type Synchronous_Barrier (Number_Waiting : Barrier_Limit) is limited private; procedure Wait_For_Release (The_Barrier : in out Synchronous_Barrier; Notified : out Boolean); private -- not specified by the language end Ada.Synchronous_Barriers; Type Synchronous_Barrier needs finalization (see 7.6). Dynamic Semantics Each call to Wait_For_Release blocks the calling task until the number of blocked tasks associated with the Synchronous_Barrier object is equal to Number_Waiting, at which time all blocked tasks are released. Notified is set to True for one of the released tasks, and set to False for all other released tasks. The mechanism for determining which task sets Notified to True is implementation defined. Once all tasks have been released, a Synchronous_Barrier object may be reused to block another Number_Waiting number of tasks. As the first step of the finalization of a Synchronous_Barrier, each blocked task is unblocked and Program_Error is raised at the place of the call to Wait_For_Release. It is implementation-defined whether an abnormal task which is waiting on a Synchronous_Barrier object is aborted immediately or aborted when the tasks waiting on the object are released. Wait_For_Release is a potentially blocking operation (see 9.5.1). Bounded (Run-Time) Errors It is a bounded error to call Wait_For_Release on a Synchronous_Barrier object after that object is finalized. If the error is detected, Program_Error is raised. Otherwise, the call proceeds normally, which may leave a task blocked forever. !discussion This paradigm was discussed at IRTAW 14, Portovenere, Italy, 7-9 October 2009. We examined alternative syntax and semantics, such as - default setting of the barrier to True or to False, - explicit releasing of the tasks via Set_True as is done in suspension objects. We agreed that alternative functionality is needed to permit more complex interactions, but these require sharing some state and this is best done in the context of a highly optimized protected object. The extremely simple syntax and semantics provided for Synchronous_Barriers exactly matches the most common paradigm of N threads executing a single loop iteration and being released in a "for" loop from the loop evaluation release point. One of the threads will execute sequential parts of the program after the parallel portion is complete, but will execute its parallel portion prior to the sequential execution. The task that executes this part is determined by the return of a Boolean TRUE at the release of the barrier. It is implementation defined whether or not the same task executes the sequential portion on different executions of a release from a single barrier. We do not define when abort of a task which is waiting on a synchronous barrier takes effect, so long as it is no later than the release of the tasks on that barrier. We want this feature to be efficiently implementable using features of the target system (if it has appropriate operations). Since synchronous barriers cannot be safely used in the presence of abort (if too few tasks show up at a barrier, they will all be blocked forever), what happens is not important. For the case of the protected object, we agreed on the following: New syntax is required. Protected object semantics must be preserved and be identical for the cases of completely parallel readers and sequential readers (i.e. on a single CPU). Blocking tasks must be able to be released once the count of blocked tasks on a single entry has been reached. Blocking tasks must be able to be released based on any arbitrary set of conditions. Where there is a single "final" task in the set of released tasks, it must block at a blockage point (such as a "then" in a "select ... then ... end select" until the reader count becomes 0 (or 1 if we started at N), then that final task executes the portion that writes to the protected state, including setting the lock for the entry subprogram barrier condition. In a sequential version of the program, the final task performs this action and does not need to wait. Another AI is needed to address this need. This AI is only focused on the Synchronous_Barriers package. !example with Ada.Synchronous_Barriers; use Ada.Synchronous_Task_Control; with Ada.Text_IO; use Ada.Text_IO; with Ada.Task_Identification; use Ada; procedure Test_Barriers is Number_Of_Tasks : constant := 1_000; Barrier : Barriers.Synchronous_Barrier (Number_Waiting => Number_Of_Tasks); task type Worker is end Worker; task body Worker is Released_Last : Boolean := False; begin Barriers.Wait_For_Release (Barrier, Released_Last); if Released_Last then Put_Line ("Released Last!" & Task_Identification.Image (T => Task_Identification.Current_Task)); end if; end Worker; Worker_Array : array (1 .. Number_Of_Tasks - 1) of Worker; Released_Last : Boolean := False; begin delay 5.0; Barriers.Wait_For_Release (Barrier, Released_Last); if Released_Last then Put_Line ("Released Last!" & Task_Identification.Image (T => Task_Identification.Current_Task)); end if; end Test_Barriers; !corrigendum D.10.1(0) @dinsc This clause introduces a language-defined package to synchronously release a group of tasks after the number of blocked tasks reaches a specified count value. @s8<@i> The following language-defined library package exists: @xcode<@b Ada.Synchronous_Barriers @b @b Preelaborate(Synchronous_Barriers); @b Barrier_Limit @b Positive @b 1 .. @ft<@i>; @b Synchronous_Barrier (Number_Waiting : Barrier_Limit) @b; @b Wait_For_Release (The_Barrier : @b Synchronous_Barrier; Notified : @b Boolean); @b -- @ft<@i> @b Ada.Synchronous_Barriers;> Type Synchronous_Barrier needs finalization (see 7.6). @s8<@i> Each call to Wait_For_Release blocks the calling task until the number of blocked tasks associated with the Synchronous_Barrier object is equal to Number_Waiting, at which time all blocked tasks are released. Notified is set to True for one of the released tasks, and set to False for all other released tasks. The mechanism for determining which task sets Notified to True is implementation defined. Once all tasks have been released, a Synchronous_Barrier object may be reused to block another Number_Waiting number of tasks. As the first step of the finalization of a Synchronous_Barrier, each blocked task is unblocked and Program_Error is raised at the place of the call to Wait_For_Release. It is implementation-defined whether an abnormal task which is waiting on a Synchronous_Barrier object is aborted immediately or aborted when the tasks waiting on the object are released. Wait_For_Release is a potentially blocking operation (see 9.5.1). @s8<@i> It is a bounded error to call Wait_For_Release on a Synchronous_Barrier object after that object is finalized. If the error is detected, Program_Error is raised. Otherwise, the call proceeds normally, which may leave a task blocked forever. !ACATS test Add an ACATS C-Test to test the new package. !appendix From: Brad Moore Sent: Wednesday, February 24, 2010 12:33 AM Here is the other part of my homework. I have collaborated with Stephen Michell and Luke Wong to update AI-174 having to do with providing a facility for synchronous barriers. Please see the attached. **************************************************************** From: Tucker Taft Sent: Wednesday, February 24, 2010 12:56 AM The term "implementation dependent" is not used withing the Ada standard. You probably should use "implementation defined" instead. **************************************************************** From: Randy Brukardt Sent: Wednesday, February 24, 2010 7:30 PM Note that it is never used in normative wording (it's in the proposal and discussion). But I think it would be better to use "implementation defined"; I made that change. I also fixed the type of the parameter to Wait_for_Release (it wasn't changed from "Simple_Barrier") - in both packages. Also simple_barriers was still used in the last paragraph of the Proposal, I fixed that, too. **************************************************************** From: Bob Duff Sent: Wednesday, February 24, 2010 8:00 PM Doesn't "implementation defined" imply some sort of documentation requirement? If we don't want to require that, we're supposed to say "not defined by This International Standard" or some such obnoxious verbiage. But in non-normative places, "implementation dependent" is just fine. **************************************************************** From: Randy Brukardt Sent: Wednesday, February 24, 2010 8:56 PM Yes, but in this case the term is used in relation to the constant System.Max_Parallel_Release. I think it would be good (and easy) for an implementation to document that constant. And in non-normative wording, it doesn't really matter anyway. **************************************************************** From: Brad Moore Sent: Friday, March 12, 2010 4:47 PM A couple of further notes and questions regarding the Synchronous Barrier package that was discussed in Boston. I have come up with a couple of reference implementations for the package. (See the attached) The first implementation uses the POSIX calls for barriers (on a Linux desktop). The second implementation uses an Ada protected type to implement the package. The first thing I noted was that the package could not be Pure for the POSIX implementation, because the barrier type needs finalization. (So I changed pragma Pure to pragma Preelaborate) The Ada protected type implementation could be implemented with pragma Pure, but since the intent of this AI is to support the POSIX calls, we need to go to the lowest common denominator, pragma Preelaborate. Then it occurred to me that it could be quite useful to be able to use distributed barriers, as a RACW type. The barrier could be declared in a partition, and other partitions could call Wait_For_Release remotely. To support that capability, a pragma Remote_Types is needed in the package. Also, the Synchronous_Barrier type would need to be a tagged type, since a RACW type needs to designate a class wide type. It might be nice to declare; type Distributed_Barrier is access all Synchronous_Barrier'Class; in the Synchronous_Barriers package as a convenience for those wishing to use barriers in a distributed manner. This also helps point out the possibility of distributed use for the package. Interestingly, this however rules out my implementation as a regular Ada protected type, because a tagged type cannot be completed by a protected type. I believe it should be possible to wrap the protected object as a component in a tagged type however. To summarize, my questions are; 1) Should this package be a Remote_Types package? 2) Should we declare Distributed_Barrier as a RACW type in the package? 3) (Minor question) The parameter for Wait_For_Release is "The_Barrier". It strikes me as inconsistent to have "The_" as a preface to the parameter name. Do we do that anywhere else? I tried searching the RM for "The_", but it returns every occurrence of the word "the" in the RM. I am not sure I have the patience to check out all these occurrences to see if any are used as a preface on a parameter name. Should the parameter just be "Barrier" instead of "The_Barrier"? Finally, I did some timing tests to see the difference in execution times using the POSIX calls vs an Ada protected type. My test program calls a procedure using the barrier package a 100000 times. The time for the POSIX implementation came fairly consistently around 4.8 seconds to execute, while the Ada implementation took typically around 5.3 seconds to implement on my Linux desktop. This is a crude measurement, since it includes times needed for procedure calls, task initialization, etc, however it does show that the POSIX version is slightly faster. The tasks were essentially executing NOP's. I think that if each task had significant workloads to perform, then the time spent executing the barrier calls would fall into the noise level for either implementation. Regarding the reference implementations. Note that the POSIX implementation is a tagged type, while the Ada implementation is not tagged. --- -- POSIX Barrier implementation private with Ada.Finalization; package Ada.Synchronous_Barriers is pragma Preelaborate; pragma Remote_Types; subtype Barrier_Limit is Positive range 1 .. System.Max_Parallel_Release; type Synchronous_Barrier (Number_Waiting : Barrier_Limit) is tagged limited private; procedure Wait_For_Release (The_Barrier : in out Synchronous_Barrier; Released_Last : out Boolean); type Distributed_Barrier is access all Synchronous_Barrier'Class; private use Interfaces; use Ada.Finalization; SIZEOF_PTHREAD_BARRIER_T : constant := 20; type pthread_barrier_t_view is (size_based, align_based); -- POSIX barriers data type. type pthread_barrier_t (Kind : pthread_barrier_t_view := size_based) is record case Kind is when size_based => size : C.char_array (1 .. SIZEOF_PTHREAD_BARRIER_T); when align_based => align : C.long; end case; end record; pragma Unchecked_Union (pthread_barrier_t); type Synchronous_Barrier (Number_Waiting : Barrier_Limit) is new Limited_Controlled with record POSIX_Barrier : aliased pthread_barrier_t; end record; overriding procedure Initialize (Barrier : in out Synchronous_Barrier); overriding procedure Finalize (Barrier : in out Synchronous_Barrier); end Ada.Synchronous_Barriers; -- POSIX Implementation Body with System; package body Ada.Synchronous_Barriers is EAGAIN : constant C.int := 11; -- Try again EBUSY : constant C.int := 16; -- Device or resource Busy EFAULT : constant C.int := 14; -- Bad address EINVAL : constant C.int := 22; -- Invalid Argument EOK : constant C.int := 0; -- Success use type C.int; PTHREAD_BARRIER_SERIAL_THREAD : constant C.int := -1; function pthread_barrier_destroy (barrier : not null access pthread_barrier_t) return C.int; pragma Import (Convention => C, Entity => pthread_barrier_destroy, External_Name => "pthread_barrier_destroy"); function pthread_barrier_init (barrier : not null access pthread_barrier_t; attr : System.Address := System.Null_Address; count : C.unsigned) return C.int; pragma Import (Convention => C, Entity => pthread_barrier_init, External_Name => "pthread_barrier_init"); function pthread_barrier_wait (barrier : not null access pthread_barrier_t) return C.int; pragma Import (Convention => C, Entity => pthread_barrier_wait, External_Name => "pthread_barrier_wait"); overriding procedure Finalize (Barrier : in out Synchronous_Barrier) is begin case pthread_barrier_destroy (barrier => Barrier.POSIX_Barrier'Access) is when EOK => -- Success. null; when EBUSY => -- The barrier is in use. raise Program_Error with "Barrier in use"; when EINVAL => -- Invalid Barrier raise Program_Error with "Invalid Barrier"; when others => raise Program_Error with "Unexpected return code"; end case; end Finalize; overriding procedure Initialize (Barrier : in out Synchronous_Barrier) is begin case pthread_barrier_init (barrier => Barrier.POSIX_Barrier'Access, attr => System.Null_Address, count => C.unsigned (Barrier.Number_Waiting)) is when EOK => -- Success. null; when EAGAIN => -- The system lacks the necessary resources to -- initialize another barrier. raise Storage_Error; when EBUSY => -- Attempt to reinitialize a barrier while it's in use. raise Program_Error with "Barrier in Use"; when EFAULT => -- A fault occurred when the kernel tried to -- access barrier or attr. raise Program_Error with "Kernel Fault"; when EINVAL => -- Invalid value specified by attr. raise Program_Error with "Invalid attr Argument"; when others => raise Program_Error with "Unexpected return code"; end case; end Initialize; ---------------------- -- Wait_For_Release -- ---------------------- procedure Wait_For_Release (The_Barrier : in out Synchronous_Barrier; Released_Last : out Boolean) is Result : C.int; begin Result := pthread_barrier_wait (barrier => The_Barrier.POSIX_Barrier'Access); case Result is when EINVAL => -- Invalid value specified by attr. raise Program_Error with "Barrier isn't initialized"; when others => Released_Last := (Result = PTHREAD_BARRIER_SERIAL_THREAD); end case; end Wait_For_Release; end Ada.Synchronous_Barriers; ------------------------------------------------------------------------------- -- Ada Protected Type implementation with System; package Ada.Synchronous_Barriers is pragma Preelaborate; pragma Remote_Types; package System renames brSystem; subtype Barrier_Limit is Positive range 1 .. System.Max_Parallel_Release; type Synchronous_Barrier (Number_Waiting : Barrier_Limit) is limited private; procedure Wait_For_Release (The_Barrier : in out Synchronous_Barrier; Released_Last : out Boolean); type Distributed_Barrier is access all Synchronous_Barrier'Class; private protected type Synchronous_Barrier (Number_Waiting : Barrier_Limit) is entry Wait_For_All_Tasks_To_Block (Released_Last : out Boolean); private Done : Boolean := False; end Synchronous_Barrier; end Ada.Synchronous_Barriers; package body Ada.Synchronous_Barriers is ---------------------- -- Wait_For_Release -- ---------------------- procedure Wait_For_Release (The_Barrier : in out Synchronous_Barrier; Released_Last : out Boolean) is begin The_Barrier.Wait_For_All_Tasks_To_Block (Released_Last); end Wait_For_Release; protected body Synchronous_Barrier is --------------------------------- -- Wait_For_All_Tasks_To_Block -- --------------------------------- entry Wait_For_All_Tasks_To_Block (Released_Last : out Boolean) when Wait_For_All_Tasks_To_Block'Count = Number_Waiting or Done is begin Done := True; if Wait_For_All_Tasks_To_Block'Count = 0 then Released_Last := True; Done := False; else Released_Last := False; end if; end Wait_For_All_Tasks_To_Block; end Synchronous_Barrier; end Ada.Synchronous_Barriers; **************************************************************** From: Randy Brukardt Sent: Saturday, March 13, 2010 12:05 AM > A couple of further notes and questions regarding the Synchronous > Barrier package that was discussed in Boston. We already approved this AI at the last meeting. Are you asking to reopen it, create a second AI, or something else?? **************************************************************** From: Randy Brukardt Sent: Saturday, March 13, 2010 12:25 AM Now for the technical response: > It might be nice to declare; > > type Distributed_Barrier is access all Synchronous_Barrier'Class; > > in the Synchronous_Barriers package as a convenience for those wishing > to use barriers in a distributed manner. This also helps point out the > possibility of distributed use for the package. I *detest* sticking random types and objects into packages because "they might be convenient for someone". Especially when said types have nothing to do with the operations in the package. These things turn into maintenance hazards down the road. After all, the reason that we can't make Unbounded_Strings Remote_Types is that someone thought that sticking a random access-to-string type into the package would be "convenient". That's despite the fact that the about the only time you could be sure that you wouldn't want to use such a type is when you are using real Unbounded_Strings! And if you *aren't* using Unbounded_Strings, why would you want to drag a lot of finalization stuff into your program?? Just friggin' stupid. Grumble. Anyway, let's not repeat that silliness. If someone needs an RACW type, let them declare it. > To summarize, my questions are; > > 1) Should this package be a Remote_Types package? This I can't answer. The original version is Pure, so it would work that way. If we change it to Preelaborate, Remote_Types tends to go along as much as possible. But I can't say if there is an implementation burden (say on a 'real' multiprocessor with separate CPUs). > 2) Should we declare Distributed_Barrier as a RACW type in the > package? No. See above. > 3) (Minor question) The parameter for Wait_For_Release is "The_Barrier". > It strikes me as inconsistent to have "The_" as a preface to the > parameter name. Do we do that anywhere else? I tried searching the RM > for "The_", but it returns every occurrence of the word "the" in the > RM. I solved this problem by searching the Bob Duff special text-only version of the RM (an ancient one, I haven't regenerated it in forever) with a most complex tool, the command-line Find utility that come built-in to Windows (an MS-DOS holdover). ;-) There appears to be exactly one such parameter, The_Tag found in Ada.Tags.Generic_Dispatching_Constructor (3.9(18.2/2)). So it surely isn't common, but there is precedent (and even in something I wrote, although I would have said I never use parameters named like that -- so go figure). Whether that is enough to leave or change the naming I can't say. **************************************************************** From: Robert Dewar Sent: Saturday, March 13, 2010 8:22 AM ... > After all, the reason that we can't make Unbounded_Strings > Remote_Types is that someone thought that sticking a random > access-to-string type into the package would be "convenient". That's > despite the fact that the about the only time you could be sure that > you wouldn't want to use such a type is when you are using real > Unbounded_Strings! And if you *aren't* using Unbounded_Strings, why > would you want to drag a lot of finalization stuff into your program?? > Just friggin' stupid. Grumble. I agree that putting Accrss_String was a bad idea for Unbounded_Strings, but NOT having a standard Access_String type *somewhere* is an error that is even worse! The result is a proliferation of incompatible string access types around a program or system > Anyway, let's not repeat that silliness. If someone needs an RACW > type, let them declare it. Not clear to me that the analogy is right here, that type looks like an advantageous addition to me, I don't agree with dismissing it as "silliness". >> To summarize, my questions are; >> >> 1) Should this package be a Remote_Types package? > > This I can't answer. The original version is Pure, so it would work > that way. If we change it to Preelaborate, Remote_Types tends to go > along as much as possible. But I can't say if there is an > implementation burden (say on a 'real' multiprocessor with separate CPUs). How can there be an implementation burden, no one says you have to implement this in pure Ada (surely you don't think the finalization package is implemented in pure Ada :-)) **************************************************************** From: Tucker Taft Sent: Saturday, March 13, 2010 4:00 PM Thanks for doing some trial implementations. I agree that we don't normally start parameter names with "The_". The most interesting thing for me was that the "pure" Ada implementation was very nearly as fast as the Posix implementation. That means to me that protected objects are pretty flexible, if they can nearly match the performance of a special purpose O/S feature. **************************************************************** From: Bob Duff Sent: Saturday, March 13, 2010 4:53 PM > The time for the POSIX implementation came fairly consistently around > 4.8 seconds to execute, while the Ada implementation took typically > around 5.3 seconds to implement on my Linux desktop. ^^^^^^^^^ Wow, that's impressive! How long did it take to execute? ;-) ;-) **************************************************************** From: Brad Moore Sent: Sunday, March 14, 2010 1:14 AM I don't usually like to mince words, however I am happy to put that particular instance of that word in the shredder. It passed my spell-checker, and my grammar checker. If only I had a semantics-checker. :) **************************************************************** From: Brad Moore Sent: Saturday, March 13, 2010 9:11 PM > We already approved this AI at the last meeting. Are you asking to > reopen it, create a second AI, or something else?? Whichever makes the most sense. Though to me, the most sense would be to reopen the AI. The pragma Pure is a problem with the current version of the AI, since it seems to rule out the Ada implementation that utilizes the POSIX barrier calls. That suggests reopening the existing AI. Perhaps the package could be a non-Ada implementation, but it seems a shame to not be able to implement the package in Ada. On the other hand, I could see the distributed barrier feature as a new AI, since it involves a new capability. But since that would require making the Synchronous_Barrier type a tagged type, I wouldn't want to see AI 174 go through, and then try to address the second AI as a post amendment AI, because that would involve a backwards incompatibility. If we think distributed barriers could be useful, we should try to fix it now while we have the chance to make the change without introducing any incompatibilities. Because a new AI would be so tightly coupled with AI 174 anyway, we might as well address all this under the existing AI, IMHO. It's unfortunate that the pragma Pure issue wasn't caught before Boston, though I do recall a question was raised on whether the package could be pure or not. At the time, nobody came up with a reason why it couldn't be Pure. It's one of those things that one might not detect until one actually tries to implement the package. **************************************************************** From: Robert Dewar Sent: Sunday, March 14, 2010 5:08 AM > The pragma Pure is a problem with the current version of the AI, since > it seems to rule out the Ada implementation that utilizes the POSIX > barrier calls. That suggests reopening the existing AI. > Perhaps the package could be a non-Ada implementation, but it seems a > shame to not be able to implement the package in Ada. with GNAT, we disable the categorization checks throughout the run-time, (more accurately we make them warnings, and then supress the warning where needed) and this suppression is required in several situations, so we would not find it a problem at all. **************************************************************** From: Brad Moore Sent: Sunday, March 14, 2010 1:02 PM Hmmm, maybe the package should stay Pure then, since it can be implemented in Ada with a Protected Type (using pragma Pure), and the protected type version performs comparably to the POSIX barrier implementation. If a compiler vendor wishes to use the POSIX calls, they can still do so. Regarding the changed to a tagged type. I think this is still worth considering, since it allows for distributed usage. However, if we declare the Distributed_Barrier access type in a pure package, then the Storage_Size for the type needs to be specified as 0. That seems to not be a problem though, since an instance of a RACW type is never allocated. (I think). There is one other consideration though. The POSIX call to initialize the barrier has a parameter to specify attributes of the barrier. function pthread_barrier_init (barrier : not null access pthread_barrier_t; attr : System.Address := System.Null_Address; count : C.unsigned) return C.int; If attr is NULL, the default barrier attributes are used; the effect is the same as passing the address of a default barrier attributes object. In my reference implementations, I use these default barrier attributes. Currently, only the process-shared attribute is provided in POSIX, and the pthread_barrierattr_getpshared() and pthread_barrierattr_setpshared() C functions are used to get and set the attribute. The process-shared attribute can have the following values: PTHREAD_PROCESS_PRIVATE The barrier can only be operated upon by threads created within the same process as the thread that initialized the barrier. This is the default value of the process-shared attribute. PTHREAD_PROCESS_SHARED The barrier can be operated upon by any thread that has access to the memory where the barrier is allocated. The Ada package interface as currently defined does not provide a way to control this attribute. By having the Distributed_Barrier type though, we effectively get this capability and more since the barrier is not only available to other partitions on the same computer, they can be accessed by remote computers, if the compiler implementation supports the distributed annex. I wonder though whether we should be providing a way to specify barrier attributes in the synchronous barrier package as some sort of system defined set of attributes. An implementation that uses a protected type instead of the POSIX calls might only provide the default attribute (equivalent to PTHREAD_PROCESS_PRIVATE). On another note, I have attached a third implementation of the Synchronous_Barriers package that implements the barrier as a tagged type that contains a protected object component. I ran the same timing tests, and found it interesting to note that this Ada implementation shaves a tenth of a second off the original Ada implementation. I cannot offer an explanation for why this version is faster over the first one. I would have expected it to be slightly slower if anything, since the protected type is wrapped inside a tagged type. i.e. Original Ada => 5.30 - 5.36 seconds New Ada => 5.23 - 5.26 seconds POSIX => 4.94 - 4.97 seconds --- with System; package Ada.Synchronous_Barriers is pragma Pure; subtype Barrier_Limit is Positive range 1 .. System.Max_Parallel_Release; type Synchronous_Barrier (Number_Waiting : Barrier_Limit) is tagged limited private; procedure Wait_For_Release (Barrier : in out Synchronous_Barrier; Released_Last : out Boolean); type Distributed_Barrier is access all Synchronous_Barrier'Class; private for Distributed_Barrier'Storage_Size use 0; protected type Barrier_Type (Number_Waiting : Barrier_Limit) is entry Wait_For_All_Tasks_To_Block (Released_Last : out Boolean); private Done : Boolean := False; end Barrier_Type; type Synchronous_Barrier (Number_Waiting : Barrier_Limit) is tagged limited record The_Barrier : Barrier_Type (Number_Waiting); end record; end Ada.Synchronous_Barriers; package body Ada.Synchronous_Barriers is ---------------------- -- Wait_For_Release -- ---------------------- procedure Wait_For_Release (Barrier : in out Synchronous_Barrier; Released_Last : out Boolean) is begin Barrier.The_Barrier.Wait_For_All_Tasks_To_Block (Released_Last); end Wait_For_Release; protected body Barrier_Type is --------------------------------- -- Wait_For_All_Tasks_To_Block -- --------------------------------- entry Wait_For_All_Tasks_To_Block (Released_Last : out Boolean) when Wait_For_All_Tasks_To_Block'Count = Number_Waiting or Done is begin Done := True; if Wait_For_All_Tasks_To_Block'Count = 0 then Released_Last := True; Done := False; else Released_Last := False; end if; end Wait_For_All_Tasks_To_Block; end Barrier_Type; end Ada.Synchronous_Barriers; **************************************************************** From: Bob Duff Sent: Sunday, March 14, 2010 1:43 PM > I wonder though whether we should be providing a way to specify > barrier attributes in the synchronous barrier package as some sort of > system defined set of attributes. An implementation that uses a > protected type instead of the POSIX calls might only provide the > default attribute (equivalent to PTHREAD_PROCESS_PRIVATE). Seems better to keep it portable. If implementations want to provide implementation dependent bells and whistles they can do so (e.g. put additional operations in a child package). **************************************************************** From: Tucker Taft Sent: Sunday, March 14, 2010 2:44 PM I would vote for making it Preelaborate. It seems likely that this package should be implementable without special treatment by the compiler, and it seems pretty natural to use finalization. I think we want to make it a Remote_Types package so that it *can* be used with a RACW type, but not actually declare such a type inside it. That does leave open the question of whether it should be declared as a visibly tagged type, or let the user wanting to use it with an RACW to wrap it in a tagged type. I would probably let the user wrap it, since the whole idea of distributed barriers seems a bit unlikely to me, given how easily a barrier will freeze up if even a single task fails to "show up." **************************************************************** From: Brad Moore Sent: Sunday, March 14, 2010 11:14 PM > I would vote for making it Preelaborate. > It seems likely that this package should be implementable without > special treatment by the compiler, and it seems pretty natural to use > finalization. I could go either way, but I think it would be preferable to use Preelaborate for the same reasons you mention. > I think we want to make it a Remote_Types package so that it *can* be > used with a RACW type, but not actually declare such a type inside it. > That does leave open the question of whether it should be declared as > a visibly tagged type, or let the user wanting to use it with an RACW > to wrap it in a tagged type. > I would probably let the user wrap it, since the whole idea of > distributed barriers seems a bit unlikely to me, given how easily a > barrier will freeze up if even a single task fails to "show up." Good points. Do you think there is any likelihood that users might want to extend the type for other reasons than remote usage? Perhaps to include state information to assist the final "cleanup" task that gets the "Released_Last" out parameter set to true? I suppose you could argue that the user could just as easily wrap the type in a tagged type for that purpose. If it seems that this would be fairly common usage though, then having a tagged type would be more convenient. If that scenario seems unlikely, then leaving it as untagged could make more sense. **************************************************************** From: Tucker Taft Sent: Sunday, March 14, 2010 11:04 PM Hi Jon, I am forwarding this to the ARG mailing list. Hopefully Brad will see it. Jon S. Squire. wrote: > Please send a test case. > > I tried to write a test for both versions, no luck. > Could not get them to compile. > > I have used barriers with both Java and Python. > I really need them for Ada. **************************************************************** From: Brad Moore Sent: Monday, March 15, 2010 12:05 AM You probably had problems getting the implementations to compile because the System package currently does not define Max_Parallel_Release constant. That does not exist yet, until AI05-174 becomes official, as proposed for the Ada 2012 amendment. Actually, I recall there was some change related to that constant, but I will need to see the minutes of the Boston ARG meeting (which haven't been prepared yet) to see what was decided upon. [The rest of the message describes a sample implementation, complete with compilation instructions. It's too large to include here. Contact Brad or the ARG Editor to get a copy.] **************************************************************** From: Tucker Taft Sent: Monday, March 15, 2010 7:30 AM I believe we decided to eliminate Max_Parallel_Release from System. It is too specialized a value to appear there. There seems no reason not to put it directly in the new package. In fact I think we decided to eliminate it completely, by simply saying the index high bound was an implementation-defined value, which you could always determine using the 'Last attribute. **************************************************************** From: Bob Duff Sent: Monday, March 15, 2010 10:43 AM My recollection is that I suggested that, but you said it was a stupid idea (well, you were more polite than that ;-)), because the 'Last is intended to be something like 2**31-1, whereas Max_Parallel_Release might be much smaller. So I thought we ended up agreeing to keep Max_Parallel_Release, but move it into this package (instead of System). But we'll have to look at Randy's minutes to be sure. **************************************************************** From: Randy Brukardt Sent: Monday, March 15, 2010 12:43 PM My notes reflect what Tucker said (get rid of it completely and just use 'Last). I don't have any of your "recollections". You may have had a stupid idea during the meeting :-), but I don't think this was it. Specifically, my notes (which need some fleshing out) have: subtype Barrier_Limit is Positive range 1 .. <>; Users can use Barrier_Limit'Last. Get rid of the System change and with of System. **************************************************************** From: Bob Duff Sent: Monday, March 15, 2010 4:56 PM > My notes reflect what Tucker said (get rid of it completely and just > use 'Last). I don't have any of your "recollections". Your notes should be trusted far more than my vague memory. >... You may have had a stupid > idea during the meeting :-), ... Well, I'm pretty sure I at least made some silly jokes. >...but I don't think this was it. > > Specifically, my notes (which need some fleshing out) have: > > subtype Barrier_Limit is Positive range 1 .. <>; > > Users can use Barrier_Limit'Last. > > Get rid of the System change and with of System. Thanks for taking good notes! I know it's hard, especially when you're also trying to participate in the discussion. **************************************************************** From: Tucker Taft Sent: Monday, March 15, 2010 5:07 PM Actually, I vaguely remember the same thing that Bob vaguely remembered. But perhaps we went back and forth a couple of times, and you only recorded the final decision. **************************************************************** From: Randy Brukardt Sent: Monday, March 15, 2010 5:42 PM Or maybe I simply never heard that part of the discussion because I was too busy recording the first part. (It happens.) I definitely don't recall any discussion about *not* making the value 'Last -- and I can't think of any advantage to having a separate constant for barriers (this value reflects a target OS - or implementation - limit, so it isn't going to change often). Maybe you are thinking about the similar value for CPUs, which is likely to be determined at runtime by the target hardware. There, a separate function to get the limit makes sense, with a larger value for 'Last for the largest number of CPUs that the target OS supports. **************************************************************** From: Tucker Taft Sent: Monday, March 15, 2010 7:26 PM I think you might be right that I was confusing the number of CPUs and the barrier limit. ****************************************************************