!standard 5.1(1) 18-07-20 AI12-0267-1/06 !standard 9.5(57/5) !standard 9.10(11) !standard 9.10(15) !standard 11.5(19.2/2) !standard H.5(0) !standard H.5(1/2) !standard H.5(5/5) !standard H.5(6/2) !class Amendment 18-03-29 !status work item 18-03-29 !status received 18-03-29 !priority Medium !difficulty Hard !subject Data race and non-blocking checks for parallel constructs !summary Restrictions against potential data races and blocking are defined for parallel constructs are defined. A policy-based mechanism is provided to control what level of restrictions are imposed for potential data races. !problem Parallel execution can be a source of erroneous execution in the form of data races and deadlocking. As multicore computing becomes more prevalent, the concerns for improved safety with regard to parallel execution are expected to become more common-place. The proposals for the global and non-blocking aspects should provide the compiler with semantic information to facilitate detection of these errors, but additional checking may be needed to determine if a parallelism construct is data-race free. The focus of this AI is to consider any additional restrictions that may be needed to support these features. Similarly, if a compiler cannot determine if the parallelism is safe, then there ought to be a way for the programmer to explicitly override the conservative approach of the compiler and insist that parallelism should be applied, even if data races or deadlocking problems can potentially occur. Ada needs mechanisms whereby the compiler is given the necessary semantic information to enable the implicit and explicit parallelization of code. After all, the current Ada language only allows parallel execution when it is semantically neutral (see 1.1.4(18) and 11.6(3/3)) or explicitly in the code as a task. !proposal This proposal depends on the facilities for aspect Global (AI12-0079-1) and for aspect Nonblocking (AI12-0064-2). Those proposals allow the compiler to statically determine where parallelism may be introduced without introducing data races or deadlocking. Note that we distinguish "data race" from the more general term "race condition" where "data race" refers to the case where two concurrent activities attempt to access the same data object without appropriate synchronization, where at least one of the accesses updates the object. Such "conflicting" concurrent activities are considered erroneous in Ada, per section 9.10 of the reference manual. The more general term "race condition" includes other kinds of situations where the effect depends on the precise ordering of the actions of concurrent activities. Race conditions can be benign, or even intentional, and cannot easily be detected in many cases. We make no particular attempt to prevent the more general notion of "race conditions," but we do hope to minimize data races. An important part of this model is that the compiler will complain if it is not able to verify that the parallel computations are independent (See AI12-0079-1 and AI12-0064-1 for ways on how this can happen). Note that in this model the compiler will identify code where a potential data race (herein called a "potential conflict") occurs (following the rules for access to shared variables as specified in 9.10, and point out where objects cannot be guaranteed to be independently addressable. We considered using run-time checks to detect data overlap when compile-time checks were not feasible, but the consensus was that better would be to define a range of compile-time levels of checking, and forego any use of run-time checks. Parallel constructs are only useful if they speed up a program, and complex run-time checks are almost certain to undermine any attempt at a speedup. This model also disallows potentially blocking operations within parallel block statements, parallel loop statements, and parallel reduce attributes, to simplify the implementation of these constructs, and to eliminate possibilities of deadlocking from occurring. We propose a Conflict_Check_Policy pragma to control the level of compile-time checking performed for potentially conflicting accesses to shared data. !wording Modify 9.5(57/5): A {parallel construct or a} nonblocking program unit shall not contain, other than within nested units with Nonblocking specified as statically False, a call on a callable entity for which the Nonblocking aspect is statically False, nor shall it contain any of the following: Between 9.10(10) and 9.10(11): Remove "Erroneous Execution" subtitle (it will re-appear later). Modify 9.10(11): [Given an action of assigning to an object, and an action of reading or updating a part of the same object (or of a neighboring object if the two are not independently addressable), then the execution of the actions is erroneous unless the actions are sequential.] Two actions are {defined to be} sequential if one of the following is true: Add after 9.10(15): Two actions that are not sequential are defined to be /concurrent/ actions. Two actions are defined to /conflict/ if one action assigns to an object, and the other action reads or assigns to a part of the same object (or of a neighboring object if the two are not independently addressable). The action comprising a call on a subprogram or an entry is defined to /potentially conflict/ with another action if the Global aspect (or Global'Class aspect in the case of a dispatching call) of the called subprogram or entry is such that a conflicting action would be possible during the execution of the call. Similarly two calls are considered to potentially conflict if they each have Global (or Global'Class in the case of a dispatching call) aspects such that conflicting actions would be possible during the execution of the calls. A /synchronized/ object is an object of a task or protected type, an atomic object (see C.6), a suspension object (see D.10), or a synchronous barrier (see D.10.1). [Redundant: Operations on such objects are necessarily sequential with respect to one another, and hence are never considered to conflict.] Erroneous Execution The execution of two concurrent actions is erroneous if the actions make conflicting uses of a shared variable (or neighboring variables that are not independently addressable). Add new section: 9.10.1 Conflict Check Policies This subclause determines what checks are performed relating to possible concurrent conflicting actions (see 9.10). The form of a Conflict_Check_Policy pragma is as follows: pragma Conflict_Check_Policy (/policy_/identifier); A pragma Conflict_Check_Policy is allowed only immediately within a declarative_part, a package_specification, or as a configuration pragma. Legality Rules The /policy_/identifier shall be one of Unchecked, Known_Conflict_Checks, Parallel_Conflict_Checks, All_Conflict_Checks, or an implementation-defined conflict check policy. Static Semantics A pragma Conflict_Check_Policy given in a declarative_part or immediately within a package_specification applies from the place of the pragma to the end of the innermost enclosing declarative region. The region for a pragma Conflict_Check_Policy given as a configuration pragma is the declarative region for the entire compilation unit (or units) to which it applies. If a pragma Conflict_Check_Policy applies to a generic_instantiation, then the pragma Conflict_Check_Policy applies to the entire instance. If multiple Conflict_Check_Policy pragmas apply to a given construct, the conflict check policy is determined by the one in the innermost enclosing region. If no such Conflict_Check_Policy pragma exists, the policy is Parallel_Conflict_Checks (see below). Implementation Requirements The implementation shall impose restrictions related to possible concurrent conflicting actions, according to which conflict check policies apply at the place where the action or actions occur, as follows: * Unchecked This policy imposes no restrictions. * Known_Conflict_Checks If this policy applies to two concurrent actions, they are disallowed if they are known to denote the same object (see 6.4.1) with uses that potentially conflict. For the purposes of this check, any parallel loop may be presumed to involve multiple concurrent iterations, and any task type may be presumed to have multiple instances. * Parallel_Conflict_Checks This policy disallows a parallel construct from reading or updating a variable that is global to the construct, unless it is a synchronized object, or unless the construct is a parallel loop, and the global variable is a part of a component of an array denoted by an indexed component with at least one index expression that statically denotes the loop parameter of the loop_parameter_specification or the chunk index parameter of the parallel loop. * All_Conflict_Checks This policy includes the restrictions imposed by the Parallel_Conflict_Checks policy, and in addition disallows a task body from reading or updating a variable that is global to the task body, unless it is a synchronized object. Implementation Permissions When the applicable conflict check policy is Known_Conflict_Checks, the implementation may disallow two concurrent actions if the implementation can prove they will at run-time denote the same object with uses that potentially conflict. Modify H.5(1/2): The following pragma [forces] {requires} an implementation to detect potentially blocking operations [within] {during the execution of} a protected operation{ or a parallel construct}. Modify H.5(5/5): An implementation is required to detect a potentially blocking operation {that occurs} during {the execution of} a protected operation{ or a parallel construct defined within a compilation unit to which the pragma applies}, and to raise Program_Error (see 9.5[.1]). Modify H.5(6/2): An implementation is allowed to reject a compilation_unit {to which a pragma Detect_Blocking applies} if a potentially blocking operation is present directly within an entry_body{,} [or] the body of a protected subprogram{, or a parallel construct occurring within the compilation unit}. !discussion It is important for the programmer to receive an indication from the compiler whether the desired parallelism is safe. We considered whether blocking calls should be allowed in a parallel block statement. We felt that allowing that could add significant complexity for the implementor, as well as introduce safety concerns about potential deadlocking. While supporting blocking within a parallel construct is feasible, it was felt that it should be disallowed for now. If the demand and need is felt in the future, it could be added then, but it is better to not standardize that capability until we know it is needed. In general, to support blocking, a logical thread of control must get its own physical thread of control, which implies overhead that is not normally desired for relatively light-weight parallel constructs. We have largely chosen to simply piggy-back on the existing rules in 9.5 about potentially blocking to disallow blocking in parallel constructs. To maximize safety, we are considering all parallel constructs to be implicitly "nonblocking" constructs, thereby moving the non-blocking check to compile time. As far as data-races, we have chosen to introduce the term "concurrent actions" in contrast to "sequential actions," to avoid the somewhat awkward term "non-sequential actions." We have defined the terms "conflicting" and "potentially conflicting" to simplify other wording that talks about data races. The distinction is that directly "conflicting" actions involve assignments and reads, while "potentially conflicting" actions involve calls where the Global (or Global'Class aspect for dispatching calls) implies that a conflicting action would be allowed during the execution of the call. Note that these definitions apply no matter how the concurrency is introduced, so they apply to concurrency associated with "normal" Ada tasking. We define four "conflict check policies" to control the level of checking that is performed to prevent data races. The most restrictive policies ensure that individual actions do not read or update variables that are visible to other logical threads of control. This is quite conservative, as it will complain even if the given variable is manipulated by only one logical thread of control. The Known_Conflict_Checks policy is less restrictive, and is intended to only disallow conflicts that are clearly unsafe. The Unchecked policy imposes no restrictions whatsoever. It was felt that the default should be upward compatible with existing code, but completely safe for new uses of parallel constructs. As such it disallows all uses of un-synchronized globals, unless the reference is to an element of an array using either the loop parameter or the chunk index as the index into the array. !ASIS ** TBD. !ACATS test ACATS B- and C-Tests are needed to check that the new capabilities are supported. !appendix From: Brad Moore Sent: Wednesday, March 28, 2018 12:12 AM This is a new AI extracted from AI12-0119-1 (Parallel Operations). [This is version /01 of this AI - ED.] This contains the bits that were related to detecting data races and deadlocking issues relating to blocking calls for parallel constructs. The wording hasn't change on this, but it probably needs a bit of work to better define what checks are performed, when they are performed, and how those checks can be enabled/disabled. **************************************************************** From: Tucker Taft Sent: Wednesday, March 28, 2018 3:07 PM We might want to have these rules be associated with a restriction or restrictions, rather than as basic legality rules. Erhard makes a strong case for allowing programmers to bypass these rules when appropriate. Randy argues for introducing the notion of an "Allowance" and making these in effect unless an Allowance overrules them. In any case, to provide finer control, we might want to have some way to turn them on and off, which might argue for both a restriction and an allowance, with "No_Data_Races" and "Allow_Data_Races" (though that name is a bit strange). Perhaps we just need a way to locally disable a restriction, e.g.: Pragma Bypass_Restrictions("No_Data_Races"), analogous to Unsuppress. We already have a way to specify a restriction partition wide, and implementations could choose to provide certain restrictions by default, so that seems like an alternative to having to introduce the distinct concept of allowances. In particular, support for Annex H might require that certain restrictions be on by default, but then we clearly would need a way to turn them off. Also, you seem to have left these rules in the basic AI on parallel operations. They should probably be removed from there. **************************************************************** From: Randy Brukardt Sent: Thursday, March 29, 2018 3:07 PM One thing that is clearly missing from this AI is a statement that it depends on AI12-0119-1 (parallel blocks and loops) and probably on AI12-0242-1 (reduction). It does mention the contract AIs (64 & 79) so it's odd to not mention the AIs that it is adding Legality Rules too. **************************************************************** From: Tucker Taft Sent: Tuesday, June 12, 2018 12:20 PM Here is an AI started by Brad, but essentially re-written by Tuck, to capture the checks for blocking and data races associated with parallel constructs. [This is version /02 of the AI - Editor.] In fact, as you will see, the data-race detection applies to any concurrency, whether it is introduced by tasking or parallel constructs. Furthermore, the detection of potential blocking is simply a refinement of what is already performed for protected operations. We introduce a new pragma Detect_Conflicts and a new "check" Conflict_Check, along with some new terminology such as "concurrent actions," "conflicting actions," and "potentially conflicting actions." Hopefully these terms are relatively intuitive. Comments as usual welcome! **************************************************************** From: Randy Brukardt Sent: Tuesday, June 12, 2018 8:29 PM > Here is an AI started by Brad, but essentially re-written by Tuck, to > capture the checks for blocking and data races associated with > parallel constructs. ... > Comments as usual welcome! Thanks for getting this done. The use of a dynamic check that can be suppressed is clever and eliminates certain issues. --- Sigh. I wanted to start with something positive, but sadly I don't see much positive in this specific proposal. I best start with a pair of philosophies: (1) Ada is at its heart a safe language. We apply checking by default other than in cases where we can't do it in order to keep compatibility with existing code. (For instance, for these checks in tasks, and for Nonblocking checks in protected types -- adding the checks would be incompatible with existing code -- so we don't do it.) The parallel block and loop are new constructs, so checks should be done by default. (If the checks are so unreliable that doing so is a problem, then we shouldn't have the checks at all.) (2) Ada already is a fine language for writing parallel programs. Brad's various Paraffin library efforts have proved that you can already write fine parallel programs using the existing facilities of the language. Various improvements to Ada 2020 that we've already finished or are close (in particular, AI12-0189-1 for procedure body iterators) would give virtually the same syntactic and semantic power to using libraries rather than a complex set of new features. (Indeed, a library is more flexible, as it can more easily provide tuning control.) As such, the value of the parallel loop and block features boils down to two things: (A) the ability to more safely construct programs using them; and (B) the marketing advantages of having such features. (A) is important since it allows average programmers to correctly use parallel programming. If we're not going to make any serious effort to get (A) and default to be safe, then the value of the parallel block and loop have (for me) declined to nearly zero (just being (B), and I don't put a ton of value on marketing). Without static, safe checks, we could get the benefits of parallel iterators simply by defining them and not requiring any special syntax to use them. And we could provide some libraries like Brad's for more free-form iteration. That would be much less work than what we've been doing, and with no loss of capability. --- Given the above, here are some specific criticisms: (1) > Change H.5 title to: Pragmas Detect_Blocking and Detect_Conflicts Putting Detect_Conflicts into Annex H and not having it on by default ensures that most people will never use it (like Detect_Blocking), ensures that many users won't even have access to it (since many vendors don't support Annex H), and worst, ensures that Ada newbies won't get the benefit of correctness checking for parallel loops. It makes sense to let experts turn these checks off (they're not intended for people like Alan!), but they need to be made unless/until a user knows they don't need them. (2)> An implementation is required to detect a potentially blocking > operation {that occurs} during {the execution of} a protected > operation{ or a parallel construct defined > within a compilation unit to which the pragma applies}, and to raise > Program_Error (see 9.5[.1]). I don't understand this at all. AI12-0119-1 already includes the following: During the execution of one of the iterations of a parallel loop, it is a bounded error to invoke an operation that is potentially blocking (see 9.5). I expect that that most implementations of parallel loops will detect the error, since that will allow a substantial simplification in the implementation (a thread that doesn't block doesn't need data structures to manage blocking or delays - just the ability to complete or busy-wait for a protected action). For such implementations, this buys nothing. Moreover, the presence of the bounded error ensure that no portable code can ever depend on blocking in a parallel loop or block. Moreover, this formulation completely ignores the substantial work we did defining static checks for nonblocking. Since no portable code can depend on blocking, I don't even see much reason to be able to turn the check off. We could simply have a Legality Rule for each statement preventing the use of any operation that allows blocking. If we really need a way to turn the check off, we should declare a pragma for this purpose. (It doesn't have to be a general feature.) Aside: If we're in the business of defining configuration pragmas to get additional checks, we should consider adding a pragma to force all protected declarations to default to having Nonblocking => True. That would make the nonblocking checks static in those declarations. We can't do that by default for compatibility reasons, but we can surely make it easy to do that if the user wants. (3) A pragma Detect_Conflicts is a configuration pragma. Detech_Conflicts should always be on for parallel loops and blocks (to get (A), above). The suppression form ought to be sufficient for the hopefully rare cases where it needs to be turned off. For compatibility, we need a pragma like Detect_Conflicts if we want to detect these uses for tasks. (4) > Implementation Permissions > > An implementation is allowed to reject a compilation_unit to which > the pragma Detect_Conflicts applies, if concurrent actions within the > compilation unit are known to refer to the same object (see 6.4.1) > with uses that potentially conflict, and neither action is within the > scope of a Suppress pragma for the Conflict_Check (see 11.5). This ought to be a requirement. What is the point of knowing this statically (which we do) and then not reporting it to the user? I doubt that many vendors would take advantage of a permission that simply adds work, especially early on. Perhaps we should limit any such requirement to obvious places (such as within a parallel block or loop) as opposed to the rather general catch-all. But then see below. (5) > An implementation is required to detect that two concurrent actions > make potentially conflicting uses of the same shared variable (or of > neighboring variables that are not independently addressable) if the > actions occur within a compilation unit to which the pragma applies > (see 9.5). Program_Error is raised if such potentially conflicting > concurrent actions are detected. I'm not sure how this could be implemented in Janus/Ada, since we don't process anything larger than a simple statement at a time. Trying to keep track of every possible concurrent access over an entire unit sounds like a nightmare, given that nearly object referenced in the compilation unit could be used that way. The check for a single parallel block or loop seems tractable since the scope of the check is very limited. But if any possible concurrent access is involved, that can happen in any code anywhere. That seems way too broad. It seems especially painful in the case of array indexing, which already requires a very complex dynamic check even for a parallel loop. I haven't a clue how one could implement such a check for all tasks in a program unit. It might be implementable if such checks were allowed in (and limited to) to task bodies as well as the parallel constructs. But even then the implementation would be quite painful, not just the local check in a single construct as previously envisioned. ======================================================= So, let me outline an alternative proposal using a few of the features of Tucker's proposal. (1) The use of an operation that allows blocking (see 9.5) is not allowed in a parallel block or loop unless pragma Allow_Blocking applies to the block or loop. Pragma Allows_Blocking gives permission to allow blocking in a parallel block or loop. It applies to a scope similarly to pragma Suppress. (Detailed wording TBD, of course). (2) Pragma Statically_Check_Blocking is a configuration pragma. It causes the value of Nonblocking for a protected declaration (that is, a protected type or single protected object) to be True, unless it is directly specified. (That is, the value is not inherited from the enclosing package.) It also causes pragma Detect_Blocking to be invoked. The effect of this is to cause protected types to be rejected if a potentially blocking operation can be executed (note that we include Detect_Blocking so that the deadlock case, which can't be statically detected, raises Program_Error). One can turn off the effect of this pragma on an individual declaration by explicitly specifying Nonblocking => False. (3) Conflict checks are defined essentially as Tucker has defined them, but they only apply to parallel blocks and loops. And they're always on unless specifically suppressed. (4) The check Conflict_Check is defined essentially as Tucker has it. (5) They have a Legality Rule to reject the parallel construct for the reasons Tucker noted: A parallel construct is illegal if concurrent actions within the construct are known to refer to the same object (see 6.4.1) with uses that potentially conflict, and neither action is within the scope of a Suppress pragma for the Conflict_Check (see 11.5). We could call this a "requirement" if it gives people heartburn to attach a Legality Rule to pragma Suppress. I don't see much benefit to insisting that a runtime check is made if it is known to be problematic at compile-time, but whatever way we describe that doesn't matter much. P.S. Hope I wasn't too negative in this; I'm trying to make progress on this topic. **************************************************************** From: Tucker Taft Sent: Tuesday, June 12, 2018 9:06 PM I agree with much of what you say. I tried to define something that was internally consistent, as well as consistent with existing features of the language. However, I agree with you that we might want to have some of these checks on by default. I guess I should have made that clear. I did indicate in the discussion that placement of the Detect_Conflicts pragma in H.5 is one approach, but 9.10 and C.6.1 are reasonable alternative places. To some extent I was reacting to Erhard's big concern that we might be making parallelism too hard to use, by enforcing all checks by default, when we know that Global annotations will not be very precise in many situations. As far as implementation of conflict checks, I tried to make it clear that the checks can be implemented locally to individual subprograms, since the Global/Global'Class aspect is used when analyzing a call on some other subprogram. So really all you need to worry about are places where a task is initiated, or a parallel construct occurs. You don't need to look across the "entire" compilation unit for possible conflicts. The checks can be performed very locally. Despite Brad's success with his Parafin library, I believe it is totally unrealistic to expect typical Ada programmers to build their own light-weight threading that takes advantage of multicore hardware. We clearly need to provide a new standard way to create light-weight threads of control, with appropriate work-stealing-based scheduling. It could be in a "magic" generic package of some sort, but I really don't think that would be very friendly. **************************************************************** From: Randy Brukardt Sent: Tuesday, June 12, 2018 9:53 PM > I agree with much of what you say. I tried to define something that > was internally consistent, as well as consistent with existing > features of the language. However, I agree with you that we might > want to have some of these checks on by default. I guess I should > have made that clear. > I did indicate in the discussion that placement of the > Detect_Conflicts pragma in H.5 is one approach, but 9.10 and > C.6.1 are reasonable alternative places. Saw that, but I thought it was important to remind everyone that Annex H is *optional*! > To some extent I was reacting to Erhard's big concern that we might be > making parallelism too hard to use, by enforcing all checks by > default, when we know that Global annotations will not be very precise > in many situations. Parallel programming *is* hard, and papering that over doesn't really serve anyone. Anyone below the "guru" level will appreciate having help getting it right (even if that is restrictive). After all, parallelism not truly safe unless everything is Global => null or in synchronized (with a few additional cases for blocks, which can have different subprograms in each branch). You don't need Ada to write unsafe parallel code! I viewed Erhard's concern as being more oriented to the problem that no existing code will have Nonblocking or Global specified. So to make some existing code parallel, it might be necessary to add those all over the place. That does bother me a bit, but I view that as a one-time pain, and since the contracts have other benefits beyond just these checks, it's well worth the effort. > As far as implementation of conflict checks, I tried to make it clear > that the checks can be implemented locally to individual subprograms, > since the Global/Global'Class aspect is used when analyzing a call on > some other subprogram. So really all you need to worry about are > places where a task is initiated, or a parallel construct occurs. You > don't need to look across the "entire" compilation unit for possible > conflicts. The checks can be performed very locally. I don't quite understand this. The checks are about conflicts between logical threads, and as such there have to be at least two threads involved. In the case of task initiation, you can find out the global uses for the new task, but you don't know that for activating task -- you can activate a task anywhere. You would know if there are any local conflicts in the scope where the task is going to end up -- how could you possibly do that for a task initiated by an allocator or returned from a function? > Despite Brad's success with his Parafin library, I believe it is > totally unrealistic to expect typical Ada programmers to build their > own light-weight threading that takes advantage of multicore hardware. I agree, they should download Paraffin. :-) More seriously, we could provide a light-weight version of the library in Standard and let implementers enhance it. > We clearly need to provide a new > standard way to create light-weight threads of control, with > appropriate work-stealing-based scheduling. It could be in a "magic" > generic package of some sort, but I really don't think that would be > very friendly. The friendliness comes from AI12-0189-1 and similar syntax sugar. We didn't need much extra syntax to get generalized iterators, generalized indexing, or generalized references. The only reason this case is different is that we (some of us, at least) want to attach static checks to this case. Otherwise, the general syntax is good enough. P.S. Sorry, I think that I inherited the need to always have the last word from Robert. :-) **************************************************************** From: Randy Brukardt Sent: Thursday, June 14, 2018 9:48 PM > I don't quite understand this. The checks are about conflicts between > logical threads, and as such there have to be at least two threads > involved. > In the case of task initiation, you can find out the global uses for > the new task, but you don't know that for activating task -- you can > activate a task anywhere. You would know if there are any local > conflicts in the scope where the task is going to end up -- how could > you possibly do that for a task initiated by an allocator or returned > from a function? Having thought about this some more, I think you are trying to use the Global of the activator for this check. That works for the activation of local tasks, but it doesn't seem to work in the most important case: library-level tasks (or tasks that are allocated for a library-level access type). That's because you have to compare against the Global for the environment task. What's that? Well, it has to be effectively Global => in out all, because the environment task elaborates all of the global objects. So you have to allow it to write them! But that means that there would be a conflict with every task that uses any global data -- even if that task is the *only* (regular) task to access that global data. That doesn't seem helpful in any way. As I noted yesterday, I also don't see how the check could be done in the case where a task is returned by a function -- you don't know the activator inside the function, and you don't know much about the task at the function call site. You probably could rig up a runtime version of the static check, but that would be very complex (the static check being complex, since the data structures needed are very complex). Seems like a huge amount of runtime overhead for not much benefit (as noted above, almost every task has a conflict; the whole reason that programming Ada tasks is hard is managing that conflict). **************************************************************** From: Tucker Taft Sent: Friday, June 15, 2018 3:12 AM I am working on a new version. Hopefully it will address some of these issues. **************************************************************** From: Tucker Taft Sent: Friday, June 15, 2018 4:44 AM Here is a new version that attempts to address most of Randy's comments. Both non-blocking checks and compile-time conflict checks are on by default for parallel constructs. [This is version /03 of the AI - Editor.] **************************************************************** From: Randy Brukardt Sent: Friday, June 15, 2018 8:17 PM > Here is a new version that attempts to address most of Randy's > comments. Both non-blocking checks and compile-time conflict checks > are on by default for parallel constructs. Thank you very much; I like this version much better. For the record, I deleted a couple of extra spaces and added a missing period when posting. There might be some value to describing the model of the data race checks for tasks, since it didn't seem obvious to me. Also, there might be value in describing the difference between a "data race" and a "race condition" (you did this once for me in e-mail, which I happened to re-read last night when working on AI12-0240-2; I doubt most people understand the difference). And in particular, that there is no detection of race conditions (which isn't really possible, and as you pointed out, aren't even erroneous in an Ada sense). No rush on those though. **************************************************************** From: Erhard Ploederder Sent: Saturday, June 23, 2018 4:16 PM Here are a few relevant examples for determining data races. The AI should have good answers to the questions raised below. -------------- Example 1: Are the rules really conservative? type IP is access all Integer; AP: IP; A,B,C: aliased Integer; procedure A_Read is -- with Globals in: AP, Heap; Globals out: C begin C := AP.all; end; procedure A_Write is -- with Globals out: A begin A := 42; end; and somewhere in the universe: if Random then AP := A'access; else AP := B'access; end if; There is a data race if Random=true. Is this data race detected? If so, what about the data-race-freeness of: procedure C_Write is -- with Globals out: C begin C := 42; end; -------------- Example 1a: and, dropping the "aliased", I missed rules that handle the race between A_Write and Write(A) for procedure Write(X: out Integer) is -- Globals = {} begin X := 21; end Write; ----------------------------------------------- Example 2: do the rules prevent legitimate programs? The following idiom (minus the B, which I added just for fun) to protect the uses of A gets taught in C++ classes: Ready: Boolean := False with Atomic, Volatile; -- I never remember which one implies the other in Ada -- A and B of arbitrary type, here Integer A, B: Integer := 0; procedure Write is -- Globals out: A -- Globals in: B -- Globals in out: Ready begin while Ready loop null; end loop; A := 42 + B; Ready := False; end Write; procedure Read is -- with Globals out: B -- Globals in: A -- Globals in out: Ready begin while not Ready loop null; end loop; B := A - 42; Ready := True; end; This code does not have a data race, not even a general race condition. Illegal in Ada? or legal only, if all ckecks are off ??? Neither will be good advertising for Ada. **************************************************************** From: Tucker Taft Sent: Saturday, June 23, 2018 7:27 PM OK, here is a nearly complete re-write of AI12-0267 which checks for data races. The stuff relating to nonblocking is not particularly changed, though we did decide to say that a parallel construct inside a protected action is legal, but does not actually create any new logical threads of control. [This is version /04 of the AI - Editor.] **************************************************************** From: Randy Brukardt Sent: Friday, June 29, 2018 8:52 PM > Here are a few relevant examples for determining data races. > The AI should have good answers to the questions raised below. My opinion is that examples writing global objects are nearly irrelevant. Only tasking gurus can figure out if such things are safe -- we don't even have tools that can do it. And our goal ought to be to make it possible for the largest number of programmers to write working parallel code. If that requires a tasking guru, we've gained nothing. It makes sense to have an "anything goes" mode so people that who think that they actually understand tasking can use these features unfettered -- that will be valuable for a few experts to write libraries for everyone else to use -- but it probably will cause countless others to shoot themselves in the foot. ... > The following idiom (minus the B, which I added just for fun) to > protect the uses of A gets taught in C++ classes: This of course is precisely the problem I'm concerned about. > Ready: Boolean := False with Atomic, Volatile; > -- I never remember which one implies the other in Ada > -- A and B of arbitrary type, here Integer > A, B: Integer := 0; > > procedure Write is -- Globals out: A > -- Globals in: B > -- Globals in out: Ready > begin > while Ready loop null; end loop; > A := 42 + B; > Ready := False; > end Write; > > procedure Read is -- with Globals out: B > -- Globals in: A > -- Globals in out: Ready > begin > while not Ready loop null; end loop; > B := A - 42; > Ready := True; > end; > > This code does not have a data race, not even a general race > condition. No one but a true tasking guru can prove that. (It would likely take me 4 hours of reading rules and simulating code to verify that you wrote the above example correctly.) And using an "idiom" rather than simply using the language abstractions provided for that purpose to protect objects seems like a terrible inversion. In cases like this, A and B should be in a protected object. That does put pressure on vendors to provide multiple implementations of protected objects, especially lightweight ones to use when there are no queues. But using a safe abstraction will adapt better to whatever the next hardware innovation turns out to be, unlike depending on a particular implementation of a lock. > Illegal in Ada? or legal only, if all ckecks are off ??? > Neither will be good advertising for Ada. Only among people who want to work a lot harder than necessary to get parallel code working. I'd expect it to be better advertising for Ada to tell them that they don't have to write "idioms" that are easy to get wrong, but rather can write simple, safe abstractions for inter-thread communication -- and the Ada compiler will check for most common mistakes. I know from experience that it is nearly impossible to get the simplest piece of code like the above correct, and even identifying problems to worry about is a difficult problem. That can never be the way for the majority of programmers to work. Ada is not going to grow if only a few elite programmers can use it! Even restricting usage to POs (and reading) isn't guaranteed to be perfect, but it seems to be the best that can be done statically. Perhaps by Ada 2028 we can give some help in the design and use of such POs. On one point I do agree with you: we're spending a lot of time getting these rules right, but they're not really going to help. I'd say that's because including atomic objects is going to allow a variety of badly designed code -- personally, I'd ONLY allow writing of global POs in parallel operations, and no other writing. I want to put the work into allowing cases of reading that can't naturally be allowed (like cursor operations in the containers). **************************************************************** From: Jean-Pierre Rosen Sent: Saturday, June 30, 2018 12:57 AM > procedure Write is -- Globals out: A > -- Globals in: B > -- Globals in out: Ready > begin > while Ready loop null; end loop; > A := 42 + B; > Ready := False; := True, I guess > end Write; > > procedure Read is -- with Globals out: B > -- Globals in: A > -- Globals in out: Ready > begin > while not Ready loop null; end loop; > B := A - 42; > Ready := True; := False, I guess > end; > > This code does not have a data race, not even a general race > condition. Hmmm... as long as you can prove there is only one task calling Read, and one task calling Write **************************************************************** From: Randy Brukardt Sent: Saturday, June 30, 2018 1:22 AM ...and the error demonstrates just how hard it is to get these things right. And this is only toy code. It's often much harder in real situations. Yes, these errors will cause some pain. That's not unlike the pain caused by other Ada errors and checks (strong typing; strong hiding; range checks). Ada programmers soon learn that the gain is worth the pain. The same is likely to be true here - reducing thread communication means code that is more easily parallelized automatically as well as manually. **************************************************************** From: Erhard Ploedereder Sent: Sunday, July 1, 2018 5:46 AM > In cases like this, A and B should be in a protected object I could not agree more in theory. However ... > No one but a true tasking guru can prove that. No. In more than 10 years of analyzing code that is running in cars, I have not seen a single use of a construct like a semaphore, mutex, and alike. And it is not just single programmers, it is an entire application domain that works this way. The programmers are not gurus and yet our cars work (mostly). They manage to write the code in such a way that a deadlock is provably impossible, since there are no primitives used that could cause them. (The possibility of livelooks is blissfully ignored or dealt with a code review levels.) The synchronization is done either via flags (as in my example) or, more often, via states in the path predicates (the system cannot be in states 15 and 31 at the same time) plus appropriate state management. **************************************************************** From: Erhard Ploedereder Sent: Sunday, July 1, 2018 5:58 AM >> This code does not have a data race, not even a general race >> condition. > Hmmm... as long as you can prove there is only one task calling Read, > and one task calling Write I stand corrected wrt the comment in the code. **************************************************************** From: Brad Moore Sent: Sunday, July 1, 2018 11:12 AM ... > Example 2: do the rules prevent legitimate programs? > > The following idiom (minus the B, which I added just for fun) to > protect the uses of A gets taught in C++ classes: > > Ready: Boolean := False with Atomic, Volatile; > -- I never remember which one implies the other in Ada > -- A and B of arbitrary type, here Integer > A, B: Integer := 0; > > procedure Write is -- Globals out: A > -- Globals in: B > -- Globals in out: Ready > begin > while Ready loop null; end loop; > A := 42 + B; > Ready := False; > end Write; > > procedure Read is -- with Globals out: B > -- Globals in: A > -- Globals in out: Ready > begin > while not Ready loop null; end loop; > B := A - 42; > Ready := True; > end; > > This code does not have a data race, not even a general race condition. > Illegal in Ada? or legal only, if all ckecks are off ??? Neither will > be good advertising for Ada. I'm not sure I understand the purpose of this example. It appears that one of these two routines, if called, will exit without changing the state of Ready (but assigning it a confirming value), and the other one is guaranteed to get caught in an infinite loop if called. Or did you intend Write to set Ready to True, and Read to set Ready to False? I suspect this is the case, in which case as Jean-Pierre suggests, this only works for a single reader + single writer scenario. If this is the case, it is a case in point (logic bug not easily spotted) for showing better support for using safer constructs such as protected objects, as Randy was suggesting. Or are you suggesting or asking if the rules of the AI should detect this logic problem? Or am I missing something? **************************************************************** From: Erhard Ploedereder Sent: Sunday, July 1, 2018 7:17 PM > ..and the error demonstrates just how hard it is to get these things right. > And this is only toy code. It's often much harder in real situations. Well, it was not an error. It was an idiom for a one-to-one situation that would indeed not work in a many-to-one/many situation. For many-to-X you normally need CAS and family, or else higher-level constructs. **************************************************************** From: Randy Brukardt Sent: Sunday, July 1, 2018 9:33 PM Getting the True and False flags reversed is an error! Moral: use protected objects, not "idioms". **************************************************************** From: Erhard Ploedereder Sent: Sunday, July 1, 2018 9:06 PM > Or did you intend Write to set Ready to True, and Read to set Ready to > False? I suspect this is the case, in which case as Jean-Pierre suggests, > this only works for a single reader + single writer scenario. Yes, indeed. So, I messed up the coding by inverting the two assignments. But the point remains that code like that (without stupid coding mistakes) does work for the one-to-one purpose intended, and that the automotive community has traditionally not been using higher level constructs. Ramming them down their throat will not help. We need to allow such code without poo-pooing it. That was the purpose of the example. **************************************************************** From: Randy Brukardt Sent: Sunday, July 1, 2018 9:46 PM But are these sorts of organizations really customers for Ada? You can write low-level code (that may or may not work) in any language -- the benefits of Ada aren't really available for such code in any case. To get the benefits of Ada, one has to write higher-level code in a different style than they're accustomed. Is there any benefit to advertising that Ada is no more safe than any other language? If some organization wants to write unchecked Ada code, they can by setting the policy appropriately. They won't get any benefit from using Ada this way, but perhaps they'll find some benefit in other existing parts of Ada. For the rest of us, that aren't capable of correctly writing tasking communication using idioms (which apparently includes you :-), there is a default safe mode. Seems like a win-win to me. (And maybe the "automotive community" will learn something about structuring multi-threaded code. But I'm skeptical.) **************************************************************** From: Tucker Taft Sent: Monday, July 2, 2018 2:39 PM I have lost the thread here, I fear, but I believe it is important that Ada provide low-level coding primitives, including unsafe ones (at least under some kind of "Unchecked" banner), because it is much more painful to be forced to write such low-level stuff in some other language (e.g. C) when most of the system is written in Ada, because you then end up having to worry about multiple compilers, potentially incompatible calling conventions, etc. Ada certainly has better-than-average "import" and "export" capabilities, but having to get multiple compilers working smoothly together, potentially across multiple targets, with your build system and run-time libraries, can be a real pain, and they will often be upgraded on different schedules, etc. Clearly smart users will minimize and isolate their use of low-level coding, which is simpler in Ada due to its good modularization and information hiding, but we shoot ourselves in the foot if we get too insistent that all portable Ada code must be completely safe and high level. **************************************************************** From: Erhard Ploedereder Sent: Tuesday, July 3, 2018 6:23 AM Exactly my point and view. My mails are/were about the possible consequence: "If you want your "unsafe" program compiled, you must delete the Global specs, or repent and use the "safe" concepts instead." I believe we resolved this in Lisbon in principle by adding a "do not check race conds based on Global specs"-option. (The examples were written prior to that resolution.) **************************************************************** From: Peter Chapin Sent: Tuesday, July 3, 2018 12:06 PM i don't have any experience in the automotive industry, but i have to say that it disturbs me a little to learn that automotive software developers routinely use such low level methods to manage multi-tasking! **************************************************************** From: Jeff Cousins Sent: Saturday, July 7, 2018 4:15 PM Some minor comments that I didn’t raise at the meeting as the AI was going back for a re-write anyway... !proposal Missing new-lines before the first two goals. !wording There is some over-zealous changing singular to plural: Modify 5.5.2(7/5) last two sentences The loop parameters may be plural but the subtype is still singular. Modify AARM 5.5.2(8.a/5) last sentence The loop parameters may be plural but the accessibility is still singular. Modify 5.5.2(8/3) first and last sentences I would prefer to change “constant” to a plural to keep the word as a noun rather than changing to using it as an adjective. Modify A.18.2(230.2/3) Modify A.18.2(230.4/3) Modify A.18.3(144.2/3) Modify A.18.3(144.4/3) Modify A.18.6(61.2/3) Modify A.18.6(94.2/3) Modify A.18.6(94.4/3) Modify A.18.8(85.2/3) Modify A.18.9(113.2/3) Modify A.18.9(113.4/3) Modify A.18.10(157/4) Modify A.18.10(159/4) I would prefer to change “a loop parameter” to “the loop parameters” rather than “loop parameters”. !discussion 2nd para – capitalise the first letter. ***************************************************************