CVS difference for ai12s/ai12-0234-1.txt

Differences between 1.3 and version 1.4
Log of other versions for file ai12s/ai12-0234-1.txt

--- ai12s/ai12-0234-1.txt	2018/03/30 07:55:08	1.3
+++ ai12s/ai12-0234-1.txt	2018/04/14 04:51:07	1.4
@@ -911,3 +911,724 @@
 to find anything that explains why.
 
 ****************************************************************
+
+From: Erhard Ploedereder
+Sent: Friday, March 30, 2018  5:11 PM
+
+One of the best references is
+Alan Burns, Andy Wellings: Real-Time Systems and Programming Languages,
+Addison Wesley Longmain, ISBN: 978-0-321-41745-9
+
+
+Here is a canonical scenario for a priority inversion:
+
+4 tasks T1 - T4, numbers indicate priority. High number = high priority.
+The tasks are periodic, i.e., released at some time intervals.
+
+T1 runs, grabs said lock to perform the software-emulated compare-and-swap,
+but before it is done with it, T4 is released by time-tigger, preempts T1, 
+runs a bit - meanwhile T2 and T3 are released, too.
+
+Then T4 asks for the same lock, but can't get it. Shucks. Needs to wait until
+T1 finishes with the CAS. But T2 and T3 have higher priority than T1. So, T1 
+needs to wait until T2 and T3 have finished their work. Then, much later, T1 
+gets to run, completes the CAS and releases the lock.
+Then, finally, T4 gets to do its CAS.
+
+Now, there was a reason for T4 having high priority, namely, the fact that it 
+has the tightest deadline (a general principle in fixed-priority embedded 
+scheduling, known to be optimal). Which is likely to be long past in the 
+scenario above.
+
+It T4 controls the brakes in your car, you no longer perceive this as being 
+merely a performance issue. Dead people do not reflect on such things any 
+more.
+
+You just saw a priority inversion in action (named so, because T4 behaves for 
+a while as if it had lowest priority 1). There are scheduling schemes that 
+avoid priority inversion, but only if the locks are a concept provided by the
+run-time system and well understood by the scheduler  (ICPP, OCPP, Priority 
+Inheritance, ... Deadline-floor protocol, etc.)
+
+You can't find these buggers by testing, because they are highly intermittent, 
+i.e., things need to happen at just the right time to cause the prioity 
+inversion.
+
+CAS and friends in the ISA use the synchronization of the memory bus over each 
+memory access instruction to avoid the need for a lock to make the operation 
+atomic, even in the case of multicore.
+
+What makes them dangerous is when users apply them to build their own locks to
+protect some data, because these locks are then unknown to the scheduler. => 
+severe risk of priority inversions, if these löcks cause waits.
+
+****************************************************************
+
+From: Randy Brukardt
+Sent: Friday, March 30, 2018  5:58 PM
+
+> Here is a canonical scenario for a priority inversion:
+> 
+> 4 tasks T1 - T4, numbers indicate priority. High number = high 
+> priority.
+> The tasks are periodic, i.e., released at some time intervals.
+> 
+> T1 runs, grabs said lock to perform the software-emulated 
+> compare-and-swap, but before it is done with it, T4 is released by 
+> time-tigger, preempts T1, runs a bit - meanwhile
+> T2 and T3 are released, too.
+> 
+> Then T4 asks for the same lock, but can't get it. Shucks. 
+> Needs to wait until T1 finishes with the CAS. But T2 and T3 have 
+> higher priority than T1. So, T1 needs to wait until T2 and T3 have 
+> finished their work. Then, much later, T1 gets to run, completes the 
+> CAS and releases the lock.
+> Then, finally, T4 gets to do its CAS.
+
+Thanks. It's clear the problem here is the fact that T1 gets preempted (I knew
+there was a reason I dislike preemption :-).
+
+I also note that this doesn't happen if the lock is part of a protected object,
+is a protected action can't be preempted (caused via ceiling priority or 
+whatever) unless no higher priority task can use it.
+
+> Now, there was a reason for T4 having high priority, namely, the fact 
+> that it has the tightest deadline (a general principle in 
+> fixed-priority embedded scheduling, known to be optimal). Which is 
+> likely to be long past in the scenario above.
+> 
+> It T4 controls the brakes in your car, you no longer perceive this as 
+> being merely a performance issue. Dead people do not reflect on such 
+> things any more.
+
+This is of course why I want checking on the introduction of parallel 
+execution. Mere mortals cannot see these sorts of issues; the easier it is to
+introduce parallelism, the more likely it is for these sorts of effects to 
+occur. (I'm happy to have such checking turned off by experts; it necessarily 
+has to be quite conservative and it wouldn't do to force many things to be 
+written as tasks -- which are even less structured.)
+ 
+> You just saw a priority inversion in action (named so, because T4 
+> behaves for a while as if it had lowest priority 1).
+> There are scheduling schemes that avoid priority inversion, but only 
+> if the locks are a concept provided by the run-time system and well 
+> understood by the scheduler  (ICPP, OCPP, Priority Inheritance, ...
+> Deadline-floor protocol, etc.)
+> 
+> You can't find these buggers by testing, because they are highly 
+> intermittent, i.e., things need to happen at just the right time to 
+> cause the prioity inversion.
+
+Right. Tasking issues in general are impossible to find, because of that fact
+-- even if you get them to happen, you can't reproduce them. I seriously have
+no idea how people do that -- even debugging Janus/Ada's cooperative 
+multitasking is very difficult -- and it's repeatable if you can get rid of
+any timing effects.
+
+> CAS and friends in the ISA use the synchronization of the memory bus 
+> over each memory access instruction to avoid the need for a lock to 
+> make the operation atomic, even in the case of multicore.
+> 
+> What makes them dangerous is when users apply them to build their own 
+> locks to protect some data, because these locks are then unknown to 
+> the scheduler. => severe risk of priority inversions, if these löcks 
+> cause waits.
+
+Makes sense. This suggests that you would prefer that anyone that needs 
+portable synchronization avoid atomic objects altogether (one presumes that
+the compiler has selected an implementation [of protected objects - ED] known
+to the scheduler and/or truly atomic -- if the compiler implementer is 
+clueless to these issues you  have no hope anyway). Is that a fair conclusion??
+
+I'm primarily interested here in the effect on "checked" synchronization for
+parallel execution. That needs to be defined so that a ny moderately 
+competent Ada programmer can do the right thing. Since "parallel" is often 
+used as an optimization, it will often be introduced after the fact, so the 
+only thing preventing problems is the checking.
+
+****************************************************************
+
+From: Erhard Ploedereder
+Sent: Friday, March 30, 2018  6:59 PM
+
+> I also note that this doesn't happen if the lock is part of a 
+> protected object, as a protected action can't be preempted (caused via 
+> ceiling priority or whatever) unless no higher priority task can use it.
+
+True only under a scheduling based on ceiling protocols. Under "plain"
+fixed-priority preemptive scheduling or even with priority inheritance, the 
+preemption can happen.
+
+****************************************************************
+
+From: Edward Fish
+Sent: Thursday, March 29, 2018  10:16 PM
+
+> This is of course why I want checking on the introduction of parallel
+> execution.
+
+But the issue here (preemption of execution) is purely a sequential issue: 
+this is to say, if you have Task-1 and Task-2 where Task-1 where Task-2 is 
+executing and there's a free processor for Task-1 there is no issue. (This 
+issue w/ locks is something different, at least how I learned it [preemption
+having to do strictly with execution].)
+
+> Mere mortals cannot see these sorts of issues; the easier it is
+> to introduce parallelism, the more likely it is for these sorts of effects
+> to occur. (I'm happy to have such checking turned off by experts; it
+> necessarily has to be quite conservative and it wouldn't do to force many
+> things to be written as tasks -- which are even less structured.)
+
+I was really impressed by the Thesis that I referenced in an earlier email 
+-- "Reducing the cost of real-time software through a cyclic task 
+abstraction for Ada" -- I thought it did a great job with increasing the 
+accuracy of schedulability and analysis, at least in the theoretical.
+ 
+> Right. Tasking issues in general are impossible to find, because of that
+> fact -- even if you get them to happen, you can't reproduce them. I
+> seriously have no idea how people do that -- even debugging Janus/Ada's
+> cooperative multitasking is very difficult -- and it's repeatable if you can
+> get rid of any timing effects.
+
+Are they? In the very 'generalest' it may be like the halting-problem and thus
+impossible... but I don't know that that necessarily translates into some 
+usable subset. Just like how just because Ada's generics are not 
+turing-complete doesn't mean they're unusable. (Indeed, I'd argue that 
+turing-complete in a generic- or template-system hurts usability.)
+
+>> CAS and friends in the ISA use the synchronization of the 
+>> memory bus over each memory access instruction to avoid the 
+>> need for a lock to make the operation atomic, even in the 
+>> case of multicore.
+
+>> What makes them dangerous is when users apply them to build 
+>> their own locks to protect some data, because these locks are 
+>> then unknown to the scheduler. => severe risk of priority 
+>> inversions, if these löcks cause waits.
+
+> Makes sense. This suggests that you would prefer that anyone that needs
+> portable synchronization avoid atomic objects altogether (one presumes that
+> the compiler has selected an implementation known to the scheduler and/or
+> truly atomic -- if the compiler implementer is clueless to these issues you
+> have no hope anyway). Is that a fair conclusion??
+
+Seems a fair conclusion to me, but the reverse may be interesting: when the 
+synchronization constructs present enough information to the scheduler make 
+such guarantees -- this honestly seems right up Ada's alley or, if not, 
+certainly SPARK's.
+
+****************************************************************
+
+From: Jean-Pierre Rosen
+Sent: Friday, March 30, 2018  11:50 PM
+
+>> I also note that this doesn't happen if the lock is part of a 
+>> protected object, as a protected action can't be preempted (caused 
+>> via ceiling priority or whatever) unless no higher priority task can use it.
+> 
+> True only under a scheduling based on ceiling protocols. Under "plain"
+> fixed-priority preemptive scheduling or even with priority 
+> inheritance, the preemption can happen.
+> 
+More precisely: preemption can always happen on tasks in protected actions,
+but in the case of the priority ceiling protocol, a task can be preempted 
+only by a task that is not allowed to call the same PO, thus preventing 
+priority inversion.
+
+****************************************************************
+
+From: Erhard Ploedereder
+Sent: Saturday, March 31, 2018  11:57 AM
+
+>> This is of course why I want checking on the introduction of parallel 
+>> execution.
+
+> But the issue here (preemption of execution) is purely a sequential
+> issue: this is to say, if you have Task-1 and Task-2 where Task-1 where
+> Task-2 is executing and there's a free processor for Task-1 there is 
+> no issue. (This issue w/ locks is something different, at least how I 
+> learned it [preemption having to do strictly with execution].)
+
+Yes and No. In your scenario, the inversion goes away. But what about 
+T2 and T3?. They would preempt T1 if run on the same core. Welcome back
+to Priority inversion for T4. You can get rid of it only if you have as 
+many cores as tasks (not likely), or not do preemptive scheduling (not 
+real-time), or use the scheduling/lock protocol schemes talked about in
+the earlier mail.
+
+****************************************************************
+
+From: Brad Moore
+Sent: Saturday, March 31, 2018  6:58 PM
+
+>> He sure would hate not to be told that his compare-and-swap is not 
+>> mapped to such an operation. If the compiler is mum about this, there 
+>> is a good chance that the issue is not discovered during porting of 
+>> the software.
+> 
+> Compilers that are mum violate the Standard. Unfortunately, there's no 
+> good way to enforce that rule.
+
+This maybe sounds like another case where having a suppressable error might
+useful. A language mandated "soft" error sounds like a better way to enforce a
+compiler to not be mum.
+
+>> So, I would have less problems with a set of procedures that 
+>> encapsulated the particular primitives, combined with a mandate to 
+>> support all that match a bus-synchronized operation and a mandate to 
+>> REJECT calls to any for which said operation does not exist.
+> 
+> The problem is I don't think we can do that in the Standard, since 
+> there is zero chance that we could formally define what a 
+> "bus-synchronized operation" is. And any Legality Rule requires that.
+
+If it were a soft error, would it still be considered a legality rule? If 
+not, maybe it could be added without requiring definitions for such things
+as "bus-synchronized", and perhaps not require an ACATS test.
+
+****************************************************************
+
+From: Brad Moore
+Sent: Saturday, March 31, 2018  7:28 PM
+
+> operation when building lock-free data structures, and it is quite 
+> annoying that this cannot be done portably in Ada.  And yes, this 
+> operation is for experts only, but that doesn't mean such experts don't
+> want to write in Ada, while still achieving some level of portability.
+> 
+> Below is the gcc API, extracted from:
+>   https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html
+
+I have been investigating this suggestion to create some sort of Ada interface
+to the gcc API.
+
+> I think we don't need all the options.
+
+By options, I'm not sure if you were referring to memory order options, or 
+primitives.
+
+Here's a quick summary of what I've found so far;
+
+The gcc interface appears to map to the C++11 memory model.
+
+There are basically 3 main flavors of memory ordering supported.
+
+The most restrictive mode is Sequentially Consistent, which means updates are 
+synchronised across all threads, and all threads see the same order of update 
+to atomic variables. I believe this corresponds closely to Ada's definition of
+Volatile and Atomic objects. This is the safest mode, but also the least 
+efficient since it requires a higher level of synchronisation. We could 
+probably support primitives that are in sequentially consistent mode most
+easily, since we mostly already have this, and it is the safest mode with the
+least amount of unexpected behaviours and pitfalls.
+
+The next level down in terms of synchronisation requirements is a mode where 
+only the threads involved in the access to a particular atomic variable are 
+synchronised with each other. This capability however effects which 
+optimisations can be applied to the code.
+
+For example, hoisting or sinking of code across boundaries of access to atomic
+variables is disabled. To support this, the compiler needs to be intimately 
+aware and involved where this is used when it is applying optimisations.
+I am only guessing, but I suspect there may not be enough appetite to 
+introduce such a memory mode if it requires the compiler to mesh well with it.
+
+Below that is a relaxed mode where there is no synchronisation between 
+threads, the only guarantee is that a thread wont see previous values of a
+variable if it has seen a newer value. This is the most efficient mode, but
+also the most dangerous.
+
+Someone who knows what they are doing could in theory use this to write more 
+efficient code. This mode might be easier to implement than the previous mode,
+since it doesn't enforce "happens-before" constraints.  Maybe we could create 
+a new type of Volatile aspect, such as Relaxed_Volatile for this purpose?
+
+The other threads of this email chain seem to be suggesting however that the 
+use of atomic primitives in user code to create lock free abstractions should
+be discouraged, to avoid problems such as priority inversions when the 
+implementation is in software instead of hardware.
+
+This has me thinking that the best option for Ada to add capability in this 
+area may be to go back to the Lock_Free aspect idea.
+
+That way, the implementation of the lock is provided by the compiler 
+implementation, and fits into the synchronisation model we already have for 
+protected objects. The implementation can choose between multiple 
+implementation techniques, such as transactional memory. A protected object 
+also guides the programmer to write code inside the lock that is better formed
+such by disallowing potentially blocking operations.
+
+Here is a quote from the introduction of Geert's paper that seems relevant.
+
+"The use of atomic primitives, memory barriers or transactional memory are 
+implementation details after all, that should not appear in actual user 
+code [1]."
+
+Where the reference is;
+
+H.-J. Boehm. Transactional memory should be an implementation technique, not 
+a programming interface.
+In Proceedings of the First USENIX conference on Hot topics in parallelism, 
+HotPar’09, pages 15–15, Berkeley, CA, USA, 2009. USENIX Association.
+
+
+I did go through the exercise of creating an Ada package spec for the functions
+described in Tucker's link and came up with a generic package which I've 
+attached if anyone is interested.
+
+Perhaps many of the primitives would not be needed, such as the arithmetic 
+ones, and the basic load and store routines.
+
+----
+
+atomic_operations.ads
+
+----
+
+with Interfaces;
+
+generic
+   type Atomic_Type is mod <>;
+package Atomic_Operations is
+
+   pragma Assert (Atomic_Type'Size = 1 or else Atomic_Type'Size = 2
+                  or else Atomic_Type'SIze = 4 or else Atomic_Type'Size = 8);
+
+   type Memory_Orders is
+     (Relaxed,
+      --  Implies no inter-thread ordering constraints.
+
+      Consume,
+      --  This is currently implemented using the stronger Acquire memory
+      --  order because of a deficiency in C++11's semantics for
+      --  memory_order_consume.
+
+      Acquire,
+      --  Creates an inter-thread happens-before constraint from the Release
+      --  (or stronger) semantic store to this acquire load. Can prevent
+      --  hoisting of code to before the operation.
+      --  Note: This implies the compiler needs to be pretty aware of this
+      --  setting, as it affects optimisation. i.e. Calls that use this order
+      --  are not just regular library calls
+
+      Release,
+      --  Creates an inter-thread happens-before-constraint to acquire (or
+      --  stronger) semantic loads that read from this release store. Can
+      --  prevent sinking of code to after the operation.
+      --  Note: This implies the compiler needs to be pretty aware of this
+      --  setting, as it affects optimisation. i.e. Calls that use this order
+      --  are not just regular library calls
+
+      Acquire_Release,
+      --  Combines the effects of both Acquire and Release.
+      --  Note: This implies the compiler needs to be pretty aware of this
+      --  setting, as it affects optimisation. i.e. Calls that use this order
+      --  are not just regular library calls
+
+      Sequentially_Consistent
+      --  Enforces total ordering with all other Sequentially_Consistent
+      --  operations.  This is basically equivalent to Ada's Volatile and
+      --  Atomic semantics
+     );
+
+   function Atomic_Load
+     (From         : aliased Atomic_Type;
+      Memory_Order : Memory_Orders := Sequentially_Consistent)
+      return Atomic_Type
+     with Convention => Intrinsic,
+          Pre        => Memory_Order = Relaxed or else
+                        Memory_Order = Acquire or else
+                        Memory_Order = Consume or else
+                        Memory_Order = Sequentially_Consistent;
+   --
+   --  Returns the value of From
+
+   procedure Atomic_Load
+     (From         : aliased Atomic_Type;
+      Result       : out Atomic_Type;
+      Memory_Order : Memory_Orders := Sequentially_Consistent)
+     with Convention => Intrinsic,
+          Pre        => Memory_Order = Relaxed or else
+                        Memory_Order = Acquire or else
+                        Memory_Order = Consume or else
+                        Memory_Order = Sequentially_Consistent;
+   --
+   --  Returns the value of From into Result.
+
+   procedure Atomic_Store
+     (Into         : aliased in out Atomic_Type;
+      Value        : Atomic_Type;
+      Memory_Order : Memory_Orders := Sequentially_Consistent)
+       with Convention => Intrinsic,
+            Pre        => Memory_Order = Relaxed or else
+                          Memory_Order = Release or else
+                          Memory_Order = Sequentially_Consistent;
+   --
+   --  Writes Value into Into
+
+   function Atomic_Exchange
+     (Item         : aliased in out Atomic_Type;
+      Value        : Atomic_Type;
+      Memory_Order : Memory_Orders := Sequentially_Consistent)
+     return Atomic_Type
+     with Convention => Intrinsic,
+          Pre        => Memory_Order = Relaxed or else
+                        Memory_Order = Acquire or else
+                        Memory_Order = Release or else
+                        Memory_Order = Acquire_Release or else
+                        Memory_Order = Sequentially_Consistent;
+   --
+   --  Writes Value into Item, and returns the previous value of Item.
+
+   procedure Atomic_Exchange
+     (Item         : aliased in out Atomic_Type;
+      Value        : Atomic_Type;
+      Result       : out Atomic_Type;
+      Memory_Order : Memory_Orders := Sequentially_Consistent)
+     with Convention => Intrinsic,
+          Pre        => Memory_Order = Relaxed or else
+                        Memory_Order = Acquire or else
+                        Memory_Order = Release or else
+                        Memory_Order = Acquire_Release or else
+                        Memory_Order = Sequentially_Consistent;
+   --
+   --  Stores the value of Value into Item. The original value of Item is
+   --  copied into Result.
+
+   function Atomic_Compare_And_Exchange
+     (Item                 : aliased in out Atomic_Type;
+      Expected             : aliased Atomic_Type;
+      Desired              : Atomic_Type;
+      Weak                 : Boolean;
+      Success_Memory_Order : Memory_Orders;
+      Failure_Memory_Order : Memory_Orders) return Boolean
+     with Convention => Intrinsic,
+          Pre        => Failure_Memory_Order /= Release and then
+                        Failure_Memory_Order /= Acquire_Release and then
+                        Failure_Memory_Order <= Success_Memory_Order;
+   --
+   --  Compares the value of Item with the value of Expected. If equal, the
+   --  operation is a read-modify-write operation that writes Desired into
+   --  Item. If they are not equal, the operation is a read and the current
+   --  contents of Item are written into Expected.
+   --  Weak is true for weak compare_and_exchange, which may fail spuriously,
+   --  and false for the strong variation, which never fails spuriously. Many
+   --  targets only offer the strong variation and ignore the parameter.
+   --  When in doubt, use the strong variation.
+   --
+   --  If desired is written into Item then True is returned and memory is
+   --  affected according to the memory order specified by
+   --  Success_Memory_Order. There are no restrictions on what memory order can
+   --  be used here.
+   --
+   --  Otherwise, False is returned and memory is affected according to
+   --  Failure_Memory_Order.
+
+   --------------------------------------------------------------------
+
+   --------------------------
+   --  Arithmetic Operations
+   --
+   --  The following functions perform the operation suggested by the name,
+   --  and return the result of the operation.
+   --
+   -- i.e. Item := Item op Value; return Item;
+   --
+   --  All memory orders are valid.
+
+   function Atomic_Add_And_Fetch
+     (Item         : aliased in out Atomic_Type;
+      Value        : Atomic_Type;
+      Memory_Order : Memory_Orders := Sequentially_Consistent)
+      return Atomic_Type
+     with Convention => Intrinsic;
+
+   function Atomic_Subtract_And_Fetch
+     (Item         : aliased in out Atomic_Type;
+      Value        : Atomic_Type;
+      Memory_Order : Memory_Orders := Sequentially_Consistent)
+      return Atomic_Type
+     with Convention => Intrinsic;
+
+   function Atomic_Bitwise_And_And_Fetch
+     (Item         : aliased in out Atomic_Type;
+      Value        : Atomic_Type;
+      Memory_Order : Memory_Orders := Sequentially_Consistent)
+      return Atomic_Type
+     with Convention => Intrinsic;
+
+   function Atomic_Bitwise_Or_And_Fetch
+     (Item         : aliased in out Atomic_Type;
+      Value        : Atomic_Type;
+      Memory_Order : Memory_Orders := Sequentially_Consistent)
+      return Atomic_Type
+     with Convention => Intrinsic;
+
+   function Atomic_Xor_And_Fetch
+     (Item         : aliased in out Atomic_Type;
+      Value        : Atomic_Type;
+      Memory_Order : Memory_Orders := Sequentially_Consistent)
+      return Atomic_Type
+     with Convention => Intrinsic;
+
+   function Atomic_Nand_And_Fetch
+     (Item         : aliased in out Atomic_Type;
+      Value        : Atomic_Type;
+      Memory_Order : Memory_Orders := Sequentially_Consistent)
+      return Atomic_Type
+     with Convention => Intrinsic;
+
+   -------------------------------------------------------------------------
+   --  The following functions perform the operation suggested by the name,
+   --  and return the value that had previously been in Item.
+   --
+   --  i.e.  Tmp := Item; Item := Item op Value; return Tmp;
+   --
+   --  All memory orders are valid.
+   ------------------------------------------------------------------------
+
+   function Atomic_Fetch_And_Add
+     (Item         : aliased in out Atomic_Type;
+      Value        : Atomic_Type;
+      Memory_Order : Memory_Orders := Sequentially_Consistent)
+      return Atomic_Type
+     with Convention => Intrinsic;
+
+   function Atomic_Fetch_And_Subtract
+     (Item         : aliased in out Atomic_Type;
+      Value        : Atomic_Type;
+      Memory_Order : Memory_Orders := Sequentially_Consistent)
+      return Atomic_Type
+     with Convention => Intrinsic;
+
+   function Atomic_Fetch_And_Bitwise_And
+     (Item         : aliased in out Atomic_Type;
+      Value        : Atomic_Type;
+      Memory_Order : Memory_Orders := Sequentially_Consistent)
+      return Atomic_Type
+     with Convention => Intrinsic;
+
+   function Atomic_Fetch_And_Bitwise_Or
+     (Item         : aliased in out Atomic_Type;
+      Value        : Atomic_Type;
+      Memory_Order : Memory_Orders := Sequentially_Consistent)
+      return Atomic_Type
+     with Convention => Intrinsic;
+
+   function Atomic_Fetch_And_Xor
+     (Item         : aliased in out Atomic_Type;
+      Value        : Atomic_Type;
+      Memory_Order : Memory_Orders := Sequentially_Consistent)
+      return Atomic_Type
+     with Convention => Intrinsic;
+
+   function Atomic_Fetch_And_Nand
+     (Item         : aliased in out Atomic_Type;
+      Value        : Atomic_Type;
+      Memory_Order : Memory_Orders := Sequentially_Consistent)
+      return Atomic_Type
+     with Convention => Intrinsic;
+
+   function Atomic_Test_And_Set
+     (Item         : aliased in out Interfaces.Unsigned_8;
+      Memory_Order : Memory_Orders := Sequentially_Consistent)
+      return Boolean
+     with Convention => Intrinsic;
+   --
+   --  Performs an atomic test-and-set operation on Item. Item is set to some
+   --  implementation defined nonzero "set" value and the return value is true
+   --  if and only if the previous contents were "set".
+   --  All memory orders are valid.
+
+   procedure Atomic_Clear
+     (Item         : aliased in out Interfaces.Unsigned_8;
+      Memory_Order : Memory_Orders := Sequentially_Consistent)
+     with Convention => Intrinsic,
+          Pre        => Memory_Order = Relaxed or else
+                        Memory_Order = Release or else
+                        Memory_Order = Sequentially_Consistent;
+   --  Performs an atomic clear operation on Item. After the operation, Item
+   --  contains 0. This call should be used in conjunction with
+   --  Atomic_Test_And_Set.
+
+   procedure Atomic_Thread_Fence
+     (Memory_Order : Memory_Orders := Sequentially_Consistent)
+     with Convention => Intrinsic;
+   --  This procedure acts as a synchronization fence between threads based on
+   --  the specified memory order. All memory orders are valid.
+
+   procedure Atomic_Signal_Fence
+     (Memory_Order : Memory_Orders := Sequentially_Consistent)
+     with Convention => Intrinsic;
+   --  This procedure acts as a synchronization fence between a thread and
+   --  signal handlers int he same thead. All memory orders are valid.
+
+   function Atomic_Always_Lock_Free return Boolean
+     with Convention => Intrinsic;
+   --  Returns True if objects always generate lock-free atomic instructions
+   --  for the target architecture.
+
+   function Atomic_Always_Lock_Free
+     (Item : aliased Atomic_Type) return Boolean
+     with Convention => Intrinsic;
+   --  Returns True if objects always generate lock-free atomic instructions
+   --  for the target architecture. Item may be used ot determine alignment.
+   --  The compiler may also ignore this parameter.
+
+   function Atomic_Is_Lock_Free
+     (Item : aliased Atomic_Type) return Boolean
+     with Convention => Intrinsic;
+   --
+   --  This function returns True if objects always generate lock-free atomic
+   --  instructions for the target architecture. If the intrinsic function is
+   --  not know to be lock-free, a call is made to a runtime routine to
+   --  determine the answer. Item may be used ot determine alignment.
+   --  The compiler may also ignore this parameter.
+
+end Atomic_Operations;
+
+****************************************************************
+
+From: Erhard Ploedereder
+Sent: Saturday, March 31, 2018  7:43 PM
+
+>> So, I would have less problems with a set of procedures that 
+>> encapsulated the particular primitives, combined with a mandate to 
+>> support all that match a bus-synchronized operation and a mandate to 
+>> REJECT calls to any for which said operation does not exist.
+> The problem is I don't think we can do that in the Standard, since 
+> there is zero chance that we could formally define what a 
+> "bus-synchronized operation" is. And any Legality Rule requires that.
+
+Not so difficult:
+
+"A call on xyz is illegal if the compiler does not map it to a single 
+atomic instruction of the target architecture." (or rewrite this an 
+implementation requirement, with an "otherwise  produce an 
+error/warning/diagnostic message ).
+
+Incidentally, C.1(11) already has the recommendation that intrinsic 
+subprograms for the set of these atomic memory operations be provided.
+(Atomic increment is one of them - this was mentioned in another mail as 
+being useful when available.) The way C.1 is written, one would expect the 
+respective target-specific subset of atomic memory operations. This would be 
+a good place to standardize their signatures and be done with the AI.
+
+****************************************************************
+
+From: Tucker Taft
+Sent: Sunday, April 1, 2018  4:39 PM
+
+I don't think we should delve into the various memory ordering options.  It 
+just seems like overkill given how rarely these will be used.  Remember C++ 
+doesn't have an equivalent to the protected object (at least not yet! -- never
+say never in C++ ;-).  So we just need to address the main usage, I believe.
+So I would go with Sequentially Consistent, and those who are desperate for 
+some looser synchronization can write their own primitives, or pay their 
+vendor to provide it.
+
+Standardizing the Lock_Free aspect might be worth doing, but I don't believe 
+that addresses the main goal here, which is to provide atomic operations on 
+atomic objects.
+
+****************************************************************

Questions? Ask the ACAA Technical Agent