Version 1.7 of ai05s/ai05-0117-1.txt

Unformatted version of ai05s/ai05-0117-1.txt version 1.7
Other versions for file ai05s/ai05-0117-1.txt

!standard C.6(16)          10-11-23 AI05-0117-1/03
!class Amendment 08-10-16
!status Amendment 2012 10-11-23
!status ARG Approved 10-0-1 10-10-31
!status work item 08-10-16
!status received 08-07-29
!priority Medium
!difficulty Hard
!subject Memory barriers and Volatile objects
!summary
Pragma Volatile has its implementation advice changed to ensure that it supports the required efficient serialisation on multiprocessor platforms.
!problem
Some Lock-free and wait-free algorithms rely on a specific order of loads and stores of some variables. However, modern multi-core architectures do not guarantee this ordering between processors unless special instructions are used. Ada should require some behavior which will allow these algorithms to be programmed in a straightforward manner.
!proposal
Weaken the definition of Volatile so that it provides the required ordering but does not require all read/writes to be performed to actual memory locations.
!wording
Replace C.6(16) with:
All tasks of the program (on all processors) that read or update volatile variables see the same order of updates to the variables.
Replace Implementation Note C.6(16.a) with:
AARM Implementation Note: To ensure this, on a multiprocessor, any read or update of a volatile object should involve the use of an appropriate memory barrier.
!discussion
After considering a new pragma (Coherent) it was decided to modify the current definition of pragma Volatile to remove the requirement that caches etc cannot be used with volatile variables. This brings the Ada definition of volatile closer to that used in other languages.
The original Paragraph C.6(16) is removed as 1.1.3 (13) already makes it clear that any read or update of a volatile object has an external effect; beyond that it is overspecification. It is replaced by a new paragraph that explains the requirements rather than an implementation.
!example
The following will ensure that task 2 does get the value 42.
Data : Integer; pragma Volatile (Data);
Flag : Boolean := False; pragma Volatile (Flag);
in task 1:
Data := 42; Flag := True;
in task 2:
loop exit when Flag; end loop; Copy := Data;
!corrigendum C.6(16)
Replace the paragraph:
For a volatile object all reads and updates of the object as a whole are performed directly to memory.
by:
All tasks of the program (on all processors) that read or update volatile variables see the same order of updates to the variables.
!ACATS test
This is probably not testable within Ada, beyond the very simple tests already within the suite. The absence of an error in a test program doesn't tell much about whether the implementation is correct, since there is a very small window for an error compared to a much larger window for correct behavior.
!ASIS
This has no impact on ASIS.
!appendix

!topic Memory barriers and pragma Atomic / Volatile
!reference Ada 2005 RM RM95-1.1.3(13)
!from Santiago Urueña-Pascual 2008-07-28
!keywords sequential consistency, relaxed consistency, transactional memory
!discussion

Some lock-free and wait-free algorithms rely in a specific order of
loads and stores of some variables:

Data : Integer_64;
pragma Volatile (Data);

Flag : Boolean;
pragma Atomic (Flag);


task 1:

 Data := ...;
 Flag := True;


task 2:

 loop
   exit when Flag;
 end loop;
 Copy := Data;


This code is guaranteed to work in a single processor. However, in a
multiprocessor, virtually all modern CPU architectures allow a relaxed
memory consistency model, a microarchitecture optimization to reorder
internally some loads and stores.[1] For example, in the CPU executing
task 1 the second assignment could be written to L1 cache before the
first assignment. The CPU views internally its own execution sequentially,
but other CPUs of the system could view a different order, and thus
task 2 could read an obsolete value of 'Data'.

Those architectures provide specific machine instructions to disable
that optimization, called memory barriers or memory fences.[2][3] For
example, inserting an adequate memory barrier between the two assignments
of task 1 and before reading 'Data' in task 2 means that the programmer
explicitly needs an specific order, at the cost of reduced performance.
It is worth noting that not all lock-free or wait-free algorithms require
memory barriers (highly architecture dependent).

It seems that the Ada Reference Manual requires that the order of any read
or update of an Atomic or Volatile object is guaranteed. That means that
memory barriers must be inserted by the compiler. Is that intended?

Current programming languages have taken different approaches. For example,
the C keyword 'volatile' just disables some compiler optimizations,[4] so
either the programmer should use the adequate OS synchronization API, like
POSIX mutex, or insert the adequate memory barriers (e.g. when programming
a specific spinlock implementation). It seems that future versions of C++
will provide sequential consistency by default, but a new low-level
mechanism is added for expert programmers mastering the relaxed consistency
model.[5]

Some projects, like LLVM or the Linux kernel, provide a portable API for
memory barriers:

package Machine_Code is
   procedure Memory_Fence;         -- x86: mfence   Alpha: mf   PowerPC: sync
   procedure Read_Memory_Fence;    -- x86: lfence   Alpha: mf   PowerPC: sync
   procedure Write_Memory_Fence;   -- x86: sfence   Alpha: wmb  PowerPC: sync
   procedure IO_Memory_Fence;      -- PowerPC: eieie

   procedure Read_Dependence_Memory_Fence;  -- Alpha only: mf
end;

It seems impossible for the compiler to decide whether a memory barrier is
actually needed, depending on the semantics of the specific algorithm, so
probably a brute force approach is required (inserting memory barriers before
and after all volatile accesses, even if not required) even if this kills
performance in current CPUs. It should be noted that there are some
architectures that require a different type of memory barrier for hardware
devices.

However, a sophisticated compiler could try to generate code for some protected
objects using no locks, just atomic instructions. In this case, the compiler
could probably be able to analyse all the protected subprograms, and decide
whether memory barriers instructions are required (and, if needed, just the
minimum number of them). It is also worth noting that there is a separate
paradigm in modern architectures called 'transactional memory'.[6] A
sophisticated compiler could also generate code for protected objects without
any lock, with mutual exclusion directly guaranteed by this very new hardware.
But, does the Reference Manual allow these advanced protected object
implementations?

== References ==

[1] Shared Memory Consistency Models: A Tutorial
     -- http://www.hpl.hp.com/techreports/Compaq-DEC/WRL-95-7.pdf
[2] Memory Ordering in Modern Microprocessors, Part I
     -- http://www.linuxjournal.com/article/8211
[3] Memory Ordering in Modern Microprocessors, Part II
     -- http://www.linuxjournal.com/article/8212
[4] volatile -- http://www.airs.com/blog/archives/154
[5] Memory Consistency Models: Convergence at last!
     -- http://www.cse.iitk.ac.in/users/mtworkshop07/workshop/DayIII/adve.sarita.pdf
[6] Sun slots transactional memory into Rock
     -- http://www.theregister.co.uk/2007/08/21/sun_transactional_memory_rock/


****************************************************************

From: Robert A. Duff
Date: Thursday, July 31, 2008  10:34 AM

...
> This code is guaranteed to work in a single processor.

Atomic and volatile are hard to reason about, but I think the above is
guaranteed to work, no matter how many processors.

Ada doesn't even have a built-in way to query or control the number of
processors.

...
> It seems that the Ada Reference Manual requires that the order of any
> read or update of an Atomic or Volatile object is guaranteed. That
> means that memory barriers must be inserted by the compiler. Is that intended?

I think so.

...
> It seems impossible for the compiler to decide whether a memory
> barrier is actually needed, depending on the semantics of the specific
> algorithm, so probably a brute force approach is required (inserting
> memory barriers before and after all volatile accesses, even if not
> required) even if this kills performance in current CPUs. It should be
> noted that there are some architectures that require a different type
> of memory barrier for hardware devices.

I think any interfacing to hardware devices has to be implementation dependent.

> However, a sophisticated compiler could try to generate code for some
> protected objects using no locks, just atomic instructions. In this
> case, the compiler could probably be able to analyse all the protected
> subprograms, and decide whether memory barriers instructions are
> required (and, if needed, just the minimum number of them). It is also
> worth noting that there is a separate paradigm in modern architectures
> called 'transactional memory'.[6] A sophisticated compiler could also
> generate code for protected objects without any lock, with mutual
> exclusion directly guaranteed by this very new hardware. But, does the
> Reference Manual allow these advanced protected object implementations?

I think so.

> == References ==

Thanks for the interesting refs.

****************************************************************

From: Alan Burns
Date: Thursday, February 18, 2010  7:15 AM

I was charged with looking at AI-117 - Memory barriers and Volatile objects

AI05-0117 asks the question - on a modern multicore architecture can lock-free
and wait-free algorithms be implemented in a straightforward way using the
current provisions of Ada?

My opinion is a guarded, yes. But a weaker form of Volatile would be useful.

Lock-free and wait-free algorithms do not use protected types or other forms of
synchronisation but employ shared variables.

To work correctly these shared variables must not have their load and store
instructions reordered. Reordering, in general, is however used extensively to
improve performance. It is not however necessary for load and store operations
to be performed directly on memory as any intermediate cache will ensure
consistency. Reordering can be the result of compiler or hardware optimisations.

Hardware provides memory barriers (MBs) to prevent reordering.
Different forms of MBs are supported on different hardware. There are readMBs
and writeMBs - the readMB being more efficient if the shared variable is only
read.

Ada proves Volatile and Atomic pragmas:

"For a volatile object all reads and updates of the object as a whole are
performed directly to memory.

Implementation Note: This precludes any use of register temporaries, caches, and
other similar optimizations for that object."

Its is clear that this will prevent reordering.

So, if the compiler controls its own reordering and places MB instructions after
each assess to a volatile object the correct behaviour will ensue.

There is a question of efficiency. Most lock-free and wait-free algorithms make
use of a only a few shared variables and local variables can be used where
necessary to reduce the actual updates to the volatile ones. The only
inefficiency would therefore seem to be that Volatile prohibits the use of
caching, whereas this is not strictly required for these algorithms. A new
pragma, No_Reordering, could be provided that is weaker than Volatile.

The AI noted the possibility of providing an API to the operations on the MBs.
This is not recommended as there is no single set of operations; hardware
solutions differ. Other languages provide various forms of volatile (eg C, C++
and Java).

****************************************************************

From: Alan Burns
Date: Friday, April 23, 2010  2:37 AM

A question about where to place the definition of the pragma.

To recap - the point of this pragma is to ensure that the compiler/processor
does not reorder memory assesses (so that various useful algorithms will work on
multiprocessors).

This seems easy to ensure by defining any read or write on a coherent object to
be an 'external effect' (1.1.3(13)). This with 1.1.3(15) about ordering would
seem to be sufficient.

my reading of this is that if the program needs to ensure that:
write(x);
write(y);
does indeed occur in that order then BOTH x and y would need to be defined as
coherent - agree? or is it sufficient just to define x as coherent?

This pragma could be defined in C.6 with atomic and volatile, but to give it
equal status would require considerable rewriting. But as coherence is only
really needed with multiprocessor systems then I feel it would be easier to
define it in the new Multiprocessor section of the real-time annex - agree?

****************************************************************

From: Jean-Pierre Rosen
Date: Friday, April 23, 2010  2:53 AM

> This seems easy to ensure by defining any read or write on a coherent
> object to be an 'external effect' (1.1.3(13)).

Why isn't volatile sufficient (it is an external effect for volatile)?

****************************************************************

From: Alan Burns
Date: Friday, April 23, 2010  4:25 AM

This was discussed at last meeting - volatile is too strong, yes it gives what
is required, but it also requires all writes to be direct to memory (effectively
bypassing cache). This is not necessary as cache coherence is fine, the key is
that the complier does not reorder. See discussion in AI

****************************************************************

From: Robert Dewar
Date: Friday, April 23, 2010  11:49 AM

> A question about where to place the definition of the pragma.

I definitely think this pragma must be optional, i.e. the compiler is free to
reject it if it cannot be guaranteed, like pragma Atomic. I am pretty sure we
would just reject it all the time. Because we have no way of constraining the
code generator in this respect.

****************************************************************

From: Robert Dewar
Date: Friday, April 23, 2010  11:50 AM

> This was discussed at last meeting - volatile is too strong, yes it
> gives what is required, but it also requires all writes to be direct
> to memory (effectively bypassing cache). This is not necessary as
> cache coherence is fine, the key is that the complier does not
> reorder. See discussion

Ah, OK, so in fact this is easy to implement, we just treat it as Volatile, so I
withdraw my objection.

****************************************************************

From: Randy Brukardt
Date: Friday, April 23, 2010  1:01 PM

Right, that works, but as we're determined, that requires generating memory
barriers in many places for multicore/multiprocessor architectures. Which is a
real performance drag.

I don't know about GNAT specifically, but I would guess that what the majority
of compilers have been calling "Volatile" is really "Coherent". So I think you
are right that Coherent can be implemented as (current) Volatile, but Volatile
will often need more work to be implemented correctly. Of course, that depends
on whether customers care; I don't think there is any practical way for portable
tests (like the ACATS) to verify if this is done properly -- it requires
examining the emitted code for each architecture. So customer demand will be the
only incentive to implement Volatile correctly.

****************************************************************

From: Robert Dewar
Date: Saturday, April 24, 2010  11:47 AM

To me, if the Ada notion of volatile does not match C's notion of volatile then

a) it is junk

b) GNAT will not implement it, regardless of customer demand!

****************************************************************

From: Bob Duff
Date: Friday, April 23, 2010  9:09 PM

I'm not a C language lawyer, but my impression is that nobody really understands
precisely what "volatile" in C means. And I've heard that the committee
deliberately left it vague, so implementations can do more-or-less what they
like.

I also read a paper where they tested various compilers, and found that nobody
implements volatile in C properly (for whatever semantics the authors of that
paper think is "proper").

I had a conversation with somebody who was griping about the lack of clarity in
the C standard w.r.t. volatile, and I pointing him to the Ada definition, and I
think he said the Ada version was much superior.

The C definition has something to do with setjump/longjump (in part).  And C
doesn't have threads.

This is all rumor and hearsay -- sorry.

****************************************************************

From: Robert Dewar
Date: Sunday, April 25, 2010  4:56 AM

In practice GNAT does (and will for ever) give the same semantics for Volatile
that gcc does. People use volatile all the time both in GNAT and GNU C, without
any problems or reports of difficulty for 15 years, so I don't see that there is
a problem here that needs solving, or that a more precise solution will be of
any use.

I am not aware at all of the confusion over volatile of which Bob Duff presents
rumor and hearsay :-)

However, I am not concerned too much, sonds like 10 mins work to add pragma
Coherent and make it mean the same as Volatile.

****************************************************************

From: Bob Duff
Date: Sunday, April 25, 2010  8:14 AM

> I am not aware at all of the confusion over volatile of which Bob Duff
> presents rumor and hearsay :-)

The paper I mentioned is here:

    http://www.cs.utah.edu/~regehr/papers/emsoft08-preprint.pdf

    "Volatiles Are Miscompiled, and What to Do about It"

The rest of what I said remains rumor and hearsay.

****************************************************************

From: Alan Burns
Date: Monday, April 26, 2010  5:21 AM

> However, I am not concerned too much, sonds like 10 mins work to add
> pragma Coherent and make it mean the same as Volatile.

Just to recap - leaving aside 'atomic' there are two properties that 'volatile'
is aimed at:

1 - all rights/read go directly to the memory location - needed (only) when the
memory location is a register for some external device.

2 - control reordering (bound what reordering can take place due to compiler or
processor behaviour).

Not sure what GNAT does, but C seems to be aimed more at the second property.
Ada's 'volatile' is concerned with the first property. So GNAT C may not give
Ada's semantics - it will ensure no reordering but will leave it to memory
manager to decide when, for example, values can just stay in cache.

With multicore there is a real need to get control over reordering. Hence the
introduction, in Ada, of Coherent - this would be easy to implement as it just
requires the insertion of calls to a memory barrier which all such hardware
seems to support.

So I would guess that GNAT 'Volatile' implements Ada's new 'Coherent'.
But this leaves open the question of how to get Ada's Volatile.

I'm told that Java's volatile is defined in terms of Lamport's 'happens before'
operation - hence it is concerned with ordering.

PS the discussions did not however tell me where to define Coherent?

****************************************************************

From: Robert Dewar
Date: Monday, April 26, 2010  12:01 PM

> 1 - all rights/read go directly to the memory location - needed (only)
> when the memory location is a register for some external device.

it is completely bogus to use pragma Volatile for this purpose, since in
practice you will be making the non-portable assumption that a load or store
uses a single appropriate machine instruction. Even the use of Atomic is dubious
for this purpose! Device registers need to be accessed with very specific
instructions. The ONLY legitimate way to ensure this is to write the appropriate
ASM inserts. In practice Atomic may work fine for most cases, but I would NEVER
use Volatile for this purpose.

For example, if you have a 32-bit hardware register, and you set one bit, then
several sequences of instructions are possible on a 386 (use bit instructions,
use load/store, load/store byte rather than word etc). The use of Atomic pretty
much guarantees a 32-bit read followed by a 32-bit write. If that's what you
want it will probably work, but you cannot guarantee this from the RM. If you
use Volatile, all bets are off as to the exact instructions generated.

> 2 - control reordering (bound what reordering can take place due to
> compiler or processor behaviour).

Actually one of the most common uses of Volatile is to make sure that two
independent tasks correctly access a common version without making private
copies, e.g. a circular buffer where the input and output pointers are atomic,
and the actual buffer contents is volatile.

> Not sure what GNAT does, but C seems to be aimed more at the second
> property. Ada's 'volatile' is concerned with the first property. So
> GNAT C may not give Ada's semantics - it will ensure no reordering but
> will leave it to memory manager to decide when, for example, values
> can just stay in cache.

As I said, to me if Volatile in Ada does not do what Volatile in C does, then it
is confusing and useless.

> With multicore there is a real need to get control over reordering.
> Hence the
> introduction, in Ada, of Coherent - this would be easy to implement as
> it just requires the insertion of calls to a memory barrier which all
> such hardware seems to support.

Please do not assume that "instructions available on hardware" = "easy to
implement". This equation is simply false in large modern compiler systems.

> So I would guess that GNAT 'Volatile' implements Ada's new 'Coherent'.
> But this leaves open the question of how to get Ada's Volatile.

If Volatile is supposed to mean something different from Volatile in C, then I
would say, you can't expect to "get" it at all. As I noted before, we never had
a customer query or complaint in this area for the entire history of GNAT.

> PS the discussions did not however tell me where to define Coherent?

Since it's virtually the same issue as Volatile, I would put it in the same
place.

****************************************************************

From: Tucker Taft
Date: Monday, April 26, 2010  12:22 PM

>> 1 - all rights/read go directly to the memory location - needed
>> (only) when the memory location is a register for some external device.
>
> it is completely bogus to use pragma Volatile for this purpose, since
> in practice you will be making the non-portable assumption that a load
> or store uses a single appropriate machine instruction. Even the use
> of Atomic is dubious for this purpose! Device registers need to be
> accessed with very specific instructions. The ONLY legitimate way to
> ensure this is to write the appropriate ASM inserts. In practice
> Atomic may work fine for most cases, but I would NEVER use Volatile
> for this purpose.

I agree.  You generally need to use pragma Atomic if you want to guarantee a
single instruction.  Volatile operations are allowed to result in multiple
instructions.

> For example, if you have a 32-bit hardware register, and you set one
> bit, then several sequences of instructions are possible on a 386 (use
> bit instructions, use load/store, load/store byte rather than word
> etc). The use of Atomic pretty much guarantees a 32-bit read followed
> by a 32-bit write. If that's what you want it will probably work, but
> you cannot guarantee this from the RM. If you use Volatile, all bets
> are off as to the exact instructions generated...

I don't happen to agree that it is always necessary to use special instructions.
A lot of device drivers have been written in the "Unix" era without ever using
assembler. I suppose some of these depended on the "|=" C operator, so Ada comes
up short there...

Of course if the device is not memory mapped, but instead uses some kind of
built-in "port," then you clearly need to resort to a special "intrinsic"
operation.

****************************************************************

From: Bob Duff
Date: Monday, April 26, 2010  12:48 PM

> As I said, to me if Volatile in Ada does not do what Volatile in C
> does, then it is confusing and useless.

I think you mean that if Volatile in GNAT does not do what Volatile in gcc C
does, then it is confusing and useless.  Because last time I looked at this part
of the C standard it was confusing and useless, but that doesn't mean a
particular C compiler can't do something useful.

Anyway, why are we adding a new feature, if in practice it will be implemented
the same as Volatile?

****************************************************************

From: Robert Dewar
Date: Monday, April 26, 2010  1:01 PM

> I think you mean that if Volatile in GNAT does not do what Volatile in
> gcc C does, then it is confusing and useless.  Because last time I
> looked at this part of the C standard it was confusing and useless,
> but that doesn't mean a particular C compiler can't do something
> useful.

Whether C volatile in the standard does anything useful, I have no idea.
But C volatile in practice is very useful, and works fine, and Ada need do no
better (in fact Ada already does better, from having Atomic -- a typical
implementation in C gets around this by treating volatile the same as atomic in
Ada if the type is suitable for that purpose (that of course is implementation
dependent, as it is in Ada).

> Anyway, why are we adding a new feature, if in practice it will be
> implemented the same as Volatile?

indeed.

****************************************************************

From: Tucker Taft
Date: Monday, April 26, 2010  1:13 PM

In practice, volatile works fine for mono-processors.
As we move toward multicore/multi-processors being the norm, volatile will
probably be insufficient.  I can believe GNAT will necessarily wait until
something akin to Coherent is handled by the GCC back end, but that doesn't mean
the distinction isn't important in the timeframe of Ada 2012.

****************************************************************

From: Santiago Uruena Pascual
Date: Wednesday, January 12, 2011  2:29 AM

Hello again, and Happy New Year!,

sorry if this is not the adequate channel / way for dicussing Ada Issues, but
I have further comments about AI05-0117.

While in Java 5.0 it was OK to require inserting memory barriers in volatile
objects because this language is not meant for writing device drivers (RTSJ
adds the RawMemoryAccess class for this purpose, though), I think in a systems
programming language like Ada plain volatile / atomic accesses (with no
barriers) are a must for hardware interfacing. Probably is better to continue
with the pragma Coherent solution as in previous version of this AI, which
would be compatible with the atomic types being developed for the future C++0X
and C1X standards [1][2][3].

Furthermore, the current proposal is not enough for properly coding wait-free
algorithms (even in uniprocessor machines): besides coherent loads and stores
(specifying indivisible accesses, visibility to other threads, and ordering
from other CPUs), it is also required to provide coherent operations=
(test-and-set/atomic add/atomic xor/compare and swap...), and to relax memory
operations (not just sequential consistency which is extremely expensive in
some arch like Power or ARM [4], but also Acquire / Release / Consume /
Relaxed semantics like in Java and C++0X / C1X).

Defining the memory model for Java and C/C++ was a huge effort due to the
subtle memory semantics of different hardware microarchitectures, and bugs are
still being found in the specifications [4][5]. So as protected objects are
perfectly OK for SMP/multicpre (the memory barriers are already coded in the
underlying OS locks), inserting barrier instructions to pragma Volatile /
Atomic as proposed in the current AI version (rev 1.6) is an incompability
that would certainly break existing code in some device drivers or interfacing
with C, and is not enough for programming wait-free algorithms, IMHO maybe is
better to delay this AI to the next standard revision after Ada 2012. This
could leverage the experience gained in the standardization and implementation
of future C / C++ specs, so Ada can have a compatible approach that take
advantage of the existing implementations and to ease interfacing with the C
atomic types.

Thanks for your patience, all the best.

Santi


References:

[1] "Should volatile Acquire Atomicity and Thread Visibility Semantics?", by
    Hans Boehm & Nick Maclaren
    http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2006/n2016.html

[2] "C++0x November 2010 Working Draft" (chapter 29: Atomic operations library)
    http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2010/n3225.pdf

[3] "C1X -- October 2010 Committee Draft" (section 7.17: Atomics)
    http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1516.pdf

[4] "Mathematizing C++ Concurrency", by Mark Batty et al.
    http://www.cl.cam.ac.uk/~pes20/cpp/test.pdf

[5] "On Validity of Program Transformations in the Java Memory Model",
    by Jaroslav Sevcik & David Aspinall
    http://homepages.inf.ed.ac.uk/s0566973/jmmtrans.pdf

****************************************************************

From: Randy Brukardt
Date: Wednesday, January 12, 2011  3:29 PM

> sorry if this is not the adequate channel / way for dicussing Ada
> Issues, b= ut I have further comments about AI05-0117.

The ARG is discussing this (again) privately. The early consensus is that we
have it right in the current AI05-0117-1.

The (previous) decision reflected in the current write-up of AI05-0117-1 is that
it is premature to go beyond the existing Ada pragmas. Processor and memory
architectures are rapidly evolving, and it seems unlikely that there is a
general solution that will work now and in the future. We don't want Ada to be
burdened with obsolete features. So this is best left for Ada 2020 when we
should be able to clearly see what is needed (pun intended).

One thing that we did notice is that some people were misreading the intent of
the current language for Volatile. Therefore, we reworded that definition to
match the actual intent (from what we can tell, implementations have implemented
the real intent and not the stricter reading that some readers have had). We
don't expect this wording change to have any impact on actual implementations,
or on how programs work.

There is nothing stopping an implementation from providing additional facilities
if needed for a particular target system. Indeed, the provision of such features
would provide an example of an approach to use in future Ada standards.

The above is heavily colored by my own opinions on this topic, and should not be
taken as an "official" pronouncement. Only the contents of AIs and (ultimately)
the Ada standard can do that.

****************************************************************

From: Alan Burns
Date: Wednesday, January 12, 2011  7:48 AM

It seems that our view that weakening the definition of volatile would suffice
is not supported by everyone. Shall I try and discuss this topic further with
Santi?

[Followed by a reposting of the Ada-Comment message sent at 2:29 AM - Editor.]

****************************************************************

From: Bob Duff
Date: Wednesday, January 12, 2011  1:18 PM

Sounds like a good idea to me.  The Java and C++ folks have apparently been
doing a lot of work in this area, so it would be a good idea for at least some
folks on ARG to pay attention to it.

****************************************************************

From: Robert Dewar
Date: Wednesday, January 12, 2011  1:25 PM

Let's not spend TOO much effort on this, anything we do won't have much effect
on any implementations I suspect (certainly GNAT will just borrow whatever is
provided by the gcc back end, which presumably is the same as what the C
standard has now).

****************************************************************

From: Tucker Taft
Date: Wednesday, January 12, 2011  1:49 PM

I think we all pretty much felt that the new wording did a better job of
capturing what implementations already did, or at least what was the original
intent.  The subtleties that Santi is mentioning seem worth investigating, but I
wonder whether any implementation intends to make any real changes as a result
of this new wording.  Do you think that the old wording was better in some way?

****************************************************************

From: Robert Dewar
Date: Wednesday, January 12, 2011  1:56 PM

I know that GNAT ignores what the RM says and assumes that

a) volatile means the same as in C

b) C has it right

:-)

****************************************************************

From: Randy Brukardt
Date: Wednesday, January 12, 2011  3:15 PM

AI05-0117-1 already says that no change is intended. Beyond that, we already
have agreement on what to do (which is make the RM say what we meant and what is
actually implemented, and go no further). I would very much doubt that we would
get any support for doing more.

I know we wouldn't get such support from me. Processor and memory architectures
change frequently, and it would be a huge mistake to standardize features that
depend on a particular architecture that probably will get replaced by something
better in the next few generations. If processor vendors continue on their path
to thousands of cores, I suspect that cache coherence will simply become
impossible to implement, and anything that depends on it simply will not work.
So I don't want to make any silly requirements in the Ada standard, either by
mistake (as with the current wording) or by some sort of new features.

****************************************************************

From: Robert Dewar
Date: Wednesday, January 12, 2011  3:22 PM

As a pragmatic point, hooking ourselves to whatever C decides makes good sense
:-)

****************************************************************

From: Randy Brukardt
Date: Wednesday, January 12, 2011  3:39 PM

Right. And trying to guess ahead of time what they are going to decide is a
fool's game. It almost certainly would lead to junk in the Ada Standard. Not
worth it.

****************************************************************

From: Robert Dewar
Date: Wednesday, January 12, 2011  3:53 PM

Could we just reference the ISO C standard? :-)

****************************************************************

From: Randy Brukardt
Date: Wednesday, January 12, 2011  4:06 PM

We could, but we'd probably have a versioning problem. That is, we probably want
to use the *next* version of the C standard for this purpose, not the current
one. But we wouldn't have a way to do that until that version is decided. And if
we used the current version, Ada would then be behind when C gets updated.

[Yes, I noticed the smiley. But I was semi-seriously thinking the same way
myself, and thus thought that the implications of doing so would be worth
considering.]

****************************************************************

From: Robert Dewar
Date: Wednesday, January 12, 2011  5:20 PM

Well that's a problem that would only concern language lawyers. The version of C
referenced is actually totally irrelevant (to all but aforementioned language
lawyers). The point is that if we did this xref, we would clearly indicate our
intent to be the same as C, which is the important message.

****************************************************************

From: Bob Duff
Date: Wednesday, January 12, 2011  6:59 PM

> Right. And trying to guess ahead of time what they are going to decide
> is a fool's game. It almost certainly would lead to junk in the Ada
> Standard. Not worth it.

I think it's unwise to rely on the C standard.
It is hopelessly ill-defined in the area of "volatile".
I don't know if they're planning to fix that for C1X, but as far as I can tell,
hardly anybody implements even the full C99 standard.

Has anybody on ARG even read what the C standard says?
If not, it hardly makes sense to blindly agree with it.

Perhaps C++ or Java standards would be more fruitful.
I don't know, but those folks appear to think about this area.

> I know that GNAT ignores what the RM says and assumes that
>
> a) volatile means the same as in C
>
> b) C has it right

I think you need to replace "C" with "the gcc dialect of C"
for the above to be true.  Nothing wrong with GNAT following gcc, here, but gcc
/= C.

****************************************************************

From: Robert Dewar
Date: Wednesday, January 12, 2011  7:12 PM

> I think it's unwise to rely on the C standard.
> It is hopelessly ill-defined in the area of "volatile".
> I don't know if they're planning to fix that for C1X, but as far as I
> can tell, hardly anybody implements even the full C99 standard.

Who really cares? I think it is a waste of time for Ada language lawyers to
think they can do better than the C folks in this area.

> Has anybody on ARG even read what the C standard says?
> If not, it hardly makes sense to blindly agree with it.

it makes even less sense to try to spin our own definition.

>> I know that GNAT ignores what the RM says and assumes that
>>
>> a) volatile means the same as in C
>>
>> b) C has it right
>
> I think you need to replace "C" with "the gcc dialect of C"
> for the above to be true.  Nothing wrong with GNAT following gcc,
> here, but gcc /= C.

Agaibn, only a language lawyer cares about this issue, to me it is a waste of
time. If you try to come up with a "really good correct definition" of volatile

a) implementors will ignore it

b) so will everyone else

So you will have wasted your time.

****************************************************************

From: Tucker Taft
Date: Wednesday, January 12, 2011  8:48 PM

I'm not convinced that only language lawyers care about this issue.  People who
write device drivers or interrupt/signal handlers in C or Ada care about the
meaning of volatile. Also, those who use setjmp/longjmp in C/C++ care about
volatile, since only volatile variables are guaranteed to preserve their value
across setjmp/longjmp.

I think the big problem with the C and C++ standards is that they don't deal
with multiple threads, but instead talk about what is true at "sequence" points,
when you enter a signal handler, and when you do a longjmp. Furthermore, there
is a presumption you are talking about a single processor executing a single C
or C++ program.

I think the upcoming standards may start to talk about multiprocessors, but the
older ones don't address the issue at all.

So I recommend we avoid referring to the C/C++ standards for this purpose in
anything normative.  I see no harm in adding some kind of implementation advice
that encourages consistency with the C/C++ definition of "volatile," as it
evolves.

****************************************************************

From: Ben Brosgol
Date: Wednesday, January 12, 2011 11:36 PM

Here's the C# approach, as a data point in how languages treat volatile:

(From C# 4.0 Language Specification, Section 10.5.3) << For volatile fields,
[certain] reordering optimizations are restricted:

* A read of a volatile field is called a _volatile read_. A volatile read has
  "acquire semantics"; that is, it is guaranteed to occur prior to any
  references to memory that occur after it in the instruction sequence.

* A write of a volatile field is called a _volatile write_. A volatile write has
  "release semantics"; that is, it is guaranteed to happen after any memory
  references prior to the write instruction in the instruction sequence.
 >>

As noted by Albahari & Albahari, "C# 4.0 in a Nutshell", pp. 829, these rules
can give counterintuitive results because they allow the compiler to swap a
write with a subsequent read.  The following example, attributed to Joe Duffy,
illustrates the problem:

class IfYouThinkYouUnderstandVolatile{
   volatile int x, y;

   void Test1(){  // Executed on one thread
      x = 1;      // volatile write to x
      int a = y;  // volatile read from y
      ...
   }

   void Test2(){  // Executed on another thread
      y = 1;      // volatile write to y
      int b = x;  // volatile read from x
      ...
   }
}

With C# semantics, variables a and b may both end up with the value 0 -- ie, the
compiler may interchange the order of the assignment statements in each method
-- even though both fields x and y are specified as volatile.

In light of such surprises (well to most programmers the above result would be a
surprise), Albahari & Albahari's advice was to avoid using the volatile keyword.

(Last summer at the AdaCore company meeting I showed the Ada version of this
example to some ARG members -- Bob Duff and Steve Baird and a few others -- I've
forgotten what they thought an Ada compiler was allowed to do :-)

****************************************************************

From: Robert Dewar
Date: Thursday, January 13, 2011  2:58 AM

> I'm not convinced that only language lawyers care about this issue.

that's NOT what I said,

I said that only language lawyers care about what the Ada standard has to say
about this issue.

That's a *totally* different point!

In practice, I think the behavior of compilers is going to be driven far more by
what users/customers need/expect than verbiage in standards, and for sure the
verbiage in the Ada standard is not going to have any influence.

****************************************************************

From: Robert Dewar
Date: Thursday, January 13, 2011  3:04 AM

I actually am all in favor of trying to define volatile (in all languages) more
clearly, I just don't see much point in spending much time trying to come up
with Ada's idiosyncratic viewpoint on this subject.

The problem is basically the same in all these languages, so it is something
that should be addressed in a more global way than just working on an individual
standard.

Is there an ISO group more suitable to this work?

****************************************************************

From: Tucker Taft
Date: Thursday, January 13, 2011  5:29 AM

> In practice, I think the behavior of compilers is going to be driven
> far more by what users/customers need/expect than verbiage in
> standards, and for sure the verbiage in the Ada standard is not going
> to have any influence.

Perhaps, though I have been surprised at the number of people who seem to study
these descriptions of volatile in other standards, so I wouldn't be surprised if
the words in the Ada standard are also being read by people with influence in
the embedded-systems community.

****************************************************************

From: Alan Burns
Date: Thursday, January 13, 2011  6:48 AM

I'll pass back to Santi the essences of these emails.

Basically we are not really making any change to volatile but loosening what is
said in the AARM and pointing to the need for control over ordering.

Also that evolving C solution is in practice what will be available in Ada. I'll
not get involved in any Ada-Comment discussions, but if Santi gets back to me
with anything that I feel should be shared, I'll do just that.

****************************************************************

From: Randy Brukardt
Date: Thursday, January 13, 2011  1:19 PM

> Basically we are not really making any change to volatile but
> loosening what is said in the AARM and pointing to the need for
> control over ordering.

I posted yesterday a public reply thanking him for his input and pointing this
out.

> Also that evolving C solution is in practice what will be available in
> Ada.

I didn't discuss this, because to me the important point is that "Robert Dewar
says that GNAT will ignore what the Ada standard says in any case and do
whatever GCC does for Volatile." And that seems pretty dangerous, because I
don't want to put wording in Robert's mouth or make any representations about
what AdaCore will do (a company I don't even work for). I'd rather leave that to
AdaCore.

The C standard is a red herring here (at least presently) because it doesn't
really say anything. And we cannot reference standards that might happen in the
future. And defining things without saying what they mean is bogus at best.

I like Robert's idea of getting multiple language groups together to figure out
what "volatile" should mean, and if that happens Ada's standard surely should
take advantage of it, but that clearly will be too late for Ada 2012.

> I'll not get involved in any Ada-Comment discussions, but if Santi
> gets back to me with anything that I feel should be shared, I'll do
> just that.

I would hope that he posts any observations on Ada-Comment, so that they too get
in the public record. Every level of indirectness adds misinterpretations to
people's opinions.

****************************************************************

From: Erhard Ploedereder
Date: Thursday, January 13, 2011  3:07 PM

> Again, only a language lawyer cares about this issue, to me it is a
> waste of time. If you try to come up with a "really good correct
> definition" of volatile

Au contraire. It is THE major concern that gets voiced when manufacturing
industry worries about the safety of its C-code in the context of shared-memory
parallelism. They all want to implement wait-free communication via atomic
accesses to "shared variables" to be guaranteed to work, which to them means
  - no cacheing of the value in registers or temps by the compiler,
  - atomic write/read-through to memory or, alternatively, atomicity and
    guaranteed cache coherency,
  - no "relevant" instruction swapping around the access
    (with varying views on "relevant", usually starting at "any") And they want
    close to zero direct overhead; they usually do not understand the notion of
    distributed overhead. Implicit mutexes are obviously out.

True, they won't (try to) find the answers in the reference manual.
They will instead (try to) force the compiler writers to do right by them, "so
that the existing code works". And right they are if compiler writers ignore
their standard. It is not gcc that they are trying to force, so gcc is not the
gold standard in this realm.

As a deja-vu, I see customers who are trying to leverage the language standards
against compiler writers in lieu of paying for the right implementation. We've
been there before with Ada.

So, for a language like Ada, it is not good to neglect this concern by saying
"whatever happens, happens".

Note incidently the contradiction:
"Look at Java, C and C++ and how hard they try." (Maybe they have a reason!)
"Therefore Ada can neglect the issue, let the others figure it out." (It is not
that important.) Ada is a bit special, since its integration of parallelism
comes with an obligation to make it work "right". (Unlike C, where it is the job
of the library providers to do the right things, beyond the elementary notion of
read/write-through or any equivalent implementation, which par force involves
the compiler and hence usually the language).

****************************************************************

From: Robert Dewar
Date: Thursday, January 13, 2011  3:56 PM

I still think that if you want to address this issue, you must do it in a
cross-language setting. An Ada-only solution in the standard will be ignored by
users and implementors.

Implementors such as us, certainly want to do the right thing, but we will pay
attention to what users want/need, and of course this will be discussed in the
language independent context of gcc, since it makes no sense to discuss it just
for Ada. And users similarly are interested in getting the compiler to behave in
a reasonable manner, independent of what the standard has to say.

When I say GNAT will pay no attention to what the Ada standard says, it is not
that I think the issue is unimportant, rather we have to find the useful
solutions for users in a language independent context in gcc. If we can't
convince C to do a particular thing, then gcc/Ada won't do it either.

****************************************************************

From: Alan Burns
Date: Friday, January 14, 2011  2:59 AM

Just for the record, Santi seems to happy with my reply clarifying what we are
going (or indeed not going)

---

Thanks Alan for your detailed answer, I finally understand the intent of this AI
(didn't notice that it was just an AARM note). Sorry for the noise.


>  first that volatile need not necessarily mean that the ultimate
> memory location of the variable is updated - this could happened, but
> it may be sufficient to put the value out to coherent cache.

Agreed


>  The second change is to draw attention to the fact that operations on
> volatile variables should not get reorders - and that appropriate
> memory barriers be used where necessary.

If "memory barrier" means "compiler memory barrier" directive (i.e. the compiler
must not move accesses after or before that point) I completely agree.

What would depart from the C / C++ behavior would be to suggest that the
compiler must insert "hardware memory barrier" instructions for volatile
accesses, as done in Java.

>  Ada approach is really to follow what is happening to C (&C++) and
> give the same level of control.

I'm not a C fan... :-)  but in the case of the library of atomic operations I
think is very good news!

****************************************************************


Questions? Ask the ACAA Technical Agent