Version 1.1 of ai05s/ai05-0012-1.txt

Unformatted version of ai05s/ai05-0012-1.txt version 1.1
Other versions for file ai05s/ai05-0012-1.txt

!standard 13.1(24/2)          06-03-31 AI05-0012-1/01
!standard 13.1(25/2)
!standard 13.1(26/2)
!standard 13.2(6.1/2)
!standard C.7(10)
!class binding interpretation 06-03-31
!status work item 06-03-31
!status received 06-03-30
!priority Medium
!difficulty Medium
!qualifier Omission
!subject Independence and Representation clauses for atomic objects
!summary
The recommended level of support is adjusted to say that it is not required to support a non-confirming representation clause that would cause an atomic object to have a size or alignment different than the default size or alignment for that object.
The components of an object covered by an Atomic_Components pragma are always independent.
!question
The Recommended Level of Support implies that it is required to support pragma Pack on types that have Atomic_Components, even to the bit level. Is this the intent? (No.) Should it be required to
!recommendation
13.1(24-26) needs to cover atomic objects as well.
!wording
Insert "or atomic" into "aliased object" in 13.1(24-26).
Replace 13.2(6.1/2) with:
The component of a packed type need not be aligned according to the Alignment of its subtype; in particular it need not be allocated on a storage element boundary.
Add "and independent" after "indivisible" in C.7(10).
!discussion
13.2(6.1/2) conflicts with the Recommended Level of Support. Changing the Recommended Level of Support was rejected as that could cause the representation of types with representation clauses to change silently when "aliased" is added or deleted. The pragma Pack can be rejected anyway based on 13.1(24-26); there is no reason duplicate those permissions here.
This change makes C.7(11) redundant. [Should it be deleted? - RLB]
[Potentially, the change to C.7(10) could be removed if the resolution of AI05-0009 covers it. - RLB.]
!corrigendum 13.1(24/2)
Replace the paragraph:
by:
!corrigendum 13.1(25/2)
Replace the paragraph:
by:
!corrigendum 13.1(26/2)
Replace the paragraph:
by:
!corrigendum 13.2(6.1/2)
Replace the paragraph:
If a packed type has a component that is not of a by-reference type and has no aliased part, then such a component need not be aligned according to the Alignment of its subtype; in particular it need not be allocated on a storage element boundary.
by:
The component of a packed type need not be aligned according to the Alignment of its subtype; in particular it need not be allocated on a storage element boundary.
!corrigendum C.7(10)
Replace the paragraph:
It is illegal to apply either an Atomic or Atomic_Components pragma to an object or type if the implementation cannot support the indivisible reads and updates required by the pragma (see below).
by:
It is illegal to apply either an Atomic or Atomic_Components pragma to an object or type if the implementation cannot support the indivisible and independent reads and updates required by the pragma (see below).
!ACATS test
Since this only allows (rather than requires) an implementation to reject something, it would be hard to usefully test.
!appendix

From: Jean-Pierre Rosen
Sent: Friday, February 17, 2006  6:34 AM

A question that arose while designing a rule for AdaControl about shared 
variables.

If a variable is subject to a pragma Atomic_Components, is it safe for 
two tasks to update *different* components without synchronization?

C.6 talks only about indivisibility, not independent addressing. Of 
course, you have to throw 9.10 in...

The whole issue is with the "(or of a neighboring object if the two are 
not independently addressable)" in 9.10(11), while C.6 (17) says that 
"Two actions are sequential (see 9.10) if each is the read or update of 
the same atomic object", but doesn't mention neighboring objects.

In a sense, indivisibility guarantees only that there cannot be 
temporary incorrect values in a variable due to the fact that the 
variable is written by more than one memory cycle. The issue *is* 
different from independent addressability. OTOH, Atomic_Components 
without independent addressability seems pretty much useless...

****************************************************************

From: Robert Dewar
Sent: Thursday, March 30, 2006  5:55 AM

Answer seems clear, yes it is safe, provided that independence is
assured, which means that there is no rep clause that would disturb
the independence.

If you are suggesting that Atomic Components should guarantee such
independence, and result in the rejection of rep clauses that would
compromise it, that seems reasonable, e.g. you have a packed array
of bits with atomic components, that's definitely peculiar, and
it seems reasonable to reject it.

****************************************************************

From: Pascal Leroy
Sent: Thursday, March 30, 2006  6:07 AM

> If a variable is subject to a pragma Atomic_Components, is it safe for 
> two tasks to update *different* components without synchronization?

I think that 9.10(1) is quite clear: distinct objects are independently
addressable unless "packing, record layout or Component_Size is
specified".

So regardless of atomicity, it is always safe to read/update two distinct
components of an object (in the absence of packing, etc.).  What
Atomic_Component buys you is that reads/updates of the same component are
sequential.

****************************************************************

From: Jean-Pierre Rosen
Sent: Thursday, March 30, 2006  6:17 AM

Of course, my question was in the case of the presence of packing etc.

The answer seems to be no, there is no *additional* implication on 
addressability due to atomic_components. Correct?

****************************************************************

From: Pascal Leroy
Sent: Thursday, March 30, 2006  6:25 AM

> Of course, my question was in the case of the presence of packing etc.

In the presence of packing, 9.10(1) says that independent addressability
is "implementation defined", which is not too helpful.  (This topic was
discussed a few weeks ago as part of another thread, btw.)

> The answer seems to be no, there is no *additional* implication on 
> addressability due to atomic_components. Correct?

Right.

****************************************************************

From: Tucker Taft
Sent: Thursday, March 30, 2006  6:57 AM

The ARG recently disallowed combining a pair of atomic operations
on distinct objects into a single operation, I believe.
I would certainly support saying that array-of-aliased
and array-of-atomic would ensure independence between
components, even in the presence of other rep-clauses.
That seems like a reasonable interpretation of what
atomic means, and "aliased" implies that you can
have multiple access paths that make no visible use
of indexing, and hence you would certainly want independence.

****************************************************************

From: Robert Dewar
Sent: Thursday, March 30, 2006  7:58 AM

> The ARG recently disallowed combining a pair of atomic operations
> on distinct objects into a single operation, I believe.
> I would certainly support saying that array-of-aliased
> and array-of-atomic would ensure independence between
> components, even in the presence of other rep-clauses.

Wait a moment, then you have to give permission to reject
these "other rep clauses", you can't insist that they be
recognized and independence be preserved!

****************************************************************

From: Robert Dewar
Sent: Thursday, March 30, 2006  8:02 AM

> In the presence of packing, 9.10(1) says that independent addressability
> is "implementation defined", which is not too helpful.  (This topic was
> discussed a few weeks ago as part of another thread, btw.)

It seems *really* nasty to make this implementation defined, I hate
erroneousness being imp defined. Is this a new change, I missed it.

****************************************************************

From: Robert Dewar
Sent: Thursday, March 30, 2006  8:08 AM

> So regardless of atomicity, it is always safe to read/update two distinct
> components of an object (in the absence of packing, etc.).  What
> Atomic_Component buys you is that reads/updates of the same component are
> sequential.

.. and atomic!

But there is still the issue of something like this

    type X is array (1 .. 8) of Boolean;
    pragma Pack (X);
    pragma Atomic_Components (X);

Should one of the two pragmas be ignored, or should one of
them be rejected, or what? In GNAT we get:


a.ads:4:30: warning: Pack canceled, cannot pack atomic components

is that behavior OK? forbidden? mandated?
(not clear to me at any right)

****************************************************************

From: Pascal Leroy
Sent: Thursday, March 30, 2006  8:17 AM

> It seems *really* nasty to make this implementation defined, 
> I hate erroneousness being imp defined. Is this a new change, 
> I missed it.

This is not new, it has been like that since Ada 95, and the last time
this was discussed (around Feb, 24th, thread titled "Independence and
confirming rep. clauses"), the two of us (at least) agreed that it was
poor language design.

****************************************************************

From: Robert Dewar
Sent: Thursday, March 30, 2006  8:25 AM

OK, so I just misremembered here, sorry!

****************************************************************

From: Robert Dewar
Sent: Thursday, March 30, 2006  8:25 AM

> is that behavior OK? forbidden? mandated?
> (not clear to me at any right)

It's certainly OK to reject any representation item that you don't like.
However, it appears that the implementation advice about pragma Pack does
not mention atomicity, so you are not following the advice, and you don't
comply with Annex C.

On a machine that could independently address bits, the two pragmas could
well coexist, so there is some amount of implementation dependence here.

For the record Apex also ignores Pack in this example, although it doesn't
emit a warning.

****************************************************************

From: Robert Dewar
Sent: Thursday, March 30, 2006  8:41 AM

> It's certainly OK to reject any representation item that you don't like.
> However, it appears that the implementation advice about pragma Pack does
> not mention atomicity, so you are not following the advice, and you don't
> comply with Annex C.

Yes, but it is impossible to comply on virtually all machines
 
> On a machine that could independently address bits, the two pragmas could
> well coexist, so there is some amount of implementation dependence here.

There are almost no such machines!

****************************************************************

From: Tucker Taft
Sent: Thursday, March 30, 2006  8:21 AM

> Wait a moment, then you have to give permission to reject
> these "other rep clauses", you can't insist that they be
> recognized and independence be preserved!

I believe there are already rules that effectively allow that,
once we make it clear that being atomic also implies being
independent of neighboring objects.  E.g. C.6(10-11):

    It is illegal to apply either an Atomic or Atomic_Components pragma
    to an object or type if the implementation cannot support the
    indivisible reads and updates required by the pragma (see below).

    It is illegal to specify the Size attribute of an atomic object,
    the Component_Size attribute for an array type with atomic
    components, or the layout attributes of an atomic component,
    in a way that prevents the implementation from performing the
    required indivisible reads and updates.

Probably would want to change "indivisible" to
"indivisible and independent" in both of the above paragraphs.

****************************************************************

From: Robert Dewar
Sent: Thursday, March 30, 2006  1:30 PM

SO I guess you would consider my packed example illegal, and the
warning should be a real illegality?

****************************************************************

From: Tucker Taft
Sent: Thursday, March 30, 2006  2:24 PM

> SO I guess you would consider my packed example illegal, and the
> warning should be a real illegality?

Pragma Pack is a little different.  It says "pack as
tightly as you can, subject to all the other requirements
imposed on the type."  So you never need to reject a
pragma Pack.  I could imagine that in the absence of
a pragma Pack, some implementations might make the following
array 32-bits/element:

    type Very_Short is new Integer range 0..7;
    type VS_Array is array(Positive range <>) of Very_Short;
    pragma Atomic_Components(VS_Array);

but if we add a pragma Pack(VS_Array), I would expect it to be shrunk
down to 8 bits per component on machines that allow atomic
reference to bytes.  In the absence of the pragma Atomic_Components,
I would expect it to be shrunk down to 3 or 4 bits/component.

****************************************************************

From: Gary Dismukes
Sent: Thursday, March 30, 2006  3:05 PM

> Pragma Pack is a little different.  It says "pack as
> tightly as you can, subject to all the other requirements
> imposed on the type."  So you never need to reject a
> pragma Pack.  I could imagine that in the absence of
> a pragma Pack, some implementations might make the following
> array 32-bits/element:

But in the case of Annex C compliance you have to follow the
recommended level of support, which requires tight packing
of things like Boolean arrays as I understand it.  There's
nothing about "subject to other requirements", so it seems
that one of the pragmas would have to be rejected.

****************************************************************

From: Tucker Taft
Sent: Thursday, March 30, 2006  3:31 PM

> But in the case of Annex C compliance you have to follow the
> recommended level of support, which requires tight packing
> of things like Boolean arrays as I understand it.  There's
> nothing about "subject to other requirements", so it seems
> that one of the pragmas would have to be rejected.

Good point.  But an existing AARM note implies there is
some interplay between a component being aliased and
the "size of the component subtype":

    Ramification: If a component subtype is aliased, its Size will
    generally be a multiple of Storage_Unit, so it probably won't
    get packed very tightly.

This AARM ramification seems totally unjustified, unless we
presumed that there was some kind of implicit "widening"
that was occuring on the Size of a component subtype if
necessary to satisfy other requirements, such as "aliased,"
"atomic," etc.  But that really doesn't fit with the model,
since the *subtype* is not aliased, nor is the component
*subtype* atomic in the case of an Atomic_Components pragma.

So I think we will definitely need to change the words here
if that is what we want, namely the "tight" packing is not
required if the components are aliased, by-reference, or
atomic.

****************************************************************

From: Randy Brukardt
Sent: Thursday, March 30, 2006  3:37 PM

> But in the case of Annex C compliance you have to follow the
> recommended level of support, which requires tight packing
> of things like Boolean arrays as I understand it.  There's
> nothing about "subject to other requirements", so it seems
> that one of the pragmas would have to be rejected.

As much as I hate to, I agree with Gary. Indeed, I don't see anything about
"subject to other requirements" anywhere in 13.2. Here's what the definition
of Pack is (this has nothing to do with recommended level of support):

"If a type is packed, then the implementation should try to minimize storage
allocated to objects of the type, possibly at the expense of speed of
accessing components, subject to reasonable complexity in addressing
calculations."

I don't see that "reasonable complexity" has anything whatsoever to do with
"other requirements". And then the Recommended Level of Support pretty much
defines what "reasonable complexity" means (by allowing rounding up to avoid
crossing boundaries).

So I agree that one of the pragmas has to be rejected. (I don't think that
any language change is needed to make that a requirement, either, although
it would make sense to clarify this so there is no doubt.) A warning (as
GNAT gives) is wrong for a compiler following Annex C, and unfriendly
otherwise. Silently doing nothing...I better not go there. :-)

****************************************************************

From: Randy Brukardt
Sent: Thursday, March 30, 2006  3:50 PM

> So I think we will definitely need to change the words here
> if that is what we want, namely the "tight" packing is not
> required if the components are aliased, by-reference, or
> atomic.

The note was unjustified in Ada 95, but in Ada 2005, we added a blanket
permission to reject rep. clauses for components of by-reference and aliased
types unless they are confirming. See 13.1(26/2). Remember that pragma Pack
is never confirming, so this is the same as saying that it can be rejected
(but not required to be rejected) for any aliased or by-reference type.

There is even an AARM note (carried over from Ada 95) which notes that
Atomic_Components has similar restrictions. But it doesn't look like we ever
considered the interaction of Atomic_Components and other rep. clauses.
Perhaps it should be included in 13.1(26/2)? (That is, it shouldn't be
required to support any non-confirming rep. clauses on such a type, but of
course you can if you want.)

****************************************************************

From: Tucker Taft
Sent: Thursday, March 30, 2006  4:00 PM

> As much as I hate to, I agree with Gary. Indeed, I don't see anything about
> "subject to other requirements" anywhere in 13.2....

The new paragraph 13.2(6.1) says:

    If a packed type has a component that is not of a by-reference
    type and has no aliased part, then such a component need not
    be aligned according to the Alignment of its subtype; in
    particular it need not be allocated on a storage element boundary.

This is the part that implies that packing is "subject to
other requirements."  If we changed "aliased" to "aliased or atomic"
in the above, I think it would accomplish roughly what I was
suggesting.  I think you will agree that the above paragraph,
combined with 13.3(26.3):

     For an object X of subtype S, if S'Alignment is not zero, then
     X'Alignment is a nonzero integral multiple of S'Alignment unless
     specified otherwise by a representation item.

implies that in:

     type Aliased_Bit_Vector is
       array (Positive range <>) of aliased Boolean;
     pragma Pack(Boolean);

the components should be aligned on Boolean'Alignment boundaries.
I would think the same thing should apply if Atomic_Components
is applied to a boolean array.

I admit that these paragraphs seem to contradict the recommended
level of support, but I think the bug is there, not in the above
two paragraphs.

 > ...
> So I agree that one of the pragmas has to be rejected. (I don't think that
> any language change is needed to make that a requirement, either, although
> it would make sense to clarify this so there is no doubt.) A warning (as
> GNAT gives) is wrong for a compiler following Annex C, and unfriendly
> otherwise. Silently doing nothing...I better not go there. :-)

I suppose it depends on your interpretation of "Pack."  I have
always taken it as "do as well as you can."  If you really have
a specific size you need, then specify that with Component_Size,
or be sure that there is nothing inhibiting the packing, such
as aliased, by-reference, or atomic components.

I agree it is friendly to inform the user if the pack has *no*
effect, but I wouldn't want to disallow pragma Pack completely
in the above example, because array of Boolean might use
32-bits/component in its absence, if byte-at-a-time access is
significantly slower than word-at-a-time access on the given
hardware.

****************************************************************

From: Robert Dewar
Sent: Thursday, March 30, 2006  4:32 PM

> I suppose it depends on your interpretation of "Pack."  I have
> always taken it as "do as well as you can."  If you really have
> a specific size you need, then specify that with Component_Size,
> or be sure that there is nothing inhibiting the packing, such
> as aliased, by-reference, or atomic components.

Well you can interpret it that way if you like, but it is not
the definition in the language, which says that for arrays with
1,2,4 bit components, pragma Pack works as expected!
 
> I agree it is friendly to inform the user if the pack has *no*
> effect, but I wouldn't want to disallow pragma Pack completely
> in the above example, because array of Boolean might use
> 32-bits/component in its absence, if byte-at-a-time access is
> significantly slower than word-at-a-time access on the given
> hardware.

I think that is wrong in this case, since pragma Pack for Boolean
has precise well defined semantics, and must make the component
size 1, it does not mean, do-as-well-as-you-can.

****************************************************************

From: Randy Brukardt
Sent: Thursday, March 30, 2006  5:18 PM

>      type Aliased_Bit_Vector is
>        array (Positive range <>) of aliased Boolean;
>      pragma Pack(Boolean);
>
> the components should be aligned on Boolean'Alignment boundaries.
> I would think the same thing should apply if Atomic_Components
> is applied to a boolean array.

Well, in your example, the pragma should be rejected because the type isn't
local. But I presume you meant "pragma Pack(Aliased_Bit_Vector);".

I see your point, but all it says to me is that the new paragraph shouldn't
be conditional. The needed escape is provided by 13.1(26/2) anyway.
13.1(26/2) says that there is no requirement to even support pragma Pack for
such a type.

> I admit that these paragraphs seem to contradict the recommended
> level of support, but I think the bug is there, not in the above
> two paragraphs.

And I disagree; I think the RLS is correct and the above should simply read:

     The component of a packed type need not be aligned according to the
     Alignment of its subtype; in particular it need not be allocated on
     a storage element boundary.

This doesn't require misalignment, it just allows it. The RLS requires it in
some cases, but in those cases there is no requirement to support pragma
Pack.

...
> I suppose it depends on your interpretation of "Pack."  I have
> always taken it as "do as well as you can."  If you really have
> a specific size you need, then specify that with Component_Size,
> or be sure that there is nothing inhibiting the packing, such
> as aliased, by-reference, or atomic components.

Pack is defined to "minimize storage, within reason". No exceptions for
goofy component types; for those you can't minimize storage.

> I agree it is friendly to inform the user if the pack has *no*
> effect, but I wouldn't want to disallow pragma Pack completely
> in the above example, because array of Boolean might use
> 32-bits/component in its absence, if byte-at-a-time access is
> significantly slower than word-at-a-time access on the given
> hardware.

Such hardware is possible, I suppose, but it seems unlikely since it would
perform poorly on C code and thus on standard benchmarks. Moreover, there is
more to overall performance than just the byte access time; all of the
wasted space would cause extra cache pressure and usually would cause the
overall run time to be longer.

After all, the default representation should be best for "typical"
conditions. If your use of a particular type is atypical (you need storage
minimization or performance maximization), then you need to declare the type
appropriately. For storage minimization, that's pragma Pack. For time
maximization, you have to noodle with 'Alignment and/or 'Component_Size,
which is difficult; it would be useful if Ada had a pragma Fastest (...)
that worked like Pack in reverse (sort of like Pascal unpack) -- space be
damned, give me the fastest possible access to these components.

So, I don't see any value to pragma Pack in your example; if anything, it is
misleading because it does nothing. One of our goals with this amendment,
after all, was to reduce the effects of adding or removing "aliased". I
don't think that adding or removing "aliased" should change representation
if there are rep. clauses (although it might make the rep. clauses
illegal) -- otherwise, a simple maintenance change can introduce
hard-to-find bugs.

Specifically, you're saying that changing:

    type Bit_Vector is array (Positive range <>) of Boolean;
    pragma Pack(Bit_Vector);

to

    type Bit_Vector is array (Positive range <>) of aliased Boolean;
    pragma Pack(Bit_Vector);

will *silently* change the representation. Yuk. I'm pretty sure that we'll
never do that in our compiler...

****************************************************************

From: Robert Dewar
Sent: Thursday, March 30, 2006  5:31 PM

> will *silently* change the representation. Yuk. I'm pretty sure that we'll
> never do that in our compiler...

So how *will* your compiler handle these two cases?

****************************************************************

From: Robert Dewar
Sent: Thursday, March 30, 2006  5:47 PM

> So how *will* your compiler handle these two cases?

I presume you're asking about the Ada 2005 update, not the current practice
(without the new 13.1(26/2), we just give warnings that nothing will
happen).

Anyway, in Ada 2005, the first will be accepted, and the second rejected
(based on 13.1(26/2) - this is not confirming). The rejection of the second
one will make the maintenance programmer remove the pragma, and that will
make the change of representation crystal clear.

****************************************************************

From: Tucker Taft
Sent: Thursday, March 30, 2006  6:00 PM

> Anyway, in Ada 2005, the first will be accepted, and the second rejected
> (based on 13.1(26/2) - this is not confirming). The rejection of the second
> one will make the maintenance programmer remove the pragma, and that will
> make the change of representation crystal clear.

I'm convinced.  And I think pragma Atomic_Components ought
to work very much like adding "aliased".  So perhaps the only
real change is needed in 13.1(24/2):

     An implementation need not support a nonconfirming representation
     item if it could cause an aliased object or an object of a
     by-reference type to be allocated at a nonaddressable location or,
     when the alignment attribute of the subtype of such an object is
     nonzero, at an address that is not an integral multiple of that
     alignment.

We should probably change "aliased" above to "aliased or atomic."

****************************************************************

From: Robert Dewar
Sent: Thursday, March 30, 2006  6:15 PM

> We should probably change "aliased" above to "aliased or atomic."

or volatile, you don't want extra reads/writes there either.

****************************************************************

From: Randy Brukardt
Sent: Thursday, March 30, 2006  6:21 PM

> We should probably change "aliased" above to "aliased or atomic."

I think we'd want to make that change to 13.1(25/2) and 13.1(26/2), too. We
don't want to force compilers to handle 4-bit atomic record components,
either. (Those could be aligned correctly and still have a size that's too
small.)

****************************************************************

From: Robert I. Eachus
Sent: Thursday, March 30, 2006  7:27 PM

>> On a machine that could independently address bits, the two pragmas 
>> could
>> well coexist, so there is some amount of implementation dependence here.
>
>
> There are almost no such machines!

I totally agree with the language part of this discussion, but many 
hardware ISAs allow read-modify-write access.  If you can do an AND or 
an OR as an RMW isntruction, then ORing16#EF# sets the fourth bit of the 
byte, and ANDing of 16#EF# resets it.  (There are often advantages to 
doing 32 or 64-bit wide operations instead of byte wide operations, 
especially with modern CPUs, but that is a detail.) Is the RMW 
instruction atomic?  The most interesting case is in the x86 case.  If 
you have a single CPU (or today CPU core) the retirement rules make the 
instructions atomic from the CPUs point of view.  (If an interrupt 
occurs, either the write has completed, or the instruction will be 
restarted.)  What if you have multiple CPUs, multiple cores, or are 
interfacing with an I/O device?  Better mark the memory as UC 
(uncacheable) and use the LOCK prefix on the AND or OR instruction, but 
then it is guaranteed to work.

So I would say that the majority of  computers in use do support  
bit-addressable atomic access support--as long as the component values 
don't cross quad-word boundaries. (There are lots of other CISC CPU 
designs where this works as well.  The first microprocessor I used it on 
was the M68000, but I had used this trick on many mainframes before then.)

****************************************************************

From: Robert I. Eachus
Sent: Thursday, March 30, 2006  7:53 PM

> So I would say that the majority of  computers in use do support  
> bit-addressable atomic access support--as long as the component values 
> don't cross quad-word boundaries.

Whoops! I got a bit carried away.  In the x86 ISA you can only do atomic 
loads and stores of a set of all one bits or all zero bits.  Some other 
ISAs do allow arbitrary bit patterns to be substituted.  You can always 
use a locked XOR iff each entry in an array is 'owned' by a different 
thread.

So the changes being discussed are needed for the non-boolean cases.  
However, I would hope that at least the AARM should explain the special 
nature of atomic bit arrays.

****************************************************************

From: Bibb Latting
Sent: Thursday, March 30, 2006 11:46 PM

> So I would say that the majority of  computers in use do support
> bit-addressable atomic access support--as long as the component values
> don't cross quad-word boundaries. (There are lots of other CISC CPU
> designs where this works as well.  The first microprocessor I used it on
> was the M68000, but I had used this trick on many mainframes before then.)

This is a molecular operation, not an atomic operation for:

   type packed_bits (1..N) of boolean;
   pragma pack (packed_bits);
   pragma atomic_components (packed_bits);

    1) RMW assumes that the contents on read are the same as write.  When
       dealing with I/O interfaces, this is not always true.

    2)  Without a data source for the other bits, the operation is not 
        atomic.

> Probably would want to change "indivisible" to
> "indivisible and independent" in both of the above paragraphs.

I think this change is worth considering.

****************************************************************

From: Jean-Pierre Rosen
Sent: Friday, March 31, 2006  2:07 AM

Just to spread a little more oil on the fire...
What happens here?

type Tab is array (positive range <>) of boolean;
pragma pack (Tab);

X : Tab (1 ..32);
pragma Atomic_Components (X);

i.e. when a *type* is packed, but an individual *variable* has atomic 
components?

****************************************************************

From: Robert Dewar
Sent: Thursday, March 30, 2006  5:05 AM

An error message I trust:

> The array_local_name in an Atomic_Components or
> Volatile_Components pragma shall resolve to denote the declaration of an
> array type or an array object of an anonymous type.

Tab don't look anonymous to me :-)

****************************************************************

From: Robert I. Eachus
Sent: Friday, March 31, 2006 11:27 AM

> This is a molecular operation, not an atomic operation for:
>
>   type packed_bits (1..N) of boolean;
>   pragma pack (packed_bits);
>   pragma atomic_components (packed_bits);
>
>    1) RMW assumes that the contents on read are the same as write.  When
> dealing with I/O interfaces, this is not always true.

No, you have to follow the prescription exactly.  And although it is 
possible that some chipsets get this wrong, the ISA specifies what is 
done exactly because it is used in interfacing between multiple CPUs and 
CPUs and I/O devices.  Oh, and it is about 50 times faster on a Hammer 
(AMD Athlon64, Turion, or Opteron) CPU because all memory access goes 
through CPU caches.  So if the memory is local to the CPU, it just has 
to do the RMW in cache, and any other writes to the location can't 
interrupt.  Teechnically the cache line containing the array is Owned by 
the thread that executes the locked RMW instruction.  This means that 
the data migrates to the local cache, and the CPU connected to the 
memory has a Shared copy in cache.  (Reads are not an issue, they either 
see the previous  state of the array, or the final state.)

To repeat, on x86, you must use an AND or OR instruction where the first 
argument is the bit array you want treated as atomic.  (The second 
argument--the mask--can be a register or an immediate constant.) You 
must use the LOCK prefix byte, and the page containing the array must be 
marked as uncacheable.  (Yes, Hammer chips cache them anyway, but 
enforce the atomicity rules.  In fact they go a bit further, and don't 
even allow other reads during the few CPU clocks the cycle takes.  If 
you read a Shared cache line, the read causes a cache snoop that can 
invalidate the read, and cause the instruction to be retried.)

>    2)  Without a data source for the other bits, the operation is not 
> atomic. 

Did you miss the fact that you have to use an AND or OR instruction with 
a memory address as the first argument to
use the LOCK prefix?   This insures that the read and write are seen as 
atomic by the CPU.  Marking the memory as uncacheable is necessary if 
there are other CPUs and/or I/O devices involved.  This ensures that the 
memory line is locked with Intel CPUs and must be locally Owned by AMD CPUs.

If you really think this doesn't work, look at some driver code.  I''ve 
avoided giving example programs, because I'd also need to supply 
hardware to test the code.

****************************************************************

From: Bibb Latting
Sent: Friday, March 31, 2006  4:44 PM

> If you really think this doesn't work, look at some driver code.  I''ve
> avoided giving example programs, because I'd also need to supply hardware 
> to test the code.

I *really* think that this doesn't *always* work.  I understand the 
mechanization of memory access that you describe: indeed today there are 
usually adequate means to obtain exclusive access to a memory element, which 
when combined with suitable cache management allows implementation of 
volatile/atomic accesses.

However, the underlying assumption is that the address  referenced returns 
the last value written.  I'm saying that this isn't always true for memory 
mapped I/O.  An example I encountered was the SCC2692 a number of years ago. 
It was a really *cheap* chip with 16 bytes of address space.  The problem is 
that the chip doesn't have enough address space to provide both read-back of 
control registers and adequate status.  To work around the problem, the 
Read/Write line was multiplexed: when you write to the chip you're accessing 
one register; when you read, you're accessing a different register.  So, 
there are two objects, one for write and another for read, at the *same 
address*.  In terms of C.6, I'm treating (perhaps incorrectly) every 
addressable element as a variable, which becomes "shared" by application of 
volatile/atomic.

****************************************************************

From: Robert I. Eachus
Sent: Friday, March 31, 2006  7:59 PM

Ah!  I guess I mixed you up by going from the general to the specific 
case.  The Intel 8086, 8088, and 80186, were not designed to support 
(demand paged) virtual memory, although it could be done. The Intel 
80286 was designed to do so, but to call the support a kludge is an 
insult to most kludges.  Since the 80386, and in chip years that is a 
long time ago, the mechanism I described has been supported as part of 
the ISA.  Right now the AMD and Intel implementations are very 
different, but the same code will work on all PC compatible CPUs.

There may be non-x86 compatible hardware out there that is not capable 
of correctly doing the (single) bit flipping.  But I think that from a 
language design point of view, we should realize that most CPUs out 
there will support the packed array of Boolean special.case.  I would 
rather have the RM require it for Real-Time Annex support, and allow 
compilers for non-conforming hardware to document that. For example, 
there is an errata for the Itanium2 IA-32 execution layer (#14 on page 
67 of  http://download.intel.com/design/Itanium2/specupdt/25114140.pdf)  
But that just means you shouldn't try to run real-time code in IA-32 
emulation mode on an Itanium2 CPU.  ;-)

Incidently notice that there is a lot of magic that goes on in operating 
systems that may prevent a program from doing this bit-twiddling.  
That's fine.  If a program that uses the Real-Time Annex needs special 
permissions, document them and move on.  I personally think that there 
is no reason for an OS not to satisfy a user request for an uncacheable 
(UC) page.  It is necessary for real-time code, and harmless otherwise.  
Especially on the AMD Hammer CPUs, there is no reason to restrict user 
access to UC pages and/or the LOCK prefix.  The actual locking lasts a 
few nanoseconds. (The memory location will be read, ownership, if 
necessary transferred to the correct CPU and process.  Then the locked 
RMW cycle takes place in the L1 data cache. Unlocked writes to the bit 
array can occur during the change of ownership, but the copy used in the 
RMW cycle is the latest version.)

****************************************************************

From: Randy Brukardt
Sent: Friday, March 31, 2006  8:36 PM

> Ah!  I guess I mixed you up by going from the general to the specific
> case.

No, you missed his point at altogether. It doesn't have anything to do with
the CPU!

The point is that memory-mapped hardware often doesn't act like memory at
all; in particular a location may not be readable or writable or (worse)
may return something different when read after writing.

You can't make bit-mapped atomic writing work at all in such circumstances,
no matter what CPU locking is provided. You are suggesting using

    Lock
    Or [Mem],16#10#

to set just the fifth bit atomically, but this cannot work on memory-mapped
hardware that doesn't allow reading! You'll set the other bits to whatever
random junk, not the correct values.

Now, the question is what this has to do with the language. You seem to want
to insist that compilers support this. But compiler vendors have no control
over what hardware their customers build/use. If your rule was adopted,
about all vendors could do is put "don't use Atomic_Components with
memory-mapped hardware that can only be written" in their manual.

But this is nasty; Atomic and Atomic_Components exist in large part because
of memory-mapped hardware, and here you're trying to tell people to not use
one of them exactly when they are most likely to do so. That doesn't seem to
be a good policy.

It seems better to me to require users to read/write full storage units in
this case, using an appropriate record or array type. There's much less risk
of problems in that case. Funny hardware seems to be quite prevalent
(remember that we had a long discussion on whether an atomic read/write
could read two bytes instead of one word), we have to recognize that.

****************************************************************



Questions? Ask the ACAA Technical Agent