CVS difference for ai05s/ai05-0012-1.txt

Differences between 1.3 and version 1.4
Log of other versions for file ai05s/ai05-0012-1.txt

--- ai05s/ai05-0012-1.txt	2007/05/24 01:17:55	1.3
+++ ai05s/ai05-0012-1.txt	2011/11/09 01:13:52	1.4
@@ -1,9 +1,10 @@
-!standard 13.2(6.1/2)                                      07-05-20    AI05-0012-1/02
+!standard 13.2(6.1/2)                                      11-11-08    AI05-0012-1/03
 !standard 13.2(7)
 !standard C.6(10)
 !standard C.6(11)
 !standard C.6(21)
 !class binding interpretation 06-03-31
+!status deleted 11-11-08
 !status work item 06-03-31
 !status received 06-03-30
 !priority Medium
@@ -13,1487 +14,17 @@
 
 !summary
 
-This action item resolves the difference in recommended level of support
-for atomic and volatile objects by making the alignment implementation
-advice recommended support and adding a rejection statement for those
-array objects that are packed to a different alignment than that of
-the component's subtype.
+This AI has been moved to Ada 2012 (AI12-0001-1).
 
 !question
 
-The Recommended Level of Support implies that it is required to support pragma
-Pack on types that have Atomic_Components, even to the bit level. Is this the
-intent? (No.) 
-
 !recommendation
 
-Resolve the difference by eliminating C.6 (21) and changing 13.2 (6.1/2)
-to be a recommended level of support where by-reference, aliased, atomic
-and volatile objects must be aligned according to subtype.
-
-Change 13.2(9) to reject packed arrays that require independent addressability,
-but are packed to a different or no alignment.
-
-In C.6(10-11), add "and Independent" after indivisible.
-
-Delete C.6 (21) as it is no longer required.
-
 !wording
 
-13.2 (6.1/2) is moved after 13.2 (7) and changed to:
-
-For a packed type that has a component that is of a by-reference type,
-aliased, volatile, or atomic, the component shall be aligned according to
-the alignment of its subtype; in particular it shall be aligned on a
-storage element boundary.
-
-13.2 (9) append:
-
-If the array component is required to be aligned according to its subtype
-and the results of packing are not so aligned, pragma pack should be rejected.
-
-C.6 (10-11)
-
-Add "and independent" after indivisible.
-
-C.6 (21, AARM 21.a) Delete.
-
 !discussion
 
-Addition of atomic (and volatile) to 13.1 (24-26) was discarded because
-neither pragma is confirming.
-
-Making 13.2 (6.1/2) a Recommended Level of Support makes it a requirement
-when Annex C is supported.  This covers volatile and atomic and eliminates
-the conflict between the Recommended Level of Support and this rule.
-
-Similarly, C.6(21) conflicts with the Recommended Level of Support. We don't want
-the representation of a packed array of Boolean to depend on other keywords
-(like aliased) or pragmas that apply to the type. (That could cause silent
-representation changes during maintenance.) Thus, this rule is deleted.
-
-
-!corrigendum 13.2(6.1/2)
-
-@ddel
-If a packed type has a component that  is not of a by-reference type and has
-no aliased part, then such a component need not be aligned according to the
-Alignment of its subtype; in particular it need not be allocated on
-a storage element boundary.
-
-!corrigendum 13.2(7)
-
-@dinst
-The recommended level of support for pragma Pack is:
-@dinsa
-@xindent<For a packed type that has a component that is of a by-reference type,
-aliased, volatile, or atomic, the component shall be aligned according to
-the alignment of its subtype; in particular it shall be aligned on a
-storage element boundary.>
-
-!corrigendum 13.2(9)
-
-@drepl
-@xbullet<For a packed array type, if the component subtype's Size is less than
-or equal to the word size, and Component_Size is not specified for the type,
-Component_Size should be less than or equal to the Size of the component subtype,
-rounded up to the nearest factor of the word size.>
-@dby
-@xbullet<For a packed array type, if the component subtype's Size is less than
-or equal to the word size, and Component_Size is not specified for the type,
-Component_Size should be less than or equal to the Size of the component subtype,
-rounded up to the nearest factor of the word size. If the array component is
-required to be aligned according to its subtype and the results of packing are
-not so aligned, pragma pack should be rejected.>
-
-!corrigendum C.6(10)
-
-@drepl
-It is illegal to apply either an Atomic or Atomic_Components pragma to an object or type
-if the implementation cannot support the indivisible reads and updates required by the
-pragma (see below).
-@dby
-It is illegal to apply either an Atomic or Atomic_Components pragma to an object or type
-if the implementation cannot support the indivisible and independent reads and updates
-required by the pragma (see below).
-
-!corrigendum C.6(11)
-
-@drepl
-It is illegal to specify the Size attribute of an atomic object, the Component_Size
-attribute for an array type with atomic components, or the layout attributes of an
-atomic component, in a way that prevents the implementation from performing the
-required indivisible reads and updates.
-@dby
-It is illegal to specify the Size attribute of an atomic object, the Component_Size
-attribute for an array type with atomic components, or the layout attributes of an
-atomic component, in a way that prevents the implementation from performing the
-required indivisible and independent reads and updates.
-
-
-!corrigendum C.6(21)
-
-@ddel
-If a pragma Pack applies to a type any of whose subcomponents are atomic, the
-implementation shall not pack the atomic subcomponents more tightly than that
-for which it can support indivisible reads and updates.
-
 !ACATS test
 
-ACATS tests confirming rejection of pragma Pack combined with Atomic_Components
-for small types like Boolean on all targets but bit addressable targets should
-be implemented.
-
 !appendix
-
-From: Jean-Pierre Rosen
-Sent: Friday, February 17, 2006  6:34 AM
-
-A question that arose while designing a rule for AdaControl about shared 
-variables.
-
-If a variable is subject to a pragma Atomic_Components, is it safe for 
-two tasks to update *different* components without synchronization?
-
-C.6 talks only about indivisibility, not independent addressing. Of 
-course, you have to throw 9.10 in...
-
-The whole issue is with the "(or of a neighboring object if the two are 
-not independently addressable)" in 9.10(11), while C.6 (17) says that 
-"Two actions are sequential (see 9.10) if each is the read or update of 
-the same atomic object", but doesn't mention neighboring objects.
-
-In a sense, indivisibility guarantees only that there cannot be 
-temporary incorrect values in a variable due to the fact that the 
-variable is written by more than one memory cycle. The issue *is* 
-different from independent addressability. OTOH, Atomic_Components 
-without independent addressability seems pretty much useless...
-
-****************************************************************
-
-From: Robert Dewar
-Sent: Thursday, March 30, 2006  5:55 AM
-
-Answer seems clear, yes it is safe, provided that independence is
-assured, which means that there is no rep clause that would disturb
-the independence.
-
-If you are suggesting that Atomic Components should guarantee such
-independence, and result in the rejection of rep clauses that would
-compromise it, that seems reasonable, e.g. you have a packed array
-of bits with atomic components, that's definitely peculiar, and
-it seems reasonable to reject it.
-
-****************************************************************
-
-From: Pascal Leroy
-Sent: Thursday, March 30, 2006  6:07 AM
-
-> If a variable is subject to a pragma Atomic_Components, is it safe for 
-> two tasks to update *different* components without synchronization?
-
-I think that 9.10(1) is quite clear: distinct objects are independently
-addressable unless "packing, record layout or Component_Size is
-specified".
-
-So regardless of atomicity, it is always safe to read/update two distinct
-components of an object (in the absence of packing, etc.).  What
-Atomic_Component buys you is that reads/updates of the same component are
-sequential.
-
-****************************************************************
-
-From: Jean-Pierre Rosen
-Sent: Thursday, March 30, 2006  6:17 AM
-
-Of course, my question was in the case of the presence of packing etc.
-
-The answer seems to be no, there is no *additional* implication on 
-addressability due to atomic_components. Correct?
-
-****************************************************************
-
-From: Pascal Leroy
-Sent: Thursday, March 30, 2006  6:25 AM
-
-> Of course, my question was in the case of the presence of packing etc.
-
-In the presence of packing, 9.10(1) says that independent addressability
-is "implementation defined", which is not too helpful.  (This topic was
-discussed a few weeks ago as part of another thread, btw.)
-
-> The answer seems to be no, there is no *additional* implication on 
-> addressability due to atomic_components. Correct?
-
-Right.
-
-****************************************************************
-
-From: Tucker Taft
-Sent: Thursday, March 30, 2006  6:57 AM
-
-The ARG recently disallowed combining a pair of atomic operations
-on distinct objects into a single operation, I believe.
-I would certainly support saying that array-of-aliased
-and array-of-atomic would ensure independence between
-components, even in the presence of other rep-clauses.
-That seems like a reasonable interpretation of what
-atomic means, and "aliased" implies that you can
-have multiple access paths that make no visible use
-of indexing, and hence you would certainly want independence.
-
-****************************************************************
-
-From: Robert Dewar
-Sent: Thursday, March 30, 2006  7:58 AM
-
-> The ARG recently disallowed combining a pair of atomic operations
-> on distinct objects into a single operation, I believe.
-> I would certainly support saying that array-of-aliased
-> and array-of-atomic would ensure independence between
-> components, even in the presence of other rep-clauses.
-
-Wait a moment, then you have to give permission to reject
-these "other rep clauses", you can't insist that they be
-recognized and independence be preserved!
-
-****************************************************************
-
-From: Robert Dewar
-Sent: Thursday, March 30, 2006  8:02 AM
-
-> In the presence of packing, 9.10(1) says that independent addressability
-> is "implementation defined", which is not too helpful.  (This topic was
-> discussed a few weeks ago as part of another thread, btw.)
-
-It seems *really* nasty to make this implementation defined, I hate
-erroneousness being imp defined. Is this a new change, I missed it.
-
-****************************************************************
-
-From: Robert Dewar
-Sent: Thursday, March 30, 2006  8:08 AM
-
-> So regardless of atomicity, it is always safe to read/update two distinct
-> components of an object (in the absence of packing, etc.).  What
-> Atomic_Component buys you is that reads/updates of the same component are
-> sequential.
-
-.. and atomic!
-
-But there is still the issue of something like this
-
-    type X is array (1 .. 8) of Boolean;
-    pragma Pack (X);
-    pragma Atomic_Components (X);
-
-Should one of the two pragmas be ignored, or should one of
-them be rejected, or what? In GNAT we get:
-
-
-a.ads:4:30: warning: Pack canceled, cannot pack atomic components
-
-is that behavior OK? forbidden? mandated?
-(not clear to me at any right)
-
-****************************************************************
-
-From: Pascal Leroy
-Sent: Thursday, March 30, 2006  8:17 AM
-
-> It seems *really* nasty to make this implementation defined, 
-> I hate erroneousness being imp defined. Is this a new change, 
-> I missed it.
-
-This is not new, it has been like that since Ada 95, and the last time
-this was discussed (around Feb, 24th, thread titled "Independence and
-confirming rep. clauses"), the two of us (at least) agreed that it was
-poor language design.
-
-****************************************************************
-
-From: Robert Dewar
-Sent: Thursday, March 30, 2006  8:25 AM
-
-OK, so I just misremembered here, sorry!
-
-****************************************************************
-
-From: Pascal Leroy
-Sent: Thursday, March 30, 2006  8:25 AM
-
-> is that behavior OK? forbidden? mandated?
-> (not clear to me at any right)
-
-It's certainly OK to reject any representation item that you don't like.
-However, it appears that the implementation advice about pragma Pack does
-not mention atomicity, so you are not following the advice, and you don't
-comply with Annex C.
-
-On a machine that could independently address bits, the two pragmas could
-well coexist, so there is some amount of implementation dependence here.
-
-For the record Apex also ignores Pack in this example, although it doesn't
-emit a warning.
-
-****************************************************************
-
-From: Robert Dewar
-Sent: Thursday, March 30, 2006  8:41 AM
-
-> It's certainly OK to reject any representation item that you don't like.
-> However, it appears that the implementation advice about pragma Pack does
-> not mention atomicity, so you are not following the advice, and you don't
-> comply with Annex C.
-
-Yes, but it is impossible to comply on virtually all machines
- 
-> On a machine that could independently address bits, the two pragmas could
-> well coexist, so there is some amount of implementation dependence here.
-
-There are almost no such machines!
-
-****************************************************************
-
-From: Tucker Taft
-Sent: Thursday, March 30, 2006  8:21 AM
-
-> Wait a moment, then you have to give permission to reject
-> these "other rep clauses", you can't insist that they be
-> recognized and independence be preserved!
-
-I believe there are already rules that effectively allow that,
-once we make it clear that being atomic also implies being
-independent of neighboring objects.  E.g. C.6(10-11):
-
-    It is illegal to apply either an Atomic or Atomic_Components pragma
-    to an object or type if the implementation cannot support the
-    indivisible reads and updates required by the pragma (see below).
-
-    It is illegal to specify the Size attribute of an atomic object,
-    the Component_Size attribute for an array type with atomic
-    components, or the layout attributes of an atomic component,
-    in a way that prevents the implementation from performing the
-    required indivisible reads and updates.
-
-Probably would want to change "indivisible" to
-"indivisible and independent" in both of the above paragraphs.
-
-****************************************************************
-
-From: Robert Dewar
-Sent: Thursday, March 30, 2006  1:30 PM
-
-SO I guess you would consider my packed example illegal, and the
-warning should be a real illegality?
-
-****************************************************************
-
-From: Tucker Taft
-Sent: Thursday, March 30, 2006  2:24 PM
-
-> SO I guess you would consider my packed example illegal, and the
-> warning should be a real illegality?
-
-Pragma Pack is a little different.  It says "pack as
-tightly as you can, subject to all the other requirements
-imposed on the type."  So you never need to reject a
-pragma Pack.  I could imagine that in the absence of
-a pragma Pack, some implementations might make the following
-array 32-bits/element:
-
-    type Very_Short is new Integer range 0..7;
-    type VS_Array is array(Positive range <>) of Very_Short;
-    pragma Atomic_Components(VS_Array);
-
-but if we add a pragma Pack(VS_Array), I would expect it to be shrunk
-down to 8 bits per component on machines that allow atomic
-reference to bytes.  In the absence of the pragma Atomic_Components,
-I would expect it to be shrunk down to 3 or 4 bits/component.
-
-****************************************************************
-
-From: Gary Dismukes
-Sent: Thursday, March 30, 2006  3:05 PM
-
-> Pragma Pack is a little different.  It says "pack as
-> tightly as you can, subject to all the other requirements
-> imposed on the type."  So you never need to reject a
-> pragma Pack.  I could imagine that in the absence of
-> a pragma Pack, some implementations might make the following
-> array 32-bits/element:
-
-But in the case of Annex C compliance you have to follow the
-recommended level of support, which requires tight packing
-of things like Boolean arrays as I understand it.  There's
-nothing about "subject to other requirements", so it seems
-that one of the pragmas would have to be rejected.
-
-****************************************************************
-
-From: Tucker Taft
-Sent: Thursday, March 30, 2006  3:31 PM
-
-> But in the case of Annex C compliance you have to follow the
-> recommended level of support, which requires tight packing
-> of things like Boolean arrays as I understand it.  There's
-> nothing about "subject to other requirements", so it seems
-> that one of the pragmas would have to be rejected.
-
-Good point.  But an existing AARM note implies there is
-some interplay between a component being aliased and
-the "size of the component subtype":
-
-    Ramification: If a component subtype is aliased, its Size will
-    generally be a multiple of Storage_Unit, so it probably won't
-    get packed very tightly.
-
-This AARM ramification seems totally unjustified, unless we
-presumed that there was some kind of implicit "widening"
-that was occuring on the Size of a component subtype if
-necessary to satisfy other requirements, such as "aliased,"
-"atomic," etc.  But that really doesn't fit with the model,
-since the *subtype* is not aliased, nor is the component
-*subtype* atomic in the case of an Atomic_Components pragma.
-
-So I think we will definitely need to change the words here
-if that is what we want, namely the "tight" packing is not
-required if the components are aliased, by-reference, or
-atomic.
-
-****************************************************************
-
-From: Randy Brukardt
-Sent: Thursday, March 30, 2006  3:37 PM
-
-> But in the case of Annex C compliance you have to follow the
-> recommended level of support, which requires tight packing
-> of things like Boolean arrays as I understand it.  There's
-> nothing about "subject to other requirements", so it seems
-> that one of the pragmas would have to be rejected.
-
-As much as I hate to, I agree with Gary. Indeed, I don't see anything about
-"subject to other requirements" anywhere in 13.2. Here's what the definition
-of Pack is (this has nothing to do with recommended level of support):
-
-"If a type is packed, then the implementation should try to minimize storage
-allocated to objects of the type, possibly at the expense of speed of
-accessing components, subject to reasonable complexity in addressing
-calculations."
-
-I don't see that "reasonable complexity" has anything whatsoever to do with
-"other requirements". And then the Recommended Level of Support pretty much
-defines what "reasonable complexity" means (by allowing rounding up to avoid
-crossing boundaries).
-
-So I agree that one of the pragmas has to be rejected. (I don't think that
-any language change is needed to make that a requirement, either, although
-it would make sense to clarify this so there is no doubt.) A warning (as
-GNAT gives) is wrong for a compiler following Annex C, and unfriendly
-otherwise. Silently doing nothing...I better not go there. :-)
-
-****************************************************************
-
-From: Randy Brukardt
-Sent: Thursday, March 30, 2006  3:50 PM
-
-> So I think we will definitely need to change the words here
-> if that is what we want, namely the "tight" packing is not
-> required if the components are aliased, by-reference, or
-> atomic.
-
-The note was unjustified in Ada 95, but in Ada 2005, we added a blanket
-permission to reject rep. clauses for components of by-reference and aliased
-types unless they are confirming. See 13.1(26/2). Remember that pragma Pack
-is never confirming, so this is the same as saying that it can be rejected
-(but not required to be rejected) for any aliased or by-reference type.
-
-There is even an AARM note (carried over from Ada 95) which notes that
-Atomic_Components has similar restrictions. But it doesn't look like we ever
-considered the interaction of Atomic_Components and other rep. clauses.
-Perhaps it should be included in 13.1(26/2)? (That is, it shouldn't be
-required to support any non-confirming rep. clauses on such a type, but of
-course you can if you want.)
-
-****************************************************************
-
-From: Tucker Taft
-Sent: Thursday, March 30, 2006  4:00 PM
-
-> As much as I hate to, I agree with Gary. Indeed, I don't see anything about
-> "subject to other requirements" anywhere in 13.2....
-
-The new paragraph 13.2(6.1) says:
-
-    If a packed type has a component that is not of a by-reference
-    type and has no aliased part, then such a component need not
-    be aligned according to the Alignment of its subtype; in
-    particular it need not be allocated on a storage element boundary.
-
-This is the part that implies that packing is "subject to
-other requirements."  If we changed "aliased" to "aliased or atomic"
-in the above, I think it would accomplish roughly what I was
-suggesting.  I think you will agree that the above paragraph,
-combined with 13.3(26.3):
-
-     For an object X of subtype S, if S'Alignment is not zero, then
-     X'Alignment is a nonzero integral multiple of S'Alignment unless
-     specified otherwise by a representation item.
-
-implies that in:
-
-     type Aliased_Bit_Vector is
-       array (Positive range <>) of aliased Boolean;
-     pragma Pack(Boolean);
-
-the components should be aligned on Boolean'Alignment boundaries.
-I would think the same thing should apply if Atomic_Components
-is applied to a boolean array.
-
-I admit that these paragraphs seem to contradict the recommended
-level of support, but I think the bug is there, not in the above
-two paragraphs.
-
- > ...
-> So I agree that one of the pragmas has to be rejected. (I don't think that
-> any language change is needed to make that a requirement, either, although
-> it would make sense to clarify this so there is no doubt.) A warning (as
-> GNAT gives) is wrong for a compiler following Annex C, and unfriendly
-> otherwise. Silently doing nothing...I better not go there. :-)
-
-I suppose it depends on your interpretation of "Pack."  I have
-always taken it as "do as well as you can."  If you really have
-a specific size you need, then specify that with Component_Size,
-or be sure that there is nothing inhibiting the packing, such
-as aliased, by-reference, or atomic components.
-
-I agree it is friendly to inform the user if the pack has *no*
-effect, but I wouldn't want to disallow pragma Pack completely
-in the above example, because array of Boolean might use
-32-bits/component in its absence, if byte-at-a-time access is
-significantly slower than word-at-a-time access on the given
-hardware.
-
-****************************************************************
-
-From: Robert Dewar
-Sent: Thursday, March 30, 2006  4:32 PM
-
-> I suppose it depends on your interpretation of "Pack."  I have
-> always taken it as "do as well as you can."  If you really have
-> a specific size you need, then specify that with Component_Size,
-> or be sure that there is nothing inhibiting the packing, such
-> as aliased, by-reference, or atomic components.
-
-Well you can interpret it that way if you like, but it is not
-the definition in the language, which says that for arrays with
-1,2,4 bit components, pragma Pack works as expected!
- 
-> I agree it is friendly to inform the user if the pack has *no*
-> effect, but I wouldn't want to disallow pragma Pack completely
-> in the above example, because array of Boolean might use
-> 32-bits/component in its absence, if byte-at-a-time access is
-> significantly slower than word-at-a-time access on the given
-> hardware.
-
-I think that is wrong in this case, since pragma Pack for Boolean
-has precise well defined semantics, and must make the component
-size 1, it does not mean, do-as-well-as-you-can.
-
-****************************************************************
-
-From: Randy Brukardt
-Sent: Thursday, March 30, 2006  5:18 PM
-
->      type Aliased_Bit_Vector is
->        array (Positive range <>) of aliased Boolean;
->      pragma Pack(Boolean);
->
-> the components should be aligned on Boolean'Alignment boundaries.
-> I would think the same thing should apply if Atomic_Components
-> is applied to a boolean array.
-
-Well, in your example, the pragma should be rejected because the type isn't
-local. But I presume you meant "pragma Pack(Aliased_Bit_Vector);".
-
-I see your point, but all it says to me is that the new paragraph shouldn't
-be conditional. The needed escape is provided by 13.1(26/2) anyway.
-13.1(26/2) says that there is no requirement to even support pragma Pack for
-such a type.
-
-> I admit that these paragraphs seem to contradict the recommended
-> level of support, but I think the bug is there, not in the above
-> two paragraphs.
-
-And I disagree; I think the RLS is correct and the above should simply read:
-
-     The component of a packed type need not be aligned according to the
-     Alignment of its subtype; in particular it need not be allocated on
-     a storage element boundary.
-
-This doesn't require misalignment, it just allows it. The RLS requires it in
-some cases, but in those cases there is no requirement to support pragma
-Pack.
-
-...
-> I suppose it depends on your interpretation of "Pack."  I have
-> always taken it as "do as well as you can."  If you really have
-> a specific size you need, then specify that with Component_Size,
-> or be sure that there is nothing inhibiting the packing, such
-> as aliased, by-reference, or atomic components.
-
-Pack is defined to "minimize storage, within reason". No exceptions for
-goofy component types; for those you can't minimize storage.
-
-> I agree it is friendly to inform the user if the pack has *no*
-> effect, but I wouldn't want to disallow pragma Pack completely
-> in the above example, because array of Boolean might use
-> 32-bits/component in its absence, if byte-at-a-time access is
-> significantly slower than word-at-a-time access on the given
-> hardware.
-
-Such hardware is possible, I suppose, but it seems unlikely since it would
-perform poorly on C code and thus on standard benchmarks. Moreover, there is
-more to overall performance than just the byte access time; all of the
-wasted space would cause extra cache pressure and usually would cause the
-overall run time to be longer.
-
-After all, the default representation should be best for "typical"
-conditions. If your use of a particular type is atypical (you need storage
-minimization or performance maximization), then you need to declare the type
-appropriately. For storage minimization, that's pragma Pack. For time
-maximization, you have to noodle with 'Alignment and/or 'Component_Size,
-which is difficult; it would be useful if Ada had a pragma Fastest (...)
-that worked like Pack in reverse (sort of like Pascal unpack) -- space be
-damned, give me the fastest possible access to these components.
-
-So, I don't see any value to pragma Pack in your example; if anything, it is
-misleading because it does nothing. One of our goals with this amendment,
-after all, was to reduce the effects of adding or removing "aliased". I
-don't think that adding or removing "aliased" should change representation
-if there are rep. clauses (although it might make the rep. clauses
-illegal) -- otherwise, a simple maintenance change can introduce
-hard-to-find bugs.
-
-Specifically, you're saying that changing:
-
-    type Bit_Vector is array (Positive range <>) of Boolean;
-    pragma Pack(Bit_Vector);
-
-to
-
-    type Bit_Vector is array (Positive range <>) of aliased Boolean;
-    pragma Pack(Bit_Vector);
-
-will *silently* change the representation. Yuk. I'm pretty sure that we'll
-never do that in our compiler...
-
-****************************************************************
-
-From: Robert Dewar
-Sent: Thursday, March 30, 2006  5:31 PM
-
-> will *silently* change the representation. Yuk. I'm pretty sure that we'll
-> never do that in our compiler...
-
-So how *will* your compiler handle these two cases?
-
-****************************************************************
-
-From: Randy Brukardt
-Sent: Thursday, March 30, 2006  5:47 PM
-
-> So how *will* your compiler handle these two cases?
-
-I presume you're asking about the Ada 2005 update, not the current practice
-(without the new 13.1(26/2), we just give warnings that nothing will
-happen).
-
-Anyway, in Ada 2005, the first will be accepted, and the second rejected
-(based on 13.1(26/2) - this is not confirming). The rejection of the second
-one will make the maintenance programmer remove the pragma, and that will
-make the change of representation crystal clear.
-
-****************************************************************
-
-From: Tucker Taft
-Sent: Thursday, March 30, 2006  6:00 PM
-
-> Anyway, in Ada 2005, the first will be accepted, and the second rejected
-> (based on 13.1(26/2) - this is not confirming). The rejection of the second
-> one will make the maintenance programmer remove the pragma, and that will
-> make the change of representation crystal clear.
-
-I'm convinced.  And I think pragma Atomic_Components ought
-to work very much like adding "aliased".  So perhaps the only
-real change is needed in 13.1(24/2):
-
-     An implementation need not support a nonconfirming representation
-     item if it could cause an aliased object or an object of a
-     by-reference type to be allocated at a nonaddressable location or,
-     when the alignment attribute of the subtype of such an object is
-     nonzero, at an address that is not an integral multiple of that
-     alignment.
-
-We should probably change "aliased" above to "aliased or atomic."
-
-****************************************************************
-
-From: Robert Dewar
-Sent: Thursday, March 30, 2006  6:15 PM
-
-> We should probably change "aliased" above to "aliased or atomic."
-
-or volatile, you don't want extra reads/writes there either.
-
-****************************************************************
-
-From: Randy Brukardt
-Sent: Thursday, March 30, 2006  6:21 PM
-
-> We should probably change "aliased" above to "aliased or atomic."
-
-I think we'd want to make that change to 13.1(25/2) and 13.1(26/2), too. We
-don't want to force compilers to handle 4-bit atomic record components,
-either. (Those could be aligned correctly and still have a size that's too
-small.)
-
-****************************************************************
-
-From: Robert I. Eachus
-Sent: Thursday, March 30, 2006  7:27 PM
-
->> On a machine that could independently address bits, the two pragmas 
->> could
->> well coexist, so there is some amount of implementation dependence here.
->
->
-> There are almost no such machines!
-
-I totally agree with the language part of this discussion, but many 
-hardware ISAs allow read-modify-write access.  If you can do an AND or 
-an OR as an RMW isntruction, then ORing16#EF# sets the fourth bit of the 
-byte, and ANDing of 16#EF# resets it.  (There are often advantages to 
-doing 32 or 64-bit wide operations instead of byte wide operations, 
-especially with modern CPUs, but that is a detail.) Is the RMW 
-instruction atomic?  The most interesting case is in the x86 case.  If 
-you have a single CPU (or today CPU core) the retirement rules make the 
-instructions atomic from the CPUs point of view.  (If an interrupt 
-occurs, either the write has completed, or the instruction will be 
-restarted.)  What if you have multiple CPUs, multiple cores, or are 
-interfacing with an I/O device?  Better mark the memory as UC 
-(uncacheable) and use the LOCK prefix on the AND or OR instruction, but 
-then it is guaranteed to work.
-
-So I would say that the majority of  computers in use do support  
-bit-addressable atomic access support--as long as the component values 
-don't cross quad-word boundaries. (There are lots of other CISC CPU 
-designs where this works as well.  The first microprocessor I used it on 
-was the M68000, but I had used this trick on many mainframes before then.)
-
-****************************************************************
-
-From: Robert I. Eachus
-Sent: Thursday, March 30, 2006  7:53 PM
-
-> So I would say that the majority of  computers in use do support  
-> bit-addressable atomic access support--as long as the component values 
-> don't cross quad-word boundaries.
-
-Whoops! I got a bit carried away.  In the x86 ISA you can only do atomic 
-loads and stores of a set of all one bits or all zero bits.  Some other 
-ISAs do allow arbitrary bit patterns to be substituted.  You can always 
-use a locked XOR iff each entry in an array is 'owned' by a different 
-thread.
-
-So the changes being discussed are needed for the non-boolean cases.  
-However, I would hope that at least the AARM should explain the special 
-nature of atomic bit arrays.
-
-****************************************************************
-
-From: Bibb Latting
-Sent: Thursday, March 30, 2006 11:46 PM
-
-> So I would say that the majority of  computers in use do support
-> bit-addressable atomic access support--as long as the component values
-> don't cross quad-word boundaries. (There are lots of other CISC CPU
-> designs where this works as well.  The first microprocessor I used it on
-> was the M68000, but I had used this trick on many mainframes before then.)
-
-This is a molecular operation, not an atomic operation for:
-
-   type packed_bits (1..N) of boolean;
-   pragma pack (packed_bits);
-   pragma atomic_components (packed_bits);
-
-    1) RMW assumes that the contents on read are the same as write.  When
-       dealing with I/O interfaces, this is not always true.
-
-    2)  Without a data source for the other bits, the operation is not 
-        atomic.
-
-> Probably would want to change "indivisible" to
-> "indivisible and independent" in both of the above paragraphs.
-
-I think this change is worth considering.
-
-****************************************************************
-
-From: Jean-Pierre Rosen
-Sent: Friday, March 31, 2006  2:07 AM
-
-Just to spread a little more oil on the fire...
-What happens here?
-
-type Tab is array (positive range <>) of boolean;
-pragma pack (Tab);
-
-X : Tab (1 ..32);
-pragma Atomic_Components (X);
-
-i.e. when a *type* is packed, but an individual *variable* has atomic 
-components?
-
-****************************************************************
-
-From: Robert Dewar
-Sent: Thursday, March 30, 2006  5:05 AM
-
-An error message I trust:
-
-> The array_local_name in an Atomic_Components or
-> Volatile_Components pragma shall resolve to denote the declaration of an
-> array type or an array object of an anonymous type.
-
-Tab don't look anonymous to me :-)
-
-****************************************************************
-
-From: Robert I. Eachus
-Sent: Friday, March 31, 2006 11:27 AM
-
-> This is a molecular operation, not an atomic operation for:
->
->   type packed_bits (1..N) of boolean;
->   pragma pack (packed_bits);
->   pragma atomic_components (packed_bits);
->
->    1) RMW assumes that the contents on read are the same as write.  When
-> dealing with I/O interfaces, this is not always true.
-
-No, you have to follow the prescription exactly.  And although it is 
-possible that some chipsets get this wrong, the ISA specifies what is 
-done exactly because it is used in interfacing between multiple CPUs and 
-CPUs and I/O devices.  Oh, and it is about 50 times faster on a Hammer 
-(AMD Athlon64, Turion, or Opteron) CPU because all memory access goes 
-through CPU caches.  So if the memory is local to the CPU, it just has 
-to do the RMW in cache, and any other writes to the location can't 
-interrupt.  Teechnically the cache line containing the array is Owned by 
-the thread that executes the locked RMW instruction.  This means that 
-the data migrates to the local cache, and the CPU connected to the 
-memory has a Shared copy in cache.  (Reads are not an issue, they either 
-see the previous  state of the array, or the final state.)
-
-To repeat, on x86, you must use an AND or OR instruction where the first 
-argument is the bit array you want treated as atomic.  (The second 
-argument--the mask--can be a register or an immediate constant.) You 
-must use the LOCK prefix byte, and the page containing the array must be 
-marked as uncacheable.  (Yes, Hammer chips cache them anyway, but 
-enforce the atomicity rules.  In fact they go a bit further, and don't 
-even allow other reads during the few CPU clocks the cycle takes.  If 
-you read a Shared cache line, the read causes a cache snoop that can 
-invalidate the read, and cause the instruction to be retried.)
-
->    2)  Without a data source for the other bits, the operation is not 
-> atomic. 
-
-Did you miss the fact that you have to use an AND or OR instruction with 
-a memory address as the first argument to
-use the LOCK prefix?   This insures that the read and write are seen as 
-atomic by the CPU.  Marking the memory as uncacheable is necessary if 
-there are other CPUs and/or I/O devices involved.  This ensures that the 
-memory line is locked with Intel CPUs and must be locally Owned by AMD CPUs.
-
-If you really think this doesn't work, look at some driver code.  I''ve 
-avoided giving example programs, because I'd also need to supply 
-hardware to test the code.
-
-****************************************************************
-
-From: Bibb Latting
-Sent: Friday, March 31, 2006  4:44 PM
-
-> If you really think this doesn't work, look at some driver code.  I''ve
-> avoided giving example programs, because I'd also need to supply hardware 
-> to test the code.
-
-I *really* think that this doesn't *always* work.  I understand the 
-mechanization of memory access that you describe: indeed today there are 
-usually adequate means to obtain exclusive access to a memory element, which 
-when combined with suitable cache management allows implementation of 
-volatile/atomic accesses.
-
-However, the underlying assumption is that the address  referenced returns 
-the last value written.  I'm saying that this isn't always true for memory 
-mapped I/O.  An example I encountered was the SCC2692 a number of years ago. 
-It was a really *cheap* chip with 16 bytes of address space.  The problem is 
-that the chip doesn't have enough address space to provide both read-back of 
-control registers and adequate status.  To work around the problem, the 
-Read/Write line was multiplexed: when you write to the chip you're accessing 
-one register; when you read, you're accessing a different register.  So, 
-there are two objects, one for write and another for read, at the *same 
-address*.  In terms of C.6, I'm treating (perhaps incorrectly) every 
-addressable element as a variable, which becomes "shared" by application of 
-volatile/atomic.
-
-****************************************************************
-
-From: Robert I. Eachus
-Sent: Friday, March 31, 2006  7:59 PM
-
-Ah!  I guess I mixed you up by going from the general to the specific 
-case.  The Intel 8086, 8088, and 80186, were not designed to support 
-(demand paged) virtual memory, although it could be done. The Intel 
-80286 was designed to do so, but to call the support a kludge is an 
-insult to most kludges.  Since the 80386, and in chip years that is a 
-long time ago, the mechanism I described has been supported as part of 
-the ISA.  Right now the AMD and Intel implementations are very 
-different, but the same code will work on all PC compatible CPUs.
-
-There may be non-x86 compatible hardware out there that is not capable 
-of correctly doing the (single) bit flipping.  But I think that from a 
-language design point of view, we should realize that most CPUs out 
-there will support the packed array of Boolean special.case.  I would 
-rather have the RM require it for Real-Time Annex support, and allow 
-compilers for non-conforming hardware to document that. For example, 
-there is an errata for the Itanium2 IA-32 execution layer (#14 on page 
-67 of  http://download.intel.com/design/Itanium2/specupdt/25114140.pdf)  
-But that just means you shouldn't try to run real-time code in IA-32 
-emulation mode on an Itanium2 CPU.  ;-)
-
-Incidently notice that there is a lot of magic that goes on in operating 
-systems that may prevent a program from doing this bit-twiddling.  
-That's fine.  If a program that uses the Real-Time Annex needs special 
-permissions, document them and move on.  I personally think that there 
-is no reason for an OS not to satisfy a user request for an uncacheable 
-(UC) page.  It is necessary for real-time code, and harmless otherwise.  
-Especially on the AMD Hammer CPUs, there is no reason to restrict user 
-access to UC pages and/or the LOCK prefix.  The actual locking lasts a 
-few nanoseconds. (The memory location will be read, ownership, if 
-necessary transferred to the correct CPU and process.  Then the locked 
-RMW cycle takes place in the L1 data cache. Unlocked writes to the bit 
-array can occur during the change of ownership, but the copy used in the 
-RMW cycle is the latest version.)
-
-****************************************************************
-
-From: Randy Brukardt
-Sent: Friday, March 31, 2006  8:36 PM
-
-> Ah!  I guess I mixed you up by going from the general to the specific
-> case.
-
-No, you missed his point at altogether. It doesn't have anything to do with
-the CPU!
-
-The point is that memory-mapped hardware often doesn't act like memory at
-all; in particular a location may not be readable or writable or (worse)
-may return something different when read after writing.
-
-You can't make bit-mapped atomic writing work at all in such circumstances,
-no matter what CPU locking is provided. You are suggesting using
-
-    Lock
-    Or [Mem],16#10#
-
-to set just the fifth bit atomically, but this cannot work on memory-mapped
-hardware that doesn't allow reading! You'll set the other bits to whatever
-random junk, not the correct values.
-
-Now, the question is what this has to do with the language. You seem to want
-to insist that compilers support this. But compiler vendors have no control
-over what hardware their customers build/use. If your rule was adopted,
-about all vendors could do is put "don't use Atomic_Components with
-memory-mapped hardware that can only be written" in their manual.
-
-But this is nasty; Atomic and Atomic_Components exist in large part because
-of memory-mapped hardware, and here you're trying to tell people to not use
-one of them exactly when they are most likely to do so. That doesn't seem to
-be a good policy.
-
-It seems better to me to require users to read/write full storage units in
-this case, using an appropriate record or array type. There's much less risk
-of problems in that case. Funny hardware seems to be quite prevalent
-(remember that we had a long discussion on whether an atomic read/write
-could read two bytes instead of one word), we have to recognize that.
-
-****************************************************************
-
-From: Robert Dewar
-Sent: Saturday, April  1, 2006  3:05 AM
-
-> There may be non-x86 compatible hardware out there that is not capable 
-> of correctly doing the (single) bit flipping.  But I think that from a 
-> language design point of view, we should realize that most CPUs out 
-> there will support the packed array of Boolean special.case. 
-
-I must say I am puzzled, what code do you have in mind for
-supporting
-
-    type x is array (1 .. 8) of Boolean;
-    pragma Pack (x);
-    pragma Atomic_Components (x);
-    ...
-    ...
-    x (j) := k;
-
-this seems really messy to me
-
-****************************************************************
-
-From: Robert A. Duff
-Sent: Saturday, April  1, 2006  8:41 AM
-
-> I *really* think that this doesn't *always* work.  I understand the
-> mechanization of memory access that you describe: indeed today there are
-> usually adequate means to obtain exclusive access to a memory element, which
-> when combined with suitable cache management allows implementation of
-> volatile/atomic accesses.
-
-That makes sense.  I never thought packed bitfields could be atomic.
-
-But I'm confused.
-
-Atomic implies volatile, by C.6(8), "...In addition, every atomic type or
-object is also defined to be volatile."  Then C.6(20) says:
-
-  20    {external effect (volatile/atomic objects) [partial]} The external
-  effect of a program (see 1.1.3) is defined to include each read and update of
-  a volatile or atomic object. The implementation shall not generate any memory
-  reads or updates of atomic or volatile objects other than those specified by
-  the program.
-
-(where "volatile or atomic" means "volatile [or atomic]").
-
-Packed bitfields CAN be volatile.  But if we want to write upon a packed
-bitfield, we must read a whole word first, on most hardware (whether by an
-explicit load into a register, or an implicit read like the "LOCK OR"
-instruction Robert Eachus mentioned).  Right?  So how can one implement
-volatile bitfields in the way required by C.6(20)?
-
-The C.6(22/2) says:
-
-                            Implementation Advice
-
-  22/2  {AI95-00259-01} A load or store of a volatile object whose size is a
-  multiple of System.Storage_Unit and whose alignment is nonzero, should be
-  implemented by accessing exactly the bits of the object and no others.
-
-(where "volatile" means "volatile [or atomic]", this time ;-)).
-
-Is this not implied by C.6(20)?  Obviously, I misunderstand what C.6(20) is
-(intended to) mean
-
-****************************************************************
-
-From: Robert Dewar
-Sent: Saturday, April  1, 2006  8:49 AM
-
-> Packed bitfields CAN be volatile.  But if we want to write upon a packed
-> bitfield, we must read a whole word first, on most hardware (whether by an
-> explicit load into a register, or an implicit read like the "LOCK OR"
-> instruction Robert Eachus mentioned).  Right?  So how can one implement
-> volatile bitfields in the way required by C.6(20)?
-
-If C.6(20) requires volatile bit-fields, it is just junk. Implementors
-don't pay attention to junk :-)
-
-****************************************************************
-
-From: Tucker Taft
-Sent: Saturday, April  1, 2006  9:38 AM
-
-My interpretation of C.6(20) would be:
-
-   If the program includes an update to a bit field, and
-   that requires a read/modify/write sequence on the given
-   hardware, then that is not a violation of the requirement
-   that:
-
-       The implementation shall not generate any memory
-       reads or updates of atomic or volatile objects other
-       than those specified by the program.
-
-   The read/modify/write sequence has been "specified" by the
-   program.  If the bit fields were atomic, then that would
-   require that the read/modify/write sequence by "indivisible."
-
-To take advantage of C.6(20) to deal with "active" memory
-locations, I think the programmer has to know whether the
-hardware requires a read/modify/write sequence for the given
-size of object.  If so, then they better be sure that that
-sequence works for their memory-mapped device.  It is not
-clear how you can say enough in the reference manual to make
-all of this portable.  Hardware differs enough that this
-will require some issues that can't realistically be addressed
-without hardware-specific documentation.
-
-****************************************************************
-
-From: Robert A. Duff
-Sent: Saturday, April  1, 2006  9:43 AM
-
-> If C.6(20) requires volatile bit-fields, it is just junk. Implementors
-> don't pay attention to junk :-)
-
-Well, it's apparently the intent, given this AARM annotation:
-
-    22.b/2 Reason: Since any object can be a volatile object, including packed
-          array components and bit-mapped record components, we require the
-          above only when it is reasonable to assume that the machine can
-          avoid accessing bits outside of the object.
-
-I also just noticed:
-
-21    If a pragma Pack applies to a type any of whose subcomponents are
-atomic, the implementation shall not pack the atomic subcomponents more
-tightly than that for which it can support indivisible reads and updates.
-
-which seems to answer the original question.  (Sorry if somebody already
-pointed this out, and I missed it.)  Note that (21) is for atomic, not
-volatile.
-
-****************************************************************
-
-From: Robert Dewar
-Sent: Saturday, April  1, 2006  10:12 AM
-
->   The read/modify/write sequence has been "specified" by the
->   program.  If the bit fields were atomic, then that would
->   require that the read/modify/write sequence by "indivisible."
-
-I really think that's strange, to me if you have a volatile
-variable, then reads should be reads and writes should be
-writes.
-
-****************************************************************
-
-From: Robert Dewar
-Sent: Saturday, April  1, 2006  10:13 AM
-
->> If C.6(20) requires volatile bit-fields, it is just junk. Implementors
->> don't pay attention to junk :-)
-> 
-> Well, it's apparently the intent, given this AARM annotation:
-> 
->     22.b/2 Reason: Since any object can be a volatile object, including packed
->           array components and bit-mapped record components, we require the
->           above only when it is reasonable to assume that the machine can
->           avoid accessing bits outside of the object.
-
-How does this compare with the C rules for interest. It seems obvious
-to me that volatile in Ada should mean the same as volatile in C.
-
-****************************************************************
-
-From: Robert A. Duff
-Sent: Saturday, April  1, 2006 10:55 AM
-
-I agree.  C doesn't have packed arrays, but it does have arrays of bytes
-(char), which might require a read to write deep down in the hardware.
-It has bitfields in structs.  I'm not sure what the rules are for "volatile",
-but I have heard people claim that whatever they are, even the C language
-lawyers can't understand them and/or don't agree on what they mean, neither
-formally nor informally.  ;-)
-
-****************************************************************
-
-From: Tucker Taft
-Sent: Saturday, April  1, 2006 11:57 AM
-
-Here's what the GNU C reference manual says about volatile:
-
-     The volatile qualifier tells the compiler to not optimize
-     use of the variable by storing its value in a cache, but
-     rather to fetch its value afresh each time it is used.
-     Depending on the application, volatile variables may be
-     modified autonomously by external hardware devices.
-
-So they are focusing on requiring that no caching is performed.
-They make no mention of reading or writing *more* than is specified
-by the program.  They want to be sure you don't read any *less*
-than specified.
-
-As far as atomic, some versions of C have sig_atomic_t, which is
-an integer type that is atomic with respect to asynchronous
-interrupts (i.e. signals).  As far as I know, there is no
-such thing as an atomic bit field in C.
-
-****************************************************************
-
-From: Robert A. Duff
-Sent: Saturday, April  1, 2006  3:34 PM
-
-Thanks for looking that up.  Interesting.
-
-Of course "GNU C" is not "the C standard".  And of course, there are different
-versions of the C standard that might be relevant.  I'm too lazy to look it
-up, and anyway, I suppose I'd have to fork over hundreds of dollars to ISO to
-do so?
-
-I agree with your earlier comment, that given all the myriad hardware out
-there, we cannot hope to nail down every detail in the definitions of atomic
-and volatile.
-
-****************************************************************
-
-From: Robert Dewar
-Sent: Saturday, April  1, 2006  4:25 PM
-
-I disagree, we can have a clear semantic model (especially critical for
-atomic), and if hardware cannot accomodate this model, then the pragma
-must be rejected. So I think that is far too pessimistic.
-
-****************************************************************
-
-From: Randy Brukardt
-Sent: Saturday, April  1, 2006  6:01 PM
-
-That's certainly true for Atomic. But Volatile must always be accepted (there is
-no rule that it can be rejected based on the characteristics of the type), and
-the model is that compilers do their best to implement it, whatever that is.
-
-We added Implementation Advice (in Ada 2005) to avoid reading/writing extra bits,
-so any cases where that happens has to be documented. That should be enough
-encouragement to avoid it when possible. But we still want to allow any object
-to be volatile. (This was all discussed extensively with AI-259.) Indeed, this is
-the only significant difference between Atomic and Volatile -- otherwise there
-wouldn't be a need for both.
-
-****************************************************************
-
-From: Robert Dewar
-Sent: Saturday, April  1, 2006  7:32 PM
-
-> That's certainly true for Atomic. But Volatile must always be accepted (there is
-> no rule that it can be rejected based on the characteristics of the type),
-
-well then that's an obvious mistake, and sure there is such a rule,
-you don't have to do anything if it's not practical to do so.
-
-> and the model is that compilers do their best to implement it, whatever that is.
-
-I find that model absurd
- 
-> We added Implementation Advice (in Ada 2005) to avoid reading/writing extra
-> bits,
-
-well of course this should be a fundamental requirement of volatile to me
-
-> so any cases where that happens has to be documented. That should be enough
-> encouragement to avoid it when possible. But we still want to allow any object
-> to be volatile. (This was all discussed extensively with AI-259.) Indeed, this is
-> the only significant difference between Atomic and Volatile -- otherwise there
-> wouldn't be a need for both.
-
-I cannot believe you just said that!! Of course there is a need for
-both, they serve totally different functions.
-
-The point of atomic is that the read or write can be done in a single
-instruction. *That's* what distinguishes volatile from atomic. This
-allows various syncrhonization algorithms based on shared variables.
-See Norm Shulman's PhD thesis for a very thorough treatment of
-this subject.
-
-So for example, an array of ten integers can be volatile, but it
-takes ten reads to read it, so it cannot be atomic.
-
-Or for a concrete example, if you have a bounded buffer with a
-reader and a writer not explicitly syncrhonized, then the buffer
-must be volatile, otherwise the algorithm obviously fails, but
-the head and tail pointers must be atomic (otherwise the
-algorithm fails because of race conditions). These two needs
-are quite quite different.
-
-The idea that a single bit in a bit array that has to be assigned
-with a read/mask/store sequence can be called volatile seems
-completely silly to me.
-
-Fortunately, as far as I can tell, this nonsense language lawyering
-has zero effect on an implementation.
-
-****************************************************************
-
-From: Robert I. Eachus
-Sent: Sunday, April  2, 2006  3:32 AM
-
-> The point of atomic is that the read or write can be done in a single
-> instruction. *That's* what distinguishes volatile from atomic. This
-> allows various syncrhonization algorithms based on shared variables.
-> See Norm Shulman's PhD thesis for a very thorough treatment of
-> this subject.
-
-I seem to have missed a day of strum und drang.  But Robert Dewar put 
-his finger on the semantic disconnects.  With modern hardware Bit 
-vectors can be* atomic*--updated with a single, uninterruptable CPU 
-instruction, that is also atomic from the point of view of the memory 
-system.   Note that the cache manipulations that go on to cause this to 
-occur may be complex, but from our language lawyer point of view, all 
-that matters is the result.  On modern hardware, a read may result in 
-256 bytes being loaded into cache.  Not an issue for atomic, as long as 
-changes to the object are atomic from the point of view of the 
-programmer.  That means that the meaning of atomic may be diffferent in 
-a compiler that supports Annex D.  Of course, now that dual-core CPUs 
-are becomming more common, all compilers may have to insure that atomic 
-works in the presence of multiple CPUs or CPU cores.  (And I/O devices 
-as well.)
-
-I may have started the confusion by saying that to get atomic behavior 
-in any x86 multiple core environment, you have to ensure that the bit 
-array is stored in UC (incacheable) memory.  But in this case, that has 
-nothing to do with volitile--and on the AMD Hammer processors nothing to 
-do with whether or not the bit array can be cached!  It is just that the 
-ISA only requires uninterruptable semantics for UC memory.  Or to turn 
-that around, not all memory need support atomic updates, but memory must 
-be marked UC for the LOCK prefix to have the expected semantics.  (Well, 
-there are circumstances where the OS will handle the exception and 
-provide the expected sematics, but that is more likely to involve server 
-virtualization than memory that is actually unlockable.)
-
-> So for example, an array of ten integers can be volatile, but it
-> takes ten reads to read it, so it cannot be atomic.
->
-> Or for a concrete example, if you have a bounded buffer with a
-> reader and a writer not explicitly syncrhonized, then the buffer
-> must be volatile, otherwise the algorithm obviously fails, but
-> the head and tail pointers must be atomic (otherwise the
-> algorithm fails because of race conditions). These two needs
-> are quite quite different.
-
-I hope everyone now understands atomic, because this example shows how 
-complex volitile has become!  There is the type of hardware volitile 
-memory that Bibb Latting was talking about.  However, modern hardware 
-doesn't do single-bit reads and writes.  Hardware switches and status 
-bits are collected into registers.  A particular register may have bits 
-that are not writeable, and when you write a (32-bit?) word to that 
-location, only the setable bits are changed.  Where these registers are 
-internal to the CPU, they usually require special instructions to read 
-or write them. With I/O devices, the registers will be addressable as 
-memory, but again the semantics of reading from and/or writing to those 
-locations is going to be hardware specific.  In a perfect world, these 
-operations will all be provided as well documented code-inserts or 
-intrinsic functions.
-
-In the case above, volitile has a much different--but also 
-necessary--meaning.  Whether or not the data is cached is not 
-important--well it is important if you need speed.  What is important is 
-that all CPU cores, (Ada tasks. and) hardware processes see the same 
-data.  At this point I really need to talk about cache coherency 
-strategies.  AMD uses MOESI (Modified, Owned, Exclusive, Shared, 
-Invalid), while Intel uses MESI (skip the Owner state).  What Robert 
-Dewar's example above needs (in the MESI case) is that the bounded 
-buffer *and* the.head and tail pointers must be marked as Shared, or as 
-Modified in one cache, and Invalid in all others.  The MOESI protocol 
-allows one copy to be marked as Owned, and the others to be either 
-Shared or Invalid.
-
-In the AMD MOESI implementation, updating the owner's copy causes any 
-other copies to first be marked Invalid before the write to the Owned 
-copy completes, then the new value will be broadcast to the other chips 
-and cores.  Those that have a (now Invalid) cached copy will update it 
-and mark it again Shared.  What if you want to write to a Shared copy?  
-You must first take Ownership. MESI is faster if the next CPU to update 
-the Shared data is random, MOESI Owner state is much, much faster if 
-most updates are localized.  (In other words, the CPU (core) that last 
-updated the object is most likely to be the next updater.)
-
-Maybe we need to resurrect pragma Shared for this case, and use Volitile 
-to imply the hardware case.  Notice that with modern hardware, if all 
-you need is the Shared cache state, then you will often get much better 
-performance, if you write the code that way.  (Using Volitile where 
-Shared is appropriate will generate correct but pessimistic code.)  This 
-is a case where the hardware is evolving and we need the language to 
-evolve to match.  Right now, you need an AMD Hammer CPU to get major 
-speedups, but Intel's Conroe will have a shared L2 cache between cores, 
-and each core will be able to access data in the L1 data cache of the 
-other core.  In fact, it may be worthwhile to create real code for 
-Robert Dewar's example, and time it in various hardware configurations.  
-The difference can be a factor of thirty or more.
-
-And by the way, since modern CPUs manage data in cache lines, it is 
-worth knowing the sizes of those lines.  Intel uses 256 byte lines in 
-their L2 and L3 caches, but some Intel CPUs have 64-byte L1 data cache 
-lines.  AMD uses 64 byte cache lines throughout.  However, in practice 
-there is little if any difference.  AMD's CPUs typically request two 
-cache lines (128 bytes) and only terminate the request after the first 
-line if there is another pending request.  Intel requests 256 bytes, but 
-will stop after 128 bytes if there is a pending request.  (Intel's L2 
-cache lines can store a half-line, with the other half empty.)
-
-Both AMD and Intel support 'uncached' reads and writes intended to avoid 
-cache pollution.  But the smallest guaranteed read or write amount is 
-128 bits (16 bytes). So any x86 compiler that allows pragma Volitile for 
-in memory objects smaller than 16 bytes is probably living in a state of 
-sin. ;-)
-
-****************************************************************
-
-From: Jean-Pierre Rosen
-Sent: Sunday, April  2, 2006  4:25 PM
-
-> I also just noticed:
-> 
-> 21    If a pragma Pack applies to a type any of whose subcomponents are
-> atomic, the implementation shall not pack the atomic subcomponents more
-> tightly than that for which it can support indivisible reads and updates.
-> 
-> which seems to answer the original question.  
-Not really.
-The question was about independent addressability. You can have 
-indivisible updates without independent addressability.
-
-****************************************************************
-
-From: Tucker Taft
-Sent: Monday, May 21, 2007  8:11 AM
-
-You must not use "must" in an ISO standard.
-You shall use "shall" instead... ;-)
-
-(Although you didn't violate this one,
-you may not use "may not" either.  You
-shall use "shall not" or you might use
-"might not" instead.)
-
-> ...
-> !wording
-> 
-> 13.2 (6.1/2) is renumbered 13.2 (7.1/3) and reads:
-> 
-> For a packed type that has a component that is of a by-reference type,
-> aliased, volatile or atomic, the component must be aligned according to
-
-Please fully "comma-ize" lists of more than two elements.  Hence,
-"... volatile, or atomic, ..."
-
-> the alignment of its subtype; in particular it must be aligned on a
-> storage element boundary.
-
-Why does this last part follow?  Can't a subtype have an alignment
-of zero?
-
-> 
-> 13.2 (9) append:
-> 
-> If the array component must be aligned according to its subtype and the
-> results of packing are not so aligned, pragma pack should be rejected.
-
-This is worded somewhat ambiguously, here using "must" when probably
-some other word would make more sense.
-
-
-[Editor's note: These editorial changes were made in version /02 of the AI.]
-
-****************************************************************
 

Questions? Ask the ACAA Technical Agent