!standard 13.2(6.1/2) 13-08-27 AI12-0001-1/06 !standard 13.2(7) !standard 13.2(8) !standard 13.2(9/3) !standard C.6(8.1/3) !standard C.6(10) !standard C.6(11) !standard C.6(21) !standard C.6(24) !class binding interpretation 06-03-31 !status Corrigendum 2015 13-07-08 !status WG9 Approved 13-11-15 !status ARG Approved 6-0-3 13-06-14 !status work item 06-03-31 !status received 06-03-30 !priority Medium !difficulty Medium !qualifier Omission !subject Independence and Representation clauses for atomic objects !summary [Editor's note: This AI was carried over from Ada 2005.] Pack doesn't require tight packing in infeasible cases (atomic, aliased, by-reference types, independent addressability). !question The Recommended Level of Support implies that it is required to support pragma Pack on types that have Atomic_Components, even to the bit level. Is this the intent? (No.) !recommendation Resolve the difference by eliminating C.6 (21) and changing 13.2 (6.1/2) to be a recommended level of support where by-reference, aliased, and atomic objects must be aligned according to subtype. Change 13.2(8 and 9) to add an exception for components that have alignment requirements as detailed above. In C.6(8.1/3), add "and aliased objects" after "atomic objects". In C.6(10-11), add "and Independent" after indivisible. Delete C.6 (21) as it is no longer required. !wording Modify AARM 9.10(1.d/3): Ramification: An atomic object (including atomic components) is always independently addressable from any other nonoverlapping object. {Aspect_specifications and representation items cannot change that fact.} [Any aspect_specification or representation item which would prevent this from being true should be rejected, notwithstanding what this Standard says elsewhere.] Note, however, that the components of an atomic object are not necessarily atomic. Add an AARM Language Design Principle after 13.2(1): If the default representation already uses minimal storage for a particular type, aspect Pack may not cause any representation change. It follows that aspect Pack should always be allowed, even when it has no effect on representation. As a consequence, the chosen representation for a packed type may change during program maintenance even if the type is unchanged (in particular, if other representation aspects change on a part of the type). This is different than the behavior of most other representation aspects, whose properties remain guaranteed no matter what changes are made to other aspects. Therefore, aspect Pack should not be used to achieve a representation required by external criteria. For instance, setting Component_Size to 1 should be preferred over using aspect Pack to ensure an array of bits. If future maintenance would make the array components aliased, independent, or atomic, the program would become illegal if Component_Size is used (immediately identifying a problem) while the aspect Pack version would simply change representations (probably causing a hard-to-find bug). End Language Design Principle. Delete 13.2(6.1/2) which currently says: If a packed type has a component that is not of a by-reference type and has no aliased part, then such a component need not be aligned according to the Alignment of its subtype; in particular it need not be allocated on a storage element boundary. Add a new bullet after 13.2(7/3): * Any component of a packed type that is of a by-reference type, that is specified as independently addressable, or that contains an aliased part, shall be aligned according to the alignment of its subtype. AARM Ramification: This also applies to atomic components. "Atomic" implies "specified as independently addressable", so we don't need to mention atomic here. Other components do not have to follow the alignment of the subtype when packed; in many cases, the Recommended Level of Support will require the alignment to be ignored. End AARM Ramification. Modify 13.2(8): * For a packed record type, the components should be packed as tightly as possible subject to {the above alignment requirements,} the Sizes of the component subtypes, and [subject to] any record_representation_clause that applies to the type; the implementation may, but need not, reorder components or cross aligned word boundaries to improve the packing. A component whose Size is greater than the word size may be allocated an integral number of words. Modify 13.2(9/3): * For a packed array type, if the Size of the component subtype is less than or equal to the word size, Component_Size should be less than or equal to the Size of the component subtype, rounded up to the nearest factor of the word size {, unless this would violate the above alignment requirements}. Delete AARM 13.2(9.a), because the new alignment requirement above makes it clear: Ramification: If a component subtype is aliased, its Size will generally be a multiple of Storage_Unit, so it probably won't get packed very tightly. Modify C.6(8.1/3): When True, the aspects Independent and Independent_Components specify as independently addressable the named object or component(s), or in the case of a type, all objects or components of that type. All atomic objects {and aliased objects} are considered to be specified as independently addressable. Add "and independent" to C.6(10/3-11), twice: It is illegal to specify either of the aspects Atomic or Atomic_Components to have the value True for an object or type if the implementation cannot support the indivisible {and independent} reads and updates required by the aspect (see below). It is illegal to specify the Size attribute of an atomic object, the Component_Size attribute for an array type with atomic components, or the layout attributes of an atomic component, in a way that prevents the implementation from performing the required indivisible {and independent} reads and updates. Delete C.6(21/3) and the associated AARM note, because the new alignment requirement above covers this case: If the Pack aspect is True for a type any of whose subcomponents are atomic, the implementation shall not pack the atomic subcomponents more tightly than that for which it can support indivisible reads and updates. Implementation Note: Usually, specifying aspect Pack for such a type will be illegal as the Recommended Level of Support cannot be achieved; otherwise, a warning might be appropriate if no packing whatsoever can be achieved. Add a new note after C.6(24): Specifying the Pack aspect cannot override the effect of specifying an Atomic or Atomic_Components aspect. !discussion The idea of Pack is that if it's infeasible to pack a given component tightly (because it is atomic, aliased, of a by-reference type, or has independent addressability), then Pack is not illegal; it just doesn't pack as tightly as it might without the atomic, volatile, etc. This was always the intent, but the Recommended Level of Support (RLS) contradicted it. By making the alignment requirement part of the Recommended Level of Support eliminates the conflict between the RLS and the intent. Note that we require that aliased objects are always independently-addressable. We want dereferences to always be task-safe in this way; modifying an object through a dereference will never clobber some adjacent component (even momentarily). !corrigendum 13.2(6.1/2) @ddel If a packed type has a component that is not of a by-reference type and has no aliased part, then such a component need not be aligned according to the Alignment of its subtype; in particular it need not be allocated on a storage element boundary. !corrigendum 13.2(7) @dinsa The recommended level of support for pragma Pack is: @dinst @xbullet !corrigendum 13.2(8) @drepl @xbullet that applies to the type; the implementation may, but need not, reorder components or cross aligned word boundaries to improve the packing. A component whose Size is greater than the word size may be allocated an integral number of words.> @dby @xbullet that applies to the type; the implementation may, but need not, reorder components or cross aligned word boundaries to improve the packing. A component whose Size is greater than the word size may be allocated an integral number of words.> !corrigendum 13.2(9/3) @drepl @xbullet @dby @xbullet !corrigendum C.6(8.1/3) @drepl When True, the aspects Independent and Independent_Components @i the named object or component(s), or in the case of a type, all objects or components of that type. All atomic objects are considered to be specified as independently addressable. @dby When True, the aspects Independent and Independent_Components @i the named object or component(s), or in the case of a type, all objects or components of that type. All atomic objects and aliased objects are considered to be specified as independently addressable. !corrigendum C.6(10) @drepl It is illegal to specify either of the aspects Atomic or Atomic_Components to have the value True for an object or type if the implementation cannot support the indivisible reads and updates required by the aspect (see below). @dby It is illegal to specify either of the aspects Atomic or Atomic_Components to have the value True for an object or type if the implementation cannot support the indivisible and independent reads and updates required by the aspect (see below). !corrigendum C.6(11) @drepl It is illegal to specify the Size attribute of an atomic object, the Component_Size attribute for an array type with atomic components, or the layout attributes of an atomic component, in a way that prevents the implementation from performing the required indivisible reads and updates. @dby It is illegal to specify the Size attribute of an atomic object, the Component_Size attribute for an array type with atomic components, or the layout attributes of an atomic component, in a way that prevents the implementation from performing the required indivisible and independent reads and updates. !corrigendum C.6(21) @ddel If a pragma Pack applies to a type any of whose subcomponents are atomic, the implementation shall not pack the atomic subcomponents more tightly than that for which it can support indivisible reads and updates. !corrigendum C.6(24) @dinsa @xindent<@s9> @dinst @xindent<@s9<10 Specifying the Pack aspect cannot override the effect of specifying an Atomic or Atomic_Components aspect.>> !ACATS test There might be value in checking that Pack is allowed in all cases, even when it has no effect on the representation. For instance, combining aspect Pack combined with Atomic_Components for small types like Boolean should always work (but do nothing on most targets). (Test CXC6003 included such a case; this case has been removed from the test pending the outcome of this AI, and most likely this should be a separate test.) !ASIS No ASIS impact. !appendix From: Jean-Pierre Rosen Sent: Friday, February 17, 2006 6:34 AM A question that arose while designing a rule for AdaControl about shared variables. If a variable is subject to a pragma Atomic_Components, is it safe for two tasks to update *different* components without synchronization? C.6 talks only about indivisibility, not independent addressing. Of course, you have to throw 9.10 in... The whole issue is with the "(or of a neighboring object if the two are not independently addressable)" in 9.10(11), while C.6 (17) says that "Two actions are sequential (see 9.10) if each is the read or update of the same atomic object", but doesn't mention neighboring objects. In a sense, indivisibility guarantees only that there cannot be temporary incorrect values in a variable due to the fact that the variable is written by more than one memory cycle. The issue *is* different from independent addressability. OTOH, Atomic_Components without independent addressability seems pretty much useless... **************************************************************** From: Robert Dewar Sent: Thursday, March 30, 2006 5:55 AM Answer seems clear, yes it is safe, provided that independence is assured, which means that there is no rep clause that would disturb the independence. If you are suggesting that Atomic Components should guarantee such independence, and result in the rejection of rep clauses that would compromise it, that seems reasonable, e.g. you have a packed array of bits with atomic components, that's definitely peculiar, and it seems reasonable to reject it. **************************************************************** From: Pascal Leroy Sent: Thursday, March 30, 2006 6:07 AM > If a variable is subject to a pragma Atomic_Components, is it safe for > two tasks to update *different* components without synchronization? I think that 9.10(1) is quite clear: distinct objects are independently addressable unless "packing, record layout or Component_Size is specified". So regardless of atomicity, it is always safe to read/update two distinct components of an object (in the absence of packing, etc.). What Atomic_Component buys you is that reads/updates of the same component are sequential. **************************************************************** From: Jean-Pierre Rosen Sent: Thursday, March 30, 2006 6:17 AM Of course, my question was in the case of the presence of packing etc. The answer seems to be no, there is no *additional* implication on addressability due to atomic_components. Correct? **************************************************************** From: Pascal Leroy Sent: Thursday, March 30, 2006 6:25 AM > Of course, my question was in the case of the presence of packing etc. In the presence of packing, 9.10(1) says that independent addressability is "implementation defined", which is not too helpful. (This topic was discussed a few weeks ago as part of another thread, btw.) > The answer seems to be no, there is no *additional* implication on > addressability due to atomic_components. Correct? Right. **************************************************************** From: Tucker Taft Sent: Thursday, March 30, 2006 6:57 AM The ARG recently disallowed combining a pair of atomic operations on distinct objects into a single operation, I believe. I would certainly support saying that array-of-aliased and array-of-atomic would ensure independence between components, even in the presence of other rep-clauses. That seems like a reasonable interpretation of what atomic means, and "aliased" implies that you can have multiple access paths that make no visible use of indexing, and hence you would certainly want independence. **************************************************************** From: Robert Dewar Sent: Thursday, March 30, 2006 7:58 AM > The ARG recently disallowed combining a pair of atomic operations > on distinct objects into a single operation, I believe. > I would certainly support saying that array-of-aliased > and array-of-atomic would ensure independence between > components, even in the presence of other rep-clauses. Wait a moment, then you have to give permission to reject these "other rep clauses", you can't insist that they be recognized and independence be preserved! **************************************************************** From: Robert Dewar Sent: Thursday, March 30, 2006 8:02 AM > In the presence of packing, 9.10(1) says that independent addressability > is "implementation defined", which is not too helpful. (This topic was > discussed a few weeks ago as part of another thread, btw.) It seems *really* nasty to make this implementation defined, I hate erroneousness being imp defined. Is this a new change, I missed it. **************************************************************** From: Robert Dewar Sent: Thursday, March 30, 2006 8:08 AM > So regardless of atomicity, it is always safe to read/update two distinct > components of an object (in the absence of packing, etc.). What > Atomic_Component buys you is that reads/updates of the same component are > sequential. .. and atomic! But there is still the issue of something like this type X is array (1 .. 8) of Boolean; pragma Pack (X); pragma Atomic_Components (X); Should one of the two pragmas be ignored, or should one of them be rejected, or what? In GNAT we get: a.ads:4:30: warning: Pack canceled, cannot pack atomic components is that behavior OK? forbidden? mandated? (not clear to me at any right) **************************************************************** From: Pascal Leroy Sent: Thursday, March 30, 2006 8:17 AM > It seems *really* nasty to make this implementation defined, > I hate erroneousness being imp defined. Is this a new change, > I missed it. This is not new, it has been like that since Ada 95, and the last time this was discussed (around Feb, 24th, thread titled "Independence and confirming rep. clauses"), the two of us (at least) agreed that it was poor language design. **************************************************************** From: Robert Dewar Sent: Thursday, March 30, 2006 8:25 AM OK, so I just misremembered here, sorry! **************************************************************** From: Pascal Leroy Sent: Thursday, March 30, 2006 8:25 AM > is that behavior OK? forbidden? mandated? > (not clear to me at any right) It's certainly OK to reject any representation item that you don't like. However, it appears that the implementation advice about pragma Pack does not mention atomicity, so you are not following the advice, and you don't comply with Annex C. On a machine that could independently address bits, the two pragmas could well coexist, so there is some amount of implementation dependence here. For the record Apex also ignores Pack in this example, although it doesn't emit a warning. **************************************************************** From: Robert Dewar Sent: Thursday, March 30, 2006 8:41 AM > It's certainly OK to reject any representation item that you don't like. > However, it appears that the implementation advice about pragma Pack does > not mention atomicity, so you are not following the advice, and you don't > comply with Annex C. Yes, but it is impossible to comply on virtually all machines > On a machine that could independently address bits, the two pragmas could > well coexist, so there is some amount of implementation dependence here. There are almost no such machines! **************************************************************** From: Tucker Taft Sent: Thursday, March 30, 2006 8:21 AM > Wait a moment, then you have to give permission to reject > these "other rep clauses", you can't insist that they be > recognized and independence be preserved! I believe there are already rules that effectively allow that, once we make it clear that being atomic also implies being independent of neighboring objects. E.g. C.6(10-11): It is illegal to apply either an Atomic or Atomic_Components pragma to an object or type if the implementation cannot support the indivisible reads and updates required by the pragma (see below). It is illegal to specify the Size attribute of an atomic object, the Component_Size attribute for an array type with atomic components, or the layout attributes of an atomic component, in a way that prevents the implementation from performing the required indivisible reads and updates. Probably would want to change "indivisible" to "indivisible and independent" in both of the above paragraphs. **************************************************************** From: Robert Dewar Sent: Thursday, March 30, 2006 1:30 PM SO I guess you would consider my packed example illegal, and the warning should be a real illegality? **************************************************************** From: Tucker Taft Sent: Thursday, March 30, 2006 2:24 PM > SO I guess you would consider my packed example illegal, and the > warning should be a real illegality? Pragma Pack is a little different. It says "pack as tightly as you can, subject to all the other requirements imposed on the type." So you never need to reject a pragma Pack. I could imagine that in the absence of a pragma Pack, some implementations might make the following array 32-bits/element: type Very_Short is new Integer range 0..7; type VS_Array is array(Positive range <>) of Very_Short; pragma Atomic_Components(VS_Array); but if we add a pragma Pack(VS_Array), I would expect it to be shrunk down to 8 bits per component on machines that allow atomic reference to bytes. In the absence of the pragma Atomic_Components, I would expect it to be shrunk down to 3 or 4 bits/component. **************************************************************** From: Gary Dismukes Sent: Thursday, March 30, 2006 3:05 PM > Pragma Pack is a little different. It says "pack as > tightly as you can, subject to all the other requirements > imposed on the type." So you never need to reject a > pragma Pack. I could imagine that in the absence of > a pragma Pack, some implementations might make the following > array 32-bits/element: But in the case of Annex C compliance you have to follow the recommended level of support, which requires tight packing of things like Boolean arrays as I understand it. There's nothing about "subject to other requirements", so it seems that one of the pragmas would have to be rejected. **************************************************************** From: Tucker Taft Sent: Thursday, March 30, 2006 3:31 PM > But in the case of Annex C compliance you have to follow the > recommended level of support, which requires tight packing > of things like Boolean arrays as I understand it. There's > nothing about "subject to other requirements", so it seems > that one of the pragmas would have to be rejected. Good point. But an existing AARM note implies there is some interplay between a component being aliased and the "size of the component subtype": Ramification: If a component subtype is aliased, its Size will generally be a multiple of Storage_Unit, so it probably won't get packed very tightly. This AARM ramification seems totally unjustified, unless we presumed that there was some kind of implicit "widening" that was occuring on the Size of a component subtype if necessary to satisfy other requirements, such as "aliased," "atomic," etc. But that really doesn't fit with the model, since the *subtype* is not aliased, nor is the component *subtype* atomic in the case of an Atomic_Components pragma. So I think we will definitely need to change the words here if that is what we want, namely the "tight" packing is not required if the components are aliased, by-reference, or atomic. **************************************************************** From: Randy Brukardt Sent: Thursday, March 30, 2006 3:37 PM > But in the case of Annex C compliance you have to follow the > recommended level of support, which requires tight packing > of things like Boolean arrays as I understand it. There's > nothing about "subject to other requirements", so it seems > that one of the pragmas would have to be rejected. As much as I hate to, I agree with Gary. Indeed, I don't see anything about "subject to other requirements" anywhere in 13.2. Here's what the definition of Pack is (this has nothing to do with recommended level of support): "If a type is packed, then the implementation should try to minimize storage allocated to objects of the type, possibly at the expense of speed of accessing components, subject to reasonable complexity in addressing calculations." I don't see that "reasonable complexity" has anything whatsoever to do with "other requirements". And then the Recommended Level of Support pretty much defines what "reasonable complexity" means (by allowing rounding up to avoid crossing boundaries). So I agree that one of the pragmas has to be rejected. (I don't think that any language change is needed to make that a requirement, either, although it would make sense to clarify this so there is no doubt.) A warning (as GNAT gives) is wrong for a compiler following Annex C, and unfriendly otherwise. Silently doing nothing...I better not go there. :-) **************************************************************** From: Randy Brukardt Sent: Thursday, March 30, 2006 3:50 PM > So I think we will definitely need to change the words here > if that is what we want, namely the "tight" packing is not > required if the components are aliased, by-reference, or > atomic. The note was unjustified in Ada 95, but in Ada 2005, we added a blanket permission to reject rep. clauses for components of by-reference and aliased types unless they are confirming. See 13.1(26/2). Remember that pragma Pack is never confirming, so this is the same as saying that it can be rejected (but not required to be rejected) for any aliased or by-reference type. There is even an AARM note (carried over from Ada 95) which notes that Atomic_Components has similar restrictions. But it doesn't look like we ever considered the interaction of Atomic_Components and other rep. clauses. Perhaps it should be included in 13.1(26/2)? (That is, it shouldn't be required to support any non-confirming rep. clauses on such a type, but of course you can if you want.) **************************************************************** From: Tucker Taft Sent: Thursday, March 30, 2006 4:00 PM > As much as I hate to, I agree with Gary. Indeed, I don't see anything about > "subject to other requirements" anywhere in 13.2.... The new paragraph 13.2(6.1) says: If a packed type has a component that is not of a by-reference type and has no aliased part, then such a component need not be aligned according to the Alignment of its subtype; in particular it need not be allocated on a storage element boundary. This is the part that implies that packing is "subject to other requirements." If we changed "aliased" to "aliased or atomic" in the above, I think it would accomplish roughly what I was suggesting. I think you will agree that the above paragraph, combined with 13.3(26.3): For an object X of subtype S, if S'Alignment is not zero, then X'Alignment is a nonzero integral multiple of S'Alignment unless specified otherwise by a representation item. implies that in: type Aliased_Bit_Vector is array (Positive range <>) of aliased Boolean; pragma Pack(Boolean); the components should be aligned on Boolean'Alignment boundaries. I would think the same thing should apply if Atomic_Components is applied to a boolean array. I admit that these paragraphs seem to contradict the recommended level of support, but I think the bug is there, not in the above two paragraphs. > ... > So I agree that one of the pragmas has to be rejected. (I don't think that > any language change is needed to make that a requirement, either, although > it would make sense to clarify this so there is no doubt.) A warning (as > GNAT gives) is wrong for a compiler following Annex C, and unfriendly > otherwise. Silently doing nothing...I better not go there. :-) I suppose it depends on your interpretation of "Pack." I have always taken it as "do as well as you can." If you really have a specific size you need, then specify that with Component_Size, or be sure that there is nothing inhibiting the packing, such as aliased, by-reference, or atomic components. I agree it is friendly to inform the user if the pack has *no* effect, but I wouldn't want to disallow pragma Pack completely in the above example, because array of Boolean might use 32-bits/component in its absence, if byte-at-a-time access is significantly slower than word-at-a-time access on the given hardware. **************************************************************** From: Robert Dewar Sent: Thursday, March 30, 2006 4:32 PM > I suppose it depends on your interpretation of "Pack." I have > always taken it as "do as well as you can." If you really have > a specific size you need, then specify that with Component_Size, > or be sure that there is nothing inhibiting the packing, such > as aliased, by-reference, or atomic components. Well you can interpret it that way if you like, but it is not the definition in the language, which says that for arrays with 1,2,4 bit components, pragma Pack works as expected! > I agree it is friendly to inform the user if the pack has *no* > effect, but I wouldn't want to disallow pragma Pack completely > in the above example, because array of Boolean might use > 32-bits/component in its absence, if byte-at-a-time access is > significantly slower than word-at-a-time access on the given > hardware. I think that is wrong in this case, since pragma Pack for Boolean has precise well defined semantics, and must make the component size 1, it does not mean, do-as-well-as-you-can. **************************************************************** From: Randy Brukardt Sent: Thursday, March 30, 2006 5:18 PM > type Aliased_Bit_Vector is > array (Positive range <>) of aliased Boolean; > pragma Pack(Boolean); > > the components should be aligned on Boolean'Alignment boundaries. > I would think the same thing should apply if Atomic_Components > is applied to a boolean array. Well, in your example, the pragma should be rejected because the type isn't local. But I presume you meant "pragma Pack(Aliased_Bit_Vector);". I see your point, but all it says to me is that the new paragraph shouldn't be conditional. The needed escape is provided by 13.1(26/2) anyway. 13.1(26/2) says that there is no requirement to even support pragma Pack for such a type. > I admit that these paragraphs seem to contradict the recommended > level of support, but I think the bug is there, not in the above > two paragraphs. And I disagree; I think the RLS is correct and the above should simply read: The component of a packed type need not be aligned according to the Alignment of its subtype; in particular it need not be allocated on a storage element boundary. This doesn't require misalignment, it just allows it. The RLS requires it in some cases, but in those cases there is no requirement to support pragma Pack. ... > I suppose it depends on your interpretation of "Pack." I have > always taken it as "do as well as you can." If you really have > a specific size you need, then specify that with Component_Size, > or be sure that there is nothing inhibiting the packing, such > as aliased, by-reference, or atomic components. Pack is defined to "minimize storage, within reason". No exceptions for goofy component types; for those you can't minimize storage. > I agree it is friendly to inform the user if the pack has *no* > effect, but I wouldn't want to disallow pragma Pack completely > in the above example, because array of Boolean might use > 32-bits/component in its absence, if byte-at-a-time access is > significantly slower than word-at-a-time access on the given > hardware. Such hardware is possible, I suppose, but it seems unlikely since it would perform poorly on C code and thus on standard benchmarks. Moreover, there is more to overall performance than just the byte access time; all of the wasted space would cause extra cache pressure and usually would cause the overall run time to be longer. After all, the default representation should be best for "typical" conditions. If your use of a particular type is atypical (you need storage minimization or performance maximization), then you need to declare the type appropriately. For storage minimization, that's pragma Pack. For time maximization, you have to noodle with 'Alignment and/or 'Component_Size, which is difficult; it would be useful if Ada had a pragma Fastest (...) that worked like Pack in reverse (sort of like Pascal unpack) -- space be damned, give me the fastest possible access to these components. So, I don't see any value to pragma Pack in your example; if anything, it is misleading because it does nothing. One of our goals with this amendment, after all, was to reduce the effects of adding or removing "aliased". I don't think that adding or removing "aliased" should change representation if there are rep. clauses (although it might make the rep. clauses illegal) -- otherwise, a simple maintenance change can introduce hard-to-find bugs. Specifically, you're saying that changing: type Bit_Vector is array (Positive range <>) of Boolean; pragma Pack(Bit_Vector); to type Bit_Vector is array (Positive range <>) of aliased Boolean; pragma Pack(Bit_Vector); will *silently* change the representation. Yuk. I'm pretty sure that we'll never do that in our compiler... **************************************************************** From: Robert Dewar Sent: Thursday, March 30, 2006 5:31 PM > will *silently* change the representation. Yuk. I'm pretty sure that we'll > never do that in our compiler... So how *will* your compiler handle these two cases? **************************************************************** From: Randy Brukardt Sent: Thursday, March 30, 2006 5:47 PM > So how *will* your compiler handle these two cases? I presume you're asking about the Ada 2005 update, not the current practice (without the new 13.1(26/2), we just give warnings that nothing will happen). Anyway, in Ada 2005, the first will be accepted, and the second rejected (based on 13.1(26/2) - this is not confirming). The rejection of the second one will make the maintenance programmer remove the pragma, and that will make the change of representation crystal clear. **************************************************************** From: Tucker Taft Sent: Thursday, March 30, 2006 6:00 PM > Anyway, in Ada 2005, the first will be accepted, and the second rejected > (based on 13.1(26/2) - this is not confirming). The rejection of the second > one will make the maintenance programmer remove the pragma, and that will > make the change of representation crystal clear. I'm convinced. And I think pragma Atomic_Components ought to work very much like adding "aliased". So perhaps the only real change is needed in 13.1(24/2): An implementation need not support a nonconfirming representation item if it could cause an aliased object or an object of a by-reference type to be allocated at a nonaddressable location or, when the alignment attribute of the subtype of such an object is nonzero, at an address that is not an integral multiple of that alignment. We should probably change "aliased" above to "aliased or atomic." **************************************************************** From: Robert Dewar Sent: Thursday, March 30, 2006 6:15 PM > We should probably change "aliased" above to "aliased or atomic." or volatile, you don't want extra reads/writes there either. **************************************************************** From: Randy Brukardt Sent: Thursday, March 30, 2006 6:21 PM > We should probably change "aliased" above to "aliased or atomic." I think we'd want to make that change to 13.1(25/2) and 13.1(26/2), too. We don't want to force compilers to handle 4-bit atomic record components, either. (Those could be aligned correctly and still have a size that's too small.) **************************************************************** From: Robert I. Eachus Sent: Thursday, March 30, 2006 7:27 PM >> On a machine that could independently address bits, the two pragmas >> could >> well coexist, so there is some amount of implementation dependence here. > > > There are almost no such machines! I totally agree with the language part of this discussion, but many hardware ISAs allow read-modify-write access. If you can do an AND or an OR as an RMW isntruction, then ORing16#EF# sets the fourth bit of the byte, and ANDing of 16#EF# resets it. (There are often advantages to doing 32 or 64-bit wide operations instead of byte wide operations, especially with modern CPUs, but that is a detail.) Is the RMW instruction atomic? The most interesting case is in the x86 case. If you have a single CPU (or today CPU core) the retirement rules make the instructions atomic from the CPUs point of view. (If an interrupt occurs, either the write has completed, or the instruction will be restarted.) What if you have multiple CPUs, multiple cores, or are interfacing with an I/O device? Better mark the memory as UC (uncacheable) and use the LOCK prefix on the AND or OR instruction, but then it is guaranteed to work. So I would say that the majority of computers in use do support bit-addressable atomic access support--as long as the component values don't cross quad-word boundaries. (There are lots of other CISC CPU designs where this works as well. The first microprocessor I used it on was the M68000, but I had used this trick on many mainframes before then.) **************************************************************** From: Robert I. Eachus Sent: Thursday, March 30, 2006 7:53 PM > So I would say that the majority of computers in use do support > bit-addressable atomic access support--as long as the component values > don't cross quad-word boundaries. Whoops! I got a bit carried away. In the x86 ISA you can only do atomic loads and stores of a set of all one bits or all zero bits. Some other ISAs do allow arbitrary bit patterns to be substituted. You can always use a locked XOR iff each entry in an array is 'owned' by a different thread. So the changes being discussed are needed for the non-boolean cases. However, I would hope that at least the AARM should explain the special nature of atomic bit arrays. **************************************************************** From: Bibb Latting Sent: Thursday, March 30, 2006 11:46 PM > So I would say that the majority of computers in use do support > bit-addressable atomic access support--as long as the component values > don't cross quad-word boundaries. (There are lots of other CISC CPU > designs where this works as well. The first microprocessor I used it on > was the M68000, but I had used this trick on many mainframes before then.) This is a molecular operation, not an atomic operation for: type packed_bits (1..N) of boolean; pragma pack (packed_bits); pragma atomic_components (packed_bits); 1) RMW assumes that the contents on read are the same as write. When dealing with I/O interfaces, this is not always true. 2) Without a data source for the other bits, the operation is not atomic. > Probably would want to change "indivisible" to > "indivisible and independent" in both of the above paragraphs. I think this change is worth considering. **************************************************************** From: Jean-Pierre Rosen Sent: Friday, March 31, 2006 2:07 AM Just to spread a little more oil on the fire... What happens here? type Tab is array (positive range <>) of boolean; pragma pack (Tab); X : Tab (1 ..32); pragma Atomic_Components (X); i.e. when a *type* is packed, but an individual *variable* has atomic components? **************************************************************** From: Robert Dewar Sent: Thursday, March 30, 2006 5:05 AM An error message I trust: > The array_local_name in an Atomic_Components or > Volatile_Components pragma shall resolve to denote the declaration of an > array type or an array object of an anonymous type. Tab don't look anonymous to me :-) **************************************************************** From: Robert I. Eachus Sent: Friday, March 31, 2006 11:27 AM > This is a molecular operation, not an atomic operation for: > > type packed_bits (1..N) of boolean; > pragma pack (packed_bits); > pragma atomic_components (packed_bits); > > 1) RMW assumes that the contents on read are the same as write. When > dealing with I/O interfaces, this is not always true. No, you have to follow the prescription exactly. And although it is possible that some chipsets get this wrong, the ISA specifies what is done exactly because it is used in interfacing between multiple CPUs and CPUs and I/O devices. Oh, and it is about 50 times faster on a Hammer (AMD Athlon64, Turion, or Opteron) CPU because all memory access goes through CPU caches. So if the memory is local to the CPU, it just has to do the RMW in cache, and any other writes to the location can't interrupt. Teechnically the cache line containing the array is Owned by the thread that executes the locked RMW instruction. This means that the data migrates to the local cache, and the CPU connected to the memory has a Shared copy in cache. (Reads are not an issue, they either see the previous state of the array, or the final state.) To repeat, on x86, you must use an AND or OR instruction where the first argument is the bit array you want treated as atomic. (The second argument--the mask--can be a register or an immediate constant.) You must use the LOCK prefix byte, and the page containing the array must be marked as uncacheable. (Yes, Hammer chips cache them anyway, but enforce the atomicity rules. In fact they go a bit further, and don't even allow other reads during the few CPU clocks the cycle takes. If you read a Shared cache line, the read causes a cache snoop that can invalidate the read, and cause the instruction to be retried.) > 2) Without a data source for the other bits, the operation is not > atomic. Did you miss the fact that you have to use an AND or OR instruction with a memory address as the first argument to use the LOCK prefix? This insures that the read and write are seen as atomic by the CPU. Marking the memory as uncacheable is necessary if there are other CPUs and/or I/O devices involved. This ensures that the memory line is locked with Intel CPUs and must be locally Owned by AMD CPUs. If you really think this doesn't work, look at some driver code. I''ve avoided giving example programs, because I'd also need to supply hardware to test the code. **************************************************************** From: Bibb Latting Sent: Friday, March 31, 2006 4:44 PM > If you really think this doesn't work, look at some driver code. I''ve > avoided giving example programs, because I'd also need to supply hardware > to test the code. I *really* think that this doesn't *always* work. I understand the mechanization of memory access that you describe: indeed today there are usually adequate means to obtain exclusive access to a memory element, which when combined with suitable cache management allows implementation of volatile/atomic accesses. However, the underlying assumption is that the address referenced returns the last value written. I'm saying that this isn't always true for memory mapped I/O. An example I encountered was the SCC2692 a number of years ago. It was a really *cheap* chip with 16 bytes of address space. The problem is that the chip doesn't have enough address space to provide both read-back of control registers and adequate status. To work around the problem, the Read/Write line was multiplexed: when you write to the chip you're accessing one register; when you read, you're accessing a different register. So, there are two objects, one for write and another for read, at the *same address*. In terms of C.6, I'm treating (perhaps incorrectly) every addressable element as a variable, which becomes "shared" by application of volatile/atomic. **************************************************************** From: Robert I. Eachus Sent: Friday, March 31, 2006 7:59 PM Ah! I guess I mixed you up by going from the general to the specific case. The Intel 8086, 8088, and 80186, were not designed to support (demand paged) virtual memory, although it could be done. The Intel 80286 was designed to do so, but to call the support a kludge is an insult to most kludges. Since the 80386, and in chip years that is a long time ago, the mechanism I described has been supported as part of the ISA. Right now the AMD and Intel implementations are very different, but the same code will work on all PC compatible CPUs. There may be non-x86 compatible hardware out there that is not capable of correctly doing the (single) bit flipping. But I think that from a language design point of view, we should realize that most CPUs out there will support the packed array of Boolean special.case. I would rather have the RM require it for Real-Time Annex support, and allow compilers for non-conforming hardware to document that. For example, there is an errata for the Itanium2 IA-32 execution layer (#14 on page 67 of http://download.intel.com/design/Itanium2/specupdt/25114140.pdf) But that just means you shouldn't try to run real-time code in IA-32 emulation mode on an Itanium2 CPU. ;-) Incidently notice that there is a lot of magic that goes on in operating systems that may prevent a program from doing this bit-twiddling. That's fine. If a program that uses the Real-Time Annex needs special permissions, document them and move on. I personally think that there is no reason for an OS not to satisfy a user request for an uncacheable (UC) page. It is necessary for real-time code, and harmless otherwise. Especially on the AMD Hammer CPUs, there is no reason to restrict user access to UC pages and/or the LOCK prefix. The actual locking lasts a few nanoseconds. (The memory location will be read, ownership, if necessary transferred to the correct CPU and process. Then the locked RMW cycle takes place in the L1 data cache. Unlocked writes to the bit array can occur during the change of ownership, but the copy used in the RMW cycle is the latest version.) **************************************************************** From: Randy Brukardt Sent: Friday, March 31, 2006 8:36 PM > Ah! I guess I mixed you up by going from the general to the specific > case. No, you missed his point at altogether. It doesn't have anything to do with the CPU! The point is that memory-mapped hardware often doesn't act like memory at all; in particular a location may not be readable or writable or (worse) may return something different when read after writing. You can't make bit-mapped atomic writing work at all in such circumstances, no matter what CPU locking is provided. You are suggesting using Lock Or [Mem],16#10# to set just the fifth bit atomically, but this cannot work on memory-mapped hardware that doesn't allow reading! You'll set the other bits to whatever random junk, not the correct values. Now, the question is what this has to do with the language. You seem to want to insist that compilers support this. But compiler vendors have no control over what hardware their customers build/use. If your rule was adopted, about all vendors could do is put "don't use Atomic_Components with memory-mapped hardware that can only be written" in their manual. But this is nasty; Atomic and Atomic_Components exist in large part because of memory-mapped hardware, and here you're trying to tell people to not use one of them exactly when they are most likely to do so. That doesn't seem to be a good policy. It seems better to me to require users to read/write full storage units in this case, using an appropriate record or array type. There's much less risk of problems in that case. Funny hardware seems to be quite prevalent (remember that we had a long discussion on whether an atomic read/write could read two bytes instead of one word), we have to recognize that. **************************************************************** From: Robert Dewar Sent: Saturday, April 1, 2006 3:05 AM > There may be non-x86 compatible hardware out there that is not capable > of correctly doing the (single) bit flipping. But I think that from a > language design point of view, we should realize that most CPUs out > there will support the packed array of Boolean special.case. I must say I am puzzled, what code do you have in mind for supporting type x is array (1 .. 8) of Boolean; pragma Pack (x); pragma Atomic_Components (x); ... ... x (j) := k; this seems really messy to me **************************************************************** From: Robert A. Duff Sent: Saturday, April 1, 2006 8:41 AM > I *really* think that this doesn't *always* work. I understand the > mechanization of memory access that you describe: indeed today there are > usually adequate means to obtain exclusive access to a memory element, which > when combined with suitable cache management allows implementation of > volatile/atomic accesses. That makes sense. I never thought packed bitfields could be atomic. But I'm confused. Atomic implies volatile, by C.6(8), "...In addition, every atomic type or object is also defined to be volatile." Then C.6(20) says: 20 {external effect (volatile/atomic objects) [partial]} The external effect of a program (see 1.1.3) is defined to include each read and update of a volatile or atomic object. The implementation shall not generate any memory reads or updates of atomic or volatile objects other than those specified by the program. (where "volatile or atomic" means "volatile [or atomic]"). Packed bitfields CAN be volatile. But if we want to write upon a packed bitfield, we must read a whole word first, on most hardware (whether by an explicit load into a register, or an implicit read like the "LOCK OR" instruction Robert Eachus mentioned). Right? So how can one implement volatile bitfields in the way required by C.6(20)? The C.6(22/2) says: Implementation Advice 22/2 {AI95-00259-01} A load or store of a volatile object whose size is a multiple of System.Storage_Unit and whose alignment is nonzero, should be implemented by accessing exactly the bits of the object and no others. (where "volatile" means "volatile [or atomic]", this time ;-)). Is this not implied by C.6(20)? Obviously, I misunderstand what C.6(20) is (intended to) mean **************************************************************** From: Robert Dewar Sent: Saturday, April 1, 2006 8:49 AM > Packed bitfields CAN be volatile. But if we want to write upon a packed > bitfield, we must read a whole word first, on most hardware (whether by an > explicit load into a register, or an implicit read like the "LOCK OR" > instruction Robert Eachus mentioned). Right? So how can one implement > volatile bitfields in the way required by C.6(20)? If C.6(20) requires volatile bit-fields, it is just junk. Implementors don't pay attention to junk :-) **************************************************************** From: Tucker Taft Sent: Saturday, April 1, 2006 9:38 AM My interpretation of C.6(20) would be: If the program includes an update to a bit field, and that requires a read/modify/write sequence on the given hardware, then that is not a violation of the requirement that: The implementation shall not generate any memory reads or updates of atomic or volatile objects other than those specified by the program. The read/modify/write sequence has been "specified" by the program. If the bit fields were atomic, then that would require that the read/modify/write sequence by "indivisible." To take advantage of C.6(20) to deal with "active" memory locations, I think the programmer has to know whether the hardware requires a read/modify/write sequence for the given size of object. If so, then they better be sure that that sequence works for their memory-mapped device. It is not clear how you can say enough in the reference manual to make all of this portable. Hardware differs enough that this will require some issues that can't realistically be addressed without hardware-specific documentation. **************************************************************** From: Robert A. Duff Sent: Saturday, April 1, 2006 9:43 AM > If C.6(20) requires volatile bit-fields, it is just junk. Implementors > don't pay attention to junk :-) Well, it's apparently the intent, given this AARM annotation: 22.b/2 Reason: Since any object can be a volatile object, including packed array components and bit-mapped record components, we require the above only when it is reasonable to assume that the machine can avoid accessing bits outside of the object. I also just noticed: 21 If a pragma Pack applies to a type any of whose subcomponents are atomic, the implementation shall not pack the atomic subcomponents more tightly than that for which it can support indivisible reads and updates. which seems to answer the original question. (Sorry if somebody already pointed this out, and I missed it.) Note that (21) is for atomic, not volatile. **************************************************************** From: Robert Dewar Sent: Saturday, April 1, 2006 10:12 AM > The read/modify/write sequence has been "specified" by the > program. If the bit fields were atomic, then that would > require that the read/modify/write sequence by "indivisible." I really think that's strange, to me if you have a volatile variable, then reads should be reads and writes should be writes. **************************************************************** From: Robert Dewar Sent: Saturday, April 1, 2006 10:13 AM >> If C.6(20) requires volatile bit-fields, it is just junk. Implementors >> don't pay attention to junk :-) > > Well, it's apparently the intent, given this AARM annotation: > > 22.b/2 Reason: Since any object can be a volatile object, including packed > array components and bit-mapped record components, we require the > above only when it is reasonable to assume that the machine can > avoid accessing bits outside of the object. How does this compare with the C rules for interest. It seems obvious to me that volatile in Ada should mean the same as volatile in C. **************************************************************** From: Robert A. Duff Sent: Saturday, April 1, 2006 10:55 AM I agree. C doesn't have packed arrays, but it does have arrays of bytes (char), which might require a read to write deep down in the hardware. It has bitfields in structs. I'm not sure what the rules are for "volatile", but I have heard people claim that whatever they are, even the C language lawyers can't understand them and/or don't agree on what they mean, neither formally nor informally. ;-) **************************************************************** From: Tucker Taft Sent: Saturday, April 1, 2006 11:57 AM Here's what the GNU C reference manual says about volatile: The volatile qualifier tells the compiler to not optimize use of the variable by storing its value in a cache, but rather to fetch its value afresh each time it is used. Depending on the application, volatile variables may be modified autonomously by external hardware devices. So they are focusing on requiring that no caching is performed. They make no mention of reading or writing *more* than is specified by the program. They want to be sure you don't read any *less* than specified. As far as atomic, some versions of C have sig_atomic_t, which is an integer type that is atomic with respect to asynchronous interrupts (i.e. signals). As far as I know, there is no such thing as an atomic bit field in C. **************************************************************** From: Robert A. Duff Sent: Saturday, April 1, 2006 3:34 PM Thanks for looking that up. Interesting. Of course "GNU C" is not "the C standard". And of course, there are different versions of the C standard that might be relevant. I'm too lazy to look it up, and anyway, I suppose I'd have to fork over hundreds of dollars to ISO to do so? I agree with your earlier comment, that given all the myriad hardware out there, we cannot hope to nail down every detail in the definitions of atomic and volatile. **************************************************************** From: Robert Dewar Sent: Saturday, April 1, 2006 4:25 PM I disagree, we can have a clear semantic model (especially critical for atomic), and if hardware cannot accomodate this model, then the pragma must be rejected. So I think that is far too pessimistic. **************************************************************** From: Randy Brukardt Sent: Saturday, April 1, 2006 6:01 PM That's certainly true for Atomic. But Volatile must always be accepted (there is no rule that it can be rejected based on the characteristics of the type), and the model is that compilers do their best to implement it, whatever that is. We added Implementation Advice (in Ada 2005) to avoid reading/writing extra bits, so any cases where that happens has to be documented. That should be enough encouragement to avoid it when possible. But we still want to allow any object to be volatile. (This was all discussed extensively with AI-259.) Indeed, this is the only significant difference between Atomic and Volatile -- otherwise there wouldn't be a need for both. **************************************************************** From: Robert Dewar Sent: Saturday, April 1, 2006 7:32 PM > That's certainly true for Atomic. But Volatile must always be accepted (there is > no rule that it can be rejected based on the characteristics of the type), well then that's an obvious mistake, and sure there is such a rule, you don't have to do anything if it's not practical to do so. > and the model is that compilers do their best to implement it, whatever that is. I find that model absurd > We added Implementation Advice (in Ada 2005) to avoid reading/writing extra > bits, well of course this should be a fundamental requirement of volatile to me > so any cases where that happens has to be documented. That should be enough > encouragement to avoid it when possible. But we still want to allow any object > to be volatile. (This was all discussed extensively with AI-259.) Indeed, this is > the only significant difference between Atomic and Volatile -- otherwise there > wouldn't be a need for both. I cannot believe you just said that!! Of course there is a need for both, they serve totally different functions. The point of atomic is that the read or write can be done in a single instruction. *That's* what distinguishes volatile from atomic. This allows various syncrhonization algorithms based on shared variables. See Norm Shulman's PhD thesis for a very thorough treatment of this subject. So for example, an array of ten integers can be volatile, but it takes ten reads to read it, so it cannot be atomic. Or for a concrete example, if you have a bounded buffer with a reader and a writer not explicitly syncrhonized, then the buffer must be volatile, otherwise the algorithm obviously fails, but the head and tail pointers must be atomic (otherwise the algorithm fails because of race conditions). These two needs are quite quite different. The idea that a single bit in a bit array that has to be assigned with a read/mask/store sequence can be called volatile seems completely silly to me. Fortunately, as far as I can tell, this nonsense language lawyering has zero effect on an implementation. **************************************************************** From: Robert I. Eachus Sent: Sunday, April 2, 2006 3:32 AM > The point of atomic is that the read or write can be done in a single > instruction. *That's* what distinguishes volatile from atomic. This > allows various syncrhonization algorithms based on shared variables. > See Norm Shulman's PhD thesis for a very thorough treatment of > this subject. I seem to have missed a day of strum und drang. But Robert Dewar put his finger on the semantic disconnects. With modern hardware Bit vectors can be* atomic*--updated with a single, uninterruptable CPU instruction, that is also atomic from the point of view of the memory system. Note that the cache manipulations that go on to cause this to occur may be complex, but from our language lawyer point of view, all that matters is the result. On modern hardware, a read may result in 256 bytes being loaded into cache. Not an issue for atomic, as long as changes to the object are atomic from the point of view of the programmer. That means that the meaning of atomic may be diffferent in a compiler that supports Annex D. Of course, now that dual-core CPUs are becomming more common, all compilers may have to insure that atomic works in the presence of multiple CPUs or CPU cores. (And I/O devices as well.) I may have started the confusion by saying that to get atomic behavior in any x86 multiple core environment, you have to ensure that the bit array is stored in UC (incacheable) memory. But in this case, that has nothing to do with volitile--and on the AMD Hammer processors nothing to do with whether or not the bit array can be cached! It is just that the ISA only requires uninterruptable semantics for UC memory. Or to turn that around, not all memory need support atomic updates, but memory must be marked UC for the LOCK prefix to have the expected semantics. (Well, there are circumstances where the OS will handle the exception and provide the expected sematics, but that is more likely to involve server virtualization than memory that is actually unlockable.) > So for example, an array of ten integers can be volatile, but it > takes ten reads to read it, so it cannot be atomic. > > Or for a concrete example, if you have a bounded buffer with a > reader and a writer not explicitly syncrhonized, then the buffer > must be volatile, otherwise the algorithm obviously fails, but > the head and tail pointers must be atomic (otherwise the > algorithm fails because of race conditions). These two needs > are quite quite different. I hope everyone now understands atomic, because this example shows how complex volitile has become! There is the type of hardware volitile memory that Bibb Latting was talking about. However, modern hardware doesn't do single-bit reads and writes. Hardware switches and status bits are collected into registers. A particular register may have bits that are not writeable, and when you write a (32-bit?) word to that location, only the setable bits are changed. Where these registers are internal to the CPU, they usually require special instructions to read or write them. With I/O devices, the registers will be addressable as memory, but again the semantics of reading from and/or writing to those locations is going to be hardware specific. In a perfect world, these operations will all be provided as well documented code-inserts or intrinsic functions. In the case above, volitile has a much different--but also necessary--meaning. Whether or not the data is cached is not important--well it is important if you need speed. What is important is that all CPU cores, (Ada tasks. and) hardware processes see the same data. At this point I really need to talk about cache coherency strategies. AMD uses MOESI (Modified, Owned, Exclusive, Shared, Invalid), while Intel uses MESI (skip the Owner state). What Robert Dewar's example above needs (in the MESI case) is that the bounded buffer *and* the.head and tail pointers must be marked as Shared, or as Modified in one cache, and Invalid in all others. The MOESI protocol allows one copy to be marked as Owned, and the others to be either Shared or Invalid. In the AMD MOESI implementation, updating the owner's copy causes any other copies to first be marked Invalid before the write to the Owned copy completes, then the new value will be broadcast to the other chips and cores. Those that have a (now Invalid) cached copy will update it and mark it again Shared. What if you want to write to a Shared copy? You must first take Ownership. MESI is faster if the next CPU to update the Shared data is random, MOESI Owner state is much, much faster if most updates are localized. (In other words, the CPU (core) that last updated the object is most likely to be the next updater.) Maybe we need to resurrect pragma Shared for this case, and use Volitile to imply the hardware case. Notice that with modern hardware, if all you need is the Shared cache state, then you will often get much better performance, if you write the code that way. (Using Volitile where Shared is appropriate will generate correct but pessimistic code.) This is a case where the hardware is evolving and we need the language to evolve to match. Right now, you need an AMD Hammer CPU to get major speedups, but Intel's Conroe will have a shared L2 cache between cores, and each core will be able to access data in the L1 data cache of the other core. In fact, it may be worthwhile to create real code for Robert Dewar's example, and time it in various hardware configurations. The difference can be a factor of thirty or more. And by the way, since modern CPUs manage data in cache lines, it is worth knowing the sizes of those lines. Intel uses 256 byte lines in their L2 and L3 caches, but some Intel CPUs have 64-byte L1 data cache lines. AMD uses 64 byte cache lines throughout. However, in practice there is little if any difference. AMD's CPUs typically request two cache lines (128 bytes) and only terminate the request after the first line if there is another pending request. Intel requests 256 bytes, but will stop after 128 bytes if there is a pending request. (Intel's L2 cache lines can store a half-line, with the other half empty.) Both AMD and Intel support 'uncached' reads and writes intended to avoid cache pollution. But the smallest guaranteed read or write amount is 128 bits (16 bytes). So any x86 compiler that allows pragma Volitile for in memory objects smaller than 16 bytes is probably living in a state of sin. ;-) **************************************************************** From: Jean-Pierre Rosen Sent: Sunday, April 2, 2006 4:25 PM > I also just noticed: > > 21 If a pragma Pack applies to a type any of whose subcomponents are > atomic, the implementation shall not pack the atomic subcomponents more > tightly than that for which it can support indivisible reads and updates. > > which seems to answer the original question. Not really. The question was about independent addressability. You can have indivisible updates without independent addressability. **************************************************************** From: Tucker Taft Sent: Monday, May 21, 2007 8:11 AM You must not use "must" in an ISO standard. You shall use "shall" instead... ;-) (Although you didn't violate this one, you may not use "may not" either. You shall use "shall not" or you might use "might not" instead.) > ... > !wording > > 13.2 (6.1/2) is renumbered 13.2 (7.1/3) and reads: > > For a packed type that has a component that is of a by-reference type, > aliased, volatile or atomic, the component must be aligned according to Please fully "comma-ize" lists of more than two elements. Hence, "... volatile, or atomic, ..." > the alignment of its subtype; in particular it must be aligned on a > storage element boundary. Why does this last part follow? Can't a subtype have an alignment of zero? > > 13.2 (9) append: > > If the array component must be aligned according to its subtype and the > results of packing are not so aligned, pragma pack should be rejected. This is worded somewhat ambiguously, here using "must" when probably some other word would make more sense. [Editor's note: These editorial changes were made in version /02 of the AI05; this is version /01 of the AI12.] **************************************************************** From: Bob Duff Sent: Sunday, February 3, 2013 3:47 PM Here's a new version of AI12-0001-1, "Independence and Representation clauses for atomic objects". [This is version /03 of the AI - Editor.] This completes my homework. Meta comment: The term "reject" as in "reject a compilation unit because it's illegal" is Ada-83-speak. But this term keeps creeping into wording in AIs/RM/AARM. I ask that people please try to remember to quit doing that. Instead, say something like "so and so is illegal" or "an implementation may make so-and-so illegal". You know who you are, Steve. ;-) See AARM-1.1.3: 4 * Identify all programs or program units that contain errors whose detection is required by this International Standard; 4.a Discussion: Note that we no longer use the term "rejection" of programs or program units. We require that programs or program units with errors or that exceed some capacity limit be " identified". The way in which errors or capacity problems are reported is not specified. Here's the draft minutes, with some of my comments: > AI12-0001-1/02 Independence and Representation Clauses for atomic > objects (Other AI versions) Bob and Tuck argue that the Recommended > Level of Support is wrong as it does not match the AARM Ramifications. > [Editor's note: I didn't record what > ramification(s) they referred to. I can't find any that clearly > conflict with the Recommended Level of Support; the only one that > might be read that way is 13.2(9.a), which says that an aliased > component won't get packed very tightly because "its Size will > generally be a multiple of Storage_Unit". But this statement appears > to be circular, as the Size of a component is determined by the amount > of packing applied, so essentially says that an aliased component > won't get packed tightly because it won't get packed tightly. It would > make some logical sense if was meant to refer to the Size of the > subtype of the component, but then it is just wrong because the Size > of a subtype is not affected by aliasedness.] Yes, that's the one. Never mind the above circularity, what it's trying to say is that if you have a aliased component of type Boolean, the Pack isn't illegal -- it just doesn't pack that component tightly. It might pack other components. > Geert notes that Pack indicates that the components are not > independent; that makes no sense with Atomic. We scurry to the > Standard to see what it actually says. > 9.10(1/3) discusses independence, and it says that specified > independence wins over representation aspects, so there is no problem there. Agreed. > C.6(8.1/3) should include aliased in things that cause ``specified as > independent''. I don't think so. "Aliased" has nothing to do with task safety. It just means the thing can have access values pointing to it. Consider early versions of the Alpha 20164. An address points at an 8-bit byte, but you can't load and store bytes; you have to load a 64-bit words and do shifting and masking. If you have a packed array of bytes on that machine, you want it packed; you don't want 64-bits per byte. If you want independence, you should specify Independent (or Atomic, or...). > Tucker thinks C.6(13.2/3) is misleading, as it seems to imply packing > is not allowed when independence is specified. Check. > The Recommended Level of Support for Pack needs to be weakened to > allow atomic, aliased, and so on to make the packing less than otherwise > required. Check. > Bob will take this AI. > Approve intent: 9-0-1. [Followed by version /03 of the AI - Editor.] **************************************************************** From: Randy Brukardt Sent: Monday, February 4, 2013 2:37 PM ... > > C.6(8.1/3) should include aliased in things that cause > ``specified as > > independent''. > > I don't think so. "Aliased" has nothing to do with task safety. > It just means the thing can have access values pointing to it. > Consider early versions of the Alpha 20164. An address points at an > 8-bit byte, but you can't load and store bytes; you have to load a > 64-bit words and do shifting and masking. > If you have a packed array of bytes on that machine, you want it > packed; you don't want 64-bits per byte. If you want independence, > you should specify Independent (or Atomic, or...). [I presume you are missing the word "aliased" in your "If you have a package array of {aliased} bytes...", because otherwise the example has nothing to do with the issue at hand.] The problem with this is then there is no guarantee that designated objects are independent. And worse, there is no language means to make such a guarantee. One could add one by allowing Independent to apply to access types, and then requiring 'Access to check that the aliased objects are independent, but that seems like a lot of language mechanism to solve a problem of our own creation (allowing pack and other rep clauses to kill independence of aliased components, something that has never been true in Ada so far as I can tell), especially as I doubt anyone is clamoring for this capability (who needs aliased bytes anyway, much less packed arrays of them). I didn't look in detail at the body of the AI, I just wanted to point out that this is clearly misguided. **************************************************************** From: Robert Dewar Sent: Monday, February 4, 2013 2:42 PM > The problem with this is then there is no guarantee that designated > objects are independent. And worse, there is no language means to make > such a guarantee. Don't worry too much, of COURSE all designated objects are independent in practice, no matter what the language has to say about it. **************************************************************** From: Bob Duff Sent: Monday, February 4, 2013 3:17 PM > ... > > > C.6(8.1/3) should include aliased in things that cause ``specified as > > > independent''. > > > > I don't think so. "Aliased" has nothing to do with task safety. > > It just means the thing can have access values pointing to it. > > Consider early versions of the Alpha 20164. An address points at an > > 8-bit byte, but you can't load and store bytes; you have to load a > > 64-bit words and do shifting and masking. > > If you have a packed array of bytes on that machine, you want it > > packed; you don't want 64-bits per byte. If you want independence, > > you should specify Independent (or Atomic, or...). > > [I presume you are missing the word "aliased" in your "If you have a package ^^^^^^^ I can't spell "packed" either. > array of {aliased} bytes...", because otherwise the example has > nothing to do with the issue at hand.] Right, my claim is that you want 'Component_Size = 8, whether or not the components are aliased. > The problem with this is then there is no guarantee that designated > objects are independent. And worse, there is no language means to make > such a guarantee. What sort of designated objects do you mean? Distinct heap objects are always independently addressable (see 9.10). The only way two (nonoverlapping) objects can fail to be independently addressable is if they're both subcomponents of the same object. And you control independence of those using pragmas Independent[_Components]. >...One could add one by allowing Independent to apply to access types, >and then requiring 'Access to check that the aliased objects are >independent, but that seems like a lot of language mechanism to solve a >problem of our own creation (allowing pack and other rep clauses to >kill independence of aliased components, something that has never been >true in Ada so far as I can tell), ... My understanding is the opposite: 'aliased' never was intended to imply independent addressability -- just plain old addressability. If 'aliased' implies independent addressability, then why were pragmas Independent[_Components] added? What do others think? >...especially as I doubt anyone is clamoring for this capability (who >needs aliased bytes anyway, much less packed arrays of them). I find "who needs aliased bytes anyway" to be a strange attitude. Why shouldn't bytes be aliased? And on the Alpha 21064 (admittedly obsolete), you'd want to pack such a thing so you don't get Component_Size = 64 (unless you're sharing the array across tasks). > I didn't look in detail at the body of the AI, I just wanted to point > out that this is clearly misguided. I strongly disagree -- it's not clearly anything (guided nor misguided). ;-) I'm not sure I fully understand pragmas Independent[_Components]. Correct me if I'm wrong: If you give these pragmas for a type that has a record rep clause, a Component_Size clause, or a Convention, then the only possible effect is to make the program illegal. If you give these pragmas for a packed type, the only possible effect is to reduce the amount of packing. For any other type (which is 99.9% of all types), these pragmas have no effect. Am I right? If so, Independent[_Components] seems like a pretty marginally useful feature. I don't understand why it was added, and AI05-0009-1 does not enlighten me. **************************************************************** From: Steve Baird Sent: Monday, February 4, 2013 3:37 PM > My understanding is the opposite: 'aliased' never was intended to > imply independent addressability -- just plain old addressability. > > If 'aliased' implies independent addressability, then why were pragmas > Independent[_Components] added? For unaliased components. You have an array of 32 (unaliased) Booleans. You also have 32 tasks and you want to allow each of them to manipulate one of the array elements. It is a bit odd that we never state that "aliased" implies "independent addressability" for components. I can imagine an array with aliased-but-not-independently-addressable components (e.g., a bit packed array of aliased booleans for an implementation which implements access-to-boolean types as bit pointers), but this seems pretty contrived. **************************************************************** From: Randy Brukardt Sent: Monday, February 4, 2013 3:45 PM ... > > The problem with this is then there is no guarantee that designated > > objects are independent. And worse, there is no language means to > > make such a guarantee. > > What sort of designated objects do you mean? Distinct heap objects > are always independently addressable (see 9.10). The only way two > (nonoverlapping) objects can fail to be independently addressable is > if they're both subcomponents of the same object. And you control > independence of those using pragmas Independent[_Components]. The designated object of a general access type, of course. The client of such a type cannot know where the designated objects come from. And you're suggesting to eliminate the guarantee that these designated objects are independent. In your hypothetical array of bytes, some components are not going to be independent. Thus, you can also have designated objects that are not independent. That's something new; there is no possibility of that in current Ada (especially if you believe 13.2(9.a)). > >...One could add one by allowing Independent to apply to access > >types, and then requiring 'Access to check that the aliased objects > >are independent, but that seems like a lot of language mechanism to > >solve a problem of our own creation (allowing pack and other rep > >clauses to kill independence of aliased components, something that > >has never been true in Ada so far as I can tell), ... > > My understanding is the opposite: 'aliased' never was intended to > imply independent addressability -- just plain old addressability. The two are intimately linked (except on some broken obsolete hardware - I wouldn't have guessed that any such hardware could have existed -- indeed, I can't quite figure out how that machine is supposed to have worked -- not that relevant anyway). > If 'aliased' implies independent addressability, then why were pragmas > Independent[_Components] added? Something can be independent without being addressable, and in any case, "aliased" turns off a lot of optimizations, which isn't necessary if all you need is independent. > What do others think? > > >...especially as I doubt anyone is clamoring for this capability > >(who needs aliased bytes anyway, much less packed arrays of them). > > I find "who needs aliased bytes anyway" to be a strange attitude. > Why shouldn't bytes be aliased? Why should the language guarantees be broken for something that no one needs? > And on the Alpha 21064 (admittedly obsolete), you'd want to pack such > a thing so you don't get Component_Size = 64 (unless you're sharing > the array across tasks). Read Robert's response to see that no implementation would ever take advantage of this ability, so why even contemplate it? > > I didn't look in detail at the body of the AI, I just wanted to > > point out that this is clearly misguided. > > I strongly disagree -- it's not clearly anything (guided nor > misguided). ;-) Heck, there is a lot misguided about this discussion. Precisely *who* is misguided is a matter of opinion. :-) > I'm not sure I fully understand pragmas Independent[_Components]. > Correct me if I'm wrong: If you give these pragmas for a type that > has a record rep clause, a Component_Size clause, or a Convention, > then the only possible effect is to make the program illegal. If you > give these pragmas for a packed type, the only possible effect is to > reduce the amount of packing. > For any other type (which is 99.9% of all types), these pragmas have > no effect. > > Am I right? If so, Independent[_Components] seems like a pretty > marginally useful feature. I don't understand why it was added, and > AI05-0009-1 does not enlighten me. The whole point is that you can give this aspect to ensure that items are independently addressable (when you are depending on that), without invoking the other costs of Volatile or aliased. Yes, it's marginal in that it hardly ever is going to have an effect (and hardly anyone will understand it enough to use it), but it's part of the Ada pattern of declaring exactly what you need and no more. In that sense, it is similar to declaring a range of -32768..32767 on a 16-bit integer type -- this won't change anything but it makes it even more clear what the expectations are. **************************************************************** From: Randy Brukardt Sent: Monday, February 4, 2013 3:47 PM ... > It is a bit odd that we never state that "aliased" implies > "independent addressability" for components. Bob was supposed to add that statement into the AI he is working on (that's what we decided in Boston), but he's resisting. For very marginal capabilities. **************************************************************** From: Bob Duff Sent: Monday, February 4, 2013 4:06 PM > The designated object of a general access type, of course. The client > of such a type cannot know where the designated objects come from. And > you're suggesting to eliminate the guarantee that these designated > objects are independent. I'm not eliminating a guarantee -- such a guarantee doesn't exist, and I'm trying to understand why we want to add it. > In your hypothetical array of bytes, some components are not going to > be independent. Thus, you can also have designated objects that are > not independent. That's something new; there is no possibility of that > in current Ada (especially if you believe 13.2(9.a)). 13.2(9.a) doesn't say anything about independent addressability. (I assume in these discussions, "independent" is being used as an abbreviation for the official RM term, "independent addressability", right?) > Something can be independent without being addressable, and in any > case, So you're saying something can be independently addressable without being addressable. Yet another case where the RM reads like Alice in Wonderland. ;-) > "aliased" turns off a lot of optimizations, which isn't necessary if > all you need is independent. I suppose, but that seems pretty marginal. Remember, we're only talking about components of packed types. > > I'm not sure I fully understand pragmas Independent[_Components]. > > Correct me if I'm wrong: If you give these pragmas for a type that > > has a record rep clause, a Component_Size clause, or a Convention, > > then the only possible effect is to make the program illegal. If > > you give these pragmas for a packed type, the only possible effect > > is to reduce the amount of packing. > > For any other type (which is 99.9% of all types), these pragmas have > > no effect. > > > > Am I right? If so, Independent[_Components] seems like a pretty > > marginally useful feature. I don't understand why it was added, and > > AI05-0009-1 does not enlighten me. Please answer my question, "Am I right?". Then we can discuss "The whole point...". > The whole point is that you can give this aspect to ensure that items > are independently addressable (when you are depending on that), > without invoking the other costs of Volatile or aliased. Yes, it's > marginal in that it hardly ever is going to have an effect (and hardly > anyone will understand it enough to use it), but it's part of the Ada > pattern of declaring exactly what you need and no more. In that sense, > it is similar to declaring a range of > -32768..32767 on a 16-bit integer type -- this won't change anything > but it makes it even more clear what the expectations are. **************************************************************** From: Bob Duff Sent: Monday, February 4, 2013 3:55 PM > > My understanding is the opposite: 'aliased' never was intended to > > imply independent addressability -- just plain old addressability. > > > > If 'aliased' implies independent addressability, then why were > > pragmas Independent[_Components] added? > > For unaliased components. You can always add "aliased". > You have an array of 32 (unaliased) Booleans. > You also have 32 tasks and you want to allow each of them to > manipulate one of the array elements. If the array is not packed, then Independent_Components has no effect. If it's packed, the Independent_Components turns off the packing. I don't get it. Perhaps if somebody answered the later part of my previous email, the part starting "Correct me if I'm wrong"... > It is a bit odd that we never state that "aliased" implies > "independent addressability" for components. So you agree with Randy. But I don't understand this -- "independent addressability" is about tasking, whereas aliasedness is about allowing access values. Why should one have anything to do with the other? **************************************************************** From: Bob Duff Sent: Monday, February 4, 2013 4:10 PM > ... > > It is a bit odd that we never state that "aliased" implies > > "independent addressability" for components. > > Bob was supposed to add that statement into the AI he is working on > (that's what we decided in Boston), but he's resisting. For very > marginal capabilities. Please don't make this into a battle, Randy. I'm not "resisting" anything; I'm just trying to understand why we should add this implication. And please stop accusing me of REMOVING this implication. It's not there now, and ARG wants to ADD it to the RM, and I want to know why. The minutes don't say. Robert's comment (about the way hardware behaves in practice) argues against making any change, because it won't actually change anything. **************************************************************** From: Bob Duff Sent: Monday, February 4, 2013 4:18 PM If you say: pragma Independent_Components (Some_Array); is there some implication that the compiler should allocate the components of Some_Array on separate cache lines (and align it to a cache line boundary)? That would make more sense than what I've heard so far. **************************************************************** From: Randy Brukardt Sent: Monday, February 4, 2013 4:22 PM ... > > "aliased" turns off a lot of optimizations, which isn't necessary if > > all you need is independent. > > I suppose, but that seems pretty marginal. Remember, we're only > talking about components of packed types. Well, we're also talking about other representation clauses. It would be pretty silly if you could get non-independent aliased components via packing but not via other representation clauses. (And I worry much more about other clauses, because only the slightly insane use Pack.) > > > I'm not sure I fully understand pragmas Independent[_Components]. > > > Correct me if I'm wrong: If you give these pragmas for a type > > > that has a record rep clause, a Component_Size clause, or a > > > Convention, then the only possible effect is to make the program > > > illegal. If you give these pragmas for a packed type, the only > > > possible effect is to reduce the amount of packing. > > > For any other type (which is 99.9% of all types), these pragmas > > > have no effect. > > > > > > Am I right? If so, Independent[_Components] seems like a pretty > > > marginally useful feature. I don't understand why it was added, > > > and > > > AI05-0009-1 does not enlighten me. > > Please answer my question, "Am I right?". Then we can discuss "The > whole point...". Probably, but I don't know for sure (I'd have to go back and re-read all of the rules). But why is it relevant? There are lots of pragmas in Ada that have no effect most of the time (Pack immediately comes to mind, and you want to make that more likely). The only thing that matters is the intent expressed. This strikes me as the setup for a bait-and-switch argument. I would hope we're more adults than that... > > The whole point is that you can give this aspect to ensure that > > items are independently addressable (when you are depending on > > that), without invoking the other costs of Volatile or aliased. Yes, > > it's marginal in that it hardly ever is going to have an effect (and > > hardly anyone will understand it enough to use it), but it's part of > > the Ada pattern of declaring exactly what you need and no more. In > > that sense, it is similar to declaring a range of > > -32768..32767 on a 16-bit integer type -- this won't change anything > > but it makes it even more clear what the expectations are. The above is the only thing relevant, not the detailed effects or lack thereof. **************************************************************** From: Randy Brukardt Sent: Monday, February 4, 2013 4:38 PM > > > My understanding is the opposite: 'aliased' never was intended to > > > imply independent addressability -- just plain old addressability. > > > > > > If 'aliased' implies independent addressability, then why were > > > pragmas Independent[_Components] added? > > > > For unaliased components. > > You can always add "aliased". Not without performance implications. You could have also said "you can always add Volatile_Components", with the same caveat. > > You have an array of 32 (unaliased) Booleans. > > You also have 32 tasks and you want to allow each of them to > > manipulate one of the array elements. > > If the array is not packed, then Independent_Components has no effect. > If it's packed, the Independent_Components turns off the packing. I > don't get it. "Clearly", pack should be illegal in this case. Tucker once proposed having pack be illegal if it does nothing at all, which would certainly go a ways toward eliminating my opposition to this change. > Perhaps if somebody answered the later part of my previous email, the > part starting "Correct me if I'm wrong"... It's irrelevant has to precisely what happens, it's about declaring your intentions. If you can't see the value of declaring your intentions, we don't have much to talk about. > > It is a bit odd that we never state that "aliased" implies > > "independent addressability" for components. > > So you agree with Randy. But I don't understand this -- "independent > addressability" is about tasking, whereas aliasedness is about > allowing access values. Why should one have anything to do with the > other? Because there is a presumption that designated objects are always independently addressable. We didn't think there was a need to be able to declare that an access type has only independently addressable designated objects because it wasn't necessary, but you claim that it is not true in general and thus such a thing needs to be added. > > I can imagine an array with > > aliased-but-not-independently-addressable > > components (e.g., a bit packed array of aliased booleans for an > > implementation which implements access-to-boolean types as bit > > pointers), but this seems pretty contrived. And this is the real objection: any such examples seem contrived, and to support such examples you want to eliminate the existing guarantee that designated objects are always independently addressable. (Whether that guarantee is a ramification or direct result of the wording is a separate issue - as Robert says, no one ever has or likely will invalidate that.) The only alternative would be to extend Independent to cover access types, and then have a check on 'Access that the object actually has Independent specified (or must be by 9.10). Which seems like too much mechanism. **************************************************************** From: Tucker Taft Sent: Monday, February 4, 2013 4:40 PM > If you say: > > pragma Independent_Components (Some_Array); > > is there some implication that the compiler should allocate the > components of Some_Array on separate cache lines (and align it to a > cache line boundary)? That would make more sense than what I've heard > so far. No, I think the only point was to avoid erroneousness due to simultaneous access, which the presence of other representation items might imply. Efficiency is a different issue. I will admit I have lost track of the issue. Is it whether "aliased" implies "independent"? I would say yes that should be true. Does independent imply aliased? Clearly not. **************************************************************** From: Robert Dewar Sent: Monday, February 4, 2013 4:44 PM > Robert's comment (about the way hardware behaves in practice) argues > against making any change, because it won't actually change anything. 90% of the delicate arguments about wording also have this property (they won't actually change anything), but there is still an inclination to get the words right :-) **************************************************************** From: Robert Dewar Sent: Monday, February 4, 2013 4:49 PM > If you say: > > pragma Independent_Components (Some_Array); > > is there some implication that the compiler should allocate the > components of Some_Array on separate cache lines (and align it to a > cache line boundary)? That would make more sense than what I've heard > so far. Absolutely no such implication, and indeed this would be disastrous for two reasons: Data representation would depend on the particular target configuration, since cache line size is part of that Cache lines are huge in many machines, e.g. 256 bytes. No, that's not the idea of independence (a concept I think I can take at least partial credit for since I had my PhD student Norman Schulman study this in detail) at all. The idea is that separate tasks can operate independently on separate elements. Whether this includes the case of separate tasks on separate processors being able to access independent objects depends on the target, but in practice virtually all systems implement full cache coherence with cache snooping (where you watch bus traffic to make sure your cached view is current). Allocating on separate cache lines in a system with cache coherence would make no sense at all! To see a practical effect of > type Some_Array is array (1 .. 10) of Character; > pragma Independent_Components (Some_Array); Consider the old Alpha, which did not have byte load/store instructions. On one of these old Alpha's, you had to do 32-bit loads and stores, so the above declarations would require (on that machine) Some_Array'Component_Size = 32. **************************************************************** From: Robert Dewar Sent: Monday, February 4, 2013 4:50 PM > If you say: > > pragma Independent_Components (Some_Array); By the way in Ada 83, you were required to provide independence for all composites, including packed bit arrays. Nonsense of course, but a consequence of the infamous "big change at the last minute" to chapter 9 of the RM :-) **************************************************************** From: Randy Brukardt Sent: Monday, February 4, 2013 4:49 PM > > ... > > > It is a bit odd that we never state that "aliased" implies > > > "independent addressability" for components. > > > > Bob was supposed to add that statement into the AI he is working on > > (that's what we decided in Boston), but he's resisting. For very > > marginal capabilities. > > Please don't make this into a battle, Randy. I'm not "resisting" > anything; I'm just trying to understand why we should add this > implication. The reason is obvious: non-independent designated objects is nonsense that cannot be allowed (unless we add an additional ways to declare independent addressability of designated objects). > And please stop accusing me of REMOVING this implication. > It's not there now, and ARG wants to ADD it to the RM, and I want to > know why. The minutes don't say. Because we all have assumed for decades that it IS there; the design of Independent makes that clear (it would make no sense without such an implication). If one can't read that between the lines of 9.10, then it needs to be made explicit somehow (either with extra wording or an expansion of Independent). > Robert's comment (about the way hardware behaves in practice) argues > against making any change, because it won't actually change anything. If you mean about the behavior of Pack, I would agree. It appears to be illegal to pack aliased components, and that's a good thing (as pack would have no effect, and that's almost certainly some sort of mistake). If you want to change Pack so that it is not illegal for aliased components, and then "make no change" to aliased, that I don't understand, because we need some rule to allow aliased components to be packed less tightly. (There is no exception for aliased in the Recommended Level of Support.) And once you add such an exception, you need to describe what it means. I still think the previous version of the AI is closer to what we want (I believe it mostly matches GNAT, as well). But YMMV. **************************************************************** From: Tucker Taft Sent: Monday, February 4, 2013 4:49 PM >> ... >> If the array is not packed, then Independent_Components has no >> effect. If it's packed, the Independent_Components turns off the >> packing. I don't get it. > > "Clearly", pack should be illegal in this case. Tucker once proposed > having pack be illegal if it does nothing at all, which would > certainly go a ways toward eliminating my opposition to this change. That must have been Tucker # 42. I don't remember this suggestion. My feeling has generally been that "pack" means use space optimization over time optimization, subject to all of the other representation requirements. It should never be illegal to say "pack," though in the absence of an "independent" (or aliased) specification, it can create erroneousness. In my view, pragma Pack is really just a special case of "pragma Optimize(Space)" which is a pretty non-specific request. The only extra bit of semantics for Pack is the pesky "non-independence" implication, and pragma/aspect Independent/Independent_Components can be used to overcome that bit. If you really want to control representation, Component_Size or a record rep clause are the way to go. Pack is not really for controlling representation, it is for establishing a space bias in the default representation selection mechanism. **************************************************************** From: Robert Dewar Sent: Monday, February 4, 2013 4:51 PM > ... >>> "aliased" turns off a lot of optimizations, which isn't necessary if >>> all you need is independent. >> >> I suppose, but that seems pretty marginal. Remember, we're only >> talking about components of packed types. > > Well, we're also talking about other representation clauses. It would > be pretty silly if you could get non-independent aliased components > via packing but not via other representation clauses. (And I worry > much more about other clauses, because only the slightly insane use > Pack.) I don't believe in the tooth fairy And I don't believe in non-independent aliased components :-) This is independent of whatever wording you come up with! **************************************************************** From: Bob Duff Sent: Monday, February 4, 2013 4:58 PM > > If you say: > > > > pragma Independent_Components (Some_Array); > > > > is there some implication that the compiler should allocate the > > components of Some_Array on separate cache lines (and align it to a > > cache line boundary)? That would make more sense than what I've > > heard so far. > > No, I think the only point was to avoid erroneousness due to > simultaneous access, which the presence of other representation items > might imply. > > Efficiency is a different issue. OK. A feature that provided such an efficiency hint would be useful, IMHO. (No, I'm not proposing to add one.) > I will admit I have lost track of the issue. Is it whether "aliased" > implies "independent"? Yes, that's the main issue. The minutes say "yes, we should add such an implication", and I'm wondering why. A side issue is that if we added such an implication, the pragmas Indep[_Comp] seem VERY marginally useful, so I wonder why they were added. Randy says "to declare one's intentions". Well, that's nice, I suppose, but if you just declare normal arrays without any Pack or 'Component_Size clauses, you get independent addressability, and that was good enough for the first decades of Ada's life. >...I would say yes that should > be true. And Randy and Steve agree with you. But I still don't understand why. Sorry if I'm being dense. > Does independent imply aliased? Clearly not. Yes, I think we all agree on that. **************************************************************** From: Steve Baird Sent: Monday, February 4, 2013 5:02 PM >>> If 'aliased' implies independent addressability, then why were >>> pragmas Independent[_Components] added? >> >> For unaliased components. > > You can always add "aliased". > So you agree with Randy. But I don't understand this -- "independent > addressability" is about tasking, whereas aliasedness is about > allowing access values. Why should one have anything to do with the > other? > The bugs associated with concurrent access to non-independent components are a case of the implementation showing through in an ugly way, exposing a low-level detail that a high-level language would ideally hide. I want this to happen as infrequently as possible. Anytime I can define a component to be independently addressable without giving up something useful (and without forcing existing compilers to change), I want to do so. My point is that in the case of an aliased component, I don't see that I am giving up anything useful (e.g., the freedom to have it share a byte with a neighboring component) by adding an "aliasing => I.A." rule. Although I think this rule would be a good thing, I'd agree that it is not a big deal (which is why I don't feel strongly about it) because we already have I.A. most of the time. **************************************************************** From: Bob Duff Sent: Monday, February 4, 2013 5:07 PM > To see a practical effect of > > > type Some_Array is array (1 .. 10) of Character; > > pragma Independent_Components (Some_Array); Sorry, now I'm getting MORE confused. There's no Pack or "for Some_Array'Component_Size use..." above. So Ada requires the components of Some_Array to be independently addressable even without that pragma. So I fail to see any "practical effect" of the pragma. > Consider the old Alpha, which did not have byte load/store > instructions. On one of these old Alpha's, you had to do 32-bit loads > and stores, so the above declarations would require (on that machine) > Some_Array'Component_Size = 32. I stated earlier in this thread that you had to do 64-bit loads and stores on that machine, but you're right -- you could do 32. By the way, type String has pragma Pack, which I always thought was so String'Component_Size can be 8 even on such weird machines as the old Alpha! **************************************************************** From: Robert Dewar Sent: Monday, February 4, 2013 5:08 PM > I can imagine an array with aliased-but-not-independently-addressable > components (e.g., a bit packed array of aliased booleans for an > implementation which implements access-to-boolean types as bit > pointers), but this seems pretty contrived. Imagine away, because Ada implementations that implement access-to-boolean types as bit pointers are likely to be about as common as Loch Ness Monsters :-) Seriously, it is not worth spending much time worrying about bizarre implementation possibilities. If you did have such an implementation, then it could do all sorts of peculiar things (if necessary under control of a switch). **************************************************************** From: Randy Brukardt Sent: Monday, February 4, 2013 5:14 PM ... > I will admit I have lost track of the issue. Is it whether "aliased" > implies "independent"? I would say yes that should be true. That's the issue. And Bob wants to know why. I've tried to explain, but probably not very well. Perhaps you can take a stab at it. **************************************************************** From: Tucker Taft Sent: Monday, February 4, 2013 5:17 PM >>> If you say: >>> >>> pragma Independent_Components (Some_Array); >>> >>> is there some implication that the compiler should allocate the >>> components of Some_Array on separate cache lines (and align it to a >>> cache line boundary)? That would make more sense than what I've >>> heard so far. >> >> No, I think the only point was to avoid erroneousness due to >> simultaneous access, which the presence of other representation items >> might imply. >> >> Efficiency is a different issue. > > OK. A feature that provided such an efficiency hint would be useful, > IMHO. (No, I'm not proposing to add one.) > >> I will admit I have lost track of the issue. Is it whether "aliased" >> implies "independent"? >> ...I would say yes that should >> be true. > > And Randy and Steve agree with you. > But I still don't understand why. > Sorry if I'm being dense. My sense is that in ARG meetings over the past several years, we have all agreed we need something like Independent to overcome the slightly odd rule about erroneousness coming from any non-confirming rep clause. And of course there is nothing in the source that says "this is a confirming rep clause" so pragma Independent is one way to ensure that whether or not a rep-clause is confirming, it doesn't cause the loss of independence. Aliased seems to serve a completely different purpose, namely the ability to use 'Access. The argument for making "aliased" imply independence is simply that once you create an access value, it is pretty much impossible to keep track of where it came from. And clearly independence is guaranteed for distinct dynamically-allocated objects. But I think it would be odd to tell someone that if they want to overcome the loss of independence due to a rep-clause, they should add "aliased." That seems non-intuitive and somewhat user unfriendly. **************************************************************** From: Robert Dewar Sent: Monday, February 4, 2013 5:16 PM >> My understanding is the opposite: 'aliased' never was intended to >> imply independent addressability -- just plain old addressability. >> >> If 'aliased' implies independent addressability, then why were >> pragmas Independent[_Components] added? Independent Components is weaker than aliased Aliased means you can have a pointer to the object Independent Components means you can access them independently Example: On the x86, you can test, set, and clear individual bits, so a bit packed array can have independent comonents (if your code generator can handle this, GNAT for one cannot, so we can't allow that). But that does not mean you can have a pointer to an individual bit! **************************************************************** From: Robert Dewar Sent: Monday, February 4, 2013 5:19 PM > The reason is obvious: non-independent designated objects is nonsense > that cannot be allowed (unless we add an additional ways to declare > independent addressability of designated objects). I really don't think it matters two hoots whether the language allows non-independent designated objects. The semantics is clear, though peculiar, and on 100% of targets you can't have such things, so they wouldn't be imlemented anyway! > Because we all have assumed for decades that it IS there; the design > of Independent makes that clear (it would make no sense without such > an implication). If one can't read that between the lines of 9.10, > then it needs to be made explicit somehow (either with extra wording > or an expansion of Independent). > >> Robert's comment (about the way hardware behaves in practice) argues >> against making any change, because it won't actually change anything. > If you want to change Pack so that it is not illegal for aliased > components, and then "make no change" to aliased, that I don't > understand, because we need some rule to allow aliased components to > be packed less tightly. (There is no exception for aliased in the > Recommended Level of Support.) And once you add such an exception, you need to describe what it means. I don't see any reason to disallow pack for a record that has some aliased components. I agree that pack for an array of aliased components is unlikely to do anything! **************************************************************** From: Robert Dewar Sent: Monday, February 4, 2013 5:24 PM > In my view, pragma Pack is really just a special case of "pragma > Optimize(Space)" which is a pretty non-specific request. The only > extra bit of semantics for Pack is the pesky "non-independence" > implication, and pragma/aspect Independent/Independent_Components can > be used to overcome that bit. Not quite, packed arrays of boolean are guaranteed to work as expected. > If you really want to control representation, Component_Size or a > record rep clause are the way to go. Pack is not really for > controlling representation, it is for establishing a space bias in the > default representation selection mechanism. Well in practice everyone uses it for boolean arrays to control repreentation type B is array (0 .. 31) of Boolean; pragma Pack (B); is *VERY* standard Ada, and even advising, let alone insisting that people use Component_Size of 1 in such a case seems bgus to me **************************************************************** From: Robert Dewar Sent: Monday, February 4, 2013 5:43 PM >> To see a practical effect of >> >> > type Some_Array is array (1 .. 10) of Character; >> > pragma Independent_Components (Some_Array); > > Sorry, now I'm getting MORE confused. There's no Pack or "for > Some_Array'Component_Size use..." above. > So Ada requires the components of Some_Array to be independently > addressable even without that pragma. > So I fail to see any "practical effect" of the pragma. OK, add the pragma Pack to the example > By the way, type String has pragma Pack, which I always thought was so > String'Component_Size can be 8 even on such weird machines as the old > Alpha! That's exactly right **************************************************************** From: Robert Dewar Sent: Monday, February 4, 2013 5:45 PM I have an idea, how about we make sure that whatever we say makes sense on 8-bit byte addressable machines, with independent 8-bit bytes, and byte pointers. That covers almost all machines on which Ada is used or likely to be used. On machines that do not meet these criteria, you simply appeal to the normal argument that you can't do things if the architecture does not permit It really is a silly waste of time to try to write rules that cover all such possible weird machines. **************************************************************** From: Bob Duff Sent: Monday, February 4, 2013 7:18 PM > My sense is that in ARG meetings over the past several years, we have > all agreed we need something like Independent to overcome the slightly > odd rule about erroneousness coming from any non-confirming rep > clause. Well, it doesn't really say "erroneous", it says it's implementation-defined whether there is erroneousness. To me, that means that on "normal" machines, it won't be erroneous, and that should be good enough. But I'm willing to just give in on this point. I'm not entirely convinced, but everybody seems against me on this, and it's just not important enough to keep arguing about. **************************************************************** From: Bob Duff Sent: Monday, February 4, 2013 7:41 PM > That must have been Tucker # 42. Never heard of that Tucker. >...I don't remember this > suggestion. My feeling has generally been that "pack" means use >space optimization over time optimization, subject to all of the other >representation requirements. I think I understand what you mean, but I think that's a wrong (or obsolete) way to express it. Bit-level packing improves the speed of whole-object operations like "=" and ":=" and parameter passing. And it improves speed by making things smaller and therefore more cache-friendly. It damages the speed of operating on individual components. So Pack tells the compiler how to make that speed trade-off. On a small-memory embedded system, Pack might be about space efficiency, but on a 64-bit computer, it's purely about time efficiency -- it means "reduce space of this data structure in order to improve time overall" (as opposed to "reduce space because I might run out of memory"). It's not "space versus speed", it's "speed of these operations versus speed of those operations". >..It should never be illegal to say "pack,"... Yes, that's the important point about this AI. **************************************************************** From: Randy Brukardt Sent: Monday, February 4, 2013 7:54 PM > > My sense is that in ARG meetings over the past several years, we > > have all agreed we need something like Independent to overcome the > > slightly odd rule about erroneousness coming from any non-confirming > > rep clause. > > Well, it doesn't really say "erroneous", it says it's > implementation-defined whether there is erroneousness. > To me, that means that on "normal" machines, it won't be erroneous, > and that should be good enough. I don't follow. One expects packed array of Boolean to pack to bits on a "normal" machine, and that surely would cause the potential for erroneousness. Moreover, this is a particularly nasty kind of erroneousness, as the failure doesn't happen unless two tasks happen to access two components in the same word at the same time, something that is unlikely even when an obvious mistake happens. It's unlikely that one would find this erroneous case by testing, and that leads to the possibility of the problem not happening until the application is fielded -- which is not good. By having a declaration of Independence ;-), we can more easily write tools to check that in fact no such accesses are possible. I suppose what you say might be true for aliased components, perhaps you meant that (but Tucker was clearly talking about this in general). Even then, it seems dangerous because this is a problem that can't reasonably be tested away, and we don't want tools to be making checks just for "normal" machines (nor complaining too much about cases that can't happen on the machine's that the user is using). > But I'm willing to just give in on this point. I'm not entirely > convinced, but everybody seems against me on this, and it's just not > important enough to keep arguing about. Glad you are seeing reason. ;-) I'm happy to stop discussing this as well. **************************************************************** From: Randy Brukardt Sent: Monday, February 4, 2013 8:04 PM ... > Bit-level packing improves the speed of whole-object operations like > "=" and ":=" and parameter passing. And it improves speed by making > things smaller and therefore more cache-friendly. > It damages the speed of operating on individual components. > So Pack tells the compiler how to make that speed trade-off. > > On a small-memory embedded system, Pack might be about space > efficiency, but on a 64-bit computer, it's purely about time > efficiency -- it means "reduce space of this data structure in order > to improve time overall" (as opposed to "reduce space because I might > run out of memory"). > > It's not "space versus speed", it's "speed of these operations versus > speed of those operations". I think it is *just* about reducing the space (and should almost never be used in a modern system as a consequence). If it was really about reducing the time, why don't we have a counterpart that goes in the other direction ("use as much space as you want to maximize the performance of accessing individual components")? Given cache pressure concerns, the normal case is usually somewhere in the middle (particularly on machines with somewhat irregular access instructions like the x86). Occasionally, you want to go all one way or the other. **************************************************************** From: Robert Dewar Sent: Tuesday, February 5, 2013 4:58 AM > I don't follow. One expects packed array of Boolean to pack to bits on > a "normal" machine, and that surely would cause the potential for > erroneousness. Actualy this is a good example. If we are on the x86, and we write: type R is array (0 .. 31) of Boolean; pragma Pack (R); for R'Size use 32; a very typical set of declarations, then it is really a tossup whether we can rely on separate tasks being able to fiddle with different bits. There are three cases a) the code generator always uses the memory bit instructions to access the bits. Probably a bad choice, but would assure independence for the above case. b) the code generator is incapable of using these memory bit instructions, in which case such simultaneous access is not possible, and a pragma Independent_Components would be rejected. c) the code generator can generate these instructions, but only if told to do so. In this case the pragma Independent_Components is a perfect way of providing this instruction. Really a perfect example of the value of the pragma! > Moreover, this is a particularly nasty kind of erroneousness, as the > failure doesn't happen unless two tasks happen to access two > components in the same word at the same time, something that is > unlikely even when an obvious mistake happens. It's unlikely that one > would find this erroneous case by testing, and that leads to the > possibility of the problem not happening until the application is > fielded -- which is not good. By having a declaration of Independence > ;-), we can more easily write tools to check that in fact no such accesses are > possible. Well really just a special case of shared variable erroneousness, and indeed this is a nasty case. Shared variables are always risky. **************************************************************** From: Robert Dewar Sent: Tuesday, February 5, 2013 5:11 AM > On a small-memory embedded system, Pack might be about space > efficiency, but on a 64-bit computer, it's purely about time > efficiency -- it means "reduce space of this data structure in order > to improve time overall" (as opposed to "reduce space because I might > run out of memory"). No, it's about space efficiency too. Even on a 64-bit machine, it's just impractical to have programs with giant working sets. Yes, one can see this as a time issue, but really most people would think of it as a space issue. Randy's view that pragma Pack makes no sense on such machines is just incomprehensible! Let me give an example. The GNAT compiler allocate a 192-byte block for entities. We recently increased this from 160 bytes to add more fields and flags for Ada 2012 and new pragmas etc, and we saw a significant but tolerable effect on compiler space and time. Now these entities include space for about 325 boolean flags (taking up 40 of those 192 bytes, packed of course. Randy thinks we should remove the pragma Pack on modern machines. This would increase the size of entities by 280 bytes, giving a total of nearly 500 byte. The impact on compiler space and time would be completley disastrous! Makes no sense at all. And gains nothing! Most access tto flags is reads, and the extra cost of reading packed bits is negligible. I think a lot of people underestimate cache effects. Again you can consider these as space or time issues, but it is more reasonable to think of them as space issues, you need to have things small to keep things in cache. With modern processors, out of cache memory references are disastrously slow, and many programs are memory bound. If you want to get maximum performance, you have to really think about cache effects. For example, if you do matrix multiplication with the familiar three nested loops, you get absolutely disastrous performance if the matrix does not fit in the primary cache (a huge proportion of the references will be out of cache in this case). Instead you have to tile, adjusting the tiling parameters to match the size of the primary cache. The difference in performance is staggering. So once again, packed bit arrays are a really important aspect of Ada (another familiar use of such bit arrays is in optimization in a compiler, I assume I don't need to give details to this audience! the space taken up by these representations of sets can be huge even when they are packed. And of course we have not even talked of the use of bit packing to match external data structures, a common situation. **************************************************************** From: Randy Brukardt Sent: Wednesday, February 6, 2013 1:45 PM > No, it's about space efficiency too. Even on a 64-bit machine, it's > just impractical to have programs with giant working sets. Yes, one > can see this as a time issue, but really most people would think of it > as a space issue. > > Randy's view that pragma Pack makes no sense on such machines is just > incomprehensible! I don't think I said anything about "no sense". I said that Pack was solely about space management (which you agree with above), and I said that as a consequence it would "almost never" be used in systems where execution time is the primary criteria. "Almost never" is probably a bit stronger than I meant ("rarely" would have been a better choice), but I stand by my comment. > Let me give an example. > > The GNAT compiler allocate a 192-byte block for entities. > We recently increased this from 160 bytes to add more fields and flags > for Ada 2012 and new pragmas etc, and we saw a significant but > tolerable effect on compiler space and time. > > Now these entities include space for about 325 boolean flags (taking > up 40 of those 192 bytes, packed of course. > > Randy thinks we should remove the pragma Pack on modern machines. This > would increase the size of entities by 280 bytes, giving a total of > nearly 500 byte. Not at all. You most likely don't want to manage such a situation with aspect Pack, because it is such a blunt instrument. In the example you gave (and I expect most similar cases), using pragma Pack indiscriminately would increase the execution time more than using representation clauses carefully. I don't know exactly how GNAT organizes that "block", but in Janus/Ada it is a large variant record. The effect of applying Pack to that record would be to force all components to their minimum size, potentially including forcing pointers and large integers to be misaligned. But that doesn't make much sense, because all that you are trying to do is fit the data into a particularly sized space. You don't want to shrink components that aren't part of the "critical path" for that space -- for instance, the smaller variant limbs. And you probably don't want to pack components that are very commonly used, either, even if you could (the overall discriminant that controls this record is an example; it gets used in lots of discriminant checks even when it isn't explicitly referenced). So in this case, Pack just does too much. You have to use a record rep. clause to shrink just the components that matter and leave the rest of them alone. [We also need to keep binary compatibility as much as possible, so we don't want components moving around unnecessarily -- but that's not the primary reason to avoid Pack in the general case.] Now, I realize you can restructure your data structures with sub-records and sub-arrays so that those parts can use Pack (that would be the "rare" case I was referring to originally), but unless that happened for other reasons, I think that is putting the cart before the horse: your program structure shouldn't be determined by what representation clauses you want to apply! (And using arrays of Booleans rather than records or Booleans makes little sense in the absence of outside considerations; these bits had better be named, and once they are, record components are easier to read and write.) > The impact on compiler space and time would be completely disastrous! > Makes no sense at all. > > And gains nothing! Most access tto flags is reads, and the extra cost > of reading packed bits is negligible. I agree that reading packed Booleans (if they're in records or the array index is static) isn't much, mainly because they're almost always immediately tested, and you can combine the bit extract and the test so that you can eliminate the shift step. That's less true for other enumerations, though, and they too take up a lot of extra space. > I think a lot of people underestimate cache effects. > Again you can consider these as space or time issues, but it is more > reasonable to think of them as space issues, you need to have things > small to keep things in cache. > > With modern processors, out of cache memory references are > disastrously slow, and many programs are memory bound. > > If you want to get maximum performance, you have to really think about > cache effects. > > For example, if you do matrix multiplication with the familiar three > nested loops, you get absolutely disastrous performance if the matrix > does not fit in the primary cache (a huge proportion of the references > will be out of cache in this case). > Instead you have to tile, adjusting the tiling parameters to match the > size of the primary cache. > The difference in performance is staggering. > > So once again, packed bit arrays are a really important aspect of Ada > (another familiar use of such bit arrays is in optimization in a > compiler, I assume I don't need to give details to this audience! the > space taken up by these representations of sets can be huge even when > they are packed. We didn't use bit arrays in our optimizer specifically because the sets get too large to be practical. (Especially when your original host was MS-DOS with 640K RAM). Instead we used property lists (with a very compact representation for the properties). > And of course we have not even talked of the use of bit packing to > match external data structures, a common situation. Huh? Pack does whatever it does, if you have to match some specific data structure, you have to be more specific, using 'Component_Size and record representation clauses. I'm only talking about aspect Pack (capital P), not the general idea of bit packing!! If you have to match something with 'Component_Size = 1, write that, don't assume that Pack will do the right thing (especially as Bob and Tucker want to weaken those guarantees; if the component turns out to be aliased or atomic, no packing will happen and that's probably not what was meant). ****************************************************************