Version 1.7 of ai12s/ai12-0001-1.txt

Unformatted version of ai12s/ai12-0001-1.txt version 1.7
Other versions for file ai12s/ai12-0001-1.txt

!standard 13.2(6.1/2)          13-07-08 AI12-0001-1/05
!standard 13.2(7)
!standard 13.2(8)
!standard 13.2(9/3)
!standard C.6(8.1/3)
!standard C.6(10)
!standard C.6(11)
!standard C.6(21)
!standard C.6(24)
!class binding interpretation 06-03-31
!status Amendment 202x 13-07-08
!status ARG Approved 6-0-3 13-06-14
!status work item 06-03-31
!status received 06-03-30
!priority Medium
!difficulty Medium
!qualifier Omission
!subject Independence and Representation clauses for atomic objects
!summary
[Editor's note: This AI was carried over from Ada 2005.]
Pack doesn't require tight packing in infeasible cases (atomic, aliased, by-reference types, independent addressability).
!question
The Recommended Level of Support implies that it is required to support pragma Pack on types that have Atomic_Components, even to the bit level. Is this the intent? (No.)
!recommendation
Resolve the difference by eliminating C.6 (21) and changing 13.2 (6.1/2) to be a recommended level of support where by-reference, aliased, and atomic objects must be aligned according to subtype.
Change 13.2(8 and 9) to add an exception for components that have alignment requirements as detailed above.
In C.6(8.1/3), add "and aliased objects" after "atomic objects".
In C.6(10-11), add "and Independent" after indivisible.
Delete C.6 (21) as it is no longer required.
!wording
Modify AARM 9.10(1.d/3):
Ramification: An atomic object (including atomic components) is always independently addressable from any other nonoverlapping object. {Aspect_specifications and representation items cannot change that fact.} [Any aspect_specification or representation item which would prevent this from being true should be rejected, notwithstanding what this Standard says elsewhere.] Note, however, that the components of an atomic object are not necessarily atomic.
Add an AARM Language Design Principle after 13.2(1):
If the default representation is already uses minimal storage for a particular type, aspect Pack may not cause any representation change. It follows that aspect Pack should always be allowed, even when it has no effect on representation.
As a consequence, the chosen representation for a packed type may change during program maintenance even if the type is unchanged (in particular, if other representation aspects change on a part of the type). This is different than the behavior of most other representation aspects, whose properties remain guaranteed no matter what changes are made to other aspects.
Therefore, aspect Pack should not be used to achieve a representation required by external criteria. For instance, setting Component_Size to 1 should be preferred over using aspect Pack to ensure an array of bits. If future maintenance would make the array components aliased, independent, or atomic, the program would become illegal if Component_Size is used (immediately identifying a problem) while the aspect Pack version would simply change representations (probably causing a hard-to-find bug).
End Language Design Principle.
Delete 13.2(6.1/2) which currently says:
If a packed type has a component that is not of a by-reference type and has no aliased part, then such a component need not be aligned according to the Alignment of its subtype; in particular it need not be allocated on a storage element boundary.
Add a new bullet after 13.2(7/3):
* Any component of a packed type that is of a by-reference type, that is specified as independently addressable, or that contains an aliased part, shall be aligned according to the alignment of its subtype.
AARM Ramification: This also applies to atomic components. "Atomic" implies "specified as independently addressable", so we don't need to mention atomic here.
Other components do not have to follow the alignment of the subtype when packed; in many cases, the Recommended Level of Support will require the alignment to be ignored. End AARM Ramification.
Modify 13.2(8):
* For a packed record type, the components should be packed as tightly as possible subject to {the above alignment requirements,} the Sizes of the component subtypes, and [subject to] any record_representation_clause that applies to the type; the implementation may, but need not, reorder components or cross aligned word boundaries to improve the packing. A component whose Size is greater than the word size may be allocated an integral number of words.
Modify 13.2(9/3):
* For a packed array type, if the Size of the component subtype is less than or equal to the word size, Component_Size should be less than or equal to the Size of the component subtype, rounded up to the nearest factor of the word size {, unless this would violate the above alignment requirements}.
Delete AARM 13.2(9.a), because the new alignment requirement above makes it clear:
Ramification: If a component subtype is aliased, its Size will generally be a multiple of Storage_Unit, so it probably won't get packed very tightly.
Modify C.6(8.1/3):
When True, the aspects Independent and Independent_Components specify as independently addressable the named object or component(s), or in the case of a type, all objects or components of that type. All atomic objects {and aliased objects} are considered to be specified as independently addressable.
Add "and independent" to C.6(10/3-11), twice:
It is illegal to specify either of the aspects Atomic or Atomic_Components to have the value True for an object or type if the implementation cannot support the indivisible {and independent} reads and updates required by the aspect (see below).
It is illegal to specify the Size attribute of an atomic object, the Component_Size attribute for an array type with atomic components, or the layout attributes of an atomic component, in a way that prevents the implementation from performing the required indivisible {and independent} reads and updates.
Delete C.6(21/3) and the associated AARM note, because the new alignment requirement above covers this case:
If the Pack aspect is True for a type any of whose subcomponents are atomic, the implementation shall not pack the atomic subcomponents more tightly than that for which it can support indivisible reads and updates.
Implementation Note: Usually, specifying aspect Pack for such a type will be illegal as the Recommended Level of Support cannot be achieved; otherwise, a warning might be appropriate if no packing whatsoever can be achieved.
Add a new note after C.6(24):
Specifying the Pack aspect cannot override the effect of specifying an Atomic or Atomic_Components aspect.
!discussion
The idea of Pack is that if it's infeasible to pack a given component tightly (because it is atomic, aliased, of a by-reference type, or has independent addressability), then Pack is not illegal; it just doesn't pack as tightly as it might without the atomic, volatile, etc.
This was always the intent, but the Recommended Level of Support (RLS) contradicted it.
By making the alignment requirement part of the Recommended Level of Support eliminates the conflict between the RLS and the intent.
Note that we require that aliased objects are always independently-addressable. We want dereferences to always be task-safe in this way; modifying an object through a dereference will never clobber some adjacent component (even momentarily).
!corrigendum 13.2(6.1/2)
Delete the paragraph:
If a packed type has a component that is not of a by-reference type and has no aliased part, then such a component need not be aligned according to the Alignment of its subtype; in particular it need not be allocated on a storage element boundary.
!corrigendum 13.2(7)
Insert after the paragraph:
The recommended level of support for pragma Pack is:
the new paragraph:
!corrigendum 13.2(8)
Replace the paragraph:
by:
!corrigendum 13.2(9/3)
Replace the paragraph:
by:
!corrigendum C.6(8.1/3)
Replace the paragraph:
When True, the aspects Independent and Independent_Components specify as independently addressable the named object or component(s), or in the case of a type, all objects or components of that type. All atomic objects are considered to be specified as independently addressable.
by:
When True, the aspects Independent and Independent_Components specify as independently addressable the named object or component(s), or in the case of a type, all objects or components of that type. All atomic objects and aliased objects are considered to be specified as independently addressable.
!corrigendum C.6(10)
Replace the paragraph:
It is illegal to specify either of the aspects Atomic or Atomic_Components to have the value True for an object or type if the implementation cannot support the indivisible reads and updates required by the aspect (see below).
by:
It is illegal to specify either of the aspects Atomic or Atomic_Components to have the value True for an object or type if the implementation cannot support the indivisible and independent reads and updates required by the aspect (see below).
!corrigendum C.6(11)
Replace the paragraph:
It is illegal to specify the Size attribute of an atomic object, the Component_Size attribute for an array type with atomic components, or the layout attributes of an atomic component, in a way that prevents the implementation from performing the required indivisible reads and updates.
by:
It is illegal to specify the Size attribute of an atomic object, the Component_Size attribute for an array type with atomic components, or the layout attributes of an atomic component, in a way that prevents the implementation from performing the required indivisible and independent reads and updates.
!corrigendum C.6(21)
Delete the paragraph:
If a pragma Pack applies to a type any of whose subcomponents are atomic, the implementation shall not pack the atomic subcomponents more tightly than that for which it can support indivisible reads and updates.
!corrigendum C.6(24)
Insert after the paragraph:
NOTES:
9 An imported volatile or atomic constant behaves as a constant (i.e. read-only) with respect to other parts of the Ada program, but can still be modified by an "external source."
the new paragraph:
10 Specifying the Pack aspect cannot override the effect of specifying an Atomic or Atomic_Components aspect.
!ACATS test
There might be value in checking that Pack is allowed in all cases, even when it has no effect on the representation. For instance, combining aspect Pack combined with Atomic_Components for small types like Boolean should always work (but do nothing on most targets). (Test CXC6003 included such a case; this case has been removed from the test pending the outcome of this AI, and most likely this should be a separate test.)
!appendix

From: Jean-Pierre Rosen
Sent: Friday, February 17, 2006  6:34 AM

A question that arose while designing a rule for AdaControl about shared
variables.

If a variable is subject to a pragma Atomic_Components, is it safe for
two tasks to update *different* components without synchronization?

C.6 talks only about indivisibility, not independent addressing. Of
course, you have to throw 9.10 in...

The whole issue is with the "(or of a neighboring object if the two are
not independently addressable)" in 9.10(11), while C.6 (17) says that
"Two actions are sequential (see 9.10) if each is the read or update of
the same atomic object", but doesn't mention neighboring objects.

In a sense, indivisibility guarantees only that there cannot be
temporary incorrect values in a variable due to the fact that the
variable is written by more than one memory cycle. The issue *is*
different from independent addressability. OTOH, Atomic_Components
without independent addressability seems pretty much useless...

****************************************************************

From: Robert Dewar
Sent: Thursday, March 30, 2006  5:55 AM

Answer seems clear, yes it is safe, provided that independence is
assured, which means that there is no rep clause that would disturb
the independence.

If you are suggesting that Atomic Components should guarantee such
independence, and result in the rejection of rep clauses that would
compromise it, that seems reasonable, e.g. you have a packed array
of bits with atomic components, that's definitely peculiar, and
it seems reasonable to reject it.

****************************************************************

From: Pascal Leroy
Sent: Thursday, March 30, 2006  6:07 AM

> If a variable is subject to a pragma Atomic_Components, is it safe for
> two tasks to update *different* components without synchronization?

I think that 9.10(1) is quite clear: distinct objects are independently
addressable unless "packing, record layout or Component_Size is
specified".

So regardless of atomicity, it is always safe to read/update two distinct
components of an object (in the absence of packing, etc.).  What
Atomic_Component buys you is that reads/updates of the same component are
sequential.

****************************************************************

From: Jean-Pierre Rosen
Sent: Thursday, March 30, 2006  6:17 AM

Of course, my question was in the case of the presence of packing etc.

The answer seems to be no, there is no *additional* implication on
addressability due to atomic_components. Correct?

****************************************************************

From: Pascal Leroy
Sent: Thursday, March 30, 2006  6:25 AM

> Of course, my question was in the case of the presence of packing etc.

In the presence of packing, 9.10(1) says that independent addressability
is "implementation defined", which is not too helpful.  (This topic was
discussed a few weeks ago as part of another thread, btw.)

> The answer seems to be no, there is no *additional* implication on
> addressability due to atomic_components. Correct?

Right.

****************************************************************

From: Tucker Taft
Sent: Thursday, March 30, 2006  6:57 AM

The ARG recently disallowed combining a pair of atomic operations
on distinct objects into a single operation, I believe.
I would certainly support saying that array-of-aliased
and array-of-atomic would ensure independence between
components, even in the presence of other rep-clauses.
That seems like a reasonable interpretation of what
atomic means, and "aliased" implies that you can
have multiple access paths that make no visible use
of indexing, and hence you would certainly want independence.

****************************************************************

From: Robert Dewar
Sent: Thursday, March 30, 2006  7:58 AM

> The ARG recently disallowed combining a pair of atomic operations
> on distinct objects into a single operation, I believe.
> I would certainly support saying that array-of-aliased
> and array-of-atomic would ensure independence between
> components, even in the presence of other rep-clauses.

Wait a moment, then you have to give permission to reject
these "other rep clauses", you can't insist that they be
recognized and independence be preserved!

****************************************************************

From: Robert Dewar
Sent: Thursday, March 30, 2006  8:02 AM

> In the presence of packing, 9.10(1) says that independent addressability
> is "implementation defined", which is not too helpful.  (This topic was
> discussed a few weeks ago as part of another thread, btw.)

It seems *really* nasty to make this implementation defined, I hate
erroneousness being imp defined. Is this a new change, I missed it.

****************************************************************

From: Robert Dewar
Sent: Thursday, March 30, 2006  8:08 AM

> So regardless of atomicity, it is always safe to read/update two distinct
> components of an object (in the absence of packing, etc.).  What
> Atomic_Component buys you is that reads/updates of the same component are
> sequential.

.. and atomic!

But there is still the issue of something like this

    type X is array (1 .. 8) of Boolean;
    pragma Pack (X);
    pragma Atomic_Components (X);

Should one of the two pragmas be ignored, or should one of
them be rejected, or what? In GNAT we get:


a.ads:4:30: warning: Pack canceled, cannot pack atomic components

is that behavior OK? forbidden? mandated?
(not clear to me at any right)

****************************************************************

From: Pascal Leroy
Sent: Thursday, March 30, 2006  8:17 AM

> It seems *really* nasty to make this implementation defined,
> I hate erroneousness being imp defined. Is this a new change,
> I missed it.

This is not new, it has been like that since Ada 95, and the last time
this was discussed (around Feb, 24th, thread titled "Independence and
confirming rep. clauses"), the two of us (at least) agreed that it was
poor language design.

****************************************************************

From: Robert Dewar
Sent: Thursday, March 30, 2006  8:25 AM

OK, so I just misremembered here, sorry!

****************************************************************

From: Pascal Leroy
Sent: Thursday, March 30, 2006  8:25 AM

> is that behavior OK? forbidden? mandated?
> (not clear to me at any right)

It's certainly OK to reject any representation item that you don't like.
However, it appears that the implementation advice about pragma Pack does
not mention atomicity, so you are not following the advice, and you don't
comply with Annex C.

On a machine that could independently address bits, the two pragmas could
well coexist, so there is some amount of implementation dependence here.

For the record Apex also ignores Pack in this example, although it doesn't
emit a warning.

****************************************************************

From: Robert Dewar
Sent: Thursday, March 30, 2006  8:41 AM

> It's certainly OK to reject any representation item that you don't like.
> However, it appears that the implementation advice about pragma Pack does
> not mention atomicity, so you are not following the advice, and you don't
> comply with Annex C.

Yes, but it is impossible to comply on virtually all machines

> On a machine that could independently address bits, the two pragmas could
> well coexist, so there is some amount of implementation dependence here.

There are almost no such machines!

****************************************************************

From: Tucker Taft
Sent: Thursday, March 30, 2006  8:21 AM

> Wait a moment, then you have to give permission to reject
> these "other rep clauses", you can't insist that they be
> recognized and independence be preserved!

I believe there are already rules that effectively allow that,
once we make it clear that being atomic also implies being
independent of neighboring objects.  E.g. C.6(10-11):

    It is illegal to apply either an Atomic or Atomic_Components pragma
    to an object or type if the implementation cannot support the
    indivisible reads and updates required by the pragma (see below).

    It is illegal to specify the Size attribute of an atomic object,
    the Component_Size attribute for an array type with atomic
    components, or the layout attributes of an atomic component,
    in a way that prevents the implementation from performing the
    required indivisible reads and updates.

Probably would want to change "indivisible" to
"indivisible and independent" in both of the above paragraphs.

****************************************************************

From: Robert Dewar
Sent: Thursday, March 30, 2006  1:30 PM

SO I guess you would consider my packed example illegal, and the
warning should be a real illegality?

****************************************************************

From: Tucker Taft
Sent: Thursday, March 30, 2006  2:24 PM

> SO I guess you would consider my packed example illegal, and the
> warning should be a real illegality?

Pragma Pack is a little different.  It says "pack as
tightly as you can, subject to all the other requirements
imposed on the type."  So you never need to reject a
pragma Pack.  I could imagine that in the absence of
a pragma Pack, some implementations might make the following
array 32-bits/element:

    type Very_Short is new Integer range 0..7;
    type VS_Array is array(Positive range <>) of Very_Short;
    pragma Atomic_Components(VS_Array);

but if we add a pragma Pack(VS_Array), I would expect it to be shrunk
down to 8 bits per component on machines that allow atomic
reference to bytes.  In the absence of the pragma Atomic_Components,
I would expect it to be shrunk down to 3 or 4 bits/component.

****************************************************************

From: Gary Dismukes
Sent: Thursday, March 30, 2006  3:05 PM

> Pragma Pack is a little different.  It says "pack as
> tightly as you can, subject to all the other requirements
> imposed on the type."  So you never need to reject a
> pragma Pack.  I could imagine that in the absence of
> a pragma Pack, some implementations might make the following
> array 32-bits/element:

But in the case of Annex C compliance you have to follow the
recommended level of support, which requires tight packing
of things like Boolean arrays as I understand it.  There's
nothing about "subject to other requirements", so it seems
that one of the pragmas would have to be rejected.

****************************************************************

From: Tucker Taft
Sent: Thursday, March 30, 2006  3:31 PM

> But in the case of Annex C compliance you have to follow the
> recommended level of support, which requires tight packing
> of things like Boolean arrays as I understand it.  There's
> nothing about "subject to other requirements", so it seems
> that one of the pragmas would have to be rejected.

Good point.  But an existing AARM note implies there is
some interplay between a component being aliased and
the "size of the component subtype":

    Ramification: If a component subtype is aliased, its Size will
    generally be a multiple of Storage_Unit, so it probably won't
    get packed very tightly.

This AARM ramification seems totally unjustified, unless we
presumed that there was some kind of implicit "widening"
that was occuring on the Size of a component subtype if
necessary to satisfy other requirements, such as "aliased,"
"atomic," etc.  But that really doesn't fit with the model,
since the *subtype* is not aliased, nor is the component
*subtype* atomic in the case of an Atomic_Components pragma.

So I think we will definitely need to change the words here
if that is what we want, namely the "tight" packing is not
required if the components are aliased, by-reference, or
atomic.

****************************************************************

From: Randy Brukardt
Sent: Thursday, March 30, 2006  3:37 PM

> But in the case of Annex C compliance you have to follow the
> recommended level of support, which requires tight packing
> of things like Boolean arrays as I understand it.  There's
> nothing about "subject to other requirements", so it seems
> that one of the pragmas would have to be rejected.

As much as I hate to, I agree with Gary. Indeed, I don't see anything about
"subject to other requirements" anywhere in 13.2. Here's what the definition
of Pack is (this has nothing to do with recommended level of support):

"If a type is packed, then the implementation should try to minimize storage
allocated to objects of the type, possibly at the expense of speed of
accessing components, subject to reasonable complexity in addressing
calculations."

I don't see that "reasonable complexity" has anything whatsoever to do with
"other requirements". And then the Recommended Level of Support pretty much
defines what "reasonable complexity" means (by allowing rounding up to avoid
crossing boundaries).

So I agree that one of the pragmas has to be rejected. (I don't think that
any language change is needed to make that a requirement, either, although
it would make sense to clarify this so there is no doubt.) A warning (as
GNAT gives) is wrong for a compiler following Annex C, and unfriendly
otherwise. Silently doing nothing...I better not go there. :-)

****************************************************************

From: Randy Brukardt
Sent: Thursday, March 30, 2006  3:50 PM

> So I think we will definitely need to change the words here
> if that is what we want, namely the "tight" packing is not
> required if the components are aliased, by-reference, or
> atomic.

The note was unjustified in Ada 95, but in Ada 2005, we added a blanket
permission to reject rep. clauses for components of by-reference and aliased
types unless they are confirming. See 13.1(26/2). Remember that pragma Pack
is never confirming, so this is the same as saying that it can be rejected
(but not required to be rejected) for any aliased or by-reference type.

There is even an AARM note (carried over from Ada 95) which notes that
Atomic_Components has similar restrictions. But it doesn't look like we ever
considered the interaction of Atomic_Components and other rep. clauses.
Perhaps it should be included in 13.1(26/2)? (That is, it shouldn't be
required to support any non-confirming rep. clauses on such a type, but of
course you can if you want.)

****************************************************************

From: Tucker Taft
Sent: Thursday, March 30, 2006  4:00 PM

> As much as I hate to, I agree with Gary. Indeed, I don't see anything about
> "subject to other requirements" anywhere in 13.2....

The new paragraph 13.2(6.1) says:

    If a packed type has a component that is not of a by-reference
    type and has no aliased part, then such a component need not
    be aligned according to the Alignment of its subtype; in
    particular it need not be allocated on a storage element boundary.

This is the part that implies that packing is "subject to
other requirements."  If we changed "aliased" to "aliased or atomic"
in the above, I think it would accomplish roughly what I was
suggesting.  I think you will agree that the above paragraph,
combined with 13.3(26.3):

     For an object X of subtype S, if S'Alignment is not zero, then
     X'Alignment is a nonzero integral multiple of S'Alignment unless
     specified otherwise by a representation item.

implies that in:

     type Aliased_Bit_Vector is
       array (Positive range <>) of aliased Boolean;
     pragma Pack(Boolean);

the components should be aligned on Boolean'Alignment boundaries.
I would think the same thing should apply if Atomic_Components
is applied to a boolean array.

I admit that these paragraphs seem to contradict the recommended
level of support, but I think the bug is there, not in the above
two paragraphs.

 > ...
> So I agree that one of the pragmas has to be rejected. (I don't think that
> any language change is needed to make that a requirement, either, although
> it would make sense to clarify this so there is no doubt.) A warning (as
> GNAT gives) is wrong for a compiler following Annex C, and unfriendly
> otherwise. Silently doing nothing...I better not go there. :-)

I suppose it depends on your interpretation of "Pack."  I have
always taken it as "do as well as you can."  If you really have
a specific size you need, then specify that with Component_Size,
or be sure that there is nothing inhibiting the packing, such
as aliased, by-reference, or atomic components.

I agree it is friendly to inform the user if the pack has *no*
effect, but I wouldn't want to disallow pragma Pack completely
in the above example, because array of Boolean might use
32-bits/component in its absence, if byte-at-a-time access is
significantly slower than word-at-a-time access on the given
hardware.

****************************************************************

From: Robert Dewar
Sent: Thursday, March 30, 2006  4:32 PM

> I suppose it depends on your interpretation of "Pack."  I have
> always taken it as "do as well as you can."  If you really have
> a specific size you need, then specify that with Component_Size,
> or be sure that there is nothing inhibiting the packing, such
> as aliased, by-reference, or atomic components.

Well you can interpret it that way if you like, but it is not
the definition in the language, which says that for arrays with
1,2,4 bit components, pragma Pack works as expected!

> I agree it is friendly to inform the user if the pack has *no*
> effect, but I wouldn't want to disallow pragma Pack completely
> in the above example, because array of Boolean might use
> 32-bits/component in its absence, if byte-at-a-time access is
> significantly slower than word-at-a-time access on the given
> hardware.

I think that is wrong in this case, since pragma Pack for Boolean
has precise well defined semantics, and must make the component
size 1, it does not mean, do-as-well-as-you-can.

****************************************************************

From: Randy Brukardt
Sent: Thursday, March 30, 2006  5:18 PM

>      type Aliased_Bit_Vector is
>        array (Positive range <>) of aliased Boolean;
>      pragma Pack(Boolean);
>
> the components should be aligned on Boolean'Alignment boundaries.
> I would think the same thing should apply if Atomic_Components
> is applied to a boolean array.

Well, in your example, the pragma should be rejected because the type isn't
local. But I presume you meant "pragma Pack(Aliased_Bit_Vector);".

I see your point, but all it says to me is that the new paragraph shouldn't
be conditional. The needed escape is provided by 13.1(26/2) anyway.
13.1(26/2) says that there is no requirement to even support pragma Pack for
such a type.

> I admit that these paragraphs seem to contradict the recommended
> level of support, but I think the bug is there, not in the above
> two paragraphs.

And I disagree; I think the RLS is correct and the above should simply read:

     The component of a packed type need not be aligned according to the
     Alignment of its subtype; in particular it need not be allocated on
     a storage element boundary.

This doesn't require misalignment, it just allows it. The RLS requires it in
some cases, but in those cases there is no requirement to support pragma
Pack.

...
> I suppose it depends on your interpretation of "Pack."  I have
> always taken it as "do as well as you can."  If you really have
> a specific size you need, then specify that with Component_Size,
> or be sure that there is nothing inhibiting the packing, such
> as aliased, by-reference, or atomic components.

Pack is defined to "minimize storage, within reason". No exceptions for
goofy component types; for those you can't minimize storage.

> I agree it is friendly to inform the user if the pack has *no*
> effect, but I wouldn't want to disallow pragma Pack completely
> in the above example, because array of Boolean might use
> 32-bits/component in its absence, if byte-at-a-time access is
> significantly slower than word-at-a-time access on the given
> hardware.

Such hardware is possible, I suppose, but it seems unlikely since it would
perform poorly on C code and thus on standard benchmarks. Moreover, there is
more to overall performance than just the byte access time; all of the
wasted space would cause extra cache pressure and usually would cause the
overall run time to be longer.

After all, the default representation should be best for "typical"
conditions. If your use of a particular type is atypical (you need storage
minimization or performance maximization), then you need to declare the type
appropriately. For storage minimization, that's pragma Pack. For time
maximization, you have to noodle with 'Alignment and/or 'Component_Size,
which is difficult; it would be useful if Ada had a pragma Fastest (...)
that worked like Pack in reverse (sort of like Pascal unpack) -- space be
damned, give me the fastest possible access to these components.

So, I don't see any value to pragma Pack in your example; if anything, it is
misleading because it does nothing. One of our goals with this amendment,
after all, was to reduce the effects of adding or removing "aliased". I
don't think that adding or removing "aliased" should change representation
if there are rep. clauses (although it might make the rep. clauses
illegal) -- otherwise, a simple maintenance change can introduce
hard-to-find bugs.

Specifically, you're saying that changing:

    type Bit_Vector is array (Positive range <>) of Boolean;
    pragma Pack(Bit_Vector);

to

    type Bit_Vector is array (Positive range <>) of aliased Boolean;
    pragma Pack(Bit_Vector);

will *silently* change the representation. Yuk. I'm pretty sure that we'll
never do that in our compiler...

****************************************************************

From: Robert Dewar
Sent: Thursday, March 30, 2006  5:31 PM

> will *silently* change the representation. Yuk. I'm pretty sure that we'll
> never do that in our compiler...

So how *will* your compiler handle these two cases?

****************************************************************

From: Randy Brukardt
Sent: Thursday, March 30, 2006  5:47 PM

> So how *will* your compiler handle these two cases?

I presume you're asking about the Ada 2005 update, not the current practice
(without the new 13.1(26/2), we just give warnings that nothing will
happen).

Anyway, in Ada 2005, the first will be accepted, and the second rejected
(based on 13.1(26/2) - this is not confirming). The rejection of the second
one will make the maintenance programmer remove the pragma, and that will
make the change of representation crystal clear.

****************************************************************

From: Tucker Taft
Sent: Thursday, March 30, 2006  6:00 PM

> Anyway, in Ada 2005, the first will be accepted, and the second rejected
> (based on 13.1(26/2) - this is not confirming). The rejection of the second
> one will make the maintenance programmer remove the pragma, and that will
> make the change of representation crystal clear.

I'm convinced.  And I think pragma Atomic_Components ought
to work very much like adding "aliased".  So perhaps the only
real change is needed in 13.1(24/2):

     An implementation need not support a nonconfirming representation
     item if it could cause an aliased object or an object of a
     by-reference type to be allocated at a nonaddressable location or,
     when the alignment attribute of the subtype of such an object is
     nonzero, at an address that is not an integral multiple of that
     alignment.

We should probably change "aliased" above to "aliased or atomic."

****************************************************************

From: Robert Dewar
Sent: Thursday, March 30, 2006  6:15 PM

> We should probably change "aliased" above to "aliased or atomic."

or volatile, you don't want extra reads/writes there either.

****************************************************************

From: Randy Brukardt
Sent: Thursday, March 30, 2006  6:21 PM

> We should probably change "aliased" above to "aliased or atomic."

I think we'd want to make that change to 13.1(25/2) and 13.1(26/2), too. We
don't want to force compilers to handle 4-bit atomic record components,
either. (Those could be aligned correctly and still have a size that's too
small.)

****************************************************************

From: Robert I. Eachus
Sent: Thursday, March 30, 2006  7:27 PM

>> On a machine that could independently address bits, the two pragmas
>> could
>> well coexist, so there is some amount of implementation dependence here.
>
>
> There are almost no such machines!

I totally agree with the language part of this discussion, but many
hardware ISAs allow read-modify-write access.  If you can do an AND or
an OR as an RMW isntruction, then ORing16#EF# sets the fourth bit of the
byte, and ANDing of 16#EF# resets it.  (There are often advantages to
doing 32 or 64-bit wide operations instead of byte wide operations,
especially with modern CPUs, but that is a detail.) Is the RMW
instruction atomic?  The most interesting case is in the x86 case.  If
you have a single CPU (or today CPU core) the retirement rules make the
instructions atomic from the CPUs point of view.  (If an interrupt
occurs, either the write has completed, or the instruction will be
restarted.)  What if you have multiple CPUs, multiple cores, or are
interfacing with an I/O device?  Better mark the memory as UC
(uncacheable) and use the LOCK prefix on the AND or OR instruction, but
then it is guaranteed to work.

So I would say that the majority of  computers in use do support
bit-addressable atomic access support--as long as the component values
don't cross quad-word boundaries. (There are lots of other CISC CPU
designs where this works as well.  The first microprocessor I used it on
was the M68000, but I had used this trick on many mainframes before then.)

****************************************************************

From: Robert I. Eachus
Sent: Thursday, March 30, 2006  7:53 PM

> So I would say that the majority of  computers in use do support
> bit-addressable atomic access support--as long as the component values
> don't cross quad-word boundaries.

Whoops! I got a bit carried away.  In the x86 ISA you can only do atomic
loads and stores of a set of all one bits or all zero bits.  Some other
ISAs do allow arbitrary bit patterns to be substituted.  You can always
use a locked XOR iff each entry in an array is 'owned' by a different
thread.

So the changes being discussed are needed for the non-boolean cases.
However, I would hope that at least the AARM should explain the special
nature of atomic bit arrays.

****************************************************************

From: Bibb Latting
Sent: Thursday, March 30, 2006 11:46 PM

> So I would say that the majority of  computers in use do support
> bit-addressable atomic access support--as long as the component values
> don't cross quad-word boundaries. (There are lots of other CISC CPU
> designs where this works as well.  The first microprocessor I used it on
> was the M68000, but I had used this trick on many mainframes before then.)

This is a molecular operation, not an atomic operation for:

   type packed_bits (1..N) of boolean;
   pragma pack (packed_bits);
   pragma atomic_components (packed_bits);

    1) RMW assumes that the contents on read are the same as write.  When
       dealing with I/O interfaces, this is not always true.

    2)  Without a data source for the other bits, the operation is not
        atomic.

> Probably would want to change "indivisible" to
> "indivisible and independent" in both of the above paragraphs.

I think this change is worth considering.

****************************************************************

From: Jean-Pierre Rosen
Sent: Friday, March 31, 2006  2:07 AM

Just to spread a little more oil on the fire...
What happens here?

type Tab is array (positive range <>) of boolean;
pragma pack (Tab);

X : Tab (1 ..32);
pragma Atomic_Components (X);

i.e. when a *type* is packed, but an individual *variable* has atomic
components?

****************************************************************

From: Robert Dewar
Sent: Thursday, March 30, 2006  5:05 AM

An error message I trust:

> The array_local_name in an Atomic_Components or
> Volatile_Components pragma shall resolve to denote the declaration of an
> array type or an array object of an anonymous type.

Tab don't look anonymous to me :-)

****************************************************************

From: Robert I. Eachus
Sent: Friday, March 31, 2006 11:27 AM

> This is a molecular operation, not an atomic operation for:
>
>   type packed_bits (1..N) of boolean;
>   pragma pack (packed_bits);
>   pragma atomic_components (packed_bits);
>
>    1) RMW assumes that the contents on read are the same as write.  When
> dealing with I/O interfaces, this is not always true.

No, you have to follow the prescription exactly.  And although it is
possible that some chipsets get this wrong, the ISA specifies what is
done exactly because it is used in interfacing between multiple CPUs and
CPUs and I/O devices.  Oh, and it is about 50 times faster on a Hammer
(AMD Athlon64, Turion, or Opteron) CPU because all memory access goes
through CPU caches.  So if the memory is local to the CPU, it just has
to do the RMW in cache, and any other writes to the location can't
interrupt.  Teechnically the cache line containing the array is Owned by
the thread that executes the locked RMW instruction.  This means that
the data migrates to the local cache, and the CPU connected to the
memory has a Shared copy in cache.  (Reads are not an issue, they either
see the previous  state of the array, or the final state.)

To repeat, on x86, you must use an AND or OR instruction where the first
argument is the bit array you want treated as atomic.  (The second
argument--the mask--can be a register or an immediate constant.) You
must use the LOCK prefix byte, and the page containing the array must be
marked as uncacheable.  (Yes, Hammer chips cache them anyway, but
enforce the atomicity rules.  In fact they go a bit further, and don't
even allow other reads during the few CPU clocks the cycle takes.  If
you read a Shared cache line, the read causes a cache snoop that can
invalidate the read, and cause the instruction to be retried.)

>    2)  Without a data source for the other bits, the operation is not
> atomic.

Did you miss the fact that you have to use an AND or OR instruction with
a memory address as the first argument to
use the LOCK prefix?   This insures that the read and write are seen as
atomic by the CPU.  Marking the memory as uncacheable is necessary if
there are other CPUs and/or I/O devices involved.  This ensures that the
memory line is locked with Intel CPUs and must be locally Owned by AMD CPUs.

If you really think this doesn't work, look at some driver code.  I''ve
avoided giving example programs, because I'd also need to supply
hardware to test the code.

****************************************************************

From: Bibb Latting
Sent: Friday, March 31, 2006  4:44 PM

> If you really think this doesn't work, look at some driver code.  I''ve
> avoided giving example programs, because I'd also need to supply hardware
> to test the code.

I *really* think that this doesn't *always* work.  I understand the
mechanization of memory access that you describe: indeed today there are
usually adequate means to obtain exclusive access to a memory element, which
when combined with suitable cache management allows implementation of
volatile/atomic accesses.

However, the underlying assumption is that the address  referenced returns
the last value written.  I'm saying that this isn't always true for memory
mapped I/O.  An example I encountered was the SCC2692 a number of years ago.
It was a really *cheap* chip with 16 bytes of address space.  The problem is
that the chip doesn't have enough address space to provide both read-back of
control registers and adequate status.  To work around the problem, the
Read/Write line was multiplexed: when you write to the chip you're accessing
one register; when you read, you're accessing a different register.  So,
there are two objects, one for write and another for read, at the *same
address*.  In terms of C.6, I'm treating (perhaps incorrectly) every
addressable element as a variable, which becomes "shared" by application of
volatile/atomic.

****************************************************************

From: Robert I. Eachus
Sent: Friday, March 31, 2006  7:59 PM

Ah!  I guess I mixed you up by going from the general to the specific
case.  The Intel 8086, 8088, and 80186, were not designed to support
(demand paged) virtual memory, although it could be done. The Intel
80286 was designed to do so, but to call the support a kludge is an
insult to most kludges.  Since the 80386, and in chip years that is a
long time ago, the mechanism I described has been supported as part of
the ISA.  Right now the AMD and Intel implementations are very
different, but the same code will work on all PC compatible CPUs.

There may be non-x86 compatible hardware out there that is not capable
of correctly doing the (single) bit flipping.  But I think that from a
language design point of view, we should realize that most CPUs out
there will support the packed array of Boolean special.case.  I would
rather have the RM require it for Real-Time Annex support, and allow
compilers for non-conforming hardware to document that. For example,
there is an errata for the Itanium2 IA-32 execution layer (#14 on page
67 of  http://download.intel.com/design/Itanium2/specupdt/25114140.pdf)
But that just means you shouldn't try to run real-time code in IA-32
emulation mode on an Itanium2 CPU.  ;-)

Incidently notice that there is a lot of magic that goes on in operating
systems that may prevent a program from doing this bit-twiddling.
That's fine.  If a program that uses the Real-Time Annex needs special
permissions, document them and move on.  I personally think that there
is no reason for an OS not to satisfy a user request for an uncacheable
(UC) page.  It is necessary for real-time code, and harmless otherwise.
Especially on the AMD Hammer CPUs, there is no reason to restrict user
access to UC pages and/or the LOCK prefix.  The actual locking lasts a
few nanoseconds. (The memory location will be read, ownership, if
necessary transferred to the correct CPU and process.  Then the locked
RMW cycle takes place in the L1 data cache. Unlocked writes to the bit
array can occur during the change of ownership, but the copy used in the
RMW cycle is the latest version.)

****************************************************************

From: Randy Brukardt
Sent: Friday, March 31, 2006  8:36 PM

> Ah!  I guess I mixed you up by going from the general to the specific
> case.

No, you missed his point at altogether. It doesn't have anything to do with
the CPU!

The point is that memory-mapped hardware often doesn't act like memory at
all; in particular a location may not be readable or writable or (worse)
may return something different when read after writing.

You can't make bit-mapped atomic writing work at all in such circumstances,
no matter what CPU locking is provided. You are suggesting using

    Lock
    Or [Mem],16#10#

to set just the fifth bit atomically, but this cannot work on memory-mapped
hardware that doesn't allow reading! You'll set the other bits to whatever
random junk, not the correct values.

Now, the question is what this has to do with the language. You seem to want
to insist that compilers support this. But compiler vendors have no control
over what hardware their customers build/use. If your rule was adopted,
about all vendors could do is put "don't use Atomic_Components with
memory-mapped hardware that can only be written" in their manual.

But this is nasty; Atomic and Atomic_Components exist in large part because
of memory-mapped hardware, and here you're trying to tell people to not use
one of them exactly when they are most likely to do so. That doesn't seem to
be a good policy.

It seems better to me to require users to read/write full storage units in
this case, using an appropriate record or array type. There's much less risk
of problems in that case. Funny hardware seems to be quite prevalent
(remember that we had a long discussion on whether an atomic read/write
could read two bytes instead of one word), we have to recognize that.

****************************************************************

From: Robert Dewar
Sent: Saturday, April  1, 2006  3:05 AM

> There may be non-x86 compatible hardware out there that is not capable
> of correctly doing the (single) bit flipping.  But I think that from a
> language design point of view, we should realize that most CPUs out
> there will support the packed array of Boolean special.case.

I must say I am puzzled, what code do you have in mind for
supporting

    type x is array (1 .. 8) of Boolean;
    pragma Pack (x);
    pragma Atomic_Components (x);
    ...
    ...
    x (j) := k;

this seems really messy to me

****************************************************************

From: Robert A. Duff
Sent: Saturday, April  1, 2006  8:41 AM

> I *really* think that this doesn't *always* work.  I understand the
> mechanization of memory access that you describe: indeed today there are
> usually adequate means to obtain exclusive access to a memory element, which
> when combined with suitable cache management allows implementation of
> volatile/atomic accesses.

That makes sense.  I never thought packed bitfields could be atomic.

But I'm confused.

Atomic implies volatile, by C.6(8), "...In addition, every atomic type or
object is also defined to be volatile."  Then C.6(20) says:

  20    {external effect (volatile/atomic objects) [partial]} The external
  effect of a program (see 1.1.3) is defined to include each read and update of
  a volatile or atomic object. The implementation shall not generate any memory
  reads or updates of atomic or volatile objects other than those specified by
  the program.

(where "volatile or atomic" means "volatile [or atomic]").

Packed bitfields CAN be volatile.  But if we want to write upon a packed
bitfield, we must read a whole word first, on most hardware (whether by an
explicit load into a register, or an implicit read like the "LOCK OR"
instruction Robert Eachus mentioned).  Right?  So how can one implement
volatile bitfields in the way required by C.6(20)?

The C.6(22/2) says:

                            Implementation Advice

  22/2  {AI95-00259-01} A load or store of a volatile object whose size is a
  multiple of System.Storage_Unit and whose alignment is nonzero, should be
  implemented by accessing exactly the bits of the object and no others.

(where "volatile" means "volatile [or atomic]", this time ;-)).

Is this not implied by C.6(20)?  Obviously, I misunderstand what C.6(20) is
(intended to) mean

****************************************************************

From: Robert Dewar
Sent: Saturday, April  1, 2006  8:49 AM

> Packed bitfields CAN be volatile.  But if we want to write upon a packed
> bitfield, we must read a whole word first, on most hardware (whether by an
> explicit load into a register, or an implicit read like the "LOCK OR"
> instruction Robert Eachus mentioned).  Right?  So how can one implement
> volatile bitfields in the way required by C.6(20)?

If C.6(20) requires volatile bit-fields, it is just junk. Implementors
don't pay attention to junk :-)

****************************************************************

From: Tucker Taft
Sent: Saturday, April  1, 2006  9:38 AM

My interpretation of C.6(20) would be:

   If the program includes an update to a bit field, and
   that requires a read/modify/write sequence on the given
   hardware, then that is not a violation of the requirement
   that:

       The implementation shall not generate any memory
       reads or updates of atomic or volatile objects other
       than those specified by the program.

   The read/modify/write sequence has been "specified" by the
   program.  If the bit fields were atomic, then that would
   require that the read/modify/write sequence by "indivisible."

To take advantage of C.6(20) to deal with "active" memory
locations, I think the programmer has to know whether the
hardware requires a read/modify/write sequence for the given
size of object.  If so, then they better be sure that that
sequence works for their memory-mapped device.  It is not
clear how you can say enough in the reference manual to make
all of this portable.  Hardware differs enough that this
will require some issues that can't realistically be addressed
without hardware-specific documentation.

****************************************************************

From: Robert A. Duff
Sent: Saturday, April  1, 2006  9:43 AM

> If C.6(20) requires volatile bit-fields, it is just junk. Implementors
> don't pay attention to junk :-)

Well, it's apparently the intent, given this AARM annotation:

    22.b/2 Reason: Since any object can be a volatile object, including packed
          array components and bit-mapped record components, we require the
          above only when it is reasonable to assume that the machine can
          avoid accessing bits outside of the object.

I also just noticed:

21    If a pragma Pack applies to a type any of whose subcomponents are
atomic, the implementation shall not pack the atomic subcomponents more
tightly than that for which it can support indivisible reads and updates.

which seems to answer the original question.  (Sorry if somebody already
pointed this out, and I missed it.)  Note that (21) is for atomic, not
volatile.

****************************************************************

From: Robert Dewar
Sent: Saturday, April  1, 2006  10:12 AM

>   The read/modify/write sequence has been "specified" by the
>   program.  If the bit fields were atomic, then that would
>   require that the read/modify/write sequence by "indivisible."

I really think that's strange, to me if you have a volatile
variable, then reads should be reads and writes should be
writes.

****************************************************************

From: Robert Dewar
Sent: Saturday, April  1, 2006  10:13 AM

>> If C.6(20) requires volatile bit-fields, it is just junk. Implementors
>> don't pay attention to junk :-)
>
> Well, it's apparently the intent, given this AARM annotation:
>
>     22.b/2 Reason: Since any object can be a volatile object, including packed
>           array components and bit-mapped record components, we require the
>           above only when it is reasonable to assume that the machine can
>           avoid accessing bits outside of the object.

How does this compare with the C rules for interest. It seems obvious
to me that volatile in Ada should mean the same as volatile in C.

****************************************************************

From: Robert A. Duff
Sent: Saturday, April  1, 2006 10:55 AM

I agree.  C doesn't have packed arrays, but it does have arrays of bytes
(char), which might require a read to write deep down in the hardware.
It has bitfields in structs.  I'm not sure what the rules are for "volatile",
but I have heard people claim that whatever they are, even the C language
lawyers can't understand them and/or don't agree on what they mean, neither
formally nor informally.  ;-)

****************************************************************

From: Tucker Taft
Sent: Saturday, April  1, 2006 11:57 AM

Here's what the GNU C reference manual says about volatile:

     The volatile qualifier tells the compiler to not optimize
     use of the variable by storing its value in a cache, but
     rather to fetch its value afresh each time it is used.
     Depending on the application, volatile variables may be
     modified autonomously by external hardware devices.

So they are focusing on requiring that no caching is performed.
They make no mention of reading or writing *more* than is specified
by the program.  They want to be sure you don't read any *less*
than specified.

As far as atomic, some versions of C have sig_atomic_t, which is
an integer type that is atomic with respect to asynchronous
interrupts (i.e. signals).  As far as I know, there is no
such thing as an atomic bit field in C.

****************************************************************

From: Robert A. Duff
Sent: Saturday, April  1, 2006  3:34 PM

Thanks for looking that up.  Interesting.

Of course "GNU C" is not "the C standard".  And of course, there are different
versions of the C standard that might be relevant.  I'm too lazy to look it
up, and anyway, I suppose I'd have to fork over hundreds of dollars to ISO to
do so?

I agree with your earlier comment, that given all the myriad hardware out
there, we cannot hope to nail down every detail in the definitions of atomic
and volatile.

****************************************************************

From: Robert Dewar
Sent: Saturday, April  1, 2006  4:25 PM

I disagree, we can have a clear semantic model (especially critical for
atomic), and if hardware cannot accomodate this model, then the pragma
must be rejected. So I think that is far too pessimistic.

****************************************************************

From: Randy Brukardt
Sent: Saturday, April  1, 2006  6:01 PM

That's certainly true for Atomic. But Volatile must always be accepted (there is
no rule that it can be rejected based on the characteristics of the type), and
the model is that compilers do their best to implement it, whatever that is.

We added Implementation Advice (in Ada 2005) to avoid reading/writing extra bits,
so any cases where that happens has to be documented. That should be enough
encouragement to avoid it when possible. But we still want to allow any object
to be volatile. (This was all discussed extensively with AI-259.) Indeed, this is
the only significant difference between Atomic and Volatile -- otherwise there
wouldn't be a need for both.

****************************************************************

From: Robert Dewar
Sent: Saturday, April  1, 2006  7:32 PM

> That's certainly true for Atomic. But Volatile must always be accepted (there is
> no rule that it can be rejected based on the characteristics of the type),

well then that's an obvious mistake, and sure there is such a rule,
you don't have to do anything if it's not practical to do so.

> and the model is that compilers do their best to implement it, whatever that is.

I find that model absurd

> We added Implementation Advice (in Ada 2005) to avoid reading/writing extra
> bits,

well of course this should be a fundamental requirement of volatile to me

> so any cases where that happens has to be documented. That should be enough
> encouragement to avoid it when possible. But we still want to allow any object
> to be volatile. (This was all discussed extensively with AI-259.) Indeed, this is
> the only significant difference between Atomic and Volatile -- otherwise there
> wouldn't be a need for both.

I cannot believe you just said that!! Of course there is a need for
both, they serve totally different functions.

The point of atomic is that the read or write can be done in a single
instruction. *That's* what distinguishes volatile from atomic. This
allows various syncrhonization algorithms based on shared variables.
See Norm Shulman's PhD thesis for a very thorough treatment of
this subject.

So for example, an array of ten integers can be volatile, but it
takes ten reads to read it, so it cannot be atomic.

Or for a concrete example, if you have a bounded buffer with a
reader and a writer not explicitly syncrhonized, then the buffer
must be volatile, otherwise the algorithm obviously fails, but
the head and tail pointers must be atomic (otherwise the
algorithm fails because of race conditions). These two needs
are quite quite different.

The idea that a single bit in a bit array that has to be assigned
with a read/mask/store sequence can be called volatile seems
completely silly to me.

Fortunately, as far as I can tell, this nonsense language lawyering
has zero effect on an implementation.

****************************************************************

From: Robert I. Eachus
Sent: Sunday, April  2, 2006  3:32 AM

> The point of atomic is that the read or write can be done in a single
> instruction. *That's* what distinguishes volatile from atomic. This
> allows various syncrhonization algorithms based on shared variables.
> See Norm Shulman's PhD thesis for a very thorough treatment of
> this subject.

I seem to have missed a day of strum und drang.  But Robert Dewar put
his finger on the semantic disconnects.  With modern hardware Bit
vectors can be* atomic*--updated with a single, uninterruptable CPU
instruction, that is also atomic from the point of view of the memory
system.   Note that the cache manipulations that go on to cause this to
occur may be complex, but from our language lawyer point of view, all
that matters is the result.  On modern hardware, a read may result in
256 bytes being loaded into cache.  Not an issue for atomic, as long as
changes to the object are atomic from the point of view of the
programmer.  That means that the meaning of atomic may be diffferent in
a compiler that supports Annex D.  Of course, now that dual-core CPUs
are becomming more common, all compilers may have to insure that atomic
works in the presence of multiple CPUs or CPU cores.  (And I/O devices
as well.)

I may have started the confusion by saying that to get atomic behavior
in any x86 multiple core environment, you have to ensure that the bit
array is stored in UC (incacheable) memory.  But in this case, that has
nothing to do with volitile--and on the AMD Hammer processors nothing to
do with whether or not the bit array can be cached!  It is just that the
ISA only requires uninterruptable semantics for UC memory.  Or to turn
that around, not all memory need support atomic updates, but memory must
be marked UC for the LOCK prefix to have the expected semantics.  (Well,
there are circumstances where the OS will handle the exception and
provide the expected sematics, but that is more likely to involve server
virtualization than memory that is actually unlockable.)

> So for example, an array of ten integers can be volatile, but it
> takes ten reads to read it, so it cannot be atomic.
>
> Or for a concrete example, if you have a bounded buffer with a
> reader and a writer not explicitly syncrhonized, then the buffer
> must be volatile, otherwise the algorithm obviously fails, but
> the head and tail pointers must be atomic (otherwise the
> algorithm fails because of race conditions). These two needs
> are quite quite different.

I hope everyone now understands atomic, because this example shows how
complex volitile has become!  There is the type of hardware volitile
memory that Bibb Latting was talking about.  However, modern hardware
doesn't do single-bit reads and writes.  Hardware switches and status
bits are collected into registers.  A particular register may have bits
that are not writeable, and when you write a (32-bit?) word to that
location, only the setable bits are changed.  Where these registers are
internal to the CPU, they usually require special instructions to read
or write them. With I/O devices, the registers will be addressable as
memory, but again the semantics of reading from and/or writing to those
locations is going to be hardware specific.  In a perfect world, these
operations will all be provided as well documented code-inserts or
intrinsic functions.

In the case above, volitile has a much different--but also
necessary--meaning.  Whether or not the data is cached is not
important--well it is important if you need speed.  What is important is
that all CPU cores, (Ada tasks. and) hardware processes see the same
data.  At this point I really need to talk about cache coherency
strategies.  AMD uses MOESI (Modified, Owned, Exclusive, Shared,
Invalid), while Intel uses MESI (skip the Owner state).  What Robert
Dewar's example above needs (in the MESI case) is that the bounded
buffer *and* the.head and tail pointers must be marked as Shared, or as
Modified in one cache, and Invalid in all others.  The MOESI protocol
allows one copy to be marked as Owned, and the others to be either
Shared or Invalid.

In the AMD MOESI implementation, updating the owner's copy causes any
other copies to first be marked Invalid before the write to the Owned
copy completes, then the new value will be broadcast to the other chips
and cores.  Those that have a (now Invalid) cached copy will update it
and mark it again Shared.  What if you want to write to a Shared copy?
You must first take Ownership. MESI is faster if the next CPU to update
the Shared data is random, MOESI Owner state is much, much faster if
most updates are localized.  (In other words, the CPU (core) that last
updated the object is most likely to be the next updater.)

Maybe we need to resurrect pragma Shared for this case, and use Volitile
to imply the hardware case.  Notice that with modern hardware, if all
you need is the Shared cache state, then you will often get much better
performance, if you write the code that way.  (Using Volitile where
Shared is appropriate will generate correct but pessimistic code.)  This
is a case where the hardware is evolving and we need the language to
evolve to match.  Right now, you need an AMD Hammer CPU to get major
speedups, but Intel's Conroe will have a shared L2 cache between cores,
and each core will be able to access data in the L1 data cache of the
other core.  In fact, it may be worthwhile to create real code for
Robert Dewar's example, and time it in various hardware configurations.
The difference can be a factor of thirty or more.

And by the way, since modern CPUs manage data in cache lines, it is
worth knowing the sizes of those lines.  Intel uses 256 byte lines in
their L2 and L3 caches, but some Intel CPUs have 64-byte L1 data cache
lines.  AMD uses 64 byte cache lines throughout.  However, in practice
there is little if any difference.  AMD's CPUs typically request two
cache lines (128 bytes) and only terminate the request after the first
line if there is another pending request.  Intel requests 256 bytes, but
will stop after 128 bytes if there is a pending request.  (Intel's L2
cache lines can store a half-line, with the other half empty.)

Both AMD and Intel support 'uncached' reads and writes intended to avoid
cache pollution.  But the smallest guaranteed read or write amount is
128 bits (16 bytes). So any x86 compiler that allows pragma Volitile for
in memory objects smaller than 16 bytes is probably living in a state of
sin. ;-)

****************************************************************

From: Jean-Pierre Rosen
Sent: Sunday, April  2, 2006  4:25 PM

> I also just noticed:
>
> 21    If a pragma Pack applies to a type any of whose subcomponents are
> atomic, the implementation shall not pack the atomic subcomponents more
> tightly than that for which it can support indivisible reads and updates.
>
> which seems to answer the original question.
Not really.
The question was about independent addressability. You can have
indivisible updates without independent addressability.

****************************************************************

From: Tucker Taft
Sent: Monday, May 21, 2007  8:11 AM

You must not use "must" in an ISO standard.
You shall use "shall" instead... ;-)

(Although you didn't violate this one,
you may not use "may not" either.  You
shall use "shall not" or you might use
"might not" instead.)

> ...
> !wording
>
> 13.2 (6.1/2) is renumbered 13.2 (7.1/3) and reads:
>
> For a packed type that has a component that is of a by-reference type,
> aliased, volatile or atomic, the component must be aligned according to

Please fully "comma-ize" lists of more than two elements.  Hence,
"... volatile, or atomic, ..."

> the alignment of its subtype; in particular it must be aligned on a
> storage element boundary.

Why does this last part follow?  Can't a subtype have an alignment
of zero?

>
> 13.2 (9) append:
>
> If the array component must be aligned according to its subtype and the
> results of packing are not so aligned, pragma pack should be rejected.

This is worded somewhat ambiguously, here using "must" when probably
some other word would make more sense.


[Editor's note: These editorial changes were made in version /02 of the AI05;
this is version /01 of the AI12.]

****************************************************************

From: Bob Duff
Sent: Sunday, February  3, 2013  3:47 PM

Here's a new version of AI12-0001-1,
"Independence and Representation clauses for atomic objects". [This is version
/03 of the AI - Editor.] This completes my homework.

Meta comment:  The term "reject" as in "reject a compilation unit because it's
illegal" is Ada-83-speak.  But this term keeps creeping into wording in
AIs/RM/AARM.  I ask that people please try to remember to quit doing that.
Instead, say something like "so and so is illegal" or "an implementation may
make so-and-so illegal".

You know who you are, Steve.  ;-)

See AARM-1.1.3:

4     * Identify all programs or program units that contain errors whose
        detection is required by this International Standard;

4.a         Discussion: Note that we no longer use the term "rejection" of
            programs or program units. We require that programs or program
            units with errors or that exceed some capacity limit be "
            identified". The way in which errors or capacity problems are
            reported is not specified.

Here's the draft minutes, with some of my comments:

> AI12-0001-1/02 Independence and Representation Clauses for atomic
> objects (Other AI versions) Bob and Tuck argue that the Recommended
> Level of Support is wrong as it does not match the AARM Ramifications.
> [Editor's note: I didn't record what
> ramification(s) they referred to. I can't find any that clearly
> conflict with the Recommended Level of Support; the only one that
> might be read that way is 13.2(9.a), which says that an aliased
> component won't get packed very tightly because "its Size will
> generally be a multiple of Storage_Unit". But this statement appears
> to be circular, as the Size of a component is determined by the amount
> of packing applied, so essentially says that an aliased component
> won't get packed tightly because it won't get packed tightly. It would
> make some logical sense if was meant to refer to the Size of the
> subtype of the component, but then it is just wrong because the Size
> of a subtype is not affected by aliasedness.]

Yes, that's the one.  Never mind the above circularity, what it's trying to say
is that if you have a aliased component of type Boolean, the Pack isn't illegal
-- it just doesn't pack that component tightly.  It might pack other components.

> Geert notes that Pack indicates that the components are not
> independent; that makes no sense with Atomic. We scurry to the
> Standard to see what it actually says.
> 9.10(1/3) discusses independence, and it says that specified
> independence wins over representation aspects, so there is no problem there.

Agreed.

> C.6(8.1/3) should include aliased in things that cause ``specified as
> independent''.

I don't think so.  "Aliased" has nothing to do with task safety. It just means
the thing can have access values pointing to it. Consider early versions of the
Alpha 20164.  An address points at an 8-bit byte, but you can't load and store
bytes; you have to load a 64-bit words and do shifting and masking.  If you have
a packed array of bytes on that machine, you want it packed; you don't want
64-bits per byte.  If you want independence, you should specify Independent (or
Atomic, or...).

> Tucker thinks C.6(13.2/3) is misleading, as it seems to imply packing
> is not allowed when independence is specified.

Check.

> The Recommended Level of Support for Pack needs to be weakened to
> allow atomic, aliased, and so on to make the packing less than otherwise
> required.

Check.

> Bob will take this AI.
> Approve intent: 9-0-1.

[Followed by version /03 of the AI - Editor.]

****************************************************************

From: Randy Brukardt
Sent: Monday, February  4, 2013  2:37 PM

...
> > C.6(8.1/3) should include aliased in things that cause
> ``specified as
> > independent''.
>
> I don't think so.  "Aliased" has nothing to do with task safety.
> It just means the thing can have access values pointing to it.
> Consider early versions of the Alpha 20164.  An address points at an
> 8-bit byte, but you can't load and store bytes; you have to load a
> 64-bit words and do shifting and masking.
> If you have a packed array of bytes on that machine, you want it
> packed; you don't want 64-bits per byte.  If you want independence,
> you should specify Independent (or Atomic, or...).

[I presume you are missing the word "aliased" in your "If you have a package array of {aliased} bytes...", because otherwise the example has nothing to do with the issue at hand.]

The problem with this is then there is no guarantee that designated objects are independent. And worse, there is no language means to make such a guarantee. One could add one by allowing Independent to apply to access types, and then requiring 'Access to c
heck that the aliased objects are independent, but that seems like a lot of language mechanism to solve a problem of our own creation (allowing pack and other rep clauses to kill independence of aliased components, something that has never been true in Ada
 so far as I can tell), especially as I doubt anyone is clamoring for this capability (who needs aliased bytes anyway, much less packed arrays of them).

I didn't look in detail at the body of the AI, I just wanted to point out that this is clearly misguided.

****************************************************************

From: Robert Dewar
Sent: Monday, February  4, 2013  2:42 PM

> The problem with this is then there is no guarantee that designated
> objects are independent. And worse, there is no language means to make
> such a guarantee.

Don't worry too much, of COURSE all designated objects are independent in
practice, no matter what the language has to say about it.

****************************************************************

From: Bob Duff
Sent: Monday, February  4, 2013  3:17 PM

> ...
> > > C.6(8.1/3) should include aliased in things that cause ``specified as
> > > independent''.
> >
> > I don't think so.  "Aliased" has nothing to do with task safety.
> > It just means the thing can have access values pointing to it.
> > Consider early versions of the Alpha 20164.  An address points at an
> > 8-bit byte, but you can't load and store bytes; you have to load a
> > 64-bit words and do shifting and masking.
> > If you have a packed array of bytes on that machine, you want it
> > packed; you don't want 64-bits per byte.  If you want independence,
> > you should specify Independent (or Atomic, or...).
>
> [I presume you are missing the word "aliased" in your "If you have a package
                                                                       ^^^^^^^
                                                 I can't spell "packed" either.

> array of {aliased} bytes...", because otherwise the example has
> nothing to do with the issue at hand.]

Right, my claim is that you want 'Component_Size = 8, whether or not the
components are aliased.

> The problem with this is then there is no guarantee that designated
> objects are independent. And worse, there is no language means to make
> such a guarantee.

What sort of designated objects do you mean?  Distinct heap objects are always
independently addressable (see 9.10).  The only way two (nonoverlapping) objects
can fail to be independently addressable is if they're both subcomponents of the
same object.  And you control independence of those using pragmas
Independent[_Components].

>...One could add one by allowing Independent to apply to access  types,
>and then requiring 'Access to check that the aliased objects are
>independent, but that seems like a lot of language mechanism to solve a
>problem of our own creation (allowing pack and other rep clauses to
>kill  independence of aliased components, something that has never been
>true in  Ada so far as I can tell), ...

My understanding is the opposite:  'aliased' never was intended to imply
independent addressability -- just plain old addressability.

If 'aliased' implies independent addressability, then why were pragmas
Independent[_Components] added?

What do others think?

>...especially as I doubt anyone is clamoring for  this capability (who
>needs aliased bytes anyway, much less packed arrays of  them).

I find "who needs aliased bytes anyway" to be a strange attitude.
Why shouldn't bytes be aliased?

And on the Alpha 21064 (admittedly obsolete), you'd want to pack such a thing so
you don't get Component_Size = 64 (unless you're sharing the array across
tasks).

> I didn't look in detail at the body of the AI, I just wanted to point
> out that this is clearly misguided.

I strongly disagree -- it's not clearly anything (guided nor misguided).  ;-)

I'm not sure I fully understand pragmas Independent[_Components].
Correct me if I'm wrong:  If you give these pragmas for a type that has a record
rep clause, a Component_Size clause, or a Convention, then the only possible
effect is to make the program illegal.  If you give these pragmas for a packed
type, the only possible effect is to reduce the amount of packing. For any other
type (which is 99.9% of all types), these pragmas have no effect.

Am I right?  If so, Independent[_Components] seems like a pretty marginally
useful feature.  I don't understand why it was added, and AI05-0009-1 does not
enlighten me.

****************************************************************

From: Steve Baird
Sent: Monday, February  4, 2013  3:37 PM

> My understanding is the opposite:  'aliased' never was intended to
> imply independent addressability -- just plain old addressability.
>
> If 'aliased' implies independent addressability, then why were pragmas
> Independent[_Components] added?

For unaliased components.

You have an array of 32 (unaliased) Booleans.
You also have 32 tasks and you want to allow each of them to manipulate one of
the array elements.

It is a bit odd that we never state that "aliased" implies "independent
addressability" for components.

I can imagine an array with aliased-but-not-independently-addressable
components (e.g., a bit packed array of aliased booleans for an implementation
which implements access-to-boolean types as bit pointers), but this seems pretty
contrived.

****************************************************************

From: Randy Brukardt
Sent: Monday, February  4, 2013  3:45 PM

...
> > The problem with this is then there is no guarantee that designated
> > objects are independent. And worse, there is no language means to
> > make such a guarantee.
>
> What sort of designated objects do you mean?  Distinct heap objects
> are always independently addressable (see 9.10).  The only way two
> (nonoverlapping) objects can fail to be independently addressable is
> if they're both subcomponents of the same object.  And you control
> independence of those using pragmas Independent[_Components].

The designated object of a general access type, of course. The client of such a
type cannot know where the designated objects come from. And you're suggesting
to eliminate the guarantee that these designated objects are independent.

In your hypothetical array of bytes, some components are not going to be
independent. Thus, you can also have designated objects that are not
independent. That's something new; there is no possibility of that in current
Ada (especially if you believe 13.2(9.a)).

> >...One could add one by allowing Independent to apply to access
> >types, and then requiring 'Access to check that the aliased objects
> >are independent, but that seems like a lot of language mechanism to
> >solve a problem of our own creation (allowing pack and other rep
> >clauses to kill  independence of aliased components, something that
> >has never been true in  Ada so far as I can tell), ...
>
> My understanding is the opposite:  'aliased' never was intended to
> imply independent addressability -- just plain old addressability.

The two are intimately linked (except on some broken obsolete hardware - I
wouldn't have guessed that any such hardware could have existed -- indeed, I
can't quite figure out how that machine is supposed to have worked -- not that
relevant anyway).

> If 'aliased' implies independent addressability, then why were pragmas
> Independent[_Components] added?

Something can be independent without being addressable, and in any case,
"aliased" turns off a lot of optimizations, which isn't necessary if all you
need is independent.

> What do others think?
>
> >...especially as I doubt anyone is clamoring for  this capability
> >(who needs aliased bytes anyway, much less packed arrays of  them).
>
> I find "who needs aliased bytes anyway" to be a strange attitude.
> Why shouldn't bytes be aliased?

Why should the language guarantees be broken for something that no one needs?

> And on the Alpha 21064 (admittedly obsolete), you'd want to pack such
> a thing so you don't get Component_Size = 64 (unless you're sharing
> the array across tasks).

Read Robert's response to see that no implementation would ever take advantage
of this ability, so why even contemplate it?

> > I didn't look in detail at the body of the AI, I just wanted to
> > point out that this is clearly misguided.
>
> I strongly disagree -- it's not clearly anything (guided nor
> misguided).  ;-)

Heck, there is a lot misguided about this discussion. Precisely *who* is
misguided is a matter of opinion. :-)

> I'm not sure I fully understand pragmas Independent[_Components].
> Correct me if I'm wrong:  If you give these pragmas for a type that
> has a record rep clause, a Component_Size clause, or a Convention,
> then the only possible effect is to make the program illegal.  If you
> give these pragmas for a packed type, the only possible effect is to
> reduce the amount of packing.
> For any other type (which is 99.9% of all types), these pragmas have
> no effect.
>
> Am I right?  If so, Independent[_Components] seems like a pretty
> marginally useful feature.  I don't understand why it was added, and
> AI05-0009-1 does not enlighten me.

The whole point is that you can give this aspect to ensure that items are
independently addressable (when you are depending on that), without invoking the
other costs of Volatile or aliased. Yes, it's marginal in that it hardly ever is
going to have an effect (and hardly anyone will understand it enough to use it),
but it's part of the Ada pattern of declaring exactly what you need and no more.
In that sense, it is similar to declaring a range of -32768..32767 on a 16-bit
integer type -- this won't change anything but it makes it even more clear what
the expectations are.

****************************************************************

From: Randy Brukardt
Sent: Monday, February  4, 2013  3:47 PM

...
> It is a bit odd that we never state that "aliased" implies
> "independent addressability" for components.

Bob was supposed to add that statement into the AI he is working on (that's what
we decided in Boston), but he's resisting. For very marginal capabilities.

****************************************************************

From: Bob Duff
Sent: Monday, February  4, 2013  4:06 PM

> The designated object of a general access type, of course. The client
> of such a type cannot know where the designated objects come from. And
> you're suggesting to eliminate the guarantee that these designated
> objects are independent.

I'm not eliminating a guarantee -- such a guarantee doesn't exist, and I'm
trying to understand why we want to add it.

> In your hypothetical array of bytes, some components are not going to
> be independent. Thus, you can also have designated objects that are
> not independent. That's something new; there is no possibility of that
> in current Ada (especially if you believe 13.2(9.a)).

13.2(9.a) doesn't say anything about independent addressability.
(I assume in these discussions, "independent" is being used as an abbreviation
for the official RM term, "independent addressability", right?)

> Something can be independent without being addressable, and in any
> case,

So you're saying something can be independently addressable without being
addressable.  Yet another case where the RM reads like Alice in Wonderland.  ;-)

> "aliased" turns off a lot of optimizations, which isn't necessary if
> all you need is independent.

I suppose, but that seems pretty marginal.  Remember, we're only talking about
components of packed types.

> > I'm not sure I fully understand pragmas Independent[_Components].
> > Correct me if I'm wrong:  If you give these pragmas for a type that
> > has a record rep clause, a Component_Size clause, or a Convention,
> > then the only possible effect is to make the program illegal.  If
> > you give these pragmas for a packed type, the only possible effect
> > is to reduce the amount of packing.
> > For any other type (which is 99.9% of all types), these pragmas have
> > no effect.
> >
> > Am I right?  If so, Independent[_Components] seems like a pretty
> > marginally useful feature.  I don't understand why it was added, and
> > AI05-0009-1 does not enlighten me.

Please answer my question, "Am I right?".  Then we can discuss "The whole
point...".

> The whole point is that you can give this aspect to ensure that items
> are independently addressable (when you are depending on that),
> without invoking the other costs of Volatile or aliased. Yes, it's
> marginal in that it hardly ever is going to have an effect (and hardly
> anyone will understand it enough to use it), but it's part of the Ada
> pattern of declaring exactly what you need and no more. In that sense,
> it is similar to declaring a range of
> -32768..32767 on a 16-bit integer type -- this won't change anything
> but it makes it even more clear what the expectations are.

****************************************************************

From: Bob Duff
Sent: Monday, February  4, 2013  3:55 PM

> > My understanding is the opposite:  'aliased' never was intended to
> > imply independent addressability -- just plain old addressability.
> >
> > If 'aliased' implies independent addressability, then why were
> > pragmas Independent[_Components] added?
>
> For unaliased components.

You can always add "aliased".

> You have an array of 32 (unaliased) Booleans.
> You also have 32 tasks and you want to allow each of them to
> manipulate one of the array elements.

If the array is not packed, then Independent_Components has no effect.  If it's
packed, the Independent_Components turns off the packing.  I don't get it.

Perhaps if somebody answered the later part of my previous email, the part
starting "Correct me if I'm wrong"...

> It is a bit odd that we never state that "aliased" implies
> "independent addressability" for components.

So you agree with Randy.  But I don't understand this -- "independent
addressability" is about tasking, whereas aliasedness is about allowing access
values.  Why should one have anything to do with the other?

****************************************************************

From: Bob Duff
Sent: Monday, February  4, 2013  4:10 PM

> ...
> > It is a bit odd that we never state that "aliased" implies
> > "independent addressability" for components.
>
> Bob was supposed to add that statement into the AI he is working on
> (that's what we decided in Boston), but he's resisting. For very
> marginal capabilities.

Please don't make this into a battle, Randy.  I'm not "resisting"
anything; I'm just trying to understand why we should add this implication.

And please stop accusing me of REMOVING this implication.  It's not there now,
and ARG wants to ADD it to the RM, and I want to know why.  The minutes don't
say.

Robert's comment (about the way hardware behaves in practice) argues against
making any change, because it won't actually change anything.

****************************************************************

From: Bob Duff
Sent: Monday, February  4, 2013  4:18 PM

If you say:

    pragma Independent_Components (Some_Array);

is there some implication that the compiler should allocate the components of
Some_Array on separate cache lines (and align it to a cache line boundary)?
That would make more sense than what I've heard so far.

****************************************************************

From: Randy Brukardt
Sent: Monday, February  4, 2013  4:22 PM

...
> > "aliased" turns off a lot of optimizations, which isn't necessary if
> > all you need is independent.
>
> I suppose, but that seems pretty marginal.  Remember, we're only
> talking about components of packed types.

Well, we're also talking about other representation clauses. It would be pretty
silly if you could get non-independent aliased components via packing but not
via other representation clauses. (And I worry much more about other clauses,
because only the slightly insane use Pack.)

> > > I'm not sure I fully understand pragmas Independent[_Components].
> > > Correct me if I'm wrong:  If you give these pragmas for a type
> > > that has a record rep clause, a Component_Size clause, or a
> > > Convention, then the only possible effect is to make the program
> > > illegal.  If you give these pragmas for a packed type, the only
> > > possible effect is to reduce the amount of packing.
> > > For any other type (which is 99.9% of all types), these pragmas
> > > have no effect.
> > >
> > > Am I right?  If so, Independent[_Components] seems like a pretty
> > > marginally useful feature.  I don't understand why it was  added,
> > > and
> > > AI05-0009-1 does not enlighten me.
>
> Please answer my question, "Am I right?".  Then we can discuss "The
> whole point...".

Probably, but I don't know for sure (I'd have to go back and re-read all of the
rules). But why is it relevant? There are lots of pragmas in Ada that have no
effect most of the time (Pack immediately comes to mind, and you want to make
that more likely). The only thing that matters is the intent expressed.

This strikes me as the setup for a bait-and-switch argument. I would hope we're
more adults than that...

> > The whole point is that you can give this aspect to ensure that
> > items are independently addressable (when you are depending on
> > that), without invoking the other costs of Volatile or aliased. Yes,
> > it's marginal in that it hardly ever is going to have an effect (and
> > hardly anyone will understand it enough to use it), but it's part of
> > the Ada pattern of declaring exactly what you need and no more. In
> > that sense, it is similar to declaring a range of
> > -32768..32767 on a 16-bit integer type -- this won't change anything
> > but it makes it even more clear what the expectations are.

The above is the only thing relevant, not the detailed effects or lack thereof.

****************************************************************

From: Randy Brukardt
Sent: Monday, February  4, 2013  4:38 PM

> > > My understanding is the opposite:  'aliased' never was intended to
> > > imply independent addressability -- just plain old addressability.
> > >
> > > If 'aliased' implies independent addressability, then why were
> > > pragmas Independent[_Components] added?
> >
> > For unaliased components.
>
> You can always add "aliased".

Not without performance implications. You could have also said "you can always
add Volatile_Components", with the same caveat.

> > You have an array of 32 (unaliased) Booleans.
> > You also have 32 tasks and you want to allow each of them to
> > manipulate one of the array elements.
>
> If the array is not packed, then Independent_Components has no effect.
> If it's packed, the Independent_Components turns off the packing.  I
> don't get it.

"Clearly", pack should be illegal in this case. Tucker once proposed having pack
be illegal if it does nothing at all, which would certainly go a ways toward
eliminating my opposition to this change.

> Perhaps if somebody answered the later part of my previous email, the
> part starting "Correct me if I'm wrong"...

It's irrelevant has to precisely what happens, it's about declaring your
intentions. If you can't see the value of declaring your intentions, we don't
have much to talk about.

> > It is a bit odd that we never state that "aliased" implies
> > "independent addressability" for components.
>
> So you agree with Randy.  But I don't understand this -- "independent
> addressability" is about tasking, whereas aliasedness is about
> allowing access values.  Why should one have anything to do with the
> other?

Because there is a presumption that designated objects are always independently
addressable. We didn't think there was a need to be able to declare that an
access type has only independently addressable designated objects because it
wasn't necessary, but you claim that it is not true in general and thus such a
thing needs to be added.

> > I can imagine an array with
> > aliased-but-not-independently-addressable
> > components (e.g., a bit packed array of aliased booleans for an
> > implementation which implements access-to-boolean types as bit
> > pointers), but this seems pretty contrived.

And this is the real objection: any such examples seem contrived, and to support
such examples you want to eliminate the existing guarantee that designated
objects are always independently addressable. (Whether that guarantee is a
ramification or direct result of the wording is a separate issue - as Robert
says, no one ever has or likely will invalidate that.)

The only alternative would be to extend Independent to cover access types, and
then have a check on 'Access that the object actually has Independent specified
(or must be by 9.10). Which seems like too much mechanism.

****************************************************************

From: Tucker Taft
Sent: Monday, February  4, 2013  4:40 PM

> If you say:
>
>      pragma Independent_Components (Some_Array);
>
> is there some implication that the compiler should allocate the
> components of Some_Array on separate cache lines (and align it to a
> cache line boundary)?  That would make more sense than what I've heard
> so far.

No, I think the only point was to avoid erroneousness due to simultaneous
access, which the presence of other representation items might imply.

Efficiency is a different issue.

I will admit I have lost track of the issue.  Is it whether "aliased" implies
"independent"?  I would say yes that should be true.

Does independent imply aliased?  Clearly not.

****************************************************************

From: Robert Dewar
Sent: Monday, February  4, 2013  4:44 PM

> Robert's comment (about the way hardware behaves in practice) argues
> against making any change, because it won't actually change anything.

90% of the delicate arguments about wording also have this property (they won't
actually change anything), but there is still an inclination to get the words
right :-)

****************************************************************

From: Robert Dewar
Sent: Monday, February  4, 2013  4:49 PM

> If you say:
>
>      pragma Independent_Components (Some_Array);
>
> is there some implication that the compiler should allocate the
> components of Some_Array on separate cache lines (and align it to a
> cache line boundary)?  That would make more sense than what I've heard
> so far.

Absolutely no such implication, and indeed this would be disastrous for two
reasons:

Data representation would depend on the particular target configuration, since
cache line size is part of that

Cache lines are huge in many machines, e.g. 256 bytes.

No, that's not the idea of independence (a concept I think I can take at least
partial credit for since I had my PhD student Norman Schulman study this in
detail) at all. The idea is that separate tasks can operate independently on
separate elements.

Whether this includes the case of separate tasks on separate processors being
able to access independent objects depends on the target, but in practice
virtually all systems implement full cache coherence with cache snooping (where
you watch bus traffic to make sure your cached view is current).

Allocating on separate cache lines in a system with cache coherence would make
no sense at all!

To see a practical effect of

 >      type Some_Array is array (1 .. 10) of Character;
 >      pragma Independent_Components (Some_Array);

Consider the old Alpha, which did not have byte load/store instructions. On one
of these old Alpha's, you had to do 32-bit loads and stores, so the above
declarations would require (on that machine) Some_Array'Component_Size = 32.

****************************************************************

From: Robert Dewar
Sent: Monday, February  4, 2013  4:50 PM

> If you say:
>
>      pragma Independent_Components (Some_Array);

By the way in Ada 83, you were required to provide independence for all
composites, including packed bit arrays.

Nonsense of course, but a consequence of the infamous "big change at the last
minute" to chapter 9 of the RM :-)

****************************************************************

From: Randy Brukardt
Sent: Monday, February  4, 2013  4:49 PM

> > ...
> > > It is a bit odd that we never state that "aliased" implies
> > > "independent addressability" for components.
> >
> > Bob was supposed to add that statement into the AI he is working on
> > (that's what we decided in Boston), but he's resisting. For very
> > marginal capabilities.
>
> Please don't make this into a battle, Randy.  I'm not "resisting"
> anything; I'm just trying to understand why we should add this
> implication.

The reason is obvious: non-independent designated objects is nonsense that
cannot be allowed (unless we add an additional ways to declare independent
addressability of designated objects).

> And please stop accusing me of REMOVING this implication.
> It's not there now, and ARG wants to ADD it to the RM, and I want to
> know why.  The minutes don't say.

Because we all have assumed for decades that it IS there; the design of
Independent makes that clear (it would make no sense without such an
implication). If one can't read that between the lines of 9.10, then it needs to
be made explicit somehow (either with extra wording or an expansion of
Independent).

> Robert's comment (about the way hardware behaves in practice) argues
> against making any change, because it won't actually change anything.

If you mean about the behavior of Pack, I would agree. It appears to be illegal
to pack aliased components, and that's a good thing (as pack would have no
effect, and that's almost certainly some sort of mistake).

If you want to change Pack so that it is not illegal for aliased components, and
then "make no change" to aliased, that I don't understand, because we need some
rule to allow aliased components to be packed less tightly. (There is no
exception for aliased in the Recommended Level of Support.) And once you add
such an exception, you need to describe what it means.

I still think the previous version of the AI is closer to what we want (I
believe it mostly matches GNAT, as well). But YMMV.

****************************************************************

From: Tucker Taft
Sent: Monday, February  4, 2013  4:49 PM

>> ...
>> If the array is not packed, then Independent_Components has no
>> effect.  If it's packed, the Independent_Components turns off the
>> packing.  I don't get it.
>
> "Clearly", pack should be illegal in this case. Tucker once proposed
> having pack be illegal if it does nothing at all, which would
> certainly go a ways toward eliminating my opposition to this change.

That must have been Tucker # 42.  I don't remember this suggestion.  My feeling
has generally been that "pack" means use space optimization over time
optimization, subject to all of the other representation requirements.  It
should never be illegal to say "pack," though in the absence of an "independent"
(or aliased) specification, it can create erroneousness.

In my view, pragma Pack is really just a special case of "pragma
Optimize(Space)" which is a pretty non-specific request.  The only extra bit of
semantics for Pack is the pesky "non-independence" implication, and
pragma/aspect Independent/Independent_Components can be used to overcome that
bit.

If you really want to control representation, Component_Size or a record rep
clause are the way to go.  Pack is not really for controlling representation, it
is for establishing a space bias in the default representation selection
mechanism.

****************************************************************

From: Robert Dewar
Sent: Monday, February  4, 2013  4:51 PM

> ...
>>> "aliased" turns off a lot of optimizations, which isn't necessary if
>>> all you need is independent.
>>
>> I suppose, but that seems pretty marginal.  Remember, we're only
>> talking about components of packed types.
>
> Well, we're also talking about other representation clauses. It would
> be pretty silly if you could get non-independent aliased components
> via packing but not via other representation clauses. (And I worry
> much more about other clauses, because only the slightly insane use
> Pack.)

I don't believe in the tooth fairy
And I don't believe in non-independent aliased components

:-)

This is independent of whatever wording you come up with!

****************************************************************

From: Bob Duff
Sent: Monday, February  4, 2013  4:58 PM

> > If you say:
> >
> >      pragma Independent_Components (Some_Array);
> >
> > is there some implication that the compiler should allocate the
> > components of Some_Array on separate cache lines (and align it to a
> > cache line boundary)?  That would make more sense than what I've
> > heard so far.
>
> No, I think the only point was to avoid erroneousness due to
> simultaneous access, which the presence of other representation items
> might imply.
>
> Efficiency is a different issue.

OK.  A feature that provided such an efficiency hint would be useful, IMHO. (No,
I'm not proposing to add one.)

> I will admit I have lost track of the issue.  Is it whether "aliased"
> implies "independent"?

Yes, that's the main issue.  The minutes say "yes, we should add such an
implication", and I'm wondering why.

A side issue is that if we added such an implication, the pragmas Indep[_Comp]
seem VERY marginally useful, so I wonder why they were added.  Randy says "to
declare one's intentions".  Well, that's nice, I suppose, but if you just
declare normal arrays without any Pack or 'Component_Size clauses, you get
independent addressability, and that was good enough for the first decades of
Ada's life.

>...I would say yes that should
> be true.

And Randy and Steve agree with you.
But I still don't understand why.
Sorry if I'm being dense.

> Does independent imply aliased?  Clearly not.

Yes, I think we all agree on that.

****************************************************************

From: Steve Baird
Sent: Monday, February  4, 2013  5:02 PM

>>> If 'aliased' implies independent addressability, then why were
>>> pragmas Independent[_Components] added?
>>
>> For unaliased components.
>
> You can always add "aliased".

> So you agree with Randy.  But I don't understand this -- "independent
> addressability" is about tasking, whereas aliasedness is about
> allowing access values.  Why should one have anything to do with the
> other?
>

The bugs associated with concurrent access to non-independent components are a
case of the implementation showing through in an ugly way, exposing a low-level
detail that a high-level language would ideally hide.

I want this to happen as infrequently as possible.

Anytime I can define a component to be independently addressable without giving
up something useful (and without forcing existing compilers to change), I want
to do so.

My point is that in the case of an aliased component, I don't see that I am
giving up anything useful (e.g., the freedom to have it share a byte with a
neighboring component) by adding an "aliasing => I.A." rule.

Although I think this rule would be a good thing, I'd agree that it is not a big
deal (which is why I don't feel strongly about it) because we already have I.A.
most of the time.

****************************************************************

From: Bob Duff
Sent: Monday, February  4, 2013  5:07 PM

> To see a practical effect of
>
>  >      type Some_Array is array (1 .. 10) of Character;
>  >      pragma Independent_Components (Some_Array);

Sorry, now I'm getting MORE confused.  There's no Pack or "for
Some_Array'Component_Size use..." above. So Ada requires the components of
Some_Array to be independently addressable even without that pragma. So I fail
to see any "practical effect" of the pragma.

> Consider the old Alpha, which did not have byte load/store
> instructions. On one of these old Alpha's, you had to do 32-bit loads
> and stores, so the above declarations would require (on that machine)
> Some_Array'Component_Size = 32.

I stated earlier in this thread that you had to do 64-bit loads and stores on
that machine, but you're right -- you could do 32.

By the way, type String has pragma Pack, which I always thought was so
String'Component_Size can be 8 even on such weird machines as the old Alpha!

****************************************************************

From: Robert Dewar
Sent: Monday, February  4, 2013  5:08 PM

> I can imagine an array with aliased-but-not-independently-addressable
> components (e.g., a bit packed array of aliased booleans for an
> implementation which implements access-to-boolean types as bit
> pointers), but this seems pretty contrived.

Imagine away, because Ada implementations that implement access-to-boolean types
as bit pointers are likely to be about as common as Loch Ness Monsters :-)

Seriously, it is not worth spending much time worrying about bizarre
implementation possibilities.

If you did have such an implementation, then it could do all sorts of peculiar
things (if necessary under control of a switch).

****************************************************************

From: Randy Brukardt
Sent: Monday, February  4, 2013  5:14 PM

...
> I will admit I have lost track of the issue.  Is it whether "aliased"
> implies "independent"?  I would say yes that should be true.

That's the issue. And Bob wants to know why. I've tried to explain, but probably
not very well. Perhaps you can take a stab at it.

****************************************************************

From: Tucker Taft
Sent: Monday, February  4, 2013  5:17 PM

>>> If you say:
>>>
>>>       pragma Independent_Components (Some_Array);
>>>
>>> is there some implication that the compiler should allocate the
>>> components of Some_Array on separate cache lines (and align it to a
>>> cache line boundary)?  That would make more sense than what I've
>>> heard so far.
>>
>> No, I think the only point was to avoid erroneousness due to
>> simultaneous access, which the presence of other representation items
>> might imply.
>>
>> Efficiency is a different issue.
>
> OK.  A feature that provided such an efficiency hint would be useful,
> IMHO. (No, I'm not proposing to add one.)
>
>> I will admit I have lost track of the issue.  Is it whether "aliased"
>> implies "independent"?
>> ...I would say yes that should
>> be true.
>
> And Randy and Steve agree with you.
> But I still don't understand why.
> Sorry if I'm being dense.

My sense is that in ARG meetings over the past several years, we have all agreed
we need something like Independent to overcome the slightly odd rule about
erroneousness coming from any non-confirming rep clause.  And of course there is
nothing in the source that says "this is a confirming rep clause" so pragma
Independent is one way to ensure that whether or not a rep-clause is confirming,
it doesn't cause the loss of independence.

Aliased seems to serve a completely different purpose, namely the ability to use
'Access.

The argument for making "aliased" imply independence is simply that once you
create an access value, it is pretty much impossible to keep track of where it
came from.  And clearly independence is guaranteed for distinct
dynamically-allocated objects.

But I think it would be odd to tell someone that if they want to overcome the
loss of independence due to a rep-clause, they should add "aliased." That seems
non-intuitive and somewhat user unfriendly.

****************************************************************

From: Robert Dewar
Sent: Monday, February  4, 2013  5:16 PM

>> My understanding is the opposite:  'aliased' never was intended to
>> imply independent addressability -- just plain old addressability.
>>
>> If 'aliased' implies independent addressability, then why were
>> pragmas Independent[_Components] added?

Independent Components is weaker than aliased

Aliased means you can have a pointer to the object

Independent Components means you can access them independently

Example:

On the x86, you can test, set, and clear individual bits, so a bit packed array
can have independent comonents (if your code generator can handle this, GNAT for
one cannot, so we can't allow that).

But that does not mean you can have a pointer to an individual bit!

****************************************************************

From: Robert Dewar
Sent: Monday, February  4, 2013  5:19 PM

> The reason is obvious: non-independent designated objects is nonsense
> that cannot be allowed (unless we add an additional ways to declare
> independent addressability of designated objects).

I really don't think it matters two hoots whether the language allows
non-independent designated objects. The semantics is clear, though peculiar, and
on 100% of targets you can't have such things, so they wouldn't be imlemented
anyway!

> Because we all have assumed for decades that it IS there; the design
> of Independent makes that clear (it would make no sense without such
> an implication). If one can't read that between the lines of 9.10,
> then it needs to be made explicit somehow (either with extra wording
> or an expansion of Independent).
>
>> Robert's comment (about the way hardware behaves in practice) argues
>> against making any change, because it won't actually change anything.

> If you want to change Pack so that it is not illegal for aliased
> components, and then "make no change" to aliased, that I don't
> understand, because we need some rule to allow aliased components to
> be packed less tightly. (There is no exception for aliased in the
> Recommended Level of Support.) And once you add such an exception, you need to describe what it means.

I don't see any reason to disallow pack for a record that has some aliased
components. I agree that pack for an array of aliased components is unlikely to
do anything!

****************************************************************

From: Robert Dewar
Sent: Monday, February  4, 2013  5:24 PM

> In my view, pragma Pack is really just a special case of "pragma
> Optimize(Space)" which is a pretty non-specific request.  The only
> extra bit of semantics for Pack is the pesky "non-independence"
> implication, and pragma/aspect Independent/Independent_Components can
> be used to overcome that bit.

Not quite, packed arrays of boolean are guaranteed to work as expected.

> If you really want to control representation, Component_Size or a
> record rep clause are the way to go.  Pack is not really for
> controlling representation, it is for establishing a space bias in the
> default representation selection mechanism.

Well in practice everyone uses it for boolean arrays to control repreentation

type B is array (0 .. 31) of Boolean;
pragma Pack (B);

is *VERY* standard Ada, and even advising, let alone insisting that people use
Component_Size of 1 in such a case seems bgus to me

****************************************************************

From: Robert Dewar
Sent: Monday, February  4, 2013  5:43 PM

>> To see a practical effect of
>>
>>   >      type Some_Array is array (1 .. 10) of Character;
>>   >      pragma Independent_Components (Some_Array);
>
> Sorry, now I'm getting MORE confused.  There's no Pack or "for
> Some_Array'Component_Size use..." above.
> So Ada requires the components of Some_Array to be independently
> addressable even without that pragma.
> So I fail to see any "practical effect" of the pragma.

OK, add the pragma Pack to the example

> By the way, type String has pragma Pack, which I always thought was so
> String'Component_Size can be 8 even on such weird machines as the old
> Alpha!

That's exactly right

****************************************************************

From: Robert Dewar
Sent: Monday, February  4, 2013  5:45 PM

I have an idea, how about we make sure that whatever we say makes sense on 8-bit
byte addressable machines, with independent 8-bit bytes, and byte pointers.

That covers almost all machines on which Ada is used or likely to be used. On
machines that do not meet these criteria, you simply appeal to the normal
argument that you can't do things if the architecture does not permit

It really is a silly waste of time to try to write rules that cover all such
possible weird machines.

****************************************************************

From: Bob Duff
Sent: Monday, February  4, 2013  7:18 PM

> My sense is that in ARG meetings over the past several years, we have
> all agreed we need something like Independent to overcome the slightly
> odd rule about erroneousness coming from any non-confirming rep
> clause.

Well, it doesn't really say "erroneous", it says it's implementation-defined
whether there is erroneousness. To me, that means that on "normal" machines, it
won't be erroneous, and that should be good enough.

But I'm willing to just give in on this point. I'm not entirely convinced, but
everybody seems against me on this, and it's just not important enough to keep
arguing about.

****************************************************************

From: Bob Duff
Sent: Monday, February  4, 2013  7:41 PM

> That must have been Tucker # 42.

Never heard of that Tucker.

>...I don't remember this
> suggestion.  My feeling has generally been that "pack" means  use
>space optimization over time optimization, subject to  all of the other
>representation requirements.

I think I understand what you mean, but I think that's a wrong (or obsolete) way
to express it.

Bit-level packing improves the speed of whole-object operations like "=" and
":=" and parameter passing.  And it improves speed by making things smaller and
therefore more cache-friendly. It damages the speed of operating on individual
components. So Pack tells the compiler how to make that speed trade-off.

On a small-memory embedded system, Pack might be about space efficiency, but on
a 64-bit computer, it's purely about time efficiency -- it means "reduce space
of this data structure in order to improve time overall" (as opposed to "reduce
space because I might run out of memory").

It's not "space versus speed", it's "speed of these operations versus speed of
those operations".

>..It should never be illegal to say "pack,"...

Yes, that's the important point about this AI.

****************************************************************

From: Randy Brukardt
Sent: Monday, February  4, 2013  7:54 PM

> > My sense is that in ARG meetings over the past several years, we
> > have all agreed we need something like Independent to overcome the
> > slightly odd rule about erroneousness coming from any non-confirming
> > rep clause.
>
> Well, it doesn't really say "erroneous", it says it's
> implementation-defined whether there is erroneousness.
> To me, that means that on "normal" machines, it won't be erroneous,
> and that should be good enough.

I don't follow. One expects packed array of Boolean to pack to bits on a
"normal" machine, and that surely would cause the potential for erroneousness.
Moreover, this is a particularly nasty kind of erroneousness, as the failure
doesn't happen unless two tasks happen to access two components in the same word
at the same time, something that is unlikely even when an obvious mistake
happens. It's unlikely that one would find this erroneous case by testing, and
that leads to the possibility of the problem not happening until the application
is fielded -- which is not good. By having a declaration of Independence ;-), we
can more easily write tools to check that in fact no such accesses are possible.

I suppose what you say might be true for aliased components, perhaps you meant
that (but Tucker was clearly talking about this in general). Even then, it seems
dangerous because this is a problem that can't reasonably be tested away, and we
don't want tools to be making checks just for "normal" machines (nor complaining
too much about cases that can't happen on the machine's that the user is using).

> But I'm willing to just give in on this point. I'm not entirely
> convinced, but everybody seems against me on this, and it's just not
> important enough to keep arguing about.

Glad you are seeing reason. ;-) I'm happy to stop discussing this as well.

****************************************************************

From: Randy Brukardt
Sent: Monday, February  4, 2013  8:04 PM

...
> Bit-level packing improves the speed of whole-object operations like
> "=" and ":=" and parameter passing.  And it improves speed by making
> things smaller and therefore more cache-friendly.
> It damages the speed of operating on individual components.
> So Pack tells the compiler how to make that speed trade-off.
>
> On a small-memory embedded system, Pack might be about space
> efficiency, but on a 64-bit computer, it's purely about time
> efficiency -- it means "reduce space of this data structure in order
> to improve time overall" (as opposed to "reduce space because I might
> run out of memory").
>
> It's not "space versus speed", it's "speed of these operations versus
> speed of those operations".

I think it is *just* about reducing the space (and should almost never be used
in a modern system as a consequence).

If it was really about reducing the time, why don't we have a counterpart that
goes in the other direction ("use as much space as you want to maximize the
performance of accessing individual components")? Given cache pressure concerns,
the normal case is usually somewhere in the middle (particularly on machines
with somewhat irregular access instructions like the x86). Occasionally, you
want to go all one way or the other.

****************************************************************

From: Robert Dewar
Sent: Tuesday, February  5, 2013  4:58 AM

> I don't follow. One expects packed array of Boolean to pack to bits on
> a "normal" machine, and that surely would cause the potential for
> erroneousness.

Actualy this is a good example. If we are on the x86, and we
write:

    type R is array (0 .. 31) of Boolean;
    pragma Pack (R);
    for R'Size use 32;

a very typical set of declarations, then it is really a tossup whether we can
rely on separate tasks being able to fiddle with different bits. There are three
cases

a) the code generator always uses the memory bit instructions to access the
   bits. Probably a bad choice, but would assure independence for the above
   case.

b) the code generator is incapable of using these memory bit instructions, in
   which case such simultaneous access is not possible, and a pragma
   Independent_Components would be rejected.

c) the code generator can generate these instructions, but only if told to do
   so. In this case the pragma Independent_Components is a perfect way of
   providing this instruction.

Really a perfect example of the value of the pragma!

> Moreover, this is a particularly nasty kind of erroneousness, as the
> failure doesn't happen unless two tasks happen to access two
> components in the same word at the same time, something that is
> unlikely even when an obvious mistake happens. It's unlikely that one
> would find this erroneous case by testing, and that leads to the
> possibility of the problem not happening until the application is
> fielded -- which is not good. By having a declaration of Independence
> ;-), we can more easily write tools to check that in fact no such accesses are
> possible.

Well really just a special case of shared variable erroneousness, and indeed
this is a nasty case. Shared variables are always risky.

****************************************************************

From: Robert Dewar
Sent: Tuesday, February  5, 2013  5:11 AM

> On a small-memory embedded system, Pack might be about space
> efficiency, but on a 64-bit computer, it's purely about time
> efficiency -- it means "reduce space of this data structure in order
> to improve time overall" (as opposed to "reduce space because I might
> run out of memory").

No, it's about space efficiency too. Even on a 64-bit machine, it's just
impractical to have programs with giant working sets. Yes, one can see this as a
time issue, but really most people would think of it as a space issue.

Randy's view that pragma Pack makes no sense on such machines is just
incomprehensible!

Let me give an example.

The GNAT compiler allocate a 192-byte block for entities.
We recently increased this from 160 bytes to add more fields and flags for Ada
2012 and new pragmas etc, and we saw a significant but tolerable effect on
compiler space and time.

Now these entities include space for about 325 boolean flags (taking up 40 of
those 192 bytes, packed of course.

Randy thinks we should remove the pragma Pack on modern machines. This would
increase the size of entities by 280 bytes, giving a total of nearly 500 byte.

The impact on compiler space and time would be completley disastrous! Makes no
sense at all.

And gains nothing! Most access tto flags is reads, and the extra cost of reading
packed bits is negligible.

I think a lot of people underestimate cache effects.
Again you can consider these as space or time issues, but it is more reasonable
to think of them as space issues, you need to have things small to keep things
in cache.

With modern processors, out of cache memory references are disastrously slow,
and many programs are memory bound.

If you want to get maximum performance, you have to really think about cache
effects.

For example, if you do matrix multiplication with the familiar three nested
loops, you get absolutely disastrous performance if the matrix does not fit in
the primary cache (a huge proportion of the references will be out of cache in
this case). Instead you have to tile, adjusting the tiling parameters to match
the size of the primary cache. The difference in performance is staggering.

So once again, packed bit arrays are a really important aspect of Ada (another
familiar use of such bit arrays is in optimization in a compiler, I assume I
don't need to give details to this audience! the space taken up by these
representations of sets can be huge even when they are packed.

And of course we have not even talked of the use of bit packing to match
external data structures, a common situation.

****************************************************************

From: Randy Brukardt
Sent: Wednesday, February  6, 2013  1:45 PM

> No, it's about space efficiency too. Even on a 64-bit machine, it's
> just impractical to have programs with giant working sets. Yes, one
> can see this as a time issue, but really most people would think of it
> as a space issue.
>
> Randy's view that pragma Pack makes no sense on such machines is just
> incomprehensible!

I don't think I said anything about "no sense". I said that Pack was solely
about space management (which you agree with above), and I said that as a
consequence it would "almost never" be used in systems where execution time is
the primary criteria. "Almost never" is probably a bit stronger than I meant
("rarely" would have been a better choice), but I stand by my comment.

> Let me give an example.
>
> The GNAT compiler allocate a 192-byte block for entities.
> We recently increased this from 160 bytes to add more fields and flags
> for Ada 2012 and new pragmas etc, and we saw a significant but
> tolerable effect on compiler space and time.
>
> Now these entities include space for about 325 boolean flags (taking
> up 40 of those 192 bytes, packed of course.
>
> Randy thinks we should remove the pragma Pack on modern machines. This
> would increase the size of entities by 280 bytes, giving a total of
> nearly 500 byte.

Not at all. You most likely don't want to manage such a situation with aspect
Pack, because it is such a blunt instrument.

In the example you gave (and I expect most similar cases), using pragma Pack
indiscriminately would increase the execution time more than using
representation clauses carefully.

I don't know exactly how GNAT organizes that "block", but in Janus/Ada it is a
large variant record. The effect of applying Pack to that record would be to
force all components to their minimum size, potentially including forcing
pointers and large integers to be misaligned. But that doesn't make much sense,
because all that you are trying to do is fit the data into a particularly sized
space. You don't want to shrink components that aren't part of the "critical
path" for that space -- for instance, the smaller variant limbs. And you
probably don't want to pack components that are very commonly used, either, even
if you could (the overall discriminant that controls this record is an example;
it gets used in lots of discriminant checks even when it isn't explicitly
referenced).

So in this case, Pack just does too much. You have to use a record rep. clause
to shrink just the components that matter and leave the rest of them alone. [We
also need to keep binary compatibility as much as possible, so we don't want
components moving around unnecessarily -- but that's not the primary reason to
avoid Pack in the general case.]

Now, I realize you can restructure your data structures with sub-records and
sub-arrays so that those parts can use Pack (that would be the "rare" case I was
referring to originally), but unless that happened for other reasons, I think
that is putting the cart before the horse: your program structure shouldn't be
determined by what representation clauses you want to apply!

(And using arrays of Booleans rather than records or Booleans makes little sense
in the absence of outside considerations; these bits had better be named, and
once they are, record components are easier to read and write.)

> The impact on compiler space and time would be completely disastrous!
> Makes no sense at all.
>
> And gains nothing! Most access tto flags is reads, and the extra cost
> of reading packed bits is negligible.

I agree that reading packed Booleans (if they're in records or the array index
is static) isn't much, mainly because they're almost always immediately tested,
and you can combine the bit extract and the test so that you can eliminate the
shift step. That's less true for other enumerations, though, and they too take
up a lot of extra space.

> I think a lot of people underestimate cache effects.
> Again you can consider these as space or time issues, but it is more
> reasonable to think of them as space issues, you need to have things
> small to keep things in cache.
>
> With modern processors, out of cache memory references are
> disastrously slow, and many programs are memory bound.
>
> If you want to get maximum performance, you have to really think about
> cache effects.
>
> For example, if you do matrix multiplication with the familiar three
> nested loops, you get absolutely disastrous performance if the matrix
> does not fit in the primary cache (a huge proportion of the references
> will be out of cache in this case).
> Instead you have to tile, adjusting the tiling parameters to match the
> size of the primary cache.
> The difference in performance is staggering.
>
> So once again, packed bit arrays are a really important aspect of Ada
> (another familiar use of such bit arrays is in optimization in a
> compiler, I assume I don't need to give details to this audience! the
> space taken up by these representations of sets can be huge even when
> they are packed.

We didn't use bit arrays in our optimizer specifically because the sets get too
large to be practical. (Especially when your original host was MS-DOS with 640K
RAM). Instead we used property lists (with a very compact representation for the
properties).

> And of course we have not even talked of the use of bit packing to
> match external data structures, a common situation.

Huh? Pack does whatever it does, if you have to match some specific data
structure, you have to be more specific, using 'Component_Size and record
representation clauses. I'm only talking about aspect Pack (capital P), not the
general idea of bit packing!! If you have to match something with
'Component_Size = 1, write that, don't assume that Pack will do the right thing
(especially as Bob and Tucker want to weaken those guarantees; if the component
turns out to be aliased or atomic, no packing will happen and that's probably
not what was meant).

****************************************************************



Questions? Ask the ACAA Technical Agent