Version 1.17 of ais/ai-00133.txt

Unformatted version of ais/ai-00133.txt version 1.17
Other versions for file ais/ai-00133.txt

!standard 13.05.03 (08)          05-08-09 AI95-00133/06
!standard 13.03 (08)
!standard 13.05.01 (10)
!standard 13.05.01 (13)
!standard 13.05.01 (17)
!standard 13.05.02 (02)
!standard 13.05.02 (03)
!standard 13.05.02 (04)
!class binding interpretation 96-05-07
!status Amendment 200Y 04-09-27
!status WG9 approved 04-11-18
!status ARG Approved 8-0-1 04-09-18
!status work item 96-05-07
!status received 96-05-07
!priority Medium
!difficulty Medium
!subject Controlling bit ordering
!summary
Bit_Order clauses are concerned with the numbering of bits and not concerned with data flipping interoperability.
The interpretation of component_clauses in the nondefault bit order is based on machine scalars, which are chunks of storage that can be natively loaded and stored by the machine. All of the component_clauses at a given offset are considered to be part of the same machine scalar, and the first_bit and last_bit are interpreted as bit offsets within that machine scalar. This makes it possible to write endian-independent record_representation_clauses.
The recommended level of support for Bit_Order clauses is modified to include support for the nondefault bit order in all cases.
!question
What problem is the Bit_Order attribute supposed to solve? Is it intended to solve:
1) the "compiler uniformity problem" where a single program on
a single processor wishes to use unchecked_conversion to convert a scalar (e.g. float) object to a record type (e.g. in order to extract the sign, exponent, and mantissa), and wishes to use a single portable record representation clause regardless of the default endianness of the target computer;
or
2) the "data interoperability problem" where two processors with
different bit orders need to access shared memory, files, devices, or network channels, so one processor has to do cumbersome byte flipping?
(The former.)
!recommendation
(See Summary.)
!wording
Add at the end of 13.3(8):
A machine scalar is an amount of storage that can be conveniently and efficiently loaded, stored, or operated upon by the hardware. Machine scalars consist of an integral number of storage elements. The set of machine scalars is implementation defined, but must include at least the storage element and the word. [Machine scalars are used to interpret component_clauses when the nondefault bit ordering applies.]
Add after 13.5.1(10)
If the nondefault bit ordering applies to the type, then either:
o the value of last_bit shall be less than the size of the largest machine scalar; or
o the value of first_bit shall be zero and the value of last_bit + 1 shall be a multiple of System.Storage_Unit.
Replace 13.5.1(13) by:
A record_representation_clause (without the mod_clause) specifies the layout.
If the default bit ordering applies to the type, the position, first_bit, and last_bit of each component_clause directly specify the position and size of the corresponding component.
If the nondefault bit ordering applies to the type then the layout is determined as follows:
o the component_clauses for which the value of last_bit is greater than or equal to the size of the largest machine scalar directly specify the position and size of the corresponding component;
o for other component_clauses, all of the components having the same value of position are considered to be part of a single machine scalar, located at that position; this machine scalar has a size which is the smallest machine scalar size larger than the largest last_bit for all component_clauses at that position; the first_bit and last_bit of each component_clause are then interpreted as bit offsets in this machine scalar.
Add after 13.5.1(17):
o An implementation should support machine scalars that correspond to all of the integer, floating point, and address formats supported by the machine.
Replace 13.5.1(20) by:
o For a component with a subtype whose Size is less than the word size, any
storage place that does not cross an aligned word boundary should be supported.
Replace 13.5.2(2-4) by:
R.C'Position
If the nondefault bit ordering applies to the composite type, and if a component_clause specifies the placement of C, denotes the value given for the position of the component_clause; otherwise, denotes the same value as R.C'Address - R'Address. The value of this attribute is of the type universal_integer.
R.C'First_Bit
If the nondefault bit ordering applies to the composite type, and if a component_clause specifies the placement of C, denotes the value given for the first_bit of the component_clause; otherwise, denotes the offset, from the start of the first of the storage elements occupied by C, of the first bit occupied by C. This offset is measured in bits. The first bit of a storage element is numbered zero. The value of this attribute is of the type universal_integer.
R.C'Last_Bit
If the nondefault bit ordering applies to the composite type, and if a component_clause specifies the placement of C, denotes the value given for the last_bit of the component_clause; otherwise, denotes the offset, from the start of the first of the storage elements occupied by C, of the last bit occupied by C. This offset is measured in bits. The value of this attribute is of the type universal_integer.
[Author's Note: This is a situation where the fact that a representation item (the component_clause) is specified has a visible effect. That's unfortunate, but I see no alternative, because in the absence of a component_clause for C there is no way that we can conjure up the machine scalars out of thin air.]
Replace 13.5.3(8) by:
o The implementation should support the nondefault bit ordering in addition to the default bit ordering.
Add after 13.5.3(8):
NOTE: Bit_Order clauses make it possible to write record_representation_clauses that can be ported between machines having different bit ordering. They do not guarantee transparent exchange of data between such machines.
!discussion
NOTE: This AI is largely based on Norman Cohen's paper "A Proposal for Endian- Portable Record Representation Clauses", which can be found at http://www.ada- auth.org/ai-files/grab_bag/bitorder.pdf. This paper contains figures that would be hard to reproduce in a text-only format, so the interested reader is invited to consult the PDF version. The most important definitions and conclusions of this paper are repeated here for convenience.
There are at least two "endian problems". One is the run-time problem of transferring between a big-endian machine and a little-endian machine (or between big-endian and little-endian processes of a bi-endian machine). Another is the compile-time problem of specifying a layout portably, so that a single version of an Ada source file will produce the same layout regardless of the compiler's default bit ordering. This is a proposal to solve the second problem.
Most machine instructions operate on small values that we shall call machine scalars. Typical machine scalars include an 8-bit byte, a 16-bit halfword, and a 32-bit word, as well as 32- and 64-bit floating point formats. A machine scalar generally fits in a register or a pair of registers.
The only difference between big-endian and little-endian execution is the correspondence between a sequence of two or more bytes in memory, starting at a given address and extending to higher-addressed bytes, and machine-scalar values.
A program that never views the same bits as belonging to two different types (and never performs binary I/O, which is tantamount to viewing the raw contents of a file as the representation of some type) is inherently endian-independent. That is, the program is portable between big-endian and little-endian machines. A source program that views the same storage as belonging to more than one type can also be endian-independent, provided that an appropriate programming discipline is followed and provided that object code is generated from the source code in a manner consistent with the target execution bit ordering.
A program viewing the same storage as belonging to more than one type will typically depend on properties like the following:
o A given machine scalar occurs at a specified offset within a record. o The bits of a machine scalar can be subdivided into contiguous fields of
specified sizes, occurring in a specified order from most significant bits to least significant bits.
For example the IEEE 32-bit floating point format has a 1-bit sign field, followed by an 8-bit exponent field, followed by a 23-bit mantissa field. The word "followed" in this description pertains to the machine scalar representation, i.e. the register representation. On a big-endian target machine, this format might be represented as follows:
type IEEE_32 is record Sign : Integer range 0 .. 1; Exponent : Integer range 0 .. 2**8 - 1; Mantissa : Integer range 0 .. 2**23 - 1; end record; for IEEE_32 use record Sign at 0 range 0 .. 0; Exponent at 0 range 1 .. 8; Mantissa at 0 range 9 .. 31; end record;
On a little-endian target machine, the record representation clause would have to be written as follows:
for IEEE_32 use
record Sign at 0 range 31 .. 31; Exponent at 0 range 23 .. 30; Mantissa at 0 range 0 .. 22; end record;
We would prefer to be able to write and maintain a single record-representation clause that would produce the appropriate memory mapping for the target machine whether that machine is big-endian or little-endian. At first glance, the Bit_Order clause appears to provide a solution. Unfortunately, the recommended level of support for the Bit_Order clause, as defined by 13.5.3(8) is very weak:
"If Word_Size = Storage_Unit, then the implementation should support the
nondefault bit ordering in addition to the default bit ordering."
In other words, there is no requirement to support the non-default bit ordering if Word_Size > Storage_Unit. On most machines in existence today Storage_Unit is 8 and Word_Size is typically 32 or more, so the recommended level of support is vacuous.
The reason why 13.5.3(8) is so weak has to do with the meaning of large bit numbers, i.e., bit numbers exceeding System.Storage_Unit - 1. The meaning of large bit numbers in the default bit ordering is well understood: assuming 8-bit bytes, bit 8+b of byte a is the same bit as bit b of byte a+1. Consequently, there are redundant ways to specify the same storage layout. For example, the big-endian record representation clause above could have been written equivalently:
for IEEE_32 use
record Sign at 0 range 0 .. 0; Exponent at 0 range 1 .. 8; Mantissa at 1 range 1 .. 23; end record;
While the meaning of large bit numbers is obvious in the default bit ordering, it is not obvious in the nondefault bit ordering. Suppose we compile a big- endian record-representation clause, together with a big-endian Bit_Order clause, for a little-endian target. The record-representation clause includes the component clause:
Exponent at 0 range 1 .. 8;
specifying that the Exponent component includes bits 1 to 8 of byte 0. If we adhere to the definition that bit 8+b of byte a is the same bit as bit b of byte a+1, then this is equivalent to bits 1 to 7 of byte 0 and bit 0 of byte 1. Under big-endian bit numbering rules, these are the 7 least significant bits of byte 0 together with by the most significant bit of byte 1. Unfortunately, on a little- endian target, these two groups of bits will not be adjacent in a machine scalar corresponding to bytes 0 and 1. This would make it difficult to extract and update the Exponent component. Furthermore, it is not clear that there is a need for supporting non-contiguous components in the language, but it is clear that there is a need for supporting endian-independent record representation clause.
It is important to notice that contiguity of bits is a meaningful notion within a machine scalar, or within a single byte of memory, but not between bits in different bytes of memory. The reason is that, on different target machines, bytes are loaded in different order to compose a machine scalar. If we want to be able to write endian-independent record representation clauses, we cannot interpret large bit numbers with respect to the memory representation. We can only interpret them with respect to machine scalars.
Therefore, we adopt the following convention: in a record-representation clause for the nondefault bit ordering, there is a one-to-one correspondence between byte offsets (i.e., the numbers appearing between the words "at" and "range") and machine scalars. That is, all components whose positions are specified with the same byte offset are assumed to be part of the same machine scalar (so that in typical implementations they will be loaded into a register together, ignoring alignment issues); and any two components required to reside within the same machine scalar have their positions specified in terms of the same byte offset. The length of a machine scalar is inferred from the highest bit number specified along with its byte position in some component clause, rounded up as appropriate.
Note that the set of machine scalars is implementation-dependent, so there is no guarantee that two compilers targetting the same machine will support the same set of machine scalars, and therefore the same set of record representation clauses. We give an implementation advice to support all integer, floating-point and address formats, though, as there is really no run-time complexity associated with supporting machine scalars: they only play a role when interpreting component_clauses. The reason why we have them is to reduce the likelihood that people will be surprised by the effect of some representation clauses, notably in the presence of holes.
Also note that a representation clause written for, say, a 32-bit target machine, may not port to a 16-bit target machine (that could be the case for the above example) as there may not exist a 32-bit machine scalar on the latter target.
The proposed interpretation is of course incompatible in the sense that, for the nondefault bit ordering, it breaks the current rule that bit 8+b of byte a is the same bit as bit b of byte a+1. However, the existing RM is sufficiently vague and muddled that it's hard to believe that there is a lot of code out there depending on record_representation_clauses with nondefault bit ordering.
!corrigendum 13.3(8)
Insert after the paragraph:
A storage element is an addressable element of storage in the machine. A word is the largest amount of storage that can be conveniently and efficiently manipulated by the hardware, given the implementation's run-time model. A word consists of an integral number of storage elements.
the new paragraph:
A machine scalar is an amount of storage that can be conveniently and efficiently loaded, stored, or operated upon by the hardware. Machine scalars consist of an integral number of storage elements. The set of machine scalars is implementation defined, but must include at least the storage element and the word. Machine scalars are used to interpret component_clauses when the nondefault bit ordering applies.
!corrigendum 13.5.1(10)
Insert after the paragraph:
The position, first_bit, and last_bit shall be static expressions. The value of position and first_bit shall be nonnegative. The value of last_bit shall be no less than first_bit - 1.
the new paragraphs:
If the nondefault bit ordering applies to the type, then either:
!corrigendum 13.5.1(13)
Replace the paragraph:
A record_representation_clause (without the mod_clause) specifies the layout. The storage place attributes (see 13.5.2) are taken from the values of the position, first_bit, and last_bit expressions after normalizing those values so that first_bit is less than Storage_Unit.
by:
A record_representation_clause (without the mod_clause) specifies the layout.
If the default bit ordering applies to the type, the position, first_bit, and last_bit of each component_clause directly specify the position and size of the corresponding component.
If the nondefault bit ordering applies to the type then the layout is determined as follows:
!corrigendum 13.5.1(17)
Insert after the paragraph:
The recommended level of support for record_representation_clauses is:
the new paragraph:
!corrigendum 13.5.1(20)
Replace the paragraph:
by:
!corrigendum 13.5.2(02)
Replace the paragraph:
R.C'Position
Denotes the same value as R.C'Address – R'Address. The value of this attribute is of the type universal_integer.
by:
R.C'Position
If the nondefault bit ordering applies to the composite type, and if a component_clause specifies the placement of C, denotes the value given for the position of the component_clause; otherwise, denotes the same value as R.C'Address – R'Address. The value of this attribute is of the type universal_integer.
!corrigendum 13.5.2(03)
Replace the paragraph:
R.C'First_Bit
Denotes the offset, from the start of the first of the storage elements occupied by C, of the first bit occupied by C. This offset is measured in bits. The first bit of a storage element is numbered zero. The value of this attribute is of the type universal_integer.
by:
R.C'First_Bit
If the nondefault bit ordering applies to the composite type, and if a component_clause specifies the placement of C, denotes the value given for the first_bit of the component_clause; otherwise, denotes the offset, from the start of the first of the storage elements occupied by C, of the first bit occupied by C. This offset is measured in bits. The first bit of a storage element is numbered zero. The value of this attribute is of the type universal_integer.
!corrigendum 13.5.2(04)
Replace the paragraph:
R.C'Last_Bit
Denotes the offset, from the start of the first of the storage elements occupied by C, of the last bit occupied by C. This offset is measured in bits. The value of this attribute is of the type universal_integer.
by:
R.C'Last_Bit
If the nondefault bit ordering applies to the composite type, and if a component_clause specifies the placement of C, denotes the value given for the last_bit of the component_clause; otherwise, denotes the offset, from the start of the first of the storage elements occupied by C, of the last bit occupied by C. This offset is measured in bits. The value of this attribute is of the type universal_integer.
!corrigendum 13.5.3(8)
Replace the paragraph:
by:
NOTES
13 Bit_Order clauses make it possible to write record_representation_clauses that can be ported between machines having different bit ordering. They do not guarantee transparent exchange of data between such machines.
!ACATS Test
Create a test to check that the non-default bit order is supported and creates the correct layout (for IEEE floats, for instance.)
!appendix

!section 13.5.3(00)
!subject controlling bit ordering
!reference RM95-13.5.3
!from Dan Eilers 96-04-24
!reference 96-5511.a Dan Eilers  96-4-24>>
!discussion

There seems to be confusion (as evidenced by recent c.l.a. discussion)
as to which problem the Bit_Order attribute is supposed to solve.
Is is intended to solve:
   1) the "compiler uniformity problem" where a single program on
      a single processor wishes to use unchecked_conversion to
      convert a scalar (e.g. float) object to a record type (e.g.
      in order to extract the sign, exponent, and mantissa), and
      wishes to use a single portable record representation clause
      regardless of the default endianness of the target computer;
or
   2) the "data interoperability problem" where two processors with
      different bit orders need to access shared memory, files,
      devices, or network channels, so one processor has to do
      cumbersome byte flipping?

Problem #1 was accepted as Revision Requirement 2.4 "Controlling
Implementation-Dependent Choices", referring to RR-0137 "Standardize
bit storage/order conventions" and RR-0411 "Express record representation
clauses in a machine-independent way".

Problem #2 was accepted in the Aug 27, 1990 Draft 3.3 as Revision
Requirement 6.2 "Data Interoperability" under Study Topic 25.1
"Data Interoperability":
     For example, there is no control over the bit/byte ordering --
     such control is required in order to deal with the conflicting
     representations between "little-endian" and "big-endian"
     representations.
However, the endianness aspect of Revision Requirement 6.2 was dropped
in the final Revision Requirements document.

Evidence that Problem #1 was intended is: that it is the one
that was accepted as a final requirement; it is the easiest
of the two solutions to implement; and 13.5.3 says nothing about
byte flipping.  Presumably problem #1 is implemented simply
by counting bit offsets from the end of the record rather
than from the front of the record.

Evidence that Problem #2 was intended is that support for
nondefault bit ordering is optional, apparently due to presumed
implementation difficulties.

Note that it isn't possible for a single attribute to solve
both problems, since the solutions are mutually exclusive in
the case where a record field spans a byte boundary.  Such a
field would get flipped for solution #2, and not in Solution #1.

        -- Dan Eilers

****************************************************************

[* Editor's note: This paper does not translate well to text form. To see the
paper in its original form with diagrams, download the file bitorder.pdf from
the ACAA web site - www.ada-auth.org/~acats/grab_bag.html *]


A Proposal for Endian-Portable
Record Representation Clauses
Norman H. Cohen

What problem are we solving?

There are at least two "endian problems". One is the run-time problem of
transferring data between a big-endian machine and a little-endian machine (or
between big-endian and little-endian processes of a biendian machine). Another
is the compile-time problem of specifying a data layout portably, so that a
single version of an Ada source file will produce the same layout regardless of
the compiler's default bit ordering. This is a proposal to solve the second
problem.

What is the semantic difference between big-endian and little-endian
execution?

Most machine instructions operate on small values that we shall call machine
scalars. Typical machine scalars include an 8-bit byte, a 16-bit halfword, and a
32-bit word. (To simplify the presentation, this document will, without loss of
generality, be phrased in terms of an architecture supporting at least these
three kinds of machine scalars.) A machine scalar generally fits in a register
or a pair of registers. In a RISC architecture, all machine scalars are loaded
into registers before being operated upon, while in a CISC architecture, one or
more machine-scalar operands may reside in storage.

The only difference between big-endian and little-endian execution is the
correspondence between a sequence of two or more bytes in memory, starting at a
given address and extending to higher-addressed bytes, and machine-scalar
values. (A sequence of bytes in memory is mapped to a machine-scalar value upon
being loaded into a register or upon being used as an operand of a CISC
register-storage instruction, and a machine-scalar value is mapped to a sequence
of bytes in memory upon being stored.) In little-endian execution, the
lowest-addressed byte corresponds to the low-order eight bits of the
machine-scalar value, while in big-endian execution, the lowest-addressed byte
corresponds to the high-order eight bits of the machine-scalar value.

What do we mean by the "same" layout, and why is it important?

A program that never views the same bits as belonging to two different types
(and never performs binary I/O, which is tantamount to viewing the raw contents
of a file as the representation of some type) is inherently endian-independent.
That is, barring impediments to portability unrelated to bit order, the program
is portable between big-endian and little-endian machines. A source program that
views the same storage as belonging to more than one type can also be
endian-independent, provided that an appropriate programming discipline is
followed and provided that object code is generated from the source code in a
manner consistent with the target execution bit order.

A program viewing the same storage as belonging to more than one type will
typically depend on properties like the following:
* A given machine scalar occurs at a specified offset within a record.
* The bits of a machine scalar can be subdivided into contiguous fields of
  specified sizes,
occurring in a specified order from most significant bits to least significant bits.

For example, the DOS 32-bit representation of a date and time can be described
as follows:
* The representation consists of a halfwords at offset 0 and a halfword at
  offset 2. (For consistency throughout this document, we refer to a 16-bit
  machine scalar as a "halfword", even though it is called a "word" in the Intel
  architecture.)
* In the halfword at offset 0, the high-order seven bits give the number of
  years since 1980, the middle four bits give the month of the year, and the
  low-order five bits give the day of the month.
* In the halfword at offset 2, the high-order five bits give the hour of the
  day, the  middle six bits give the minute of the hour, and the low-order five
  bits give the number of two-second units within the minute.

Notice that we have described the properties of this data representation in a
manner that is independent of big-endian and little-endian conventions. We shall
refer to sets of properties that can be described in this way as
endian-independent layout specifications. Suppose a compiler could be
instructed, using notation independent of the bit ordering on the target
machine, to produce a storage layout that obeys a given set of
endian-independent layout specifications on the target machine. Then the source
file of a rogram that depended only upon those specifications could be compiled
for either a big-endian target or a little-endian target.

There are many ways in which programs can exploit endian-independent layout
specifications. A program might depend on the date information residing at the
lower-addressed halfword so that pointers to date-and-time structures could be
passed to subprograms expecting only pointers to structures containing the three
date components. (This is a form of homegrown polymorphism through record
extension.) A program might depend on the left-to-right ordering of components
within a halfword so that two dates, or two times, could be compared by a single
16-bit unsigned-integer comparison.

Endian-independent layout specifications can be represented graphically by
vertically stacking machine scalars that must occur at specified byte offsets
(drawing machine scalars with lower offsets at the top) and by drawing fields
within machine scalars so that fields with less significant bits are to the left
of fields with more significant bits. For example, the DOS 32-bit representation
of a date and time can be depicted as follows:

Endian-Independent Layout Specification 1:

(* Diagram omitted *)

When we say that a storage representation is "the same" on both a big-endian and
little-endian machine, we mean, in effect, that it is described in each case by
the same such picture.

This picture expresses not the physical positions of components in memory, but
the endian-independent layout specifications upon which the program depends. For
example, we might also have drawn the DOS representation of a date and time as
follows:

Endian-Independent Layout Specification 2:

(* Diagram omitted *)

This picture corresponds to exactly the same memory mapping on a big-endian
machine, but it specifies a different set of layout specifications to be
preserved across bit orders, and thus a different memory mapping on a
little-endian machine:

Little-endian physical representation of
Endian-Independent Layout Specification 1:

(* Diagram omitted *)

Big-endian physical representation of
Endian-Independent Layout Specification 1:

(* Diagram omitted *)

Big-endian physical representation of
Endian-Independent Layout Specification 2:

(* Diagram omitted *)

Little-endian physical representation of
Endian-Independent Layout Specification 2:

(* Diagram omitted *)

Endian-Independent Layout Specification 2 specifies the left-to-right ordering
of all six components within a single 32-bit machine scalar, and thus allows two
time-and-date values to be compared by a single 32-bit unsigned-integer
comparison. However, Endian-Independent Layout Specification 2 does not
stipulate that the three date components reside in the halfword at offset 0, so
a program that uses pointers to time-and-date structures as if they were
pointers to date structures will not be portable to little-endian machines.

As this example illustrates, certain combinations of layout specifications can
be maintained consistently when changing bit order, and others cannot. It is
impossible to preserve both the left-to-right ordering of all six components and
the offset of the halfword containing the date components across both bit
orderings. Thus one cannot write an endian-independent program depending on all
of these properties. However, layout specifications that specify only the size
and location of nonoverlapping machine scalars, plus the size and left-to-right
ordering of fields within those machine scalars, can be maintained consistently
when changing bit order.

Each endian-independent layout specification determines one big-endian memory
mapping and one little-endian memory mapping. A given memory mapping on a given
machine may satisfy many different endian-independent layout specifications.
Distinct endian-independent layout specifications that are satisfied by the same
memory mapping in one bit ordering are satisfied by distinct memory mappings in
the opposite bit ordering.

(* Diagram omitted *)

Bit numbers within record-representation clauses

A storage layout for a record type can be specified in Ada by a
record-representation clause. Such a clause specifies the location of each
component in memory in terms of a specified range of bits, numbered relative to
bit zero of the storage unit (on typical architectures, the byte) at a specified
offset. However, whether bits are numbered left-to-right or right-to-left by
default depends on the compiler. Thus, Endian-Independent Layout Specification 1
might be specified by the record-representation clause

for Date_And_Time_Type use
  record
    Years_Since_1980 at 0 range 0 .. 6;
    Month at 0 range 7 .. 10;
    Day_Of_Month at 0 range 11 .. 15;
    Hour at 2 range 0 .. 4;
    Minute at 2 range 5 .. 10;
    Seconds at 2 range 11 .. 15;
end record;

for a big-endian target machine, and by the record-representation clause

for Date_And_Time_Type use
  record
    Years_Since_1980 at 0 range 9 .. 15;
    Month at 0 range 5 .. 8;
    Day_Of_Month at 0 range 0 .. 4;
    Hour at 2 range 11 .. 15;
    Minute at 2 range 5 .. 10;
    Seconds at 2 range 0 .. 4;
end record;

for a little-endian target machine.

We would prefer to be able to write and maintain a single record-representation
clause that would produce the appropriate memory mapping for the target machine
whether that machine is big-endian or little-endian. At first glance, the
bit-order clause, a new feature of Ada 95, appears to provide a solution: The
bit-order clause
  for Date_And_Time_Type'Bit_Order use High_Order_First;
specifies that bit numbers should be interpreted according to big-endian
conventions in a record-representation clause for Date_And_Time_Type (i.e., with
bit 0 being the high-order bit of a byte), regardless of the compiler's default
bit order, while the bit-order clause
  for Date_And_Time_Type'Bit_Order use Low_Order_First;
specifies those bit numbers should be interpreted according to low-endian
conventions. Thus, either the first bit-order clause followed by the first
record-representation clause, or the second bit-order clause followed by the
second record-representation clause, should produce the appropriate storage
layout for the target, regardless of whether the target is big-endian or
little-endian.

Unfortunately, matters are not so simple. The Ada standard gives compilers for
most machines permission to reject a bit-order clause that specifies the
nondefault bit order. In the nondefault bit order, there are multiple, distinct
meanings we can ascribe to bit  numbers greater than or equal to the number of
bits in a byte. The drafters of the Ada-95 standard implicitly ascribed a
meaning that allows the specification of impractical memory mappings. To avoid
requiring compilers to support these impractical mappings, they chose not to
require compilers to support nondefault bit orders.

The meaning of large bit numbers in record representation clauses

The meaning of large bit numbers in the default bit order is well understood:
Assuming eight-bit bytes, bit 8+b of byte a is the same bit as bit b of byte
a+1. Consequently, there are redundant ways to specify the same storage layout.
For example, the big-endian record-representation clause could have been written
equivalently (for a compiler whose default bit order is big-endian) as follows:

for Date_And_Time_Type use
  record
    Years_Since_1980 at 0 range 0 .. 6;
    Month at 0 range 7 .. 10;
    Day_Of_Month at 1 range 3 .. 7; -- was 0 range 11 .. 15
    Hour at 2 range 0 .. 4;
    Minute at 2 range 5 .. 10;
    Seconds at 3 range 3 .. 7; -- was 2 range 11 .. 15
end record;

Our proposal exploits this redundancy to express endian-independent layout
specifications. Under this proposal, the two big-endian record-representation
clauses specify the same memory mapping, but a different set of
endian-independent layout specifications.

While the meaning of large bit numbers is obvious in the default bit order, it
is not obvious in the nondefault bit order. Suppose we and compile a big-endian
record-representation clause, together with a big-endian bit-order clause, for a
little-endian target. The record-representation clause includes the component
clause
  Minute at 2 range 5 .. 10;
specifying that the Minute component is to occupy the following six bits:
* bit 5 (big endian) of byte 2
* bit 6 (big endian) of byte 2
* bit 7 (big endian) of byte 2
* "bit 8" (big endian) of byte 2
* "bit 9" (big endian) of byte 2
* "bit 10" (big endian) of byte 2

If we adhere to the definition that bit 8+b of byte a is the same bit as bit b
of byte a+1, hen this is equivalent to the following bits:
* bit 5 (big endian) of byte 2
* bit 6 (big endian) of byte 2
* bit 7 (big endian) of byte 2
* bit 0 (big endian) of byte 3
* bit 1 (big endian) of byte 3
* bit 2 (big endian) of byte 3

Under big-endian bit-numbering rules, these are the three rightmost (i.e., least
significant) bits of byte 2 and the three leftmost (i.e., most significant) bits
of byte 3. However, on a little-endian target, these two groups of bits will not
be adjacent in a machine scalar corresponding to bytes 2 and 3. This would make
it difficult to extract and update the Minute component. (An analogous problem
arises if we try to compile a little-endian record-representation clause with a
little-endian bit-order clause for a big-endian machine.)

To avoid this problem, the drafters of the Ada-95 standard chose to reject
mandatory support for nondefault bit orders. We prefer to reject the definition
of bit 8+b of byte a to be the same bit as bit b of byte a+1 in the nondefault
bit order. We adopt a different definition that turns out to be equivalent to
that definition in the default bit order, but different, and more useful, in the
nondefault bit order.

Avoiding noncontiguous bit fields

Contiguity of bits is a meaningful notion within a machine scalar, or within a
single byte of memory, but not between bits in different bytes of memory. The
notion of bit position within a byte is inherent in the binary representation of
the numeric value contained in the byte. Our view of bit contiguity within
memory is conveyed by depicting memory with bits arranged left-to-right in bytes
that are stacked vertically:

		byte 0
		byte 1
		byte 2

Although bits in different bytes may correspond to adjacent bits in a machine
scalar (i.e., the bits become adjacent when loaded into a register), there is no
notion of two bits in different bytes of memory being adjacent. (In the context
of a single machine with a known byte order, it is common practice to depict
bytes side by side, with addresses increasing left-to-right for big-endian
machines and right-to-left for little-endian machines. These depictions are
convenient because bits are depicted in memory in the same left-to-right order
as in the corresponding machine scalar. However, such depictions are meaningless
for machines of opposite byte order.)

Consequently, the notion of a range of bits only makes sense with respect to a
machine scalar. When we say
  Minute at 2 range 5 .. 10;
we are referring to a range of bits in a machine scalar corresponding to the
memory beginning at offset 2. This definition refers to a contiguous range of
bits in the machine scalar regardless of the bit ordering.

In the default bit order, we number bits from zero starting in the
lowest-addressed byte of the machine scalar. Thus we can readily determine the
number assigned to a bit position at a given distance from the low-address end
of the machine scalar. However, in the nondefault bit order, we number bits from
zero starting in the highest-addressed byte of the machine scalar, so the number
assigned to a bit position at a given distance from the low-address end of a
machine scalar depends on the length of the machine scalar.

Therefore, we adopt the following convention: In a record-representation clause
for the nondefault bit order, there is a one-to-one correspondence between byte
offsets (i.e., the numbers appearing between the words "at" and "range") and
machine scalars. That is, all components whose positions are specified with the
same byte offset are assumed to be part of the same machine scalar (so that in
typical implementations they will be loaded into a register together); and any
two components required to reside within the same machine scalar have their
positions specified in terms of the same byte offset. The length of a machine
scalar is inferred from the highest bit number specified along with its byte
position in some component clause, rounded up to the next multiple of
System.Storage_Unit.

(This approach requires the explicit declaration of "filler" fields when the
byte of a machine scalar containing the high-numbered bits is to be left unused.
If no use is made of the record component Filler, the declarations

  for R'Bit_Order use X;
  for R use
    record
      C at 0 range 0 .. 23;
  end record;
  for R'Size use 32;

and the declarations

  for R'Bit_Order use X;
  for R use
    record
      C at 0 range 0 .. 23;
      Filler at 0 range 24 .. 31;
  end record;
  for R'Size use 32;

are equivalent on compilers whose default bit order is X. However, on compilers
with the opposite default bit order, the first set of declarations places C in
the bytes at offsets 0, 1, and 2, while the second set of declarations places C
in the bytes at offsets 1, 2, and 3.)

This interpretation of bit numbers makes it feasible to require compiler support
for the nondefault bit order. We can then write portable record-representation
clauses. These representation clauses correspond directly to endian-independent
layout specifications: Endian-Independent Layout Specification 1 can be written
portably either as

for Date_And_Time'Bit_Order use High_Order_First;
for Date_And_Time_Type use
  record
    Years_Since_1980 at 0 range 0 .. 6;
    Month at 0 range 7 .. 10;
    Day_Of_Month at 0 range 11 .. 15;
    Hour at 2 range 0 .. 4;
    Minute at 2 range 5 .. 10;
    Seconds at 2 range 11 .. 15;
end record;

or as

for Date_And_Time'Bit_Order use Low_Order_First;
for Date_And_Time_Type use
  record
    Years_Since_1980 at 0 range 9 .. 15;
    Month at 0 range 5 .. 8;
    Day_Of_Month at 0 range 0 .. 4;
    Hour at 2 range 11 .. 15;
    Minute at 2 range 5 .. 10;
    Seconds at 2 range 0 .. 4;
end record;

Endian-Independent Layout Specification 2 can be written portably either as

for Date_And_Time'Bit_Order use High_Order_First;
for Date_And_Time_Type use
  record
    Years_Since_1980 at 0 range 0 .. 6;
    Month at 0 range 7 .. 10;
    Day_Of_Month at 0 range 11 .. 15;
    Hour at 0 range 16 .. 20;
    Minute at 0 range 21 .. 26;
    Seconds at 0 range 27 .. 31;
end record;

or as

for Date_And_Time'Bit_Order use Low_Order_First;
for Date_And_Time_Type use
  record
    Years_Since_1980 at 0 range 25 .. 31;
    Month at 0 range 21 .. 24;
    Day_Of_Month at 0 range 16 .. 20;
    Hour at 0 range 11 .. 15;
    Minute at 0 range 5 .. 10;
    Seconds at 0 range 0 .. 4;
end record;

In all four record-representation clauses, each distinct byte-offset value
corresponds to a distinct machine scalar (i.e., to a line of an
endian-independent-layout-specification picture), and the ranges associated with
a given byte offset correspond to the position of a field within that machine
scalar.

****************************************************************

From: Robert Dewar
Sent: Tuesday, July 3, 2001  7:42 PM

We suddenly had two of our large customer ask, just days apart, whether
there was a way of controlling bit ordering in arrays.

The answer of course is no, but it does seem that it would be perfectly
reasonable to allow the specification of the Bit_Order attribute for an
array type ...

Thoughts?

****************************************************************

From: Randy Brukardt
Sent: Tuesday, July 3, 2001  8:14 PM

Could you be a bit more specific on what the need/problem is? Off-hand, I
can't think of anything having to do with arrays for which the bit ordering
would matter. In particular, you don't specify bit numbers of arrays as you
do with records.

****************************************************************

From: Robert Dewar
Sent: Tuesday, July 3, 2001  8:47 PM

   type x is array (0 .. 7) of Boolean;
   pragma Pack (x);

now, which bit is x(0)?

****************************************************************

From: Robert Duff
Sent: Thursday, July 5, 2001  11:52 AM

So it matters if you unchecked_convert to a type T is range 0..2**7-1.
The Bit_Order indicates whether x(0) is the low- or high-order bit
of the integer.  Right?

Now what if it's not packed?  Eg:

    X: array (0 .. 15) of Boolean;

Does Bit_Order control whether the array is stored backwards in memory
(i.e., whether X(0)'Address = X'Address + 15)?

If we have:

    X: array (1 .. 100) of Character;
    Y: array (1 .. 100) of Character;

and X and Y are of different Bit_Order, and we unchecked convert X to Y,
will Y(100) = X(1), and Y(99) = X(2)?

Or what if it's packed, and bigger than a storage unit, or bigger than a
word?

I guess I'm confused about what the semantics should be when crossing
byte or word boundaries.  I'm not sure I understand the issues for
records, either.  :-(

****************************************************************

From: Robert Dewar
Sent: Thursday, July 5, 2001  12:26 PM

Well bit order for records simply controls the numbering of bits WITHIN
a storage unit, it has no effect on numbering of bytes. If you have fields
that cross storage unit boundaries, there are two cases:

1. The easy case, where the field occupies an integral number of bytes and
completely occupies these bytes, e.g. a 32 bit field occupying four bytes.
In this case, bit order has no relevance in any case, since the field will
simply occupy these four bytes.

2. The hard case, where the field occupies part of a byte and crosses
byte boundaries. In this case the specification of a non-standard bit
order results in non-contiguous fields, and is a mess. GNAT simply disallows
the specification of bit order for any record with such a field.

For arrays, I think it only makes sense to worry about the case of 1,2,4
bits where every element lies entirely within a storage unit, and in that
case bit order makes perfectly good sense.

****************************************************************

From: Robert Dewar
Sent: Wednesday, February 25, 2004  5:31 PM

> Is this AI waiting for someone to do some further work?
> My understanding is that Norm Cohen wrote up a good solution
> several years ago, but nothing much seems to have happened since then.

I remember Norm describing how to control bit ordering within
the current language, but I do not remember any suggestions of
language features.

What customers want of course is some magic incantation to allow
their big endian dependent apps to run on little endian machines
without any change. They are not going to get this :-)

There are some things that would be useful, though whether we
should mandate them, I don't know.

It would be useful to be able to apply Bit_Order to a bit packed
array to number the bits in opposite order. Perhaps this could be
a more general feature of indexing arrays backwards (interface to
Fortran-2 anyone? :-)

It might also be useful to be able to apply Bit_Order to a discrete
type meaning that you allow Little-endian integers for example on
a big-endian machine. Note that bit order and byte order must always
be consistent, so controlling the bit order as a way of talking about
controlling byte order is fine.

It is definitely mysterious to people that if they declare a record

    type R is record
       A, B : Integer;
    end record;

    for R'Bit_Order use ...

the Bit_Order spec has no effect at all. In fact this is sufficiently
odd that in GNAT we warn that the pragma has no effect on fields A
and B.

****************************************************************

From: Dan Eilers
Sent: Thursday, February 25, 2004  12:30 PM


Robert Dewar wrote:
> I remember Norm describing how to control bit ordering within
> the current language, but I do not remember any suggestions of
> language features.
>
> What customers want of course is some magic incantation to allow
> their big endian dependent apps to run on little endian machines
> without any change. They are not going to get this :-)

AI-133 points out that there are two different things a user might
want, one of which is as you describe, wanting an app running on
a little endian machine to magically do some sort of byte flipping
so that it can deal with data represented in big-endian order.
And yes, they are not going to get this.

The other thing that a user might want is to be able to write a
record rep clause, for example describing the layout of an
IEEE 32-bit float, or other "machine scalar" type, and want
to specify the record layout in little-endian format using the
existing Ada95 bit-order attribute, and be able to compile such a
program on either a big- or little- endian machine and be able to
use such record type to extract the bit fields of such an object.
In this case, there is no byte flipping going on.

Norm Cohen's paper, which was added to the AI in June, 1999,
(see http://www.ada-auth.org/ai-files/grab_bag/bitorder.pdf)
shows that it is reasonable to expect a compiler to support
endian-independent record rep clauses for machine scalar types,
using a corrected definition of the existing bit-order attribute.

> It would be useful to be able to apply Bit_Order to a bit packed
> array to number the bits in opposite order. Perhaps this could be
> a more general feature of indexing arrays backwards (interface to
> Fortran-2 anyone? :-)

Yes, this would also be useful (VHDL does this, for example).
But that would be a separate AI.

****************************************************************

From: Dan Eilers
Sent: Monday, March  1, 2004  12:57 PM


AI-133 has a note:
> [Author's note: Randy, would it be possible to run a compiler survey to see
> what compilers do with Bit_Order clauses that specify the nondefault bit
> ordering? I suppose that you could use the above example of the IEEE 32-bit
> format.]


Here is a proposed test program:



with system;
with unchecked_conversion;
with text_io; use text_io;
procedure ai133 is

   type flt is digits 6;
   for flt'size use 32;

   type IEEE_32_big_endian is
      record
         Sign : Integer range 0 .. 1;
         Exponent : Integer range 0 .. 2**8 - 1;
         Mantissa : Integer range 0 .. 2**23 - 1;
      end record;

   for IEEE_32_big_endian'bit_order use system.high_order_first;
   for IEEE_32_big_endian use
      record
         Sign at 0 range 0 .. 0;
         Exponent at 0 range 1 .. 8;
         Mantissa at 0 range 9 .. 31;
      end record;

   type IEEE_32_little_endian is
      record
         Sign : Integer range 0 .. 1;
         Exponent : Integer range 0 .. 2**8 - 1;
         Mantissa : Integer range 0 .. 2**23 - 1;
      end record;

   for IEEE_32_little_endian'bit_order use system.low_order_first;
   for IEEE_32_little_endian use
      record
         Sign at 0 range 31 .. 31;
         Exponent at 0 range 23 .. 30;
         Mantissa at 0 range 0 .. 22;
      end record;

   function conv is new unchecked_conversion(flt, IEEE_32_big_endian);
   function conv is new unchecked_conversion(flt, IEEE_32_little_endian);

   data: constant array(integer range <>) of flt :=
           (0.0, 0.0005, -1.0, 2.5, -37.25, 12345.75);
   ok: boolean := true;

begin
   for i in data'range loop
      declare
         x: flt := data(i);
         y_b: IEEE_32_big_endian := conv(x);
         y_l: IEEE_32_little_endian := conv(x);
      begin
         if y_b.sign /= y_l.sign then
            put_line("FAILED, problem with sign");
            ok := false;
         end if;
         if y_b.exponent /= y_l.exponent then
            put_line("FAILED, problem with exponent");
            put_line(integer'image(y_b.exponent));
            put_line(integer'image(y_l.exponent));
            ok := false;
         end if;
         if y_b.mantissa /= y_l.mantissa then
            put_line("FAILED, problem with mantissa");
            ok := false;
         end if;
         if flt'exponent(x)+126 /= y_b.exponent and not
            (y_b.exponent = 0 and flt'exponent(x) = 0) then
            put_line("FAILED, wrong exponent");
            ok := false;
         end if;
         if (y_b.sign = 0 and flt'fraction(x) < 0.0) or
            (y_b.sign = 1 and flt'fraction(x) >= 0.0) then
            put_line("FAILED, wrong sign");
            ok := false;
         end if;
       end;
   end loop;
   if ok then put_line("PASSED"); end if;
end;

****************************************************************

From: Robert Dewar
Sent: Monday, March  1, 2004  6:04 PM

GNAT says

gcc -c ai133.adb
ai133.adb:20:19: attempt to specify non-contiguous field not permitted
ai133.adb:20:19: (caused by non-standard Bit_Order specified)
ai133.adb:21:19: attempt to specify non-contiguous field not permitted
ai133.adb:21:19: (caused by non-standard Bit_Order specified)
gnatmake: "ai133.adb" compilation error

This is as we expect. The RM specifically does not require support of
non-contiguous fields, and this seems like a reasonable choice to me!

****************************************************************

From: Dan Eilers
Sent: Monday, March  1, 2004  6:13 PM

The AI does _not_ call for support of non-contiguous fields!
It calls for "proper" interpretation of component clauses in
non-default bit order, such that consecutive bitoffsets are
contiguous.  Thus results in useful endian-independent rep clauses
rather than un-useful errors about supposed attempts to specify
non-contiguous fields.

****************************************************************

From: Robert Dewar
Sent: Monday, March  1, 2004  6:59 PM

OK, but GNAT implements the current definition in the RM, and the test
program you have clearly tries to make non-contiguous fields according
to that definition if you have a byte addressed machine.

So if you have some other meaning in mind, it would definitely be
seriously incompatible (we find lots of our customers using bit order
clauses with the meaning in the RM, so if this is changed, I would want
to be very sure that there is not some serious incompatibility
introduced.

****************************************************************

From: Dan Eilers
Sent: Monday, March  1, 2004  7:58 PM

I haven't found any incompatibility problem.
GNAT rejects non-default order rep clauses where fields span a
byte boundary (as in the test case I sent), so these aren't a
compatibility concern.

When the fields don't cross byte boundaries, as in the following
example, GNAT interprets the component clauses consistent with AI-133.



with system;
with unchecked_conversion;
with text_io; use text_io;
procedure ai133b is

   type uint32 is range 0..2**32-1;
   for uint32'size use 32;

   type byte is range 0..255;
   for byte'size use 8;

   type rec is record
     a,b,c,d: byte;
   end record;
   for rec'size use 32;

   type rec_big_endian is
      record
         a,b,c,d: byte;
      end record;

   for rec_big_endian'bit_order use system.high_order_first;
   for rec_big_endian use
      record
         a at 0 range 0 .. 7;
         b at 0 range 8 ..15;
         c at 0 range 16..23;
         d at 0 range 24..31;
      end record;

   type rec_little_endian is
      record
         a,b,c,d: byte;
      end record;

   for rec_little_endian'bit_order use system.low_order_first;
   for rec_little_endian use
      record
         a at 0 range 24..31;
         b at 0 range 16..23;
         c at 0 range  8..15;
         d at 0 range  0.. 7;
      end record;

   function conv is new unchecked_conversion(uint32, rec_big_endian);
   function conv is new unchecked_conversion(uint32, rec_little_endian);

   ok: boolean := true;

   x: uint32 := 3*(1+2**8+2**16+2**24);
   y_b: rec_big_endian := conv(x);
   y_l: rec_little_endian := conv(x);
begin
   if y_b.a /= y_l.a then
      put_line("FAILED, problem with a");
      ok := false;
   end if;
   if y_b.b /= y_l.b then
      put_line("FAILED, problem with b");
      ok := false;
   end if;
   if y_b.c /= y_l.c then
      put_line("FAILED, problem with c");
      ok := false;
   end if;
   if y_b.d /= y_l.d then
      put_line("FAILED, problem with d");
      ok := false;
   end if;

   if ok then put_line("PASSED"); end if;
end;

****************************************************************

From: Dan Eilers
Sent: Monday, March  1, 2004  8:12 PM

I spoke too soon.
There _is_ a compatibility problem for GNAT when the fields don't
cross byte boundaries.  My example just didn't expose it.
This one does.



with system;
with unchecked_conversion;
with text_io; use text_io;
procedure ai133c is

   type uint32 is range 0..2**32-1;
   for uint32'size use 32;

   type byte is range 0..255;
   for byte'size use 8;

   type rec is record
     a,b,c,d: byte;
   end record;
   for rec'size use 32;

   type rec_big_endian is
      record
         a,b,c,d: byte;
      end record;

   for rec_big_endian'bit_order use system.high_order_first;
   for rec_big_endian use
      record
         a at 0 range 0 .. 7;
         b at 0 range 8 ..15;
         c at 0 range 16..23;
         d at 0 range 24..31;
      end record;

   type rec_little_endian is
      record
         a,b,c,d: byte;
      end record;

   for rec_little_endian'bit_order use system.low_order_first;
   for rec_little_endian use
      record
         a at 0 range 24..31;
         b at 0 range 16..23;
         c at 0 range  8..15;
         d at 0 range  0.. 7;
      end record;

   function conv is new unchecked_conversion(uint32, rec_big_endian);
   function conv is new unchecked_conversion(uint32, rec_little_endian);

   ok: boolean := true;

   x: uint32 := 1+3*2**8+5*2**16+7*2**24;
   y_b: rec_big_endian := conv(x);
   y_l: rec_little_endian := conv(x);
begin
   if y_b.a /= y_l.a then
      put_line("FAILED, problem with a");
      ok := false;
   end if;
   if y_b.b /= y_l.b then
      put_line("FAILED, problem with b");
      ok := false;
   end if;
   if y_b.c /= y_l.c then
      put_line("FAILED, problem with c");
      ok := false;
   end if;
   if y_b.d /= y_l.d then
      put_line("FAILED, problem with d");
      ok := false;
   end if;

   if ok then put_line("PASSED"); end if;
end;

****************************************************************

From: Robert Dewar
Sent: Monday, March  1, 2004  9:02 PM

Well I will have to analyze this more carefully. Do you think GNAT
is wrong with respect to the definition in the Ada 95 RM -- it seems
right to me. All the attribute does in Ada 95 is to renumber the
bits in an addressing unit ...

****************************************************************

From: Randy Brukardt
Sent: Monday, March  1, 2004  9:38 PM

...
> Here is a proposed test program:
...
>    type IEEE_32_big_endian is
>       record
>          Sign : Integer range 0 .. 1;
>          Exponent : Integer range 0 .. 2**8 - 1;
>          Mantissa : Integer range 0 .. 2**23 - 1;
>       end record;

Minor nit: This assumes that Integer is 32-bit type, which is not guaranteed
by the language and isn't true on at least one compiler...

****************************************************************

From: Dan Eilers
Sent: Tuesday, March  2, 2004  11:58 AM

Robert Dewar wrote:
> Well I will have to analyze this more carefully. Do you think GNAT
> is wrong with respect to the definition in the Ada 95 RM -- it seems
> right to me. All the attribute does in Ada 95 is to renumber the
> bits in an addressing unit ...

The existing RM wording is muddled, and subject to multiple interpretations.
RM 13.5.3 paragraph 2, says that bit_orders of high_order_first and
low_order_first correspond to big endian and little endian, respectively.

GNAT gives the warning:
   warning: Bit_Order clause does not affect byte ordering

But it is nonsensical to interpret "big endian" and "little endian"
in RM 13.5.3(2) as not referring to byte ordering.

If this turns out to be too big an incompatibility for GNAT users,
then it might make sense to add a new attribute byte_order, which
does what AI 133 prescribes for bit_order, and then make bit_order
obsolete.

But I don't think the incompatibility is particularly bad.
For cases where fields do cross byte boundaries there is no
incompatibility because GNAT rejects those.  And GNAT users
are unlikely to use non-default bit_order rep clauses where
fields don't cross byte boundaries, because GNAT warns that
such rep clauses have no effect.

****************************************************************

From: Robert Dewar
Sent: Tuesday, March  2, 2004  4:09 PM

> But it is nonsensical to interpret "big endian" and "little endian"
> in RM 13.5.3(2) as not referring to byte ordering.

I strongly disagree that it is nonsensical. It makes perfect sense
and is what was intended by the design. Big-endian and Little-endian
clearly refer to both bit and byte order (since the two always go
together -- yes yes, except on the 68K -- read my book to find out
how idiotic that decision was :-)

Saying that something is nonsensical is not an argument, in fact
it is a lack of argument :-)

Yes, that means it only works for an addressing unit, which means
that it only is really useful if the addressing unit is large, hence
the limitation in the RM, but a lot of our users have found it useful
for ordering of bits within a byte, and indeed Randall Anderson and
I worked out a scheme for completely handling big-little endian
compatibility that works very nicely, but depends on having this
renumbering of bits.

> If this turns out to be too big an incompatibility for GNAT users,
> then it might make sense to add a new attribute byte_order, which
> does what AI 133 prescribes for bit_order, and then make bit_order
> obsolete.

Requiring compilers to manipulate byte order is too large a change
in my view, and in any case is not what the original AI was about!

> But I don't think the incompatibility is particularly bad.
> For cases where fields do cross byte boundaries there is no
> incompatibility because GNAT rejects those.  And GNAT users
> are unlikely to use non-default bit_order rep clauses where
> fields don't cross byte boundaries, because GNAT warns that
          ^^^^^
You mean *do* not *don't*

> such rep clauses have no effect.

Wrong, the warning is on a field by field basis ...

By the way, the approach Randall and I worked out is to have the
big-endian machine and little-endian machine order the fields in
reverse order, which is easily done with a parametrized rep clause,
then use bit order to deal with stuff within a byte, then byte
flip the entire record -- yes I know, more detail needed :-)

****************************************************************

From: Dan Eilers
Sent: Tuesday, March  2, 2004  6:57 PM

>                                         Big-endian and Little-endian
> clearly refer to both bit and byte order ...

On this we agree completely.
When the bit_order attribute is used, it should be used to specify
the interpretation of both the bit and byte order.

> Requiring compilers to manipulate byte order is too large a change
> in my view, and in any case is not what the original AI was about!

Correct.  AI-133 has absolutely nothing to do with requiring
compilers to manipulate byte order at run-time.  Instead, it is
about endian-independent rep clauses, with absolute no run-time
byte flipping or discontiguous fields.

> Yes, that means it only works for an addressing unit.

No, as shown by Norm Cohen's write up, a proper interpretation
of the bit_order attribute makes it trivially easy to support
endian-independent rep clauses that work for more than an addressing
unit.

> > But I don't think the incompatibility is particularly bad.
> > For cases where fields do cross byte boundaries there is no
> > incompatibility because GNAT rejects those.  And GNAT users
> > are unlikely to use non-default bit_order rep clauses where
> > fields don't cross byte boundaries, because GNAT warns that
>          ^^^^^
> You mean *do* not *don't*

No, I meant exactly what I said.  The only incompatibility is
when fields don't cross byte boundaries (the other case is
rejected by GNAT, so no inconsistency), and such use is unlikely
due to GNAT's warning that the attribute is interpreted not to do
what the user wants.

> By the way, the approach Randall and I worked out is to have the
> big-endian machine and little-endian machine order the fields in
> reverse order, which is easily done with a parametrized rep clause,
> then use bit order to deal with stuff within a byte, then byte
> flip the entire record -- yes I know, more detail needed :-)

No such standing on one's head is required.  Read the AI.

****************************************************************

From: Robert Dewar
Sent: Tuesday, March  2, 2004  8:22 PM

> On this we agree completely.
> When the bit_order attribute is used, it should be used to specify
> the interpretation of both the bit and byte order.

Well in fact it only specifies bit order, and AI133 does not change
that.

>>Requiring compilers to manipulate byte order is too large a change
>>in my view, and in any case is not what the original AI was about!

> Correct.  AI-133 has absolutely nothing to do with requiring
> compilers to manipulate byte order at run-time.  Instead, it is
> about endian-independent rep clauses, with absolute no run-time
> byte flipping or discontiguous fields.

Right, but I think it is useless to solve this sub problem. The
only interesting problem to solve is interchange information
between LE and BE machines, and the approach of this AI is
simply not helpful in that regard.

>>Yes, that means it only works for an addressing unit.

> No, as shown by Norm Cohen's write up, a proper interpretation
> of the bit_order attribute makes it trivially easy to support
> endian-independent rep clauses that work for more than an addressing
> unit.

I find the NC approach unconvincing, I cannot think of a single
customer problem that we have had in this area (and there are many)
where this would have helped at all.

> No, I meant exactly what I said.  The only incompatibility is
> when fields don't cross byte boundaries (the other case is
> rejected by GNAT, so no inconsistency), and such use is unlikely
> due to GNAT's warning that the attribute is interpreted not to do
> what the user wants.

You miss entirely the usual use of this attribute which is something
like the following.

I want to transfer a byte between a LE and BE machine that has
field A in the 4 ms bits, and field B in the 4 ls bits.

All I have to do is

    type R is record
       a,b : integer range 0 .. 15;
    end record;

    for R'Bit_Order use High_Order_First;
    for R use record
       A at 0 range 0 .. 3;
       B at 0 range 4 .. 7;
    end record;

This works perfectly and allows the same rep clause on the LE and
BE machines, and allows data to be interchanged.

No warning, because the rep clause does exactly what is expected.
And this is a case where fields do not cross byte boundaries.

So I really have no idea what you meant to say in the above para.
What you did say makes no sense.

>>By the way, the approach Randall and I worked out is to have the
>>big-endian machine and little-endian machine order the fields in
>>reverse order, which is easily done with a parametrized rep clause,
>>then use bit order to deal with stuff within a byte, then byte
>>flip the entire record -- yes I know, more detail needed :-)

> No such standing on one's head is required.

Yes it is!

> Read the AI.

The whole point of the AI is to solve the problem of consistent
layout of fields, *WITHOUT PROVIDING* LE/BE data interchange. That's
clear in the AI.

To solve the LE/BE data interchange issue, you HAVE to have byte
flipping. There is no avoiding this, it's fundamental.

I find the AI in its current form worse than useless. It further
complicates the understanding of this attribute in a bizarre and
complex manner without providing any additional useful functionality.
The only problem worth solving here is interchange of data between
the two machine types. The current definition does this within bytes
but does not deal with fields going over byte boundaries or fields
occupying more than one byte. The AI solves neither of these problems.

****************************************************************

From: Dan Eilers
Sent: Wednesday, March  3, 2004  11:35 AM

Today, Robert Dewar wrote:
> The only interesting problem to solve is interchange information
> between LE and BE machines, and the approach of this AI is
> simply not helpful in that regard.

Last week, Robert Dewar wrote:
> What customers want of course is some magic incantation to allow
> their big endian dependent apps to run on little endian machines
> without any change. They are not going to get this :-)

Yes it is true that AI-133 is not the least bit helpful with regard
to the data interchange problem, and yes it is true that Ada0Y is
not going to provide a solution to that problem.

What AI-133 does provide is a solution to is the endian-independent
rep clause problem.  This problem is not some unimportant subproblem.
It has been the subject of more than one published technical article,
it is the subject of repeated c.l.a. inquiries, it was the subject of
more than one Ada9x revision request, and such revision revision requests
were vetted by experts and accepted in the Ada9x revision requirements
document.

However, Ada95's solution to endian-independent rep clauses was botched.
AI-133 fixes that in a way that does not cause any runtime overhead or
any compile time complexity, or any difficulties in understanding the
attribute.  Consecutive bitoffsets in non-default bitorder are simply
interpreted as specifying contiguous storage locations, just as they do
in default bitorder.

We have implemented Norm's solution.  It was trivial to do, and
it works great.  Our user's don't get any nonsensical error messages
about purported attempts to specify non-contiguous fields, or any
nonsensical warnings about the bit_order attribute not controlling
byte order.

> You miss entirely the usual use of this attribute which is something
> like the following.
>
> I want to transfer a byte between a LE and BE machine that has
> field A in the 4 ms bits, and field B in the 4 ls bits.

The AI makes it crystal clear that the bit_order attribute is _not_
intended to solve the problem of data interchange between BE and LE
machines.

****************************************************************

From: Robert Dewar
Sent: Wednesday, March  3, 2004  6:49 PM

> What AI-133 does provide is a solution to is the endian-independent
> rep clause problem.  This problem is not some unimportant subproblem.
> It has been the subject of more than one published technical article,
> it is the subject of repeated c.l.a. inquiries, it was the subject of
> more than one Ada9x revision request, and such revision revision requests
> were vetted by experts and accepted in the Ada9x revision requirements
> document.

The Ada 95 RM itself was vetted by experts, and what is says is
reasonable, and in fact the facilities there DO solve the endian
transfer problem for a limited set of cases.

> However, Ada95's solution to endian-independent rep clauses was botched.

I do not agree that this is botched.

> AI-133 fixes that in a way that does not cause any runtime overhead or
> any compile time complexity, or any difficulties in understanding the
> attribute.  Consecutive bitoffsets in non-default bitorder are simply
> interpreted as specifying contiguous storage locations, just as they do
> in default bitorder.

I do not regard AI-133 as useful

> We have implemented Norm's solution.  It was trivial to do, and
> it works great.

I trust this is under some switch, since I think it is clearly
non-conforming behavior otherwise.

> Our user's don't get any nonsensical error messages
> about purported attempts to specify non-contiguous fields, or any
> nonsensical warnings about the bit_order attribute not controlling
> byte order.

Those warnings are extremely important in practice. Every user who
has queried the warnings was in fact trying to solve the problem of
moving data between different endianess machines.

>>You miss entirely the usual use of this attribute which is something
>>like the following.
>>
>>I want to transfer a byte between a LE and BE machine that has
>>field A in the 4 ms bits, and field B in the 4 ls bits.

> The AI makes it crystal clear that the bit_order attribute is _not_
> intended to solve the problem of data interchange between BE and LE
> machines.

Sorry, but the AI is irrelevant. The feature as described in the RM
today *IS* useful in solving the data interchange problem, and is
used that way routinely. You may read the AI this way, but even if
this reading of the AI is correct, the AI has no normative effect
on the standard.

Actually what I think the AI is saying is that Bit_Order cannot be
used to solve all such problems. And that I agree with!

****************************************************************

From: Dan Eilers
Sent: Wednesday, March  3, 2004  7:05 PM

Robert Dewar wrote:
> The Ada 95 RM itself was vetted by experts, and what is says is
> reasonable, and in fact the facilities there DO solve the endian
> transfer problem for a limited set of cases.

We can agree to disagree that what the RM currently says about
bit_order is reasonable.  Certainly there is no dispute that the
current RM wording does not solve the problem of endian-independent
rep clauses.

> I do not regard AI-133 as useful

We can agree to disagree on this.  But certainly AI-133 _does_ solve
the problem of endian-independent rep clauses.

> > We have implemented Norm's solution.  It was trivial to do, and
> > it works great.
>
> I trust this is under some switch, since I think it is clearly
> non-conforming behavior otherwise.

No, it's in standard mode, using Dewar's rule that the RM does
not require nonsense, and discontiguous fields are nonsense.

> > Our user's don't get any nonsensical error messages
> > about purported attempts to specify non-contiguous fields, or any
> > nonsensical warnings about the bit_order attribute not controlling
> > byte order.
>
> Those warnings are extremely important in practice.

I agree that under your interpretation of the bit_order attribute,
it is extremely important to warn users of this unexpected behavior.

> Actually what I think the AI is saying is that Bit_Order cannot be
> used to solve all such problems. And that I agree with!

The AI says much more than this.  It proposes a perfectly fine
solution to the endian-independent rep clause problem.

****************************************************************

From: Robert Dewar
Sent: Wednesday, March  3, 2004  7:38 PM

Dan Eilers wrote:

> No, it's in standard mode, using Dewar's rule that the RM does
> not require nonsense, and discontiguous fields are nonsense.

Not at all. Non-contiguous fields definitely arise when transferring
data from LE to BE machines. It would indeed be useful to support
them generally. We have this on a list of useful (but difficult)
enhancements.

> I agree that under your interpretation of the bit_order attribute,
> it is extremely important to warn users of this unexpected behavior.

You miss the point. In *all* the cases where we have discussed this
warning, customers wanted to solve the problem of transferring data
between machines. We would still give the warning even if the AI
were implemented, since in our experience, the interpretation that
the AI gives by default is NOT what the customer needed, wanted,
or expected.

> The AI says much more than this.  It proposes a perfectly fine
> solution to the endian-independent rep clause problem.

First, I do not agree that this is a significant issue, divorced
from the data independence issue. Second, the feature as it exists
now IS useful for solving data independence, and is best seen as
a limited facility for this purpose.

That's the way the RM is written now, and I prefer the way it is
written now to your preferred rewriting.

If you need a solution to the problem of endian independent rep
clauses (they are quite easy to write now with appropriate
parametrization), I think it is a bad idea to hijack this
feature in this manner, and I find the fundamental trick in
this AI to be an unpleasant and confusing kludge.

****************************************************************

From: Dan Eilers
Sent: Wednesday, March  3, 2004  8:29 PM

Robert Dewar wrote:
> Non-contiguous fields definitely arise when transferring
> data from LE to BE machines. It would indeed be useful to support
> them generally. We have this on a list of useful (but difficult)
> enhancements.

I fully agree that it would useful for Ada at some point to solve
the data interoperability problem.   But since nobody has proposed
a solution, I don't see any chance for 200Y.   Feel free to propose
an AI for that, but don't hijack AI-133, and don't dismiss AI-133
on the grounds that it doesn't solve that problem.

I continue to believe that the bit-order attribute is not intended
as a solution to the data interoperability problem.  This was an
early Ada9x study topic, but it got dropped from the final revision
requirements document, although solving the endian-independent
rep clauses problem was retained as a Ada9x requirement.

> We would still give the warning even if the AI
> were implemented, since in our experience, the interpretation that
> the AI gives by default is NOT what the customer needed, wanted,
> or expected.

Well, if AI-133 were implemented, the revised 200Y RM would would
presumably make it clear to users what to expect from the feature.
If you still wanted to give a warning, that would be fine with me.

> First, I do not agree that this is a significant issue, divorced
> from the data independence issue.

We can agree to disagree on this.  Certainly users raise this
issue repeatedly, particularly since Ada doesn't support
conditional compilation.  Endian-independent rep clauses also
happen to come in handy for implementing the floating point
attributes, such as 'exponent.

> Second, the feature as it exists
> now IS useful for solving data independence, and is best seen as
> a limited facility for this purpose.

We can agree to disagree on this.

****************************************************************

From: Robert Dewar
Sent: Wednesday, March  3, 2004  9:00 PM

> We can agree to disagree on this.

Not really, the disagreement has consequences (whether or
not to proceed on this AI), so an agreement of this nature
settles nothing.

****************************************************************

From: Robert Dewar
Sent: Wednesday, March  3, 2004  9:02 PM

> I fully agree that it would useful for Ada at some point to solve
> the data interoperability problem.   But since nobody has proposed
> a solution, I don't see any chance for 200Y.   Feel free to propose
> an AI for that, but don't hijack AI-133, and don't dismiss AI-133
> on the grounds that it doesn't solve that problem.

I repeat I find the "solution" to be a horrible and confusing
kludge. It also prevents a more useful implementation of Bit_Order
(GNAT warns now, but it could do things properly instead).

****************************************************************

From: Dan Eilers
Sent: Wednesday, March  3, 2004  10:46 PM


Robert Dewar wrote:
> > We can agree to disagree on this.
>
> Not really, the disagreement has consequences (whether or
> not to proceed on this AI), so an agreement of this nature
> settles nothing.

We can agree to let the ARG settle the dispute.
Clearly we are not going to change each other's
opinion by repeatedly saying we think the feature
is useful or we think it's not.

> I repeat I find the "solution" to be a horrible and confusing
> kludge. It also prevents a more useful implementation of Bit_Order
> (GNAT warns now, but it could do things properly instead).

It would move the discussion along if you said what it was
about the proposed "solution" that you find to be so horrible.
Is it simply that the solution gets in the way of future
GNAT plans for solving the data interoperability problem?
Is it the way the solution is worded?
Surely you don't find it horrible and confusing that a
user would want to write portable rep clauses.  Its
certainly less of a kludge than the "parameterized"
rep clauses you suggested as an alternative.
When IEEE defines the layout of floating point types
do they use "parameterized" descriptions?

****************************************************************


Questions? Ask the ACAA Technical Agent