!standard 13.05.03 (08) 04-09-28 AI95-00133/05 !standard 13.03 (08) !standard 13.05.01 (10) !standard 13.05.01 (13) !standard 13.05.01 (17) !standard 13.05.02 (02) !standard 13.05.02 (03) !standard 13.05.02 (04) !class binding interpretation 96-05-07 !status Amendment 200Y 04-09-27 !status WG9 approved 04-11-18 !status ARG Approved 8-0-1 04-09-18 !status work item 96-05-07 !status received 96-05-07 !priority Medium !difficulty Medium !subject Controlling bit ordering !summary Bit_Order clauses are concerned with the numbering of bits and not concerned with data flipping interoperability. The interpretation of component_clauses in the nondefault bit order is based on *machine scalars*, which are chunks of storage that can be natively loaded and stored by the machine. All the component_clauses at a given offset are considered to be part of the same machine scalar, and the first_bit and last_bit are interpreted as bit offsets within that machine scalar. This makes it possible to write endian-independent record_representation_clauses. The recommended level of support for Bit_Order clauses is modified to include support for the nondefault bit order in all cases. !question What problem is the Bit_Order attribute supposed to solve? Is it intended to solve: 1) the "compiler uniformity problem" where a single program on a single processor wishes to use unchecked_conversion to convert a scalar (e.g. float) object to a record type (e.g. in order to extract the sign, exponent, and mantissa), and wishes to use a single portable record representation clause regardless of the default endianness of the target computer; or 2) the "data interoperability problem" where two processors with different bit orders need to access shared memory, files, devices, or network channels, so one processor has to do cumbersome byte flipping? (The former.) !recommendation (See Summary.) !wording Add at the end of 13.3(8): A *machine scalar* is an amount of storage that can be conveniently and efficiently loaded, stored, or operated upon by the hardware. Machine scalars consist of an integral number of storage elements. The set of machine scalars is implementation defined, but must include at least the storage element and the word. [Machine scalars are used to interpret component_clauses when the nondefault bit ordering applies.] Add after 13.5.1(10) If the nondefault bit ordering applies to the type, then either: o the value of last_bit shall be less than the size of the largest machine scalar; or o the value of first_bit shall be zero and the value of last_bit + 1 shall be a multiple of System.Storage_Unit. Replace 13.5.1(13) by: A record_representation_clause (without the mod_clause) specifies the layout. If the default bit ordering applies to the type, the position, first_bit, and last_bit of each component_clause directly specify the position and size of the corresponding component. If the nondefault bit ordering applies to the type then the layout is determined as follows: o the component_clauses for which the value of last_bit is greater than or equal to the size of the largest machine scalar directly specify the position and size of the corresponding component; o for other component_clauses, all the components having the same value of position are considered to be part of a single machine scalar, located at that position; this machine scalar has a size which is the smallest machine scalar size larger than the largest last_bit for all component_clauses at that position; the first_bit and last_bit of each component_clause are then interpreted as bit offsets in this machine scalar. Add after 13.5.1(17): o An implementation should support machine scalars that correspond to all the integer, floating point, and address formats supported by the machine. Replace 13.5.2(2-4) by: R.C'Position If the nondefault bit ordering applies to the composite type, and if a component_clause specifies the placement of C, denotes the value given for the position of the component_clause; otherwise, denotes the same value as R.C'Address - R'Address. The value of this attribute is of the type universal_integer. R.C'First_Bit If the nondefault bit ordering applies to the composite type, and if a component_clause specifies the placement of C, denotes the value given for the first_bit of the component_clause; otherwise, denotes the offset, from the start of the first of the storage elements occupied by C, of the first bit occupied by C. This offset is measured in bits. The first bit of a storage element is numbered zero. The value of this attribute is of the type universal_integer. R.C'Last_Bit If the nondefault bit ordering applies to the composite type, and if a component_clause specifies the placement of C, denotes the value given for the last_bit of the component_clause; otherwise, denotes the offset, from the start of the first of the storage elements occupied by C, of the last bit occupied by C. This offset is measured in bits. The value of this attribute is of the type universal_integer. [Author's Note: This is a situation where the fact that a representation item (the component_clause) is specified has a visible effect. That's unfortunate, but I see no alternative, because in the absence of a component_clause for C there is no way that we can conjure up the machine scalars out of thin air.] Replace 13.5.3(8) by: o The implementation should support the nondefault bit ordering in addition to the default bit ordering. Add after 13.5.3(8): NOTE: Bit_Order clauses make it possible to write record_representation_clauses that can be ported between machines having different bit ordering. They do not guarantee transparent exchange of data between such machines. !discussion NOTE: This AI is largely based on Norman Cohen's paper "A Proposal for Endian- Portable Record Representation Clauses", which can be found at http://www.ada- auth.org/ai-files/grab_bag/bitorder.pdf. This paper contains figures that would be hard to reproduce in a text-only format, so the interested reader is invited to consult the PDF version. The most important definitions and conclusions of this paper are repeated here for convenience. There are at least two "endian problems". One is the run-time problem of transferring between a big-endian machine and a little-endian machine (or between big-endian and little-endian processes of a bi-endian machine). Another is the compile-time problem of specifying a layout portably, so that a single version of an Ada source file will produce the same layout regardless of the compiler's default bit ordering. This is a proposal to solve the second problem. Most machine instructions operate on small values that we shall call machine scalars. Typical machine scalars include an 8-bit byte, a 16-bit halfword, and a 32-bit word, as well as 32- and 64-bit floating point formats. A machine scalar generally fits in a register or a pair of registers. The only difference between big-endian and little-endian execution is the correspondence between a sequence of two or more bytes in memory, starting at a given address and extending to higher-addressed bytes, and machine-scalar values. A program that never views the same bits as belonging to two different types (and never performs binary I/O, which is tantamount to viewing the raw contents of a file as the representation of some type) is inherently endian-independent. That is, the program is portable between big-endian and little-endian machines. A source program that views the same storage as belonging to more than one type can also be endian-independent, provided that an appropriate programming discipline is followed and provided that object code is generated from the source code in a manner consistent with the target execution bit ordering. A program viewing the same storage as belonging to more than one type will typically depend on properties like the following: o A given machine scalar occurs at a specified offset within a record. o The bits of a machine scalar can be subdivided into contiguous fields of specified sizes, occurring in a specified order from most significant bits to least significant bits. For example the IEEE 32-bit floating point format has a 1-bit sign field, followed by an 8-bit exponent field, followed by a 23-bit mantissa field. The word "followed" in this description pertains to the machine scalar representation, i.e. the register representation. On a big-endian target machine, this format might be represented as follows: type IEEE_32 is record Sign : Integer range 0 .. 1; Exponent : Integer range 0 .. 2**8 - 1; Mantissa : Integer range 0 .. 2**23 - 1; end record; for IEEE_32 use record Sign at 0 range 0 .. 0; Exponent at 0 range 1 .. 8; Mantissa at 0 range 9 .. 31; end record; On a little-endian target machine, the record representation clause would have to be written as follows: for IEEE_32 use record Sign at 0 range 31 .. 31; Exponent at 0 range 23 .. 30; Mantissa at 0 range 0 .. 22; end record; We would prefer to be able to write and maintain a single record-representation clause that would produce the appropriate memory mapping for the target machine whether that machine is big-endian or little-endian. At first glance, the Bit_Order clause appears to provide a solution. Unfortunately, the recommended level of support for the Bit_Order clause, as defined by 13.5.3(8) is very weak: "If Word_Size = Storage_Unit, then the implementation should support the nondefault bit ordering in addition to the default bit ordering." In other words, there is no requirement to support the non-default bit ordering if Word_Size > Storage_Unit. On most machines in existence today Storage_Unit is 8 and Word_Size is typically 32 or more, so the recommended level of support is vacuous. The reason why 13.5.3(8) is so weak has to do with the meaning of large bit numbers, i.e., bit numbers exceeding System.Storage_Unit - 1. The meaning of large bit numbers in the default bit ordering is well understood: assuming 8-bit bytes, bit 8+b of byte a is the same bit as bit b of byte a+1. Consequently, there are redundant ways to specify the same storage layout. For example, the big-endian record representation clause above could have been written equivalently: for IEEE_32 use record Sign at 0 range 0 .. 0; Exponent at 0 range 1 .. 8; Mantissa at 1 range 1 .. 23; end record; While the meaning of large bit numbers is obvious in the default bit ordering, it is not obvious in the nondefault bit ordering. Suppose we compile a big- endian record-representation clause, together with a big-endian Bit_Order clause, for a little-endian target. The record-representation clause includes the component clause: Exponent at 0 range 1 .. 8; specifying that the Exponent component includes bits 1 to 8 of byte 0. If we adhere to the definition that bit 8+b of byte a is the same bit as bit b of byte a+1, then this is equivalent to bits 1 to 7 of byte 0 and bit 0 of byte 1. Under big-endian bit numbering rules, these are the 7 least significant bits of byte 0 together with by the most significant bit of byte 1. Unfortunately, on a little- endian target, these two groups of bits will not be adjacent in a machine scalar corresponding to bytes 0 and 1. This would make it difficult to extract and update the Exponent component. Furthermore, it is not clear that there is a need for supporting non-contiguous components in the language, but it is clear that there is a need for supporting endian-independent record representation clause. It is important to notice that contiguity of bits is a meaningful notion within a machine scalar, or within a single byte of memory, but not between bits in different bytes of memory. The reason is that, on different target machines, bytes are loaded in different order to compose a machine scalar. If we want to be able to write endian-independent record representation clauses, we cannot interpret large bit numbers with respect to the memory representation. We can only interpret them with respect to machine scalars. Therefore, we adopt the following convention: in a record-representation clause for the nondefault bit ordering, there is a one-to-one correspondence between byte offsets (i.e., the numbers appearing between the words "at" and "range") and machine scalars. That is, all components whose positions are specified with the same byte offset are assumed to be part of the same machine scalar (so that in typical implementations they will be loaded into a register together, ignoring alignment issues); and any two components required to reside within the same machine scalar have their positions specified in terms of the same byte offset. The length of a machine scalar is inferred from the highest bit number specified along with its byte position in some component clause, rounded up as appropriate. Note that the set of machine scalars is implementation-dependent, so there is no guarantee that two compilers targetting the same machine will support the same set of machine scalars, and therefore the same set of record representation clauses. We give an implementation advice to support all integer, floating-point and address formats, though, as there is really no run-time complexity associated with supporting machine scalars: they only play a role when interpreting component_clauses. The reason why we have them is to reduce the likelihood that people will be surprised by the effect of some representation clauses, notably in the presence of holes. Also note that a representation clause written for, say, a 32-bit target machine, may not port to a 16-bit target machine (that could be the case for the above example) as there may not exist a 32-bit machine scalar on the latter target. The proposed interpretation is of course incompatible in the sense that, for the nondefault bit ordering, it breaks the current rule that bit 8+b of byte a is the same bit as bit b of byte a+1. However, the existing RM is sufficiently vague and muddled that it's hard to believe that there is a lot of code out there depending on record_representation_clauses with nondefault bit ordering. !corrigendum 13.3(8) @dinsa A @i is an addressable element of storage in the machine. A @i is the largest amount of storage that can be conveniently and efficiently manipulated by the hardware, given the implementation's run-time model. A word consists of an integral number of storage elements. @dinst A @i is an amount of storage that can be conveniently and efficiently loaded, stored, or operated upon by the hardware. Machine scalars consist of an integral number of storage elements. The set of machine scalars is implementation defined, but must include at least the storage element and the word. Machine scalars are used to interpret @fas when the nondefault bit ordering applies. !corrigendum 13.5.1(10) @dinsa The @fa, @fa, and @fa shall be static expressions. The value of @fa and @fa shall be nonnegative. The value of @fa shall be no less than @fa - 1. @dinss If the nondefault bit ordering applies to the type, then either: @xbullet shall be less than the size of the largest machine scalar; or> @xbullet shall be zero and the value of @fa + 1 shall be a multiple of System.Storage_Unit.> !corrigendum 13.5.1(13) @drepl A @fa (without the @fa) specifies the layout. The storage place attributes (see 13.5.2) are taken from the values of the @fa, @fa, and @fa expressions after normalizing those values so that first_bit is less than Storage_Unit. @dby A @fa (without the @fa) specifies the layout. If the default bit ordering applies to the type, the @fa, @fa, and @fa of each @fa directly specify the position and size of the corresponding component. If the nondefault bit ordering applies to the type then the layout is determined as follows: @xbullets for which the value of @fa is greater than or equal to the size of the largest machine scalar directly specify the position and size of the corresponding component;> @xbullets, all the components having the same value of @fa are considered to be part of a single machine scalar, located at that @fa; this machine scalar has a size which is the smallest machine scalar size larger than the largest @fa for all @fas at that @fa; the @fa and @fa of each @fa are then interpreted as bit offsets in this machine scalar.> !corrigendum 13.5.1(17) @dinsa The recommended level of support for @fas is: @dinst @xbullet !corrigendum 13.5.2(02) @drepl @xhang<@xterm Denotes the same value as R.C'Address @endash R'Address. The value of this attribute is of the type @i.> @dby @xhang<@xterm If the nondefault bit ordering applies to the composite type, and if a @fa specifies the placement of C, denotes the value given for the @fa of the @fa; otherwise, denotes the same value as R.C'Address @endash R'Address. The value of this attribute is of the type @i.> !corrigendum 13.5.2(03) @drepl @xhang<@xterm Denotes the offset, from the start of the first of the storage elements occupied by C, of the first bit occupied by C. This offset is measured in bits. The first bit of a storage element is numbered zero. The value of this attribute is of the type @i.> @dby @xhang<@xterm If the nondefault bit ordering applies to the composite type, and if a @fa specifies the placement of C, denotes the value given for the @fa of the @fa; otherwise, denotes the offset, from the start of the first of the storage elements occupied by C, of the first bit occupied by C. This offset is measured in bits. The first bit of a storage element is numbered zero. The value of this attribute is of the type @i.> !corrigendum 13.5.2(04) @drepl @xhang<@xterm Denotes the offset, from the start of the first of the storage elements occupied by C, of the last bit occupied by C. This offset is measured in bits. The value of this attribute is of the type @i.> @dby @xhang<@xterm If the nondefault bit ordering applies to the composite type, and if a @fa specifies the placement of C, denotes the value given for the @fa of the @fa; otherwise, denotes the offset, from the start of the first of the storage elements occupied by C, of the last bit occupied by C. This offset is measured in bits. The value of this attribute is of the type @i.> !corrigendum 13.5.3(8) @drepl @xbullet @dby @xbullet @xindent<@s9s that can be ported between machines having different bit ordering. They do not guarantee transparent exchange of data between such machines.>> !ACATS Test Create a test to check that the non-default bit order is supported and creates the correct layout (for IEEE floats, for instance.) !appendix !section 13.5.3(00) !subject controlling bit ordering !reference RM95-13.5.3 !from Dan Eilers 96-04-24 !reference 96-5511.a Dan Eilers 96-4-24>> !discussion There seems to be confusion (as evidenced by recent c.l.a. discussion) as to which problem the Bit_Order attribute is supposed to solve. Is is intended to solve: 1) the "compiler uniformity problem" where a single program on a single processor wishes to use unchecked_conversion to convert a scalar (e.g. float) object to a record type (e.g. in order to extract the sign, exponent, and mantissa), and wishes to use a single portable record representation clause regardless of the default endianness of the target computer; or 2) the "data interoperability problem" where two processors with different bit orders need to access shared memory, files, devices, or network channels, so one processor has to do cumbersome byte flipping? Problem #1 was accepted as Revision Requirement 2.4 "Controlling Implementation-Dependent Choices", referring to RR-0137 "Standardize bit storage/order conventions" and RR-0411 "Express record representation clauses in a machine-independent way". Problem #2 was accepted in the Aug 27, 1990 Draft 3.3 as Revision Requirement 6.2 "Data Interoperability" under Study Topic 25.1 "Data Interoperability": For example, there is no control over the bit/byte ordering -- such control is required in order to deal with the conflicting representations between "little-endian" and "big-endian" representations. However, the endianness aspect of Revision Requirement 6.2 was dropped in the final Revision Requirements document. Evidence that Problem #1 was intended is: that it is the one that was accepted as a final requirement; it is the easiest of the two solutions to implement; and 13.5.3 says nothing about byte flipping. Presumably problem #1 is implemented simply by counting bit offsets from the end of the record rather than from the front of the record. Evidence that Problem #2 was intended is that support for nondefault bit ordering is optional, apparently due to presumed implementation difficulties. Note that it isn't possible for a single attribute to solve both problems, since the solutions are mutually exclusive in the case where a record field spans a byte boundary. Such a field would get flipped for solution #2, and not in Solution #1. -- Dan Eilers **************************************************************** [* Editor's note: This paper does not translate well to text form. To see the paper in its original form with diagrams, download the file bitorder.pdf from the ACAA web site - www.ada-auth.org/~acats/grab_bag.html *] A Proposal for Endian-Portable Record Representation Clauses Norman H. Cohen What problem are we solving? There are at least two "endian problems". One is the run-time problem of transferring data between a big-endian machine and a little-endian machine (or between big-endian and little-endian processes of a biendian machine). Another is the compile-time problem of specifying a data layout portably, so that a single version of an Ada source file will produce the same layout regardless of the compiler's default bit ordering. This is a proposal to solve the second problem. What is the semantic difference between big-endian and little-endian execution? Most machine instructions operate on small values that we shall call machine scalars. Typical machine scalars include an 8-bit byte, a 16-bit halfword, and a 32-bit word. (To simplify the presentation, this document will, without loss of generality, be phrased in terms of an architecture supporting at least these three kinds of machine scalars.) A machine scalar generally fits in a register or a pair of registers. In a RISC architecture, all machine scalars are loaded into registers before being operated upon, while in a CISC architecture, one or more machine-scalar operands may reside in storage. The only difference between big-endian and little-endian execution is the correspondence between a sequence of two or more bytes in memory, starting at a given address and extending to higher-addressed bytes, and machine-scalar values. (A sequence of bytes in memory is mapped to a machine-scalar value upon being loaded into a register or upon being used as an operand of a CISC register-storage instruction, and a machine-scalar value is mapped to a sequence of bytes in memory upon being stored.) In little-endian execution, the lowest-addressed byte corresponds to the low-order eight bits of the machine-scalar value, while in big-endian execution, the lowest-addressed byte corresponds to the high-order eight bits of the machine-scalar value. What do we mean by the "same" layout, and why is it important? A program that never views the same bits as belonging to two different types (and never performs binary I/O, which is tantamount to viewing the raw contents of a file as the representation of some type) is inherently endian-independent. That is, barring impediments to portability unrelated to bit order, the program is portable between big-endian and little-endian machines. A source program that views the same storage as belonging to more than one type can also be endian-independent, provided that an appropriate programming discipline is followed and provided that object code is generated from the source code in a manner consistent with the target execution bit order. A program viewing the same storage as belonging to more than one type will typically depend on properties like the following: * A given machine scalar occurs at a specified offset within a record. * The bits of a machine scalar can be subdivided into contiguous fields of specified sizes, occurring in a specified order from most significant bits to least significant bits. For example, the DOS 32-bit representation of a date and time can be described as follows: * The representation consists of a halfwords at offset 0 and a halfword at offset 2. (For consistency throughout this document, we refer to a 16-bit machine scalar as a "halfword", even though it is called a "word" in the Intel architecture.) * In the halfword at offset 0, the high-order seven bits give the number of years since 1980, the middle four bits give the month of the year, and the low-order five bits give the day of the month. * In the halfword at offset 2, the high-order five bits give the hour of the day, the middle six bits give the minute of the hour, and the low-order five bits give the number of two-second units within the minute. Notice that we have described the properties of this data representation in a manner that is independent of big-endian and little-endian conventions. We shall refer to sets of properties that can be described in this way as endian-independent layout specifications. Suppose a compiler could be instructed, using notation independent of the bit ordering on the target machine, to produce a storage layout that obeys a given set of endian-independent layout specifications on the target machine. Then the source file of a rogram that depended only upon those specifications could be compiled for either a big-endian target or a little-endian target. There are many ways in which programs can exploit endian-independent layout specifications. A program might depend on the date information residing at the lower-addressed halfword so that pointers to date-and-time structures could be passed to subprograms expecting only pointers to structures containing the three date components. (This is a form of homegrown polymorphism through record extension.) A program might depend on the left-to-right ordering of components within a halfword so that two dates, or two times, could be compared by a single 16-bit unsigned-integer comparison. Endian-independent layout specifications can be represented graphically by vertically stacking machine scalars that must occur at specified byte offsets (drawing machine scalars with lower offsets at the top) and by drawing fields within machine scalars so that fields with less significant bits are to the left of fields with more significant bits. For example, the DOS 32-bit representation of a date and time can be depicted as follows: Endian-Independent Layout Specification 1: (* Diagram omitted *) When we say that a storage representation is "the same" on both a big-endian and little-endian machine, we mean, in effect, that it is described in each case by the same such picture. This picture expresses not the physical positions of components in memory, but the endian-independent layout specifications upon which the program depends. For example, we might also have drawn the DOS representation of a date and time as follows: Endian-Independent Layout Specification 2: (* Diagram omitted *) This picture corresponds to exactly the same memory mapping on a big-endian machine, but it specifies a different set of layout specifications to be preserved across bit orders, and thus a different memory mapping on a little-endian machine: Little-endian physical representation of Endian-Independent Layout Specification 1: (* Diagram omitted *) Big-endian physical representation of Endian-Independent Layout Specification 1: (* Diagram omitted *) Big-endian physical representation of Endian-Independent Layout Specification 2: (* Diagram omitted *) Little-endian physical representation of Endian-Independent Layout Specification 2: (* Diagram omitted *) Endian-Independent Layout Specification 2 specifies the left-to-right ordering of all six components within a single 32-bit machine scalar, and thus allows two time-and-date values to be compared by a single 32-bit unsigned-integer comparison. However, Endian-Independent Layout Specification 2 does not stipulate that the three date components reside in the halfword at offset 0, so a program that uses pointers to time-and-date structures as if they were pointers to date structures will not be portable to little-endian machines. As this example illustrates, certain combinations of layout specifications can be maintained consistently when changing bit order, and others cannot. It is impossible to preserve both the left-to-right ordering of all six components and the offset of the halfword containing the date components across both bit orderings. Thus one cannot write an endian-independent program depending on all of these properties. However, layout specifications that specify only the size and location of nonoverlapping machine scalars, plus the size and left-to-right ordering of fields within those machine scalars, can be maintained consistently when changing bit order. Each endian-independent layout specification determines one big-endian memory mapping and one little-endian memory mapping. A given memory mapping on a given machine may satisfy many different endian-independent layout specifications. Distinct endian-independent layout specifications that are satisfied by the same memory mapping in one bit ordering are satisfied by distinct memory mappings in the opposite bit ordering. (* Diagram omitted *) Bit numbers within record-representation clauses A storage layout for a record type can be specified in Ada by a record-representation clause. Such a clause specifies the location of each component in memory in terms of a specified range of bits, numbered relative to bit zero of the storage unit (on typical architectures, the byte) at a specified offset. However, whether bits are numbered left-to-right or right-to-left by default depends on the compiler. Thus, Endian-Independent Layout Specification 1 might be specified by the record-representation clause for Date_And_Time_Type use record Years_Since_1980 at 0 range 0 .. 6; Month at 0 range 7 .. 10; Day_Of_Month at 0 range 11 .. 15; Hour at 2 range 0 .. 4; Minute at 2 range 5 .. 10; Seconds at 2 range 11 .. 15; end record; for a big-endian target machine, and by the record-representation clause for Date_And_Time_Type use record Years_Since_1980 at 0 range 9 .. 15; Month at 0 range 5 .. 8; Day_Of_Month at 0 range 0 .. 4; Hour at 2 range 11 .. 15; Minute at 2 range 5 .. 10; Seconds at 2 range 0 .. 4; end record; for a little-endian target machine. We would prefer to be able to write and maintain a single record-representation clause that would produce the appropriate memory mapping for the target machine whether that machine is big-endian or little-endian. At first glance, the bit-order clause, a new feature of Ada 95, appears to provide a solution: The bit-order clause for Date_And_Time_Type'Bit_Order use High_Order_First; specifies that bit numbers should be interpreted according to big-endian conventions in a record-representation clause for Date_And_Time_Type (i.e., with bit 0 being the high-order bit of a byte), regardless of the compiler's default bit order, while the bit-order clause for Date_And_Time_Type'Bit_Order use Low_Order_First; specifies those bit numbers should be interpreted according to low-endian conventions. Thus, either the first bit-order clause followed by the first record-representation clause, or the second bit-order clause followed by the second record-representation clause, should produce the appropriate storage layout for the target, regardless of whether the target is big-endian or little-endian. Unfortunately, matters are not so simple. The Ada standard gives compilers for most machines permission to reject a bit-order clause that specifies the nondefault bit order. In the nondefault bit order, there are multiple, distinct meanings we can ascribe to bit numbers greater than or equal to the number of bits in a byte. The drafters of the Ada-95 standard implicitly ascribed a meaning that allows the specification of impractical memory mappings. To avoid requiring compilers to support these impractical mappings, they chose not to require compilers to support nondefault bit orders. The meaning of large bit numbers in record representation clauses The meaning of large bit numbers in the default bit order is well understood: Assuming eight-bit bytes, bit 8+b of byte a is the same bit as bit b of byte a+1. Consequently, there are redundant ways to specify the same storage layout. For example, the big-endian record-representation clause could have been written equivalently (for a compiler whose default bit order is big-endian) as follows: for Date_And_Time_Type use record Years_Since_1980 at 0 range 0 .. 6; Month at 0 range 7 .. 10; Day_Of_Month at 1 range 3 .. 7; -- was 0 range 11 .. 15 Hour at 2 range 0 .. 4; Minute at 2 range 5 .. 10; Seconds at 3 range 3 .. 7; -- was 2 range 11 .. 15 end record; Our proposal exploits this redundancy to express endian-independent layout specifications. Under this proposal, the two big-endian record-representation clauses specify the same memory mapping, but a different set of endian-independent layout specifications. While the meaning of large bit numbers is obvious in the default bit order, it is not obvious in the nondefault bit order. Suppose we and compile a big-endian record-representation clause, together with a big-endian bit-order clause, for a little-endian target. The record-representation clause includes the component clause Minute at 2 range 5 .. 10; specifying that the Minute component is to occupy the following six bits: * bit 5 (big endian) of byte 2 * bit 6 (big endian) of byte 2 * bit 7 (big endian) of byte 2 * "bit 8" (big endian) of byte 2 * "bit 9" (big endian) of byte 2 * "bit 10" (big endian) of byte 2 If we adhere to the definition that bit 8+b of byte a is the same bit as bit b of byte a+1, hen this is equivalent to the following bits: * bit 5 (big endian) of byte 2 * bit 6 (big endian) of byte 2 * bit 7 (big endian) of byte 2 * bit 0 (big endian) of byte 3 * bit 1 (big endian) of byte 3 * bit 2 (big endian) of byte 3 Under big-endian bit-numbering rules, these are the three rightmost (i.e., least significant) bits of byte 2 and the three leftmost (i.e., most significant) bits of byte 3. However, on a little-endian target, these two groups of bits will not be adjacent in a machine scalar corresponding to bytes 2 and 3. This would make it difficult to extract and update the Minute component. (An analogous problem arises if we try to compile a little-endian record-representation clause with a little-endian bit-order clause for a big-endian machine.) To avoid this problem, the drafters of the Ada-95 standard chose to reject mandatory support for nondefault bit orders. We prefer to reject the definition of bit 8+b of byte a to be the same bit as bit b of byte a+1 in the nondefault bit order. We adopt a different definition that turns out to be equivalent to that definition in the default bit order, but different, and more useful, in the nondefault bit order. Avoiding noncontiguous bit fields Contiguity of bits is a meaningful notion within a machine scalar, or within a single byte of memory, but not between bits in different bytes of memory. The notion of bit position within a byte is inherent in the binary representation of the numeric value contained in the byte. Our view of bit contiguity within memory is conveyed by depicting memory with bits arranged left-to-right in bytes that are stacked vertically: byte 0 byte 1 byte 2 Although bits in different bytes may correspond to adjacent bits in a machine scalar (i.e., the bits become adjacent when loaded into a register), there is no notion of two bits in different bytes of memory being adjacent. (In the context of a single machine with a known byte order, it is common practice to depict bytes side by side, with addresses increasing left-to-right for big-endian machines and right-to-left for little-endian machines. These depictions are convenient because bits are depicted in memory in the same left-to-right order as in the corresponding machine scalar. However, such depictions are meaningless for machines of opposite byte order.) Consequently, the notion of a range of bits only makes sense with respect to a machine scalar. When we say Minute at 2 range 5 .. 10; we are referring to a range of bits in a machine scalar corresponding to the memory beginning at offset 2. This definition refers to a contiguous range of bits in the machine scalar regardless of the bit ordering. In the default bit order, we number bits from zero starting in the lowest-addressed byte of the machine scalar. Thus we can readily determine the number assigned to a bit position at a given distance from the low-address end of the machine scalar. However, in the nondefault bit order, we number bits from zero starting in the highest-addressed byte of the machine scalar, so the number assigned to a bit position at a given distance from the low-address end of a machine scalar depends on the length of the machine scalar. Therefore, we adopt the following convention: In a record-representation clause for the nondefault bit order, there is a one-to-one correspondence between byte offsets (i.e., the numbers appearing between the words "at" and "range") and machine scalars. That is, all components whose positions are specified with the same byte offset are assumed to be part of the same machine scalar (so that in typical implementations they will be loaded into a register together); and any two components required to reside within the same machine scalar have their positions specified in terms of the same byte offset. The length of a machine scalar is inferred from the highest bit number specified along with its byte position in some component clause, rounded up to the next multiple of System.Storage_Unit. (This approach requires the explicit declaration of "filler" fields when the byte of a machine scalar containing the high-numbered bits is to be left unused. If no use is made of the record component Filler, the declarations for R'Bit_Order use X; for R use record C at 0 range 0 .. 23; end record; for R'Size use 32; and the declarations for R'Bit_Order use X; for R use record C at 0 range 0 .. 23; Filler at 0 range 24 .. 31; end record; for R'Size use 32; are equivalent on compilers whose default bit order is X. However, on compilers with the opposite default bit order, the first set of declarations places C in the bytes at offsets 0, 1, and 2, while the second set of declarations places C in the bytes at offsets 1, 2, and 3.) This interpretation of bit numbers makes it feasible to require compiler support for the nondefault bit order. We can then write portable record-representation clauses. These representation clauses correspond directly to endian-independent layout specifications: Endian-Independent Layout Specification 1 can be written portably either as for Date_And_Time'Bit_Order use High_Order_First; for Date_And_Time_Type use record Years_Since_1980 at 0 range 0 .. 6; Month at 0 range 7 .. 10; Day_Of_Month at 0 range 11 .. 15; Hour at 2 range 0 .. 4; Minute at 2 range 5 .. 10; Seconds at 2 range 11 .. 15; end record; or as for Date_And_Time'Bit_Order use Low_Order_First; for Date_And_Time_Type use record Years_Since_1980 at 0 range 9 .. 15; Month at 0 range 5 .. 8; Day_Of_Month at 0 range 0 .. 4; Hour at 2 range 11 .. 15; Minute at 2 range 5 .. 10; Seconds at 2 range 0 .. 4; end record; Endian-Independent Layout Specification 2 can be written portably either as for Date_And_Time'Bit_Order use High_Order_First; for Date_And_Time_Type use record Years_Since_1980 at 0 range 0 .. 6; Month at 0 range 7 .. 10; Day_Of_Month at 0 range 11 .. 15; Hour at 0 range 16 .. 20; Minute at 0 range 21 .. 26; Seconds at 0 range 27 .. 31; end record; or as for Date_And_Time'Bit_Order use Low_Order_First; for Date_And_Time_Type use record Years_Since_1980 at 0 range 25 .. 31; Month at 0 range 21 .. 24; Day_Of_Month at 0 range 16 .. 20; Hour at 0 range 11 .. 15; Minute at 0 range 5 .. 10; Seconds at 0 range 0 .. 4; end record; In all four record-representation clauses, each distinct byte-offset value corresponds to a distinct machine scalar (i.e., to a line of an endian-independent-layout-specification picture), and the ranges associated with a given byte offset correspond to the position of a field within that machine scalar. **************************************************************** From: Robert Dewar Sent: Tuesday, July 3, 2001 7:42 PM We suddenly had two of our large customer ask, just days apart, whether there was a way of controlling bit ordering in arrays. The answer of course is no, but it does seem that it would be perfectly reasonable to allow the specification of the Bit_Order attribute for an array type ... Thoughts? **************************************************************** From: Randy Brukardt Sent: Tuesday, July 3, 2001 8:14 PM Could you be a bit more specific on what the need/problem is? Off-hand, I can't think of anything having to do with arrays for which the bit ordering would matter. In particular, you don't specify bit numbers of arrays as you do with records. **************************************************************** From: Robert Dewar Sent: Tuesday, July 3, 2001 8:47 PM type x is array (0 .. 7) of Boolean; pragma Pack (x); now, which bit is x(0)? **************************************************************** From: Robert Duff Sent: Thursday, July 5, 2001 11:52 AM So it matters if you unchecked_convert to a type T is range 0..2**7-1. The Bit_Order indicates whether x(0) is the low- or high-order bit of the integer. Right? Now what if it's not packed? Eg: X: array (0 .. 15) of Boolean; Does Bit_Order control whether the array is stored backwards in memory (i.e., whether X(0)'Address = X'Address + 15)? If we have: X: array (1 .. 100) of Character; Y: array (1 .. 100) of Character; and X and Y are of different Bit_Order, and we unchecked convert X to Y, will Y(100) = X(1), and Y(99) = X(2)? Or what if it's packed, and bigger than a storage unit, or bigger than a word? I guess I'm confused about what the semantics should be when crossing byte or word boundaries. I'm not sure I understand the issues for records, either. :-( **************************************************************** From: Robert Dewar Sent: Thursday, July 5, 2001 12:26 PM Well bit order for records simply controls the numbering of bits WITHIN a storage unit, it has no effect on numbering of bytes. If you have fields that cross storage unit boundaries, there are two cases: 1. The easy case, where the field occupies an integral number of bytes and completely occupies these bytes, e.g. a 32 bit field occupying four bytes. In this case, bit order has no relevance in any case, since the field will simply occupy these four bytes. 2. The hard case, where the field occupies part of a byte and crosses byte boundaries. In this case the specification of a non-standard bit order results in non-contiguous fields, and is a mess. GNAT simply disallows the specification of bit order for any record with such a field. For arrays, I think it only makes sense to worry about the case of 1,2,4 bits where every element lies entirely within a storage unit, and in that case bit order makes perfectly good sense. **************************************************************** From: Robert Dewar Sent: Wednesday, February 25, 2004 5:31 PM > Is this AI waiting for someone to do some further work? > My understanding is that Norm Cohen wrote up a good solution > several years ago, but nothing much seems to have happened since then. I remember Norm describing how to control bit ordering within the current language, but I do not remember any suggestions of language features. What customers want of course is some magic incantation to allow their big endian dependent apps to run on little endian machines without any change. They are not going to get this :-) There are some things that would be useful, though whether we should mandate them, I don't know. It would be useful to be able to apply Bit_Order to a bit packed array to number the bits in opposite order. Perhaps this could be a more general feature of indexing arrays backwards (interface to Fortran-2 anyone? :-) It might also be useful to be able to apply Bit_Order to a discrete type meaning that you allow Little-endian integers for example on a big-endian machine. Note that bit order and byte order must always be consistent, so controlling the bit order as a way of talking about controlling byte order is fine. It is definitely mysterious to people that if they declare a record type R is record A, B : Integer; end record; for R'Bit_Order use ... the Bit_Order spec has no effect at all. In fact this is sufficiently odd that in GNAT we warn that the pragma has no effect on fields A and B. **************************************************************** From: Dan Eilers Sent: Thursday, February 25, 2004 12:30 PM Robert Dewar wrote: > I remember Norm describing how to control bit ordering within > the current language, but I do not remember any suggestions of > language features. > > What customers want of course is some magic incantation to allow > their big endian dependent apps to run on little endian machines > without any change. They are not going to get this :-) AI-133 points out that there are two different things a user might want, one of which is as you describe, wanting an app running on a little endian machine to magically do some sort of byte flipping so that it can deal with data represented in big-endian order. And yes, they are not going to get this. The other thing that a user might want is to be able to write a record rep clause, for example describing the layout of an IEEE 32-bit float, or other "machine scalar" type, and want to specify the record layout in little-endian format using the existing Ada95 bit-order attribute, and be able to compile such a program on either a big- or little- endian machine and be able to use such record type to extract the bit fields of such an object. In this case, there is no byte flipping going on. Norm Cohen's paper, which was added to the AI in June, 1999, (see http://www.ada-auth.org/ai-files/grab_bag/bitorder.pdf) shows that it is reasonable to expect a compiler to support endian-independent record rep clauses for machine scalar types, using a corrected definition of the existing bit-order attribute. > It would be useful to be able to apply Bit_Order to a bit packed > array to number the bits in opposite order. Perhaps this could be > a more general feature of indexing arrays backwards (interface to > Fortran-2 anyone? :-) Yes, this would also be useful (VHDL does this, for example). But that would be a separate AI. **************************************************************** From: Dan Eilers Sent: Monday, March 1, 2004 12:57 PM AI-133 has a note: > [Author's note: Randy, would it be possible to run a compiler survey to see > what compilers do with Bit_Order clauses that specify the nondefault bit > ordering? I suppose that you could use the above example of the IEEE 32-bit > format.] Here is a proposed test program: with system; with unchecked_conversion; with text_io; use text_io; procedure ai133 is type flt is digits 6; for flt'size use 32; type IEEE_32_big_endian is record Sign : Integer range 0 .. 1; Exponent : Integer range 0 .. 2**8 - 1; Mantissa : Integer range 0 .. 2**23 - 1; end record; for IEEE_32_big_endian'bit_order use system.high_order_first; for IEEE_32_big_endian use record Sign at 0 range 0 .. 0; Exponent at 0 range 1 .. 8; Mantissa at 0 range 9 .. 31; end record; type IEEE_32_little_endian is record Sign : Integer range 0 .. 1; Exponent : Integer range 0 .. 2**8 - 1; Mantissa : Integer range 0 .. 2**23 - 1; end record; for IEEE_32_little_endian'bit_order use system.low_order_first; for IEEE_32_little_endian use record Sign at 0 range 31 .. 31; Exponent at 0 range 23 .. 30; Mantissa at 0 range 0 .. 22; end record; function conv is new unchecked_conversion(flt, IEEE_32_big_endian); function conv is new unchecked_conversion(flt, IEEE_32_little_endian); data: constant array(integer range <>) of flt := (0.0, 0.0005, -1.0, 2.5, -37.25, 12345.75); ok: boolean := true; begin for i in data'range loop declare x: flt := data(i); y_b: IEEE_32_big_endian := conv(x); y_l: IEEE_32_little_endian := conv(x); begin if y_b.sign /= y_l.sign then put_line("FAILED, problem with sign"); ok := false; end if; if y_b.exponent /= y_l.exponent then put_line("FAILED, problem with exponent"); put_line(integer'image(y_b.exponent)); put_line(integer'image(y_l.exponent)); ok := false; end if; if y_b.mantissa /= y_l.mantissa then put_line("FAILED, problem with mantissa"); ok := false; end if; if flt'exponent(x)+126 /= y_b.exponent and not (y_b.exponent = 0 and flt'exponent(x) = 0) then put_line("FAILED, wrong exponent"); ok := false; end if; if (y_b.sign = 0 and flt'fraction(x) < 0.0) or (y_b.sign = 1 and flt'fraction(x) >= 0.0) then put_line("FAILED, wrong sign"); ok := false; end if; end; end loop; if ok then put_line("PASSED"); end if; end; **************************************************************** From: Robert Dewar Sent: Monday, March 1, 2004 6:04 PM GNAT says gcc -c ai133.adb ai133.adb:20:19: attempt to specify non-contiguous field not permitted ai133.adb:20:19: (caused by non-standard Bit_Order specified) ai133.adb:21:19: attempt to specify non-contiguous field not permitted ai133.adb:21:19: (caused by non-standard Bit_Order specified) gnatmake: "ai133.adb" compilation error This is as we expect. The RM specifically does not require support of non-contiguous fields, and this seems like a reasonable choice to me! **************************************************************** From: Dan Eilers Sent: Monday, March 1, 2004 6:13 PM The AI does _not_ call for support of non-contiguous fields! It calls for "proper" interpretation of component clauses in non-default bit order, such that consecutive bitoffsets are contiguous. Thus results in useful endian-independent rep clauses rather than un-useful errors about supposed attempts to specify non-contiguous fields. **************************************************************** From: Robert Dewar Sent: Monday, March 1, 2004 6:59 PM OK, but GNAT implements the current definition in the RM, and the test program you have clearly tries to make non-contiguous fields according to that definition if you have a byte addressed machine. So if you have some other meaning in mind, it would definitely be seriously incompatible (we find lots of our customers using bit order clauses with the meaning in the RM, so if this is changed, I would want to be very sure that there is not some serious incompatibility introduced. **************************************************************** From: Dan Eilers Sent: Monday, March 1, 2004 7:58 PM I haven't found any incompatibility problem. GNAT rejects non-default order rep clauses where fields span a byte boundary (as in the test case I sent), so these aren't a compatibility concern. When the fields don't cross byte boundaries, as in the following example, GNAT interprets the component clauses consistent with AI-133. with system; with unchecked_conversion; with text_io; use text_io; procedure ai133b is type uint32 is range 0..2**32-1; for uint32'size use 32; type byte is range 0..255; for byte'size use 8; type rec is record a,b,c,d: byte; end record; for rec'size use 32; type rec_big_endian is record a,b,c,d: byte; end record; for rec_big_endian'bit_order use system.high_order_first; for rec_big_endian use record a at 0 range 0 .. 7; b at 0 range 8 ..15; c at 0 range 16..23; d at 0 range 24..31; end record; type rec_little_endian is record a,b,c,d: byte; end record; for rec_little_endian'bit_order use system.low_order_first; for rec_little_endian use record a at 0 range 24..31; b at 0 range 16..23; c at 0 range 8..15; d at 0 range 0.. 7; end record; function conv is new unchecked_conversion(uint32, rec_big_endian); function conv is new unchecked_conversion(uint32, rec_little_endian); ok: boolean := true; x: uint32 := 3*(1+2**8+2**16+2**24); y_b: rec_big_endian := conv(x); y_l: rec_little_endian := conv(x); begin if y_b.a /= y_l.a then put_line("FAILED, problem with a"); ok := false; end if; if y_b.b /= y_l.b then put_line("FAILED, problem with b"); ok := false; end if; if y_b.c /= y_l.c then put_line("FAILED, problem with c"); ok := false; end if; if y_b.d /= y_l.d then put_line("FAILED, problem with d"); ok := false; end if; if ok then put_line("PASSED"); end if; end; **************************************************************** From: Dan Eilers Sent: Monday, March 1, 2004 8:12 PM I spoke too soon. There _is_ a compatibility problem for GNAT when the fields don't cross byte boundaries. My example just didn't expose it. This one does. with system; with unchecked_conversion; with text_io; use text_io; procedure ai133c is type uint32 is range 0..2**32-1; for uint32'size use 32; type byte is range 0..255; for byte'size use 8; type rec is record a,b,c,d: byte; end record; for rec'size use 32; type rec_big_endian is record a,b,c,d: byte; end record; for rec_big_endian'bit_order use system.high_order_first; for rec_big_endian use record a at 0 range 0 .. 7; b at 0 range 8 ..15; c at 0 range 16..23; d at 0 range 24..31; end record; type rec_little_endian is record a,b,c,d: byte; end record; for rec_little_endian'bit_order use system.low_order_first; for rec_little_endian use record a at 0 range 24..31; b at 0 range 16..23; c at 0 range 8..15; d at 0 range 0.. 7; end record; function conv is new unchecked_conversion(uint32, rec_big_endian); function conv is new unchecked_conversion(uint32, rec_little_endian); ok: boolean := true; x: uint32 := 1+3*2**8+5*2**16+7*2**24; y_b: rec_big_endian := conv(x); y_l: rec_little_endian := conv(x); begin if y_b.a /= y_l.a then put_line("FAILED, problem with a"); ok := false; end if; if y_b.b /= y_l.b then put_line("FAILED, problem with b"); ok := false; end if; if y_b.c /= y_l.c then put_line("FAILED, problem with c"); ok := false; end if; if y_b.d /= y_l.d then put_line("FAILED, problem with d"); ok := false; end if; if ok then put_line("PASSED"); end if; end; **************************************************************** From: Robert Dewar Sent: Monday, March 1, 2004 9:02 PM Well I will have to analyze this more carefully. Do you think GNAT is wrong with respect to the definition in the Ada 95 RM -- it seems right to me. All the attribute does in Ada 95 is to renumber the bits in an addressing unit ... **************************************************************** From: Randy Brukardt Sent: Monday, March 1, 2004 9:38 PM ... > Here is a proposed test program: ... > type IEEE_32_big_endian is > record > Sign : Integer range 0 .. 1; > Exponent : Integer range 0 .. 2**8 - 1; > Mantissa : Integer range 0 .. 2**23 - 1; > end record; Minor nit: This assumes that Integer is 32-bit type, which is not guaranteed by the language and isn't true on at least one compiler... **************************************************************** From: Dan Eilers Sent: Tuesday, March 2, 2004 11:58 AM Robert Dewar wrote: > Well I will have to analyze this more carefully. Do you think GNAT > is wrong with respect to the definition in the Ada 95 RM -- it seems > right to me. All the attribute does in Ada 95 is to renumber the > bits in an addressing unit ... The existing RM wording is muddled, and subject to multiple interpretations. RM 13.5.3 paragraph 2, says that bit_orders of high_order_first and low_order_first correspond to big endian and little endian, respectively. GNAT gives the warning: warning: Bit_Order clause does not affect byte ordering But it is nonsensical to interpret "big endian" and "little endian" in RM 13.5.3(2) as not referring to byte ordering. If this turns out to be too big an incompatibility for GNAT users, then it might make sense to add a new attribute byte_order, which does what AI 133 prescribes for bit_order, and then make bit_order obsolete. But I don't think the incompatibility is particularly bad. For cases where fields do cross byte boundaries there is no incompatibility because GNAT rejects those. And GNAT users are unlikely to use non-default bit_order rep clauses where fields don't cross byte boundaries, because GNAT warns that such rep clauses have no effect. **************************************************************** From: Robert Dewar Sent: Tuesday, March 2, 2004 4:09 PM > But it is nonsensical to interpret "big endian" and "little endian" > in RM 13.5.3(2) as not referring to byte ordering. I strongly disagree that it is nonsensical. It makes perfect sense and is what was intended by the design. Big-endian and Little-endian clearly refer to both bit and byte order (since the two always go together -- yes yes, except on the 68K -- read my book to find out how idiotic that decision was :-) Saying that something is nonsensical is not an argument, in fact it is a lack of argument :-) Yes, that means it only works for an addressing unit, which means that it only is really useful if the addressing unit is large, hence the limitation in the RM, but a lot of our users have found it useful for ordering of bits within a byte, and indeed Randall Anderson and I worked out a scheme for completely handling big-little endian compatibility that works very nicely, but depends on having this renumbering of bits. > If this turns out to be too big an incompatibility for GNAT users, > then it might make sense to add a new attribute byte_order, which > does what AI 133 prescribes for bit_order, and then make bit_order > obsolete. Requiring compilers to manipulate byte order is too large a change in my view, and in any case is not what the original AI was about! > But I don't think the incompatibility is particularly bad. > For cases where fields do cross byte boundaries there is no > incompatibility because GNAT rejects those. And GNAT users > are unlikely to use non-default bit_order rep clauses where > fields don't cross byte boundaries, because GNAT warns that ^^^^^ You mean *do* not *don't* > such rep clauses have no effect. Wrong, the warning is on a field by field basis ... By the way, the approach Randall and I worked out is to have the big-endian machine and little-endian machine order the fields in reverse order, which is easily done with a parametrized rep clause, then use bit order to deal with stuff within a byte, then byte flip the entire record -- yes I know, more detail needed :-) **************************************************************** From: Dan Eilers Sent: Tuesday, March 2, 2004 6:57 PM > Big-endian and Little-endian > clearly refer to both bit and byte order ... On this we agree completely. When the bit_order attribute is used, it should be used to specify the interpretation of both the bit and byte order. > Requiring compilers to manipulate byte order is too large a change > in my view, and in any case is not what the original AI was about! Correct. AI-133 has absolutely nothing to do with requiring compilers to manipulate byte order at run-time. Instead, it is about endian-independent rep clauses, with absolute no run-time byte flipping or discontiguous fields. > Yes, that means it only works for an addressing unit. No, as shown by Norm Cohen's write up, a proper interpretation of the bit_order attribute makes it trivially easy to support endian-independent rep clauses that work for more than an addressing unit. > > But I don't think the incompatibility is particularly bad. > > For cases where fields do cross byte boundaries there is no > > incompatibility because GNAT rejects those. And GNAT users > > are unlikely to use non-default bit_order rep clauses where > > fields don't cross byte boundaries, because GNAT warns that > ^^^^^ > You mean *do* not *don't* No, I meant exactly what I said. The only incompatibility is when fields don't cross byte boundaries (the other case is rejected by GNAT, so no inconsistency), and such use is unlikely due to GNAT's warning that the attribute is interpreted not to do what the user wants. > By the way, the approach Randall and I worked out is to have the > big-endian machine and little-endian machine order the fields in > reverse order, which is easily done with a parametrized rep clause, > then use bit order to deal with stuff within a byte, then byte > flip the entire record -- yes I know, more detail needed :-) No such standing on one's head is required. Read the AI. **************************************************************** From: Robert Dewar Sent: Tuesday, March 2, 2004 8:22 PM > On this we agree completely. > When the bit_order attribute is used, it should be used to specify > the interpretation of both the bit and byte order. Well in fact it only specifies bit order, and AI133 does not change that. >>Requiring compilers to manipulate byte order is too large a change >>in my view, and in any case is not what the original AI was about! > Correct. AI-133 has absolutely nothing to do with requiring > compilers to manipulate byte order at run-time. Instead, it is > about endian-independent rep clauses, with absolute no run-time > byte flipping or discontiguous fields. Right, but I think it is useless to solve this sub problem. The only interesting problem to solve is interchange information between LE and BE machines, and the approach of this AI is simply not helpful in that regard. >>Yes, that means it only works for an addressing unit. > No, as shown by Norm Cohen's write up, a proper interpretation > of the bit_order attribute makes it trivially easy to support > endian-independent rep clauses that work for more than an addressing > unit. I find the NC approach unconvincing, I cannot think of a single customer problem that we have had in this area (and there are many) where this would have helped at all. > No, I meant exactly what I said. The only incompatibility is > when fields don't cross byte boundaries (the other case is > rejected by GNAT, so no inconsistency), and such use is unlikely > due to GNAT's warning that the attribute is interpreted not to do > what the user wants. You miss entirely the usual use of this attribute which is something like the following. I want to transfer a byte between a LE and BE machine that has field A in the 4 ms bits, and field B in the 4 ls bits. All I have to do is type R is record a,b : integer range 0 .. 15; end record; for R'Bit_Order use High_Order_First; for R use record A at 0 range 0 .. 3; B at 0 range 4 .. 7; end record; This works perfectly and allows the same rep clause on the LE and BE machines, and allows data to be interchanged. No warning, because the rep clause does exactly what is expected. And this is a case where fields do not cross byte boundaries. So I really have no idea what you meant to say in the above para. What you did say makes no sense. >>By the way, the approach Randall and I worked out is to have the >>big-endian machine and little-endian machine order the fields in >>reverse order, which is easily done with a parametrized rep clause, >>then use bit order to deal with stuff within a byte, then byte >>flip the entire record -- yes I know, more detail needed :-) > No such standing on one's head is required. Yes it is! > Read the AI. The whole point of the AI is to solve the problem of consistent layout of fields, *WITHOUT PROVIDING* LE/BE data interchange. That's clear in the AI. To solve the LE/BE data interchange issue, you HAVE to have byte flipping. There is no avoiding this, it's fundamental. I find the AI in its current form worse than useless. It further complicates the understanding of this attribute in a bizarre and complex manner without providing any additional useful functionality. The only problem worth solving here is interchange of data between the two machine types. The current definition does this within bytes but does not deal with fields going over byte boundaries or fields occupying more than one byte. The AI solves neither of these problems. **************************************************************** From: Dan Eilers Sent: Wednesday, March 3, 2004 11:35 AM Today, Robert Dewar wrote: > The only interesting problem to solve is interchange information > between LE and BE machines, and the approach of this AI is > simply not helpful in that regard. Last week, Robert Dewar wrote: > What customers want of course is some magic incantation to allow > their big endian dependent apps to run on little endian machines > without any change. They are not going to get this :-) Yes it is true that AI-133 is not the least bit helpful with regard to the data interchange problem, and yes it is true that Ada0Y is not going to provide a solution to that problem. What AI-133 does provide is a solution to is the endian-independent rep clause problem. This problem is not some unimportant subproblem. It has been the subject of more than one published technical article, it is the subject of repeated c.l.a. inquiries, it was the subject of more than one Ada9x revision request, and such revision revision requests were vetted by experts and accepted in the Ada9x revision requirements document. However, Ada95's solution to endian-independent rep clauses was botched. AI-133 fixes that in a way that does not cause any runtime overhead or any compile time complexity, or any difficulties in understanding the attribute. Consecutive bitoffsets in non-default bitorder are simply interpreted as specifying contiguous storage locations, just as they do in default bitorder. We have implemented Norm's solution. It was trivial to do, and it works great. Our user's don't get any nonsensical error messages about purported attempts to specify non-contiguous fields, or any nonsensical warnings about the bit_order attribute not controlling byte order. > You miss entirely the usual use of this attribute which is something > like the following. > > I want to transfer a byte between a LE and BE machine that has > field A in the 4 ms bits, and field B in the 4 ls bits. The AI makes it crystal clear that the bit_order attribute is _not_ intended to solve the problem of data interchange between BE and LE machines. **************************************************************** From: Robert Dewar Sent: Wednesday, March 3, 2004 6:49 PM > What AI-133 does provide is a solution to is the endian-independent > rep clause problem. This problem is not some unimportant subproblem. > It has been the subject of more than one published technical article, > it is the subject of repeated c.l.a. inquiries, it was the subject of > more than one Ada9x revision request, and such revision revision requests > were vetted by experts and accepted in the Ada9x revision requirements > document. The Ada 95 RM itself was vetted by experts, and what is says is reasonable, and in fact the facilities there DO solve the endian transfer problem for a limited set of cases. > However, Ada95's solution to endian-independent rep clauses was botched. I do not agree that this is botched. > AI-133 fixes that in a way that does not cause any runtime overhead or > any compile time complexity, or any difficulties in understanding the > attribute. Consecutive bitoffsets in non-default bitorder are simply > interpreted as specifying contiguous storage locations, just as they do > in default bitorder. I do not regard AI-133 as useful > We have implemented Norm's solution. It was trivial to do, and > it works great. I trust this is under some switch, since I think it is clearly non-conforming behavior otherwise. > Our user's don't get any nonsensical error messages > about purported attempts to specify non-contiguous fields, or any > nonsensical warnings about the bit_order attribute not controlling > byte order. Those warnings are extremely important in practice. Every user who has queried the warnings was in fact trying to solve the problem of moving data between different endianess machines. >>You miss entirely the usual use of this attribute which is something >>like the following. >> >>I want to transfer a byte between a LE and BE machine that has >>field A in the 4 ms bits, and field B in the 4 ls bits. > The AI makes it crystal clear that the bit_order attribute is _not_ > intended to solve the problem of data interchange between BE and LE > machines. Sorry, but the AI is irrelevant. The feature as described in the RM today *IS* useful in solving the data interchange problem, and is used that way routinely. You may read the AI this way, but even if this reading of the AI is correct, the AI has no normative effect on the standard. Actually what I think the AI is saying is that Bit_Order cannot be used to solve all such problems. And that I agree with! **************************************************************** From: Dan Eilers Sent: Wednesday, March 3, 2004 7:05 PM Robert Dewar wrote: > The Ada 95 RM itself was vetted by experts, and what is says is > reasonable, and in fact the facilities there DO solve the endian > transfer problem for a limited set of cases. We can agree to disagree that what the RM currently says about bit_order is reasonable. Certainly there is no dispute that the current RM wording does not solve the problem of endian-independent rep clauses. > I do not regard AI-133 as useful We can agree to disagree on this. But certainly AI-133 _does_ solve the problem of endian-independent rep clauses. > > We have implemented Norm's solution. It was trivial to do, and > > it works great. > > I trust this is under some switch, since I think it is clearly > non-conforming behavior otherwise. No, it's in standard mode, using Dewar's rule that the RM does not require nonsense, and discontiguous fields are nonsense. > > Our user's don't get any nonsensical error messages > > about purported attempts to specify non-contiguous fields, or any > > nonsensical warnings about the bit_order attribute not controlling > > byte order. > > Those warnings are extremely important in practice. I agree that under your interpretation of the bit_order attribute, it is extremely important to warn users of this unexpected behavior. > Actually what I think the AI is saying is that Bit_Order cannot be > used to solve all such problems. And that I agree with! The AI says much more than this. It proposes a perfectly fine solution to the endian-independent rep clause problem. **************************************************************** From: Robert Dewar Sent: Wednesday, March 3, 2004 7:38 PM Dan Eilers wrote: > No, it's in standard mode, using Dewar's rule that the RM does > not require nonsense, and discontiguous fields are nonsense. Not at all. Non-contiguous fields definitely arise when transferring data from LE to BE machines. It would indeed be useful to support them generally. We have this on a list of useful (but difficult) enhancements. > I agree that under your interpretation of the bit_order attribute, > it is extremely important to warn users of this unexpected behavior. You miss the point. In *all* the cases where we have discussed this warning, customers wanted to solve the problem of transferring data between machines. We would still give the warning even if the AI were implemented, since in our experience, the interpretation that the AI gives by default is NOT what the customer needed, wanted, or expected. > The AI says much more than this. It proposes a perfectly fine > solution to the endian-independent rep clause problem. First, I do not agree that this is a significant issue, divorced from the data independence issue. Second, the feature as it exists now IS useful for solving data independence, and is best seen as a limited facility for this purpose. That's the way the RM is written now, and I prefer the way it is written now to your preferred rewriting. If you need a solution to the problem of endian independent rep clauses (they are quite easy to write now with appropriate parametrization), I think it is a bad idea to hijack this feature in this manner, and I find the fundamental trick in this AI to be an unpleasant and confusing kludge. **************************************************************** From: Dan Eilers Sent: Wednesday, March 3, 2004 8:29 PM Robert Dewar wrote: > Non-contiguous fields definitely arise when transferring > data from LE to BE machines. It would indeed be useful to support > them generally. We have this on a list of useful (but difficult) > enhancements. I fully agree that it would useful for Ada at some point to solve the data interoperability problem. But since nobody has proposed a solution, I don't see any chance for 200Y. Feel free to propose an AI for that, but don't hijack AI-133, and don't dismiss AI-133 on the grounds that it doesn't solve that problem. I continue to believe that the bit-order attribute is not intended as a solution to the data interoperability problem. This was an early Ada9x study topic, but it got dropped from the final revision requirements document, although solving the endian-independent rep clauses problem was retained as a Ada9x requirement. > We would still give the warning even if the AI > were implemented, since in our experience, the interpretation that > the AI gives by default is NOT what the customer needed, wanted, > or expected. Well, if AI-133 were implemented, the revised 200Y RM would would presumably make it clear to users what to expect from the feature. If you still wanted to give a warning, that would be fine with me. > First, I do not agree that this is a significant issue, divorced > from the data independence issue. We can agree to disagree on this. Certainly users raise this issue repeatedly, particularly since Ada doesn't support conditional compilation. Endian-independent rep clauses also happen to come in handy for implementing the floating point attributes, such as 'exponent. > Second, the feature as it exists > now IS useful for solving data independence, and is best seen as > a limited facility for this purpose. We can agree to disagree on this. **************************************************************** From: Robert Dewar Sent: Wednesday, March 3, 2004 9:00 PM > We can agree to disagree on this. Not really, the disagreement has consequences (whether or not to proceed on this AI), so an agreement of this nature settles nothing. **************************************************************** From: Robert Dewar Sent: Wednesday, March 3, 2004 9:02 PM > I fully agree that it would useful for Ada at some point to solve > the data interoperability problem. But since nobody has proposed > a solution, I don't see any chance for 200Y. Feel free to propose > an AI for that, but don't hijack AI-133, and don't dismiss AI-133 > on the grounds that it doesn't solve that problem. I repeat I find the "solution" to be a horrible and confusing kludge. It also prevents a more useful implementation of Bit_Order (GNAT warns now, but it could do things properly instead). **************************************************************** From: Dan Eilers Sent: Wednesday, March 3, 2004 10:46 PM Robert Dewar wrote: > > We can agree to disagree on this. > > Not really, the disagreement has consequences (whether or > not to proceed on this AI), so an agreement of this nature > settles nothing. We can agree to let the ARG settle the dispute. Clearly we are not going to change each other's opinion by repeatedly saying we think the feature is useful or we think it's not. > I repeat I find the "solution" to be a horrible and confusing > kludge. It also prevents a more useful implementation of Bit_Order > (GNAT warns now, but it could do things properly instead). It would move the discussion along if you said what it was about the proposed "solution" that you find to be so horrible. Is it simply that the solution gets in the way of future GNAT plans for solving the data interoperability problem? Is it the way the solution is worded? Surely you don't find it horrible and confusing that a user would want to write portable rep clauses. Its certainly less of a kludge than the "parameterized" rep clauses you suggested as an alternative. When IEEE defines the layout of floating point types do they use "parameterized" descriptions? ****************************************************************