Version 1.1 of acs/ac-00023.txt

Unformatted version of acs/ac-00023.txt version 1.1
Other versions for file acs/ac-00023.txt

!standard B.01(30)          02-01-23 AC95-00023/01
!standard B.01(41)
!class confirmation 02-01-06
!status received no action 02-01-06
!subject Component alignment precedence: pragma Convention vs compiler default
!summary
!appendix

From: Marc A. Criley
Sent: Sunday, January 6, 2002  4:15 PM

While working on porting a large code base that involved passing messages
(records) between Ada and C I encountered a record component layout issue
that startled me.  I've been adding pragma Convention (C, ...) to avoid such
problems.

I see three possibilities:
  1) I am misunderstanding something or making unwarranted assumptions.
  2) There is a compiler bug.
  3) There is an implementation freedom for compilers that I feel should
be restricted or clarified (and therefore is the reason I bring it to
this forum).

Here are the particulars:

The C structs, wherein "int" occupies 32 bits:

  typedef struct
  {
    int Seconds;
    int Milliseconds;
  } TimeType;

  typedef struct
  {
    int      ID;
    TimeType TLO_Time;
  } TLO_Data;

The analogous Ada definitions:

  type Time_Type is record
      Seconds      : Seconds_Type;  -- 32 bits
      Milliseconds : MS_Type;       -- 32 bits
  end record;

  type TLO_Data is record
      ID       : ID_Type;  -- 32 bits
      TLO_Time : Time_Type;
  end record;
  pragma Convention(C, TLO_Data);

The operating environment is Rational Apex 4.0, on Solaris 7, a "64 bit
operating system".

The problem arose with the C side placing each struct component in adjacent
words with 4 bytes of padding at the end, i.e.
ID/Seconds/Milliseconds/Pad32.

Apex layed the record out as ID/Pad32/Seconds/Milliseconds, which obviously
triggered unexpected effects.

No alignment clauses were applied to the Ada records, so component alignment
would be controlled by the compiler chosen default alignment (which for Apex
4.0 on Solaris 7 is "mod 8") and any layout requirements resulting from
specifying pragma Convention (C, ...).


Is this a situation where pragma Convention is required to be applied to
TLO_Data as well?  I.e., programmer error?  It was believed, perhaps
incorrectly, that pragma Convention would ensure the record subcomponent
(Time_Type) would at least _start_ at the address subsequent to ID, as
occurs in the C code.

There is of course no notion that Convention should in any way be
recursively applied to a record's subcomponents, but is it reasonable to
expect that the layout of the record to which the Convention pragma has been
applied would at least lay out its components as monolithic blocks of data
that conform to the practices of the specified convention language?

If the answer to this is "No", that either pragma Convention or a suitable
alignment clause must be specified for the subcomponent record (Time_Type),
then it is simply a matter of programmer error, though one might ask for
some stronger wording to this effect in the RM.

If the answer is "Yes", then where is the precedence of pragma Convention
over a compiler's default layout scheme stated in the RM?  If there is such
a reference, this becomes a compiler bug, otherwise clarification is
required in the RM.

Marc A. Criley
Consultant
Quadrus Corporation
www.quadruscorp.com

****************************************************************

From: Robert Dewar
Sent: Sunday, January 6, 2002  6:04 PM

If you have components in a record which are themselves not Convention C,
then you have no right to expect anything particular unless your compiler
documents additional constraints (e.g. in general, if you did not give
pragma Convention C for Seconds, then all bets are off, except that GNAT
guarantees that 32-bit integer types are in fact represented the same as
int in C.

****************************************************************

From: Robert Dewar
Sent: Sunday, January 6, 2002  6:04 PM

I really would suggest that ada-comment only be addressed for something like
this AFTER it has been discussed elsewhere. This should really NOT be a
general list for chatting about Ada issues.

****************************************************************

From: Pascal Leroy
Sent: Monday, January 7, 2002  4:40 AM

> I see three possibilities:
>   1) I am misunderstanding something or making unwarranted assumptions.
>   2) There is a compiler bug.
>   3) There is an implementation freedom for compilers that I feel should
> be restricted or clarified (and therefore is the reason I bring it to
> this forum).

As Robert mentioned, this is not the proper forum for discussing user
misunderstandings or compiler bugs.  You have to realize that every message
sent to Ada Comment is a comment on an ISO standard, and therefore requires
some formal handling.  More messages mean more administrative work and
therefore less time to do productive work.  Rational has both a support
organization and users forums where such issues can be addressed much more
productively.

This being said, let me explain what's going on here:

>   type Time_Type is record
>       Seconds      : Seconds_Type;  -- 32 bits
>       Milliseconds : MS_Type;       -- 32 bits
>   end record;
>
>   type TLO_Data is record
>       ID       : ID_Type;  -- 32 bits
>       TLO_Time : Time_Type;
>   end record;
>   pragma Convention(C, TLO_Data);
>
> The operating environment is Rational Apex 4.0, on Solaris 7, a "64 bit
> operating system".
>
> Apex layed the record out as ID/Pad32/Seconds/Milliseconds, which obviously
> triggered unexpected effects.
>
> No alignment clauses were applied to the Ada records, so component alignment
> would be controlled by the compiler chosen default alignment (which for Apex
> 4.0 on Solaris 7 is "mod 8") and any layout requirements resulting from
> specifying pragma Convention (C, ...).

In the absence of a pragma Convention (C) on Time_Type, Apex is free to choose
whatever representation it likes.  It exercises this freedom by selecting
8-byte alignment for this type (improving access time at the expense of space).
Now when laying out type Time_Type, it uses textual order for the components
(this is required by convention C).  But other than that, each component
inherits the properties of its type.  In particular, the component TLO_Time
must be aligned on an 8-byte boundary, which requires the insertion of a 4-byte
gap after ID.

So the root of the problem is indeed a user misunderstanding.  However, I also
believe that there is a compiler bug here.  The pragma should really be
rejected because Time_Type doesn't have convention C (this is an
implementation-dependent decision, of course, but I am talking about the intent
of the person who made these implementation-dependent decisions); that's
because it is much more friendly to produce an error at compile time, and
require the user to write an extra pragma, than to silently produce a layout
which is very likely to be surprising.

In other words, I strongly encourage you to submit a problem report to your
favorite vendor ;-)

****************************************************************

Questions? Ask the ACAA Technical Agent