Version 1.3 of ais/ai-00216.txt
!standard 03.08.01 (01) 00-11-15 AI95-00216/03
!class amendment 99-03-23
!status work item 99-03-23
!status received 99-03-23
!priority Medium
!difficulty Hard
!subject Unchecked Unions -- Variant Records With No Run-Time Discriminant
!summary
We propose to formalize the rules for the pragma Unchecked_Union, which
is now supported by several Ada 95 compilers. The proposed rules
are intended to be flexible enough to accommodate all reasonable uses,
while preserving a modicum of safety.
!problem
The Ada 95 standard does not include a mechanism for mapping C unions to
Ada data structures. This capability is important enough that several Ada 95
compilers have defined a method to support unions.
!proposal
pragma Unchecked_Union(first_subtype_mark);
This pragma is intended for use with a type with a variant part
(potentially with further nested variant parts), that is to be
manipulated in C using union(s). The pragma
specifies that the associated type should be given
a representation that leaves no space for the discriminant(s)
of the type (unless overridden by a component clause that specifies
a location for the discriminant). Furthermore, the effect of this pragma
includes an implicit suppress of Discriminant_Check on the specified type,
and an implicit convention of C (which may be overridden).
The specified type may have a non-variant part preceding the
variant part, which would correspond to a C struct containing
a union as its last component. A variant may have multiple
components, which would correspond to a C union with a struct
as one of its elements. A variant may have a nested variant part,
which would correspond to a nested C union. The type may
have multiple discriminants, to support the possibly
nested unions being selected along different dimensions.
The Ada type may, but need not, have defaults for all discriminants.
All objects of the type, even if limited or allocated in the heap
(and hence effectively constrained by the initial discriminant value(s)),
should be allocated the size C would allocate for the corresponding
struct/union, which will be at least the maximum size required for any
discriminant value. This is because any whole-object assignments performed
to or from such an object by the C code will generally involve this maximum
size, even if they preserve the (conceptual) discriminant value.
[Note that requiring defaults is not really necessary, but it
seems benign, and suggests to the reader that objects of such a type
are always allocated the max size, and are generally "mutable."]
Each discriminant of an object of an unchecked_union type
must be specified (explicitly or implicitly) when the object is created,
even though its value is not stored (unless a location is specified via a
component clause), to enable appropriate default initialization of the
appropriate variant(s), or to determine which fields should appear in a record
aggregate.
Within the definition of an unchecked_union type, the discriminant(s)
may not be used in a constraint in a component_definition, unless
the type of the component is itself an unchecked_union type. This
is to ensure that the size of the component does not depend on the
value of the discriminant. Note that the discriminant may be used
to govern a discriminant part, or as a default initial value for
a component, or within a component clause of a record representation
clause.
Outside the definition of the object's type, a discriminant
of an object of an unchecked_union type must not be read.
Constrained subtypes of an unchecked_union type are permitted,
as this may be necessary to properly specify the (initial) discriminant
value for a variable or subcomponent having the type. It is erroneous to
perform any operation (in C or Ada) that would have failed a discriminant
check had the discriminant been present at run-time.
The pragma Unchecked_Union may be applied to a derived type,
presuming its ultimate ancestor type meets the requirements for the pragma.
Converting the derived type to an unconstrained subtype of
an ancestor (checked) type raises Program_Error, because there is
no way to determine the values for the discriminants.
Converting to a constrained subtype is permitted, as the
discriminant values are implied by the constraint (as above,
the conversion is erroneous if it would have failed a discriminant check).
Converting from an ancestor (checked) type to the derived
type is permitted, and simply drops the discriminant(s) (and performs
whatever other representation adjustments are necessary).
If the target (unchecked) subtype is constrained, a constraint check
is performed on the value of the checked type prior to dropping the
discriminants. (These conversion rules are intended to allow
an Ada program to primarily manipulate a checked type, and then
convert to/from an unchecked type just before and after communicating
with C code.)
In an instantiation of a generic, so as to avoid contract violations involving
discriminant references in the body of the generic, if the actual type is an
unchecked union, then, if the formal type is private, it must not have known
discriminants, or if the formal type is derived, the specified ancestor type
must also be an unchecked union.
To avoid other kinds of generic contract violations, if the type is
non-limited, all of the normal operations of a non-limited type exist,
including assignment, equality, membership, 'Read, 'Write, 'Input,
'Output, etc. Assignment is defined in terms of a block copy on all bits
of the representation. The other operations all raise Program_Error,
because they generally require reading the value of the discriminant
to give a meaningful result.
Record representation clauses are permitted for unchecked unions.
By default, no space is given for a discriminant unless it
is mentioned explicitly in a component clause.
!wording
!example
Given the C type:
struct sym {
int id;
char *name;
union {
struct {
struct sym *obj_type;
int obj_val_if_known;
} obj;
struct {
struct sym *pkg_first_component;
int pkg_num_components;
} pkg;
} u;
};
This would map to the following unchecked_union type in Ada:
type Sym;
type Sym_Ptr is access Sym;
type Sym_Kind_Enum is (Obj_Kind, Pkg_Kind);
type Sym(Kind : Sym_Kind_Enum := Sym_Kind_Enum'First) is record
Id : Integer := Assign_Id(Kind);
Name : C_Ptr;
case Kind is
when Obj_Kind =>
Obj_Type : Sym_Ptr;
Obj_Val_If_Known : Integer := -1;
when Pkg_Kind =>
Pkg_First_Component : Sym_Ptr;
Pkg_Num_Components : Integer := 0;
end case;
end record;
pragma Unchecked_Union(Sym);
!discussion
Several Ada 95 compilers now support a pragma Unchecked_Union for
specifying that the discriminant of a variant record should not be
present at run-time, thereby matching a C union.
However, these compilers differ in the legality rules they enforce
relative to the type.
This proposal is designed to allow what in C is essentially
one (named) type definition, to similarly become one type
definition in Ada. It is quite common to use anonymous
unions and structs nested within one another in C. In Ada, the
natural mapping for this is non-variant and variant parts,
and nested variants. The need to eliminate the storage for
the discriminant exists for these more complex uses of "union" in C
just as much as it exists for a very simple use of "union." Hence,
we propose to allow unchecked_union types to include non-variant
parts along with variant parts, to include multiple components
per variant, and to support nested variants, to maximize the likelihood
that there is a natural mapping possible between the C type
structure and the Ada type structure.
We have also defined rules that should eliminate generic contract
model violations related to unchecked_unions, as this seems important
if such types are going to be used in relatively portable C interfacing
code with compilers that may share generics.
!appendix
Randy Brukardt 99-03-29
I've reformatted Tucker's submission into the format discussed at
the recent ARG meeting. Note that I haven't made any of the changes
considered at that meeting.
*************************************************************
From: Jean-Pierre Rosen
Sent: Friday, October 1, 1999 at 2:09 AM
Agreed, just a suggestion: [Editor's note, this a comment on version 2]
> To avoid other kinds of generic contract violations, if the type is
> non-limited, all of the normal operations of a non-limited type exist,
> including assignment, equality, membership, 'Read, 'Write, 'Input,
> 'Output, etc. Assignment is defined in terms of a block copy on all bits
> of the representation. The other operations all raise Program_Error,
> because they generally require reading the value of the discriminant
> to give a meaningful result.
>
It seems to me that completely forbidding the use of 'Read and 'Write is
too strict; there are certainly cases where you want to send such types over
a network!
'Read and 'Write could be defined like the case where there are no default
values (i.e. no discriminants written).
'input and 'output could raise Program_Error, or do the same.
OTOH, it is always possible for the user to redefine 'read and 'write...
Well, in that case, it is important to NOT define 'input and 'output as
raising P_E, since their normal behaviour (for discriminated defaulted
types) is simply to call 'Read and 'Write.
*************************************************************
From: Robert A Duff
Sent: Friday, October 1, 1999 at 8:57 am.
> It seems to me that completely forbidding the use of 'Read and 'Write is
> too strict; there are certainly cases where you want to send such types over
> a network!
> 'Read and 'Write could be defined like the case where there are no default
> values (i.e. no discriminants written).
> 'input and 'output could raise Program_Error, or do the same.
>
> OTOH, it is allways possible for the user to redefine 'read and 'write...
As Tucker mentioned in the meeting, another way to get the operations
that raise Program_Error is to first define the variant record as a
normal type, then derive from it, and declare the derived type to be an
unchecked_union. Then you can convert values to the parent type, and do
all the normal stuff without getting Program_Error.
Oh, and by the way, we decided that you could convert to constrained
subtypes of the parent type (contrary to what Tuck's write-up says).
Of course you can't convert to an unconstrained subtype, because the
compiler can't know what the discriminant value should be.
> Well, in that case, it is important to NOT define 'input and 'output as
> raising P_E, since their normal behaviour (for discriminated defaulted
> types) is simply to call 'Read and 'Write.
Good point.
- Bob
*************************************************************
Questions? Ask the ACAA Technical Agent