Version 1.3 of ais/ai-00216.txt

Unformatted version of ais/ai-00216.txt version 1.3
Other versions for file ais/ai-00216.txt

!standard 03.08.01 (01)          00-11-15 AI95-00216/03
!class amendment 99-03-23
!status work item 99-03-23
!status received 99-03-23
!priority Medium
!difficulty Hard
!subject Unchecked Unions -- Variant Records With No Run-Time Discriminant
!summary
We propose to formalize the rules for the pragma Unchecked_Union, which is now supported by several Ada 95 compilers. The proposed rules are intended to be flexible enough to accommodate all reasonable uses, while preserving a modicum of safety.
!problem
The Ada 95 standard does not include a mechanism for mapping C unions to Ada data structures. This capability is important enough that several Ada 95 compilers have defined a method to support unions.
!proposal
pragma Unchecked_Union(first_subtype_mark);
This pragma is intended for use with a type with a variant part (potentially with further nested variant parts), that is to be manipulated in C using union(s). The pragma specifies that the associated type should be given a representation that leaves no space for the discriminant(s) of the type (unless overridden by a component clause that specifies a location for the discriminant). Furthermore, the effect of this pragma includes an implicit suppress of Discriminant_Check on the specified type, and an implicit convention of C (which may be overridden).
The specified type may have a non-variant part preceding the variant part, which would correspond to a C struct containing a union as its last component. A variant may have multiple components, which would correspond to a C union with a struct as one of its elements. A variant may have a nested variant part, which would correspond to a nested C union. The type may have multiple discriminants, to support the possibly nested unions being selected along different dimensions.
The Ada type may, but need not, have defaults for all discriminants. All objects of the type, even if limited or allocated in the heap (and hence effectively constrained by the initial discriminant value(s)), should be allocated the size C would allocate for the corresponding struct/union, which will be at least the maximum size required for any discriminant value. This is because any whole-object assignments performed to or from such an object by the C code will generally involve this maximum size, even if they preserve the (conceptual) discriminant value. [Note that requiring defaults is not really necessary, but it seems benign, and suggests to the reader that objects of such a type are always allocated the max size, and are generally "mutable."]
Each discriminant of an object of an unchecked_union type must be specified (explicitly or implicitly) when the object is created, even though its value is not stored (unless a location is specified via a component clause), to enable appropriate default initialization of the appropriate variant(s), or to determine which fields should appear in a record aggregate.
Within the definition of an unchecked_union type, the discriminant(s) may not be used in a constraint in a component_definition, unless the type of the component is itself an unchecked_union type. This is to ensure that the size of the component does not depend on the value of the discriminant. Note that the discriminant may be used to govern a discriminant part, or as a default initial value for a component, or within a component clause of a record representation clause.
Outside the definition of the object's type, a discriminant of an object of an unchecked_union type must not be read.
Constrained subtypes of an unchecked_union type are permitted, as this may be necessary to properly specify the (initial) discriminant value for a variable or subcomponent having the type. It is erroneous to perform any operation (in C or Ada) that would have failed a discriminant check had the discriminant been present at run-time.
The pragma Unchecked_Union may be applied to a derived type, presuming its ultimate ancestor type meets the requirements for the pragma. Converting the derived type to an unconstrained subtype of an ancestor (checked) type raises Program_Error, because there is no way to determine the values for the discriminants. Converting to a constrained subtype is permitted, as the discriminant values are implied by the constraint (as above, the conversion is erroneous if it would have failed a discriminant check). Converting from an ancestor (checked) type to the derived type is permitted, and simply drops the discriminant(s) (and performs whatever other representation adjustments are necessary). If the target (unchecked) subtype is constrained, a constraint check is performed on the value of the checked type prior to dropping the discriminants. (These conversion rules are intended to allow an Ada program to primarily manipulate a checked type, and then convert to/from an unchecked type just before and after communicating with C code.)
In an instantiation of a generic, so as to avoid contract violations involving discriminant references in the body of the generic, if the actual type is an unchecked union, then, if the formal type is private, it must not have known discriminants, or if the formal type is derived, the specified ancestor type must also be an unchecked union.
To avoid other kinds of generic contract violations, if the type is non-limited, all of the normal operations of a non-limited type exist, including assignment, equality, membership, 'Read, 'Write, 'Input, 'Output, etc. Assignment is defined in terms of a block copy on all bits of the representation. The other operations all raise Program_Error, because they generally require reading the value of the discriminant to give a meaningful result.
Record representation clauses are permitted for unchecked unions. By default, no space is given for a discriminant unless it is mentioned explicitly in a component clause.
!wording
!example
Given the C type:
struct sym {
int id; char *name; union {
struct { struct sym *obj_type; int obj_val_if_known; } obj; struct { struct sym *pkg_first_component; int pkg_num_components; } pkg;
} u;
};
This would map to the following unchecked_union type in Ada:
type Sym; type Sym_Ptr is access Sym;
type Sym_Kind_Enum is (Obj_Kind, Pkg_Kind);
type Sym(Kind : Sym_Kind_Enum := Sym_Kind_Enum'First) is record
Id : Integer := Assign_Id(Kind); Name : C_Ptr; case Kind is when Obj_Kind => Obj_Type : Sym_Ptr; Obj_Val_If_Known : Integer := -1; when Pkg_Kind => Pkg_First_Component : Sym_Ptr; Pkg_Num_Components : Integer := 0; end case;
end record; pragma Unchecked_Union(Sym);
!discussion
Several Ada 95 compilers now support a pragma Unchecked_Union for specifying that the discriminant of a variant record should not be present at run-time, thereby matching a C union. However, these compilers differ in the legality rules they enforce relative to the type.
This proposal is designed to allow what in C is essentially one (named) type definition, to similarly become one type definition in Ada. It is quite common to use anonymous unions and structs nested within one another in C. In Ada, the natural mapping for this is non-variant and variant parts, and nested variants. The need to eliminate the storage for the discriminant exists for these more complex uses of "union" in C just as much as it exists for a very simple use of "union." Hence, we propose to allow unchecked_union types to include non-variant parts along with variant parts, to include multiple components per variant, and to support nested variants, to maximize the likelihood that there is a natural mapping possible between the C type structure and the Ada type structure.
We have also defined rules that should eliminate generic contract model violations related to unchecked_unions, as this seems important if such types are going to be used in relatively portable C interfacing code with compilers that may share generics.
!appendix

Randy Brukardt  99-03-29

I've reformatted Tucker's submission into the format discussed at
the recent ARG meeting. Note that I haven't made any of the changes
considered at that meeting.

*************************************************************

From: Jean-Pierre Rosen
Sent: Friday, October 1, 1999 at 2:09 AM

Agreed, just a suggestion: [Editor's note, this a comment on version 2]

> To avoid other kinds of generic contract violations, if the type is
> non-limited, all of the normal operations of a non-limited type exist,
> including assignment, equality, membership, 'Read, 'Write, 'Input,
> 'Output, etc.  Assignment is defined in terms of a block copy on all bits
> of the representation.  The other operations all raise Program_Error,
> because they generally require reading the value of the discriminant
> to give a meaningful result.
>
It seems to me that completely forbidding the use of  'Read and 'Write is
too strict; there are certainly cases where you want to send such types over
a network!
'Read and 'Write could be defined like the case where there are no default
values (i.e. no discriminants written).
'input and 'output could raise Program_Error, or do the same.

OTOH, it is always possible for the user to redefine 'read and 'write...
Well, in that case, it is important to NOT define 'input and 'output as
raising P_E, since their normal behaviour (for discriminated defaulted
types) is simply to call 'Read and 'Write.

*************************************************************

From: Robert A Duff
Sent: Friday, October 1, 1999 at 8:57 am.

> It seems to me that completely forbidding the use of  'Read and 'Write is
> too strict; there are certainly cases where you want to send such types over
> a network!
> 'Read and 'Write could be defined like the case where there are no default
> values (i.e. no discriminants written).
> 'input and 'output could raise Program_Error, or do the same.
>
> OTOH, it is allways possible for the user to redefine 'read and 'write...

As Tucker mentioned in the meeting, another way to get the operations
that raise Program_Error is to first define the variant record as a
normal type, then derive from it, and declare the derived type to be an
unchecked_union.  Then you can convert values to the parent type, and do
all the normal stuff without getting Program_Error.

Oh, and by the way, we decided that you could convert to constrained
subtypes of the parent type (contrary to what Tuck's write-up says).
Of course you can't convert to an unconstrained subtype, because the
compiler can't know what the discriminant value should be.

> Well, in that case, it is important to NOT define 'input and 'output as
> raising P_E, since their normal behaviour (for discriminated defaulted
> types) is simply to call 'Read and 'Write.

Good point.

- Bob

*************************************************************



Questions? Ask the ACAA Technical Agent