Version 1.7 of ais/ai-00216.txt

Unformatted version of ais/ai-00216.txt version 1.7
Other versions for file ais/ai-00216.txt

!standard 03.08.01 (01)          02-02-04 AI95-00216/06
!class amendment 99-03-23
!status work item 99-03-23
!status received 99-03-23
!priority Medium
!difficulty Hard
!subject Unchecked Unions -- Variant Records With No Run-Time Discriminant
!summary
We propose to formalize the rules for the pragma Unchecked_Union, which is now supported by several Ada 95 compilers. The proposed rules are intended to be flexible enough to accommodate all reasonable uses, while preserving a modicum of safety.
!problem
The Ada 95 standard does not include a mechanism for mapping C unions to Ada data structures. This capability is important enough that several Ada 95 compilers have defined a method to support unions.
!proposal
pragma Unchecked_Union(first_subtype_mark);
This representation pragma is intended for use with a type with a variant part (potentially with further nested variant parts), that is to be manipulated in C using unions. The pragma specifies that the associated type should be given a representation that leaves no space for the discriminants of the type. Furthermore, the effect of this pragma includes an implicit suppress of Discriminant_Check on the specified type, and an implicit convention of C (which may be overridden).
The specified type may have a non-variant part preceding the variant part, which would correspond to a C struct containing a union as its last component. A variant may have multiple components, which would correspond to a C union with a struct as one of its elements. A variant may have a nested variant part, which would correspond to a nested C union. The type may have multiple discriminants, to support the possibly nested unions being selected along different dimensions.
The Ada type may, but need not, have defaults for all discriminants. All objects of the type, even if limited or allocated in the heap (and hence effectively constrained by the initial discriminant values), must be allocated the size C would allocate for the corresponding struct/union, which will be at least the maximum size required for any discriminant value. This is because any whole-object assignments performed to or from such an object by the C code will generally involve this maximum size, even if they preserve the (conceptual) discriminant value.
Each discriminant of an object of an unchecked_union type must be specified (explicitly or implicitly) when the object is created, even though its value is not stored, to enable appropriate default initialization of the appropriate variants, or to determine which fields should appear in a record aggregate.
Within the definition of an unchecked_union type, the discriminants may not be used in a constraint in a component_definition, unless the type of the component is itself an unchecked_union type. This is to ensure that the size of the component does not depend on the value of the discriminant. Note that the discriminant may be used to govern a discriminant part or as a default initial value for a component.
Outside the definition of the object's type, a discriminant of an object of an unchecked_union type must not be read.
Constrained subtypes of an unchecked_union type are permitted, as this may be necessary to properly specify the (initial) discriminant value for a variable or subcomponent having the type. It is erroneous to perform any operation (in C or Ada) that would have failed a discriminant check had the discriminant been present at run-time.
The pragma Unchecked_Union may be applied to a derived type, presuming it meets the requirements for the pragma. Converting the derived type to an unconstrained subtype of an ancestor (checked) type raises Program_Error, because there is no way to determine the values for the discriminants. Converting to a constrained subtype is permitted, as the discriminant values are implied by the constraint (as above, the conversion is erroneous if it would have failed a discriminant check). Converting from an ancestor (checked) type to the derived type is permitted, and simply drops the discriminants (and performs whatever other representation adjustments are necessary). If the target (unchecked) subtype is constrained, a constraint check is performed on the value of the checked type prior to dropping the discriminants. (These conversion rules are intended to allow an Ada program to primarily manipulate a checked type, and then convert to/from an unchecked type just before and after communicating with C code.)
A private type is never an unchecked union (even if its full type is). As usual, a derived type inherits the unchecked union representation aspect from its parent type. A formal derived type is an unchecked union if its specified ancestor is an unchecked union.
If an unchecked union completes a private type or private extension, the partial view must not have known discriminants or it must be an unchecked union. For contract model reasons, in an instantiation of a generic, if the actual type is an unchecked union, the formal type must not have known discriminants, or it must be an unchecked union.
To avoid other kinds of generic contract violations, if an unchecked union is non-limited, all of the normal operations of a non-limited type exist, including assignment, equality, membership, 'Read, 'Write, 'Input, 'Output, etc. Assignment is defined in terms of a block copy on all bits of the representation. The other operations all raise Program_Error, because they generally require reading the value of the discriminant to give a meaningful result.
Record representation clauses are permitted for unchecked unions. No space is given for a discriminant; it is illgal to mention a discriminant explicitly in a component clause.
An implementation may support this pragma for tagged record types or record extensions, with implementation-defined semantics.
!wording
B.3.3 Pragma Unchecked_Union
[A pragma Unchecked_Union specifies an interface correspondence between a given discriminated type and some C union. The pragma specifies that the associated type shall be given a representation that leaves no space for its discriminant(s).]
Syntax
The form of pragma Unchecked_Union is as follows:
pragma Unchecked_Union (first_subtype_local_name);
Legality Rules
Unchecked_Union is a representation pragma, specifying the unchecked union aspect of representation.
The first_subtype_local_name of a pragma Unchecked_Union shall denote an unconstrained discriminated record subtype having a variant_part.
The type is said to be an unchecked union type. A subtype of an unchecked union type is said to be an unchecked union subtype. An object of an unchecked union type is said to be an unchecked union object.
All component subtypes of the type shall be C-compatible.
If a component subtype of the type is subject to a per-object constraint, then the component subtype shall be an unchecked union subtype.
The type shall not be a by-reference type.
Static Semantics
An unchecked union type is eligible for conventions C and C_Pass_By_Copy, as is any subtype of the type.
Discriminant_Check is suppressed for the type.
All objects of the type shall have the same size.
Discriminants of objects of the type shall be of size zero.
Any name which denotes a discriminant of an Unchecked_Union type shall occur within the declarative region of the type, the component_choice_list of an aggregate, or the selector_name of a discriminant_constraint.
An Unchecked_Union subtype shall not be passed as a generic actual parameter if the corresponding formal type has a known_discriminant_part or is a formal derived type which is not an unchecked union type.
Dynamic Semantics
A view of an unchecked union object [(including a type conversion or function call)] is said to have "inferable discriminants" if it has a constrained nominal subtype, unless the object is a component of an enclosing unchecked union object which is subject to a per-object constraint and the enclosing object lacks inferable discriminants.
An expression of an Unchecked_Union type is said to have "inferable discriminants" if it is either a name of an object with inferable discriminants or a qualified expression whose subtype_mark denotes a constrained subtype.
The predefined equality operator for an Unchecked_Union type raises Program_Error if either of the operands lacks inferable discriminants. [This includes the case where the equality operator is invoked implicitly by the equality operator for an enclosing composite type - if the Unchecked_Union component is unconstrained, Program_Error is raised].
Evaluation of a membership test raises Program_Error if the subtype_mark denotes a constrained Unchecked_Union subtype and the expression lacks inferable discriminants.
Conversion from an Unchecked_Union type to an unconstrained non-Unchecked_Union type raises Program_Error if the operand of the conversion lacks inferable discriminants.
Unless overridden by an attribute_definition_clause, execution of the Write or Read attribute of an Unchecked_Union type raises Program_Error. The same holds for the Output and Input attributes if the type lacks default discriminant values.
!example
Given the C type:
struct sym {
int id; char *name; union {
struct { struct sym *obj_type; int obj_val_if_known; } obj; struct { struct sym *pkg_first_component; int pkg_num_components; } pkg;
} u;
};
This would map to the following unchecked_union type in Ada:
type Sym; type Sym_Ptr is access Sym;
type Sym_Kind_Enum is (Obj_Kind, Pkg_Kind);
type Sym(Kind : Sym_Kind_Enum := Sym_Kind_Enum'First) is record
Id : Integer := Assign_Id(Kind); Name : C_Ptr; case Kind is when Obj_Kind => Obj_Type : Sym_Ptr; Obj_Val_If_Known : Integer := -1; when Pkg_Kind => Pkg_First_Component : Sym_Ptr; Pkg_Num_Components : Integer := 0; end case;
end record; pragma Unchecked_Union(Sym);
!discussion
Several Ada 95 compilers now support a pragma Unchecked_Union for specifying that the discriminant of a variant record should not be present at run-time, thereby matching a C union. However, these compilers differ in the legality rules they enforce relative to the type.
This proposal is designed to allow what in C is essentially one (named) type definition, to similarly become one type definition in Ada. It is quite common to use anonymous unions and structs nested within one another in C. In Ada, the natural mapping for this is non-variant and variant parts, and nested variants. The need to eliminate the storage for the discriminant exists for these more complex uses of "union" in C just as much as it exists for a very simple use of "union." Hence, we propose to allow unchecked_union types to include non-variant parts along with variant parts, to include multiple components per variant, and to support nested variants, to maximize the likelihood that there is a natural mapping possible between the C type structure and the Ada type structure.
We have also defined rules that should eliminate generic contract model violations related to unchecked_unions, as this seems important if such types are going to be used in relatively portable C interfacing code with compilers that may share generics.
!appendix

Randy Brukardt  99-03-29

I've reformatted Tucker's submission into the format discussed at
the recent ARG meeting. Note that I haven't made any of the changes
considered at that meeting.

*************************************************************

From: Jean-Pierre Rosen
Sent: Friday, October 1, 1999 at 2:09 AM

Agreed, just a suggestion: [Editor's note, this a comment on version 2]

> To avoid other kinds of generic contract violations, if the type is
> non-limited, all of the normal operations of a non-limited type exist,
> including assignment, equality, membership, 'Read, 'Write, 'Input,
> 'Output, etc.  Assignment is defined in terms of a block copy on all bits
> of the representation.  The other operations all raise Program_Error,
> because they generally require reading the value of the discriminant
> to give a meaningful result.
>
It seems to me that completely forbidding the use of  'Read and 'Write is
too strict; there are certainly cases where you want to send such types over
a network!
'Read and 'Write could be defined like the case where there are no default
values (i.e. no discriminants written).
'input and 'output could raise Program_Error, or do the same.

OTOH, it is always possible for the user to redefine 'read and 'write...
Well, in that case, it is important to NOT define 'input and 'output as
raising P_E, since their normal behaviour (for discriminated defaulted
types) is simply to call 'Read and 'Write.

*************************************************************

From: Robert A Duff
Sent: Friday, October 1, 1999 at 8:57 AM

> It seems to me that completely forbidding the use of  'Read and 'Write is
> too strict; there are certainly cases where you want to send such types over
> a network!
> 'Read and 'Write could be defined like the case where there are no default
> values (i.e. no discriminants written).
> 'input and 'output could raise Program_Error, or do the same.
>
> OTOH, it is allways possible for the user to redefine 'read and 'write...

As Tucker mentioned in the meeting, another way to get the operations
that raise Program_Error is to first define the variant record as a
normal type, then derive from it, and declare the derived type to be an
unchecked_union.  Then you can convert values to the parent type, and do
all the normal stuff without getting Program_Error.

Oh, and by the way, we decided that you could convert to constrained
subtypes of the parent type (contrary to what Tuck's write-up says).
Of course you can't convert to an unconstrained subtype, because the
compiler can't know what the discriminant value should be.

> Well, in that case, it is important to NOT define 'input and 'output as
> raising P_E, since their normal behaviour (for discriminated defaulted
> types) is simply to call 'Read and 'Write.

Good point.

- Bob

*************************************************************

From: Steve Baird
Sent: Monday, January 28, 2002 at  2:40 PM

B.3.3 Pragma Unchecked_Union

[A pragma Unchecked_Union specifies an interface correspondence
between a given discriminated type and some C union. The pragma
specifies that the associated type shall be given a representation
that leaves no space for its discriminant(s).]

                Syntax

The form of pragma Unchecked_Union is as follows:

    pragma Unchecked_Union (first_subtype_local_name);

                Legality Rules

Unchecked_Union is a representation pragma, specifying the unchecked
union aspect of representation.

The first_subtype_local_name of a pragma Unchecked_Union shall denote
an unconstrained discriminated record subtype having a variant_part.

The type is said to be an unchecked union type. A subtype of an
unchecked union type is said to be an unchecked union subtype.
An object of an unchecked union type is said to be an unchecked union
object.

All component subtypes of the type shall be C-compatible.

If a component subtype of the type is subject to a per-object constraint,
then the component subtype shall be an unchecked union subtype.

The type shall not be a by-reference type.

                Static Semantics

An unchecked union type is eligible for conventions C and C_Pass_By_Copy,
as is any subtype of the type.

Discriminant_Check is suppressed for the type.

All objects of the type shall have the same size.

Discriminants of objects of the type shall be of size zero.

Any name which denotes a discriminant of an object of an
Unchecked_Union type shall occur within the declarative region of the type.

An Unchecked_Union subtype shall not be passed as a generic actual parameter
if the corresponding formal type has a known_discriminant_part or
is a formal derived type which is not an unchecked union type.

                Dynamic Semantics

A view of an unchecked union object [(including a type conversion or
function call)] is said to have "inferable discriminants" if it has a
constrained nominal subtype, unless the object is a component of an
enclosing unchecked union object which is subject to a per-object constraint
and the enclosing object lacks inferable discriminants.

An expression of an Unchecked_Union type is said to have "inferable
discriminants" if it is either a name of an object with inferable
discriminants
or a qualified expression whose subtype_mark denotes a constrained subtype.

The predefined equality operator for an Unchecked_Union type
raises Program_Error if either of the operands lacks inferable
discriminants. [This includes the case where the equality operator
is invoked implicitly by the equality operator for an enclosing composite
type - if the Unchecked_Union component is unconstrained, Program_Error
is raised].

Evaluation of a membership test raises Program_Error if the
subtype_mark denotes a constrained Unchecked_Union subtype and the
expression lacks inferable discriminants.

Conversion from a derived Unchecked_Union type to an unconstrained
non-Unchecked_Union ancestor type raises Program_Error if the operand
of the conversion lacks inferable discriminants.

Unless overridden by an attribute_definition_clause, execution of
the Write or Read attribute of an Unchecked_Union type
raises Program_Error. The same holds for the Output and Input attributes
if the type lacks default discriminant values.


-------- Discussion -------

1) Consider the following example:

     with Interfaces;
     package Uu is
       type Uu_32 (Is_Signed : Boolean) is
          record
            case Is_Signed is
                when False =>
                  Unsigned : Interfaces.Unsigned_32;
                when True =>
                  Signed : Interfaces.Integer_32;
            end case;
          end record;
      pragma Unchecked_Union (Uu_32);

       X : Uu_32 := (False, Interfaces.Unsigned_32'Last);
       Y : Interfaces.Integer_32 := X.Signed;
     end Uu;

   Elaboration of this package results in erroneous execution. If it is
   intended that this should behave as some sort of non-erroneous alternative
   form of Unchecked_Conversion, then the definition needs to be changed.

2) The various operations listed as raising Program_Error (i.e. certain
   equality tests, membership tests, conversions, and streaming operations)
   could be defined to be bounded errors, where the implementation would have
   the option of either raising Program_Error or returning the correct
   answer (typically, the latter alternative would only be selected
   if the implementation was somehow able to infer discriminant values
   for the object).

   Consider the following example:

     procedure Uu is
       type Uu_32 is <as above> ;

       subtype Signed_Uu is Uu_32 (Is_Signed => True);

       generic
         Unknown_Discrim : Uu_32;
       package G is
         Flag1 : constant Boolean := Unknown_Discrim in Signed_Uu;
       end G;

       package body G is
         Flag2 : constant Boolean := Unknown_Discrim in Signed_Uu;
       end G;

       Known_Discrim : constant Signed_Uu := (True, 33);

       package I is new G (Unknown_Discrim => Known_Discrim);
     begin
       null;
     end;

   An implementation (particularly one which does not share code for generics)
   might have to go out of its way to ignore known information in order to
   raise Program_Error instead of returning the correct answer for these
   membership tests.

3) It has been suggested that defining the dynamic semantics of Unchecked_Union
   in terms of check suppression leads to definitional problems in the
   cases of imported objects declared in C code, values created via unchecked
   conversion, values created via streaming, etc. What is the "correct"
   (albeit missing) discriminant value in these cases? What matters is that
   some appropriate discriminant value exists (even if an oracular consultation
   would be needed to ascertain its value) and, in particular, that
   the assumption that it exists does not lead to a contradiction.

4) Would the term "implied discriminants" be preferable to "inferable
   discriminants"?

5) Propagated discriminant constraints where both the enclosing type and
   the component type are unchecked union types introduce definitional
   complications. Some of the rules could be simpler if constructs such as
       type Uu_32 is <as above> ;

       type Another_Uu (Is_Signed : Boolean) is
         record
           F : Uu_32 (Is_Signed => Is_Signed);
           ...
         end record;
       pragma Unchecked_Union (Another_Uu);
   were illegal.

6) Requiring that the type not be By_Reference is just a convenient
   way of avoiding interactions with finalization, tags, view
   conversions, and limited types. For most implementations, this is
   already a consequence of the C-compatibility requirement.

7) The requirement that constrained objects must have the same
   size as unconstrained objects stems from C.

8) Checking is suppressed when converting from a non-Unchecked_Union
   ancestor type to a constrained Unchecked_Union derived subtype.
   This is a logical consequence of suppressing Discriminant_Check
   for the Unchecked_Union type, but contradicts the earlier AI.
   Adding a rule of the form "discriminant checks are suppressed,
   except that ... " does not seem like a good idea. Opening that can
   of worms would lead to questions like "what about converting an
   operand which has inferable discriminants? what about a variant
   field access where the prefix has inferable discriminants?".

9) If a mechanism is introduced for reinstating suppressed checks
   (see AI-224), it must be made clear that the check suppression
   implied by an Unchecked_Union pragma is irrevocable.

*************************************************************

From: Tucker Taft
Sent: Monday, January 28, 2002 at  4:22 PM

"Baird, Steve" wrote:

> B.3.3 Pragma Unchecked_Union
>
> [A pragma Unchecked_Union specifies an interface correspondence
> between a given discriminated type and some C union. The pragma
> specifies that the associated type shall be given a representation
> that leaves no space for its discriminant(s).]
>
>                 Syntax
>
> The form of pragma Unchecked_Union is as follows:
>
>     pragma Unchecked_Union (first_subtype_local_name);
>
>                 Legality Rules
>
> Unchecked_Union is a representation pragma, specifying the unchecked
> union aspect of representation.
>
> The first_subtype_local_name of a pragma Unchecked_Union shall denote
> an unconstrained discriminated record subtype having a variant_part.
>
> The type is said to be an unchecked union type. A subtype of an
> unchecked union type is said to be an unchecked union subtype.
> An object of an unchecked union type is said to be an unchecked union
> object.
>
> All component subtypes of the type shall be C-compatible.
>
> If a component subtype of the type is subject to a per-object constraint,
> then the component subtype shall be an unchecked union subtype.
>
> The type shall not be a by-reference type.

This seems heavy handed.  Is there a clear reason why
one shouldn't have an atomic or volatile component in an
unchecked-union?

>                 Static Semantics
>
> An unchecked union type is eligible for conventions C and C_Pass_By_Copy,
> as is any subtype of the type.

I thought the convention was by default "C" but that could be overridden.

>
> Discriminant_Check is suppressed for the type.
>
> All objects of the type shall have the same size.
>
> Discriminants of objects of the type shall be of size zero.
>
> Any name which denotes a discriminant of an object of an
> Unchecked_Union type shall occur within the declarative region of the type.

Discriminant names can also appear in named aggregates -- in fact they must.

Given your concept of "inferrable" discriminants, why not allow
referring to the discriminants of an object with inferrable discriminants?


>
> An Unchecked_Union subtype shall not be passed as a generic actual parameter
> if the corresponding formal type has a known_discriminant_part or
> is a formal derived type which is not an unchecked union type.
>
>                 Dynamic Semantics
>
> A view of an unchecked union object [(including a type conversion or
> function call)] is said to have "inferable discriminants" if it has a
> constrained nominal subtype, unless the object is a component of an
> enclosing unchecked union object which is subject to a per-object constraint
> and the enclosing object lacks inferable discriminants.
>
> An expression of an Unchecked_Union type is said to have "inferable
> discriminants" if it is either a name of an object with inferable
> discriminants or a qualified expression whose subtype_mark denotes a
> constrained subtype.
>
> The predefined equality operator for an Unchecked_Union type
> raises Program_Error if either of the operands lacks inferable
> discriminants. [This includes the case where the equality operator
> is invoked implicitly by the equality operator for an enclosing composite
> type - if the Unchecked_Union component is unconstrained, Program_Error
> is raised].
>
> Evaluation of a membership test raises Program_Error if the
> subtype_mark denotes a constrained Unchecked_Union subtype and the
> expression lacks inferable discriminants.
>
> Conversion from a derived Unchecked_Union type to an unconstrained
> non-Unchecked_Union ancestor type raises Program_Error if the operand
> of the conversion lacks inferable discriminants.

Why does it matter whether the other type is an ancestor?  Presuming
we are talking untagged types, all that is required is that they
have a common ancestor for them to be convertible.

I would drop the whole mention of "derived" and "ancestor" since that
is better handled by the normal legality rules.

*************************************************************

From: Steve Baird
Sent: Monday, January 28, 2002 at  6:33 PM

> From: Tucker Taft
> ...
> This seems heavy handed.  Is there a clear reason why
> one shouldn't have an atomic or volatile component in an
> unchecked-union?

Fair enough. I was just trying to avoid enumerating all the
constructs mentioned in discussion point #6.

> I thought the convention was by default "C" but that
> could be overridden.

This is a way of bypassing the C vs. C_Pass_By_Copy
question.

> Discriminant names can also appear in named aggregates

Good point; I agree.

> Given your concept of "inferrable" discriminants, why not
> allow referring to the discriminants of an object with
> inferrable discriminants?

That's certainly an option. I had thought of this concept
as only being part of the dynamic semantics of the language,
not the static semantics, but your idea seems reasonable.

> Why does it matter whether the other type is an ancestor?
Good point. My mistake.

*************************************************************

From: Randy Brukardt
Sent: Monday, February  4, 2002 at  8:22 PM

> 9) If a mechanism is introduced for reinstating suppressed checks
>    (see AI-224), it must be made clear that the check suppression
>    implied by an Unchecked_Union pragma is irrevocable.

I'm not suppressing of checks is the right model here. An implementation is
allowed to ignore check suppression anytime it wants. I'm not sure that an
implementation could do any useful checking, but it seems like the model ought
to be that there are no checks. That might be harder to define, though.

*************************************************************

From: Steve Baird
Sent: Tuesday, February  5, 2002 at  6:01 PM

I believe that suppression is the right model. An implementation
is allowed to ignore check suppression in the case of an unchecked_union
type. If an implementation is able to gather enough information from
somewhere (certainly not from the zero-sized discriminant fields) to
perform a check, it may do so.

What is new about this form of suppression is that ignoring it
is, in general, much harder (i.e. effectively impossible) than ignoring
the suppression of other checks.

I readily concede that the notion of a discriminant which exists in some
theoretical sense (e.g. its value may be inferred in some cases)
but occupies no storage is a bit odd.

To me, the compelling advantage of this approach is its simplicity; only
one line of RM text is needed because it uses an existing
language mechanism.

Inventing a similar-but-different set of rules expressly for
Unchecked_Union would be much more complicated.

*************************************************************

From: Randy Brukardt
Sent: Wednesday, February  6, 2002 at  6:05 PM

> To me, the compelling advantage of this approach is its simplicity; only
> one line of RM text is needed because it uses an existing language mechanism.

Except that it messes up Unsuppress further; as the one with that short straw,
I'm not amused :-)

*************************************************************

From: Robert Dewar
Sent: Wednesday, February  6, 2002 at  7:30 PM

You just say that Unsuppress unsupresses checks previously suppressed by
Suppress, then that does not include the unchecked union stuff.

*************************************************************


Questions? Ask the ACAA Technical Agent