Version 1.10 of ais/ai-00318.txt

Unformatted version of ais/ai-00318.txt version 1.10
Other versions for file ais/ai-00318.txt

!standard 03.03.01 (02)          04-02-28 AI95-00318/03
!standard 06.05.00 (17)
!standard 06.05.00 (18)
!class amendment 02-10-09
!status No Action (9-0-0) 04-09-17
!status work item 03-05-23
!status received 02-10-09
!priority Medium
!difficulty Medium
!subject Returning [limited] objects without copying
!summary
A new extended syntax is proposed for the return statement, providing a name for the new object being created as a result of a call on the function.
This new syntax can be used to support returning limited objects from a function, to support returning objects of an anonymous access type, and more generally to reduce the copying that might be required when a function returns a complex object, a controlled object, etc.
The existing ability to return by reference is changed so that it would be permitted only if the function were declared with an extra keyword which distinguishes it syntactically from a "normal" function -- the result of a "normal" function is always a newly created object.
!problem
We already have a proposal for allowing aggregates of a limited type, by requiring that the aggregate be built directly in the target object. rather than being copied into the target.
But aggregates can only be used with non-private types. Limited private types could not be initializable at their declaration point. It would be natural to allow functions to return limited objects, so long as the object could be built directly in the "target" of the function call, which could be a newly created object being initialized, or simply a parameter to another subprogram call.
We have also considered allowing functions to return anonymous access types. In this case, if the function returned an allocator, it would be natural for the caller context to determine the storage pool to be used by the allocator.
Whether returning a limited type or an anonymous access type, in both cases, it may be desirable to perform some other initialization to the object after it has been created, but before returning from the function. This is difficult to do while still creating the object directly in its "final" location.
Currently functions that return a limited private type may have an accessibility check performed on the object returned, depending on a property ("return-by-reference-ness") which is not generally visible based on the partial view of the type. This means that a function that works initially may stop working if the full type of the result type is changed to include, say, a limited tagged component, or some other component that is return-by-reference.
A function whose result type turns out to be return-by-reference cannot be allowed where a new object is required. However, there is nothing in the declaration of such a function that indicates it returns by reference.
!proposal
A new syntax for a function specification is proposed:
FUNCTION designator [parameter_profile] ALIASED RETURN subtype_mark
Such a function is defined to return by reference. Upon return from such a function, a check is made that the object associated with the return expression has an accessibility level that is no deeper than that of the function declaration. Program_Error is raised if this check fails.
A call of a return-by-reference function denotes an aliased constant view of the object associated with the return expression. The accessibility level of this view is that of the function.
[NOTE: One possibility is to eliminate this return-by-reference capability, in favor of functions with an anonymous access result type. See the discussion.]
-------------
An extended syntax for the return statement is proposed:
RETURN identifier : [ALIASED] subtype_indication [:= expression] [DO
handled_sequence_of_statements
end return];
Such an extended return statement is permitted only immediately within a function which is not a return-by-reference function. The specified identifier names the object that is the result of a call on the function. If the expression is present, it provides the initial value for the result object. If not, the result object is default initialized. If the handled_sequence_of_statements is present, it is executed after initializing the result object. Within the handled_sequence_of_statements, the identifier denotes a variable view of the result object with nominal subtype given by the subtype_indication. When the handled_sequence_of_statements completes, the function is complete.
[Question: Should an expression-less return statement be permitted within the handled_sequence_of_statements? That would be consistent with the way that accept statements work. Tentative ARG Answer: Yes.]
A call of a non-return-by-reference function with a limited result type may be used in the same contexts where we have proposed to allow aggregates of a limited type, namely contexts where a new object is being created (or can be).
1) Initializing a newly declared object (including a result object identified
in an extended return statement)
2) Default initialization of a record component 3) Initialized allocator 4) Component of an aggregate 5) IN formal object in a generic instantiation (including as a default) 6) Expression of a return statement (though note that if the function
is return-by-reference, it would surely fail the accessibility check)
7) IN parameter in a function call (including as a default expression)
In addition, since the result of a function call is a name in Ada 95, the following contexts would be permitted, with the same semantics as creating a new temporary constant object, and then creating a reference to it:
8) Declaring an object that is the renaming of a function call. 9) Use of the function call as a prefix to 'Address
In other words, it would be permitted in any context where limited types are permitted. With the new proposals, that is pretty much any context where a "name" that denotes an object or value is permitted, except as the right hand side of an assignment statement.
Note that a call of a return by reference function, because it is represents a view of a preexisting object, is not permitted in contexts 1, 2, 3, 4, and 6 if the result type is limited. Such a call would be permitted in the expression of a return statement for another return-by-reference function.
-----------
[ASIDE (see AI-325): If we permit function result types to be anonymous access types (e.g. "function Blah return access T"), then we likely will want such functions, if they return the result of an allocator, to be able to use the context of the call to determine the storage pool for the allocator. This proposed syntax would allow the function to do the allocator in the "caller" context, but still be able to perform further initialization of the allocated object after the allocator. Essentially the "return" object would inherit the storage pool determined by the calling context, so that allocators that are used to initialize it, or that are assigned to it later, would use the caller-determined storage pool. END of ASIDE]
!wording
Change 6.1(13) to:
parameter_and_result_profile ::= [formal_part] [aliased] return subtype_mark
Modify 6.3.1(16) as follows:
Two profiles are mode conformant if they are type-conformant, {if for functions, both or neither are return-by-reference (see 6.5),} corresponding parameters have identical modes, and, for access parameters, the designated subtypes statically match.
Replace clause 6.5 with the following:
6.5 Return Statements
A return_statement is used to complete the execution of the innermost enclosing subprogram_body, entry_body, or accept_statement.
Syntax
return_statement ::= simple_return_statement | extended_return_statement
simple_return_statement ::= return [expression];
extended_return_statement ::= return identifier : [aliased] subtype_indication [:= expression] [do handled_sequence_of_statements end return];
Name Resolution Rules
The result subtype of a function is the subtype denoted by the subtype_mark after the reserved word RETURN in the profile of the function. If the reserved word ALIASED precedes RETURN, it is a return-by-reference function. The expression, if any, of a return_statement is called the return expression. The expected type for a return expression is the result type of the corresponding function.
Legality Rules
A return_statement shall be within a callable construct, and it applies to the innermost callable construct or extended_return_statement that contains it. A return_statement shall not be within a body that is within the construct to which the return_statement applies.
A function body shall contain at least one return_statement that applies to the function body, unless the function contains code_statements. A simple_return_statement shall include a return expression if and only if it applies to a function body. An extended_return_statement shall apply to a function body, and the function shall not be return-by-reference.
For a return-by-reference function, the return expression of the simple_return_statement shall be a name that denotes an aliased view of an object. For a non return-by-reference function with limited result type, the object associated with the return expression, if any, shall be an object defined by an aggregate or a function_call on a non return-by-reference function.
The type of the subtype_indication of an extended_return_statement shall be the result type of the function. If the result subtype of the function is constrained, then the subtype defined by the subtype_indication of the extended_return_statement shall also be constrained and shall statically match this result subtype. If the result subtype of the function is unconstrained, then the subtype defined by the subtype_indication shall be a definite subtype, or there shall be a return expression.
For a non return-by-reference function with limited result type, the result subtype shall be constrained.
AARM NOTE: This last paragraph is not a necessary restriction, but simplifies implementation dramatically, since it means the caller can always allocate space for the result object, perform all "implicit" initializations of task and protected components, worry about accessibility levels for access discriminants, etc. The restriction could be lifted in Ada 201Z.
AARM Ramification: Note that this rule is defined at the point of the return statement rather than at the point of the function declaration, to ensure we are talking about the type characteristics visible inside the function body, rather than the characteristics visible to the caller. Of course compilers are encouraged to signal the error as soon as possible
Static Semantics
Within an extended_return_statement, the return object is declared with the given identifier, with subtype defined by the subtype_indication.
Dynamic Semantics
For the execution of a simple_return_statement, the expression (if any) is first evaluated and converted to the result subtype.
If the function is return-by-reference, then a check is made that the return expression is one of the following:
* a name that denotes an object view whose accessibility level is not deeper than that of the master that elaborated the function body; or
* a parenthesized expression or qualified_expression whose operand is one of these kinds of expressions.
The exception Program_Error is raised if this check fails.
If the function is not return-by-reference, the value of the return expression becomes the initial value of an anonymous return object.
For the execution of an extended_return_statement, the subtype_indication is elaborated. This creates the nominal subtype of the return object. If there is a return expression, it is evaluated and converted to the nominal subtype (which might raise Constraint_Error -- see 4.6) and becomes the initial value of the return object; otherwise, the return object is initialized by default as for a stand-alone object of its nominal subtype (see 3.3.1). The handled sequence of statements, if any, is then executed.
If the result type of a function is a specific tagged type:
* If it is limited, or the function is return-by-reference, a check is made that the tag of the value of the return expression, if any, identifies the result type. Constraint_Error is raised if this check fails;
* Otherwise, the tag of the return object is that of the result type.
AARM Ramification: This is true even if the tag of the return expression is different.
AARM Reason: These rules ensure that a function whose result type is a specific tagged type always returns an object whose tag is that of the result type. This is important for dispatching on controlling result, and, if not return-by-reference, allows the caller to allocate the appropriate amount of space to hold the value being returned (assuming there are no discriminants).
For a return-by-reference function the result is returned by reference; that is, the function call denotes an aliased constant view of the (preexisting) object denoted by the return expression. For a non return-by-reference function, the return object is a newly created object, and the function call denotes that object.
Finally, a transfer of control is performed which completes the execution of the construct to which the return_statement applies, and returns to the caller.
Examples
Examples of return statements:
return; -- in a procedure body, entry_body, -- accept_statement, or extended_return_statement
return Key_Value(Last_Index); -- in a function body
Add after 8.1(4):
* an extended_return_statement;
!example
Here is an example of a function with a limited result type using an extended return statement:
function Make_Obj(Len : Natural) return Lim_Type is begin return Result : Lim_Type(Discrim => Len) do -- the "result" object -- Finish the initialization of the "return" object. for I in 1..Len loop Result.Data(I) := I; end loop; end return; end Make_Obj;
[ASIDE: See AI-325: Here is essentially the same function, but with an anonymous access type for its result type:
function Make_Obj(Len : Natural) return access Lim_Type is begin return Result : access Lim_Type do -- The "result" object Result := new Lim_Type(Discrim => Len); -- this uses the storage pool determined by the caller context -- Finish the initialization of the allocated object for I in 1..Len loop Result.Data(I) := I; end loop; end return; end Make_Obj;
By "caller context", we mean that the same rules as apply to an allocator would apply to calls on this function, where the expected (access) type would determine the storage pool:
type My_Acc_Type is access Lim_Type; for My_Acc_Type'Storage_Pool use My_Amazing_Stg_Pool;
P : My_Acc_Type;
begin P := Make_Obj(3); -- allocator inside Make_Obj uses My_Amazing_Stg_Pool
end of ASIDE]
!discussion
In meetings with Ada users, there has been a general sense that if limited aggregates are provided in Ada 200Y, it would be desirable to also provide limited function returns which could act as "constructor" functions.
Just allowing a function whose whole body is a return statement returning an aggregate (or another function call) does not give the programmer much flexibility. What they would like is to be able to create the object being returned and then initialize it further somehow, perhaps by calling a procedure, doing a loop (as in the examples above), etc. This requires a named object. However, to avoid copying, we need this object to be created in its final "resting place," i.e. in the target of the function call. This might be in the "middle" of some enclosing composite object the caller is initializing, or it might be in the heap, or it might be a stand-alone local object.
Because the implementation needs to create the result object in a place or a storage pool determined by the caller, it is important that the declaration of the object be distinguished in some way. By declaring it as part of an extended return statement, we have a way for the programmer to indicate that this is the object to be returned. Clearly we don't want to allow extended return statements to be nested.
Because it may be necessary to do some computing before deciding exactly how the result object should be declared, we permit the extended return statement to occur any place a normal return statement is permitted. So different branches of an if or case statement could have their own extended return statements, each with its own named result object.
Note that we have allowed the user to declare the result object as "aliased." This seems like a natural thing which might be wanted, so you could initialize a circularly-linked list header to point at itself, etc.
Note that we had discussed various mechanisms where information from the calling context would be available inside the function at the language level. In particular, it would be possible to refer to the values of the discriminants or bounds of the object being initialized, presuming it was constrained, within the subtype indication and initializing expression, if any.
Ultimately this capability was not included in this proposal, as it created a series of somewhat complicated restrictions on usage and made the implementation that much more difficult. Note that the implementation may still need to pass in information from the calling context, depending on the run-time model, because if the type is "really" limited (e.g. it is limited tagged, or contains a task or a protected object), then the new object must be built in its final resting place. In many run-time models, that means the storage needs to be allocated at the call-site if the object being initialized is a component of some larger object.
However, by not allowing the programmer to refer to this contextual information at the langauge level, we give the implementation more flexibility in how it solves the build-in-place requirement for "really" limited objects. See the discussion below about implementation approaches.
The proposed syntax for extended return statements was discussed a year or so ago, but when this AI was first written up, we proposed instead a revised object declaration syntax where the word "return" was used almost like the word "constant," as a qualifier. This was somewhat more economical in terms of syntax and indenting, but was not felt to be as clear semantically as this current syntax.
POSSIBLE IMPLEMENTATION APPROACHES
[ASIDE (See AI-325): The implementation approach for anonymous access result types is very similar to that for limited result types. In the following, we will mostly talk about limited result types. Towards the end we will explain how it applies to anonymous access result types. Full accessibility level checking adds to the complexity. At the end we will show how to introduce restrictions that eliminate most of this complexity, in exchange for some loss in functionaliy. END of ASIDE]
The implementation of the return-by-reference function is the same as the existing capability for functions whose result type is return-by-reference. The difference is that this would be permitted for types that are not "return-by-reference," and perhaps not even limited. In particular, an accessibility check is performed at the point of the return statement, and a reference to the object associated with the return expresion is returned to the caller.
The implementation of the extended return statement for non-limited types should minimize the number of copies, but may still require a copy in some implementation models and in some calling contexts.
The implementation of the extended return statement for limited result types is straightforward if the result subtype is constrained. It is essentially equivalent to a procedure with an OUT parameter -- the caller allocates space for the target object, and passes its address to the called routine, which uses it for the "return" object.
If the result subtype is unconstrained, then there are two basic possibilities: (NOTE: We have disallowed this case for now.)
1) The target object's (nominal) subtype is definite, and either constrained or
the size of the object is independent of the constraints (e.g. allocate-the-max is used for the object); the target object might be a component of a larger object.
2) The target object's nominal subtype is unconstrained, and its size
is to be determined by the result returned from the function; the target object must be a stand-alone object, or an "entire" heap object.
In the first case, the caller determines the size of the target object and can allocate space for it; in the second, the caller cannot preallocate space for the target object, and must rely on the called routine allocating space for it in an "appropriate" place.
The code for the called routine must handle both of these cases. One reasonable way to do so is for the caller to provide a "storage pool" for the result. In the first case, this storage "pool" has space for exactly one object of a given maximum size. It's Allocate routine is trivial. It just checks to see if the size is no greater than the space available, and then returns the preallocated (target) address.
In the second case, the storage pool is either the storage pool associated with the initialized allocator at the call site, or a storage pool that represents a secondary stack, or equivalent, used for returning objects of unknown size from a function.
In either case, the function would return the address of the new object.
A "bare" storage pool may not be enough in general. If the type has any task parts, then these tasks must be placed on an activation list determined by the calling context. They may also be linked onto a master record of some sort, unless this is deferred until activation occurs. Note that the tasks cannot be activated until after returning from the call, since they may have to be activated in conjunction with other tasks having the same master.
If the type has any controlled or protected parts, then the object as a whole, or the individual parts, may need to be added to a cleanup list determined by the calling context.
If the type has any access discriminants, then some kind of accessibility level will need to be provided, since the access discriminant may only be initialized to point to an object whose accessibility level is no deeper than that of the storage pool where the new object is being allocated.
What this means is that rather than passing just a reference to a storage pool, it is more likely the caller will pass a reference to a structure which in turn refers to:
- a storage pool, - an accessibility level, - an activation list, - the associated master, - a cleanup list
TAKING ADVANTAGE OF RESTRICTION TO CONSTRAINED LIMITED RESULT SUBTYPES
In the current proposal, we have disallowed unconstrained limited result subtypes to simplify implementation. By "limited" we mean limited from the function body perspective, not from the caller perspective.
With this limitation, the caller can always preallocate space for the "return object" when limited, and treat the "result" somewhat like an OUT parameter. In addition, the caller can do default initialization for all limited subcomponents. Nonlimited controlled components can still require some fancy footwork, since they can be explicitly initialized, so default initializing them would be inappropriate. But compilers already have to deal with returning non-limited controlled objects, so presumably this won't create an insurmountable burden.
DEALING WITH EXCEPTIONS
There was some concern about what would happen if an exception was propagated by an extended return statement, and then the same or some other extended return statement was reentered. I don't see a real problem. The return object doesn't really exist outside the function until the function returns, so it can be restored to its initial state on call of the function if an exception is propagated from an extended return statement. Once restored to its initial state, there seems no harm in starting over in another extended_return_statement.
IMPLEMENTATION APPROACH FOR ANONYMOUS ACCESS RESULT TYPE
For anonymous access result types, a similar approach would be taken. In this case, however, a new object is not required. It would be permissible to return an access value designating a preexisting object. The storage pool parameter would always be required, but the caller could always ignore it. An accessibility level would be needed associated with the storage pool, so the called routine would know the accessibility level of the result of an allocator that used the storage pool. An accessibility level would also need to be returned, so the caller would know the accessibility of the result.
Although the RM talks about accessibility levels in terms of dynamic levels of nesting, most implementations use accessibility levels that correspond to static levels of nesting, but adjust the level when passing a given (formal) access parameter to a more nested subprogram with an access parameter as well, by collapsing deeper static levels into a level that corresponds to the static level of the given formal access parameter's declaration. This is explained in AARM 3.10.2(22.x-22.ee).
Unfortunately, this "collapsing" of levels loses information. So when passing accessibility levels to and from a function with an anonymous access type result, it would be desirable to avoid "collapsing" such levels, and use the original accessibility levels. In some implementations it might be helpful for the caller to provide the called routine with a level for the called routine to use for its own locals, which is guaranteed to be deeper than any level number that the caller cares about.
limited TYPES and ACCESSIBILITY ISSUES
Because limited types can have access discriminants, and an accessibility check is required when an allocator for such a type is performed to be sure the allocated object doesn't outlive the object referenced by the access discriminant, some kind of accessibility level will also have to be provided to the called routine when a storage pool is provided, at least when the result type has access discriminants. Because the storage pool will often be local to the caller and the access discriminant might be specified via an access parameter to the function, the collapsing of accessibility levels mentioned above would have to be supressed in this case as well.
Hence we end up with a general rule that when access parameters are passed to a function with a limited result type (with access discriminants), or with an anonymous access result type, no collapsing of accessibility levels is performed. The caller's accessibility levels are used in the access parameters, and in the storage pool. The called routine has to accommodate this somehow. Again, in some implementations it may be helpful for the caller to provide the called routine with an accessibility level it can use for its own locals that is certain to be deeper than any other level passed in from the caller.
POSSIBLE SIMPLIFICATIONS OF ACCESSIBILITY CHECKING
If we would like some of these capabilities, but would like to avoid dealing with uncollapsed accessibility levels, accessibility levels associated with storage pools, etc., then we could make some restrictions that might simplify the implementation (though of course it would complexify the user's model a bit):
1) If a limited result type has access discriminants, then
the storage pool passed in must not outlive the function declaration. This would imply that the function could safely set the access discriminants to point to objects with an accessibility level no deeper than the function declaration. This is similar to the test performed on return-by-reference now (6.5(17-20)).
With this restriction, no accessibilty level needs to be passed in with the storage pool for limited result types.
Note also that with this restriction, calls on local functions could not be used within initialized allocators for global access types, if the function's result type is a limited type with access discriminants (doesn't seem like much of a loss).
2) For anonymous access result types, if the storage pool were
not used inside the function, the accessibility of the returned access value must be no deeper than that of the function declaration (e.g., it could not return the value of an access parameter passed in, unless the access parameter designated an object global to the function). Again, this is essentially the check performed now for return-by-reference types.
If the storage pool is used, then naturally the accessibility is that of the storage pool, so the caller knows that the maximum accessibility depth of the result is the depth of the storage pool or the depth of the function declaration, whichever is deeper.
If the designated type of the anonymous access type is a limited type with access discriminants, then the same restriction as (1) would apply to the storage pool, i.e. that the storage pool depth must be no less than that of the function declaration.
With these restrictions, no accessibility level needs to be passed in with the storage pool for anonymous access result types, and in turn no level would be returned (without the storage pool level passed in, it would be pretty much impossible to pass it back!).
Note that without the level being returned, a local function could not be used to create a value assigned to variable of a global access type, since the function might return a pointer designating a local object, and it has no way of indicating that.
CONCLUSION
If we are willing to accept the restrictions of the above section, then the implementation burden is roughly the same for either return limited types or returning anonymous access types, namely that a storage pool may need to be passed in. The called routine needs to use that storage pool when creating a limited return object, or evaluating an allocator whose target is the anonymous access return object. If the return object is itself initialized by a function call, then the storage pool needs to be passed into that function as well, presuming that function also returns a limited type or an anonymous access type.
If a user doesn't explicitly declare a return object, then each return statement is equivalent to a local block that declares a return object initialized from the return expression, and then returns it.
If we don't want to accept the restrictions given above, then accessibility levels need to be passed with the storage pool, and the accessibility levels passed with access parameters should not be "collapsed." An accessibility level would be returned from a function with an anonymous access result type.
Note that an additional advantage of the restricted form is that more accessibility checking can be performed at compile-time, and it will generally involve less run-time overhead.
Given this, it seems appropriate to consider the restricted (compile-time accessibility) form of the proposal first, and only if this is felt sufficiently valuable, to consider the unrestricted form of the proposal.
!ACATS test
ACATS(s) tests need to be created for these features.
!appendix

From: Robert Duff
Sent: Wednesday, February 6, 2002,  6:05 PM

                    Limited Types Considered Limited

One of my homework assignments was to propose changes to "fix" limited
types.  This e-mail outlines the problems, and proposed solutions.
I haven't written down every detail -- I first want to find out if there
is any interest in going forward with these changes.  Some of these
ideas came from Tucker.

I apologize for doing my homework at the last minute.  I hope folks will
have a chance to read this before the meeting, and I hope Pascal is
willing to put it on the agenda.

Ada's limited types allow programmers to express the idea that "copying
values of this type does not make sense".  This is a very useful
capability; after all, the whole point of a compile-time type system is
to allow programmers to formally express which operations do and do not
make sense for each type.

Unfortunately, Ada places certain limitations on limited types that have
nothing to do with the prevention of copying.  The primary example is
aggregates: the programmer is forced to choose between the benefits of
aggregates (full coverage checking) and the benefits of limited types.
Forcing programmers to choose between two features that ought to be
orthogonal is one of the most frustrating aspects of Ada.

I consider the full coverage rules (for aggregates and case statements)
to be one of the primary benefits of Ada over many other languages,
especially with type extensions, where some components are inherited
from elsewhere.  I will refrain from further singing the praises of full
coverage; I assume I'm preaching to the choir.

My goals are:

    - Allow aggregates of limited types.

    - Allow constructor functions that return limited types.

    - Allow initialization of limited objects.

    - Allow limited constants.

    - Allow subtype_marks in aggregates more generally.
      (They are currently allowed only for the parent part in an
      extension aggregate.)

The basic idea is that there is nothing wrong with constructing a
limited object; *copying* is the evil thing.  One should be allowed to
create a new object (whether it be a standalone object, or a formal
parameter in a call, or whatever), and initialize that object with a
function call or an aggregate.  In implementation terms, the result
object of the function call, or the aggregate, is built in place in its
final destination -- no copying is necessary, or allowed.

All of the above goals except constructor functions are fairly trivial
to achieve, both in terms of language design and in terms of
implementation effort.  Constructor functions are somewhat more
involved.  However, I am against any language design that allows
aggregates where function calls are not allowed; subprogram calls are
perhaps the single most important tool of abstraction ever invented!
(There is at least one other such case in Ada, and I hate it.)

By "constructor function", I mean a function that returns an object
created local to the function, as opposed to an object that already
existed before the function was called.

Ada currently allows functions to return limited types in two cases,
neither of which achieves the goal here:

    If the limited type "becomes nonlimited" (for example, a limited
    private type whose full type is integer), then constructor functions
    are allowed, but the return involves a copy, thus defeating the
    purpose of limited types.  Anyway, this feature is not allowed for
    various types, such as tagged types.

    If the limited type does not become nonlimited, then it is returned
    by reference, and the returned object must exist prior to the
    function call; it cannot be created by the function.  In essense,
    these functions don't return limited objects at all; they simply
    return a pointer to a preexisting limited object (or perhaps
    a heap object).

We need a new kind of function that constructs a new limited object
inside of itself, and returns that object to be used in the
initialization of some outer object.  The run-time model is that the
caller allocates the object, and passes in a pointer to that object.
The function builds its result in that place; thus, no copying is done
on return.

Because the run-time model for calls to these constructor functions is
different from that of existing functions that return a limited type, we
need to indicate this syntactically on the spec of the function.  In
particular, change the syntax of functions so that "return" can be
replaced by "out", indicating a constructor function.  In addition,
change the syntax of object declarations to allow "out", as in
"X: out T;"; this marks the object as "the result object" of a limited
constructor function.  The reason for "out" is that these things behave
much like parameters of mode 'out'.

Examples:

    type T is tagged limited
        record
            X: ...;
            Y: ...;
            Z: ...;
        end record;

    type Ptr is access T'Class;

    Object_1: constant T := (X => ..., Y => ..., Z => ...);

    function F(X: ...; Y: ...) out T;

    function F(X: ...; Y: ...) out T is
        Result: out T := (X => X, Y => Y, Z => ...);
    begin
        ... -- possible modifications of Result.
        return Result;
    end F;

    Object_2: Ptr := new T'(F(X => ..., Y => ...));
        -- Build a limited object in the heap.

Rules:

Change the rules in 4.3 to allow limited aggregates.  This basically
means erasing the word "nonlimited" in a few places.

Change the rule in 3.3.1(5) about initializing objects to allow limited
types.  But require the expression to be an aggregate or a constructor
function.  ("X: T := Y;", where Y is a limited object, remains illegal,
because that would necessarily involve a copy.)  There are various
analogous rules (initialized allocators, subexpressions of aggregates,
&c) that need analogous changes.

Assignment statements remain illegal for limited types, even if the
right-hand side is an aggregate or limited constructor function.

Allowing constants falls out from the other rules.

Allow a component expression in an aggregate to be a subtype_mark.  This
means that the component is created as a default-initialized object.
It's essentially the same thing we already allow in an extension
aggregate; we're simply generalizing it to all components of all
aggregates.  This is important, in case some part of the type is
private.  There is no reason to limit this capability to limited types.

Specify that limited aggregates are built "in place"; there is always a
newly-created object provided by the context.  Note that we already have
one case where aggregates are built in place: (nonlimited) controlled
aggregates.  Similarly, the result of a limited constructor function is
built in place; the context of the call provides a newly-created object.
(In the case of "X: T := F(...);", where F says "return G(...);", F will
receive the address of X, and simply pass it on to G.)

If the result object of a limited constructor function contains tasks,
the master is the caller.

For a function whose result is declared "out T", T must be a limited
type; such a function is defined to be a "limited constructor function".

Subtype T must be definite.  This rule is not semantically necessary.
However, the run-time model calls for the caller to allocate the result
object, and this rule allows the caller to know its size before the
call.  Without this rule, a different run-time model would be required
for indefinite subtypes: the called function would have to allocate the
result in the heap, and return a pointer to it.  A design principle of
Ada is to avoid language rules that require implicit heap allocation;
hence this rule.  (An alternative rule would be that T must be
constrained if composite, thus eliminating defaulted discriminants.)

A limited constructor function must have exactly one return statement.
The expression must be one of the following:

    - An object local to the function (possibly nested in block
      statements), declared with "out".

    - A function call to a limited constructor function.

    - An aggregate.

    - A parenthesized or qualified expression of one of these.

An object declared "out" must be local to a limited constructor
function.

A constraint check is needed on creation of a local "out" object.
We have to do the check early (as opposed to the usual check on the
return statement), because we need to make sure the object fits in the
place where it belongs (at the call site).  If the return expression is
an aggregate, that needs a constraint check, as usual.  If the return
expression is a function call, then that function will do whatever
checking is necessary.

Is there an issue with dispatching-on-result functions?
I don't think so.


Compatibility:

This change is not upward compatible.  Consider:

    type Lim is limited
        record
            Comp: Integer;
        end record;

    type Not_Lim is
        record
            Comp: Integer;
        end record;

    procedure P(X: Lim);
    procedure P(X: Not_Lim);

    P((Comp => 123));

The call to P is currently legal, and calls P(Not_Lim).  In the new
language, this call will be ambiguous.

This seems like a tolerable incompatibility.  It is always caught at
compile time, and cases where nonlimitedness is used to resolve
overloading have got to be vanishingly rare.  The above program, though
legal, is highly confusing, and I can't imagine anybody wanting to do
that.  The current rule was a mistake in the first place: even if
limited aggregates *should* be illegal, that should not be a Name
Resolution Rule.


Other advantages:

One advantage of this change is that it makes the usage style of limited
types more uniform with nonlimited types, thus making the language more
accessible to beginners.

How do you construct an object in Ada?  You call a function.  Cool -- no
need for the kludginess of C++ constructors.  But if it's limited, you
have to fool about with discriminants -- not something that would
naturally occur to a beginner.  And discriminants have various annoying
restrictions when used for this purpose.

How do you capture the result of a function call?  You put it in a
constant: "X: constant T := F(...);".  But if it's limited, you have to
*rename* it: "X: T renames F(...);".  Again, that's not something that
would naturally occur to a beginner -- and the beginner would rightly
look upon it as a "trick" or a "workaround".

Another point is that the current rules force you into the heap,
unnecessarily.  You end up passing around pointers to limited objects,
either explicitly or implicitly, which tends to add complexity to one's
programs.

Limited types offer other advantages in addition to lack of copying:
access discriminants, and the ability to take 'Access of the "current
instance".  It seems a shame to require the programmer to choose between
these and aggregates.


Alternatives:

It is not strictly necessary to mark the result object with "out"; the
compiler could deduce this information by looking at the return
statement(s).  However, marking the object simplifies the compiler -- it
needs to treat this object specially by using the space allocated by the
caller.

It is not necessary to limit the number of return statements to 1.
However, it seems simplest.  We need to prevent things like this:

    function F(...) out T is
        Result_1: out T;
        Result_2: out T;
    begin
        Result_1.X := 123;
        Result_2.X := 456;
        if ... then
            return Result_1;
        else
            return Result_2;
        end if;
    end F;

because we can't allocate Result_1 and Result_2 in the same place!
On the other hand, the following could work:

    function F(...) out T is
        Result: out T;
    begin
        if ... then
            return Result;
        else
            return Result;
        end if;
    end F;

suggesting a rule that all return statements must refer to the *same*
object.  But this could work, too:

    function F(...) out T is
    begin
        if ... then
            return (...); -- aggregate
        elsif ... then
            return G(...); -- function call
        elsif ... then
            declare
                Result_1: out T;
            begin
                Result_1.X := 123;
                return Result_1; -- local
            end;
        else
            declare
                Result_2: out T;
            begin
                Result_2.X := 456;
                return Result_2; -- different local
            end;
        end if;
    end F;

because only one of the four different result objects exists at any
given time.  I'm not sure how much to relax this rule.  Perhaps some
rule about declaring only one of these special result objects in a given
region?

****************************************************************

From: Tucker Taft
Sent: Thursday, February 7, 2002,  6:15 AM

I would allow these "constructor" functions for any kind of
type.  I would require that at most one OUT local variable
be declared.  It must be in the outermost declarative part
(to avoid it going out of scope before it was returned), and
that all return statements must identify the OUT local variable
if present (or perhaps, all return statements must omit the
return expression completely if present).

Allowing the declaration of an "OUT" local variable might
be generalized to normal functions (with the same restrictions
as above).  For a "normal" function, the OUT local variable
would represent the returned object, but with the difference that
the space for it is allocated by the called routine, rather than
the caller.  This allows the Pascal "style" of assigning to
a return object, when it is appropriate.

****************************************************************

From: Robert Duff
Sent: Thursday, February 7, 2002,  7:52 AM

> I would allow these "constructor" functions for any kind of
> type.

You mean for nonlimited as well as limited?  Sounds OK.

I assume you do not mean to allow them for unknown-size subtypes.

>...  I would require that at most one OUT local variable
> be declared.  It must be in the outermost declarative part
> (to avoid it going out of scope before it was returned),

I don't see why that's necessary.  The OUT variable doesn't *really* go
away -- it's really just a view of the object created by the caller.
No action is required to return it -- it's already in the right spot,
so a return of it is simply a "goto end of function".

>... and
> that all return statements must identify the OUT local variable
> if present (or perhaps, all return statements must omit the
> return expression completely if present).

A less restrictive rule is that all return statements that refer to a
variable must refer to the OUT variable.  Does that not work?

> Allowing the declaration of an "OUT" local variable might
> be generalized to normal functions (with the same restrictions
> as above).

OK.

>...  For a "normal" function, the OUT local variable
> would represent the returned object, but with the difference that
> the space for it is allocated by the called routine, rather than
> the caller.  This allows the Pascal "style" of assigning to
> a return object, when it is appropriate.

One question is: what if the function doesn't execute a return
statement?  That's currently erroneous.  But if there's an OUT variable,
it would seem sensible to just let the function fall off the end, and
have the OUT variable define the result.

Which implies that we should eliminate the rule requiring at least one
return statement, in the case where there's an OUT variable.

I like it!

I presume the OUT object can actually be a constant or variable, by the
way.

****************************************************************

From: Tucker Taft
Sent: Thursday, February 7, 2002,  8:28 AM

> > I would allow these "constructor" functions for any kind of
> > type.
>
> You mean for nonlimited as well as limited?  Sounds OK.

Yes, that's what I meant.  And since some limited types
become non-limited, I think this may be necessary.

> I assume you do not mean to allow them for unknown-size subtypes.

Right.  They should have exactly the same restrictions,
independent of whether the type is limited or nonlimited,
and the caller should always allocate space for the
returned object.  That way for limited types that
become non-limited, we don't run into any weirdness.

This also means that one can switch between limited and non-limited
with minimal semantic disruption, and you don't have to remember
a lot of special cases for limited vs. non-limited (I presume
that is one of the key goals of this proposal).

> >...  I would require that at most one OUT local variable
> > be declared.  It must be in the outermost declarative part
> > (to avoid it going out of scope before it was returned),
>
> I don't see why that's necessary.  The OUT variable doesn't *really* go
> away -- it's really just a view of the object created by the caller.
> No action is required to return it -- it's already in the right spot,
> so a return of it is simply a "goto end of function".

I don't see how that could work.  Consider the following:

    type Lim_T(B : Boolean := False) is record
        case B is
            when True => X : Task_Type;
            when False => null;
        end case;
    end record;


    function Cons(...) out Lim_T is
    begin
        declare
             Result : out Lim_T(True);
        begin
             null;
        end;
        return Lim_T'(B => False);
    end Cons;

Are you going to require a "return" on all paths out of the declare block?
What if an exception is propagated from the declare block, and then in
the handler there is a return of something with a different discriminant?

> >... and
> > that all return statements must identify the OUT local variable
> > if present (or perhaps, all return statements must omit the
> > return expression completely if present).
>
> A less restrictive rule is that all return statements that refer to a
> variable must refer to the OUT variable.  Does that not work?

No; see above.  I think once you create the OUT local, that
must be the one that gets returned.  Hence, it is probably
simplest if there is an OUT local declared, to eliminate
the use of return expressions, or even the requirement for
an explicit "return" statement, so it would work like
a procedure with one OUT parameter.

> > Allowing the declaration of an "OUT" local variable might
> > be generalized to normal functions (with the same restrictions
> > as above).
>
> OK.
>
> >...  For a "normal" function, the OUT local variable
> > would represent the returned object, but with the difference that
> > the space for it is allocated by the called routine, rather than
> > the caller.  This allows the Pascal "style" of assigning to
> > a return object, when it is appropriate.
>
> One question is: what if the function doesn't execute a return
> statement?  That's currently erroneous.  But if there's an OUT variable,
> it would seem sensible to just let the function fall off the end, and
> have the OUT variable define the result.
>
> Which implies that we should eliminate the rule requiring at least one
> return statement, in the case where there's an OUT variable.
>
> I like it!

In my view, this would be desirable only if we eliminate all return
expressions from return statements for constructor functions with a
local OUT object, making them obey the same rules as procedures
thereafter.

> I presume the OUT object can actually be a constant or variable, by the
> way.

I would not allow it to be a constant.  The word "OUT" would take
the place of the word "CONSTANT" in the syntax, the way I see it.
It seems nearly useless to have it be a constant, and clearly,
the caller could use it to initialize a variable, so calling it a constant
could be misleading.

****************************************************************

From: Robert Duff
Sent: Thursday, February 7, 2002,  8:57 AM

> I don't see how that could work.

Neither do I.  ;-)

> No; see above.  I think once you create the OUT local, that
> must be the one that gets returned.  Hence, it is probably
> simplest if there is an OUT local declared, to eliminate
> the use of return expressions, or even the requirement for
> an explicit "return" statement, so it would work like
> a procedure with one OUT parameter.

Agreed.

> In my view, this would be desirable only if we eliminate all return
> expressions from return statements for constructor functions with a
> local OUT object, making them obey the same rules as procedures
> thereafter.

Yes, that makes sense.

> I would not allow it to be a constant.  The word "OUT" would take
> the place of the word "CONSTANT" in the syntax, the way I see it.
> It seems nearly useless to have it be a constant, and clearly,
> the caller could use it to initialize a variable, so calling it a constant
> could be misleading.

Yes, of course.  I wasn't thinking clearly.

****************************************************************

From: Robert Dewar
Sent: Thursday, February 7, 2002,  9:02 AM

I must say that for me, this entire proposal seems to be insufficiently
grounded in real requirements. I am concerned that the ARG is starting to
wander around in the realm of nice-to-have-neat-language-extensions which
are really rather irrelevant to the future success of Ada. I am not opposed
to a few extensions in areas where a really important marketplace need has
been demonstrated, but the burden for new extensions should be extremely
high in my view, and this extension seems to fall far short of meeting
that burden.

****************************************************************

From: Randy Brukardt
Sent: Thursday, February 7, 2002,  2:19 PM

I hate to be agreeing with Robert here :-), but he's right.

There is a problem worth solving here (the inability to have constants of
limited types), but that could adequately be solved simply by the 'in-place'
construction of aggregates (which we already require in similar contexts).
[I'll post a real-world example of the problem in my next message.] The problem
is relatively limited, and thus the solution also has to be limited, or it
isn't worth it. This whole business of constructor functions only will sink any
attempt to fix the real problem, because it is just too big of a change at this
point.

Bob's concerns about the purity of the language would make sense in a new
language design, but we're working with limited resources here, and simple
solutions are preferred over perfect ones.

****************************************************************

From: Randy Brukardt
Sent: Thursday, February 7, 2002,  3:05 PM

Here is an example that came up in Claw where we really wanted constants of a
limited type:

The Windows registry contains a bunch of predefined "keys", along with user
defined keys. Our original design for the key type was something like (these
types were all private, and the constants were deferred, but I've left that out
for clarity):

    type Key_Type is new Ada.Finalization.Limited_Controlled with record
      Handle : Claw.Win32.HKey := Claw.Win32.Null_HKey;
      Predefined_Key : Boolean := False; -- Is this a predefined key?
                                         -- (Only valid if Handle is not null)
      -- other components.
    end record;

    Classes_Root : constant Key_Type := (Ada.Finalization.Limited_Controlled with
       Handle => 16#80000000#, -- Windows magic number
       Predefined_Key => True, ...);
    Current_User : constant Key_Type := (Ada.Finalization.Limited_Controlled with
       Handle => 16#80000001#, -- Windows magic number
       Predefined_Key => True, ...);
    -- And several more like this.

    procedure Open (Key    : in out Key_Type;
                    Parent : in Key_Type;
                    Name   : in String);

    procedure Close (Key   : in out Key_Type);

    procedure Put (Root_Key   : in Key_Type;
                   Subkey     : in String;
                   Value_Name : in String;
                   Item       : in String);

    -- and so on..

Of course, our favorite compiler rejected the constants as illegal.

So, they were turned into functions.

   function Classes_Root return Key_Type;
   function Current_User return Key_Type;

However, these have the problem that they have to be overridden for any
extensions of the type (as they are primitive). We could have put them into a
child/nested package (to make them not primitive), but that would bend the
structure of the design even further and add an extra package for no good
reason. We also could have made them class-wide, but that would be a misleading
specification, as they can never return anything other than Key_Type. So we
left them in the main package.

Aside: we originally wanted to use these as default parameters for some of the
various primitive routines. However, that would illegal by 3.9.2(11) unless
they are primitive functions. This rule exists so that the default makes sense
in inherited primitives. But we really would have preferred that the default
expressions weren't inherited; they only make sense on the base routines. That
is a problem that probably isn't worth solving though.

Of course, now that we had functions, we had to implement them. The first try
was:

   function Classes_Root return Key_Type is
   begin
       return (Ada.Finalization.Limited_Controlled with
           Handle => 16#80000000#, -- Windows magic number
           Predefined_Key => True, ...);
   end Classes_Root;

But our friendly compiler told us that THIS was illegal, because this is
return-by-reference type, and the aggregate doesn't have the required
accessibility.

So we had to add a library package-level constant and return that:

   Standard_Classes_Root : constant Key_Type := (Ada.Finalization.Limited_Controlled with
       Handle => 16#80000000#, -- Windows magic number
       Predefined_Key => True, ...);

   function Classes_Root return Key_Type is
   begin
       return Standard_Classes_Root;
   end Classes_Root;

But of course THAT is illegal (its the original problem all over again), so we
had to turn that into a variable and initialize it component-by-component in
the package body's elaboration code:

   Standard_Classes_Root : Key_Type;

   function Classes_Root return Key_Type is
   begin
       return Standard_Classes_Root;
   end Classes_Root;

begin
   Standard_Classes_Root.Handle := 16#80000000#; -- Windows magic number
   Standard_Classes_Root.Predefined_Key => True;
   ...

Which is essentially how it is today.

This turned into such a mess that we gave up deriving from it altogether, and
created an entirely new higher-level abstraction to provide the most commonly
used operations in an easy to use form. Thus, we ended up losing out on the
benefits of O-O programming here.

I certainly hope that newcomers to Ada don't run into a problem like this,
because it is a classic "stupid language" problem.

Simply having a way to initialize a limited constant with an aggregate would be
sufficient to fix this problem. "Constructor functions" might add
orthogonality, but seem unnecessary to solve the problem of being able to have
constants as part of an abstraction's specification.

****************************************************************

From: Robert Duff
Sent: Friday, February 8, 2002, 10:12 AM

> Simply having a way to initialize a limited constant with an aggregate would
> be sufficient to fix this problem. "Constructor functions" might add
> orthogonality, but seem unnecessary to solve the problem of being able to
> have constants as part of an abstraction's specification.

Surely you don't mean that we would allow limited aggregates only for
initializing stand-alone constants?!  Surely, you could use them to
initialize variables.  And if they can be used to initialize variables,
surely initialized allocators should be allowed.  And of course,
parameters.

In *my* programs, much of the data is heap-allocated.  I want to say:

    X: Some_Ptr := new T'(...);

when T is limited.  Allowing only constants would solve about 1% of *my*
problem.

Are you saying that this is illegal:

    P(T'(...));

and I have to instead write:

    Temp: constant T := (...);

    P(Temp);

?!  That sort of arbitrary restriction is what makes people laugh at the
language.

****************************************************************

From: Dan Eilers
Sent: Friday, February 8, 2002, 12:12 PM

Bob Duff wrote:
> My goals are:
>
>     - Allow aggregates of limited types.
>
>     - Allow constructor functions that return limited types.
>
>     - Allow initialization of limited objects.
>
>     - Allow limited constants.
>
>     - Allow subtype_marks in aggregates more generally.
>       (They are currently allowed only for the parent part in an
>       extension aggregate.)

Tuck wrote:
> I would allow these "constructor" functions for any kind of
> type.

I agree that the non-limited case is also important, and should be
listed as an explicit goal of the AI.  The non-limited case is an
efficiency issue, where a programmer wishes to prevent unnecessary
copying of large objects implied by the semantics of aggregates and
function calls.


Tuck wrote:
> ... so it would work like a procedure with one OUT parameter.

The proposal seems to go to a lot of trouble to define a new
kind of function that behaves exactly like a procedure with
one OUT parameter.  I think there may be a simpler solution
involving an extension to function renaming.  The constructor
function would be declared as a procedure with one OUT parameter,
and then renamed (allowing it to be called) as a function, using
the type of the OUT parameter as its return type.

Example:

        procedure p(x: some_type; y: out return_type);

        function f(x: some_type) return return_type renames p;


> > I assume you do not mean to allow them for unknown-size subtypes.
>
> Right.  They should have exactly the same restrictions,
> independent of whether the type is limited or nonlimited,
> and the caller should always allocate space for the
> returned object.  That way for limited types that
> become non-limited, we don't run into any weirdness.

There are many cases, such as the string concat function, where
the return type is declared to be unconstrained (unknown-size),
but really its size is a function of the sizes of the parameters
and therefore computable before the call.  It might facilitate
allocating space in the caller if Ada had a way of expressing the
size of the return type in terms of the parameters, possibly
using the proposed new assertion mechanism (AI-00286).

****************************************************************

From: Tucker Taft
Sent: Friday, February 8, 2002,  5:39 PM

This is an intriguing idea.  Clearly less syntax invention
than the "out" function return idea.  We actually perform the
transformation implied (from a procedure with an OUT parameter to
a function) as an optimization, when the OUT parameter is of an
elementary type.  Also there is a DEC-Ada pragma which imports an
external function as an Ada procedure, because the external function
had OUT parameters in addition to its return value.  I believe GNAT
supports this pragma.

Unfortunately, I'm not sure the above would work for the case when
you want to use an aggregate as the return expression.  Also, a procedure
presumes its OUT parameter is already at least default-initialized.
With the constructor function, the initialization was deferred until
entering the function, where it could be initialization from an aggregate
or from some other function call.

Another approach is to use the proposal for anonymous access
types as return types, which for (access-to) limited types has many of
the same advantages as the constructor function concept (see AI 231 for
details).

****************************************************************

From: Tucker Taft
Sent: Friday, February 8, 2002,  5:48 PM

Given the relative complexity of the constructor function
concept compared to the other de-limiting ideas, I would propose
we split the AI.  The simpler one would allow:

   1) Aggregates of limited objects, with use of a subtype_mark to mean
      default init of a component;
   2) Explicit initialization of limited objects, both in a declaration and
      an allocator, from an aggregate, with the aggregate built "in place" in
      the target.

The more complex one would address functions constructing limited objects
on behalf of the caller.

The aggregate one seems very straightforward.  Almost just eliminate
the existing restriction, presuming that compilers have already learned
how to build aggregates "in place" for controlled types.

The function one looks like a lot of work.

****************************************************************

From: Tucker Taft
Sent: Sunday, January 12, 2003,  7:55 PM

When we presented some of the Ada 200Y ideas
at SIGAda, there was a feeling that if we
added support for aggregates of a limited
type, we should also have function returns.
Bob and I don't feel the two need to be tied
that closely together, but they both go in the
category of making limited types less limited.

In any case, I got to thinking about the problem
more, and wrote the following note to Bob
describing a "brainstorm" I had a couple of nights
ago.  Bob said I might as well forward this to
the full ARG for comments.  He hasn't decided whether
he will incorporate it into an AI on limited function
returns....

So, fire away.
-Tuck

---------

Bob,
I realized sometime after I gave my quick
response on constructor functions, that I had
forgotten about one of the main challenges, namely
the desire to execute some statements (assignments,
procedure calls, etc.) to initialize the object
being returned, before actually returning it.
If the only thing you can do is return an
aggregate or another function call, you don't
have much flexibility, and there is no way to
insert a call on a procedure.

Which got me to thinking about the various special
naming conventions we had talked about for local
variables which *must* be returned, all of which
were unsatisfactory, kludgey, inelegant, etc.

And then suddenly the idea came to me that if you
could attach the statements directly to the
return statement, that would be nice.

E.g. something like:
     return blah with {statement} end return;

But then you need a name for the object being returned,
so that led to something like:

     return X : blah with {statement} end return;

And then I thought, what construct in Ada already has
an optional set of statements following it?  The
"accept" statement.  So why not try to make use of the
lonely "do" reserved word.  Also, it seems odd to have
a name on an expression, so let's make it a regular declaration
with a subtype indication as well.  The leads to:

    return_statement ::=
       RETURN ;
     | RETURN expr ;
     | RETURN identifier : subtype_indication [:= expr]
         [ DO
              handled_sequence_of_statements
           END [identifier] ;
         ]

For example:

     function Cool(B : Boolean) return Variant_Rec is
     begin
        if B then
            return Result : Variant_Rec(True) do
                Fixup(Result);
            end Result;
        else
            return Result : Variant_Rec(False) do
                Different_Fixup(Result);
            end Result;
        end if;
     end Cool;

With this construct, we could allow limited function
returns, where either the second form of "return" statement
is used and the expr is an aggregate or a function call
(or a reference to a long-lived existing object, per Ada 95),
or the third form of the "return" statement is used, and
pretty much anything goes, since you are clearly creating
a new object.

This construct would also make it possible to support
a result type that was an anonymous access type.  E.g.:

     function Cooler(Blah : access T) return access U is
     begin
         return Result : access U := new U(discrim => blah) do
             Result.Fum := Blah;
         end Result;
     end Cooler;

and the caller could determine the storage pool used for
"Result" from context:

     X : U_Ptr := Cooler(Something'Access);

In fact, limited function returns and anonymous access-type returns
could be seen as almost the same thing.  To implement both,
the caller has to pass in a storage pool, and an accessibility
level.  The called routine can either use that storage pool
(and its associated finalization/dependent-task list, if needed),
or it can return an access/reference to an existing object, so
long as it satisfies the accessibility check.  There might
be an accessibility level indication or some other flag that
means the return object *must* be newly allocated out of the
given storage pool.  The compiler would also have to implicitly
create a local storage pool to be passed in when the result
of the function call is used to initialize a local variable.

I suppose one could get even more radical, and allow this
"do ... end identifer;" at the end of any declaration that
is declaring only one object (i.e. isn't "X, Y : Blah").
This would solve the old problem of making sure any initialization
procedures that need to be called get connected tightly to
the declaration.  But that problem could probably be solved
better by having limited functions with the "return ... do ... end ID;"
construct, so I think I will keep this more radical suggestion
to myself ;-).

So, that was my great "brainstorm" last night (well, actually
this morning when I couldn't sleep...).  As they used to
say during the 9X project, I'll go don my flak jacket now,
so feel free to fire away.

****************************************************************

From: Dan Eilers
Sent: Monday, January 13, 2003  1:01 PM

Tuck's proposal looks interesting to me.

I particularly like the idea of somehow solving "the old problem
of making sure any initialization procedures that need to be called
get connected tightly to the declaration."

Being able to attach initialization code to declarations is useful
for a variety of reasons, including eliminating the overhead of
default initialization code where explicit initialization is
provided later, and making sure the declaration is initialized
before its first use.

****************************************************************

From: Tucker Taft
Sent: Friday, May 23, 2003,  8:05 AM

I believe I floated a "trial balloon" a month or so
ago about a syntax to support returning objects of
a limited type from a function.  Bob Duff pointed
out that it created yet another level of nesting in
simple cases.  Also, it involved a completely new
syntactic construct (return ... do ... end), which
seems excessive.  So here is a revamped proposal, now
structured as a "real" AI.

[Editor's note: this is version /01 of the AI.]

****************************************************************

From: Randy Brukardt
Sent: Friday, May 23, 2003,  1:22 PM

My gut reaction is that your trial balloon syntax is much preferred:
-- Using different syntax for return means that it is clear that this is not
the "normal" return with copying;
-- There is no need to look all over the source code to find the return
object;
-- There is no set of complex rules to guarantee that only one return object
is available in a given context.

The 'weight' of the extra syntax is pretty similar either way (an entire new
kind of object declaration doesn't seem "light" to me either).

So, given the advantages of the original syntax, and since the only
identified problem is an extra level of nesting (who cares?), I much prefer
that alternative. (I'm unconvinced that we can afford the complexity of any
of these proposals, but that's a different issue altogether. I don't like
the idea that the compiler has to be able to determine at run-time whether a
call is 'build-in-place' or 'existing object reference'; that seems like
substantial overhead, and I would expect 'build-in-place' to be commonly
used.)

****************************************************************

From: Tucker Taft
Sent: Friday, May 23, 2003,  1:50 PM

Can you explain this a bit more.  The called routine knows
whether it is returning an existing object or a new object,
so I don't see extra overhead there.  I suppose it has to
check whether the caller allowed returning an existing object,
but that is just a simple test, certainly cheaper than the
average constraint check.

The caller shouldn't care, since it can always use the address
returned from the called function, whether or not it created
a new object.

What is the source of the overhead that I am missing?

****************************************************************

From: Randy Brukardt
Sent: Friday, May 23, 2003,  3:26 PM

According to your write-up, the caller has to pass into the function some or
all of:
 -- A flag as to whether an object return is allowed;
 -- The address of the memory to build the return in;
 -- A storage pool;
 -- An accessibility level.
Obviously, there is going to be overhead to build and pass these things.
(Parameter passing isn't free!) Even if these are all packed into a
descriptor, initializing that descriptor is going to take a bunch of
instructions. That's especially true if a storage pool is created on the fly
for the call (which your writeup suggested in some cases).

So, such function calls look quite a bit more expensive than the similar
aggregates or the currently existing function calls. It's not likely to be
hundreds of times worse, but it's pretty complicated and certainly will slow
down these calls. That might matter for a few existing programs.

****************************************************************

From: Tucker Taft
Sent: Friday, May 23, 2003,  5:19 PM

If we adopt the restriction that eliminates run-time accessibility issues
for this proposal, then what the AI suggested was as follows:

1) If the function had a known-size result, then the caller would
   preallocate the space, and pass this address in the usual way for a
   function that returned a "large" result.  In addition, a flag
   would be passed to indicate whether the function was allowed
   to simply return the address of a preexisting object.  If so,
   then the caller would expect the function to return the address
   of the result, which could be the preallocated space, or the
   preexising object's address.

   In this case, the only extra overhead for typical implementations
   would be the extra boolean flag, and the test against it.

2) If the function had an unknown-size result, and hence would normally
   have to allocate the result on a secondary stack or heap, then the
   caller would pass in a storage pool and a boolean flag (or a possibly-null
   storage pool).  The storage pool is one of the following:
     a) a "normal" storage pool, presuming the function call is used
        as the expression of an initialized allocator
     b) a special "secondary stack" storage pool, presumably which could
        be precreated by the run-time system
     c) an on-the-fly constructed "preallocated-space" storage pool, which
        at a minimum would consist of:
          i) a tag identifying it as one of these kinds of storage pools
          ii) an address of the preallocated space
          iii) the length of the preallocated space

Case (2c) seems like the only one that involves measurable extra work at the
call-site.  Presuming the storage pool is allocated on the primary stack
then at the call-site you would have at least 3 instructions to initialize
the storage pool (assignments of the tag, address, and length), probably
more like 6 for the typical RISC machine.  Then you would have to pass
the address of the storage pool as an implicit parameter.

So I agree there would be some overhead, but by using the storage-pool
"abstraction" for cases (2a,2b,2c) and a simple boolean flag for (1),
the total amount seems pretty modest.

Just to be more precise about the preallocate-space storage pool, here
is a sample implementation of such a beast:

   type Preallocated_Space_Storage_Pool(Addr : Integer_Address; Max_Size : Storage_Count) is
     new Root_Storage_Pool with null record;

   procedure Allocate(
     Pool : in out Preallocated_Space_Storage_Pool;
     Storage_Address : out Address;
     Size_In_Storage_Elements : in Storage_Count;
     Alignment : in Storage_Count) is
   begin
       if Size_In_Storage_Elements > Max_Size then
           raise Storage_Error;
       end if;
       Storage_Address := To_Address(Addr);
   end;

   procedure Deallocate(...) is begin raise Program_Error; end;
   function Storage_Size(Pool : ...) is begin return Pool.Max_Size; end;

A local object of this type would need to be created at the call-site and
passed as an implicit parameter when the space for the object is preallocated
by the caller (case 2c above).

> So, such function calls look quite a bit more expensive than the similar
> aggregates or the currently existing function calls. It's not likely to be
> hundreds of times worse, but it's pretty complicated and certainly will slow
> down these calls. That might matter for a few existing programs.

It doesn't look that expensive to me.  Calling functions with unknown-size
results is relatively expensive anyway.  Presuming case (2c) above is relatively
rare (limited type, unknown size result, caller preallocates), this doesn't
seem like a show-stopper.

****************************************************************

From: Randy Brukardt
Sent: Friday, May 23, 2003,  5:46 PM

Tucker wrote:
...
> 1) If the function had a known-size result, then the caller would
>    preallocate the space, and pass this address in the usual way for a
>    function that returned a "large" result.  In addition, a flag
>    would be passed to indicate whether the function was allowed
>    to simply return the address of a preexisting object.  If so,
>    then the caller would expect the function to return the address
>    of the result, which could be the preallocated space, or the
>    preexising object's address.
>
>    In this case, the only extra overhead for typical implementations
>    would be the extra boolean flag, and the test against it.

But then there is overhead to get rid of the extra 'preallocated' space if
it isn't used. And the overhead of figuring out if that needs to be done. If
the space is in a pool (because it's an anonymous access type, or an item
with a non-stack size), this will require calling pool operation(s). In the
stack case, this memory won't be recovered until the subprogram exits
(Janus/Ada might reuse it, but it cannot recover it). If the object is
controlled, it probably will have to be registered (not sure precisely when
that would have to happen for this case, but it certainly can't happen
inside the subprogram unless a finalization chain is passed into it, which
would add even more overhead).

In addition, this implementation means that you will end up allocating an
'extra' copy of the object in the return existing object case. If the object
is large (and certainly some of the objects we're talking about are), that
could be a problem, as it could cause an existing program to raise
Storage_Error.

> Presuming case (2c) above is relatively
> rare (limited type, unknown size result, caller preallocates), this doesn't
> seem like a show-stopper.

I don't think it's necessarily a show-stopper. But we have to do a
cost/benefit analysis on new features. Certainly, there is a benefit here,
but just like Interfaces, it is not at all clear to me that the benefit
outweighs the cost, which is considerable (and growing).

****************************************************************

From: Tucker Taft
Sent: Saturday, May 24, 2003,  12:34 PM

> But then there is overhead to get rid of the extra 'preallocated' space if
> it isn't used.

I'm not sure I understand what this means.  It may be something
about the way your compiler works.  In our compiler, if
the caller preallocates temporary space for a function result,
it is space that gets released automatically at the end of
the enclosing scope, so there is no point (or sometimes, no
way), to reclaim it earlier than then.

> ... And the overhead of figuring out if that needs to be done.

The caller would know at compile-time whether the function
returns a known-size result, and whether the result is
used in a context where a preexisting object would be
permitted, so I don't see any run-time overhead there.

 > ... If
> the space is in a pool (because it's an anonymous access type, or an item
> with a non-stack size), this will require calling pool operation(s). In the
> stack case, this memory won't be recovered until the subprogram exits
> (Janus/Ada might reuse it, but it cannot recover it). If the object is
> controlled, it probably will have to be registered (not sure precisely when
> that would have to happen for this case, but it certainly can't happen
> inside the subprogram unless a finalization chain is passed into it, which
> would add even more overhead).

I am unclear now whether you are talking about overhead that is new
to limited function return, or is the same as what you would face
for non-limited function return.

> In addition, this implementation means that you will end up allocating an
> 'extra' copy of the object in the return existing object case. If the object
> is large (and certainly some of the objects we're talking about are), that
> could be a problem, as it could cause an existing program to raise
> Storage_Error.

You could set your upper limit for caller-preallocated space
relatively low for these kinds of functions, if this is a
significant concern.  That is, require use of the secondary
stack or heap even if the result size is known, if the known
size is so large as to be of concern.

>>Presuming case (2c) above is relatively
>>rare (limited type, unknown size result, caller preallocates), this doesn't
>>seem like a show-stopper.
>
>
> I don't think it's necessarily a show-stopper. But we have to do a
> cost/benefit analysis on new features. Certainly, there is a benefit here,
> but just like Interfaces, it is not at all clear to me that the benefit
> outweighs the cost, which is considerable (and growing).

Are you talking about implementation cost or run-time overhead?
I don't see the run-time overhead as being much greater than
function calls that return a non-limited type of similar complexity.
If the result might be controlled, or large, or of unknown-size,
then yes that adds to the run-time overhead, but that is true
for non-limited functions as well.

****************************************************************

From: Stephen W Baird
Sent: Tuesday, October 14, 2003,  3:51 PM

This is a discussion of the interaction between AI-318 and the
IBM-Rational  Apex Ada compiler's implementation of finalization, as per
my homework assignment from the Sydney ARG meeting.

----

The Apex compiler manages pending finalization requirements (i.e.
finalization of controlled and protected objects, not tasks) at the
granularity of
top-level (i.e. non-component) objects. The finalization code generated
for the enclosing "construct or entity" (7.6.1(2)) of a given top-level
object relies on the invariant that either all or none of the
subcomponents of the object require finalization.

This means, for example, that if an exception occurs while initializing an
object, then the initialization code (which knows how far it has
progressed) must handle the exception, finalize any components which were
successfully initialized, and then (typically) reraise the exception. If
an object cannot make it to the state where all of its subcomponents need
to be finalized, then it must revert to the state where none require
finalization before execution of the finalization code for the enclosing
"construct or entity".

This has proven to be a reasonable implementation model, but AI-318 might
be difficult to implement using this approach. Consider the case of a
return object which contains several controlled subcomponents. Suppose
that some, but not all, of these subcomponents have been successfully
initialized when an exception is raised. The code (in the callee) which
knows how far initialization has progressed would have to handle the
exception, perform any necessary finalization, and then (typically)
reraise the exception.

Unfortunately, the AI (as currently written) disallows this approach:
    "The return object would not be finalized prior to leaving the
function. The caller would be responsible for its finalization".

This problem could be resolved by having this provision of the AI apply
only in the case of a "normal completion" (7.6.1(2)) of the function, with
the callee responsible for finalization otherwise (or perhaps just by
adding an implementation permission allowing the callee to perform
finalization in this case).

It cannot always be known whether a function is going to return normally
until after any other finalization of objects declared by the function has
completed. Thus, the return object might have to be the last object to be
finalized. This could be accomplished either by requiring that it be the
first object with nontrivial finalization to be declared or by inventing a
special dynamic-semantics rule to handle this case (perhaps only an
implementation permission).

The Apex Ada compiler implements abortion (including ATC) by means of a
distinguished anonymous "exception". Thus, abortion while the callee is
executing introduces essentially the same problem for the caller as if the
callee propagated an exception. The distinction between normal and
abnormal completion proposed above would also help in resolving this
problem.

****************************************************************

From: Robert A Duff
Sent: Wednesday, October 15, 2003, 11:53 AM

> This problem could be resolved by having this provision of the AI apply
> only in the case of a "normal completion" (7.6.1(2)) of the function, with
> the callee responsible for finalization otherwise ...

That makes sense to me.  How can it make sense to let the caller do the
finalization of the result when the function is not returning a result,
but is propagating an exception instead?

Also, Steve is talking about the case where the returned object is "half
baked".  But what if it hasn't been initialized at all?  I believe there
would be trouble in that case, too.

>...(or perhaps just by
> adding an implementation permission allowing the callee to perform
> finalization in this case).

I'm not a big fan of implementation permissions, but I would have no
objection in this case.

****************************************************************

From: Dan Eilers
Sent: Thursday, October 30, 2003, 8:01 PM

The initial proposals for AI-318 involved changes to the syntax
of a function specification, such as using OUT instead of RETURN.
The current proposals don't.  The only proposed syntax changes are
in the body of a function.

This makes it impossible for the caller to know that this is a
special return-in-place function, which would seem to be necessary
in order to use different calling conventions.  Note that the caller
can't just go by the return type being limited, because the AI is
intended to also eliminate the copy-back for non-limited types.

****************************************************************

From: Randy Brukardt
Sent: Thursday, October 30, 2003, 8:32 PM

I *think* that's intentional. The majority of functions return by-copy
types, and for those, it makes no difference (at least, it better not). For
other types, most compilers already use an in-place function convention in
most cases; and those that don't (i.e. Janus/Ada) probably would be better
off changing to use one. So, it seems that for most calls, any performance
changes would be in the direction of faster (and possibly larger) code.

But any performance incompatibilities ought to be investigated throughly.
I've already complained about performance incompatibilities with this
proposal (see the mail thread of May 2003 in the AI); Tucker's response is
essentially that compilers will optimize the calling conventions, and the
ugly cases are rare. Since that *is* an incompatibility, it should be
discussed in the AI.

We've already asked for implementation reports on this AI, since several
implementors expressed concern about the cost of the convention. I'm sure
we'd welcome one from you as well.

****************************************************************

From: Tucker Taft
Sent: Thursday, October 30, 2003, 8:56 PM

It is true that a single calling convention must be used.
This implies some overhead on calling functions with a
return-by-reference type, but the presumption is that these
are very rare at the moment.  The presumed model is that the caller
specifies a storage "area" (or equivalent), and a flag indicating
whether the storage area *must* be used, or simply *may* be used.
(I believe this is discussed in the AI already.)

If used in a context where a new object is being initialized
(e.g. a component of an aggregate, initialization expression
for a limited object, or an initialized allocator), the specified
storage area must be used.  If used in another context
(e.g. in a renaming, as an IN parameter, or as an operand of some construct
like a membership test), then the storage area need not be
used, and returning an existing object is allowed.

Currently there is an accessibility check on returning a preexisting
return-by-reference object.  That check would be expanded to
include a check on whether returning an existing object is permitted.
Right now the check is officially a run-time check, but it is
generally easy to perform at compile-time (or instantiation time).
It would become a real run-time check with this change.

The underlying presumption behind all this is that the existing
capability to return existing objects by reference is of relatively
little use, and it is reasonable to largely ignore this capability,
and focus on being able to use functions with limited result types as
"constructors."  The existing capability would be preserved, but
perhaps might even deserve to be made obsolescent.

The existing capability provides very little value over what can
be done with returning an access value, whereas the new capability
provides significant value as part of making limited types more
useful.

> ... Note that the caller
> can't just go by the return type being limited, because the AI is
> intended to also eliminate the copy-back for non-limited types.

This is meant to be an optimization.  There is no guarantee
that copy-back is eliminated for non-limited types.
The presumption is that for most functions returning
large objects of known size, the caller already passes in
the address of a place where the return object should be
placed.  This new syntax would simplify using that space
directly within the function body, rather than doing another
copy.

****************************************************************

From: Robert Dewar
Sent: Thursday, October 30, 2003, 8:42 PM

> It is true that a single calling convention must be used.
> This implies some overhead on calling functions with a
> return-by-reference type, but the presumption is that these
> are very rare at the moment.  The presumed model is that the caller
> specifies a storage "area" (or equivalent), and a flag indicating
> whether the storage area *must* be used, or simply *may* be used.
> (I believe this is discussed in the AI already.)

That seems a nasty incomaptibility. I don't like to see a feature of relatively
minor importance (in my view) causing an implementation incompatibility of
this magnitude, potentially requiring reocmpilation of existing code that
does not use the new feature, and invalidating libraries.

****************************************************************

[Editor's note: Additional discussion on this topic can be found in AI-325.]

****************************************************************

From: Tucker Taft
Sent: Monday, December  8, 2003, 10:32 AM

There seem to a lot of messages flying around about how best
to support function-like-things returning/constructing limited objects.

Using the "Getting to Yes" method of trying to focus on what we agree about,
here is a list of possibly desired features of the solution.
I will start with those that seem to already have a consensus.
I would most appreciate responses that indicate if I missed any "consensus"
statements, or if there are some that are clearly *not* a consensus.
Secondly, it would be good to have a prioritization of the nice-to-haves.
Finally, it would be good to get some feeling about the non-consensus
statements, and perhaps adjustments to them which might allow them to
become consensus statements.

-------------------------

Consensus statements about Ada 200Y if we were to approve AI-318:

   1) Should be possible to declare an object of a limited
      type and provide an initializing expression
   2) Should be possible to use an initialized allocator for
      an access-to-limited type
   3) Should be possible to provide an aggregate as the initializing expression
      for a declaration or an initialized allocator, or for a component
      of such an aggregate; such aggregates may use "<>" to represent default
      initialization of a component
   4) Should be possible to use a function call (or something that looks syntactically
      like a function call) as the initializing expression for a declaration,
      initialized allocator, or component of an aggregate that is of a limited type,
      including a limited private type.
   5) Should be possible to declare a function-like thing callable by such a function call
      for limited types whose first subtype is a definite subtype.
   6) Should be possible to use an aggregate or a function-call (-like-thing) as
      an actual IN parameter of a limited type
   7) Should *not* be possible to copy an existing limited object.  I.e.
      Should *not* be possible to have an assignment statement for a limited
      type, and should *not* be possible to use the name of a limited declared object
      nor a dereference of an access-to-limited type as a component of an aggregate.
   8) The compiler needs to know at the call-site whether function-like thing
      is returning an existing object by reference, or returning/initializing
      a new object

Nice to have:
   9) Ability to declare and call a function-like thing for a limited type with
      non-defaulted discriminants
   10) Ability to declare and call a function-like thing for a limited type with
      unknown discriminants; such types would require an initializing expression --
      no default initialization is defined for them.
   11) Easy to implement

Other possible desirables:
   12) Should not require alteration in the way limited types are laid out
   13) Should allow us to still have function-like things
       that return by reference
   14) Should (or should not) use the word "constructor" somewhere
   15) Should (or should not) use the word "limited" somewhere
   16) Should (or should not) use the word "function" somewhere
   17) Should provide more efficient way for non-limited types to be returned/initialized
   18) Should not "orphan" existing language features

****************************************************************

From: Robert A. Duff
Sent: Monday, December  8, 2003,  2:48 PM

Thanks, Tuck.  This is a very helpful summary.  I was getting lost in
all those e-mails.

> -------------------------
>
> Consensus statements about Ada 200Y if we were to approve AI-318:
>
>    1) Should be possible to declare an object of a limited
>       type and provide an initializing expression
>    2) Should be possible to use an initialized allocator for
>       an access-to-limited type

I would add, "even when the storage pool is user-defined".

>    3) Should be possible to provide an aggregate as the initializing expression
>       for a declaration or an initialized allocator, or for a component
>       of such an aggregate; such aggregates may use "<>" to represent default
>       initialization of a component
>    4) Should be possible to use a function call (or something that looks syntactically
>       like a function call) as the initializing expression for a declaration,
>       initialized allocator, or component of an aggregate that is of a limited type,
>       including a limited private type.
>    5) Should be possible to declare a function-like thing callable by such a function call
>       for limited types whose first subtype is a definite subtype.
>    6) Should be possible to use an aggregate or a function-call (-like-thing) as
>       an actual IN parameter of a limited type
>    7) Should *not* be possible to copy an existing limited object.  I.e.
>       Should *not* be possible to have an assignment statement for a limited
>       type, and should *not* be possible to use the name of a limited declared object
>       nor a dereference of an access-to-limited type as a component of an aggregate.
>    8) The compiler needs to know at the call-site whether function-like thing
>       is returning an existing object by reference, or returning/initializing
>       a new object

I agree with 1..8 above.

Can we also get concensus on this:

    Every context that allows an initialization expression for
    nonlimited types should also allow it for limited types.

?  That subsumes 1,2,part-of-3,part-of-6.  It also includes
record-component-defaults, generic-formal-in's, and probably some
others I've forgotten.  The relevant AI's list all the cases.

> Nice to have:
>    9) Ability to declare and call a function-like thing for a limited type with
>       non-defaulted discriminants
>    10) Ability to declare and call a function-like thing for a limited type with
>       unknown discriminants; such types would require an initializing expression --
>       no default initialization is defined for them.

I think 9 and 10 are important.  I'm not quite willing to kill the whole
idea if I can't have 9 and 10, but since these work already for the
nonlimited case, it would seem pretty kludgy to leave them out in the
limited case.

>    11) Easy to implement

Who could disaggree with that?  But I'm willing to put up with some
implementation complexity to get 9 and 10.

> Other possible desirables:
>    12) Should not require alteration in the way limited types are laid out

I think I agree, but I'm not really sure what you're getting at.  Which
proposal(s) violate this?

>    13) Should allow us to still have function-like things
>        that return by reference

I don't much care about that for my own code, but I think it would be
irresponsible of us to be incompatible with folks have used this
feature, even if we think perhaps it's a misguided feature.

>    14) Should (or should not) use the word "constructor" somewhere
>    15) Should (or should not) use the word "limited" somewhere
>    16) Should (or should not) use the word "function" somewhere

I have no strong opinion on the syntax, but I think these new kinds of
constructors are conceptually "functions".  They just happen to build
their result in the final resting place.

Viewing them as "procedures" seems like a compiler-writer viewpoint; I'd
rather take a user-oriented viewpoint.  The fact that function results
can take their discriminants from the function parameters, but you can't
do that for 'out' parameters, is accidental, not fundamental.

Viewing them as totally new animals seems like overkill.  To me, a
constructor is just a function that creates a new thing.  Number 8 above
implies that we need *some* sort of new syntax.  I would prefer to keep
it as close as possible to the existing function declaration syntax.
But I do not feel strongly about this.

>    17) Should provide more efficient way for non-limited types to be returned/initialized

Seems like a nice side effect.  Not important.

>    18) Should not "orphan" existing language features

This seems like a possible symptom of kludgery, but not a worthy goal in
its own right.  I mean, if I have to write "constructor" instead of
"function" all over the place in future code, that's not a disaster.

****************************************************************

From: Robert I. Eachus
Sent: Monday, December  8, 2003,  4:15 PM

I had to go out after sending my previous message, so effectively Tucker
and I crossed in the mail.  But Bob Duff did an good job of responding
to Tucker's excellent list of points:

Robert A Duff wrote:

>Thanks, Tuck.  This is a very helpful summary.  I was getting lost in
>all those e-mails.
>
>
Agreed, and I was writing some of them, and referring back to others to
keep everything straight.

...
>I agree with 1..8 above.
>
>Can we also get concensus on this:
>
>    Every context that allows an initialization expression for
>    nonlimited types should also allow it for limited types.
>
>?  That subsumes 1,2,part-of-3,part-of-6.  It also includes
>record-component-defaults, generic-formal-in's, and probably some
>others I've forgotten.  The relevant AI's list all the cases.

I like Tucker's breakdown better.  It makes it easier to say that 1,2,3,
and 6 are must haves, and some of the other cases are nice to haves.  I
certainly favor allowing record component defaults, and generic formal
in parameters likewise seem safe, and in most compilers I would expect
them to be implemented identically to required cases when the defaults
were actually used.  But I would certainly consider any objections from
implementors if some fringe case caused serious implementation problems.

>>Nice to have:
>>   9) Ability to declare and call a function-like thing for a limited type with
>>      non-defaulted discriminants
>>   10) Ability to declare and call a function-like thing for a limited type with
>>      unknown discriminants; such types would require an initializing expression --
>>      no default initialization is defined for them.
>
>I think 9 and 10 are important.  I'm not quite willing to kill the whole
>idea if I can't have 9 and 10, but since these work already for the
>nonlimited case, it would seem pretty kludgy to leave them out in the
>limited case.

Agree I think 10 is more important than 9, but both will be very important.

>>   11) Easy to implement
>
>Who could disaggree with that?  But I'm willing to put up with some
>implementation complexity to get 9 and 10.

Definitely agree.

>>Other possible desirables:
>>   12) Should not require alteration in the way limited types are laid out
>>
>>
>
>I think I agree, but I'm not really sure what you're getting at.  Which
>proposal(s) violate this?

No current proposal, AFAIK.  Doesn't mean a future new variation won't.
But there is a difference between require and permit that should be kept
in mind.  There will be cases where compilers can generate more
efficient layouts for types that are just not used today.  Remember that
my example code compiles cleanly today, the only problem is that it
exports ADTs that can't be created by users.

For example, it would be an optimization for a compiler that currently
places unbounded structures on the heap and uses 'hidden' pointers in
the structure to manage them for the compiler to allocate one contiguous
chunck of the heap, for all components of a record, have
pointers/offsets in the record structure, and just one heap object to
free when the whole record is freed.   That doesn't mean that all
compilers have to treat records with multiple constructors that way,
just that a compiler is allowed to do so.

To be honest, I expect that the objects in point 10 above will become
common in Ada 0Y.  The types are currently legal in Ada 95/2000, but
they just are not used.  (And not really very usable.)  So I don't know
if any compilers have horrible overhead if someone does  create one.  If
so, that compiler would probably need to change its layout policy for
limited types.

>>   13) Should allow us to still have function-like things
>>       that return by reference
>
>I don't much care about that for my own code, but I think it would be
>irresponsible of us to be incompatible with folks have used this
>feature, even if we think perhaps it's a misguided feature.

Huh?  Oh.  You don't use it because you don't use limited tagged types.
This feature will become much more useful and less 'misguided' if we can
initialize objects of types derived from Limited_Controlled more easily.

>>   14) Should (or should not) use the word "constructor" somewhere
>>   15) Should (or should not) use the word "limited" somewhere
>>   16) Should (or should not) use the word "function" somewhere
>
>I have no strong opinion on the syntax, but I think these new kinds of
>constructors are conceptually "functions".  They just happen to build
>their result in the final resting place.
>
>Viewing them as "procedures" seems like a compiler-writer viewpoint; I'd
>rather take a user-oriented viewpoint.  The fact that function results
>can take their discriminants from the function parameters, but you can't
>do that for 'out' parameters, is accidental, not fundamental.
>
>Viewing them as totally new animals seems like overkill.  To me, a
>constructor is just a function that creates a new thing.  Number 8 above
>implies that we need *some* sort of new syntax.  I would prefer to keep
>it as close as possible to the existing function declaration syntax.
>But I do not feel strongly about this.
>
>
We really need Norman Cohen to take a look at the issue.  I'm sure he
could come up with something.  Seriously, I have no objection to
retaining the word function.  But I do want the syntax to indicate that
this is one of those special "constructor" things to both the user and
the compiler.  I don't like "limited function" because that implies that
there are also "not limited function" types. ;-)   Using "constructor
function" seems a bit wordy but otherwise fine.  Certainly whatever the
syntax, the RM should talk about them as constructor functions or
constructors.  I also tried out:

function Foo return new Bar;

But that seems to imply that a hidden pointer must be used.

function Foo return inplace Bar;

Is a bit better, but even with the precedent of goto, I don't like the
idea of reserved words that are not English words.

function Foo create Bar;

Might be acceptable to everyone?  I am certainly open to any good ideas.

>>   17) Should provide more efficient way for non-limited types to be returned/initialized
>>
>Seems like a nice side effect.  Not important.
>
Agree.  Well maybe more than just nice if we can improve the efficiency
of Unbounded_String in some cases.  But certainly not a requirement.

>>   18) Should not "orphan" existing language features
>
>This seems like a possible symptom of kludgery, but not a worthy goal in
>its own right.  I mean, if I have to write "constructor" instead of
>"function" all over the place in future code, that's not a disaster.

One last suggested inclusion:

19) If we adopt a partial solution, that partial solution shouldn't
limit a future extention to cover everything.

I am certainly willing to consider scope reduction of a complete
solution, as long as it doesn't preclude ever fixing the excluded cases.

****************************************************************

From: Tucker Taft
Sent: Monday, December  8, 2003, 5:52 PM

Robert A Duff wrote:
> ...
> Can we also get concensus on this:
>
>     Every context that allows an initialization expression for
>     nonlimited types should also allow it for limited types.

I agree with that.  I also believe that the reverse should be true,
namely there should be no contexts where calling these function-like
things are permitted, but calling good-old functions returning
non-limited types are not permitted.

So in other words, from a user point of view, these are all very
similar.  The limited-returning ones cannot be called in certain
contexts because those contexts would require copying the
result.  The only context I can think of off the top of my
head is as the right hand-side of an assignment statement,
though there are probably others.  Whether they can be
used as the expression of a return statement depends on the details
of how these limited-returning things are implemented (as opposed to
called).

It is the existing limited-returning-by-ref functions that are odd, because
they can only be called in very limited contexts.  In particular,
a call on one of these can be used as an IN parameter, in a renaming,
and as a prefix of a name (anyplace else?).

This is not so noticeable in Ada 95, because the limitedness of the result
and the absence of aggregates and initialization of limited types, means
that the by-ref-ness doesn't create much additional limitation.  *But*,
if we add aggregates and initialization of limited types, then suddenly
these kinds of functions have some odd-ball limitations which may be
hard to remember, especially if there are function-like things that
don't have these limitations.

Hence, I feel pretty strongly that if we are going to use syntax to make
these two kinds of limited-returning function-like things look different,
we should make the existing returning-by-ref functions look different
from non-limited-returning functions, and make the new more flexible
limited-returning functions look like good old non-limited returning functions,
since they have so much more in common (in terms of legal calling contexts).

This is why I would recommend we require something like the word "limited" on
a function if it will be returning by-ref, and can only be called in contexts
where by-ref makes sense.  This is of course incompatible, but it is easily
caught at compile-time, and compilers could start allowing the word "limited"
right away, even before they support the new capability.

> > Other possible desirables:
> >    12) Should not require alteration in the way limited types are laid out
>
> I think I agree, but I'm not really sure what you're getting at.  Which
> proposal(s) violate this?

There were a lot of different ideas thrown around, but at least one of
them implied that the caller might *not* know the size of the thing
being allocated, nor where it was being allocated.  Clearly if
you call one of these function-like things as a component of an aggregate,
and you lay out limited types contiguously (even if some component is
dynamic-sized), then the caller *must* specify where the object is allocated,
and *must* know the size before it goes out-of-line so it can add up all the sizes
of the components and do the one overall allocation in the appropriate place
(on the secondary stack, in some user-specified storage pool, as a component
of a yet larger limited object, etc.).

I got the sense that one solution being bandied about was that limited components
of dynamic size would *have* to use a level of indirection, precluding a contiguous
allocation for an enclosing limited record.  This is not the way many compilers
do things now, and so would imply a change in the way limited types are laid out.
I would hope (12) is a point of consensus, but I couldn't tell if that were true
based on the flurry of messages.

>
> >    13) Should allow us to still have function-like things
> >        that return by reference
>
> I don't much care about that for my own code, but I think it would be
> irresponsible of us to be incompatible with folks have used this
> feature, even if we think perhaps it's a misguided feature.

But perhaps call these things "limited functions" because if we add
aggregates and initialized limited objects, these guys won't be callable
in those contexts.  Alternatively, require that they be recast as
functions returning anonymous access types, effectively moving the ".all" from the
return expression to the point of call (since in my experience, these
functions almost always return a reference to a heap object, due to accessibility
limitations).

> >    14) Should (or should not) use the word "constructor" somewhere
> >    15) Should (or should not) use the word "limited" somewhere
> >    16) Should (or should not) use the word "function" somewhere
>
> I have no strong opinion on the syntax, but I think these new kinds of
> constructors are conceptually "functions".  They just happen to build
> their result in the final resting place.

I agree (as is presumably obvious).  As indicated above, it is the return-by-ref
guys that will begin to look like oddballs, if we add limited aggregates and
initialization.

> Viewing them as "procedures" seems like a compiler-writer viewpoint; I'd
> rather take a user-oriented viewpoint.  The fact that function results
> can take their discriminants from the function parameters, but you can't
> do that for 'out' parameters, is accidental, not fundamental.

I'm not sure I followed that logic, but I agree that they should be *viewed*
as functions.  The question is how does one implement these.  I fear
that to achieve nice-to-have's (10) and (11), allowing the first subtype
to have non-defaulted or unknown discriminants, combined with (12), creates
a real challenge.  Renaming a procedure call as a function nicely solved
all the problems:
   a) the visible declaration is a function
   b) the renaming declaration can use the parameters to specify the
      discriminants for the returned (i.e. OUT) object (e.g. "(Disc => 3, others => <>)")
   c) the out of line code has a name for the pre-allocated object so
      it can refer to the discriminants.

If there is another solution that has all these capabilities that would be
great.  I have not found one.  The hardest problem is where the discriminants
are not explicitly determined by the caller, but are instead determined
by some computation on the IN parameters.

One suggested solution was:

   function Make_Text(Len : Natural) return Lim_Text(Len);

But that doesn't work if the discriminants of Lim_Text are not visible (i.e. "(<>)").
The renaming (of a procedure call) could work because the renaming can be
in the private part.

It may be that some mild restrictions could be added to deal with this
problem.  I would hope the restrictions can be enforced on the *declaration*
of the function rather than at the call site.  Otherwise I fear we
will get into the "applicable index constraint" game, which I don't
relish.  That is, certain calls would only be permitted when there is
an applicable discriminant constraint.


>
> Viewing them as totally new animals seems like overkill.  To me, a
> constructor is just a function that creates a new thing. ...

And except for the oddball return-by-ref functions, all
functions create a new thing.

****************************************************************

From: Randy Brukardt
Sent: Monday, December  8, 2003, 6:53 PM

Tucker said:

> Hence, I feel pretty strongly that if we are going to use syntax to make
> these two kinds of limited-returning function-like things look different,
> we should make the existing returning-by-ref functions look different
> from non-limited-returning functions, and make the new more flexible
> limited-returning functions look like good old non-limited returning
functions,
> since they have so much more in common (in terms of legal calling
> contexts).
>
> This is why I would recommend we require something like the word "limited" on
> a function if it will be returning by-ref, and can only be called in contexts
> where by-ref makes sense.  This is of course incompatible, but it is easily
> caught at compile-time, and compilers could start allowing the word "limited"
> right away, even before they support the new capability.

I don't mind that in a vacuum, but I think that it means that either (1)
non-limited constructors are actually more expensive than current functions;
or (2) converting a limited type to non-limited requires checking all
functions for correct behavior.

The former occurs because (in one model) you get a call to Initialize that
generally can't be optimized away on top of the Adjust and Finalize calls
that we already have; the latter occurs (in another model) because limited
types call Initialize and non-limited types don't.

I don't much like either result.

> > > Other possible desirables:
> > >    12) Should not require alteration in the way limited types are laid out
> >
> > I think I agree, but I'm not really sure what you're getting at.  Which
> > proposal(s) violate this?
>
> There were a lot of different ideas thrown around, but at least one of
> them implied that the caller might *not* know the size of the thing
> being allocated, nor where it was being allocated.  Clearly if
> you call one of these function-like things as a component of an aggregate,
> and you lay out limited types contiguously (even if some component is
> dynamic-sized), then the caller *must* specify where the object is allocated,
> and *must* know the size before it goes out-of-line so it can add up all the sizes
> of the components and do the one overall allocation in the appropriate place
> (on the secondary stack, in some user-specified storage pool, as a component
> of a yet larger limited object, etc.).

Trying to lay out all possible record types contiguously is a fool's game.
Kinda like trying to implement universal generic sharing. :-) It's possible
to get it to work, but only with lots of standing on your head. And the
result is very use-unfriendly: objects of reasonable types like
   type Sane_Bounded_String (D : Natural := 0) record
        Data :  String (1 .. D);
   end record;
raise Storage_Error unless constrained.

In any case, the vast majority of real types can be implemented
contiguously, with any of these proposals. (Most ADTs don't have
discriminants anyway, at least not on the top-level types.) If a few types
have to change representation in a few compilers (and only if there are
constructors defined) to make this work, I cannot get too excited. It can't
be incompatible: there are no constructors now.

> > Viewing them as totally new animals seems like overkill.  To me, a
> > constructor is just a function that creates a new thing. ...
>
> And except for the oddball return-by-ref functions, all
> functions create a new thing.

I guess I view these as a new thing because what they do is create a
user-defined "construction" of an object; they need to replace the
"initialization assignment" operation of Ada as well as the "initialization"
itself. Existing functions do not change the semantics of assignment. For
non-controlled types, the distinction doesn't really matter, but it is a big
deal for controlled types (of all stripes).

Also, I see a new thing as necessary, because I don't believe that a useful
constructor can be defined that won't force some representation changes in
compilers. (That is, (12) is an impossible goal; holding to it is a disaster
from a user perspective -- it forces unnatural separations of construction
code into parts. And the idea of somehow specifying an aggregate as the
argument of an In Out parameter seems goofy.) As long as the constructors
are explicit, then there isn't a problem in that existing code would not
have to change representation.

If we don't have the will to do this right this time, I don't think there is
any value to another partial band-aid solution. Especially if it cannot be
extended properly in the future. Which is why Tucker's procedure renaming
just isn't going to work.

****************************************************************

From: Robert I. Eachus
Sent: Monday, December  8, 2003, 7:52 PM

Tucker Taft wrote:

>Hence, I feel pretty strongly that if we are going to use syntax to make
>these two kinds of limited-returning function-like things look different,
>we should make the existing returning-by-ref functions look different
>from non-limited-returning functions, and make the new more flexible
>limited-returning functions look like good old non-limited returning functions,
>since they have so much more in common (in terms of legal calling contexts).
>
>This is why I would recommend we require something like the word "limited" on
>a function if it will be returning by-ref, and can only be called in contexts
>where by-ref makes sense.  This is of course incompatible, but it is easily
>caught at compile-time, and compilers could start allowing the word "limited"
>right away, even before they support the new capability.
>
>
It sounds like you are proposing to make detecting whether a function is
a constructor or a "normal" function depend on whether or not it returns
a limited type.

But that doesn't work.  The problem is that the compiler may not know
whether or not a function can be seen in contexts where its type must be
returned by reference.  For example,  a generic formal part may specify
a limited type, but the actual may be non-limited.  The reverse happens
as well.  Inside a package where a type is declared as limited private,
the type may or may not be limited.

So I have been assuming that 'flagging' constructors as such must be
done in syntax, and constructors for non-limited types must be allowed,
subject to the same rules and restrictions as for limited types.  The
"normal" case will be that a constructor is actually defined in a scope
where the return type is non-limited, at least for non-tagged types.

>There were a lot of different ideas thrown around, but at least one of
>them implied that the caller might *not* know the size of the thing
>being allocated, nor where it was being allocated.  Clearly if
>you call one of these function-like things as a component of an aggregate,
>and you lay out limited types contiguously (even if some component is
>dynamic-sized), then the caller *must* specify where the object is allocated,
>and *must* know the size before it goes out-of-line so it can add up all the sizes
>of the components and do the one overall allocation in the appropriate place
>(on the secondary stack, in some user-specified storage pool, as a component
>of a yet larger limited object, etc.).
>
>I got the sense that one solution being bandied about was that limited components
>of dynamic size would *have* to use a level of indirection, precluding a contiguous
>allocation for an enclosing limited record.  This is not the way many compilers
>do things now, and so would imply a change in the way limited types are laid out.
>I would hope (12) is a point of consensus, but I couldn't tell if that were true
>based on the flurry of messages.
>
>
Yes, it is not just being thrown around, it is the other proposal on the
table.  However, there is a solution which doesn't require the caller to
know the size of the object at the point of the call, and does not
require a level indirection.  This requires the caller to pass a thunk
to the constructor.  When the constructor is ready to allocate the
actual object, it calls the thunk, giving the needed size, and the thunk
returns an address.  The thunk can be an allocator for some heap, or can
be code to add the size information to the size for some object on top
of the stack.  The case of an object with many constructors as part of
say an initial value aggregate can be accomodated by calling all the
tasks in sequence, if the object is being created on a different stack
than the one that contains the object being created, or in the case of a
heap object, you can get a large chunk of heap, and eventually return
what is not used.

Like Randy, though, I consider that whether an implementation uses
indirection for some types should be left to the implementor.  There are
cases where it is more user friendly--and also more efficient at
run-time--to do that.  The particular case of arrays of
Unbounded_Strings is not an idle pasttime, it comes up fairly frequently
so that it is worth looking at performance of various solutions in that
case.  (Unbounded_String is a weird case in general.  If you implement
Unbounded_Strings efficiently they are really limited objects that
cannot be copied.  What happens on assignment is a "deep copy" that
clones the object.)

>>>   13) Should allow us to still have function-like things
>>>       that return by reference
>>>
>>>
>But perhaps call these things "limited functions" because if we add
>aggregates and initialized limited objects, these guys won't be callable
>in those contexts.  Alternatively, require that they be recast as
>functions returning anonymous access types, effectively moving the ".all" from the
>return expression to the point of call (since in my experience, these
>functions almost always return a reference to a heap object, due to accessibility
>limitations).
>
>
I'm going to stay out of this argument, other than to say I will
probably recast those functions that are actually return by reference
using the new semantics.  But I don't want to have to do it at gunpoint.

>I'm not sure I followed that logic, but I agree that they should be *viewed*
>as functions.  The question is how does one implement these.  I fear
>that to achieve nice-to-have's (10) and (11), allowing the first subtype
>to have non-defaulted or unknown discriminants, combined with (12), creates
>a real challenge.  Renaming a procedure call as a function nicely solved
>all the problems:
>   a) the visible declaration is a function
>   b) the renaming declaration can use the parameters to specify the
>      discriminants for the returned (i.e. OUT) object (e.g. "(Disc => 3, others => <>)")
>   c) the out of line code has a name for the pre-allocated object so
>      it can refer to the discriminants.
>
>If there is another solution that has all these capabilities that would be
>great.  I have not found one.  The hardest problem is where the discriminants
>are not explicitly determined by the caller, but are instead determined
>by some computation on the IN parameters.
>
You have not found one, but Randy and I have.  I think it requires more
compiler implementation work than your approach, and in some cases it
will be less efficient (more information passed in the call).  But the
advantage of the approach is that it does cover all cases, including
those where neither the caller nor the constructor can know the size of
the returned object until the point of the return statement.  Yes, if
the compiler sees that for some types the constructors can return
objects larger that the largest stack available, it may decide to use
(hidden) indirection in such objects.  But I consider that to just be
the nature of Ada.  (Has anyone really thought about what will happen
when creating an allocate the maximum unconstrained String doesn't
always raise Storage_Error?  In the early days, there were a few
compilers, including the one for the DPS6, that used 16-bits for
Integer, but a few years ago I ordered a machine with 4 Gig of memory.
Of course the OS wouldn't allow 2Gig to be allocated for one String, but
that day is coming.)

Is it worth this potential extra overhead to make declaring private
types with unknown discriminants work right?  I think so.  I also think
the syntax is easier to use in the easier cases which will make the new
constructs more popular.

>It may be that some mild restrictions could be added to deal with this
>problem.  I would hope the restrictions can be enforced on the *declaration*
>of the function rather than at the call site.  Otherwise I fear we
>will get into the "applicable index constraint" game, which I don't
>relish.  That is, certain calls would only be permitted when there is
>an applicable discriminant constraint.
>
>
I very much don't relish that either.

>And except for the oddball return-by-ref functions, all
>functions create a new thing.
>
>
That is a compiler implementor's view of return by value. ;-)  Users
talk all the time about functions returning this or returning that when
they are just returning a copy of an existing object.  From the user's
point of view, a constructor is different even for non-limited types.
It might be better to say that a constructor constructs a new value,
while many functions return existing values.  Of course, arithmetic
operations don't really fit this picture, but they are already special
in a different way.  But especially for non-limited ADTs I see a
semantic division between constructors, which build new records, and
selector functions that return existing records.

****************************************************************

From: Tucker Taft
Sent: Monday, December  8, 2003, 10:28 PM

> Tucker said:
>
> > Hence, I feel pretty strongly that if we are going to use syntax to make
> > these two kinds of limited-returning function-like things look different,
> > we should make the existing returning-by-ref functions look different
> > from non-limited-returning functions, and make the new more flexible
> > limited-returning functions look like good old non-limited returning
> functions,
> > since they have so much more in common (in terms of legal calling
> > contexts).
> >
> > This is why I would recommend we require something like the word "limited"
> on
> > a function if it will be returning by-ref, and can only be called in
> contexts
> > where by-ref makes sense.  This is of course incompatible, but it is
> easily
> > caught at compile-time, and compilers could start allowing the word
> "limited"
> > right away, even before they support the new capability.
>
> I don't mind that in a vacuum, but I think that it means that either (1)
> non-limited constructors are actually more expensive than current functions;
> or (2) converting a limited type to non-limited requires checking all
> functions for correct behavior.

Unfortunately, I have completely lost you.  I was trying to
focus on the "call" side of things first, before plunging
into the body/implementation side.  Once we know what we want
from the call side, we can start to figure out what we need
to provide on the body/implementation side.

So strictly from the call side, non-limited-returning functions
always create/initialize a new object.  Unfortunately, in Ada 95,
the only limited-returning functions are return-by-ref of
a preexisting object (when I say limited, I mean
"truly" limited).

Now what AI-318 is trying to provide is limited-returning
function-like things that create/initialize a new object,
very much like non-limited-returning functions.
This is important because we are now proposing to allow
limited objects to have initializing expressions, and
we want to allow a function-call-like thing for those
expressions.  Unfortunately, the existing limited-returning
functions are exactly the *wrong* thing for these new
contexts.

These by-ref functions didn't seem so odd when we didn't
allow limited initializing expressions.  There were no
contexts where they couldn't be called due their by-ref-ness.
The limited-ness was enough to eliminate all such contexts.
But now we have proposed new contexts where limited types
are allowed, but the existing kinds of functions can't
be called in those contexts -- a definite pity.

> The former occurs because (in one model) you get a call to Initialize that
> generally can't be optimized away on top of the Adjust and Finalize calls
> that we already have; the latter occurs (in another model) because limited
> types call Initialize and non-limited types don't.

Let's just for a moment ignore this issue of whether
the object has to be default initialized and then re-initialized.
Notice that I didn't mention that in any of my
"consensus" lists, and that's not what I am focusing
on now.  I am happy to keep searching for a solution
that avoids the double initialization.  What I
want first is a good specification of what the solution
should look like on the *call* side.

> I don't much like either result.

I think you are talking about the implementation side,
but let's first try to agree about the call side.

> > > > Other possible desirables:
> > > >    12) Should not require alteration in the way limited types are laid
> out
> ...
> Trying to lay out all possible record types contiguously is a fool's game.
> Kinda like trying to implement universal generic sharing. :-) It's possible
> to get it to work, but only with lots of standing on your head. And the
> result is very use-unfriendly: objects of reasonable types like
>    type Sane_Bounded_String (D : Natural := 0) record
>         Data :  String (1 .. D);
>    end record;
> raise Storage_Error unless constrained.
>
> In any case, the vast majority of real types can be implemented
> contiguously, with any of these proposals. (Most ADTs don't have
> discriminants anyway, at least not on the top-level types.) If a few types
> have to change representation in a few compilers (and only if there are
> constructors defined) to make this work, I cannot get too excited. It can't
> be incompatible: there are no constructors now.

Are you proposing that if a programmer writes a function-like
thing for a limited type, then the layout changes?  I really think
that is very bad news.  And despite your concern about laying out records
contiguously, I am pretty certain that GNAT, Rational, Green
Hills, and Aonix all lay out records contiguously (I am *very*
certain about Green Hills and Aonix ;-).  I think that
represents about 95% of the Ada market.

> > > Viewing them as totally new animals seems like overkill.  To me, a
> > > constructor is just a function that creates a new thing. ...
> >
> > And except for the oddball return-by-ref functions, all
> > functions create a new thing.
>
> I guess I view these as a new thing because what they do is create a
> user-defined "construction" of an object; they need to replace the
> "initialization assignment" operation of Ada as well as the "initialization"
> itself. Existing functions do not change the semantics of assignment. For
> non-controlled types, the distinction doesn't really matter, but it is a big
> deal for controlled types (of all stripes).

From the call-side, I don't see the big difference.
Even from the implementation side, it seems like we are
just trying to eliminate some extra "last minute" copying
that is currently part of non-limited-type function
semantics.  Many functions are written with a local
"Result" parameter, which is then built up as desired,
and then returned.  Many other functions are little more
than the return of an aggregate.  Both of these are
clearly creating/initializing new objects.  All we need to
arrange is that in both cases for a limited type,
the object to be returned is built in its final
resting place.  And the discriminants, if any, are known
on the call side (at least to the generated code), before
the out-of-line code begins.

I suppose one (crazy?) possibility is that such functions must be
inlined if the compiler run-time model needs additional
information from the body, whereas they need not be
inlined if the compiler run-time model uses implicit
levels of indirection.  This would make it quite analogous
to the case with generics, where some compilers need the
body to be able to generate code for an instance, while
others don't, because their run-time model supports
sharing.

> Also, I see a new thing as necessary, because I don't believe that a useful
> constructor can be defined that won't force some representation changes in
> compilers. (That is, (12) is an impossible goal; holding to it is a disaster
> from a user perspective -- it forces unnatural separations of construction
> code into parts. And the idea of somehow specifying an aggregate as the
> argument of an In Out parameter seems goofy.) As long as the constructors
> are explicit, then there isn't a problem in that existing code would not
> have to change representation.

I think you are again implying that by writing a constructor-ish-thing,
the record representation would change.  This seems quite undesirable
to me.

> If we don't have the will to do this right this time, I don't think there is
> any value to another partial band-aid solution. Especially if it cannot be
> extended properly in the future. Which is why Tucker's procedure renaming
> just isn't going to work.

It would help if you had an example of the kind of extension you
had in mind.    I promise I am not wedded to the procedure
renaming approach, but I do think it satisfies the requirements,
except perhaps from an aesthetic point of view.  I think we still
might be able to make the "return ... do ..." approach work,
but there would probably be more limitations.  In any case,
I believe these things still have so much in common with
functions that calling them anything else would hurt more than
it would help.

>          Randy.

Here is a proposal that does not involve renaming:

  1) Require "limited" (or some such word) if the function is going to
     return its result by reference; all other functions must
     return/initialize "new" objects.
     By-ref functions can only be called in contexts that don't
     require a new object (e.g. as an IN parameter or a renaming).
     [Better long-term alternative: replace these oddball functions
      with functions that have anonymous access-to-limited result types,
      since that is what they really are.]

  2) Allow "return Result : Type := <expr> do ... end return;" as a way of
     having a name for the "new" object being returned/initialized.

  3) If a limited type has "normal" functions, then its full type
     must be definite (e.g., there must be defaults for its discriminants).
     Note that its partial view may be indefinite (i.e. "(<>)").
     Also note that the discriminants, though defaulted, may be given
     new values by the initializing <expr>, if the object to be returned is
     unconstrained (i.e. Result'Constrained is False).  This ensures
     that the discriminants have well-defined values coming into
     the function, though they may be changed if the new object
     is unconstrained.

  4) If the (full) result subtype is definite (and hence for
     all limited types), then the name given in the return ... do ...
     (e.g. "Result") can be used within the <expr> itself,
     but only as a prefix for discriminants and the 'Constrained
     attribute.  If 'Constrained is False, then within <expr>,
     Result.<discrim> will necessarily be equal to its default value.
     After <expr>, the discriminants will have the value that was
     determined by <expr>.

The above ensures that for run-time models that need it, the
discriminants and hence the size are known prior to going to
out-of-line code, allowing the caller to do the allocation,
and/or to include the object contiguously in an enclosing record or array, etc.

The only real restriction is that if a limited type is going
to allow objects to have their discriminants determined by an
initializing expression, the full type must have defaults for
the discriminants.  And this restriction is enforced when the
full type is declared, rather than when these functions are called.

****************************************************************

From: Randy Brukardt
Sent: Monday, December  8, 2003, 11:13 PM

Tucker said:

> Unfortunately, I have completely lost you.  I was trying to
> focus on the "call" side of things first, before plunging
> into the body/implementation side.  Once we know what we want
> from the call side, we can start to figure out what we need
> to provide on the body/implementation side.

I have no idea what you mean by "call side". In any case, the only valid
subsetting is to look at it from the user's perspective rather than the
implementors. Other divisions just mean that you are ignoring half of the
issues, and you're bound to get the wrong answer then.

...
> Let's just for a moment ignore this issue of whether
> the object has to be default initialized and then re-initialized.
> Notice that I didn't mention that in any of my
> "consensus" lists, and that's not what I am focusing
> on now.  I am happy to keep searching for a solution
> that avoids the double initialization.  What I
> want first is a good specification of what the solution
> should look like on the *call* side.

Then you're not looking at the whole issue. That's not an implementation
detail, it's a very visible part of the user semantics.

The only question is "what makes sense to the user of Ada 2005"? Once we've
figured that out, we can look at whether some implementation restrictions
are needed. That's the only sensible approach.

...
> > I guess I view these as a new thing because what they do is create a
> > user-defined "construction" of an object; they need to replace the
> > "initialization assignment" operation of Ada as well as the "initialization"
> > itself. Existing functions do not change the semantics of assignment. For
> > non-controlled types, the distinction doesn't really matter, but it is a big
> > deal for controlled types (of all stripes).
>
> From the call-side, I don't see the big difference.
> Even from the implementation side, it seems like we are
> just trying to eliminate some extra "last minute" copying
> that is currently part of non-limited-type function
> semantics.

We're also trying to eliminate unnecessary calls on Initialize, Adjust, and
Finalize. Since those are very difficult to optimize without breaking the
user's code, semantics which call them less often and still is safe is
important.

> > Also, I see a new thing as necessary, because I don't believe that a useful
> > constructor can be defined that won't force some representation changes in
> > compilers. (That is, (12) is an impossible goal; holding to it is a disaster
> > from a user perspective -- it forces unnatural separations of construction
> > code into parts. And the idea of somehow specifying an aggregate as the
> > argument of an In Out parameter seems goofy.) As long as the constructors
> > are explicit, then there isn't a problem in that existing code would not
> > have to change representation.
>
> I think you are again implying that by writing a constructor-ish-thing,
> the record representation would change.  This seems quite undesirable
> to me.

Not "would", but "could" -- in unlikely cases. If the type is definite,
there is never a problem with any proposal.

> > If we don't have the will to do this right this time, I don't think there is
> > any value to another partial band-aid solution. Especially if it cannot be
> > extended properly in the future. Which is why Tucker's procedure renaming
> > just isn't going to work.
>
> It would help if you had an example of the kind of extension you
> had in mind.    I promise I am not wedded to the procedure
> renaming approach, but I do think it satisfies the requirements,
> except perhaps from an aesthetic point of view.  I think we still
> might be able to make the "return ... do ..." approach work,
> but there would probably be more limitations.  In any case,
> I believe these things still have so much in common with
> functions that calling them anything else would hurt more than
> it would help.

Sheesh, Tuck, I wrote up a rough description of syntax and semantics
yesterday. Do I have to do it again??

> Here is a proposal that does not involve renaming:
>
>   1) Require "limited" (or some such word) if the function is going to
>      return its result by reference; all other functions must
>      return/initialize "new" objects.
>      By-ref functions can only be called in contexts that don't
>      require a new object (e.g. as an IN parameter or a renaming).
>      [Better long-term alternative: replace these oddball functions
>       with functions that have anonymous access-to-limited result types,
>       since that is what they really are.]

That's a lousy better alternative, even if it is accurate. I don't think we
ever want to force the introduction of access type where there currently are
none.

>   2) Allow "return Result : Type := <expr> do ... end return;" as a way of
>      having a name for the "new" object being returned/initialized.

So (1) and (2) are pretty close to what I proposed Saturday night (except
for syntax). It's not clear to me what the user-level semantics for
non-limited types is supposed to be.

>   3) If a limited type has "normal" functions, then its full type
>      must be definite (e.g., there must be defaults for its discriminants).
>      Note that its partial view may be indefinite (i.e. "(<>)").
>      Also note that the discriminants, though defaulted, may be given
>      new values by the initializing <expr>, if the object to be returned is
>      unconstrained (i.e. Result'Constrained is False).  This ensures
>      that the discriminants have well-defined values coming into
>      the function, though they may be changed if the new object
>      is unconstrained.

Tagged types don't allow defaults for discriminants. So you're saying that
useful limited types either must not be tagged or must not have
discriminants. That seems pretty fierce.

And, in any case, I don't see what this has to do with the user perspective.
You said something about ignoring implementation details, and I can't see
any reason for this other than an implementation detail.

>   4) If the (full) result subtype is definite (and hence for
>      all limited types), then the name given in the return ... do ...
>      (e.g. "Result") can be used within the <expr> itself,
>      but only as a prefix for discriminants and the 'Constrained
>      attribute.  If 'Constrained is False, then within <expr>,
>      Result.<discrim> will necessarily be equal to its default value.
>      After <expr>, the discriminants will have the value that was
>      determined by <expr>.

Is this necessary? I haven't tried to work out full examples, but it adds
complications where none seems to be needed. Anything that needs to refer to
the name should be in the statement part, I would think. What would you need
to do in that expression that can't be deferred to the body??

> The above ensures that for run-time models that need it, the
> discriminants and hence the size are known prior to going to
> out-of-line code, allowing the caller to do the allocation,
> and/or to include the object contiguously in an enclosing record
> or array, etc.

That seems like an implementation detail, again. That's fair game, of
course, but not when you say "let's focus on the user view".

> The only real restriction is that if a limited type is going
> to allow objects to have their discriminants determined by an
> initializing expression, the full type must have defaults for
> the discriminants.  And this restriction is enforced when the
> full type is declared, rather than when these functions are called.

I agree that (hard) restrictions should be enforced when the constructor is
declared (it's not really a problem with the type). There's nothing wrong
with run-time checks if something is declared to be indefinite, but not
otherwise. But I'd prefer to avoid restrictions if we can.

I note that this proposal does seem to allow deferring or eliminating
default initialization. But it seems to imply that non-limited types still
require a copy (and Adjust call) afterwards. I'd much prefer to allow
build-in-place. Perhaps the new 7.6.1 would allow that?

---

My quicky summary of the user view of constructors:

1) These should be usable anywhere that an aggregate can be, with similar
semantics.
   (This implies that top-level Adjust or Initialize should not be called on
them in general.)
2) There should be as few as possible restrictions on the declarations and
use of constructors.
3) They shouldn't feel "weird".
   (This implies retaining the function call-like syntax -- ":= Create(...);",
   and that the declaration probably should look something like a function call.)

****************************************************************

From: Randy Brukardt
Sent: Tuesday, December  9, 2003, 12:07 AM

Tucker said, responding to me:
> > Is this necessary? I haven't tried to work out full examples, but it adds
> > complications where none seems to be needed. Anything that needs to refer to
> > the name should be in the statement part, I would think. What would you need
> > to do in that expression that can't be deferred to the body??
>
> If you want to specify the initial value as an aggregate, you have
> to specify the values for the discriminants.  If you want them
> to match the newly created object, you need to be able to refer
> to the discriminants of the newly created object.

OK. That seems like a nice-to-have rather than a requirement. Without it,
you still could pass the discriminants as parameters to the constructor if
you had to. Not as pretty, but workable. A lot of the time, you'd want to do
that anyway. I'd hate to kill the proposal with some funny visibility that
isn't strictly necessary.

> > I note that this proposal does seem to allow deferring or eliminating
> > default initialization. But it seems to imply that non-limited types still
> > require a copy (and Adjust call) afterwards. I'd much prefer to allow
> > build-in-place. Perhaps the new 7.6.1 would allow that?
>
> I don't think I was making any such requirement.  Many compilers
> currently preallocate space for the value returned by
> a function having a non-limited composite result type.
> The return ... do ... construct would allow that
> space to be used directly.  For functions returning values
> on the secondary stack, it would also be possible to build
> the returned value directly on the secondary stack,
> and avoid cutting back the secondary stack upon return
> and just use it where it was placed.  We already do that
> in certain circumstances.

I think you're relying on 11.6 and 7.6.1 to get there. Right? That's fine as
long as those sections have the right effect (and we all agree that is an
appropriate effect).

I don't think that the original 7.6.1 would have allowed that optimization
(even though compilers probably do it!), but the new one ought to.

> With limited types we need to create enough restrictions
> to ensure it can be built in place.  For non-limited types,
> the return ... do ... construct makes it more likely that
> no extra copies are required, but we can't impose enough
> restrictions to ensure that no copies are needed in all
> run-time models.

That's fair. It seems likely that a copy would have to be done sometime
(certainly for elementary types, although that's no real problem).

> > My quicky summary of the user view of constructors:
> >
> > 1) These should be usable anywhere that an aggregate can be, with similar
> > semantics.
> >    (This implies that top-level Adjust or Initialize should not be called on
> > them in general.)
> > 2) There should be as few as possible restrictions on the declarations and
> > use of constructors.
> > 3) They shouldn't feel "weird".
> >    (This implies retaining the function call-like syntax -- ":= Create(...);", and that the
> >    declaration probably should look something like a function call.)
>
> I agree with the above, and as indicated, I see no reason
> not to call them functions.  The "return ... do ..." construct
> might be called a "constructor statement" or some such thing
> if you want to get the term "constructor" into the language ;-).

It wouldn't hurt, but I'm certainly not going to be banging any shoes if it
doesn't happen. :-)

****************************************************************

From: Robert I. Eachus
Sent: Tuesday, December  9, 2003, 10:51 AM

Wow!  I think we are finally converging, but four long messages, two
each from Tucker and Randy, after I gave up for the night?  Is Tucker
already on California time? ;-)

But I would like to start pulling out one issue at a time to resolve.
In this particular case, I am going to set ground rules for this thread,
since it presumes that certain things will be adopted.  In this case,
the RM requirements to be specified for initializing objects inside the
return <object declaration> do <sequence of statements> end;

My feeling is that in reality writing useful constructors for private
types outside the body of the package that defines the type (or one of
its children) is going to be a pretty useless capability.  Not totally
useless, but okay to ignore for now.  This means that if there are
discriminants, we can assume they are known and visible, and the same
with other fields of the target type.  (But the fields of the parent
type may not be visible in this context.)

The three options that seem worth discussing are:

1) Controlled types are initialized implicitly.
2) Default initialization only occurs if there is no explicit
initialization: (return Foo: Bar := (this, that, etc) do...)
3) We are all big boys here.  (The constructor is normally defined by
the same person who writes Initialize or assigns defaults.)  So there
are no default initializations, and no run-time checks.
4) There is a run-time check at the end of the do...end; and  an
exception is raised if any discriminants of the object were not initialized.

I think that all of these are technically viable given the way things
seem to be headed, so this is really a normative discussion   (What do
we want to happen?).  The one troubling case I see can occur if we
decide that if the target is constrained its discriminants, if any get
'copied' in from the target object.  (In practice they will be the same
object, no copying needed.)  Then if a constructor that assumes it will
get the constraints from the target is used to create an object that
will get its constraints from the initial value, something should
happen.  This leads me to tend toward 4).  If the sequence of statements
checks 'Constrained and initializes the discriminants (or other bounds)
if false, then the check can be optimized away.  Otherwise the
constructor should raise Program_Error.  Compilers can provide warnings
if code for the check is generated, or if something that reads the
discriminants in the sequence of statements occurs where the
discriminants may not yet be initialized.

Notice though, that it might be nice if 'Constrained could be checked
before the return statement.  I think that this is the one detail that I
miss from my proposal in Randy's variation.  There will be cases where a
programmer would like to write:

if Foo'Constrained
then return ...;
else return ...;
end if;

Allowing the user to use the name of the constructor in this instance as
a prefix for 'Constrained would allow this.  I don't see any real need
for other attributes of the target outside the return.

Which brings up an interesting point.  Will there be a restriction that
prevents return statements inside the sequence of statements of a return
construct?  If not, then the above will work.  But I think the principle
of least surprise says that nested return statements should be illegal.

Rule 2), it seems to me is the other contender.  If there is no explicit
initialization in the object declaration, implicit initialization
occurs.  This solves what to me is one troubling case with 4).  If some
fields of a record type have defaults, those default initializations may
not happen, and there will be no warning to the  programmer.

I guess I am comfortable with either 2) or 4).  I think for most types,
and most constructors, there will be an explicit aggregate initial
value, with the sequence of statements if present doing any fixup
needed.  As long as double initialization doesn't occur in that case, I
am happy.

****************************************************************

From: Randy Brukardt
Sent: Tuesday, December  9, 2003, 11:01 AM

Robert Eachus said:
> The three options that seem worth discussing are:
>
> 1) Controlled types are initialized implicitly.
> 2) Default initialization only occurs if there is no explicit
> initialization: (return Foo: Bar := (this, that, etc) do...)
> 3) We are all big boys here.  (The constructor is normally defined by
> the same person who writes Initialize or assigns defaults.)  So there
> are no default initializations, and no run-time checks.
> 4) There is a run-time check at the end of the do...end; and  an
> exception is raised if any discriminants of the object were not
> initialized.

"Three options"? I see four. :-)

...
> I guess I am comfortable with either 2) or 4).  I think for most types,
> and most constructors, there will be an explicit aggregate initial
> value, with the sequence of statements if present doing any fixup
> needed.  As long as double initialization doesn't occur in that case, I
> am happy.

I had proposed (2). I agree that we don't want objects floating about that
haven't been initialized at all. And, being able to write a specific
initialization expression allows overridding that (or pieces of it, given
the <> notation). That seems powerful enough to me.

(And nested returns aren't allowed. Once a return object is created, it has
to be returned. Otherwise, you wouldn't be able to build the object in
place, which is the whole point.)

****************************************************************

From: Tucker Taft
Sent: Tuesday, December  9, 2003, 11:01 AM

> The three options that seem worth discussing are:
      ^^^^^ "four" ;-)

> 1) Controlled types are initialized implicitly.
> 2) Default initialization only occurs if there is no explicit
> initialization: (return Foo: Bar := (this, that, etc) do...)
> 3) We are all big boys here.  (The constructor is normally defined by
> the same person who writes Initialize or assigns defaults.)  So there
> are no default initializations, and no run-time checks.
> 4) There is a run-time check at the end of the do...end; and  an
> exception is raised if any discriminants of the object were not initialized.

Only (2) makes sense to me.

****************************************************************

From: Robert I. Eachus
Sent: Tuesday, December  9, 2003,  4:57 PM

Tucker Taft wrote:

> > The three options that seem worth discussing are:
>
>  ^^^^^ "four" ;-)

"A foolish consistancy is the hobgoblin of little minds."  -- Ralph
Waldo Emerson.
(Seriously, I decided to add case three, then didn't change the prefix.
Of course, I was originally intending to leave case three out as not
making much sense for Ada. ;-)

> > 2) Default initialization only occurs if there is no explicit initialization:
> > (return Foo: Bar := (this, that, etc) do...)
>
>  Only (2) makes sense to me.

Since 2 is acceptable to me also, shall we consider that issue
resolved?  Any other votes?

The object in the return statement gets initialized, including a call to
an explicit Initialize for controlled types, and and default values for
record components, unless there is an initial value in the return statement.

This implies that the syntax for these special returns should allow:

return Foo: Bar := (some initial value aggregate); --without a do ... end;

****************************************************************

From: Robert A. Duff
Sent: Tuesday, December  9, 2003, 5:34 PM

Robert Eachus wrote:

> "A foolish consistancy is the hobgoblin of little minds."  -- Ralph
> Waldo Emerson.

;-)

> > > 2) Default initialization only occurs if there is no explicit initialization:
> > > (return Foo: Bar := (this, that, etc) do...)
> >
> >  Only (2) makes sense to me.
>
> Since 2 is acceptable to me also, shall we consider that issue
> resolved?  Any other votes?

I agree with (2).

****************************************************************

From: Jean-Pierre Rosen
Sent: Tuesday, December  9, 2003, 10:31 AM

> Consensus statements about Ada 200Y if we were to approve AI-318:
>
>    1) Should be possible to declare an object of a limited
>       type and provide an initializing expression
>    2) Should be possible to use an initialized allocator for
>       an access-to-limited type
This is of course what is basically required. However, I think something
is missing from the list:
- the function-like thing should be able to access the characteristics
(discriminants, bounds, etc) of the object being constructed.

Seems to me that without this requirement, we could just allow
initialization of limited types (but not assignment, of course).

****************************************************************

From: Tucker Taft
Sent: Tuesday, December  9, 2003, 12:04 PM

Good point.  We clearly need to be able to query the constraints
of the "new" object.  I said these constraints should be visible
in the initializing expression.  They should also be visible
in the subtype indication of:

  return Result : <subtype_ind> [:= <expr>] do ...

I suppose if the <subtype_ind> is unconstrained, then the
constraints come from the calling context.
If the <subtype_ind> is constrained, then there must be
a check that the constraints are compatible with those coming
from the calling context.

> Seems to me that without this requirement, we could just allow
> initialization of limited types (but not assignment, of course).

I don't understand this sentence.

****************************************************************

From: Robert I. Eachus
Sent: Tuesday, December  9, 2003,  6:10 PM

 Emboldened by my success at settling the initialization issue (I
hope!), I'll see if I can separate out another normative issue
and resolve it.   This one may be more controversial though.  What
should the syntax be for the new declarations?  (As far as I am
concerned, I think this issue is distinct from Tucker's renaming
proposal, because the renaming would normally be in the private part, or
possibly in the body.)  I started by suggesting a change from function
to constructor as a placeholder for the new syntax.  But from what I
have seen, the idea of having a marker in the syntax is as popular as
replacing the reserved word function is unpopular.

I'll try to list the proposals which, to my knowledge have not yet been
shouted down.  Of course, additional candidates are possible.  The
current syntax rules are in 6.1 and 3.2.2 and I use terms from there in
angle brackets:

1) constructor function <defining identifier> <formal_part> return
<subtype mark>;
2) function <defining identifier> <formal_part> return <defining
identifier> : <subtype indication>;
(This is my variation of a proposal by Dan Eilers.  I put the initial
value part below...)
3) function <defining identifier> <formal_part> return new <subtype mark>;
4) function <defining identifier> <formal_part> create <subtype mark>;

Also there are some cases that either get permitted or ruled out by the
syntax.  These are:

A) Should it be possible for a constructor function to be a child
program unit?
B) Should it be possible for a constructor function to be an operator?
(For example "+")
C) Should it be possible for a constructor to be declared abstract?
D) Is there a need to change the <subtype mark> to a <subtype indication>?
This would allow constraints in constructor declarations.  They are not
permitted as such in function declarations, because the value returned
is not checked against some constraints. (But others are checked, see
4.6 for details)  We may want to change this and/or these rules for
constructors.
E) In option 2, should an initial value be allowed after the subtype
mark in the object declaration?  (This would probably be instead of in
the return statement.)
F) Again in option 2, should aliased or constant be allowed before the
subtype delaration?
G) Also in option 2, what about allowing <array_type_definition> instead
of  <subtype indication>?

I should give my votes, I guess.  3) is a slight leader IMHO among the
syntax proposals, but I keep hoping for someone to come up with a better
option. (The new implies heap use to Ada programmers.)  A. Not
necessary,  B. Yes,  C. Yes,  D. No, E. Yes, F. No, G. Useless.

****************************************************************

From: Tucker Taft
Sent: Tuesday, December  9, 2003,  8:52 PM

Robert I. Eachus wrote:
>
> 1) constructor function <defining identifier> <formal_part> return
> <subtype mark>;
> 2) function <defining identifier> <formal_part> return <defining
> identifier> : <subtype indication>;
> (This is my variation of a proposal by Dan Eilers.  I put the initial
> value part below...)
> 3) function <defining identifier> <formal_part> return new <subtype mark>;
> 4) function <defining identifier> <formal_part> create <subtype mark>;

None of the above.  Functions always create new objects
(except for the funny return-by-ref ones which might
better be called "object selectors" or some such thing).
The only issue is whether you have a name for the
object being created, and whether the properties of
that object can be inherited from the calling context.
This can be a useful capability for any function that
returns a composite object, limited or non-limited.

As I suggested, I would argue for distinguishing the
existing return-by-reference functions.  These are the
ones that are going to be of limited use, and in particular,
a call on one of these will *not* be usable at the initialization
expression for a declared object or an aggregate component.

I would suggest for the return-by-reference "functions":

    limited function Ret_By_Ref(...) return Lim;
   or
    function Ret_By_Ref(...) return limited Lim;
   or
    function Ret_By_Ref(...) limited return Lim; -- probably my favorite
   or use a non-reserved keyword:
    function Ret_By_Ref(...) reference return Lim;

or drop this capability in favor of:

    function Ret_Acc(...) return access Lim;

which has the very nice property of saying exactly
what is happening, not having any funny restrictions
on use, carrying the idea of accessibility quite naturally
(currently the accessibility check on return by reference
is a bit odd), etc.

> Also there are some cases that either get permitted or ruled out by the
> syntax.  These are:
>
> A) Should it be possible for a constructor function to be a child
> program unit?
> B) Should it be possible for a constructor function to be an operator?
> (For example "+")
> C) Should it be possible for a constructor to be declared abstract?
> D) Is there a need to change the <subtype mark> to a <subtype indication>?
> This would allow constraints in constructor declarations.

I recommend that these look like regular functions at the
declaration point, and in any case should
be allowed to be abstract, operators, library level, generic, etc.

> ...
> E) In option 2, should an initial value be allowed after the subtype
> mark in the object declaration?  (This would probably be instead of in
> the return statement.)
> F) Again in option 2, should aliased or constant be allowed before the
> subtype delaration?
> G) Also in option 2, what about allowing <array_type_definition> instead
> of  <subtype indication>?

I don't like option 2 much, and certainly not
as the declaration.  I could see it as the body, perhaps,
in which case it is essentially equivalent to the return ... do ...,
where the expression, if given, is an initial value, which can
then be modified further in the body of the function.

****************************************************************

From: Robert I. Eachus
Sent: Wednesday, December 10, 2003, 12:01 PM

Tucker Taft wrote:

> As I suggested, I would argue for distinguishing the
> existing return-by-reference functions.  These are the
> ones that are going to be of limited use, and in particular,
> a call on one of these will *not* be usable at the initialization
> expression for a declared object or an aggregate component.


I can't accept this.  To quote from Ada.Text_IO:

   type File_Type is limited private;
      ...
   function Standard_Input  return File_Type;
   function Standard_Output return File_Type;
   function Standard_Error  return File_Type;

   function Current_Input   return File_Type;
   function Current_Output  return File_Type;
   function Current_Error   return File_Type;

Where the current return by reference functions are appropriate, they
are currently used with no particular effort or thought by programmers.
I find them very handy when building software where the objects have
permanence because operations that change their value can also change
the value in a database.

The new initialization rules will make such types more useful, and more
common than they currently are.  Right now the problem with them is that
users must call an initialization procedure, so the designer of the ADT
has to worry about what happens if the object is used before it is
(non-default) initialized.  This tends to create additional run-time
overhead.  I have several ADTs where there is an explicit initialization
flag in the (limited) object  type.  This is the only field with a
default initial value, and its only use is to ensure that other
operations on the type can check the flag and raise an  exception.

With the new proposal and types with unknown discriminants, that flag
and all the associated code can go away.  A user of the ADT can't create
an object without initializing it.  This means that there will be many
more inquiry functions which return an existing value of the type, and
can and should only be used as parameters in other calls.  Keeping this
usage of the existing non-constructor limited return functions efficient
is a necessary goal of the current effort.  As with File_Type in
Ada.Text_IO, there will be several of these return by reference
functions for every one of the new constructors, whatever they are called.

And notice that the 'funky' return by reference functions are the
exception to the rubric that every function returns a new object.  They
are only useful where they do not do that, and they are useful and used
for exactly that property.

> I would suggest for the return-by-reference "functions":
>
>    limited function Ret_By_Ref(...) return Lim;
>   or
>    function Ret_By_Ref(...) return limited Lim;
>   or
>    function Ret_By_Ref(...) limited return Lim; -- probably my favorite
>   or use a non-reserved keyword:
>    function Ret_By_Ref(...) reference return Lim;
>
> or drop this capability in favor of:
>
>    function Ret_Acc(...) return access Lim;
>
> which has the very nice property of saying exactly
> what is happening, not having any funny restrictions
> on use, carrying the idea of accessibility quite naturally
> (currently the accessibility check on return by reference
> is a bit odd), etc.

The later would work, and could be used in Ada.Text_IO.  I just don't
like the non-upward compatible change, and the distributed cost of doing
this.  Returning an access value is not that much less efficient than
the current situation.  Sometimes there will be an extra level of
indirection.  But I don't think that the methodological purity justifies
the forced changes to existing code.  Worse, the error messages will be
on the calls, not on the compiliation of the defining package.

> I recommend that these look like regular functions at the
> declaration point, and in any case should
> be allowed to be abstract, operators, library level,
> generic, etc.

Noted.

****************************************************************

From: Tucker Taft
Sent: Wednesday, December 10, 2003,  1:27 PM

> Tucker Taft wrote:
>
> > As I suggested, I would argue for distinguishing the
> > existing return-by-reference functions.  These are the
> > ones that are going to be of limited use, and in particular,
> > a call on one of these will *not* be usable at the initialization
> > expression for a declared object or an aggregate component.
>
>
> I can't accept this.  To quote from Ada.Text_IO:
>
>    type File_Type is limited private;
>       ...
>    function Standard_Input  return File_Type;
>    function Standard_Output return File_Type;
>    function Standard_Error  return File_Type;
>
>    function Current_Input   return File_Type;
>    function Current_Output  return File_Type;
>    function Current_Error   return File_Type;

These are not return-by-reference in most (all?) implementations.
The full type for File_Type is almost always an access value.

Return-by-reference only happens if the full type is "very"
limited (e.g. contains a task, a protected object, or an
explicitly limited record).

> Where the current return by reference functions are appropriate, they
> are currently used with no particular effort or thought by programmers.

I don't think they are used very much at all.  They weren't generally
allowed in Ada 83 (returning a task outside of its scope was an error;
protected objects and limited records didn't exist).  To use them
now, you have to worry about the accessibility check.

> I find them very handy when building software where the objects have
> permanence because operations that change their value can also change
> the value in a database.

I don't really follow.  In any case, you would still be able
to either use return-by-reference, or return-by-access, but
you might have to add a reserved word to make it clear to
the caller that they cannot call this function as an initializer
or component of an aggregate.  Note that for limited types
with non-limited full types, there would be no such limitation --
functions returning such types *could* be used as initializers, etc.
In my mind, this argues further for saying that the default
should be that the function can be called in such contexts,
and you should have to say more if you are creating a function
which will *not* be usable as an initializer.

> The new initialization rules will make such types more useful, and more
> common than they currently are.  ...

Are you talking about "inherently" limited types, or a type
whose full type is non-limited, and copying is permitted inside
the defining package?

The new initialization rules will make it clearer that return-by-reference
functions are second class citizens.

> And notice that the 'funky' return by reference functions are the
> exception to the rubric that every function returns a new object.  They
> are only useful where they do not do that, and they are useful and used
> for exactly that property.

And it is important to highlight that property in my mind,
since it is so weird.  Because if a type is visibly limited,
but internally non-limited, then you can't even write
a return-by-reference function.

> > I would suggest for the return-by-reference "functions":
> >
> >    limited function Ret_By_Ref(...) return Lim;
> >   or
> >    function Ret_By_Ref(...) return limited Lim;
> >   or
> >    function Ret_By_Ref(...) limited return Lim; -- probably my favorite
> >   or use a non-reserved keyword:
> >    function Ret_By_Ref(...) reference return Lim;
> >
> > or drop this capability in favor of:
> >
> >    function Ret_Acc(...) return access Lim;
> >
> > which has the very nice property of saying exactly
> > what is happening, not having any funny restrictions
> > on use, carrying the idea of accessibility quite naturally
> > (currently the accessibility check on return by reference
> > is a bit odd), etc.
>
> The later would work, and could be used in Ada.Text_IO.

As indicated above, this is probably irrelevant to Ada.Text_IO,
since File_Type, though limited, is almost certainly *not*
a return-by-reference type.

****************************************************************

From: Robert I. Eachus
Sent: Wednesday, December 10, 2003,  2:14 PM

Tucker Taft wrote:

>These are not return-by-reference in most (all?) implementations.
>The full type for File_Type is almost always an access value.
>
>Return-by-reference only happens if the full type is "very"
>limited (e.g. contains a task, a protected object, or an
>explicitly limited record).

I understand what you are saying, and realized after I posted that the
message may have seemed a bit too shrill.  My point really was that only
the implementation can know in some cases whether a function that
appears to be "return by reference" actuallly is, and in other cases the
programmer may know his intent but he has to communcite it to the compiler.

A better example is a tagged type (call it Foo) declared as limited
private.  Whether or not a function that returns Foo'Class needs to be
return by reference can depend on derived types not created yet.  So you
must treat current primitive functions that return Foo as return by
reference  The issue is that even though for the derived type, call it
Foobar, the function becomes abstract, that function can be defined, and
will be return by reference for Foobar.  Then if there is a dynamic
dispatching case the compiler will have to be able to either select the
function for the parent type Foo, or the fully limited child type Foobar
at compile time.

But having said that, I think that it is an unwarranted assumption that
no current implementation of Ada.Text_IO has an inherently limited
File_Type.

>Are you talking about "inherently" limited types, or a type
>whose full type is non-limited, and copying is permitted inside
>the defining package?

Yes, I gather that that is what you (Tucker) are just not seeing.  Once
I have a limited type, since copying will break things, I have no reason
not to put a protected type component in the type to insure that updates
are sequential. (Or just make the whole thing protected.)  If the type
was declared non-limited, I would have to resort to an access component
and do the storage management explicitly.  However, once I make the
visible type limited to prevent copying, why add the unnecessary code
and overhead of an explicit indirection?

>And it is important to highlight that property in my mind,
>since it is so weird.  Because if a type is visibly limited,
>but internally non-limited, then you can't even write
>a return-by-reference function.

But what if the programmer doesn't know whether the type is inherently
limited?  File_Type is a perfect example.  Right now all a user needs to
know is that he can't assign them or test for equality.  The compiler
takes care of the rest.

****************************************************************

From: Jean-Pierre Rosen
Sent: Thursday, December 11, 2003,  1:37 AM

> I don't think they are used very much at all.  They weren't generally
> allowed in Ada 83 (returning a task outside of its scope was an error;
No. It was a pathology (Rosen's pathology :-), but not an error...

****************************************************************

From: Robert Dewar
Sent: Thursday, December 11, 2003,  8:08 AM

Note that the designation pathology (was it ever used for anything
else ever?) means that implementations can ignore the situation and
treat it as erroneous.

Of course the damage to implementations had already been done by then
and I don't think many bothered to change.

****************************************************************

From: Robert A. Duff
Sent: Thursday, December 11, 2003,  1:57 PM

Tuck said:

> >>   13) Should allow us to still have function-like things
> >>       that return by reference

I replied:

> >I don't much care about that for my own code, but I think it would be
> >irresponsible of us to be incompatible with folks have used this
> >feature, even if we think perhaps it's a misguided feature.

Robert Eachus replied:

> Huh?  Oh.  You don't use it because you don't use limited tagged types.
> This feature will become much more useful and less 'misguided' if we can
> initialize objects of types derived from Limited_Controlled more easily.

I don't get it.

I *do* use limited tagged types.  I also use tagged types that I wish
were limited, but I made them nonlimited so I could use aggregates and
the like.

These return-by-reference things seem pointless to me.  What they're
"really" doing is returning a pointer to a preexisting limited object.
If I want to do that, I tend to use an explicit access type.

Can you (or somebody) show me an example where a return-by-reference
function is better than a function returning an access value?

Note that return-by-reference always returns a constant object,
so you can't do much with it.

I think I've used this feature exactly once in the last few years,
and I can't remember why.  I do remember that I thought it necessary
to put a comment in, explaining that this depends on the weird
return-by-reference nature of that thing.

> One last suggested inclusion:
>
> 19) If we adopt a partial solution, that partial solution shouldn't
> limit a future extention to cover everything.

That seems like a worthy goal.  We might say "the result type shall be
definite".  It would be good if removing that rule didn't require any
other language changes -- just implementation changes.

Tuck said:

> > > Other possible desirables:
> > >    12) Should not require alteration in the way limited types are laid out

I replied:

> > I think I agree, but I'm not really sure what you're getting at.  Which
> > proposal(s) violate this?

Tuck replied:

> There were a lot of different ideas thrown around, but at least one of
> them implied that the caller might *not* know the size of the thing
> being allocated, nor where it was being allocated.  Clearly if you
> call one of these function-like things as a component of an aggregate,
> and you lay out limited types contiguously (even if some component is
> dynamic-sized), then the caller *must* specify where the object is
> allocated, and *must* know the size before it goes out-of-line so it
> can add up all the sizes of the components and do the one overall
> allocation in the appropriate place (on the secondary stack, in some
> user-specified storage pool, as a component of a yet larger limited
> object, etc.).

I don't understand this.  I presume you are talking about
implementations that allocate-the-max for mutable records
(i.e., records with defaulted discrims).  Record components
must have definite subtypes, so their size is known before the
call -- it cannot be calculated by the constructor function.
So the caller can always add up the component sizes and determine how
much to allocate for the outer record.

If the result subtype of the constructor is indefinite, it needs to be
passed a thunk that takes a size, and returns an address.  Robert Eachus
outlined this idea.  We also talked about the same idea earlier, saying
the constructor is passed a Storage_Pool.  In the case where the
constructor is initializing a record component, as in:

    X: T := (Discrim => 123, That => Construct(456));

the caller would allocate, and the thunk/pool would return the address
of X.That.  It would raise Constraint_Error if the size is wrong.

In the case of:

    new T'(Construct(456))

the thunk/pool actually does the allocation -- it's the storage pool
defined by the user for the access type.

I see why in the component case the size is known before the call, and
the memory is allocated before the call.  But I don't see why that needs
to be true in other cases, if we're willing to pass in a thunk, which
might do some allocation, or might return the address of some
preallocated memory.

The thunk is needed whenever the result subtype is indefinite.
Do you plan to disallow that case?

If the compiler is willing to generate multiple copies of the code for
constructor functions, it could optimize away the thunks.

Likewise, if the constructor is inlined (which I suspect will be
common), the thunk can be inlined, too.

----

Randy said:

> Trying to lay out all possible record types contiguously is a fool's game.

I believe that's a common implementation.  I think AdaMagic and GNAT
both do that, for example.  I agree with Tuck: we don't want to change
record layouts; the always-contiguous assumption is fundamental
(in the compilers that use it).

Tuck said:

> Let's just for a moment ignore this issue of whether
> the object has to be default initialized and then re-initialized.
> Notice that I didn't mention that in any of my
> "consensus" lists, and that's not what I am focusing
> on now.  I am happy to keep searching for a solution
> that avoids the double initialization.  What I
> want first is a good specification of what the solution
> should look like on the *call* side.

On the call side, it looks like a function call.
It can be used anywhere a function call is allowed,
except that in the limited case, not in an assignment_statement.
(I think every other "assignment operation" is an initialization.)
Or did you mean the run-time model at the call site?

****************************************************************

From: Dan Eilers
Sent: Thursday, December 11, 2003,  2:24 PM

I'm probably missing something obvious, but if we are going
to provide "in-place" functions, why can't they be used in
assignment statements?

The purpose of limited types is presumably to avoid _copying_
an object, which in-place functions seem designed to avoid,
regardless of where they are called.

****************************************************************

From: Robert I. Eachus
Sent: Friday, December 12, 2003, 12:10 AM

You could allow it, but we definitely shouldn't allow it.  The effect
would be to create a new object with the same name as the old one, and
you can do that now, for example by having a limited object (Bar)
declared inside a procedure (Foo).  Then which Foo.Bar you are referring
to depends on which call of Foo is active.

But now think about a limited object with a task component.  To avoid
having ten tasks floating around with the same name, you have to
finalize the old task before allowing the assignment.  I could go on,
but I think you get the idea.  We could allow for example overwriting
the value of an Ada.Text_IO.File_Type, but what happens to the file in
the old object?   All the current Ada rules involving limited types
depend on the object created surviving at least as long as the
containing scope.

****************************************************************

From: Robert A. Duff
Sent: Friday, December 12, 2003,  8:24 AM

I think limited types need to avoid copying out of objects (the fetch,
or Bliss dot), and also copying *into* objects (overwriting existing
data).  Maybe you don't call the latter "copying".  But I think it's
necessary to avoid this overwriting because otherwise, the previously
existing value gets lost.  If it contains tasks, it raises the issue
of when we await termination of those tasks.  If it contains finalizable
stuff, it raises the issue of when to finalize.  I suppose you could
finalize just before overwriting, but still, it seems like a violation
of the identity of limited types.  Initialization seems like something
that should only be done once.

****************************************************************

From: Dan Eilers
Sent: Friday, December 12, 2003, 11:21 AM

I wrote:
> I'm probably missing something obvious, but if we are going
> to provide "in-place" functions, why can't they be used in
> assignment statements?

Robert Eachus and Bob Duff both pointed out that the target
of the assignment statement would need to be finalized.  But
this is normal when assigning to objects that need finalization.
7.6.1(12) says "The target of an assignment statement is finalized
before copying in the new value".

Maybe we would want to wordsmith this a little to avoid the term
"copying" in the case of the proposed return-in-place functions.

Bob Duff wrote:
>                            If it contains tasks, it raises the issue
> of when we await termination of those tasks.

Is this really a serious problem?  It seems minor compared to the
benefit of making programming with limited types as similar as possible
to programming with non-limited types, which is important given
that a program written using non-limited types can flip over when
a low-level component of some record type is changed to be limited.

****************************************************************

From: Robert A. Duff
Sent: Friday, December 12, 2003,  3:15 PM

Dan Eilers said:

> Robert Eachus and Bob Duff both pointed out that the target
> of the assignment statement would need to be finalized.  But
> this is normal when assigning to objects that need finalization.

It's not normal when assigning to *limited* objects!  ;-)

I can't get my head around the idea that an assignment_statement could
wipe out a preexisting (already initialized) limited object.  It seems
to defeat the purpose of limitedness.  I admit that "Bob Duff doesn't
like it" is a fairly weak argument.  ;-)

> 7.6.1(12) says "The target of an assignment statement is finalized
> before copying in the new value".
> Maybe we would want to wordsmith this a little to avoid the term
> "copying" in the case of the proposed return-in-place functions.
>
> Bob Duff wrote:
> >                            If it contains tasks, it raises the issue
> > of when we await termination of those tasks.
>
> Is this really a serious problem?

I think so.  I mean, these objects already exist, so if we're going to
allow assignment statements, we have to say what it means.  And I can't
think of anything good to say about that.

In Ada 83, task objects contained pointers to tasks.  In Ada 95, task
objects contain tasks; there is no longer a semantic need for the extra
level of indirection (although some implementations choose to introduce
the extra indirection for various reasons).  I don't know if any
implementations actually do this, but it is certainly allowed for the
task object to contain all the various queue links and whatnot
-- the task object *is* the Task Control Block.
Overwriting that data would cause chaos.

(I think this change is an improvement in Ada 95, and I think avoiding
the level of indirection is the best implementation, unless you're
trying to interface to an existing threads package (which is common).)

Another point is that a running task can refer to its own
discriminants.

The above two points seem to imply that we can't overwrite running
tasks.  That leaves two alternatives: (1) abort the task, then overwrite
it.  Sounds rather harsh.  Also, aborting doesn't kill the task -- it
just gets the task *started* on the long process of going away.
(2) Have the assignment hang, waiting until the overwritten task
terminates.  Doesn't sound very useful.

We can't allow assignment statements on limited types, but disallow them
on types containing tasks, because generic contract problems would rear
their ugly heads.

We could raise an exception on assignment statements if the object
contains tasks.  Yuck.

I don't like any of the above possible rules, and I can't think of any
others.

>...It seems minor compared to the
> benefit of making programming with limited types as similar as possible
> to programming with non-limited types, which is important given
> that a program written using non-limited types can flip over when
> a low-level component of some record type is changed to be limited.

It seems to me that allowing initialization of limited objects is
sufficient to make the transition feasible (in cases where it makes any
sense at all).  I'm not sure about that, though, and I'm waiting to hear
evidence to the contrary.

----

I just thought of another point:

For nonlimited types, giving a default for a discriminant means, oddly,
"unconstrained objects of this type can exist".  I.e., it is possible
to declare objects that can change size.  Some compilers choose to
allocate the max size for these; others use a heap-based
implementation.  (Compiler writers who choose the latter sneer at the
former, because it means certain records will always raise
Storage_Error.  Compiler writers who choose the former decry implicit
heap usage.  Too bad -- it causes real portability problems.)

But for *limited* types, giving a default for a discriminant means
something totally different -- it means you want a default value
(surprise, surprise).  And in the limited case, there is no excuse for a
compiler to allocate the max size, because the size cannot change.
One of the suggested implementations of Exception_Occurrence in
AARM-11.4.1(19.a...) depends on this fact:

    type Exception_Occurrence(Message_Length: Natural := 200) is
        limited record
            Id: Exception_Id;
            Message: String(1..Message_Length);
        end record;

The type is private.  The client who says:

    X: Exception_Occurrence;

does not see the discriminant.  The implementation should allocate 200
bytes for the characters of X.Message.  An implementation that allocates
2**31 bytes is broken.  And there is no need for extra indirections.

My point is: if we allow assignment_statements on limited types,
the size of X could change, thus breaking various compilers.
And if the compiler fix is to allocate the max, it would break
various programs (and break the implementation's support for
Exception_Occurrences).

****************************************************************

From: Dan Eilers
Sent: Friday, December 12, 2003,  6:27 PM

Bob Duff wrote:
> I can't get my head around the idea that an assignment_statement could
> wipe out a preexisting (already initialized) limited object.  It seems
> to defeat the purpose of limitedness.  I admit that "Bob Duff doesn't
> like it" is a fairly weak argument.  ;-)

Well, I don't think the purpose of limitedness is to prevent objects
from being "wiped out" before their scope ends.  Rather, it's to prevent
a single object from being cloned into two objects.

>...
>The above two points seem to imply that we can't overwrite running
>tasks.  That leaves two alternatives: (1) abort the task, then overwrite
>it.  Sounds rather harsh.  Also, aborting doesn't kill the task -- it
>just gets the task *started* on the long process of going away.
>(2) Have the assignment hang, waiting until the overwritten task
>terminates.  Doesn't sound very useful.

We definitely shouldn't overwrite running tasks!
It seems perfectly fine to me to do what happens at the end of
a declare block that declares a task object, which I believe
is your option (2).

The usefulness doesn't so much come from the nifty new things
you can do with tasks.  It comes from eliminating arbitrary
restrictions on what you can do with limited objects that may or
may not contain tasks, such as limited generic formal parameters.

> My point is: if we allow assignment_statements on limited types,
> the size of X could change, thus breaking various compilers.

How about if we say that a constraint error is raised at runtime
if assignment of limited types changes a discriminant, analogous
to the way indefinite objects currently behave:

    declare
       x: string := "abc";
    begin
      x := "def";   -- wipes out previous value of x
      x := "defg";  -- raises constraint_error at runtime
    end;

****************************************************************

From: Tucker Taft
Sent: Tuesday, December 16, 2003,  3:25 PM

Trying to define the semantics for assignment statements
for limited types seems doomed.  Fundamentally assignments
involve creating a copy of the right hand side, after first
destroying the left hand side.  This really can't be done,
since it is not possible to copy a task or a protected object,
and for limited private types, who knows what sort of thing
the type represents -- it might represent a physical device of
some sort, a window on a screen, etc.

Even destroying the left hand side is not easy, if there are
tasks or protected objects.  Even unchecked deallocation
doesn't stop a task from running, though it renders certain
future internal references to discriminants of the task
into exception-generating events.

Other problems include dealing properly with access discriminants.
For limited types, they are allowed to point to local objects,
if the limited object is local.  If you were to assign the
local limited object to a global limited object, you could
be creating a dangling reference.

The whole point of limited is to disallow copying, so it
really makes no sense to try to then define what copying
means.  The user can of course write an Assign procedure,
and in some conceivable future revision we could allow you
to call such a procedure using the ":=" syntax, but I'm
not holding my breath.  We've been down this path, and
the composability issues are thorny.

****************************************************************

From: Robert I. Eachus
Sent: Friday, December 12, 2003,  9:48 AM

Robert A Duff wrote:
> Robert Eachus replied:
>> Huh?  Oh.  You don't use it because you don't use limited tagged
>> types.  This feature will become much more useful and less
>> 'misguided' if we can initialize objects of types derived from
>> Limited_Controlled more easily.
>
> I don't get it.
>
> I *do* use limited tagged types.  I also use tagged types that I wish
> were limited, but I made them nonlimited so I could use aggregates and
> the like.
>
> These return-by-reference things seem pointless to me.  What they're
> "really" doing is returning a pointer to a preexisting limited object.
> If I want to do that, I tend to use an explicit access type.
>
> Can you (or somebody) show me an example where a return-by-reference
> function is better than a function returning an access value?

If you take that attitude I can't.   Seriously, the advantage is that
there doesn't need to be a user visible access type.

The area where I find them quite handy is in dealing with databases.
You can put the database "behind" limited objects, and have lookup
functions that are return by reference, and inquiry functions or
procedures that take the by reference parameter and produce a report or
whatever:  Print_Report(Lookup_Drivers(State => "MA"; Sex => Male));

The advantage is that not having to have access types cuts the number of
types floating around in half.  Yes, you know that the inquiry is
returning an array of limited records, and there really is a pointer
involved.  But it is not explicit, so you both don't have to worry about
the type, and don't have to worry about about deallocation issues.

> Note that return-by-reference always returns a constant object,
> so you can't do much with it.
>
> I think I've used this feature exactly once in the last few years,
> and I can't remember why.  I do remember that I thought it necessary
> to put a comment in, explaining that this depends on the weird
> return-by-reference nature of that thing.

As I said, it is the problems with initializing limited objects that
makes them harder to use than they need to be.  But between constructor
functions that can be called where the object is declared, and unknown
discriminants that insures that a constructor must be called, they
become much more convenient.  The fact that the actual value returned is
a constant is no problem, in fact if you are using limited types, you
want the return values to be constants.

> I see why in the component case the size is known before the call, and
> the memory is allocated before the call.  But I don't see why that needs
> to be true in other cases, if we're willing to pass in a thunk, which
> might do some allocation, or might return the address of some
> preallocated memory.
>
> The thunk is needed whenever the result subtype is indefinite.
> Do you plan to disallow that case?
>
> If the compiler is willing to generate multiple copies of the code for
> constructor functions, it could optimize away the thunks.
>
> Likewise, if the constructor is inlined (which I suspect will be
> common), the thunk can be inlined, too.

Don't declare success and go home too soon.  But I have probably been
guilty of not trying to explain what ends up going on.  In the normal
case, a record component can't be indefinite.  If it depends on a
discriminant, the discriminant is part of the object as a whole.  But
there is a special case where a component depends on a discriminant
which is not a discriminant of the containing object.  The simple
example is a record with a default discriminant, and a record with
several components of this type.  To give the gory example:

type Varying_String (Length: Integer := 0) is
record
   Contents: String(1..Length);
end record;

type Big_Object is record
 A, B, C: Varying_String;
end record;

Some compilers support such types only by "allocating the maximum."
These compilers generate Storage_Error if you declare an object of type
Big_Object.  These compilers don't have to change their implementation
for this proposal to work.  They will lay out such records as they
currently do, and some will raise Storage_Error.  But....

> Randy said:
>
>> Trying to lay out all possible record types contiguously is a fool's
>> game.
>>
>
> I believe that's a common implementation.  I think AdaMagic and GNAT
> both do that, for example.  I agree with Tuck: we don't want to change
> record layouts; the always-contiguous assumption is fundamental
> (in the compilers that use it).

Some compilers recognize such types as "special" and go the extra mile
in laying them out non-contiguously.  If you do that, as Randy says you
have to deal with the consequences.  So compilers that currently allow
such types can continue to do so.  The implementation of the new
proposal will be slightly more complex than for the "allocate the max"
type, but only becase they have both contiguous records, and objects
which are allocated in pieces.

That's why we have been discussing those cases as well.  But unless Ada
0Y requires compilers not to use the allocate the max strategy, nothing
will need to change in the way compilers currently layout records.

****************************************************************

From: Randy Brukardt
Sent: Thursday, January 22, 2004,  7:32 PM

Bob Duff said (eons ago): [Actually, December 12th]
> Can you (or somebody) show me an example where a return-by-reference
> function is better than a function returning an access value?
>
> Note that return-by-reference always returns a constant object,
> so you can't do much with it.

We had to use these guys in Claw. We needed constants of a limited
controlled type, and the only way to do it in Ada 95 is to declare variable
objects in the body, initialize them in the initialization part, and have
functions return them. All we really wanted was deferred constants of a
limited type. (The details were discussed in the appendix to AI-287.)

Of course, AI-287 gives us what we really wanted in the first place, so
there is no longer any need for the return-by-reference functions in Claw.

Bob's comment makes me wonder if *all* uses of such functions are
essentially work-arounds to get constants of limited types. In that case,
perhaps we should be so bold is to simply drop the capability completely if
we allow real constructor functions. After all, the solution we're currently
discussing is incompatible with Ada 95 -- it requires the user to modify
their source code anyway. If they simply modified it to an appropriate
constant declaration (possibility initialized by a function, or by an
aggregate), or to return an access to the object, they'd get what they need.
And we wouldn't have to add a useless new feature to the language solely for
compatibility with what looks like (in 20-20 hindsight) a useless feature.

****************************************************************

From: Tucker Taft
Sent: Tuesday, December 16, 2003,  3:09 PM

For those who didn't make it to the ARG meeting,
the proposal that was finally discussed relating to returning
limited types was as follows (a real AI will be forthcoming):

  1) Get the current capability of return-by-reference "out of the way"
     by giving it a special syntax:

     function Ret_By_Ref(X : Whatever) aliased return Lim_Type;

     Such an "aliased return" function can be used with any type, limited
     or non-limited, and it means that the function is returning
     a reference to an object whose accessibility level is no deeper
     than that of the function itself.

     An accessibility check is performed on the object associated with
     the expression of the return statement.

     A call on such a function provides a constant view of this object.

  2) Allow limited types to be returned by "normal" functions so long as
     the return expression is an aggregate, or a call on a (normal) function.
     If the type is "deep down" limited, then such a function must build
     the result in its final resting place.  If the underlying type is actually
     nonlimited, the implementation could use copying as part of returning, though of course
     minimizing copies is generally a good thing.

  3) Define a new construct called an "extended" return statement which can be used
     in "normal" functions, whether the result type is limited or non-limited.
     This construct allows the programmer to give a name to the object being
     returned, and includes a "do ... end" part where the object being returned
     can be further updated prior to return.  The syntax for an extended return is:

         RETURN identifier : [ALIASED] subtype_mark [:= expression] [DO
             handled_sequence_of_statements
         END RETURN];

     As mentioned in (2) above, this construct doesn't guarantee that no
     additional copies will be made unless the underlying result type
     is "really" limited.  But again, extra copies would be discouraged,
     and this construct is meant to aid in the goal of minimizing copies
     while still providing a chance to adjust the returned value using a sequence
     of statements.


The implementation burden of supporting returning limited types which must
be built "in place" was a concern.  It was concluded that imposing a restriction
that if the full type was "really" limited, then the result subtype must be
constrained, would help significantly.  Although this restriction is a bit
annoying, it is something which could be lifted in a future revision, or
a later stage of this revision process, if the benefit was felt to outweigh
the implementation cost.  The AI will be written up in a way that will
outline the wording both with and without this restriction, and hopefully
vendors will do a bit more analysis on the implementation burden associated
with build-in-place for unconstrained result subtypes, while potential
users will assess the relative value of the full versus restricted capability.

Note that the major advantage of constrained subtypes is that the caller
can always take care of allocation, and the caller can also do task
initialization, which includes presumably adding the task to an activation
list.  Another advantage is that it eliminates accessibility checks associated
with creating an object with an access discriminant, since any access discriminant
must be constrained ahead of time.  Of course the down side is also that
the discriminant values, if any, must be constrained ahead of time. ;-)

It turns out that the proposal of allowing anonymous access result types,
in its most general form (where the caller provides a storage pool if
the function allocates a new object), has essentially the same problem
as supporting unconstrained result subtypes for limited function returns,
in that it requires the caller to provide a storage pool, activation list, etc.
There is a subset capability for anonymous access result types as well,
where the storage pool is not provided by the caller, but instead
the function must qualify the return expresssion with a named access type
if it is going to return an allocator.

Probably it would make sense to support the full capability of both,
or the subset capability of both (or no capability at all), given
the similarity in implementation burden.  Note that it is not just
the storage pool, but also the activation list (and perhaps master)
for any tasks, and accessibility level checks for access discriminants,
that come into play with the fully general capability.  The AI will
lay out these implementation issues in so far as possible.

****************************************************************

From: Tucker Taft
Sent: Tuesday, December 16, 2003,  7:09 PM

Tucker Taft wrote:

>For those who didn't make it to the ARG meeting,
>the proposal that was finally discussed relating to returning
>limited types was as follows (a real AI will be forthcoming)...

Sounds like a lot of progress, and I first want to say that from a
functionality viewpoint it seems fine.  However, I think that it is
going to turn out to be more difficult (impossible is more difficult,
right?) to write a legality rule that implements the decision, than to
support all unconstrained cases.  And the discussions so far should have
made it clear that doing that has a potential distributed overhead.
The proposal can certainly be writen up as allowing the implementation
to decide whether or not to allow unconstrained return values for types
that may be "deep down" limited.

The first painful case of course is a function that returns a class wide
value of a limited tagged type.  It should be okay to allow the compiler
to look at the specific  types that can be returned when compiling  the
body and  reject non-conforming return statements:

type Foo (<>) is tagged  limited private;
...

type My_Foo is new Foo with....
...
function Create(Some_Param: Integer) return Foo'Class is
...
   if Some_Param = 7
   then return My_Foo(7); -- Ok.
   elsif Some_Param = 8
   then return Aliased_Return_Function(8); -- Oops!
   end if;
   ...

I think I can see how to word an Implementation Permission that covers this:

"An implementation may reject a function return with a limited result
type if  the view of the result value is limited and the result subtype
is unconstrained or if the view is limited and the return value would
need to be copied."

Argh!  Tough, but at least we can give examples.  What if there are
several return statements in the function?  I think this takes care of
that.  If none of the values that could be returned are "really limited"
you are fine,  I don't see how to have some return values that are
"really limited" and others that are not, if the return subtype is
constrained.

The problem is that this doesn't cover what I see as the most painful
case.  If you have a component of a record which has a default
discriminant, and the component does not depend on a discriminant of the
record type, what then?  We are back to allocating the max before the
call, or Alsys style mutant records.  This is okay, as long as the
component itself is not inherently limited:

task type Bar is.... end task;

type Bar_Array is array(Natural range <>) of Bar;

type Bar_List (D: Natural := 3) is record
      Contents: Bar_Array(1..D);
end Bar_List;

type Problem_Record (X: Integer) is record
   ...
   List: Bar_List;
   ...
end record;

Now I am not suggesting that we require any compiler to provide a useful
implementation of this type, and I am not really worried about a
constructor for it.  But define "subtype Bounded is Integer range
-10..10;" and make the subtype of X,  Bounded, and we have to decide if
we want compilers to be forced to make this work.  Personally I wouldn't
mind outlawing this case and having the useful unconstrained
constructors allowed.  My point is that this example doesn't seem to be
outlawed by the current proposed rule.

Let's try again:  "An implementation may reject a function return
statement with a limited result type if  the view of the result value is
limited and the result subtype, or the subtype of any limited
subcomponent, is unconstrained or if the view of the return subtype is
limited and the return value would need to be copied."

Enough to chew on for now...

****************************************************************

From: Tucker Taft
Sent: Tuesday, December 16, 2003,  8:09 PM

I don't think we want to make this an implementation permission.
Either we allow unconstrained result types or we don't.
Of course compiler implementors can experiment with extensions,
as they have already, but users should be protected from unintentionally
using extensions that will render their code non-portable.

****************************************************************

From: Robert I. Eachus
Sent: Tuesday, December 16, 2003,  9:03 PM

 My point in writing it as an implementation permission was not to make
it possible for different compilers to support different realistic
cases.  My point was that judging the ragged edge really has to be done
by the compiler, not by the user, and at the point where the body of the
function is being compiled.  The idea was that this is an
"Implementation Permission" for compilers not to implement some
otherwise impossible to implement bodies.  Sure, if the compilers were
required to support these cases, they could pass extra parameters on all
function calls "just in case."  What we want, it seems to me, is to say
that compilers must implement the cases that can be done without
distributed overhead, and is allowed to reject bodies because that
overhead is missing.

Incidently, I didn't mention this in my previous post, which was complex
enough.  But if we do go with this version of the proposal we need to
decide for all language defined limited types whether or not they are
"really" limited.   Not doing that would result in significant
difference between implementations.  The two most obvious cases are
File_Type in the various IO packages and Limited_Controlled.

****************************************************************

From: Tucker Taft
Sent: Tuesday, December 16, 2003, 10:31 PM

If we impose the restriction, it should be based
on properties of the type known at the end of the package
spec where the function is declared.  (The RM-ease would be
"somewhere within the immediate scope of the function declaration.)
It should not depend on the properties of components or ancestors
which are not visible at that point.  Hence, it should not be based on
some implementation-ish thing like return-by-reference.
We are trying to get away from that privacy-breaking rule.

Note that there is no need for it to be something
visible at the point of the *spec* of the function.
Any error can be deferred until the full type is completely
defined, presuming it is defined in the same package
as the function.

****************************************************************

From: Robert I. Eachus
Sent: Wednesday, December 17, 2003,  2:09 AM

No.  There will be legality rules, etc. for the function specification,
but I thought the first example I gave was sufficient to show that there
must be a rule on limited function return VALUES.  Go back and look at
the first example again.  If you have a 'funky' return by reference
function that is returning an existing 'really' limited object either as
the return value or as part of an aggregate or subcomponent of a record
type, you are stuck.

The second example is a similar case.  If you have a 'really' limited
subcomponent, and it is part of a record that depends on a discriminant
other than a discriminant of the return type, the object size, or even
the number of tasks it contains, cannot be determined by the caller.

What if you say, well we just won't allow any return subtype that may
have such a subcomponent?  You end up throwing most of the potential
useful functionality away.  In particular, you could never define a
constructor that returned a classwide type, because some unit not
visible from the unit where the constructor is declared could create a
"really limited" class member, even though the constructor as written
not only will never return a value of that non-existant type.

You (Tucker) are right that there are going to be return subtypes where
the compiler can detect  that it is not possible to have a legal return
statement according to this rule.  Fine, the compiler can print a
warning message.  But in 99% or more of the potentially outlawed cases,
you will have to look at the return statements to decide whether the
function is legal.  If you are really having trouble getting this, I can
probably gin up another example tomorrow where a function has two return
statements one acceptable under this rule, and one that is not.

I just don't want to lose all those cases due to some misguided
"contract model" issues.   The contract with the user of the type and
the constructor for the usual case where the type is (limited) private
is that the constructor will return a valid value of the return
subtype.  Period.  There are going to be legal values of that subtype
that cannot be returned without the infrastructure you are trying to
avoid.  We certainly don't want to require that overhead in all cases,
just to allow for return values that may never exist.   The contract
that the author of the constructor has is different.  He is the one that
has to design the type so that the constructor can return a legal (and
meaningful) value.  If this requires user written indirection which is
hidden from the user of the package (and the ADT) that should be fine.
In fact, that will be nice in many cases that now look pretty ugly.

Right now, the contract with the user is the result subtype.  There is
no requirement that all possible values of the result TYPE must be legal
in the return statement.  In fact most of  the Dynamic Semantics of 6.5
talks about the process and the checks made at run-time to determine if
a return value of the return type  matches the return subtype.  As  I
see it,  the proposal is to add an additional rule or rules that belong
in the same place, but check whether the return statement will require
copying of a "really limited" value.  A value of a type declared as
limited, especially in the case of limited tagged types may or may not
be one of these "really limited" values.  I just can't imagine how you
can do that legality check in a package spec, instead of at run-time
when the body is executed in some cases.

****************************************************************

From: Robert A. Duff
Sent: Wednesday, December 17, 2003, 10:09 AM

> type My_Foo is new Foo with....
> ...
> function Create(Some_Param: Integer) return Foo'Class is

I wasn't at the meeting, but I just talked with Tuck about it, and
I believe the intent was to make the above function illegal (in the
more-restricted version of the proposal).  The goal of the restriction
is that the call site knows the discriminants/bounds, and therefore the
size of the result object, so it can allocate the object, fill in the
discriminants, put the tasks on their various lists, and so forth.

There is no need to break privateness or look at bodies to implement the
restriction.  And there is certainly no need for
implementation-definedness in this area.

****************************************************************

From: Robert I. Eachus
Sent: Wednesday, December 17, 2003, 11:30 AM

Let me pull out an earlier example to show why I see problems with this
restriction.  Right now this code compiles:

 ---------------------------------------------------------------------------------------------------------

with Ada.Strings.Unbounded; use Ada.Strings.Unbounded;

package Unbounded_String_Utilities is

  type String_Array(<>) is limited private;

  type String_List is limited private;

  function Size (Arr: String_Array) return Natural;

  function Size (List: String_List) return Natural;

  function To_String_Array(List: in String_List) return String_Array;

private

  type List_Node;

  type String_List is access List_Node;

  type String_Array is array(Natural range <>) of Unbounded_String;

  type List_Node is record

    Value: Unbounded_String;

    Next: String_List;

  end record;

end Unbounded_String_Utilities;

package body Unbounded_String_Utilities is

  function Size (Arr: String_Array) return Natural is

  begin return Arr'Length; end Size;

  function Size (List: String_List) return Natural is

    Count: Natural := 0;

    Temp: String_List := List;

  begin

    while Temp /= null loop

      Temp := Temp.Next;

      Count := Count + 1;

    end loop;

    return Count;

  end Size;

  function To_String_Array(List: in String_List) return String_Array is

    Result: String_Array(1..10);

    Temp: String_List := List;

  begin

    if Temp = null then return Result(1..0); end if;

    for I in 1..10 loop

      Result(I) := Temp.Value;

      if Temp.Next = null then return Result(1..I); end if;

    end loop;

    return Result & To_String_Array(Temp);

  end To_String_Array;

end Unbounded_String_Utilities;

------------------------------------------------------------------------------------------------------------


Of course, I can't currently call To_String_Array in a scope where the
return type is limited.  As I understand the new proposed rule, the
intent is that I will be able to do so.  String_Array is a type that is
not "really" private, so the compiler can generate code for an
initialization expression which is a call to To_String_Array outside the
scope where String_Array is not limited.

If the new restriction makes it impossible to create limited objects of
a type with unknown discriminants, then I oppose it.  But as I
understand what Tucker is saying, the check in this case--and similar
cases--should be made on the function declaration.  If this currently
legal code becomes illegal that is a significant upward
incompatibility.  (Right now, the function is legal, but there are
limitations on where it can be used in initialization expressions.)

The case being discussed above is slightly different than this
example--the type with unknown discriminants would be a classwide type
instead of an array type in this example.  But the problem remains the same.

****************************************************************

From: Tucker Taft
Sent: Wednesday, December 17, 2003, 11:51 AM

> Let me pull out an earlier example to show why I see problems with this
> restriction.  Right now this code compiles:

And it will continue to do so.  As I indicated, the check would *not*
be made at the point of the function declaration, but rather at the
end of the enclosing package spec, where the full type is completely defined
(presuming the limited type is defined in the same package).

Since the full type isn't limited, there is no restriction.

...
> If the new restriction makes it impossible to create limited objects of
> a type with unknown discriminants, then I oppose it.

It doesn't, presuming the full type is not limited.

> ... But as I
> understand what Tucker is saying, the check in this case--and similar
> cases--should be made on the function declaration.

I said it would *not* be made at the function declaration, but rather
at the end of the enclosing package spec (or to be RM-ish,
it would be illegal if the result subtype is unconstrained and the
type is limited throughout the immediate scope of the function, or
something like that).

****************************************************************


Questions? Ask the ACAA Technical Agent