Version 1.6 of ais/ai-00318.txt

Unformatted version of ais/ai-00318.txt version 1.6
Other versions for file ais/ai-00318.txt

!standard 03.03.01 (02)          03-06-23 AI95-00318/01
!standard 06.05.00 (17)
!standard 06.05.00 (18)
!class amendment 02-10-09
!status work item 03-05-23
!status received 02-10-09
!priority Medium
!difficulty Medium
!subject Returning [limited] objects without copying
!summary
New syntax is proposed for identifying the object that will be returned from a function, allowing the object to be built in the context of the caller, without further copying required.
This could be used to support returning limited objects from a function, to support returning objects of an anonymous access type, and more generally to reduce the copying that might be required when a function returns a complex object, a controlled object, etc.
!problem
We already have a proposal for allowing aggregates of a limited type, by requiring that the aggregate be built directly in the target object. rather than being copied into the target.
But aggregates can only be used with non-private types. Limited private types could not be initializable at their declaration point. It would be natural to allow functions to return limited objects, so long as the object could be built directly in the "target" of the function call, which could be a newly created object being initialized, or simply a parameter to another subprogram call.
We have also considered allowing functions to return anonymous access types. In this case, if the function returned an allocator, it would be natural for the caller context to determine the storage pool to be used by the allocator.
Whether returning a limited type or an anonymous access type, in both cases, it may be desirable to perform some other initialization to the object after it has been created, but before returning from the function. This is difficult to do while still creating the object directly in its "final" location.
!proposal
When declaring a local variable inside a function (not including within a nested program unit), the variable may be declared to be a "return" object, using the following syntax (analagous to the syntax used for constants):
identifier : [ALIASED] RETURN subtype_indication [:= expression];
Within the scope of a return object (except within nested program units), no other return objects may be declared, and all return statements must have the name of the return object as their returned expression.
[Possible alternative: return statements in the scope of a "return" object must omit the returned expression, and be like the return statements of a procedure. One possible down side of omitting the name of the return object is that it makes the reader's job a bit harder; they have to look back to find the object being returned. One possible up side -- it is perhaps clearer that no copying is happening at the return statement.]
The return object would not be finalized prior to leaving the function. The caller would be responsible for its finalization.
This syntax would not be restricted to limited types. It could also be used for non-limited types. The implementation advice would be that the amount of copying, finalization, etc. should be reduced, if possible, as part of returning from the function. This could be particularly useful for functions that return large objects, or objects with controlled parts.
A call of a function with a limited result type could be used in the same contexts where we have proposed to allow aggregates of a limited type, namely contexts where a new object is being created (or can be).
1) Initializing a newly declared object (including a "return" object) 2) Default initialization of a record component 3) Initialized allocator 4) Component of an aggregate 5) IN formal object in a generic instantiation (including as a default) 6) Expression of a return statement 7) IN parameter in a function call (including as a default expression)
In addition, since the result of a function call is a name in Ada 95, the following contexts would be permitted, with the same semantics as creating a new temporary constant object, and then creating a reference to it:
8) Declaring an object that is the renaming of a function call. 9) Use of the function call as a prefix to 'Address
If we permit function result types to be anonymous access types (e.g. "function Blah return access T"), then we likely will want such functions, if they return the result of an allocator, to be able to use the context of the call to determine the storage pool for the allocator. This proposed syntax would allow the function to do the allocator in the "caller" context, but still be able to perform further initialization of the allocated object after the allocator. Essentially the "return" object would inherit the storage pool determined by the calling context, so that allocators that are used to initialize it, or that are assigned to it later, would use the caller-determined storage pool.
!wording
!example
Here is an example of a function with a limited result type using a "return" object:
function Construct_Obj(Len : Natural) return Lim_Type is Result : return Lim_Type(Discrim => Len); -- the "return" object begin -- Finish the initialization of the "return" object. for I in 1..Len loop Result.Data(I) := I; end loop;
-- And now return it. return Result; -- [Alternative: omit "Result" (or entire return statement); -- "return Result;" would be implicit] end Construct_Obj;
Here is essentially the same function, but with an anonymous access type for its result type:
function Construct_Obj(Len : Natural) return access Lim_Type is Result : return access Lim_Type; -- The "return" object begin Result := new Lim_Type(Discrim => Len); -- this uses the storage pool determined by the caller context
-- Finish the initialization of the allocated object for I in 1..Len loop
Result.Data(I) := I;
end loop;
-- And now return it. return Result; -- [Alternative: omit "Result" (or entire return statement); -- "return Result;" would be implicit]
end Construct_Obj;
By "caller context", we mean that the same rules as apply to an allocator would apply to calls on this function, where the expected (access) type would determine the storage pool:
type My_Acc_Type is access Lim_Type; for My_Acc_Type'Storage_Pool use My_Amazing_Stg_Pool;
P : My_Acc_Type;
begin P := Construct_Obj(3); -- allocator inside Construct_Obj uses My_Amazing_Stg_Pool
!discussion
In meetings with Ada users, there has been a general sense that if limited aggregates are provided in Ada 200Y, it would be desirable to also provide limited function returns which could act as "constructor" functions.
Just allowing a function whose whole body is a return statement returning an aggregate (or another function call) does not give the programmer much flexibility. What they would like is to be able to create the object and then initialize it further somehow, perhaps by calling a procedure, doing a loop (as in the examples above), etc. This requires a named object. However, to avoid copying, we need this object to be created in its final "resting place," i.e. in the target of the function call. This might be in the "middle" of some enclosing composite object the caller is initializing, or it might be in the heap, or it might be a stand-alone local object.
Because the implementation needs to create the returned object in a place or a storage pool determined by the caller, it is important that the declaration of the object be distinguished in some way. By using the keyword "return" in its declaration, we have a fairly intuitive way for the programmer to indicate that this is the object to be returned. Clearly we only want to allow one of these at a time, and to require that all return statements within its scope explicitly (or perhaps implicitly) return that object.
Because it may be necessary to do some computing before deciding exactly how the return object should be declared, we permit the return object to be declared within nested blocks within the function so long as there is no return object for the function already in scope. So different branches of an if or case statement could declare their "own" return object if appropriate, for example.
Note that we have allowed the user to declare the return object as "aliased." This seems like a natural thing which might be wanted, so you could initialize a circularly-linked list header to point at itself, etc.
We had considered a different syntax for this before, namely a new kind of return statement, analogous to an accept statement, e.g.:
return Result : T := blah do
Result.Data(3) := 77; ...
end Result;
However, Bob Duff pointed out that for simple cases you ended up with two levels of nesting which seemed excessive:
function Fum() return T is begin return Result : T := blah do Result.Data(3) := 77; ... end Result; end Fum;
Making a smaller change to the object declaration syntax seemed a simpler approach.
POSSIBLE IMPLEMENTATION APPROACHES
The implementation approach for anonymous access result types is very similar to that for limited result types. In the following, we will mostly talk about limited result types. Towards the end we will explain how it applies to anonymous access result types.
Full accessibility level checking adds to the complexity. At the end we will show how to introduce restrictions that eliminate most of this complexity, in exchange for some loss in functionaliy.
The implementation of this for limited result types is straightforward if the size of the result is known to the caller. It is essentially equivalent to a procedure with an OUT parameter -- the caller allocates space for the target object, and passes its address to the called routine, which uses it for the "return" object.
If the size of the function result is not known to the caller (i.e. the function result subtype is unconstrained, and perhaps indefinite), then there are two basic possibilities:
1) The target object's (nominal) subtype is constrained (or at least
"definite"), even though the function result subtype is unconstrained; the target object might be a component of a larger object.
2) The target object's nominal subtype is unconstrained, and its size
is to be determined by the result returned from the function; the target object must be a stand-alone object, or an "entire" heap object.
In the first case, the caller determines the size of the target object and can allocate space for it; in the second, the caller cannot preallocate space for the target object, and must rely on the called routine allocating space for it in an "appropriate" place.
The code for the called routine must handle both of these cases. One reasonable way to do so is for the caller to provide a "storage pool" for the result. In the first case, this storage "pool" has space for exactly one object of a given maximum size. It's allocate routine is trivial. It just checks to see if the size is no greater than the space available, and then returns the preallocated (target) address.
In the second case, the storage pool is either the storage pool associated with the initialized allocator at the call site, or a storage pool that represents a secondary stack, or equivalent, used for returning objects of unknown size from a function.
For upward compatibility, we would need to accommodate functions that return pre-existing objects by reference. One way to do this would be for the caller to provide an additional implicit boolean parameter which would indicate whether the called routine must create a new object, or could return a reference to an existing object.
Of the nine places identified above where calls on functions with limited result type would be permitted, the cases where the called routine must create a new object are (1)-(5). Cases (6)-(9) allow the use of preexisting objects, so the storage pool provided would generally be the secondary stack if the size is unknown to the caller, or a preallocated primary stack area, if the size of the object returned is always the same. Case (6), where a return statement returns the result of a function call, is a bit of a halfway situation. For (6), the storage pool provided as part of the call in the return statement would be the same storage pool passed to the function.
When the boolean flag indicates that a new object is not required, the called routine could return a reference to a preexisting object, and ignore the storage pool or target address provided. As a possible optimization, this case could be indicated by simply providing a null storage pool parameter, rather than a separate boolean flag. The called routine would take this to mean that the secondary stack, or equivalent, should be used if a new object is being created, but that it may return a reference to a preexisting object. For the simplest implementation model where the size of the result is always known to the caller, and no storage pool parameter is provided, a separate flag would probably be necessary. The net effect is that there would be one implicit parameter in both situations, a boolean flag for the known-size function result, and a possibly-null storage pool for the unknown-size function result.
In all cases, the called routine would return the address of the result, whether newly created or preexisting. The caller would use this returned address in all cases where the function result might be a preexisting object (cases (6)-(9)), or in cases where the caller didn't preallocate space for the target.
IMPLEMENTATION APPROACH FOR ANONYMOUS ACCESS RESULT TYPE
For anonymous access result types, a very similar approach would be taken. In this case, however, a new object is never required. It would always be permissible to return an access value designating a preexisting object. The storage pool parameter would always be required, but the caller could always ignore it. An accessibility level would be needed associated with the storage pool, so the called routine would know the accessibility level of the result of an allocator that used the storage pool. An accessibility level would also need to be returned, so the caller would know the accessibility of the result.
Although the RM talks about accessibility levels in terms of dynamic levels of nesting, most implementations use accessibility levels that correspond to static levels of nesting, but adjust the level when passing a given (formal) access parameter to a more nested subprogram with an access parameter as well, by collapsing deeper static levels into a level that corresponds to the static level of the given formal access parameter's declaration. This is explained in AARM 3.10.2(22.x-22.ee).
Unfortunately, this "collapsing" of levels loses information. So when passing accessibility levels to and from a function with an anonymous access type result, it would be desirable to avoid "collapsing" such levels, and use the original accessibility levels. In some implementations it might be helpful for the caller to provide the called routine with a level for the called routine to use for its own locals, which is guaranteed to be deeper than any level number that the caller cares about.
limited TYPES and ACCESSIBILITY ISSUES
Because limited types can have access discriminants, and an accessibility check is required when an allocator for such a type is performed to be sure the allocated object doesn't outlive the object referenced by the access discriminant, some kind of accessibility level will also have to be provided to the called routine when a storage pool is provided, at least when the result type has access discriminants. Because the storage pool will often be local to the caller and the access discriminant might be specified via an access parameter to the function, the collapsing of accessibility levels mentioned above would have to be supressed in this case as well.
Hence we end up with a general rule that when access parameters are passed to a function with a limited result type (with access discriminants), or with an anonymous access result type, no collapsing of accessibility levels is performed. The caller's accessibility levels are used in the access parameters, and in the storage pool. The called routine has to accommodate this somehow. Again, in some implementations it may be helpful for the caller to provide the called routine with an accessibility level it can use for its own locals that is certain to be deeper than any other level passed in from the caller.
POSSIBLE SIMPLIFICATIONS OF ACCESSIBILITY CHECKING
If we would like some of these capabilities, but would like to avoid dealing with uncollapsed accessibility levels, accessibility levels associated with storage pools, etc., then we could make some restrictions that might simplify the implementation (though of course it would complexify the user's model a bit):
1) If a limited result type has access discriminants, then
the storage pool passed in must not outlive the function declaration. This would imply that the function could safely set the access discriminants to point to objects with an accessibility level no deeper than the function declaration. This is similar to the test performed on return-by-reference now (6.5(17-20)).
With this restriction, no accessibilty level needs to be passed in with the storage pool for limited result types.
Note also that with this restriction, calls on local functions could not be used within initialized allocators for global access types, if the function's result type is a limited type with access discriminants (doesn't seem like much of a loss).
2) For anonymous access result types, if the storage pool were
not used inside the function, the accessibility of the returned access value must be no deeper than that of the function declaration (e.g., it could not return the value of an access parameter passed in, unless the access parameter designated an object global to the function). Again, this is essentially the check performed now for return-by-reference types.
If the storage pool is used, then naturally the accessibility is that of the storage pool, so the caller knows that the maximum accessibility depth of the result is the depth of the storage pool or the depth of the function declaration, whichever is deeper.
If the designated type of the anonymous access type is a limited type with access discriminants, then the same restriction as (1) would apply to the storage pool, i.e. that the storage pool depth must be no less than that of the function declaration.
With these restrictions, no accessibility level needs to be passed in with the storage pool for anonymous access result types, and in turn no level would be returned (without the storage pool level passed in, it would be pretty much impossible to pass it back!).
Note that without the level being returned, a local function could not be used to create a value assigned to variable of a global access type, since the function might return a pointer designating a local object, and it has no way of indicating that.
CONCLUSION
If we are willing to accept the restrictions of the above section, then the implementation burden is roughly the same for either return limited types or returning anonymous access types, namely that a storage pool may need to be passed in. The called routine needs to use that storage pool when creating a limited return object, or evaluating an allocator whose target is the anonymous access return object. If the return object is itself initialized by a function call, then the storage pool needs to be passed into that function as well, presuming that function also returns a limited type or an anonymous access type.
If a user doesn't explicitly declare a return object, then each return statement is equivalent to a local block that declares a return object initialized from the return expression, and then returns it.
If we don't want to accept the restrictions given above, then accessibility levels need to be passed with the storage pool, and the accessibility levels passed with access parameters should not be "collapsed." An accessibility level would be returned from a function with an anonymous access result type.
Note that an additional advantage of the restricted form is that more accessibility checking can be performed at compile-time, and it will generally involve less run-time overhead.
Given this, it seems appropriate to consider the restricted (compile-time accessibility) form of the proposal first, and only if this is felt sufficiently valuable, to consider the unrestricted form of the proposal.
!ACATS test
!appendix

From: Robert Duff
Sent: Wednesday, February 6, 2002,  6:05 PM

                    Limited Types Considered Limited

One of my homework assignments was to propose changes to "fix" limited
types.  This e-mail outlines the problems, and proposed solutions.
I haven't written down every detail -- I first want to find out if there
is any interest in going forward with these changes.  Some of these
ideas came from Tucker.

I apologize for doing my homework at the last minute.  I hope folks will
have a chance to read this before the meeting, and I hope Pascal is
willing to put it on the agenda.

Ada's limited types allow programmers to express the idea that "copying
values of this type does not make sense".  This is a very useful
capability; after all, the whole point of a compile-time type system is
to allow programmers to formally express which operations do and do not
make sense for each type.

Unfortunately, Ada places certain limitations on limited types that have
nothing to do with the prevention of copying.  The primary example is
aggregates: the programmer is forced to choose between the benefits of
aggregates (full coverage checking) and the benefits of limited types.
Forcing programmers to choose between two features that ought to be
orthogonal is one of the most frustrating aspects of Ada.

I consider the full coverage rules (for aggregates and case statements)
to be one of the primary benefits of Ada over many other languages,
especially with type extensions, where some components are inherited
from elsewhere.  I will refrain from further singing the praises of full
coverage; I assume I'm preaching to the choir.

My goals are:

    - Allow aggregates of limited types.

    - Allow constructor functions that return limited types.

    - Allow initialization of limited objects.

    - Allow limited constants.

    - Allow subtype_marks in aggregates more generally.
      (They are currently allowed only for the parent part in an
      extension aggregate.)

The basic idea is that there is nothing wrong with constructing a
limited object; *copying* is the evil thing.  One should be allowed to
create a new object (whether it be a standalone object, or a formal
parameter in a call, or whatever), and initialize that object with a
function call or an aggregate.  In implementation terms, the result
object of the function call, or the aggregate, is built in place in its
final destination -- no copying is necessary, or allowed.

All of the above goals except constructor functions are fairly trivial
to achieve, both in terms of language design and in terms of
implementation effort.  Constructor functions are somewhat more
involved.  However, I am against any language design that allows
aggregates where function calls are not allowed; subprogram calls are
perhaps the single most important tool of abstraction ever invented!
(There is at least one other such case in Ada, and I hate it.)

By "constructor function", I mean a function that returns an object
created local to the function, as opposed to an object that already
existed before the function was called.

Ada currently allows functions to return limited types in two cases,
neither of which achieves the goal here:

    If the limited type "becomes nonlimited" (for example, a limited
    private type whose full type is integer), then constructor functions
    are allowed, but the return involves a copy, thus defeating the
    purpose of limited types.  Anyway, this feature is not allowed for
    various types, such as tagged types.

    If the limited type does not become nonlimited, then it is returned
    by reference, and the returned object must exist prior to the
    function call; it cannot be created by the function.  In essense,
    these functions don't return limited objects at all; they simply
    return a pointer to a preexisting limited object (or perhaps
    a heap object).

We need a new kind of function that constructs a new limited object
inside of itself, and returns that object to be used in the
initialization of some outer object.  The run-time model is that the
caller allocates the object, and passes in a pointer to that object.
The function builds its result in that place; thus, no copying is done
on return.

Because the run-time model for calls to these constructor functions is
different from that of existing functions that return a limited type, we
need to indicate this syntactically on the spec of the function.  In
particular, change the syntax of functions so that "return" can be
replaced by "out", indicating a constructor function.  In addition,
change the syntax of object declarations to allow "out", as in
"X: out T;"; this marks the object as "the result object" of a limited
constructor function.  The reason for "out" is that these things behave
much like parameters of mode 'out'.

Examples:

    type T is tagged limited
        record
            X: ...;
            Y: ...;
            Z: ...;
        end record;

    type Ptr is access T'Class;

    Object_1: constant T := (X => ..., Y => ..., Z => ...);

    function F(X: ...; Y: ...) out T;

    function F(X: ...; Y: ...) out T is
        Result: out T := (X => X, Y => Y, Z => ...);
    begin
        ... -- possible modifications of Result.
        return Result;
    end F;

    Object_2: Ptr := new T'(F(X => ..., Y => ...));
        -- Build a limited object in the heap.

Rules:

Change the rules in 4.3 to allow limited aggregates.  This basically
means erasing the word "nonlimited" in a few places.

Change the rule in 3.3.1(5) about initializing objects to allow limited
types.  But require the expression to be an aggregate or a constructor
function.  ("X: T := Y;", where Y is a limited object, remains illegal,
because that would necessarily involve a copy.)  There are various
analogous rules (initialized allocators, subexpressions of aggregates,
&c) that need analogous changes.

Assignment statements remain illegal for limited types, even if the
right-hand side is an aggregate or limited constructor function.

Allowing constants falls out from the other rules.

Allow a component expression in an aggregate to be a subtype_mark.  This
means that the component is created as a default-initialized object.
It's essentially the same thing we already allow in an extension
aggregate; we're simply generalizing it to all components of all
aggregates.  This is important, in case some part of the type is
private.  There is no reason to limit this capability to limited types.

Specify that limited aggregates are built "in place"; there is always a
newly-created object provided by the context.  Note that we already have
one case where aggregates are built in place: (nonlimited) controlled
aggregates.  Similarly, the result of a limited constructor function is
built in place; the context of the call provides a newly-created object.
(In the case of "X: T := F(...);", where F says "return G(...);", F will
receive the address of X, and simply pass it on to G.)

If the result object of a limited constructor function contains tasks,
the master is the caller.

For a function whose result is declared "out T", T must be a limited
type; such a function is defined to be a "limited constructor function".

Subtype T must be definite.  This rule is not semantically necessary.
However, the run-time model calls for the caller to allocate the result
object, and this rule allows the caller to know its size before the
call.  Without this rule, a different run-time model would be required
for indefinite subtypes: the called function would have to allocate the
result in the heap, and return a pointer to it.  A design principle of
Ada is to avoid language rules that require implicit heap allocation;
hence this rule.  (An alternative rule would be that T must be
constrained if composite, thus eliminating defaulted discriminants.)

A limited constructor function must have exactly one return statement.
The expression must be one of the following:

    - An object local to the function (possibly nested in block
      statements), declared with "out".

    - A function call to a limited constructor function.

    - An aggregate.

    - A parenthesized or qualified expression of one of these.

An object declared "out" must be local to a limited constructor
function.

A constraint check is needed on creation of a local "out" object.
We have to do the check early (as opposed to the usual check on the
return statement), because we need to make sure the object fits in the
place where it belongs (at the call site).  If the return expression is
an aggregate, that needs a constraint check, as usual.  If the return
expression is a function call, then that function will do whatever
checking is necessary.

Is there an issue with dispatching-on-result functions?
I don't think so.


Compatibility:

This change is not upward compatible.  Consider:

    type Lim is limited
        record
            Comp: Integer;
        end record;

    type Not_Lim is
        record
            Comp: Integer;
        end record;

    procedure P(X: Lim);
    procedure P(X: Not_Lim);

    P((Comp => 123));

The call to P is currently legal, and calls P(Not_Lim).  In the new
language, this call will be ambiguous.

This seems like a tolerable incompatibility.  It is always caught at
compile time, and cases where nonlimitedness is used to resolve
overloading have got to be vanishingly rare.  The above program, though
legal, is highly confusing, and I can't imagine anybody wanting to do
that.  The current rule was a mistake in the first place: even if
limited aggregates *should* be illegal, that should not be a Name
Resolution Rule.


Other advantages:

One advantage of this change is that it makes the usage style of limited
types more uniform with nonlimited types, thus making the language more
accessible to beginners.

How do you construct an object in Ada?  You call a function.  Cool -- no
need for the kludginess of C++ constructors.  But if it's limited, you
have to fool about with discriminants -- not something that would
naturally occur to a beginner.  And discriminants have various annoying
restrictions when used for this purpose.

How do you capture the result of a function call?  You put it in a
constant: "X: constant T := F(...);".  But if it's limited, you have to
*rename* it: "X: T renames F(...);".  Again, that's not something that
would naturally occur to a beginner -- and the beginner would rightly
look upon it as a "trick" or a "workaround".

Another point is that the current rules force you into the heap,
unnecessarily.  You end up passing around pointers to limited objects,
either explicitly or implicitly, which tends to add complexity to one's
programs.

Limited types offer other advantages in addition to lack of copying:
access discriminants, and the ability to take 'Access of the "current
instance".  It seems a shame to require the programmer to choose between
these and aggregates.


Alternatives:

It is not strictly necessary to mark the result object with "out"; the
compiler could deduce this information by looking at the return
statement(s).  However, marking the object simplifies the compiler -- it
needs to treat this object specially by using the space allocated by the
caller.

It is not necessary to limit the number of return statements to 1.
However, it seems simplest.  We need to prevent things like this:

    function F(...) out T is
        Result_1: out T;
        Result_2: out T;
    begin
        Result_1.X := 123;
        Result_2.X := 456;
        if ... then
            return Result_1;
        else
            return Result_2;
        end if;
    end F;

because we can't allocate Result_1 and Result_2 in the same place!
On the other hand, the following could work:

    function F(...) out T is
        Result: out T;
    begin
        if ... then
            return Result;
        else
            return Result;
        end if;
    end F;

suggesting a rule that all return statements must refer to the *same*
object.  But this could work, too:

    function F(...) out T is
    begin
        if ... then
            return (...); -- aggregate
        elsif ... then
            return G(...); -- function call
        elsif ... then
            declare
                Result_1: out T;
            begin
                Result_1.X := 123;
                return Result_1; -- local
            end;
        else
            declare
                Result_2: out T;
            begin
                Result_2.X := 456;
                return Result_2; -- different local
            end;
        end if;
    end F;

because only one of the four different result objects exists at any
given time.  I'm not sure how much to relax this rule.  Perhaps some
rule about declaring only one of these special result objects in a given
region?

****************************************************************

From: Tucker Taft
Sent: Thursday, February 7, 2002,  6:15 AM

I would allow these "constructor" functions for any kind of
type.  I would require that at most one OUT local variable
be declared.  It must be in the outermost declarative part
(to avoid it going out of scope before it was returned), and
that all return statements must identify the OUT local variable
if present (or perhaps, all return statements must omit the
return expression completely if present).

Allowing the declaration of an "OUT" local variable might
be generalized to normal functions (with the same restrictions
as above).  For a "normal" function, the OUT local variable
would represent the returned object, but with the difference that
the space for it is allocated by the called routine, rather than
the caller.  This allows the Pascal "style" of assigning to
a return object, when it is appropriate.

****************************************************************

From: Robert Duff
Sent: Thursday, February 7, 2002,  7:52 AM

> I would allow these "constructor" functions for any kind of
> type.

You mean for nonlimited as well as limited?  Sounds OK.

I assume you do not mean to allow them for unknown-size subtypes.

>...  I would require that at most one OUT local variable
> be declared.  It must be in the outermost declarative part
> (to avoid it going out of scope before it was returned),

I don't see why that's necessary.  The OUT variable doesn't *really* go
away -- it's really just a view of the object created by the caller.
No action is required to return it -- it's already in the right spot,
so a return of it is simply a "goto end of function".

>... and
> that all return statements must identify the OUT local variable
> if present (or perhaps, all return statements must omit the
> return expression completely if present).

A less restrictive rule is that all return statements that refer to a
variable must refer to the OUT variable.  Does that not work?

> Allowing the declaration of an "OUT" local variable might
> be generalized to normal functions (with the same restrictions
> as above).

OK.

>...  For a "normal" function, the OUT local variable
> would represent the returned object, but with the difference that
> the space for it is allocated by the called routine, rather than
> the caller.  This allows the Pascal "style" of assigning to
> a return object, when it is appropriate.

One question is: what if the function doesn't execute a return
statement?  That's currently erroneous.  But if there's an OUT variable,
it would seem sensible to just let the function fall off the end, and
have the OUT variable define the result.

Which implies that we should eliminate the rule requiring at least one
return statement, in the case where there's an OUT variable.

I like it!

I presume the OUT object can actually be a constant or variable, by the
way.

****************************************************************

From: Tucker Taft
Sent: Thursday, February 7, 2002,  8:28 AM

> > I would allow these "constructor" functions for any kind of
> > type.
>
> You mean for nonlimited as well as limited?  Sounds OK.

Yes, that's what I meant.  And since some limited types
become non-limited, I think this may be necessary.

> I assume you do not mean to allow them for unknown-size subtypes.

Right.  They should have exactly the same restrictions,
independent of whether the type is limited or nonlimited,
and the caller should always allocate space for the
returned object.  That way for limited types that
become non-limited, we don't run into any weirdness.

This also means that one can switch between limited and non-limited
with minimal semantic disruption, and you don't have to remember
a lot of special cases for limited vs. non-limited (I presume
that is one of the key goals of this proposal).

> >...  I would require that at most one OUT local variable
> > be declared.  It must be in the outermost declarative part
> > (to avoid it going out of scope before it was returned),
>
> I don't see why that's necessary.  The OUT variable doesn't *really* go
> away -- it's really just a view of the object created by the caller.
> No action is required to return it -- it's already in the right spot,
> so a return of it is simply a "goto end of function".

I don't see how that could work.  Consider the following:

    type Lim_T(B : Boolean := False) is record
        case B is
            when True => X : Task_Type;
            when False => null;
        end case;
    end record;


    function Cons(...) out Lim_T is
    begin
        declare
             Result : out Lim_T(True);
        begin
             null;
        end;
        return Lim_T'(B => False);
    end Cons;

Are you going to require a "return" on all paths out of the declare block?
What if an exception is propagated from the declare block, and then in
the handler there is a return of something with a different discriminant?

> >... and
> > that all return statements must identify the OUT local variable
> > if present (or perhaps, all return statements must omit the
> > return expression completely if present).
>
> A less restrictive rule is that all return statements that refer to a
> variable must refer to the OUT variable.  Does that not work?

No; see above.  I think once you create the OUT local, that
must be the one that gets returned.  Hence, it is probably
simplest if there is an OUT local declared, to eliminate
the use of return expressions, or even the requirement for
an explicit "return" statement, so it would work like
a procedure with one OUT parameter.

> > Allowing the declaration of an "OUT" local variable might
> > be generalized to normal functions (with the same restrictions
> > as above).
>
> OK.
>
> >...  For a "normal" function, the OUT local variable
> > would represent the returned object, but with the difference that
> > the space for it is allocated by the called routine, rather than
> > the caller.  This allows the Pascal "style" of assigning to
> > a return object, when it is appropriate.
>
> One question is: what if the function doesn't execute a return
> statement?  That's currently erroneous.  But if there's an OUT variable,
> it would seem sensible to just let the function fall off the end, and
> have the OUT variable define the result.
>
> Which implies that we should eliminate the rule requiring at least one
> return statement, in the case where there's an OUT variable.
>
> I like it!

In my view, this would be desirable only if we eliminate all return
expressions from return statements for constructor functions with a
local OUT object, making them obey the same rules as procedures
thereafter.

> I presume the OUT object can actually be a constant or variable, by the
> way.

I would not allow it to be a constant.  The word "OUT" would take
the place of the word "CONSTANT" in the syntax, the way I see it.
It seems nearly useless to have it be a constant, and clearly,
the caller could use it to initialize a variable, so calling it a constant
could be misleading.

****************************************************************

From: Robert Duff
Sent: Thursday, February 7, 2002,  8:57 AM

> I don't see how that could work.

Neither do I.  ;-)

> No; see above.  I think once you create the OUT local, that
> must be the one that gets returned.  Hence, it is probably
> simplest if there is an OUT local declared, to eliminate
> the use of return expressions, or even the requirement for
> an explicit "return" statement, so it would work like
> a procedure with one OUT parameter.

Agreed.

> In my view, this would be desirable only if we eliminate all return
> expressions from return statements for constructor functions with a
> local OUT object, making them obey the same rules as procedures
> thereafter.

Yes, that makes sense.

> I would not allow it to be a constant.  The word "OUT" would take
> the place of the word "CONSTANT" in the syntax, the way I see it.
> It seems nearly useless to have it be a constant, and clearly,
> the caller could use it to initialize a variable, so calling it a constant
> could be misleading.

Yes, of course.  I wasn't thinking clearly.

****************************************************************

From: Robert Dewar
Sent: Thursday, February 7, 2002,  9:02 AM

I must say that for me, this entire proposal seems to be insufficiently
grounded in real requirements. I am concerned that the ARG is starting to
wander around in the realm of nice-to-have-neat-language-extensions which
are really rather irrelevant to the future success of Ada. I am not opposed
to a few extensions in areas where a really important marketplace need has
been demonstrated, but the burden for new extensions should be extremely
high in my view, and this extension seems to fall far short of meeting
that burden.

****************************************************************

From: Randy Brukardt
Sent: Thursday, February 7, 2002,  2:19 PM

I hate to be agreeing with Robert here :-), but he's right.

There is a problem worth solving here (the inability to have constants of
limited types), but that could adequately be solved simply by the 'in-place'
construction of aggregates (which we already require in similar contexts).
[I'll post a real-world example of the problem in my next message.] The problem
is relatively limited, and thus the solution also has to be limited, or it
isn't worth it. This whole business of constructor functions only will sink any
attempt to fix the real problem, because it is just too big of a change at this
point.

Bob's concerns about the purity of the language would make sense in a new
language design, but we're working with limited resources here, and simple
solutions are preferred over perfect ones.

****************************************************************

From: Randy Brukardt
Sent: Thursday, February 7, 2002,  3:05 PM

Here is an example that came up in Claw where we really wanted constants of a
limited type:

The Windows registry contains a bunch of predefined "keys", along with user
defined keys. Our original design for the key type was something like (these
types were all private, and the constants were deferred, but I've left that out
for clarity):

    type Key_Type is new Ada.Finalization.Limited_Controlled with record
      Handle : Claw.Win32.HKey := Claw.Win32.Null_HKey;
      Predefined_Key : Boolean := False; -- Is this a predefined key?
                                         -- (Only valid if Handle is not null)
      -- other components.
    end record;

    Classes_Root : constant Key_Type := (Ada.Finalization.Limited_Controlled with
       Handle => 16#80000000#, -- Windows magic number
       Predefined_Key => True, ...);
    Current_User : constant Key_Type := (Ada.Finalization.Limited_Controlled with
       Handle => 16#80000001#, -- Windows magic number
       Predefined_Key => True, ...);
    -- And several more like this.

    procedure Open (Key    : in out Key_Type;
                    Parent : in Key_Type;
                    Name   : in String);

    procedure Close (Key   : in out Key_Type);

    procedure Put (Root_Key   : in Key_Type;
                   Subkey     : in String;
                   Value_Name : in String;
                   Item       : in String);

    -- and so on..

Of course, our favorite compiler rejected the constants as illegal.

So, they were turned into functions.

   function Classes_Root return Key_Type;
   function Current_User return Key_Type;

However, these have the problem that they have to be overridden for any
extensions of the type (as they are primitive). We could have put them into a
child/nested package (to make them not primitive), but that would bend the
structure of the design even further and add an extra package for no good
reason. We also could have made them class-wide, but that would be a misleading
specification, as they can never return anything other than Key_Type. So we
left them in the main package.

Aside: we originally wanted to use these as default parameters for some of the
various primitive routines. However, that would illegal by 3.9.2(11) unless
they are primitive functions. This rule exists so that the default makes sense
in inherited primitives. But we really would have preferred that the default
expressions weren't inherited; they only make sense on the base routines. That
is a problem that probably isn't worth solving though.

Of course, now that we had functions, we had to implement them. The first try
was:

   function Classes_Root return Key_Type is
   begin
       return (Ada.Finalization.Limited_Controlled with
           Handle => 16#80000000#, -- Windows magic number
           Predefined_Key => True, ...);
   end Classes_Root;

But our friendly compiler told us that THIS was illegal, because this is
return-by-reference type, and the aggregate doesn't have the required
accessibility.

So we had to add a library package-level constant and return that:

   Standard_Classes_Root : constant Key_Type := (Ada.Finalization.Limited_Controlled with
       Handle => 16#80000000#, -- Windows magic number
       Predefined_Key => True, ...);

   function Classes_Root return Key_Type is
   begin
       return Standard_Classes_Root;
   end Classes_Root;

But of course THAT is illegal (its the original problem all over again), so we
had to turn that into a variable and initialize it component-by-component in
the package body's elaboration code:

   Standard_Classes_Root : Key_Type;

   function Classes_Root return Key_Type is
   begin
       return Standard_Classes_Root;
   end Classes_Root;

begin
   Standard_Classes_Root.Handle := 16#80000000#; -- Windows magic number
   Standard_Classes_Root.Predefined_Key => True;
   ...

Which is essentially how it is today.

This turned into such a mess that we gave up deriving from it altogether, and
created an entirely new higher-level abstraction to provide the most commonly
used operations in an easy to use form. Thus, we ended up losing out on the
benefits of O-O programming here.

I certainly hope that newcomers to Ada don't run into a problem like this,
because it is a classic "stupid language" problem.

Simply having a way to initialize a limited constant with an aggregate would be
sufficient to fix this problem. "Constructor functions" might add
orthogonality, but seem unnecessary to solve the problem of being able to have
constants as part of an abstraction's specification.

****************************************************************

From: Robert Duff
Sent: Friday, February 8, 2002, 10:12 AM

> Simply having a way to initialize a limited constant with an aggregate would
> be sufficient to fix this problem. "Constructor functions" might add
> orthogonality, but seem unnecessary to solve the problem of being able to
> have constants as part of an abstraction's specification.

Surely you don't mean that we would allow limited aggregates only for
initializing stand-alone constants?!  Surely, you could use them to
initialize variables.  And if they can be used to initialize variables,
surely initialized allocators should be allowed.  And of course,
parameters.

In *my* programs, much of the data is heap-allocated.  I want to say:

    X: Some_Ptr := new T'(...);

when T is limited.  Allowing only constants would solve about 1% of *my*
problem.

Are you saying that this is illegal:

    P(T'(...));

and I have to instead write:

    Temp: constant T := (...);

    P(Temp);

?!  That sort of arbitrary restriction is what makes people laugh at the
language.

****************************************************************

From: Dan Eilers
Sent: Friday, February 8, 2002, 12:12 PM

Bob Duff wrote:
> My goals are:
>
>     - Allow aggregates of limited types.
>
>     - Allow constructor functions that return limited types.
>
>     - Allow initialization of limited objects.
>
>     - Allow limited constants.
>
>     - Allow subtype_marks in aggregates more generally.
>       (They are currently allowed only for the parent part in an
>       extension aggregate.)

Tuck wrote:
> I would allow these "constructor" functions for any kind of
> type.

I agree that the non-limited case is also important, and should be
listed as an explicit goal of the AI.  The non-limited case is an
efficiency issue, where a programmer wishes to prevent unnecessary
copying of large objects implied by the semantics of aggregates and
function calls.


Tuck wrote:
> ... so it would work like a procedure with one OUT parameter.

The proposal seems to go to a lot of trouble to define a new
kind of function that behaves exactly like a procedure with
one OUT parameter.  I think there may be a simpler solution
involving an extension to function renaming.  The constructor
function would be declared as a procedure with one OUT parameter,
and then renamed (allowing it to be called) as a function, using
the type of the OUT parameter as its return type.

Example:

        procedure p(x: some_type; y: out return_type);

        function f(x: some_type) return return_type renames p;


> > I assume you do not mean to allow them for unknown-size subtypes.
>
> Right.  They should have exactly the same restrictions,
> independent of whether the type is limited or nonlimited,
> and the caller should always allocate space for the
> returned object.  That way for limited types that
> become non-limited, we don't run into any weirdness.

There are many cases, such as the string concat function, where
the return type is declared to be unconstrained (unknown-size),
but really its size is a function of the sizes of the parameters
and therefore computable before the call.  It might facilitate
allocating space in the caller if Ada had a way of expressing the
size of the return type in terms of the parameters, possibly
using the proposed new assertion mechanism (AI-00286).

****************************************************************

From: Tucker Taft
Sent: Friday, February 8, 2002,  5:39 PM

This is an intriguing idea.  Clearly less syntax invention
than the "out" function return idea.  We actually perform the
transformation implied (from a procedure with an OUT parameter to
a function) as an optimization, when the OUT parameter is of an
elementary type.  Also there is a DEC-Ada pragma which imports an
external function as an Ada procedure, because the external function
had OUT parameters in addition to its return value.  I believe GNAT
supports this pragma.

Unfortunately, I'm not sure the above would work for the case when
you want to use an aggregate as the return expression.  Also, a procedure
presumes its OUT parameter is already at least default-initialized.
With the constructor function, the initialization was deferred until
entering the function, where it could be initialization from an aggregate
or from some other function call.

Another approach is to use the proposal for anonymous access
types as return types, which for (access-to) limited types has many of
the same advantages as the constructor function concept (see AI 231 for
details).

****************************************************************

From: Tucker Taft
Sent: Friday, February 8, 2002,  5:48 PM

Given the relative complexity of the constructor function
concept compared to the other de-limiting ideas, I would propose
we split the AI.  The simpler one would allow:

   1) Aggregates of limited objects, with use of a subtype_mark to mean
      default init of a component;
   2) Explicit initialization of limited objects, both in a declaration and
      an allocator, from an aggregate, with the aggregate built "in place" in
      the target.

The more complex one would address functions constructing limited objects
on behalf of the caller.

The aggregate one seems very straightforward.  Almost just eliminate
the existing restriction, presuming that compilers have already learned
how to build aggregates "in place" for controlled types.

The function one looks like a lot of work.

****************************************************************

From: Tucker Taft
Sent: Sunday, January 12, 2003,  7:55 PM

When we presented some of the Ada 200Y ideas
at SIGAda, there was a feeling that if we
added support for aggregates of a limited
type, we should also have function returns.
Bob and I don't feel the two need to be tied
that closely together, but they both go in the
category of making limited types less limited.

In any case, I got to thinking about the problem
more, and wrote the following note to Bob
describing a "brainstorm" I had a couple of nights
ago.  Bob said I might as well forward this to
the full ARG for comments.  He hasn't decided whether
he will incorporate it into an AI on limited function
returns....

So, fire away.
-Tuck

---------

Bob,
I realized sometime after I gave my quick
response on constructor functions, that I had
forgotten about one of the main challenges, namely
the desire to execute some statements (assignments,
procedure calls, etc.) to initialize the object
being returned, before actually returning it.
If the only thing you can do is return an
aggregate or another function call, you don't
have much flexibility, and there is no way to
insert a call on a procedure.

Which got me to thinking about the various special
naming conventions we had talked about for local
variables which *must* be returned, all of which
were unsatisfactory, kludgey, inelegant, etc.

And then suddenly the idea came to me that if you
could attach the statements directly to the
return statement, that would be nice.

E.g. something like:
     return blah with {statement} end return;

But then you need a name for the object being returned,
so that led to something like:

     return X : blah with {statement} end return;

And then I thought, what construct in Ada already has
an optional set of statements following it?  The
"accept" statement.  So why not try to make use of the
lonely "do" reserved word.  Also, it seems odd to have
a name on an expression, so let's make it a regular declaration
with a subtype indication as well.  The leads to:

    return_statement ::=
       RETURN ;
     | RETURN expr ;
     | RETURN identifier : subtype_indication [:= expr]
         [ DO
              handled_sequence_of_statements
           END [identifier] ;
         ]

For example:

     function Cool(B : Boolean) return Variant_Rec is
     begin
        if B then
            return Result : Variant_Rec(True) do
                Fixup(Result);
            end Result;
        else
            return Result : Variant_Rec(False) do
                Different_Fixup(Result);
            end Result;
        end if;
     end Cool;

With this construct, we could allow limited function
returns, where either the second form of "return" statement
is used and the expr is an aggregate or a function call
(or a reference to a long-lived existing object, per Ada 95),
or the third form of the "return" statement is used, and
pretty much anything goes, since you are clearly creating
a new object.

This construct would also make it possible to support
a result type that was an anonymous access type.  E.g.:

     function Cooler(Blah : access T) return access U is
     begin
         return Result : access U := new U(discrim => blah) do
             Result.Fum := Blah;
         end Result;
     end Cooler;

and the caller could determine the storage pool used for
"Result" from context:

     X : U_Ptr := Cooler(Something'Access);

In fact, limited function returns and anonymous access-type returns
could be seen as almost the same thing.  To implement both,
the caller has to pass in a storage pool, and an accessibility
level.  The called routine can either use that storage pool
(and its associated finalization/dependent-task list, if needed),
or it can return an access/reference to an existing object, so
long as it satisfies the accessibility check.  There might
be an accessibility level indication or some other flag that
means the return object *must* be newly allocated out of the
given storage pool.  The compiler would also have to implicitly
create a local storage pool to be passed in when the result
of the function call is used to initialize a local variable.

I suppose one could get even more radical, and allow this
"do ... end identifer;" at the end of any declaration that
is declaring only one object (i.e. isn't "X, Y : Blah").
This would solve the old problem of making sure any initialization
procedures that need to be called get connected tightly to
the declaration.  But that problem could probably be solved
better by having limited functions with the "return ... do ... end ID;"
construct, so I think I will keep this more radical suggestion
to myself ;-).

So, that was my great "brainstorm" last night (well, actually
this morning when I couldn't sleep...).  As they used to
say during the 9X project, I'll go don my flak jacket now,
so feel free to fire away.

****************************************************************

From: Dan Eilers
Sent: Monday, January 13, 2003  1:01 PM

Tuck's proposal looks interesting to me.

I particularly like the idea of somehow solving "the old problem
of making sure any initialization procedures that need to be called
get connected tightly to the declaration."

Being able to attach initialization code to declarations is useful
for a variety of reasons, including eliminating the overhead of
default initialization code where explicit initialization is
provided later, and making sure the declaration is initialized
before its first use.

****************************************************************

From: Tucker Taft
Sent: Friday, May 23, 2003,  8:05 AM

I believe I floated a "trial balloon" a month or so
ago about a syntax to support returning objects of
a limited type from a function.  Bob Duff pointed
out that it created yet another level of nesting in
simple cases.  Also, it involved a completely new
syntactic construct (return ... do ... end), which
seems excessive.  So here is a revamped proposal, now
structured as a "real" AI.

[Editor's note: this is version /01 of the AI.]

****************************************************************

From: Randy Brukardt
Sent: Friday, May 23, 2003,  1:22 PM

My gut reaction is that your trial balloon syntax is much preferred:
-- Using different syntax for return means that it is clear that this is not
the "normal" return with copying;
-- There is no need to look all over the source code to find the return
object;
-- There is no set of complex rules to guarantee that only one return object
is available in a given context.

The 'weight' of the extra syntax is pretty similar either way (an entire new
kind of object declaration doesn't seem "light" to me either).

So, given the advantages of the original syntax, and since the only
identified problem is an extra level of nesting (who cares?), I much prefer
that alternative. (I'm unconvinced that we can afford the complexity of any
of these proposals, but that's a different issue altogether. I don't like
the idea that the compiler has to be able to determine at run-time whether a
call is 'build-in-place' or 'existing object reference'; that seems like
substantial overhead, and I would expect 'build-in-place' to be commonly
used.)

****************************************************************

From: Tucker Taft
Sent: Friday, May 23, 2003,  1:50 PM

Can you explain this a bit more.  The called routine knows
whether it is returning an existing object or a new object,
so I don't see extra overhead there.  I suppose it has to
check whether the caller allowed returning an existing object,
but that is just a simple test, certainly cheaper than the
average constraint check.

The caller shouldn't care, since it can always use the address
returned from the called function, whether or not it created
a new object.

What is the source of the overhead that I am missing?

****************************************************************

From: Randy Brukardt
Sent: Friday, May 23, 2003,  3:26 PM

According to your write-up, the caller has to pass into the function some or
all of:
 -- A flag as to whether an object return is allowed;
 -- The address of the memory to build the return in;
 -- A storage pool;
 -- An accessibility level.
Obviously, there is going to be overhead to build and pass these things.
(Parameter passing isn't free!) Even if these are all packed into a
descriptor, initializing that descriptor is going to take a bunch of
instructions. That's especially true if a storage pool is created on the fly
for the call (which your writeup suggested in some cases).

So, such function calls look quite a bit more expensive than the similar
aggregates or the currently existing function calls. It's not likely to be
hundreds of times worse, but it's pretty complicated and certainly will slow
down these calls. That might matter for a few existing programs.

****************************************************************

From: Tucker Taft
Sent: Friday, May 23, 2003,  5:19 PM

If we adopt the restriction that eliminates run-time accessibility issues
for this proposal, then what the AI suggested was as follows:

1) If the function had a known-size result, then the caller would
   preallocate the space, and pass this address in the usual way for a
   function that returned a "large" result.  In addition, a flag
   would be passed to indicate whether the function was allowed
   to simply return the address of a preexisting object.  If so,
   then the caller would expect the function to return the address
   of the result, which could be the preallocated space, or the
   preexising object's address.

   In this case, the only extra overhead for typical implementations
   would be the extra boolean flag, and the test against it.

2) If the function had an unknown-size result, and hence would normally
   have to allocate the result on a secondary stack or heap, then the
   caller would pass in a storage pool and a boolean flag (or a possibly-null
   storage pool).  The storage pool is one of the following:
     a) a "normal" storage pool, presuming the function call is used
        as the expression of an initialized allocator
     b) a special "secondary stack" storage pool, presumably which could
        be precreated by the run-time system
     c) an on-the-fly constructed "preallocated-space" storage pool, which
        at a minimum would consist of:
          i) a tag identifying it as one of these kinds of storage pools
          ii) an address of the preallocated space
          iii) the length of the preallocated space

Case (2c) seems like the only one that involves measurable extra work at the
call-site.  Presuming the storage pool is allocated on the primary stack
then at the call-site you would have at least 3 instructions to initialize
the storage pool (assignments of the tag, address, and length), probably
more like 6 for the typical RISC machine.  Then you would have to pass
the address of the storage pool as an implicit parameter.

So I agree there would be some overhead, but by using the storage-pool
"abstraction" for cases (2a,2b,2c) and a simple boolean flag for (1),
the total amount seems pretty modest.

Just to be more precise about the preallocate-space storage pool, here
is a sample implementation of such a beast:

   type Preallocated_Space_Storage_Pool(Addr : Integer_Address; Max_Size : Storage_Count) is
     new Root_Storage_Pool with null record;

   procedure Allocate(
     Pool : in out Preallocated_Space_Storage_Pool;
     Storage_Address : out Address;
     Size_In_Storage_Elements : in Storage_Count;
     Alignment : in Storage_Count) is
   begin
       if Size_In_Storage_Elements > Max_Size then
           raise Storage_Error;
       end if;
       Storage_Address := To_Address(Addr);
   end;

   procedure Deallocate(...) is begin raise Program_Error; end;
   function Storage_Size(Pool : ...) is begin return Pool.Max_Size; end;

A local object of this type would need to be created at the call-site and
passed as an implicit parameter when the space for the object is preallocated
by the caller (case 2c above).

> So, such function calls look quite a bit more expensive than the similar
> aggregates or the currently existing function calls. It's not likely to be
> hundreds of times worse, but it's pretty complicated and certainly will slow
> down these calls. That might matter for a few existing programs.

It doesn't look that expensive to me.  Calling functions with unknown-size
results is relatively expensive anyway.  Presuming case (2c) above is relatively
rare (limited type, unknown size result, caller preallocates), this doesn't
seem like a show-stopper.

****************************************************************

From: Randy Brukardt
Sent: Friday, May 23, 2003,  5:46 PM

Tucker wrote:
...
> 1) If the function had a known-size result, then the caller would
>    preallocate the space, and pass this address in the usual way for a
>    function that returned a "large" result.  In addition, a flag
>    would be passed to indicate whether the function was allowed
>    to simply return the address of a preexisting object.  If so,
>    then the caller would expect the function to return the address
>    of the result, which could be the preallocated space, or the
>    preexising object's address.
>
>    In this case, the only extra overhead for typical implementations
>    would be the extra boolean flag, and the test against it.

But then there is overhead to get rid of the extra 'preallocated' space if
it isn't used. And the overhead of figuring out if that needs to be done. If
the space is in a pool (because it's an anonymous access type, or an item
with a non-stack size), this will require calling pool operation(s). In the
stack case, this memory won't be recovered until the subprogram exits
(Janus/Ada might reuse it, but it cannot recover it). If the object is
controlled, it probably will have to be registered (not sure precisely when
that would have to happen for this case, but it certainly can't happen
inside the subprogram unless a finalization chain is passed into it, which
would add even more overhead).

In addition, this implementation means that you will end up allocating an
'extra' copy of the object in the return existing object case. If the object
is large (and certainly some of the objects we're talking about are), that
could be a problem, as it could cause an existing program to raise
Storage_Error.

> Presuming case (2c) above is relatively
> rare (limited type, unknown size result, caller preallocates), this doesn't
> seem like a show-stopper.

I don't think it's necessarily a show-stopper. But we have to do a
cost/benefit analysis on new features. Certainly, there is a benefit here,
but just like Interfaces, it is not at all clear to me that the benefit
outweighs the cost, which is considerable (and growing).

****************************************************************

From: Tucker Taft
Sent: Saturday, May 24, 2003,  12:34 PM

> But then there is overhead to get rid of the extra 'preallocated' space if
> it isn't used.

I'm not sure I understand what this means.  It may be something
about the way your compiler works.  In our compiler, if
the caller preallocates temporary space for a function result,
it is space that gets released automatically at the end of
the enclosing scope, so there is no point (or sometimes, no
way), to reclaim it earlier than then.

> ... And the overhead of figuring out if that needs to be done.

The caller would know at compile-time whether the function
returns a known-size result, and whether the result is
used in a context where a preexisting object would be
permitted, so I don't see any run-time overhead there.

 > ... If
> the space is in a pool (because it's an anonymous access type, or an item
> with a non-stack size), this will require calling pool operation(s). In the
> stack case, this memory won't be recovered until the subprogram exits
> (Janus/Ada might reuse it, but it cannot recover it). If the object is
> controlled, it probably will have to be registered (not sure precisely when
> that would have to happen for this case, but it certainly can't happen
> inside the subprogram unless a finalization chain is passed into it, which
> would add even more overhead).

I am unclear now whether you are talking about overhead that is new
to limited function return, or is the same as what you would face
for non-limited function return.

> In addition, this implementation means that you will end up allocating an
> 'extra' copy of the object in the return existing object case. If the object
> is large (and certainly some of the objects we're talking about are), that
> could be a problem, as it could cause an existing program to raise
> Storage_Error.

You could set your upper limit for caller-preallocated space
relatively low for these kinds of functions, if this is a
significant concern.  That is, require use of the secondary
stack or heap even if the result size is known, if the known
size is so large as to be of concern.

>>Presuming case (2c) above is relatively
>>rare (limited type, unknown size result, caller preallocates), this doesn't
>>seem like a show-stopper.
>
>
> I don't think it's necessarily a show-stopper. But we have to do a
> cost/benefit analysis on new features. Certainly, there is a benefit here,
> but just like Interfaces, it is not at all clear to me that the benefit
> outweighs the cost, which is considerable (and growing).

Are you talking about implementation cost or run-time overhead?
I don't see the run-time overhead as being much greater than
function calls that return a non-limited type of similar complexity.
If the result might be controlled, or large, or of unknown-size,
then yes that adds to the run-time overhead, but that is true
for non-limited functions as well.

****************************************************************

From: Stephen W Baird
Sent: Tuesday, October 14, 2003,  3:51 PM

This is a discussion of the interaction between AI-318 and the
IBM-Rational  Apex Ada compiler's implementation of finalization, as per
my homework assignment from the Sydney ARG meeting.

----

The Apex compiler manages pending finalization requirements (i.e.
finalization of controlled and protected objects, not tasks) at the
granularity of
top-level (i.e. non-component) objects. The finalization code generated
for the enclosing "construct or entity" (7.6.1(2)) of a given top-level
object relies on the invariant that either all or none of the
subcomponents of the object require finalization.

This means, for example, that if an exception occurs while initializing an
object, then the initialization code (which knows how far it has
progressed) must handle the exception, finalize any components which were
successfully initialized, and then (typically) reraise the exception. If
an object cannot make it to the state where all of its subcomponents need
to be finalized, then it must revert to the state where none require
finalization before execution of the finalization code for the enclosing
"construct or entity".

This has proven to be a reasonable implementation model, but AI-318 might
be difficult to implement using this approach. Consider the case of a
return object which contains several controlled subcomponents. Suppose
that some, but not all, of these subcomponents have been successfully
initialized when an exception is raised. The code (in the callee) which
knows how far initialization has progressed would have to handle the
exception, perform any necessary finalization, and then (typically)
reraise the exception.

Unfortunately, the AI (as currently written) disallows this approach:
    "The return object would not be finalized prior to leaving the
function. The caller would be responsible for its finalization".

This problem could be resolved by having this provision of the AI apply
only in the case of a "normal completion" (7.6.1(2)) of the function, with
the callee responsible for finalization otherwise (or perhaps just by
adding an implementation permission allowing the callee to perform
finalization in this case).

It cannot always be known whether a function is going to return normally
until after any other finalization of objects declared by the function has
completed. Thus, the return object might have to be the last object to be
finalized. This could be accomplished either by requiring that it be the
first object with nontrivial finalization to be declared or by inventing a
special dynamic-semantics rule to handle this case (perhaps only an
implementation permission).

The Apex Ada compiler implements abortion (including ATC) by means of a
distinguished anonymous "exception". Thus, abortion while the callee is
executing introduces essentially the same problem for the caller as if the
callee propagated an exception. The distinction between normal and
abnormal completion proposed above would also help in resolving this
problem.

****************************************************************

From: Robert A Duff
Sent: Wednesday, October 15, 2003, 11:53 AM

> This problem could be resolved by having this provision of the AI apply
> only in the case of a "normal completion" (7.6.1(2)) of the function, with
> the callee responsible for finalization otherwise ...

That makes sense to me.  How can it make sense to let the caller do the
finalization of the result when the function is not returning a result,
but is propagating an exception instead?

Also, Steve is talking about the case where the returned object is "half
baked".  But what if it hasn't been initialized at all?  I believe there
would be trouble in that case, too.

>...(or perhaps just by
> adding an implementation permission allowing the callee to perform
> finalization in this case).

I'm not a big fan of implementation permissions, but I would have no
objection in this case.

****************************************************************

From: Dan Eilers
Sent: Thursday, October 30, 2003, 8:01 PM

The initial proposals for AI-318 involved changes to the syntax
of a function specification, such as using OUT instead of RETURN.
The current proposals don't.  The only proposed syntax changes are
in the body of a function.

This makes it impossible for the caller to know that this is a
special return-in-place function, which would seem to be necessary
in order to use different calling conventions.  Note that the caller
can't just go by the return type being limited, because the AI is
intended to also eliminate the copy-back for non-limited types.

****************************************************************

From: Randy Brukardt
Sent: Thursday, October 30, 2003, 8:32 PM

I *think* that's intentional. The majority of functions return by-copy
types, and for those, it makes no difference (at least, it better not). For
other types, most compilers already use an in-place function convention in
most cases; and those that don't (i.e. Janus/Ada) probably would be better
off changing to use one. So, it seems that for most calls, any performance
changes would be in the direction of faster (and possibly larger) code.

But any performance incompatibilities ought to be investigated throughly.
I've already complained about performance incompatibilities with this
proposal (see the mail thread of May 2003 in the AI); Tucker's response is
essentially that compilers will optimize the calling conventions, and the
ugly cases are rare. Since that *is* an incompatibility, it should be
discussed in the AI.

We've already asked for implementation reports on this AI, since several
implementors expressed concern about the cost of the convention. I'm sure
we'd welcome one from you as well.

****************************************************************

From: Tucker Taft
Sent: Thursday, October 30, 2003, 8:56 PM

It is true that a single calling convention must be used.
This implies some overhead on calling functions with a
return-by-reference type, but the presumption is that these
are very rare at the moment.  The presumed model is that the caller
specifies a storage "area" (or equivalent), and a flag indicating
whether the storage area *must* be used, or simply *may* be used.
(I believe this is discussed in the AI already.)

If used in a context where a new object is being initialized
(e.g. a component of an aggregate, initialization expression
for a limited object, or an initialized allocator), the specified
storage area must be used.  If used in another context
(e.g. in a renaming, as an IN parameter, or as an operand of some construct
like a membership test), then the storage area need not be
used, and returning an existing object is allowed.

Currently there is an accessibility check on returning a preexisting
return-by-reference object.  That check would be expanded to
include a check on whether returning an existing object is permitted.
Right now the check is officially a run-time check, but it is
generally easy to perform at compile-time (or instantiation time).
It would become a real run-time check with this change.

The underlying presumption behind all this is that the existing
capability to return existing objects by reference is of relatively
little use, and it is reasonable to largely ignore this capability,
and focus on being able to use functions with limited result types as
"constructors."  The existing capability would be preserved, but
perhaps might even deserve to be made obsolescent.

The existing capability provides very little value over what can
be done with returning an access value, whereas the new capability
provides significant value as part of making limited types more
useful.

> ... Note that the caller
> can't just go by the return type being limited, because the AI is
> intended to also eliminate the copy-back for non-limited types.

This is meant to be an optimization.  There is no guarantee
that copy-back is eliminated for non-limited types.
The presumption is that for most functions returning
large objects of known size, the caller already passes in
the address of a place where the return object should be
placed.  This new syntax would simplify using that space
directly within the function body, rather than doing another
copy.

****************************************************************

From: Robert Dewar
Sent: Thursday, October 30, 2003, 8:42 PM

> It is true that a single calling convention must be used.
> This implies some overhead on calling functions with a
> return-by-reference type, but the presumption is that these
> are very rare at the moment.  The presumed model is that the caller
> specifies a storage "area" (or equivalent), and a flag indicating
> whether the storage area *must* be used, or simply *may* be used.
> (I believe this is discussed in the AI already.)

That seems a nasty incomaptibility. I don't like to see a feature of relatively
minor importance (in my view) causing an implementation incompatibility of
this magnitude, potentially requiring reocmpilation of existing code that
does not use the new feature, and invalidating libraries.

****************************************************************

[Editor's note: Additional discussion on this topic can be found in AI-325.]

****************************************************************

From: Tucker Taft
Sent: Monday, December  8, 2003, 10:32 AM

There seem to a lot of messages flying around about how best
to support function-like-things returning/constructing limited objects.

Using the "Getting to Yes" method of trying to focus on what we agree about,
here is a list of possibly desired features of the solution.
I will start with those that seem to already have a consensus.
I would most appreciate responses that indicate if I missed any "consensus"
statements, or if there are some that are clearly *not* a consensus.
Secondly, it would be good to have a prioritization of the nice-to-haves.
Finally, it would be good to get some feeling about the non-consensus
statements, and perhaps adjustments to them which might allow them to
become consensus statements.

-------------------------

Consensus statements about Ada 200Y if we were to approve AI-318:

   1) Should be possible to declare an object of a limited
      type and provide an initializing expression
   2) Should be possible to use an initialized allocator for
      an access-to-limited type
   3) Should be possible to provide an aggregate as the initializing expression
      for a declaration or an initialized allocator, or for a component
      of such an aggregate; such aggregates may use "<>" to represent default
      initialization of a component
   4) Should be possible to use a function call (or something that looks syntactically
      like a function call) as the initializing expression for a declaration,
      initialized allocator, or component of an aggregate that is of a limited type,
      including a limited private type.
   5) Should be possible to declare a function-like thing callable by such a function call
      for limited types whose first subtype is a definite subtype.
   6) Should be possible to use an aggregate or a function-call (-like-thing) as
      an actual IN parameter of a limited type
   7) Should *not* be possible to copy an existing limited object.  I.e.
      Should *not* be possible to have an assignment statement for a limited
      type, and should *not* be possible to use the name of a limited declared object
      nor a dereference of an access-to-limited type as a component of an aggregate.
   8) The compiler needs to know at the call-site whether function-like thing
      is returning an existing object by reference, or returning/initializing
      a new object

Nice to have:
   9) Ability to declare and call a function-like thing for a limited type with
      non-defaulted discriminants
   10) Ability to declare and call a function-like thing for a limited type with
      unknown discriminants; such types would require an initializing expression --
      no default initialization is defined for them.
   11) Easy to implement

Other possible desirables:
   12) Should not require alteration in the way limited types are laid out
   13) Should allow us to still have function-like things
       that return by reference
   14) Should (or should not) use the word "constructor" somewhere
   15) Should (or should not) use the word "limited" somewhere
   16) Should (or should not) use the word "function" somewhere
   17) Should provide more efficient way for non-limited types to be returned/initialized
   18) Should not "orphan" existing language features

****************************************************************

From: Robert A. Duff
Sent: Monday, December  8, 2003,  2:48 PM

Thanks, Tuck.  This is a very helpful summary.  I was getting lost in
all those e-mails.

> -------------------------
>
> Consensus statements about Ada 200Y if we were to approve AI-318:
>
>    1) Should be possible to declare an object of a limited
>       type and provide an initializing expression
>    2) Should be possible to use an initialized allocator for
>       an access-to-limited type

I would add, "even when the storage pool is user-defined".

>    3) Should be possible to provide an aggregate as the initializing expression
>       for a declaration or an initialized allocator, or for a component
>       of such an aggregate; such aggregates may use "<>" to represent default
>       initialization of a component
>    4) Should be possible to use a function call (or something that looks syntactically
>       like a function call) as the initializing expression for a declaration,
>       initialized allocator, or component of an aggregate that is of a limited type,
>       including a limited private type.
>    5) Should be possible to declare a function-like thing callable by such a function call
>       for limited types whose first subtype is a definite subtype.
>    6) Should be possible to use an aggregate or a function-call (-like-thing) as
>       an actual IN parameter of a limited type
>    7) Should *not* be possible to copy an existing limited object.  I.e.
>       Should *not* be possible to have an assignment statement for a limited
>       type, and should *not* be possible to use the name of a limited declared object
>       nor a dereference of an access-to-limited type as a component of an aggregate.
>    8) The compiler needs to know at the call-site whether function-like thing
>       is returning an existing object by reference, or returning/initializing
>       a new object

I agree with 1..8 above.

Can we also get concensus on this:

    Every context that allows an initialization expression for
    nonlimited types should also allow it for limited types.

?  That subsumes 1,2,part-of-3,part-of-6.  It also includes
record-component-defaults, generic-formal-in's, and probably some
others I've forgotten.  The relevant AI's list all the cases.

> Nice to have:
>    9) Ability to declare and call a function-like thing for a limited type with
>       non-defaulted discriminants
>    10) Ability to declare and call a function-like thing for a limited type with
>       unknown discriminants; such types would require an initializing expression --
>       no default initialization is defined for them.

I think 9 and 10 are important.  I'm not quite willing to kill the whole
idea if I can't have 9 and 10, but since these work already for the
nonlimited case, it would seem pretty kludgy to leave them out in the
limited case.

>    11) Easy to implement

Who could disaggree with that?  But I'm willing to put up with some
implementation complexity to get 9 and 10.

> Other possible desirables:
>    12) Should not require alteration in the way limited types are laid out

I think I agree, but I'm not really sure what you're getting at.  Which
proposal(s) violate this?

>    13) Should allow us to still have function-like things
>        that return by reference

I don't much care about that for my own code, but I think it would be
irresponsible of us to be incompatible with folks have used this
feature, even if we think perhaps it's a misguided feature.

>    14) Should (or should not) use the word "constructor" somewhere
>    15) Should (or should not) use the word "limited" somewhere
>    16) Should (or should not) use the word "function" somewhere

I have no strong opinion on the syntax, but I think these new kinds of
constructors are conceptually "functions".  They just happen to build
their result in the final resting place.

Viewing them as "procedures" seems like a compiler-writer viewpoint; I'd
rather take a user-oriented viewpoint.  The fact that function results
can take their discriminants from the function parameters, but you can't
do that for 'out' parameters, is accidental, not fundamental.

Viewing them as totally new animals seems like overkill.  To me, a
constructor is just a function that creates a new thing.  Number 8 above
implies that we need *some* sort of new syntax.  I would prefer to keep
it as close as possible to the existing function declaration syntax.
But I do not feel strongly about this.

>    17) Should provide more efficient way for non-limited types to be returned/initialized

Seems like a nice side effect.  Not important.

>    18) Should not "orphan" existing language features

This seems like a possible symptom of kludgery, but not a worthy goal in
its own right.  I mean, if I have to write "constructor" instead of
"function" all over the place in future code, that's not a disaster.

****************************************************************

From: Robert I. Eachus
Sent: Monday, December  8, 2003,  4:15 PM

I had to go out after sending my previous message, so effectively Tucker
and I crossed in the mail.  But Bob Duff did an good job of responding
to Tucker's excellent list of points:

Robert A Duff wrote:

>Thanks, Tuck.  This is a very helpful summary.  I was getting lost in
>all those e-mails.
>
>
Agreed, and I was writing some of them, and referring back to others to
keep everything straight.

...
>I agree with 1..8 above.
>
>Can we also get concensus on this:
>
>    Every context that allows an initialization expression for
>    nonlimited types should also allow it for limited types.
>
>?  That subsumes 1,2,part-of-3,part-of-6.  It also includes
>record-component-defaults, generic-formal-in's, and probably some
>others I've forgotten.  The relevant AI's list all the cases.

I like Tucker's breakdown better.  It makes it easier to say that 1,2,3,
and 6 are must haves, and some of the other cases are nice to haves.  I
certainly favor allowing record component defaults, and generic formal
in parameters likewise seem safe, and in most compilers I would expect
them to be implemented identically to required cases when the defaults
were actually used.  But I would certainly consider any objections from
implementors if some fringe case caused serious implementation problems.

>>Nice to have:
>>   9) Ability to declare and call a function-like thing for a limited type with
>>      non-defaulted discriminants
>>   10) Ability to declare and call a function-like thing for a limited type with
>>      unknown discriminants; such types would require an initializing expression --
>>      no default initialization is defined for them.
>
>I think 9 and 10 are important.  I'm not quite willing to kill the whole
>idea if I can't have 9 and 10, but since these work already for the
>nonlimited case, it would seem pretty kludgy to leave them out in the
>limited case.

Agree I think 10 is more important than 9, but both will be very important.

>>   11) Easy to implement
>
>Who could disaggree with that?  But I'm willing to put up with some
>implementation complexity to get 9 and 10.

Definitely agree.

>>Other possible desirables:
>>   12) Should not require alteration in the way limited types are laid out
>>
>>
>
>I think I agree, but I'm not really sure what you're getting at.  Which
>proposal(s) violate this?

No current proposal, AFAIK.  Doesn't mean a future new variation won't.
But there is a difference between require and permit that should be kept
in mind.  There will be cases where compilers can generate more
efficient layouts for types that are just not used today.  Remember that
my example code compiles cleanly today, the only problem is that it
exports ADTs that can't be created by users.

For example, it would be an optimization for a compiler that currently
places unbounded structures on the heap and uses 'hidden' pointers in
the structure to manage them for the compiler to allocate one contiguous
chunck of the heap, for all components of a record, have
pointers/offsets in the record structure, and just one heap object to
free when the whole record is freed.   That doesn't mean that all
compilers have to treat records with multiple constructors that way,
just that a compiler is allowed to do so.

To be honest, I expect that the objects in point 10 above will become
common in Ada 0Y.  The types are currently legal in Ada 95/2000, but
they just are not used.  (And not really very usable.)  So I don't know
if any compilers have horrible overhead if someone does  create one.  If
so, that compiler would probably need to change its layout policy for
limited types.

>>   13) Should allow us to still have function-like things
>>       that return by reference
>
>I don't much care about that for my own code, but I think it would be
>irresponsible of us to be incompatible with folks have used this
>feature, even if we think perhaps it's a misguided feature.

Huh?  Oh.  You don't use it because you don't use limited tagged types.
This feature will become much more useful and less 'misguided' if we can
initialize objects of types derived from Limited_Controlled more easily.

>>   14) Should (or should not) use the word "constructor" somewhere
>>   15) Should (or should not) use the word "limited" somewhere
>>   16) Should (or should not) use the word "function" somewhere
>
>I have no strong opinion on the syntax, but I think these new kinds of
>constructors are conceptually "functions".  They just happen to build
>their result in the final resting place.
>
>Viewing them as "procedures" seems like a compiler-writer viewpoint; I'd
>rather take a user-oriented viewpoint.  The fact that function results
>can take their discriminants from the function parameters, but you can't
>do that for 'out' parameters, is accidental, not fundamental.
>
>Viewing them as totally new animals seems like overkill.  To me, a
>constructor is just a function that creates a new thing.  Number 8 above
>implies that we need *some* sort of new syntax.  I would prefer to keep
>it as close as possible to the existing function declaration syntax.
>But I do not feel strongly about this.
>
>
We really need Norman Cohen to take a look at the issue.  I'm sure he
could come up with something.  Seriously, I have no objection to
retaining the word function.  But I do want the syntax to indicate that
this is one of those special "constructor" things to both the user and
the compiler.  I don't like "limited function" because that implies that
there are also "not limited function" types. ;-)   Using "constructor
function" seems a bit wordy but otherwise fine.  Certainly whatever the
syntax, the RM should talk about them as constructor functions or
constructors.  I also tried out:

function Foo return new Bar;

But that seems to imply that a hidden pointer must be used.

function Foo return inplace Bar;

Is a bit better, but even with the precedent of goto, I don't like the
idea of reserved words that are not English words.

function Foo create Bar;

Might be acceptable to everyone?  I am certainly open to any good ideas.

>>   17) Should provide more efficient way for non-limited types to be returned/initialized
>>
>Seems like a nice side effect.  Not important.
>
Agree.  Well maybe more than just nice if we can improve the efficiency
of Unbounded_String in some cases.  But certainly not a requirement.

>>   18) Should not "orphan" existing language features
>
>This seems like a possible symptom of kludgery, but not a worthy goal in
>its own right.  I mean, if I have to write "constructor" instead of
>"function" all over the place in future code, that's not a disaster.

One last suggested inclusion:

19) If we adopt a partial solution, that partial solution shouldn't
limit a future extention to cover everything.

I am certainly willing to consider scope reduction of a complete
solution, as long as it doesn't preclude ever fixing the excluded cases.

****************************************************************

From: Tucker Taft
Sent: Monday, December  8, 2003, 5:52 PM

Robert A Duff wrote:
> ...
> Can we also get concensus on this:
>
>     Every context that allows an initialization expression for
>     nonlimited types should also allow it for limited types.

I agree with that.  I also believe that the reverse should be true,
namely there should be no contexts where calling these function-like
things are permitted, but calling good-old functions returning
non-limited types are not permitted.

So in other words, from a user point of view, these are all very
similar.  The limited-returning ones cannot be called in certain
contexts because those contexts would require copying the
result.  The only context I can think of off the top of my
head is as the right hand-side of an assignment statement,
though there are probably others.  Whether they can be
used as the expression of a return statement depends on the details
of how these limited-returning things are implemented (as opposed to
called).

It is the existing limited-returning-by-ref functions that are odd, because
they can only be called in very limited contexts.  In particular,
a call on one of these can be used as an IN parameter, in a renaming,
and as a prefix of a name (anyplace else?).

This is not so noticeable in Ada 95, because the limitedness of the result
and the absence of aggregates and initialization of limited types, means
that the by-ref-ness doesn't create much additional limitation.  *But*,
if we add aggregates and initialization of limited types, then suddenly
these kinds of functions have some odd-ball limitations which may be
hard to remember, especially if there are function-like things that
don't have these limitations.

Hence, I feel pretty strongly that if we are going to use syntax to make
these two kinds of limited-returning function-like things look different,
we should make the existing returning-by-ref functions look different
from non-limited-returning functions, and make the new more flexible
limited-returning functions look like good old non-limited returning functions,
since they have so much more in common (in terms of legal calling contexts).

This is why I would recommend we require something like the word "limited" on
a function if it will be returning by-ref, and can only be called in contexts
where by-ref makes sense.  This is of course incompatible, but it is easily
caught at compile-time, and compilers could start allowing the word "limited"
right away, even before they support the new capability.

> > Other possible desirables:
> >    12) Should not require alteration in the way limited types are laid out
>
> I think I agree, but I'm not really sure what you're getting at.  Which
> proposal(s) violate this?

There were a lot of different ideas thrown around, but at least one of
them implied that the caller might *not* know the size of the thing
being allocated, nor where it was being allocated.  Clearly if
you call one of these function-like things as a component of an aggregate,
and you lay out limited types contiguously (even if some component is
dynamic-sized), then the caller *must* specify where the object is allocated,
and *must* know the size before it goes out-of-line so it can add up all the sizes
of the components and do the one overall allocation in the appropriate place
(on the secondary stack, in some user-specified storage pool, as a component
of a yet larger limited object, etc.).

I got the sense that one solution being bandied about was that limited components
of dynamic size would *have* to use a level of indirection, precluding a contiguous
allocation for an enclosing limited record.  This is not the way many compilers
do things now, and so would imply a change in the way limited types are laid out.
I would hope (12) is a point of consensus, but I couldn't tell if that were true
based on the flurry of messages.

>
> >    13) Should allow us to still have function-like things
> >        that return by reference
>
> I don't much care about that for my own code, but I think it would be
> irresponsible of us to be incompatible with folks have used this
> feature, even if we think perhaps it's a misguided feature.

But perhaps call these things "limited functions" because if we add
aggregates and initialized limited objects, these guys won't be callable
in those contexts.  Alternatively, require that they be recast as
functions returning anonymous access types, effectively moving the ".all" from the
return expression to the point of call (since in my experience, these
functions almost always return a reference to a heap object, due to accessibility
limitations).

> >    14) Should (or should not) use the word "constructor" somewhere
> >    15) Should (or should not) use the word "limited" somewhere
> >    16) Should (or should not) use the word "function" somewhere
>
> I have no strong opinion on the syntax, but I think these new kinds of
> constructors are conceptually "functions".  They just happen to build
> their result in the final resting place.

I agree (as is presumably obvious).  As indicated above, it is the return-by-ref
guys that will begin to look like oddballs, if we add limited aggregates and
initialization.

> Viewing them as "procedures" seems like a compiler-writer viewpoint; I'd
> rather take a user-oriented viewpoint.  The fact that function results
> can take their discriminants from the function parameters, but you can't
> do that for 'out' parameters, is accidental, not fundamental.

I'm not sure I followed that logic, but I agree that they should be *viewed*
as functions.  The question is how does one implement these.  I fear
that to achieve nice-to-have's (10) and (11), allowing the first subtype
to have non-defaulted or unknown discriminants, combined with (12), creates
a real challenge.  Renaming a procedure call as a function nicely solved
all the problems:
   a) the visible declaration is a function
   b) the renaming declaration can use the parameters to specify the
      discriminants for the returned (i.e. OUT) object (e.g. "(Disc => 3, others => <>)")
   c) the out of line code has a name for the pre-allocated object so
      it can refer to the discriminants.

If there is another solution that has all these capabilities that would be
great.  I have not found one.  The hardest problem is where the discriminants
are not explicitly determined by the caller, but are instead determined
by some computation on the IN parameters.

One suggested solution was:

   function Make_Text(Len : Natural) return Lim_Text(Len);

But that doesn't work if the discriminants of Lim_Text are not visible (i.e. "(<>)").
The renaming (of a procedure call) could work because the renaming can be
in the private part.

It may be that some mild restrictions could be added to deal with this
problem.  I would hope the restrictions can be enforced on the *declaration*
of the function rather than at the call site.  Otherwise I fear we
will get into the "applicable index constraint" game, which I don't
relish.  That is, certain calls would only be permitted when there is
an applicable discriminant constraint.


>
> Viewing them as totally new animals seems like overkill.  To me, a
> constructor is just a function that creates a new thing. ...

And except for the oddball return-by-ref functions, all
functions create a new thing.

****************************************************************

From: Randy Brukardt
Sent: Monday, December  8, 2003, 6:53 PM

Tucker said:

> Hence, I feel pretty strongly that if we are going to use syntax to make
> these two kinds of limited-returning function-like things look different,
> we should make the existing returning-by-ref functions look different
> from non-limited-returning functions, and make the new more flexible
> limited-returning functions look like good old non-limited returning
functions,
> since they have so much more in common (in terms of legal calling
> contexts).
>
> This is why I would recommend we require something like the word "limited" on
> a function if it will be returning by-ref, and can only be called in contexts
> where by-ref makes sense.  This is of course incompatible, but it is easily
> caught at compile-time, and compilers could start allowing the word "limited"
> right away, even before they support the new capability.

I don't mind that in a vacuum, but I think that it means that either (1)
non-limited constructors are actually more expensive than current functions;
or (2) converting a limited type to non-limited requires checking all
functions for correct behavior.

The former occurs because (in one model) you get a call to Initialize that
generally can't be optimized away on top of the Adjust and Finalize calls
that we already have; the latter occurs (in another model) because limited
types call Initialize and non-limited types don't.

I don't much like either result.

> > > Other possible desirables:
> > >    12) Should not require alteration in the way limited types are laid out
> >
> > I think I agree, but I'm not really sure what you're getting at.  Which
> > proposal(s) violate this?
>
> There were a lot of different ideas thrown around, but at least one of
> them implied that the caller might *not* know the size of the thing
> being allocated, nor where it was being allocated.  Clearly if
> you call one of these function-like things as a component of an aggregate,
> and you lay out limited types contiguously (even if some component is
> dynamic-sized), then the caller *must* specify where the object is allocated,
> and *must* know the size before it goes out-of-line so it can add up all the sizes
> of the components and do the one overall allocation in the appropriate place
> (on the secondary stack, in some user-specified storage pool, as a component
> of a yet larger limited object, etc.).

Trying to lay out all possible record types contiguously is a fool's game.
Kinda like trying to implement universal generic sharing. :-) It's possible
to get it to work, but only with lots of standing on your head. And the
result is very use-unfriendly: objects of reasonable types like
   type Sane_Bounded_String (D : Natural := 0) record
        Data :  String (1 .. D);
   end record;
raise Storage_Error unless constrained.

In any case, the vast majority of real types can be implemented
contiguously, with any of these proposals. (Most ADTs don't have
discriminants anyway, at least not on the top-level types.) If a few types
have to change representation in a few compilers (and only if there are
constructors defined) to make this work, I cannot get too excited. It can't
be incompatible: there are no constructors now.

> > Viewing them as totally new animals seems like overkill.  To me, a
> > constructor is just a function that creates a new thing. ...
>
> And except for the oddball return-by-ref functions, all
> functions create a new thing.

I guess I view these as a new thing because what they do is create a
user-defined "construction" of an object; they need to replace the
"initialization assignment" operation of Ada as well as the "initialization"
itself. Existing functions do not change the semantics of assignment. For
non-controlled types, the distinction doesn't really matter, but it is a big
deal for controlled types (of all stripes).

Also, I see a new thing as necessary, because I don't believe that a useful
constructor can be defined that won't force some representation changes in
compilers. (That is, (12) is an impossible goal; holding to it is a disaster
from a user perspective -- it forces unnatural separations of construction
code into parts. And the idea of somehow specifying an aggregate as the
argument of an In Out parameter seems goofy.) As long as the constructors
are explicit, then there isn't a problem in that existing code would not
have to change representation.

If we don't have the will to do this right this time, I don't think there is
any value to another partial band-aid solution. Especially if it cannot be
extended properly in the future. Which is why Tucker's procedure renaming
just isn't going to work.

****************************************************************

From: Robert I. Eachus
Sent: Monday, December  8, 2003, 7:52 PM

Tucker Taft wrote:

>Hence, I feel pretty strongly that if we are going to use syntax to make
>these two kinds of limited-returning function-like things look different,
>we should make the existing returning-by-ref functions look different
>from non-limited-returning functions, and make the new more flexible
>limited-returning functions look like good old non-limited returning functions,
>since they have so much more in common (in terms of legal calling contexts).
>
>This is why I would recommend we require something like the word "limited" on
>a function if it will be returning by-ref, and can only be called in contexts
>where by-ref makes sense.  This is of course incompatible, but it is easily
>caught at compile-time, and compilers could start allowing the word "limited"
>right away, even before they support the new capability.
>
>
It sounds like you are proposing to make detecting whether a function is
a constructor or a "normal" function depend on whether or not it returns
a limited type.

But that doesn't work.  The problem is that the compiler may not know
whether or not a function can be seen in contexts where its type must be
returned by reference.  For example,  a generic formal part may specify
a limited type, but the actual may be non-limited.  The reverse happens
as well.  Inside a package where a type is declared as limited private,
the type may or may not be limited.

So I have been assuming that 'flagging' constructors as such must be
done in syntax, and constructors for non-limited types must be allowed,
subject to the same rules and restrictions as for limited types.  The
"normal" case will be that a constructor is actually defined in a scope
where the return type is non-limited, at least for non-tagged types.

>There were a lot of different ideas thrown around, but at least one of
>them implied that the caller might *not* know the size of the thing
>being allocated, nor where it was being allocated.  Clearly if
>you call one of these function-like things as a component of an aggregate,
>and you lay out limited types contiguously (even if some component is
>dynamic-sized), then the caller *must* specify where the object is allocated,
>and *must* know the size before it goes out-of-line so it can add up all the sizes
>of the components and do the one overall allocation in the appropriate place
>(on the secondary stack, in some user-specified storage pool, as a component
>of a yet larger limited object, etc.).
>
>I got the sense that one solution being bandied about was that limited components
>of dynamic size would *have* to use a level of indirection, precluding a contiguous
>allocation for an enclosing limited record.  This is not the way many compilers
>do things now, and so would imply a change in the way limited types are laid out.
>I would hope (12) is a point of consensus, but I couldn't tell if that were true
>based on the flurry of messages.
>
>
Yes, it is not just being thrown around, it is the other proposal on the
table.  However, there is a solution which doesn't require the caller to
know the size of the object at the point of the call, and does not
require a level indirection.  This requires the caller to pass a thunk
to the constructor.  When the constructor is ready to allocate the
actual object, it calls the thunk, giving the needed size, and the thunk
returns an address.  The thunk can be an allocator for some heap, or can
be code to add the size information to the size for some object on top
of the stack.  The case of an object with many constructors as part of
say an initial value aggregate can be accomodated by calling all the
tasks in sequence, if the object is being created on a different stack
than the one that contains the object being created, or in the case of a
heap object, you can get a large chunk of heap, and eventually return
what is not used.

Like Randy, though, I consider that whether an implementation uses
indirection for some types should be left to the implementor.  There are
cases where it is more user friendly--and also more efficient at
run-time--to do that.  The particular case of arrays of
Unbounded_Strings is not an idle pasttime, it comes up fairly frequently
so that it is worth looking at performance of various solutions in that
case.  (Unbounded_String is a weird case in general.  If you implement
Unbounded_Strings efficiently they are really limited objects that
cannot be copied.  What happens on assignment is a "deep copy" that
clones the object.)

>>>   13) Should allow us to still have function-like things
>>>       that return by reference
>>>
>>>
>But perhaps call these things "limited functions" because if we add
>aggregates and initialized limited objects, these guys won't be callable
>in those contexts.  Alternatively, require that they be recast as
>functions returning anonymous access types, effectively moving the ".all" from the
>return expression to the point of call (since in my experience, these
>functions almost always return a reference to a heap object, due to accessibility
>limitations).
>
>
I'm going to stay out of this argument, other than to say I will
probably recast those functions that are actually return by reference
using the new semantics.  But I don't want to have to do it at gunpoint.

>I'm not sure I followed that logic, but I agree that they should be *viewed*
>as functions.  The question is how does one implement these.  I fear
>that to achieve nice-to-have's (10) and (11), allowing the first subtype
>to have non-defaulted or unknown discriminants, combined with (12), creates
>a real challenge.  Renaming a procedure call as a function nicely solved
>all the problems:
>   a) the visible declaration is a function
>   b) the renaming declaration can use the parameters to specify the
>      discriminants for the returned (i.e. OUT) object (e.g. "(Disc => 3, others => <>)")
>   c) the out of line code has a name for the pre-allocated object so
>      it can refer to the discriminants.
>
>If there is another solution that has all these capabilities that would be
>great.  I have not found one.  The hardest problem is where the discriminants
>are not explicitly determined by the caller, but are instead determined
>by some computation on the IN parameters.
>
You have not found one, but Randy and I have.  I think it requires more
compiler implementation work than your approach, and in some cases it
will be less efficient (more information passed in the call).  But the
advantage of the approach is that it does cover all cases, including
those where neither the caller nor the constructor can know the size of
the returned object until the point of the return statement.  Yes, if
the compiler sees that for some types the constructors can return
objects larger that the largest stack available, it may decide to use
(hidden) indirection in such objects.  But I consider that to just be
the nature of Ada.  (Has anyone really thought about what will happen
when creating an allocate the maximum unconstrained String doesn't
always raise Storage_Error?  In the early days, there were a few
compilers, including the one for the DPS6, that used 16-bits for
Integer, but a few years ago I ordered a machine with 4 Gig of memory.
Of course the OS wouldn't allow 2Gig to be allocated for one String, but
that day is coming.)

Is it worth this potential extra overhead to make declaring private
types with unknown discriminants work right?  I think so.  I also think
the syntax is easier to use in the easier cases which will make the new
constructs more popular.

>It may be that some mild restrictions could be added to deal with this
>problem.  I would hope the restrictions can be enforced on the *declaration*
>of the function rather than at the call site.  Otherwise I fear we
>will get into the "applicable index constraint" game, which I don't
>relish.  That is, certain calls would only be permitted when there is
>an applicable discriminant constraint.
>
>
I very much don't relish that either.

>And except for the oddball return-by-ref functions, all
>functions create a new thing.
>
>
That is a compiler implementor's view of return by value. ;-)  Users
talk all the time about functions returning this or returning that when
they are just returning a copy of an existing object.  From the user's
point of view, a constructor is different even for non-limited types.
It might be better to say that a constructor constructs a new value,
while many functions return existing values.  Of course, arithmetic
operations don't really fit this picture, but they are already special
in a different way.  But especially for non-limited ADTs I see a
semantic division between constructors, which build new records, and
selector functions that return existing records.

****************************************************************

From: Tucker Taft
Sent: Monday, December  8, 2003, 10:28 PM

> Tucker said:
>
> > Hence, I feel pretty strongly that if we are going to use syntax to make
> > these two kinds of limited-returning function-like things look different,
> > we should make the existing returning-by-ref functions look different
> > from non-limited-returning functions, and make the new more flexible
> > limited-returning functions look like good old non-limited returning
> functions,
> > since they have so much more in common (in terms of legal calling
> > contexts).
> >
> > This is why I would recommend we require something like the word "limited"
> on
> > a function if it will be returning by-ref, and can only be called in
> contexts
> > where by-ref makes sense.  This is of course incompatible, but it is
> easily
> > caught at compile-time, and compilers could start allowing the word
> "limited"
> > right away, even before they support the new capability.
>
> I don't mind that in a vacuum, but I think that it means that either (1)
> non-limited constructors are actually more expensive than current functions;
> or (2) converting a limited type to non-limited requires checking all
> functions for correct behavior.

Unfortunately, I have completely lost you.  I was trying to
focus on the "call" side of things first, before plunging
into the body/implementation side.  Once we know what we want
from the call side, we can start to figure out what we need
to provide on the body/implementation side.

So strictly from the call side, non-limited-returning functions
always create/initialize a new object.  Unfortunately, in Ada 95,
the only limited-returning functions are return-by-ref of
a preexisting object (when I say limited, I mean
"truly" limited).

Now what AI-318 is trying to provide is limited-returning
function-like things that create/initialize a new object,
very much like non-limited-returning functions.
This is important because we are now proposing to allow
limited objects to have initializing expressions, and
we want to allow a function-call-like thing for those
expressions.  Unfortunately, the existing limited-returning
functions are exactly the *wrong* thing for these new
contexts.

These by-ref functions didn't seem so odd when we didn't
allow limited initializing expressions.  There were no
contexts where they couldn't be called due their by-ref-ness.
The limited-ness was enough to eliminate all such contexts.
But now we have proposed new contexts where limited types
are allowed, but the existing kinds of functions can't
be called in those contexts -- a definite pity.

> The former occurs because (in one model) you get a call to Initialize that
> generally can't be optimized away on top of the Adjust and Finalize calls
> that we already have; the latter occurs (in another model) because limited
> types call Initialize and non-limited types don't.

Let's just for a moment ignore this issue of whether
the object has to be default initialized and then re-initialized.
Notice that I didn't mention that in any of my
"consensus" lists, and that's not what I am focusing
on now.  I am happy to keep searching for a solution
that avoids the double initialization.  What I
want first is a good specification of what the solution
should look like on the *call* side.

> I don't much like either result.

I think you are talking about the implementation side,
but let's first try to agree about the call side.

> > > > Other possible desirables:
> > > >    12) Should not require alteration in the way limited types are laid
> out
> ...
> Trying to lay out all possible record types contiguously is a fool's game.
> Kinda like trying to implement universal generic sharing. :-) It's possible
> to get it to work, but only with lots of standing on your head. And the
> result is very use-unfriendly: objects of reasonable types like
>    type Sane_Bounded_String (D : Natural := 0) record
>         Data :  String (1 .. D);
>    end record;
> raise Storage_Error unless constrained.
>
> In any case, the vast majority of real types can be implemented
> contiguously, with any of these proposals. (Most ADTs don't have
> discriminants anyway, at least not on the top-level types.) If a few types
> have to change representation in a few compilers (and only if there are
> constructors defined) to make this work, I cannot get too excited. It can't
> be incompatible: there are no constructors now.

Are you proposing that if a programmer writes a function-like
thing for a limited type, then the layout changes?  I really think
that is very bad news.  And despite your concern about laying out records
contiguously, I am pretty certain that GNAT, Rational, Green
Hills, and Aonix all lay out records contiguously (I am *very*
certain about Green Hills and Aonix ;-).  I think that
represents about 95% of the Ada market.

> > > Viewing them as totally new animals seems like overkill.  To me, a
> > > constructor is just a function that creates a new thing. ...
> >
> > And except for the oddball return-by-ref functions, all
> > functions create a new thing.
>
> I guess I view these as a new thing because what they do is create a
> user-defined "construction" of an object; they need to replace the
> "initialization assignment" operation of Ada as well as the "initialization"
> itself. Existing functions do not change the semantics of assignment. For
> non-controlled types, the distinction doesn't really matter, but it is a big
> deal for controlled types (of all stripes).

From the call-side, I don't see the big difference.
Even from the implementation side, it seems like we are
just trying to eliminate some extra "last minute" copying
that is currently part of non-limited-type function
semantics.  Many functions are written with a local
"Result" parameter, which is then built up as desired,
and then returned.  Many other functions are little more
than the return of an aggregate.  Both of these are
clearly creating/initializing new objects.  All we need to
arrange is that in both cases for a limited type,
the object to be returned is built in its final
resting place.  And the discriminants, if any, are known
on the call side (at least to the generated code), before
the out-of-line code begins.

I suppose one (crazy?) possibility is that such functions must be
inlined if the compiler run-time model needs additional
information from the body, whereas they need not be
inlined if the compiler run-time model uses implicit
levels of indirection.  This would make it quite analogous
to the case with generics, where some compilers need the
body to be able to generate code for an instance, while
others don't, because their run-time model supports
sharing.

> Also, I see a new thing as necessary, because I don't believe that a useful
> constructor can be defined that won't force some representation changes in
> compilers. (That is, (12) is an impossible goal; holding to it is a disaster
> from a user perspective -- it forces unnatural separations of construction
> code into parts. And the idea of somehow specifying an aggregate as the
> argument of an In Out parameter seems goofy.) As long as the constructors
> are explicit, then there isn't a problem in that existing code would not
> have to change representation.

I think you are again implying that by writing a constructor-ish-thing,
the record representation would change.  This seems quite undesirable
to me.

> If we don't have the will to do this right this time, I don't think there is
> any value to another partial band-aid solution. Especially if it cannot be
> extended properly in the future. Which is why Tucker's procedure renaming
> just isn't going to work.

It would help if you had an example of the kind of extension you
had in mind.    I promise I am not wedded to the procedure
renaming approach, but I do think it satisfies the requirements,
except perhaps from an aesthetic point of view.  I think we still
might be able to make the "return ... do ..." approach work,
but there would probably be more limitations.  In any case,
I believe these things still have so much in common with
functions that calling them anything else would hurt more than
it would help.

>          Randy.

Here is a proposal that does not involve renaming:

  1) Require "limited" (or some such word) if the function is going to
     return its result by reference; all other functions must
     return/initialize "new" objects.
     By-ref functions can only be called in contexts that don't
     require a new object (e.g. as an IN parameter or a renaming).
     [Better long-term alternative: replace these oddball functions
      with functions that have anonymous access-to-limited result types,
      since that is what they really are.]

  2) Allow "return Result : Type := <expr> do ... end return;" as a way of
     having a name for the "new" object being returned/initialized.

  3) If a limited type has "normal" functions, then its full type
     must be definite (e.g., there must be defaults for its discriminants).
     Note that its partial view may be indefinite (i.e. "(<>)").
     Also note that the discriminants, though defaulted, may be given
     new values by the initializing <expr>, if the object to be returned is
     unconstrained (i.e. Result'Constrained is False).  This ensures
     that the discriminants have well-defined values coming into
     the function, though they may be changed if the new object
     is unconstrained.

  4) If the (full) result subtype is definite (and hence for
     all limited types), then the name given in the return ... do ...
     (e.g. "Result") can be used within the <expr> itself,
     but only as a prefix for discriminants and the 'Constrained
     attribute.  If 'Constrained is False, then within <expr>,
     Result.<discrim> will necessarily be equal to its default value.
     After <expr>, the discriminants will have the value that was
     determined by <expr>.

The above ensures that for run-time models that need it, the
discriminants and hence the size are known prior to going to
out-of-line code, allowing the caller to do the allocation,
and/or to include the object contiguously in an enclosing record or array, etc.

The only real restriction is that if a limited type is going
to allow objects to have their discriminants determined by an
initializing expression, the full type must have defaults for
the discriminants.  And this restriction is enforced when the
full type is declared, rather than when these functions are called.

****************************************************************

From: Randy Brukardt
Sent: Monday, December  8, 2003, 11:13 PM

Tucker said:

> Unfortunately, I have completely lost you.  I was trying to
> focus on the "call" side of things first, before plunging
> into the body/implementation side.  Once we know what we want
> from the call side, we can start to figure out what we need
> to provide on the body/implementation side.

I have no idea what you mean by "call side". In any case, the only valid
subsetting is to look at it from the user's perspective rather than the
implementors. Other divisions just mean that you are ignoring half of the
issues, and you're bound to get the wrong answer then.

...
> Let's just for a moment ignore this issue of whether
> the object has to be default initialized and then re-initialized.
> Notice that I didn't mention that in any of my
> "consensus" lists, and that's not what I am focusing
> on now.  I am happy to keep searching for a solution
> that avoids the double initialization.  What I
> want first is a good specification of what the solution
> should look like on the *call* side.

Then you're not looking at the whole issue. That's not an implementation
detail, it's a very visible part of the user semantics.

The only question is "what makes sense to the user of Ada 2005"? Once we've
figured that out, we can look at whether some implementation restrictions
are needed. That's the only sensible approach.

...
> > I guess I view these as a new thing because what they do is create a
> > user-defined "construction" of an object; they need to replace the
> > "initialization assignment" operation of Ada as well as the "initialization"
> > itself. Existing functions do not change the semantics of assignment. For
> > non-controlled types, the distinction doesn't really matter, but it is a big
> > deal for controlled types (of all stripes).
>
> From the call-side, I don't see the big difference.
> Even from the implementation side, it seems like we are
> just trying to eliminate some extra "last minute" copying
> that is currently part of non-limited-type function
> semantics.

We're also trying to eliminate unnecessary calls on Initialize, Adjust, and
Finalize. Since those are very difficult to optimize without breaking the
user's code, semantics which call them less often and still is safe is
important.

> > Also, I see a new thing as necessary, because I don't believe that a useful
> > constructor can be defined that won't force some representation changes in
> > compilers. (That is, (12) is an impossible goal; holding to it is a disaster
> > from a user perspective -- it forces unnatural separations of construction
> > code into parts. And the idea of somehow specifying an aggregate as the
> > argument of an In Out parameter seems goofy.) As long as the constructors
> > are explicit, then there isn't a problem in that existing code would not
> > have to change representation.
>
> I think you are again implying that by writing a constructor-ish-thing,
> the record representation would change.  This seems quite undesirable
> to me.

Not "would", but "could" -- in unlikely cases. If the type is definite,
there is never a problem with any proposal.

> > If we don't have the will to do this right this time, I don't think there is
> > any value to another partial band-aid solution. Especially if it cannot be
> > extended properly in the future. Which is why Tucker's procedure renaming
> > just isn't going to work.
>
> It would help if you had an example of the kind of extension you
> had in mind.    I promise I am not wedded to the procedure
> renaming approach, but I do think it satisfies the requirements,
> except perhaps from an aesthetic point of view.  I think we still
> might be able to make the "return ... do ..." approach work,
> but there would probably be more limitations.  In any case,
> I believe these things still have so much in common with
> functions that calling them anything else would hurt more than
> it would help.

Sheesh, Tuck, I wrote up a rough description of syntax and semantics
yesterday. Do I have to do it again??

> Here is a proposal that does not involve renaming:
>
>   1) Require "limited" (or some such word) if the function is going to
>      return its result by reference; all other functions must
>      return/initialize "new" objects.
>      By-ref functions can only be called in contexts that don't
>      require a new object (e.g. as an IN parameter or a renaming).
>      [Better long-term alternative: replace these oddball functions
>       with functions that have anonymous access-to-limited result types,
>       since that is what they really are.]

That's a lousy better alternative, even if it is accurate. I don't think we
ever want to force the introduction of access type where there currently are
none.

>   2) Allow "return Result : Type := <expr> do ... end return;" as a way of
>      having a name for the "new" object being returned/initialized.

So (1) and (2) are pretty close to what I proposed Saturday night (except
for syntax). It's not clear to me what the user-level semantics for
non-limited types is supposed to be.

>   3) If a limited type has "normal" functions, then its full type
>      must be definite (e.g., there must be defaults for its discriminants).
>      Note that its partial view may be indefinite (i.e. "(<>)").
>      Also note that the discriminants, though defaulted, may be given
>      new values by the initializing <expr>, if the object to be returned is
>      unconstrained (i.e. Result'Constrained is False).  This ensures
>      that the discriminants have well-defined values coming into
>      the function, though they may be changed if the new object
>      is unconstrained.

Tagged types don't allow defaults for discriminants. So you're saying that
useful limited types either must not be tagged or must not have
discriminants. That seems pretty fierce.

And, in any case, I don't see what this has to do with the user perspective.
You said something about ignoring implementation details, and I can't see
any reason for this other than an implementation detail.

>   4) If the (full) result subtype is definite (and hence for
>      all limited types), then the name given in the return ... do ...
>      (e.g. "Result") can be used within the <expr> itself,
>      but only as a prefix for discriminants and the 'Constrained
>      attribute.  If 'Constrained is False, then within <expr>,
>      Result.<discrim> will necessarily be equal to its default value.
>      After <expr>, the discriminants will have the value that was
>      determined by <expr>.

Is this necessary? I haven't tried to work out full examples, but it adds
complications where none seems to be needed. Anything that needs to refer to
the name should be in the statement part, I would think. What would you need
to do in that expression that can't be deferred to the body??

> The above ensures that for run-time models that need it, the
> discriminants and hence the size are known prior to going to
> out-of-line code, allowing the caller to do the allocation,
> and/or to include the object contiguously in an enclosing record
> or array, etc.

That seems like an implementation detail, again. That's fair game, of
course, but not when you say "let's focus on the user view".

> The only real restriction is that if a limited type is going
> to allow objects to have their discriminants determined by an
> initializing expression, the full type must have defaults for
> the discriminants.  And this restriction is enforced when the
> full type is declared, rather than when these functions are called.

I agree that (hard) restrictions should be enforced when the constructor is
declared (it's not really a problem with the type). There's nothing wrong
with run-time checks if something is declared to be indefinite, but not
otherwise. But I'd prefer to avoid restrictions if we can.

I note that this proposal does seem to allow deferring or eliminating
default initialization. But it seems to imply that non-limited types still
require a copy (and Adjust call) afterwards. I'd much prefer to allow
build-in-place. Perhaps the new 7.6.1 would allow that?

---

My quicky summary of the user view of constructors:

1) These should be usable anywhere that an aggregate can be, with similar
semantics.
   (This implies that top-level Adjust or Initialize should not be called on
them in general.)
2) There should be as few as possible restrictions on the declarations and
use of constructors.
3) They shouldn't feel "weird".
   (This implies retaining the function call-like syntax -- ":= Create(...);",
   and that the declaration probably should look something like a function call.)

****************************************************************

From: Randy Brukardt
Sent: Tuesday, December  9, 2003, 12:07 AM

Tucker said, responding to me:
> > Is this necessary? I haven't tried to work out full examples, but it adds
> > complications where none seems to be needed. Anything that needs to refer to
> > the name should be in the statement part, I would think. What would you need
> > to do in that expression that can't be deferred to the body??
>
> If you want to specify the initial value as an aggregate, you have
> to specify the values for the discriminants.  If you want them
> to match the newly created object, you need to be able to refer
> to the discriminants of the newly created object.

OK. That seems like a nice-to-have rather than a requirement. Without it,
you still could pass the discriminants as parameters to the constructor if
you had to. Not as pretty, but workable. A lot of the time, you'd want to do
that anyway. I'd hate to kill the proposal with some funny visibility that
isn't strictly necessary.

> > I note that this proposal does seem to allow deferring or eliminating
> > default initialization. But it seems to imply that non-limited types still
> > require a copy (and Adjust call) afterwards. I'd much prefer to allow
> > build-in-place. Perhaps the new 7.6.1 would allow that?
>
> I don't think I was making any such requirement.  Many compilers
> currently preallocate space for the value returned by
> a function having a non-limited composite result type.
> The return ... do ... construct would allow that
> space to be used directly.  For functions returning values
> on the secondary stack, it would also be possible to build
> the returned value directly on the secondary stack,
> and avoid cutting back the secondary stack upon return
> and just use it where it was placed.  We already do that
> in certain circumstances.

I think you're relying on 11.6 and 7.6.1 to get there. Right? That's fine as
long as those sections have the right effect (and we all agree that is an
appropriate effect).

I don't think that the original 7.6.1 would have allowed that optimization
(even though compilers probably do it!), but the new one ought to.

> With limited types we need to create enough restrictions
> to ensure it can be built in place.  For non-limited types,
> the return ... do ... construct makes it more likely that
> no extra copies are required, but we can't impose enough
> restrictions to ensure that no copies are needed in all
> run-time models.

That's fair. It seems likely that a copy would have to be done sometime
(certainly for elementary types, although that's no real problem).

> > My quicky summary of the user view of constructors:
> >
> > 1) These should be usable anywhere that an aggregate can be, with similar
> > semantics.
> >    (This implies that top-level Adjust or Initialize should not be called on
> > them in general.)
> > 2) There should be as few as possible restrictions on the declarations and
> > use of constructors.
> > 3) They shouldn't feel "weird".
> >    (This implies retaining the function call-like syntax -- ":= Create(...);", and that the
> >    declaration probably should look something like a function call.)
>
> I agree with the above, and as indicated, I see no reason
> not to call them functions.  The "return ... do ..." construct
> might be called a "constructor statement" or some such thing
> if you want to get the term "constructor" into the language ;-).

It wouldn't hurt, but I'm certainly not going to be banging any shoes if it
doesn't happen. :-)

****************************************************************

From: Robert I. Eachus
Sent: Tuesday, December  9, 2003, 10:51 AM

Wow!  I think we are finally converging, but four long messages, two
each from Tucker and Randy, after I gave up for the night?  Is Tucker
already on California time? ;-)

But I would like to start pulling out one issue at a time to resolve.
In this particular case, I am going to set ground rules for this thread,
since it presumes that certain things will be adopted.  In this case,
the RM requirements to be specified for initializing objects inside the
return <object declaration> do <sequence of statements> end;

My feeling is that in reality writing useful constructors for private
types outside the body of the package that defines the type (or one of
its children) is going to be a pretty useless capability.  Not totally
useless, but okay to ignore for now.  This means that if there are
discriminants, we can assume they are known and visible, and the same
with other fields of the target type.  (But the fields of the parent
type may not be visible in this context.)

The three options that seem worth discussing are:

1) Controlled types are initialized implicitly.
2) Default initialization only occurs if there is no explicit
initialization: (return Foo: Bar := (this, that, etc) do...)
3) We are all big boys here.  (The constructor is normally defined by
the same person who writes Initialize or assigns defaults.)  So there
are no default initializations, and no run-time checks.
4) There is a run-time check at the end of the do...end; and  an
exception is raised if any discriminants of the object were not initialized.

I think that all of these are technically viable given the way things
seem to be headed, so this is really a normative discussion   (What do
we want to happen?).  The one troubling case I see can occur if we
decide that if the target is constrained its discriminants, if any get
'copied' in from the target object.  (In practice they will be the same
object, no copying needed.)  Then if a constructor that assumes it will
get the constraints from the target is used to create an object that
will get its constraints from the initial value, something should
happen.  This leads me to tend toward 4).  If the sequence of statements
checks 'Constrained and initializes the discriminants (or other bounds)
if false, then the check can be optimized away.  Otherwise the
constructor should raise Program_Error.  Compilers can provide warnings
if code for the check is generated, or if something that reads the
discriminants in the sequence of statements occurs where the
discriminants may not yet be initialized.

Notice though, that it might be nice if 'Constrained could be checked
before the return statement.  I think that this is the one detail that I
miss from my proposal in Randy's variation.  There will be cases where a
programmer would like to write:

if Foo'Constrained
then return ...;
else return ...;
end if;

Allowing the user to use the name of the constructor in this instance as
a prefix for 'Constrained would allow this.  I don't see any real need
for other attributes of the target outside the return.

Which brings up an interesting point.  Will there be a restriction that
prevents return statements inside the sequence of statements of a return
construct?  If not, then the above will work.  But I think the principle
of least surprise says that nested return statements should be illegal.

Rule 2), it seems to me is the other contender.  If there is no explicit
initialization in the object declaration, implicit initialization
occurs.  This solves what to me is one troubling case with 4).  If some
fields of a record type have defaults, those default initializations may
not happen, and there will be no warning to the  programmer.

I guess I am comfortable with either 2) or 4).  I think for most types,
and most constructors, there will be an explicit aggregate initial
value, with the sequence of statements if present doing any fixup
needed.  As long as double initialization doesn't occur in that case, I
am happy.

****************************************************************

From: Randy Brukardt
Sent: Tuesday, December  9, 2003, 11:01 AM

Robert Eachus said:
> The three options that seem worth discussing are:
>
> 1) Controlled types are initialized implicitly.
> 2) Default initialization only occurs if there is no explicit
> initialization: (return Foo: Bar := (this, that, etc) do...)
> 3) We are all big boys here.  (The constructor is normally defined by
> the same person who writes Initialize or assigns defaults.)  So there
> are no default initializations, and no run-time checks.
> 4) There is a run-time check at the end of the do...end; and  an
> exception is raised if any discriminants of the object were not
> initialized.

"Three options"? I see four. :-)

...
> I guess I am comfortable with either 2) or 4).  I think for most types,
> and most constructors, there will be an explicit aggregate initial
> value, with the sequence of statements if present doing any fixup
> needed.  As long as double initialization doesn't occur in that case, I
> am happy.

I had proposed (2). I agree that we don't want objects floating about that
haven't been initialized at all. And, being able to write a specific
initialization expression allows overridding that (or pieces of it, given
the <> notation). That seems powerful enough to me.

(And nested returns aren't allowed. Once a return object is created, it has
to be returned. Otherwise, you wouldn't be able to build the object in
place, which is the whole point.)

****************************************************************

From: Tucker Taft
Sent: Tuesday, December  9, 2003, 11:01 AM

> The three options that seem worth discussing are:
      ^^^^^ "four" ;-)

> 1) Controlled types are initialized implicitly.
> 2) Default initialization only occurs if there is no explicit
> initialization: (return Foo: Bar := (this, that, etc) do...)
> 3) We are all big boys here.  (The constructor is normally defined by
> the same person who writes Initialize or assigns defaults.)  So there
> are no default initializations, and no run-time checks.
> 4) There is a run-time check at the end of the do...end; and  an
> exception is raised if any discriminants of the object were not initialized.

Only (2) makes sense to me.

****************************************************************

From: Robert I. Eachus
Sent: Tuesday, December  9, 2003,  4:57 PM

Tucker Taft wrote:

> > The three options that seem worth discussing are:
>
>  ^^^^^ "four" ;-)

"A foolish consistancy is the hobgoblin of little minds."  -- Ralph
Waldo Emerson.
(Seriously, I decided to add case three, then didn't change the prefix.
Of course, I was originally intending to leave case three out as not
making much sense for Ada. ;-)

> > 2) Default initialization only occurs if there is no explicit initialization:
> > (return Foo: Bar := (this, that, etc) do...)
>
>  Only (2) makes sense to me.

Since 2 is acceptable to me also, shall we consider that issue
resolved?  Any other votes?

The object in the return statement gets initialized, including a call to
an explicit Initialize for controlled types, and and default values for
record components, unless there is an initial value in the return statement.

This implies that the syntax for these special returns should allow:

return Foo: Bar := (some initial value aggregate); --without a do ... end;

****************************************************************

From: Robert A. Duff
Sent: Tuesday, December  9, 2003, 5:34 PM

Robert Eachus wrote:

> "A foolish consistancy is the hobgoblin of little minds."  -- Ralph
> Waldo Emerson.

;-)

> > > 2) Default initialization only occurs if there is no explicit initialization:
> > > (return Foo: Bar := (this, that, etc) do...)
> >
> >  Only (2) makes sense to me.
>
> Since 2 is acceptable to me also, shall we consider that issue
> resolved?  Any other votes?

I agree with (2).

****************************************************************

From: Jean-Pierre Rosen
Sent: Tuesday, December  9, 2003, 10:31 AM

> Consensus statements about Ada 200Y if we were to approve AI-318:
>
>    1) Should be possible to declare an object of a limited
>       type and provide an initializing expression
>    2) Should be possible to use an initialized allocator for
>       an access-to-limited type
This is of course what is basically required. However, I think something
is missing from the list:
- the function-like thing should be able to access the characteristics
(discriminants, bounds, etc) of the object being constructed.

Seems to me that without this requirement, we could just allow
initialization of limited types (but not assignment, of course).

****************************************************************

From: Tucker Taft
Sent: Tuesday, December  9, 2003, 12:04 PM

Good point.  We clearly need to be able to query the constraints
of the "new" object.  I said these constraints should be visible
in the initializing expression.  They should also be visible
in the subtype indication of:

  return Result : <subtype_ind> [:= <expr>] do ...

I suppose if the <subtype_ind> is unconstrained, then the
constraints come from the calling context.
If the <subtype_ind> is constrained, then there must be
a check that the constraints are compatible with those coming
from the calling context.

> Seems to me that without this requirement, we could just allow
> initialization of limited types (but not assignment, of course).

I don't understand this sentence.

****************************************************************


Questions? Ask the ACAA Technical Agent