Version 1.5 of ais/ai-00325.txt

Unformatted version of ais/ai-00325.txt version 1.5
Other versions for file ais/ai-00325.txt

!standard 3.10 (01)          03-02-05 AI95-00325/01
!class amendment 03-02-05
!status No Action (8-1-1) 04-03-07
!status work item 03-02-05
!status received 03-02-05
!priority Medium
!difficulty Hard
!subject Anonymous access types as function result types
!summary
This AI focuses strictly on issues relating to anonymous access types as function result types. It only makes sense if AI-00230 in some form is approved.
!problem
This proposal builds on AI-00230, by extending the advantages of anonymous access types to function result types.
Because Ada has local access types, there needs to be a way to remember the accessibility level of the object designated by a reference across function return.
Because Ada has multiple storage pools, it is important that an allocator of an anonymous access type allocate from the appropriate storage pool.
!proposal
The "access_definition" syntactic category (see AI-00231) is permitted for function result types. The type associated with the result type is an anonymous access type, which permits implicit conversions from other access types with appropriately compatible designated subtypes (as defined by 4.6(13-17)).
The accessibility level of an anonymous access type is determined at the time when the access_definition is elaborated. For a function result type, this happens at the time of the function return, and may possibly depend on information provided by the caller.
To be specific, the level of the returned value is the same as the level of the type of the returned value, which must be shallower than the level of the execution of the function (to avoid dangling references by the caller). If the return expression is an allocator, the accessibility level and storage pool for the allocator is determined by the context of the call on the function. In particular, the storage pool used is the one that would have been used had the allocator appeared at the poin of the function call. (This rule is applied recursively for a chain of calls.)
This implies that the call on a function with an anonymous access return type must be passed in a storage pool and an accessibility level, and must return both an access value and an accessibility level. [Implementation NOTE: The implementation may implement the accessibility level check either after returning from the call, or at the point of the return statement, since it has enough information to do it there as well.]
!wording
(See proposal.)
!example
!discussion
Implementing anonymous function result types is probably more work than the other proposed uses for anonymous access types. However, anonymous result types provide important capabilities that are currently very difficult to accomplish.
First of all, one can define a function that constructs an object (including an object of a limited type) in a storage pool determined by the caller, including a "stackish" storage pool (i.e. one that is cheap to reclaim). This kind of constructor is definitely missing in Ada now. Currently, the access type, and hence the storage pool, must be known inside the function that creates an object. Secondly, there is no way to create a single function that can be used both for creating a local object and a heap object with similar efficiencies. With this proposal, the "constructor" would look like this:
function Construct(...) return access T is begin ... return new T'(...); end;
X : constant access T := Construct(...); -- Create on the stack
Y : constant T_Ptr := Construct(...); -- Create in T_Ptr'Storage_Pool
Secondly, one could define a function that could return a reference to an aliased component of an object that is itself passed in using an access parameter. This would effectively allow a function to act as a "selector" that can be used on the left hand side of an assignment when followed by ".all". E.g:
X : aliased Rec_With_Aliased_Component;
begin
Selected_Component(X'access).all := Y;
where Selected_Component is declared:
function Selected_Component(RP : access Rec_With_Aliased_Component) return access Component_Type is begin return RP.Component'access; end Selected_Component
And if we approve the "object.operation" syntax:
X.Selected_Component.all := Y;
presuming Selected_Component is declared in the same package as Rec_With_Aliased_Component.
Implementation Issues:
There are some implementation issues associated with supporting function returns. Basically it means that such a function has to use run-time accessibility levels rather than static accessibility levels internally. Or more precisely, it needs to use accessibility levels that are comparable with those used by the caller. This need not be a huge burden. Probably the simplest is to have the caller pass in a "caller" accessibility level to the function, which the function then adds one to to use as its own accessibility level.
[Some optimizations are possible: If there is only one access parameter, it could use the accessibility level associated with that parameter as the "caller" level (think about it ;-). If there are two or more access parameters, it could use the max of them. At that point, it would be more efficient to have the caller just pass in an extra parameter which is the "true" caller accessibility level. If it is nested in a function that also has access parameters, it is possible that it might have no access parameters, in which case it would probably need a "caller" accessibility level passed in as an additional parameter, though it could probably calculate it from the accessibility level of the access parameter passed into the enclosing function. Etc...]
!ACATS test
Tests should be created to check on the implementation of this feature.
!appendix

[... much interesting discussion elided ...] [See the appendex of AI-00230
for the interesting discussion.]

*************************************************************

From: Tucker Taft
Sent: Friday, February 1, 2002  11:54 AM

Here is an update to ai-00230 on anonymous access types.
I was able to resolve the problems that I had with them last time,
while eliminating the mind-bending named subtypes of anonymous access types,
but adding back components of an anonymous access type.

I think it all works safely and usefully now, especially presuming
the syntax for "access_definition" is generalized (see ai-231) to allow
control over nullness, and constantness of the designated object.

[This is version /01 of the AI - Ed.]

*************************************************************

From: Tucker Taft
Sent: Saturday, November 15, 2003  10:35 PM

I am at least partly responsible for a couple of AIs (AI-318 and
AI-325) that suggest supporting functions returning (new) limited objects,
and functions returning anonymous access types.

These two are pretty closely related, can be used to solve
similar problems, and have some of the same issues.

So this is a bit of a personal brain storming on this topic,
which will be turned into AI discussion depending on the feedback
received from other ARG members (so speak up!).

First let me acknowledge some of the comments already received:

1) Robert Dewar made the point that the function-returning-new-limited-object
proposal (AI-318), if there is nothing on the function declaration indicating
it is one of these, would require a new run-time model for calling
*all* functions returning limited types, including those that were
allowed by Ada 95 (return-by-reference of *existing* limited objects).

This kind of argument is not very convincing if it just applies to
one particular implementation approach, but it seems that this
will be true pretty much for *all* implementation approaches.
This implies that all existing code will need to be recompiled if
a compiler were to start supporting this new kind of function,
and existing code would almost certainly slow down.

Hence, I accept Robert's point as establishing an additional relatively
important criteria for evaluating these proposals, namely they should
not require a significant run-time model change for *existing* code.

2) Dan Eilers suggested that we consider allowing a function to be declared
as a renaming of a procedure with a single [in-]out parameter.
This is an interesting idea, but it has some tricky implications.
First, I would suggest some alternative syntax for the renaming
to make it clear what is happening, since otherwise the overload
resolution algorithm would have no clue to look at procedures
when working on a function renaming.  Hence, I might suggest:

    function Ret_Limited(X : Integer) return Lim_Type
      renames procedure Init_Limited;

presuming a directly visible declaration of:

    procedure Init_Limited(X : Integer; LT : [in] out Lim_Type);

The presence of the keyword "procedure" would cause overload
resolution to consider procedures rather than functions, and
match any procedure with one additional parameter of type Lim_Type,
be it first, last, or somewhere in the middle.

A second issue here is that the "call" on Ret_Limited would
have to be transformed into a call on Init_Limited.  This would
require that the target object of the function call be identified,
and that it be already created and initialized (presumably by default),
since we don't want the code of Init_Limited to be manipulating an
uninitialized object.  This implies that if the result subtype of the
function is indefinite, the context of the call has to provide
discriminants or bounds so the target object can be created and
default initialized.  This means that calling such a function would
be analogous to using an aggregate with "others," the context
would need to provide the constraint on the result.

This would also imply that whether a given context was allowed
for calling such a function would depend on whether it was a renaming
of a procedure or not, meaning that this renaming could not be
in the private part acting as completion for some visible function
declaration.  It would have to be visible to the callers.
This also implies some generic contract model issues that might
be quite nasty.

Finally, if the function result subtype is "definite" but not the same as the
corresponding procedure parameter subtype, one would have to
decide which one would be relevant; normally the subtypes used in
the profile of a subprogram renaming are irrelevant.  Here it might
be nice if the procedure parameter subtype were unconstrained,
the result subtype of the function could be used to specify the
constraints for the target object, but if the target object had
its own constraints specified, would they have to agree with that
of the function result subtype?

3) Steve Baird made the point that if for either the limited
or the anonymous-access-type function we need to pass in a
parameter representing a "storage pool" or equivalent, this
would be the first situation where the storage pool is not
known statically at the point of allocation.  Of course the
shared-generics folks could scoff at this kind of belly-aching ;-),
but it does seem to be a relevant consideration.

4) For these kinds of functions, it is valuable to be able
to have a name for the object that is going to be returned
from the function.  We have proposed various ways to do so.
One proposal involved adding a "do...end" clause to a return
statement.  A second proposal involved declaring a variable
using the keyword "return" analagous to how "constant" is used.
The return statement must return this object if one is in scope.

Pascal has indicated a preference for the do...end approach, because
it links the special semantics more closely to the return
statement, rather than having to check every return statement
to see whether it happens to return the value of a specially
marked variable.

-------------------

All of this discussion to some extent begs the question of
what problem are we really trying to solve.

Here are various problems that we might or might not
be hoping to solve with these "funky" functions:

  - Limited types are hard to use.  Allowing aggregates
    of a limited type can provide some "completeness" checks,
    but doesn't help cases where the limited type is private.
    Allowing functions to specify the initial value of a limited
    object could make limited types safer, by ensuring all the
    needed initialization took place at the point of declaration,
    and generally make them more like non-limited types in the
    paradigms of use.

  - Anonymous access types help prevent proliferation of named access
    types (particularly relevant in the context of the "limited with"
    proposal).  However, having anonymous access types only as parameters
    doesn't address the whole problem, since declaring a function
    returning an access value will necessitate declaring a named access type.

  - It is sometimes desirable to create a single constructor operation that
    can be used to create an object either in the heap or as a local
    variable.  As things are now, you generally have to choose between
    producing a procedure for initializing a large or limited object,
    a function for returning a pointer to a heap object, or a function
    for returning a constructed value of a smaller, non-limited object.
    It would be preferable to have to single paradigm for creating
    a "constructor" operation that would work for small and large
    objects, for limited and non-limited types, and for stack-resident,
    heap-resident, and component objects.

  - It is not straightforward to write a function that allocates in a
    caller-determined storage pool.   Generally it requires
    declaring a new access type, associating the storage pool
    with that access type, doing the allocator, converting
    the result to some second named access type, and then having
    the caller convert yet again to the ultimate desired access type.
    Alternatively, each caller could instantiate a generic function
    and then call it.

With regard to returning a limited object without copying it, there are
various possibilities:

  - The *caller* might want to determine the constraints of the created
    object, in which case there needs to be some way for the constraints
    to be provided to the called routine.  In this case, it is probably
    natural for the caller to preallocate the space for the result,
    even if it doesn't initialize it in any way.  (This case maps
    fairly well to a procedure renamed as a function, except that
    the renaming approach would require the object to be default initialized
    before whatever further initialization is performed out-of-line.)

  - Alternatively, the *called* routine might determine the constraints,
    perhaps based on parameters passed in, in which case the best the caller
    could do is provide some kind of storage pool or area in which the
    called routine does the allocation, and then it would have to effectively
    rename the result whose address would be returned.  For "stack" objects,
    this would be similar to functions returning objects of unknown size,
    except that no copying of the result would be permitted.  The caller
    would have to use it where the called routine allocated it, which
    might imply leaving holes in the (secondary) stack.  (This approach
    maps fairly well to an approach based on an "extended" return statement
    or a specially marked "return" object, where the declaration of the
    object to be returned can (or must?) include discriminants.)

  - There is presumably still some need for the existing capability, where
    the called routine returns a reference to a preexisting object, and the
    caller either passes it on to some other subprogram, or renames it
    for repeated local use.  (To avoid the run-time model change
    discussed above, some alternative syntax in the function declaration
    would seem to be necessary to distinguish the new-object case from
    this existing return-by-reference case.)

Much of the complexity associated with returning limited types arises
from cases where the result type is unconstrained, the worst being
the case where it is also indefinite (i.e. no defaulted discriminants).

It is therefore important to decide how valuable is the ability to
support returning limited objects where the size of the returned object is
not known just from the result subtype.

-----------------------------------------------------------------
------------ Procedure-renamed-as-function Redux ----------------
-----------------------------------------------------------------

In fact, if we restrict ourselves to cases where the result subtype is
*definite*, then the procedure-renamed-as-function solution for "returning"
limited objects looks more attractive.  The caller would always be able
to create the object using the result subtype if the "context" didn't
provide constraints, so the call would be permitted in any context,
eliminating the generic contract model problem and the need for new
legality rules resembling those for "others" in array aggregates.
The renaming could be in the private part as well (it could not
be postponed to the body, since the convention of such a function
would necessarily be intrinsic).
This also eliminates all the complexity associated with the caller
providing a storage pool or equivalent.  Instead, they just pass
in a reference to an already created and default-initialized object.
The procedure renamed as a function can treat it like any other
[in]-out parameter of a limited type.

It is somewhat interesting to note that the "Initialize" procedure
of limited controlled types can be thought of like one of these
procedures renamed as functions, where the default initialization
for controlled types is roughly equivalent to ":= Initialize;"
presuming:

   function Initialize return Lim_Cntrl_Type renames procedure Initialize;

In other words, the Initialize procedure for a controlled type is
called at the same point that one of these procedures-renamed-as-function
would be called for a limited object initialized by a call on
such a function.

Presumably if the limited object is both controlled
and has an initialization specified by a function call, then Initialize
is called immediately prior to calling the explicitly specified
procedure-renamed-as-function, to ensure the object gets properly
default initialized before being manipulate by "user" code.

--------------------------------------------------------------------------
------------ What about returning anonymous access types? ----------------
--------------------------------------------------------------------------

There is some advantage in having a name for the result object
being returned from a function returning an anonymous access
type, which gets us into the "extended" return statement.
However, most of the advantage comes in the case where you
are returning an allocator for an anonymous access-to-limited type,
and there is no way to write anything but an aggregate (if that) as the
intializer for the allocator.  However, if we posit the existence
of the above procedure-renamed-as-function as a solution for the
limited type problem, then we can call such a function in our initialized
allocator.

Given a reduced need for the extended return statement, then we can
go to a relatively simple proposal for returning anonymous access types,
namely require that the caller pass in a storage pool and an associated
accessibility level, which would be used only if the expression of a
return statement is an allocator, or a call on another such function.
An accessibility level would be included with the returned object.

The "only" new syntax for this would be allowing an access_definition to
appear for the result subtype of a function.

--------------------------

So... my brainstorming has led to the following:

   1) allow renaming a procedure with one [in] out parameter
      as a function, provided the [in] out parameter's nominal
      subtype is a definite subtype.  Require the use of
      a syntax like "function blah(...) return foo renames procedure blurfo;"
      to make it clear to the reader and the compiler that a procedure is
      being renamed.  If the parameter type is limited, then
      such a function can only be called where we have proposed to
      allow limited aggregates (i.e. initializing a declared object,
      initializing a component of an aggregate, a default expression
      for a component_definition, in an initialized allocator,
      as an actual IN parameter or parameter default, as an actual
      formal IN object or formal IN default).

   2) allow an access_definition for a function result subtype.
      If the expression of a return statement in such a function
      is an allocator (or a call on another such function), a storage
      pool and accessibility level provided by the caller is used
      (or passed on to the called function).

Any and all comments, flames, better brains, are welcomed...

I will wait a week or two and write up the results in time
for the San Diego meeting.

*************************************************************

From: Robert Dewar
Sent: Sunday, November 16, 2003  4:10 AM

> This kind of argument is not very convincing if it just applies to
> one particular implementation approach


For the record, this is a *very* convincing argument if the one implementation
approach that it applies to is in significant use today. If Ada 0Y (what a
terrible name) requires major changes in implementation approaches at any
point, the corresponding features will simply be ignored. The best hope for
seeing anyone implement these features is to make sure that they are not
disruptive in this sense.

I realize that Tuck goes on to say that this is not the case hee, but it
is still an important point for the record.

I would guess that the GNAT situation is a typical one. We are indeed forging
ahead implementing many of the new proposals, but at this stage, that is only
practical if they do not involve any major shift in impoementation (which
to a remarkable extent is the case with many/most/all? proposals so far).

After all, we avoided downward closures based on concerns from just a
couple of implementors in the Ada 9X process. The situation now even
more requires no major shifts.

*************************************************************

From: Tucker Taft
Sent: Sunday, November 16, 2003  7:50 AM

My real point was that any vendor could torpedo any
proposal they don't like by saying that it would require
changing their run-time model for existing code.

I feel that if we can suggest reasonable alternative approaches
that don't require changing the run-time model, then that
should counteract some of the concern.  However, if it seems that there
really is no way to avoid changing the run-time model of
existing code, then that is a significant problem.
And it seems pretty clear that having to support both return-by-reference
and return-new-object semantics with identical syntax for
the function declaration is such a case.

And if you promise not to overuse the argument, I'll promise
to take each one you identify seriously ;-).

*************************************************************

From: Robert Dewar
Sent: Sunday, November 16, 2003  8:36 AM

> My real point was that any vendor could torpedo any
> proposal they don't like by saying that it would require
> changing their run-time model for existing code.

Basically I would agree they can. After all if you don't have the major
vendors on board for a change at this stage, you might as well forget it.
In fact I think you can expect vendors to operate in good faith (otherwise
the whole process is broken).

> And if you promise not to overuse the argument, I'll promise
> to take each one you identify seriously ;-).

exactly :-) That's really the way things work.

An interesting question, in retrospect, did we really make the right
decision to accomodate use of a display, when in practice for Ada 95
static chains make more sense anyway? We certainly paid a price for this
accomodation!

Going back to the first paragraph, any vendor can torpedo any proposal
by simply not implementing it :-)

Yes, there will be some competitive pressure, which may be relevant for
some customers, but I would not count on this as a major factor.

That being said, I note again, that in the case of GNAT we are implementingf
away, and have a lot of the new proposals working in our latest builds.

*************************************************************

From: Robert A. Duff
Sent: Sunday, November 16, 2003  7:00 AM

Robert Dewar says:

> An interesting question, in retrospect, did we really make the right
> decision to accomodate use of a display, when in practice for Ada 95
> static chains make more sense anyway? We certainly paid a price for this
> accomodation!

Well, *my* opinion on that point has not changed in 20 years. ;-)

The one case where Ada is clearly inferior to Pascal...

(I never found the "display" argument compelling, either, since
I know of at least one Pascal compiler that used displays,
and implemented procedural parameters properly.  Not the *easiest*
way to do it, but certainly doable.)

> Going back to the first paragraph, any vendor can torpedo any proposal
> by simply not implementing it :-)

Indeed.  This is an important point.

*************************************************************

From: Robert Dewar
Sent: Sunday, November 16, 2003  7:10 PM

> The one case where Ada is clearly inferior to Pascal...

Of course in GNAT, one has Unrestricted_Access to redress the balance, but
it is hardly elegant :-)

Unrestricted_Access is a real language extension. Are we going to fix this
in Ada0Y

*************************************************************

From: Tucker Taft
Sent: Sunday, November 16, 2003  9:46 PM

The AI on anonymous access-to-subprogram parameters has been approved
for intent by the ARG.  Unfortunately, it isn't quite ready
for forwarding to the WG9.

*************************************************************

From: John Barnes
Sent: Monday, November 17, 2003  1:42 AM

I just sent a new version to Randy to put on the database.

*************************************************************

From: Robert I. Eachus
Sent: Sunday, November 16, 2003  10:06 AM

Tucker Taft wrote:

>Any and all comments, flames, better brains, are welcomed...
>
>
I don't know about the better brains, and I'll try and avoid flames.

As far as I am concerned we are trying to create these special
subprograms that are neither fish nor fowl to solve a real problem in
Ada.  Why don't we focus on what the problem is, how to solve it
technically, and then invent a notation that works.  Yes, I said invent
a notation.  This is a language revision and if there is something
missing from the language--and I think we have all concluded that there
is--we need to come up with a language change that does minimal violence
to existing code and implementations while solving the real problem in a
way that is acceptable to users.  It will be nice if it doesn't look
like a kludge, but that is a goal, not a necessity.

So what problem are we trying to solve?  Primarily constructors for
limited types.  There are also issues for functions returning anonymous
access types, but I suspect that if we get the allocators right, that
will fall out along the way.  The hard problems seem to be the same. And
as far as I see, if we get allocators right, the need for functions
which return the problem access types goes down.

So what is the problem? We need to define something that looks
suspiciously like an assignment operator, except that it can only be
used with new objects. (Don't worry I am going to try to avoid anything
that looks like procedure ":=".  That is a can of worms I don't want to
touch.)  Now I am going to make a cut.  There are three cases to be
dealt with:

Easy:  The object and constructor have identical static bounds,
discriminants, and/or other constraints, and there are no run-time issues.

Constrained object:  The object is defined as constrained, the
constructor may take its constraints from the actual object.  (This is
the case that definitely, IMHO, needs new syntax--if we allow it.)

Constrained constructor:  The constructor allocates the object and
returns it.  The object is created by the constructor and that
determines any constraints necessary on the object.

(For the anonymous access type cases, add "the subtype designated by"
where necessary.)

Do we need to support the constrained object case?  Notice that the
rules allow constraints on the target object in the other cases.  The
difference is that is what I am calling the constrained object case, the
constraints back propagate into the constructor function.  In the easy
and constrained constructor case, there may be a constraint check at
compile time or after the object is constructed, and an error if it
fails.  I think that this case is a "nice to have" and as I said above
will require new syntax to avoid massive work in existing compilers.
What do others think?

Next, in what I call the constrained constructor case, the issues are
that the size of the returned object is determined by the constructor,
and that the object must be constructed in place.  I can come up with
several notational models for how this works, but they all come back to
"hidden" parameters.  The compiler has to pass the information on where
the object is to be created to the constructor, OR the constructor gets
a "thunk" as a parameter that it can call with the size of the object to
be created, and then later return the address of the created object.  (I
deliberately said address.  This may be an object of an anonymous in
some of the cases of interest, and in others not.  It is really a return
"by reference" in the constructor case.)

As I see it we are back to the need for a new syntax, but with one
difference worth noting.  I don't see any simple way to combine the
constrained constructor and constrained object from the compilers point
of view.  I also don't see any reason to confuse users by trying to mix
the two cases.

So as I see it, we can:

1) go with the easy cases and let make (or keep) the other cases illegal.
2) allow the constrained object case with new syntax.
3) allow the constrained constructor case with new syntax..
4) allow both of the above with different syntax.

I guess I favor 3), with a syntax something like:

constructor <parameter list> returns <subtype indication>;

We can discuss where constructors can appear separately, but I think it
is pretty obvious:  in an object declaration after the :=,  and in an
allocator after new.

That limits the compiler work, and makes it clear that these subprograms
are special.

*************************************************************

From: Tucker Taft
Sent: Sunday, November 16, 2003  9:50 PM

I will admit it is a bit frustrating that most ARG notes
end up getting more comments off-topic than on.  ;-)
I sent three messages over the weekend.  There has
been only one truly on-topic response, I believe.
And that one was essentially a completely new
proposal. Oh well.  I'm as guilty as anyone...

*************************************************************

From: Robert I. Eachus
Sent: Monday, November 17, 2003  4:32 AM

If you are referring to my post, I guess you are right.  I was trying to
evaluate your proposal (renaming of procedures as functions) and thought
the wider context discussion was implictly opened by your proposal, and
we should resolve the issue.  As far as I am concerned, you tabled a
proposal that solved what I called the "constrained object" case, and
either ignored or implictly ruled the "constrained constructor" case out
of bounds.  (The intersection of the two, the "easy" cases, should fall
out of any workable solution.)

As part of drawing this distinction, I thought it was a good idea to
have a proposed notation on the floor for the "constrained constructor"
case as well.  So I put up a stalking horse. All I was really trying to
do was to make it clear that, in your proposal, the constraints on the
created object would be passed into the constructor function through the
out parameter.  I certainly like your mapping of the constrained object
case to something that current compilers support, so it is mostly
front-end work for compiler developers.  However, I do think that in a
final proposal, that mapping would be better left implicit.  I think
that the "extra" work that compilers would have to do to allow these
special-purpose procedures to be called AS procedures (with an already
initialized object) would not be doing users any favors.

But current compilers also support the case (for non-limited types) of
constructor functions where the constructor determines the bounds of the
object.  It would be nice to "open a window" so that when the full type
is non-limited constructor functions can be made visible.  For compilers
that currently support creating the return value "in place" such a
solution would work for types with say task components as well.

There were constrained constructor proposals discussed previously with
respect to AI-318, so I didn't see any new ground being opened.  And I
hope it was clear that I think it will place an undue burden on current
compilers unless any solution to the constrained object or constrained
constructor cases uses new syntax so that the implementor is clued in to
what is going on as early as possible.

*************************************************************

From: Randy Brukardt
Sent: Thursday, December 4, 2003  8:22 PM

In order to give Tucker the technical comment he craves:

He suggested 4 problems:

   - Limited types are hard to use.
   - It is sometimes desirable to create a single constructor operation
Sometimes? If it was possible, it would always be desirable. Certainly, that is
how Janus/Ada works internally (every record type has a single constructor
"thunk"), and its nasty that users can't do that as well. Anyway, these two are
the same, given that these items would only be allowed in "constructor"
locations. This is the problem we're trying to solve.

   - Anonymous access types help prevent proliferation of named access types
True, but only worthwhile if there are no complications. AI-230 only got
finished once all of the complications were removed. The same holds here.
   - It is not straightforward to write a function that allocates in a
     caller-determined storage pool.
True, but is this a real problem? And if it is, I'd prefer to fix it with
better support for pools in generics, not messing around here.

> In fact, if we restrict ourselves to cases where the result subtype is
> *definite*, then the procedure-renamed-as-function solution for
> "returning" limited objects looks more attractive.

OK by me.

> It is somewhat interesting to note that the "Initialize" procedure
> of limited controlled types can be thought of like one of these
> procedures renamed as functions, where the default initialization
> for controlled types is roughly equivalent to ":= Initialize;"
> presuming:
>
>    function Initialize return Lim_Cntrl_Type renames procedure Initialize;
>
> In other words, the Initialize procedure for a controlled type is
> called at the same point that one of these procedures-renamed-as-function
> would be called for a limited object initialized by a call on
> such a function.
>
> Presumably if the limited object is both controlled
> and has an initialization specified by a function call, then Initialize
> is called immediately prior to calling the explicitly specified
> procedure-renamed-as-function, to ensure the object gets properly
> default initialized before being manipulate by "user" code.

Umm, no. These ought to work like aggregates, and those don't call Initialize.
We certainly want to be able to replace an aggregate with a constructor
function, and vice versa.

So if you have a user-defined constructor, it is the complete constructor;
Initialize is called only for default initialized objects.

...
> Given a reduced need for the extended return statement, then we can
> go to a relatively simple proposal for returning anonymous access types,
> namely require that the caller pass in a storage pool and an associated
> accessibility level, which would be used only if the expression of a
> return statement is an allocator, or a call on another such function.
> An accessibility level would be included with the returned object.
>
> The "only" new syntax for this would be allowing an access_definition to
> appear for the result subtype of a function.

I don't like the idea that a lot of runtime expense and compiler complication
is hidden behind the use of a single keyword ("access") which normally is quite
cheap. And I don't see the need (I don't see the need for any access
parameters, when you get right down to it.)

*************************************************************

From: Tucker Taft
Sent: Thursday, December 4, 2003  8:48 PM

Randy, Thanks for the feedback.

> ...
> > In fact, if we restrict ourselves to cases where the result subtype is
> > *definite*, then the procedure-renamed-as-function solution for
> > "returning" limited objects looks more attractive.
>
> OK by me.

Good.

> ...
> > Presumably if the limited object is both controlled
> > and has an initialization specified by a function call, then Initialize
> > is called immediately prior to calling the explicitly specified
> > procedure-renamed-as-function, to ensure the object gets properly
> > default initialized before being manipulate by "user" code.
>
> Umm, no. These ought to work like aggregates, and those don't call
> Initialize. We certainly want to be able to replace an aggregate with a
> constructor function, and vice versa.
>
> So if you have a user-defined constructor, it is the complete constructor;
> Initialize is called only for default initialized objects.

I don't think this really works.  If the type is limited private,
and there is an Initialize procedure, it should definitely be called
to initialize the object properly.  You can't rely on some arbitrary
user-written procedure to do the right thing as far as maintaining
reference counts, etc.  It is *not* the same case as with
an aggregate.  Those are only permitted on non-private types,
and hence can be treated as an "inside the abstraction" operation,
which can be relied upon to initialize reference counts, etc.,
properly.

The procedure-as-function can be declared anywhere, and so
must be treated as an "outside the abstraction" operation,
which cannot be relied on to preserve the invariant required
by Initialize/Finalize.

>
> ...
> > Given a reduced need for the extended return statement, then we can
> > go to a relatively simple proposal for returning anonymous access types,
> > namely require that the caller pass in a storage pool and an associated
> > accessibility level, which would be used only if the expression of a
> > return statement is an allocator, or a call on another such function.
> > An accessibility level would be included with the returned object.
> >
> > The "only" new syntax for this would be allowing an access_definition to
> > appear for the result subtype of a function.
>
> I don't like the idea that a lot of runtime expense and compiler
> complication is hidden behind the use of a single keyword ("access") which
> normally is quite cheap.


This doesn't seem like a lot more overhead than other functions.
functions that return composite objects typically have implicit
parameters, unconstrained array parameters have implicit parameters,
and access parameters have implicit parameters, so
it seems not so big a surprise than anonymous access returns require
an implicit parameter.  This implicit parameter would presumably
be a reference to a storage pool and an accessibility level.
This can often be a statically initialized data structure, since
most storage pools are declared at the library level.

> ... And I don't see the need (I don't see the need for
> any access parameters, when you get right down to it.)

Well if you don't see the need for access parameters, then
obviously you don't see the need for access returns.  But
if you have access parameters, which we already have and
which are the basis for the whole AI on anonymous access types, and that
adds access components and access renames, then it seems to
leave a language hole to not include access returns.

>             Randy.

(I suspect this will get more juices flowing, so thanks
for starting off the discussion.)

*************************************************************

From: Randy Brukardt
Sent: Thursday, December 4, 2003  9:09 PM

> The procedure-as-function can be declared anywhere, and so
> must be treated as an "outside the abstraction" operation,
> which cannot be relied on to preserve the invariant required
> by Initialize/Finalize.

Humm. It's true that such a constructor function can be declared anywhere, but
I don't think that is a problem. The only thing that such a function (procedure
really) could is to pass the object on to another function or procedure to
construct it. The *real* constructor function/procedure would have to be
declared inside of the abstraction, which of course could do all of the needed
initialization.

If this is a real issue, then Robert Eachus's solution of an explicitly
declared constructor is a better idea, because that could be required to be a
primitive of the type, and thus would have to be in the same place as
Initialize and any aggregates.

*************************************************************

From: Tucker Taft
Sent: Thursday, December 4, 2003  9:59 PM

I still don't think this works.  Let's take an example:

     type Handle is new Limited_Controlled with record
         Ptr : Ptr_Type;
         X : Integer := 0;
     end record;

     procedure Initialize(H : in out Handle);
     procedure Finalize(H : in out Handle);
     procedure Some_Op(H : in out Handle);
     procedure Share_Object(Existing : in Handle; New_Ref : in out Handle);

As things stand now, any primitive operation of the
type Handle other than Initialize can assume
that Initialize will have been called (let's presume
it initializes Ptr to point to some heap object,
and initializes the reference count of that to one).

These operations should still be able to make the
same assumption after this change.

If someone should rename Some_Op or Share_Object to be
a function, or declare their own procedure and rename
it to be a function, then that shouldn't change the fact
that Some_Op and Share_Object can safely assume that Ptr
is non-null in all objects.

I presume we all agree that they can all presume that
the X component has been initialized to zero.  It seems
like the call on Initialize goes along with that.
The analogy with an aggregate doesn't work.  With
an aggregate, we know that the programmer is forced to
specify a value (possibly the "default" value) for
every component.  With the procedure-as-function
proposal, all we are saying is that given "L : Lim := Func;"
we know the object L is passed to the procedure renamed
as Func, before proceeding to the next declaration.
This by no means gurantees that all fields are properly
initialized, unlike an aggreate.  The renamed procedure
might just return immediately, or might just set one
component to have some special value.


>
> If this is a real issue, then Robert Eachus's solution of an explicitly
> declared constructor is a better idea, because that could be required to be
> a primitive of the type, and thus would have to be in the same place as
> Initialize and any aggregates.

Just making it a primitive doesn't fix the problem.  If you create
some special thing called a constructor, then presumably it can
only be called as part of a declaration, and when called it
must presume what?  The object is totally uninitialized?
The default initializations have occurred but Initialize has
not been called?  Pointers are nulled out but other default
initializations have not occurred?  I think you are wading into
a morass.

I believe this whole proposal flies only if it is kept very
simple, with no complicated new semantics.  I think
specifying that default initializations, including a call
on Initialize if appropriate, have occurred is quite reasonable.

A useful model is T'Input.  This is a function which
is effectively a rename of the procedure T'Read for
a type with a constrained first subtype.  And the default implementation
of T'Input must clearly default initialize the object before calling a
user-provided T'Read procedure, because we don't want user-written
code being handed an improperly initialized object.

So I think default initialization is the right model for these other
proposed cases of procedures renamed as functions.

*************************************************************

From: Randy Brukardt
Sent: Thursday, December 4, 2003 11:03 PM

> If someone should rename Some_Op or Share_Object to be
> a function, or declare their own procedure and rename
> it to be a function, then that shouldn't change the fact
> that Some_Op and Share_Object can safely assume that Ptr
> is non-null in all objects.

But you can't do that! It isn't reasonable for some random procedure to be
used as a constructor. Only a purpose-built routine that initializes
everything can be used as one, and it would never make sense to actually
call it as a procedure. So this again argues for the Eachus solution.

> I presume we all agree that they can all presume that
> the X component has been initialized to zero.  It seems
> like the call on Initialize goes along with that.
> The analogy with an aggregate doesn't work.  With
> an aggregate, we know that the programmer is forced to
> specify a value (possibly the "default" value) for
> every component.  With the procedure-as-function
> proposal, all we are saying is that given "L : Lim := Func;"
> we know the object L is passed to the procedure renamed
> as Func, before proceeding to the next declaration.
> This by no means gurantees that all fields are properly
> initialized, unlike an aggreate.  The renamed procedure
> might just return immediately, or might just set one
> component to have some special value.

I like this less and less. A constructor has to set everything, somehow.
Anything else is madness.

The problem seems to be that these aren't really constructors, they just can
be used as them.

You're saying that the default initializer does everything, and then the
constructor has to come along an undo it. Consider a constructor for
unbounded strings:

    function "+" (V : in String) return Unbounded_String....

(This is suspiciously similar to something that's been removed from AI-301.)

with the actual type being:
    type Unbounded_String is new Ada.Finalization.Controlled with record
        Str : String_Access := new String'("");
    end record;

With your semantics, this could be a rename of Set_Unbounded_String. But in
that case, there is no advantage to having it (or Set_Unbounded_String, for
that matter) - because the default sized memory string has been allocated.
"+" would have to deallocate the existing memory, then allocate a new one of
the right size. That doesn't sound like a constructor to me; we'd be doing
an extra allocation and deallocation every time. Might as well have just
used a regular function.

A real constructor would get the uninitialized object, examine its
parameters, then allocate the proper sized string. Once. No deallocation
involved.

> > If this is a real issue, then Robert Eachus's solution of an explicitly
> > declared constructor is a better idea, because that could be required to be
> > a primitive of the type, and thus would have to be in the same place as
> > Initialize and any aggregates.
>
> Just making it a primitive doesn't fix the problem.  If you create
> some special thing called a constructor, then presumably it can
> only be called as part of a declaration, and when called it
> must presume what?  The object is totally uninitialized?
> The default initializations have occurred but Initialize has
> not been called?  Pointers are nulled out but other default
> initializations have not occurred?  I think you are wading into
> a morass.

Been there, done that. This is precisely how Janus/Ada initializes all
objects; there's no problem. The constructor gets discriminants and bounds,
and everything else is uninitialized. This isn't a "morass", its how
constructors work. For Janus/Ada, we have to do this to allocate memory for
any dynamically sized components (since we always allocate them to size).

There currently are the standard three cases: default construction (which
default initializes everything), aggregates (which of course do their own
initialization, bypassing the top-level constructor, but components use the
copy constructor), and (for non-limited types) copy construction (which gets
the values from the copied object).

Thinking about that, I don't immediately see how to implement either of
these proposals. I guess your proposal would do the full default
construction followed by Initialize before even calling the constructor.
That seems both limiting and time consuming (you're doing everything twice
including the allocation/deallocation of memory, and you're stuck with the
initial bounds/discriminants in all cases). The Eachus proposal doesn't
leave any place to allocate the memory - you can't do it until you know the
bounds/discriminants, but by the time that you do, you're in user-defined
code and it is too late to do anything. I suppose that means that we really
do have to limit ourselves to cases where the constraint is known. Then, the
old (now abandoned) "make" constructor (which only allocated memory) would
do the trick. The make constructor was abandoned because it didn't work for
components with mutable discriminants: the memory to allocate can only be
determined on the assignment. We had hacked around that for years, but I
finally got fed up with it, and blew it away completely last year. One point
for Tuck. :-)

> I believe this whole proposal flies only if it is kept very
> simple, with no complicated new semantics.  I think
> specifying that default initializations, including a call
> on Initialize if appropriate, have occurred is quite reasonable.

I think that we are talking inconsistent new semantics no matter what we
choose. Just because it is simple to describe to an implementor doesn't mean
that it makes much sense.

Consider a non-limited constructor:

    package P is
        type T is new Ada.Finalization.Controlled with ...;
        procedure Tuck_Constructor (Obj : in out T);
        function New_Constructor return T renames procedure Tuck_Constructor;
        function Old_Constructor return T;
        procedure Initialize (Obj : in out T);
        procedure Adjust (Obj : in out T);
    end P;

    O1 : T; -- Calls Initialize.
    O2 : T := (Controlled with ...); -- Calls nothing.
    O3 : T := Old_Constructor; -- Calls Adjust.
    O4 : T := New_Constructor; -- Calls Initialize???

O3 and O4 sure look the same; it doesn't look good to have them do vastly
different things.

But of course, we can't have O4 call Adjust (in part because if we change T
to be limited, there is no Adjust). Besides, Adjust would do the wrong
thing, because it is expecting a fully initialized object.

I don't think that there can be a solution to this confusion (unless we
abandon the ":=" notation for constructors, but that seems to be too large a
change).

> A useful model is T'Input.  This is a function which
> is effectively a rename of the procedure T'Read for
> a type with a constrained first subtype.  And the default implementation
> of T'Input must clearly default initialize the object before calling a
> user-provided T'Read procedure, because we don't want user-written
> code being handed an improperly initialized object.

See, I don't agree with this at all. A T'Read that doesn't initialize all of
the non-discriminant components of whatever it is handed is just plain
wrong. It certainly shouldn't be depending on anything it is handed. Now, it
is true that the language cannot enforce a requirement that T'Read or a
constructor actually initializes everything, or that it doesn't read
anything it didn't set, but that is what they're intended to do.

T'Read is just a specific constructor, and it shouldn't be doing expensive
default initialization which will immediately be overwritten.

> So I think default initialization is the right model for these other
> proposed cases of procedures renamed as functions.

I won't argue that from an implementation perspective, but I think it would
be confusing as heck to users.

*************************************************************

From: Robert I. Eachus
Sent: Thursday, December 4, 2003  11:48 PM

Tucker Taft wrote:

> Randy Brukardt wrote:
>
>> If this is a real issue, then Robert Eachus's solution of an explicitly
>> declared constructor is a better idea, because that could be required
>> to be
>> a primitive of the type, and thus would have to be in the same place as
>> Initialize and any aggregates.
>
I am beginning to accept that you are right.  The renaming of procedures
as functions looks cleaner to start with, but it still requires syntax
changes.  But worse, you do have this issue that when the function body
is being compiled, there will be no explicit checks done. Users could
call other operations that assumed that the object had already been
initialized, with potentially damaging results.  We could just say that
the programmers should be careful in this case, but that is not the
normal Ada approach.

> Just making it a primitive doesn't fix the problem.  If you create
> some special thing called a constructor, then presumably it can
> only be called as part of a declaration, and when called it
> must presume what?  The object is totally uninitialized?
> The default initializations have occurred but Initialize has
> not been called?  Pointers are nulled out but other default
> initializations have not occurred?  I think you are wading into
> a morass.

I don't see a morass.  The constructor would create a value of the
limited type, but the object could not be accessed until after the
constructor returned.  Inside a function you have a value that you will
return 'in-place'.  In the case of a constructor, that would be the only
use allowed.  During the call to the constructor, the object being
initialized cannot be referenced, either by the constructor or
otherwise.  You can certainly create an object of the type inside the
constructor, and it will get default initialization.  But that object is
not the object being initialized.

With Tucker's renaming of a procedure, inside the body the out parameter
has a name, and can be accessed.  This one the major difference between
the two approaches.  (Or better, it is a side-effect of the fact that in
the renamed procedure case, the bounds and other attributes can come
from the target object, in my approach the object is only namable after
the constructor returns.)

As I said before, the two approaches are different in what they allow.
I personally can live with either set of constraints.  But I think this
discussion has shown a serious problem with the renamed procedure
approach.  We would need special rules to cover what can be done with
the out parameter inside the procedure to be renamed, or special
semantics for references to that object.  But allowing it to be passed
as a parameter to some other subprogram would seem to open an entire
Pandora's box.

Saying that some default initialization for the object occurs before the
user-defined doesn't solve anything, or creates a new set of issues.
Let's imagine a limited type that by default allocates some space on the
heap.  You want to extend this type, and create an explicit constructor
for it.  If the 'default' initialization occurs, and then you are going
to stick another value in, you are going to have to deallocate the
already assigned memory (or reuse it).

But the extension type may not be able to see all the fields of the
parent.  Take Limited_Controlled for a horrible example.  I keep coming
back to the idea that in the 'normal' constructor case, if there is such
a thing, the programmer will create an object of some ancestor type 'in
place' to do some of the initialization, and then do whatever special
bit fiddling is required in the parts of the type that he or she can name.

But which ancestor type needs to be used is a decision that needs to be
made by the programmer, not by the language. (For a type derived from
Limited_Controlled, this type might be Limited_Controlled, but it is
much more likely to be the parent, or possibly a grandparent.)   The
return statement of the constructor should not be required to be an
extension aggregate, but in most cases it will be anyway.

*************************************************************

From: Pascal Leroy
Sent: Friday, December 5, 2003  8:59 AM

> If this is a real issue, then Robert Eachus's solution of an
> explicitly declared constructor is a better idea, because
> that could be required to be a primitive of the type, and
> thus would have to be in the same place as Initialize and any
> aggregates.

If we think that adding constructors to the language is important, then,
yes, let's do something like what Eachus suggested.  I really don't like
Tucker's proposal which looks like a wart (or many warts) to me.  The
renaming of a procedure as a function is weird, and the dynamic
semantics of the constructor being called after initialization has taken
place doesn't make any sense to me, as the constructor would have to
undo the initialization.

I agree with Randy that constructors should in general work like
aggregates.

*************************************************************

From: Robert I. Eachus
Sent: Thursday, December 4, 2003  11:48 PM

>But you can't do that! It isn't reasonable for some random procedure to be
>used as a constructor. Only a purpose-built routine that initializes
>everything can be used as one, and it would never make sense to actually
>call it as a procedure. So this again argues for the Eachus solution.

Up until now, I have been viewing my role here as helping to define the
problem.  But now I think that I see a way to handle both difficult (and
of course more useful) cases.  First go with a special name/syntax for
constructors. Second, insist that constructors can only be used to
create objects.  It may be useful to permit a constructor to appear as
an (in) parameter in a call, but I really don't see that as all that
important or useful.  But the cases that have to be outlawed are the
uses of a constructor in the prefix of a name.

Next within a constructor, allow attributes of "return" to be queried.
(Or you can use some other name, for example the name of the constructor
itself, which does less damage to existing parsers, so I'll use that in
my examples.)

Now I can write (using String to make a point):

    constructor Blanks(Length: Integer := 0) return String is
    begin
         if Blanks'Constrained
         then
            declare
                Result: String(Blanks'Range)  :=  (others => ' ');
             begin
                return Result;
             end;
         else
             declare
                Result: String(1..Length) := (others => ' ');
             begin
                 return Result;
          end if;
     end Blanks;

The amount of work in the semantic phase of compilers would be
significant, but it has many advantages from a language point of view:

      Constructors are special and can be declared as such.
      Functions can serve as constuctors where the special features are not
          needed.  (I could write a Blanks function that handled the
          unconstrained case and a parameterless constructor for the
          constrained case.  In fact, in this case I probably would, but
          I wrote this as an example of the syntax and semantics.)
      Programmers don't have to learn any 'weird' new syntax or
          semantics.  The only two things that are not intuitive about this
          proposal are the name to use as the prefix of attributes and
          components, and the new reserved word "constructor."

>I like this less and less. A constructor has to set everything, somehow.
>Anything else is madness.
>
>The problem seems to be that these aren't really constructors, they just can
>be used as them.

I am in complete agreement.  I think it is possible to do constructors
right.  Whether the amount of work involved in doing them right can be
justified in this language revision is a different question.  I think
the answer is in the affirmative.  But there is no justification for
adding something that looks like a kludge to programmers and doesn't
solve the whole problem.  The renaming of a procedure as a function as
such probably doesn't reach the kluge level.  But the things we are now
discussing to make it 'work' certainly do.  It may be that there is a
better fix to be found, but I think allowing attributes of the result to
be used in constructors looks like the most elegant solution.

*************************************************************

From: Tucker Taft
Sent: Friday, December 5, 2003  5:03 PM

I think you are definitely killing this proposal with complexity.

Let's look at typical ways of writing "constructor" functions:

   function make_blah1(x, y : params) return blah is
      Result : blah;
   begin
      Result.x := x;
      Result.y := y;
      Massage(Result);
      return Result;
   end make_blah1;

   function make_blah2(x, y : params) return blah is
      Result : blah := (x => x, y => y, z => 0);
   begin
      Massage(Result);
      return Result;
   end make_blah2;

   function make_blah3(x, y : params) return blah is
      Result : blah := some_other_func(x);
   begin
      Result.y := y;
      Massage(Result);
      return Result;
   end make_blah3;

   function make_blah4(x, y : params) return blah is
   begin
       return (x => x, y => y, z => 0);
   end make_blah4;

I would tend to use make_blah1 if there are a lot of components,
and the default initial value of most of the components is fine.
I would tend to use make_blah2 if there are relatively few components,
and I want to be sure that if a new component is added, I am forced
to update my code.
I would tend to use make_blah3 if there is an existing constructor
that does about the right thing, but I want to tweak the result a bit.
I would tend to use make_blah4 in the same circumstances as make_blah2,
when there is no complex work to be done to build up the desired value.

Of course as things exist now, none of these functions could be
used with limited types.  For limited types we are forced to
use procedures to do all "construction", or possibly use discriminants
as effectively parameters to the Initialize procedure.
In all these cases, the object will undergo most if not all default initialization
before we get our hands on it.  Ada 2005 will hopefully allow us to
use aggregates, but that is no use if the limited type is also private,
which is exactly when you want to be able to have "constructor" operations.

So as things are now, programmers are used to the idea that limited types
get default initialized, and then you have to be sure to call the
constructing "procedure" as soon as possible so that no access to
the object occurs before it is initialized as intended.

What this proposal was trying to create was a situation where the
desired initialization can be specified at the point of creation, to
ensure no inappropriate "early" access is possible.  This proposal
becomes even more important if we add limited aggregates, because
having to insert a call on a separate initialization procedure for each
of the limited components of a limited aggregate is going to be a real
pain.  We really want to have some way of specifying the initialization
at the place of the component in the aggregate.  Similarly, an initialized
allocator for limited types is very limiting if the only thing we can
use is an aggregate.  Something with the syntax of a function call would
be very useful, because it could be placed in all the contexts where
we want to specify the (extra) initialization that is to be performed
before further access is permitted.

Note that one of the things that makes a limited type different from
a non-limited type is the sense that it has identity, and it is connected
to perhaps some entity that can't easily be represented by a few bits
in a record (e.g. a thread of control, or a mutex, or some external
resource like a window on a screen).  These kinds of limited objects
must undergo their "default" initialization, before the programmer
gets their hands on them, and clearly the programmer isn't going to override
all of the state of the component.  They are just going to "tweak" the
state of the component in some way, perhaps, and a procedure is
just the thing for doing this.  E.g., an initialization procedure-as-function
for a task might call an entry with some initializing values.  It certainly
can't "fully initialize" the task.  That is not meaningful for limited
objects in general.

So... I really think the idea of default-initialization-plus-user-specified
procedure-to-tweak-the-initial-state is exactly what we want for limited
types.  I think we should allow these procedures renamed as functions
for non-limited types as well (contract model and all that), but clearly
they would normally only be used if the intended constructor was similar to
make_blah1 or make_blah3 when some_other_func was a similar constructor.
In that case, writing make_blah1, make_blah3, and some_other_func as procedures,
and then renaming them as functions, could produce identical semantics.

In any case, so long as it is clear that the semantics of a procedure
renamed as a function is that the returned object is the result
of applying the procedure to a default-initialized object, the semantics
are well-defined and easy to explain.  Having to define a new kind
of program unit that via data flow rules or whatever we make sure that
no component is referenced before it is properly initialized, reminds me
of the "out" parameter morass of Ada 83, on steroids.

For aggressive optimizers, when a procedure is renamed as a function, it
could actually generate the function with a default-initialized "result"
object, and then inline the call on the procedure, and then proceed to remove
all redundant default initializations of the fields of the object.

So back to my initial point.  If we want to provide the ability to
specify how a limited object should be initialized at its point of
creation, let's not kill it with kindness and complexity.
The semantics of a function result being equivalent to applying
a procedure to a default-initialized object are well-defined,
no worse than what is available today for limited types, appropriate
to the underlying principle of limited types as having some amount
of unalterable state bound up with its identity, and
safer and friendlier for the user than forcing a separation between
creation and initialization.

*************************************************************

From: Dan Eilers
Sent: Friday, December 5, 2003  5:42 PM

Robert Eachus wrote:

> Now I can write (using String to make a point):
>
>     constructor Blanks(Length: Integer := 0) return String is
>     begin
>          if Blanks'Constrained
>          then
>             declare
>                 Result: String(Blanks'Range)  :=  (others => ' ');
>              begin
>                 return Result;
>              end;
>          else
>              declare
>                 Result: String(1..Length) := (others => ' ');
>              begin
>                  return Result;
>           end if;
>      end Blanks;


It seems to me that in most cases, the bounds of the return type are
determined by the constructor's input parameters.  This permits always
allocating space for the return object prior to the call, even when the
constructor is used in a larger expression.

Your example of Blanks is such a constructor, as is concatenation,
matrix multiply, etc.

If the syntax of contructors allowed the return type to be constrained
by the parameters, then we could simplify your example to:

     constructor Blanks(Length: Integer := 0) return String(1..length) is
     begin
         return (others => ' ');
     end Blanks;

or for "&",

    constructor "&"(x,y: string) return result: String(1..x'length+y'length) is
    begin
        result(1..x'length) := x;
        result(x'length+1..result'last) := y;
        return result;
    end "&";

*************************************************************

From: Robert A. Duff
Sent: Friday, December 5, 2003  7:04 PM

> It seems to me that in most cases, the bounds of the return type are
> determined by the constructor's input parameters.  This permits always
> allocating space for the return object prior to the call, even when the
> constructor is used in a larger expression.

This is something I've been wanting for years.

*************************************************************

From: Randy Brukardt
Sent: Friday, December 5, 2003  8:02 PM

> I think you are definitely killing this proposal with complexity.

I don't know who "you" refers to here, but I don't think the proposal is that
complex. Moreover, if it isn't complex enough to solve the problem effectively,
it isn't worth doing. If we're not willing to adopt AI-318 as it currently
stands (meaning the change in calling conventions), then I think we need to
step back and take a good look at the problem.

We started by thinking we wanted limited functions. But those make no sense
unless they are a constructor. They're a standard ADT feature that is poorly
supported in Ada, even for non-limited types.

So it makes good sense to look at this as a constructor. The problem is that
all Ada types give very little control over initialization. I'd like to be able
to write a version of Unbounded_Strings that doesn't repeatedly create useless
junk (or have distributed overhead to avoid the useless junk).

> So as things are now, programmers are used to the idea that limited types
> get default initialized, and then you have to be sure to call the
> constructing "procedure" as soon as possible so that no access to
> the object occurs before it is initialized as intended.

What programmers are used to is the idea that limited types sound good in
theory, but are useless in practice. That's one of the reasons that windows in
Claw are non-limited, even though it would make more sense for them to be
limited.

I don't think I've even declared a single limited type since Claw, since it is
so clear that they're useless for ADTs. If you have to trust the user of the
ADT to do something, either the language or the ADT (or both) is broken. That's
precisely the model we need to get away from.

Let's look specifically at the "complexity" of this counter proposal.

It needs a new form of subprogram, the constructor. This is clearly not a
function or procedure. But it reuses most of the syntax and rules from those.
Is this a big deal? I don't think so: AI-348 adds just such a program unit, the
null procedure. (Which is definitely not a "normal" procedure - it is an
entirely new kind of unit, at least as the rules are written. Just as an
abstract subprogram was in Ada 95 - which didn't cause an uproar.)

It might make sense to limit constructors to be primitive operations of the
result type, but I'm not certain that is necessary.

The constructor would use the return do construct proposed in AI-318. This
looks like:

     return identifier : subtype_indication [:= expr]
         [do
             handled_sequence_of_statements;
          end [identifier]];

This would only be allowed in a constructor. Initialization (default or
explicit) would take place when this (compound) statement is evaluated (not
before). Since AI-287 gives us not only limited aggregates, but default
initialized components in those aggregates, it is possible to use an aggregate
for any type if we need to avoid default initialization.

[Aside: it would make sense to allow (others => <>) for any object, even for
private types. That would give a way to explicitly default initialize a
component without having to do specify all of the other components as well.]

The sequence of statements would allow later massaging.

I'd prefer that if the initializing expression is omitted, top-level default
initializations are also omitted. But this isn't essential (and it does
complicate things a bit). I'm sure Tucker will feel better if it is avoided.

Calls on constructors would be limited to places where aggregates are allowed.

If we allowed this in the general case, then some memory allocation would have
to be supported at the point of the return statement. (This is a certainty for
Janus/Ada, no matter what.) That means that we'd probably need to pass in a
storage pool or some other representation, as previously outlined. But since
this is a new feature, there cannot be a compatibility problem. We could
mitigate this problem somewhat by limiting the result subtype to be constrained
(it would still allow everything if a wrapper record was used), but I doubt
that would help enough to justify the oddity.

> Let's look at typical ways of writing "constructor" functions:

OK, let's:

    constructor make_blah1(x, y : params) return blah is
    begin
        return Result : blah := (others => <>) do -- I'm explicitly showing the default
                                                -- initialization here, but that's not
                                                -- necessary.
           Result.x := x;
           Result.y := y;
           Massage(Result);
        end Result;
    end make_blah1;

    constructor make_blah2(x, y : params) return blah is
    begin
        return Result : blah := (x => x, y => y, z => 0) do
            Massage(Result);
        end Result;
    end make_blah2;

   constructor make_blah3(x, y : params) return blah is
   begin
        return Result : blah := some_other_func(x) do -- Some_other_func(x) better be a
                                                      -- constructor, at least if blah
                                                      -- is limited.
            Result.y := y;
            Massage(Result);
        end Result;
    end make_blah3;

   constructor make_blah4(x, y : params) return blah is
   begin
      return Result : blah := (x => x, y => y, z => 0);
   end make_blah4;

These all look good to me. Moreover, the example I gave last night would work
without double construction:

    type Unbounded_String is new Ada.Finalization.Controlled with record
        Str : String_Access := new String'("");
    end record;

    constructor "+" (V : in String) return Unbounded_String is
    begin
       return Result : Unbounded_String :=
           (Ada.Finalization.Controlled with Str => new String'(V);
    end "+";

> What this proposal was trying to create was a situation where the
> desired initialization can be specified at the point of creation, to
> ensure no inappropriate "early" access is possible.  This proposal
> becomes even more important if we add limited aggregates, because
> having to insert a call on a separate initialization procedure for each
> of the limited components of a limited aggregate is going to be a real
> pain.  We really want to have some way of specifying the initialization
> at the place of the component in the aggregate.  Similarly, an initialized
> allocator for limited types is very limiting if the only thing we can
> use is an aggregate.  Something with the syntax of a function call would
> be very useful, because it could be placed in all the contexts where
> we want to specify the (extra) initialization that is to be performed
> before further access is permitted.

Exactly. That's what constructors are for. But they're clearly not functions
(at least not "normal" functions), and they certainly aren't procedures.

> Note that one of the things that makes a limited type different from
> a non-limited type is the sense that it has identity, and it is connected
> to perhaps some entity that can't easily be represented by a few bits
> in a record (e.g. a thread of control, or a mutex, or some external
> resource like a window on a screen).  These kinds of limited objects
> must undergo their "default" initialization, before the programmer
> gets their hands on them, and clearly the programmer isn't going to override
> all of the state of the component.  They are just going to "tweak" the
> state of the component in some way, perhaps, and a procedure is
> just the thing for doing this.  E.g., an initialization procedure-as-function
> for a task might call an entry with some initializing values.  It certainly
> can't "fully initialize" the task.  That is not meaningful for limited
> objects in general.

Of course it is. There are some limited objects/components which have to be
default initialized. We've got a syntax for doing that; there is no problem.

Let me say right now that I'm really only interested in constructors for ADTs.
And I think that virtually all ADTs ought to be controlled. Given how
controlled works in Ada 95, that means I'm only interested in types that are
(ultimately) derived from one of the types in Ada.Finalization. Whatever we
come up with ought to make some sort of sense for other types, but it not at
all important that it is useful. (Which is why I don't care if constraints
work; those shouldn't generally be visible anyway.)

In any case, what you described would be written as:

    constructor Run_It (My_Id : in String) return Tucks_Tasks is
    begin
       return Result : Tucks_Tasks := (other => <>) do
           Result.Start_Up (My_Id);
       end Result;
    end "+";

> So back to my initial point.  If we want to provide the ability to
> specify how a limited object should be initialized at its point of
> creation, let's not kill it with kindness and complexity.

No, but we at least had better be able to do the job. If we cannot avoid
default initialization,  ADTs will have to have a convoluted design internally
to avoid excessive costs -- but that adds a distributed overhead to the use of
them which may be unacceptable.

> The semantics of a function result being equivalent to applying
> a procedure to a default-initialized object are well-defined,

Sure.

> no worse than what is available today for limited types,

Boy, this is a strong endorsement. :-) I want to waste time implementing
something that is "no worse than what we have today".

> appropriate to the underlying principle of limited types as having some amount
> of unalterable state bound up with its identity,

Any proposal I've seen meets this requirement.

> and safer and friendlier for the user than forcing a separation between
> creation and initialization.

Huh? That's precisely what we're trying to avoid with constructors: to be able
to eliminate the separation between creation and initialization. Certainly,
calling a separate procedure call at some unspecified later time is a horrible
separation. And doing the default initialization when it is inappropriate is
also a unnecessary separation.

*************************************************************

From: Jean-Pierre Rosen
Sent: Friday, December 5, 2003  3:59 PM

From: "Robert I. Eachus" <rieachus@comcast.net>
>[...]
>          and the new reserved word "constructor."
We could call this "limited function" and save a keyword...
Seems to carry the spirit.

*************************************************************

From: Robert A. Duff
Sent: Saturday, December 6, 2003  9:01 AM

Randy, in reply to Tuck, writes:

> > I think you are definitely killing this proposal with complexity.
>
> I don't know who "you" refers to here,

I didn't understand that either.  Because Tuck didn't say "... writes".
You (Randy) followed suit.  ;-)

> What programmers are used to is the idea that limited types sound good in
> theory, but are useless in practice.

I've heard this vague claim many times.  Could you be more specific?  My
feeling is that the claim is true exactly because of this initialization
problem, and if we solve that, limited types would be very useful
indeed.  Are there *other* issues that you or others think make limited
type useless in Ada?

(I also have limited types with default initialization, and/or kludgy
discriminant hackery, where constructor functions of some sort would be
cleaner.)

>... And I think that virtually all ADTs ought to be controlled.

Unfortunately, that's not feasible if you care about efficiency, given
the huge overhead most compilers have for controlled types.

You can stick to controlled types if you like, but I think it's a bad
idea to assume (for the language design) that the only types that need
constructors are controlled.  I have lots of limited types in my current
project, and lots of types that *would* be limited if only I could
initialize them nicely.

But I have very few controlled types; they're simply too inefficient.
(Actually, I guess I should say I have very few controlled *objects*.
That is, if I have a type where I'm creating and destroying thousands or
millions of objects of the type, I can't make it controlled, because
it's too slow.  The number of *types* is irrelevant to this efficiency
issue.)

*************************************************************

From: Tucker Taft
Sent: Saturday, December 6, 2003  9:39 AM

> Randy, in reply to Tuck, writes:
>
> > > I think you are definitely killing this proposal with complexity.
> >
> > I don't know who "you" refers to here,
>
> I didn't understand that either.  Because Tuck didn't say "... writes".
> You (Randy) followed suit.  ;-)

I thought it was obvious I was talking about Randy and Robert Eachus,
who were proposing a completely new kind of program unit, namely a constructor.

I could see adding a reserved word to make it clear that a given
function was designed to create a new object, and the caller must
allocate space for the object and initialize it at least to
some extent before the call.  I wouldn't limit the contexts in
which such a function could be called.

> > What programmers are used to is the idea that limited types sound good in
> > theory, but are useless in practice.
>
> I've heard this vague claim many times.  Could you be more specific?  My
> feeling is that the claim is true exactly because of this initialization
> problem, and if we solve that, limited types would be very useful
> indeed.  ...

The big question for me is then whether you feel the admittedly
limited (;-) capability provided by procedures renamed as functions
would be adequate.

*************************************************************

From: Robert A. Duff
Sent: Saturday, December 6, 2003  2:05 PM

Tuck says:

> > Randy, in reply to Tuck, writes:
> >
> > > > I think you are definitely killing this proposal with complexity.
> > >
> > > I don't know who "you" refers to here,
> >
> > I didn't understand that either.  Because Tuck didn't say "... writes".
> > You (Randy) followed suit.  ;-)
>
> I thought it was obvious I was talking about Randy and Robert Eachus,
> who were proposing a completely new kind of program unit, namely a
> constructor.

For those of us who might read these things out of order,
or weeks/months/years later, a line at the front saying
"so-and-so said, ..." would be useful.

> I could see adding a reserved word to make it clear that a given
> function was designed to create a new object, and the caller must
> allocate space for the object and initialize it at least to
> some extent before the call.  I wouldn't limit the contexts in
> which such a function could be called.
>
> > > What programmers are used to is the idea that limited types sound good in
> > > theory, but are useless in practice.
> >
> > I've heard this vague claim many times.  Could you be more specific?  My
> > feeling is that the claim is true exactly because of this initialization
> > problem, and if we solve that, limited types would be very useful
> > indeed.  ...
>
> The big question for me is then whether you feel the admittedly
> limited (;-) capability provided by procedures renamed as functions
> would be adequate.

Sorry, but I don't understand the details well enough to be sure.
I don't understand the limitations.  I've lost track.  I just went back
and re-read AI's 318 and 325, but they seem to have obsolete proposals.
I also re-read your original e-mail on this subject, and lots of others,
but I still don't have a good understanding of what all the proposals
are, in detail.

I suspect the answer is, "Yes, the func-renames-proc thing is good
enough".  But I don't understand the details well enough to distinguish
it from the "constructor(...) return ..." syntax.  Is it just an
argument as to which syntax is more sugary, and which more sour?

Despite my current ignorance, I'll offer some comments:

Constructors ought to be composable.  That is, clients should be able to
write constructors, given primitive constructors.  For example, if a
"Sequence" package gives the client a way to construct a singleton
sequence, given one element, and a way to concatenate Sequences, the
client ought to be able to write a constructor that takes two Elements
and produces a sequence of length 2.  This is common in non-limited
cases.  Why not in limited?

This implies that constructors cannot be required to be primitive ops.
And therefore that such constructors cannot see improperly initialized
objects.

----

I like the idea that these new constructor functions are recognized
somehow by the *spec*.

----

One way to write "bullet proof" abstractions is to forbid clients from
creating uninitialized objects, by saying "type T(<>) is limited private".
It would be nice if the abstraction could then export constructor
functions, and clients could compose them, and/or call them.
Clients cannot write "X: T;", so they ought to be able to write
"X: T := Primitive_Constructor(...);", or declare their own composed
constructor, and use that.  Either way, the package itself has total
control over object creation.  The unknown discriminants do not
*necessarily* mean the thing has truly unknown size.
I'm not sure how important this is.

----

I like Dan Eiler's idea, of allowing a function result to be definite,
but dependent on parameters, as in "function F(X: String) return
String(1..X'Length)".  It seems related to all this limited-type stuff,
in that it allows the caller to know sizes of function results that
would otherwise be "return String".

*************************************************************

From: Robert I. Eachus
Sent: Saturday, December 6, 2003  3:21 PM

Tucker Taft wrote:

>I thought it was obvious I was talking about Randy and Robert Eachus,
>who were proposing a completely new kind of program unit, namely a constructor.
>
>I could see adding a reserved word to make it clear that a given
>function was designed to create a new object, and the caller must
>allocate space for the object and initialize it at least to
>some extent before the call.  I wouldn't limit the contexts in
>which such a function could be called.
>
>
If you think the limitation I proposed--that a constructor cannot be
used as a prefix of a name is a big deal, fine.  But procedures can't be
used in that context either, so I think that with the renaming approach
you probably need that restriction too.

To me the constructor proposal without the attributes is slightly
simpler than your (Tucker's) renaming proposal, from a compiler
viewpoint.  But they cover different cases of constructors.  The
renaming approach covers cases where the constraints on the object being
initialized come from the target, where in the new subprogram type
approach, the case where the bounds are determined by the constructor is
easily handled.  Adding the attributes, perhaps only specific
attributes, to the new subprogram approach covers all constructor cases,
but is probably more work than the renaming approach.  So, with my
implementor hat on, I'd definitely want the context where constructors
can appear limited.  This fits nicely with using the constructor name in
attributes.  If the name always refers to the containing instance, then
it is pretty clear that allowing the name as a call in prefixes would
cause ambiguity.

As a user, I very much like the idea of covering all the cases with one
new construct.  And I really think it is worth calling it "constructor"
not "limited function."  I also like the idea of adding the "return
Result: whatever do .. end; construct.  It is clear to me that in most
constructors you will need a nested scope anyway, unless the constructor
is equivalent to assigning default values to fields.

As for the brouhaha about initialization, I definitely see a use for
constructors in replacing "junk" initializations.  But allowing the
default initializations to occur (unless there is an attribute
assignment) is no big deal.  The solution for the problems where the
initial values are not wanted will be to not provide the defaults, just
an Initialized: Boolean := False; if there is a potential problem.  The
Limited_Controlled example is perfect.  You want/need the
implementation's default initialization of any implementation specific
fields of objects derived from Controlled or Limited_Controlled.  But if
decent constructors are available, the creator of the type extension can
just use constructors instead of default values for the difficult stuff.

This does have one implication that needs watching though--it is not a
problem, but should be explicitly discussed.  What happens if someone
tries to assign a constructor to an already existing limited object?
The three reasonable answers are that it is illegal, Program_Error is
raised, and that the current value is finalized (if it is controlled)
and a new value/object is created.  I favor making it illegal for a
limited object, but Tucker seems to want something else.  (If it is
illegal, then any discussion of whether the object is initialized is out
of bounds.)

>The big question for me is then whether you feel the admittedly
>limited (;-) capability provided by procedures renamed as functions
>would be adequate.
>
At this point my answer is no.  I think that the constructors without
the attribute support are simpler to implement than the renaming
approach, and much easier on the user.  The big issue to me is whether
we need to support constructors that take their bounds from the target
object.  At this point I feel that it is worth doing, but a close call.
However, the partial functionality of constructors without the
attributes is to me significantly better than the renaming approach.  It
can end up requiring slightly more writing and overhead than the renamed
procedure approach to create a constrained object of an unconstrained
type with discriminants (and without a default for the discriminants):

     Fubar: Foo(Bar) := Make_Foo(Bar);

(You would have to explicitly repeat the discriminants once on the
object, once as parameters to the constructor.)  However, the renaming
approach does not handle the case where the constructor deterimines the
constraints at all.   To give an example I recently ran into, think
about a constructor that converts a linked list into an array.  For a
non-limited type, no problem at all, create a temporary array object,
and walk the list, if you run out of room in the temporary object, recur:

  type String_Array is array (Positive range <>) of Unbounded_String;

  type String_List is private;
...
private
  type String_Pointer is access String_List;

type String_List is record
   Value: Unbounded_String;
   Next: String_Pointer
end record;
...
function To_String_Array(List: in String_List) return String_Array is
  Result: String_Array(1..10);
  Temp: String_List := List:
begin
    if Temp := null then return Result(1..0) end if;
    for I in 1..10 loop
       Result(I) := Temp.Value;
       if Temp.Next = null then return Result(1..I); end if;
    end loop;
    return Result & To_String_Array(Temp);
end To_String_Array;

When writing this I was seriously concerned about the number of
Unbounded_String objects being created and assigned, so this is probably
a good test case for arguments about initializations.  (It is possible
to walk the list once counting and create the array "in place" if we add
a way to do that.)

*************************************************************

From: Robert I. Eachus
Sent: Saturday, December 6, 2003  3:52 PM

Robert A Duff wrote:

>Sorry, but I don't understand the details well enough to be sure.
>I don't understand the limitations.  I've lost track.  I just went back
>and re-read AI's 318 and 325, but they seem to have obsolete proposals.
>I also re-read your original e-mail on this subject, and lots of others,
>but I still don't have a good understanding of what all the proposals
>are, in detail.
>
>I suspect the answer is, "Yes, the func-renames-proc thing is good
>enough".  But I don't understand the details well enough to distinguish
>it from the "constructor(...) return ..." syntax.  Is it just an
>argument as to which syntax is more sugary, and which more sour?
>
>
No, in the cases where there are no discriminants involved either
proposal suffices.  In the renaming approach the object has to be
created before the constructor is called, which means that the bounds
have to be known before (or during) the call.   I just sent in an
example where this is not really possible.  In the new subprogram
approach, we can either allow attributes to query the target, or the
constructor will construct a value and there may be a Constraint_Check
after the call.  This particular constraint check is ugly since it has
to be made after the object is created, and it works best if there is an
explicit copy of the result into the object.  This is why I like the
return...do construct.  Clearly the constraint check will occur inside
the constructor, at the point where the return value is created but
before it can be changed by the do...end; block.

>This implies that constructors cannot be required to be primitive ops.
>
I'm not sure what you are saying here.  Constructors should be allowed
to be primitive, and we will have to argue about whether they become
abstract when inherited unless overridden in any case.  But I don't feel
strongly about whether they can be other than primitive operations.

>And therefore that such constructors cannot see improperly initialized
>objects.
>
>
I tend to agree.  The best arguement in favor of this position is that
of type extensions.  In the package that creates an extension, you may
not be able to see into the parent part of the object.  So its
initialization has to be done implicitly or by a call to a constructor
for the parent type.  As I said in my previous post, this is not too
onerous.  If you are the author of a type, you can decide to put any
initialization in the constructors instead of as initial values.

>I like the idea that these new constructor functions are recognized
>somehow by the *spec*.
>
>
>
I think it is necessary.  Otherwise we have a fairly heavy distributed
overhead.  Right now both approaches, satisfy this requirement.

>One way to write "bullet proof" abstractions is to forbid clients from
>creating uninitialized objects, by saying "type T(<>) is limited private".
>It would be nice if the abstraction could then export constructor
>functions, and clients could compose them, and/or call them.
>Clients cannot write "X: T;", so they ought to be able to write
>"X: T := Primitive_Constructor(...);", or declare their own composed
>constructor, and use that.  Either way, the package itself has total
>control over object creation.  The unknown discriminants do not
>*necessarily* mean the thing has truly unknown size.
>I'm not sure how important this is.
>
>
You may have just killed the renaming approach.  I had forgotten about
declaring a type with unknown discriminants as it is not too useful
currently.  With the renaming approach you can't use such a type.  With
the constructor as new subprogram approach, even without the attibutes,
such types will be used all over the place.  As you say, the only way to
for a user to create one will be to call a constructor.

>I like Dan Eiler's idea, of allowing a function result to be definite,
>but dependent on parameters, as in "function F(X: String) return
>String(1..X'Length)".  It seems related to all this limited-type stuff,
>in that it allows the caller to know sizes of function results that
>would otherwise be "return String".
>
I like it too.  But I think it is more of a "nice to have" for other
reasons than a solution to this problem.

*************************************************************

From: Randy Brukardt
Sent: Saturday, December 6, 2003 10:09 PM

Bob Duff wrote:

> Randy, in reply to Tuck, writes:
>
> > > I think you are definitely killing this proposal with complexity.
> >
> > I don't know who "you" refers to here,
>
> I didn't understand that either.  Because Tuck didn't say "... writes".
> You (Randy) followed suit.  ;-)

I usually answer a number of messages at once (otherwise I have to file a lot
more messages). That makes it hard to do and still be understandable.

> > What programmers are used to is the idea that limited types sound good in
> > theory, but are useless in practice.
>
> I've heard this vague claim many times.  Could you be more specific?  My
> feeling is that the claim is true exactly because of this initialization
> problem, and if we solve that, limited types would be very useful
> indeed.  Are there *other* issues that you or others think make limited
> type useless in Ada?

The main problem is that you can't have a component of one in something that
you want non-limited. Of course that has to be the case. But the workaround
(use a pointer) flies in the face of my philosophy of never using pointers
unless dynamic allocation is needed.

The net effect is that either everything in your program has to be limited
types, or everything has to be non-limited. Mixed systems don't work very well,
because they don't compose very well.

I think that's fundamental to limited types. What we can do, however, is make
it more possible to make everything limited (which is often the right choice
anyway), and the constructors are part of them.

But please keep in mind that constructors are not just for limited types, and
in fact allow things for non-limited types that require doing the operations
twice. See my Unbounded_String example, for instance. (BTW, that type is not
one that I made up. That's the type definition in Janus/Ada and in GNAT (at
least when I last looked) for Unbounded_String.) You can work around double
initialization with flags, but that has a distributed overhead: every operation
has to check the flags and be prepared for a missing data component. That is a
lot harder to get right and hurts performance as well.

> >... And I think that virtually all ADTs ought to be controlled.
>
> Unfortunately, that's not feasible if you care about efficiency, given
> the huge overhead most compilers have for controlled types.

If Ada compilers have "huge overhead" for controlled types, they're strangling
proper use of the language.

For Janus/Ada, it takes roughly 8 instructions (not counting the user-defined
code) to finalize an object, and about twice that to initialize it. That's down
in the noise for virtually all uses. Compilers that use mapping solutions
should cost even less (if not, they shouldn't bother with them!!).

The space overhead is more of an issue. I agree that if you have very small
data types, and you need very many objects, then there may be an issue. But
there are very few such types in any project by definition.

> You can stick to controlled types if you like, but I think it's a bad
> idea to assume (for the language design) that the only types that need
> constructors are controlled.  I have lots of limited types in my current
> project, and lots of types that *would* be limited if only I could
> initialize them nicely.

No, I said the only types that need constructors are ADTs. And all ADTs should
be controlled (but not necessarily in the way that Ada 95 does controlled).

> But I have very few controlled types; they're simply too inefficient.
> (Actually, I guess I should say I have very few controlled *objects*.
> That is, if I have a type where I'm creating and destroying thousands or
> millions of objects of the type, I can't make it controlled, because
> it's too slow.  The number of *types* is irrelevant to this efficiency
> issue.)

I was worried about the efficiency of my spam filter, because it stores
everything in Unbounded_Strings -- which are just a ball of heap operations and
finalizations. But it turns out that the big expense is actually loading the
patterns - the actual filtering operations are down in the noise.

If you're creating and destroying millions of objects, you have an efficiency
problem from doing that. Whether the objects are controlled or just integers
doesn't matter at all. Heap operations are so much slower than type creation
that they are the bounding factor -- you'll need very careful caching to make
the thing usable at all.

Anyway, we have completely different philosophies of programming, so I'm not
surprised that our results differ. What I find important is that we solve this
problem in a way that works for as many users as possible.

*************************************************************

From: Randy Brukardt
Sent: Saturday, December 6, 2003 10:25 PM

Bob Duff wrote:

> Constructors ought to be composable.  That is, clients should be able to
> write constructors, given primitive constructors.  For example, if a
> "Sequence" package gives the client a way to construct a singleton
> sequence, given one element, and a way to concatenate Sequences, the
> client ought to be able to write a constructor that takes two Elements
> and produces a sequence of length 2.  This is common in non-limited
> cases.  Why not in limited?

I'm not completely sure what you mean. Certainly, the proposal I've put forward
allows setting components with a constructor (usually in an aggregate). Or did
you mean something else?

> This implies that constructors cannot be required to be primitive ops.
> And therefore that such constructors cannot see improperly initialized
> objects.

I think it is important that constructors don't see *any* objects. That is,
constructors are creating an object -- in no sense should an object be "passed
in" to it. Implementations might implement them that way, but the semantics
certainly should not be that way.

> I like the idea that these new constructor functions are recognized
> somehow by the *spec*.

Good. :-)

> One way to write "bullet proof" abstractions is to forbid clients from
> creating uninitialized objects, by saying "type T(<>) is limited private".
> It would be nice if the abstraction could then export constructor
> functions, and clients could compose them, and/or call them.
> Clients cannot write "X: T;", so they ought to be able to write
> "X: T := Primitive_Constructor(...);", or declare their own composed
> constructor, and use that.  Either way, the package itself has total
> control over object creation.  The unknown discriminants do not
> *necessarily* mean the thing has truly unknown size.
> I'm not sure how important this is.

Well, if you want to do this, I think Tucker's proposal is a non-starter.
(Robert Eachus explained why).

----

A few other observations on the (rough) proposal I put out yesterday:

The proposal is very similar to the AI-318 discussed at the Sydney meeting. The
only real difference is the use of the keyword "constructor". That means that
the implementation issues discussed for that proposal would pretty much hold.

Constructors would be allowed exactly where aggregates are allowed, with the
same semantics. That means a non-limited constructor used in a regular
assignment would create a temporary object, then assign it. (Obviously, an
implementation could optimize that in some cases.)

The AARM points out in several places that := used for initialization is very
different than := used for assignment. What this proposal is really doing is
allowing users to write their own := initialization operations. That's a
capability currently absent in Ada (you can get parts of it, but not the whole
thing).

One additional concern about Tucker's proposal. Many times in the past, we've
used the invariant that you can't call a procedure in a declarative part to
prove that some rule or other can't cause a problem (usually with "in out"
parameters). If that is no longer true, there are a number of rules that would
need to be revisited (freezing, incomplete types, who knows how many others). I
don't look forward to finding new holes caused that eliminating that.

*************************************************************

From: Tucker Taft
Sent: Saturday, December 6, 2003 10:19 PM

Here is an approach that might accomplish all of our goals.
Enhance the procedure-renamed-as-function by also
allowing the renaming of something that looks like a procedure *call*.

For example:

   type Lim(F1 : Integer) is limited private;

   procedure P(Y : Integer; Out_Parm : out Lim);
   function F(X : Integer) return Lim;

private
   type Lim(F1 : Integer) is limited record
      F2 : Task_Type(F1);
   end record;

   function F(X : Integer) return Lim
     renames procedure P(Y => X, Out_Parm => (F1 => X+2, F2 => <>));

The renamed thing can be simply the name of a procedure as in the
earlier proposal, in which case it is equivalent to providing a default
initialized object as the actual for the [IN] OUT parameter of the procedure,
with the other function parameters passed to the corresponding
(by position) procedure IN parameters.

Alternatively, the renamed thing can be a procedure *call,* with the [IN]
OUT parameter given an initializing expression rather than a variable
as the actual.  The other parameters (which would have to be IN parameters)
would be given expressions of the approriate type.

In both cases, the "result" of calling the "function" is the
final value of the [IN] OUT parameter.

This latter form would clearly give a lot of flexibility, but
the *caller* would still be able to create the object (on the
stack, in the heap, as a component, etc.), perform
the specified initialization (rather than always performing default
initialization), register it for task waiting and/or
finalization, etc., as appropriate, and then call the designated
procedure.

This has a lot of nice properties.  The caller is still in charge
of getting the object ot a state at which it can be safely registered
for task waiting and/or finalization, and it is still in charge of
allocation.  The discriminants of the object can be determined
by other parameters to the "function."  And the out-of-line
procedure can still whatever it needs to to finish the desired
initialization/construction.

If we consider this direction, we *might* want to allow the
obvious generalization of allowing a function to rename
another function *call,* specified using a similar syntax.

Anyway, this might be a way to kill several birds with one
relatively straightforward stone...

*************************************************************

From: Randy Brukardt
Sent: Saturday, December 6, 2003 10:19 PM

Tucker said:

> Here is an approach that might accomplish all of our goals.
> Enhance the procedure-renamed-as-function by also
> allowing the renaming of something that looks like a procedure *call*.

Well, I think this would solve my specific concern. But I think you're just
making the idea uglier.

The fundamental problem is that a constructor is not a function and it
certainly isn't a procedure. It is more a custom definition of ":=
initialization", and really should have properties appropriate to that.

I have to wonder if this approach would work for a type with unknown
discriminants (Bob indicated that he wants to do that.) I don't see how it
would be able to set them in that case.

...
> This latter form would clearly give a lot of flexibility, but
> the *caller* would still be able to create the object (on the
> stack, in the heap, as a component, etc.), perform
> the specified initialization (rather than always performing default
> initialization), register it for task waiting and/or
> finalization, etc., as appropriate, and then call the designated
> procedure.

I don't see this as much of an issue. The "constructor" proposal has all of
that localized to the return statement. And the cost of doing it there is
not really any different than it is now -- I already have to stand on my
head to keep the return object from being finalized in the function. Doing
the same for tasks is trivial (it's the same chain for us anyway).

I realize other implementors mileage may differ somewhat, but they've
already got to face the problem of deferring finalization of the return
object.

> This has a lot of nice properties.  The caller is still in charge
> of getting the object ot a state at which it can be safely registered
> for task waiting and/or finalization, and it is still in charge of
> allocation.  The discriminants of the object can be determined
> by other parameters to the "function."  And the out-of-line
> procedure can still whatever it needs to to finish the desired
> initialization/construction.

I really see no benefit to having the caller do that. That said, I don't see
much harm in it either.

> If we consider this direction, we *might* want to allow the
> obvious generalization of allowing a function to rename
> another function *call,* specified using a similar syntax.

Yes, it certainly would be more consistent that way.

> Anyway, this might be a way to kill several birds with one
> relatively straightforward stone...

The one problem with this seems to be that if you can write an appropriate
aggregate to initialize the object, you probably don't need the constructor
in the first place. It also means that the constructor is split into two
parts arbitrarily: the initial initialization, and the rest of it. That
seems ugly.

*************************************************************

From: Robert I. Eachus
Sent: Sunday, December 7, 2003 12:17 AM

>>Here is an approach that might accomplish all of our goals.
>>Enhance the procedure-renamed-as-function by also
>>allowing the renaming of something that looks like a procedure *call*.
>
>Well, I think this would solve my specific concern. But I think you're just
>making the idea uglier.

Once I bent my head around it, it isn't that bad, and adding functions
seems to cover the case of unknown discriminants nicely.  But there is
no need to subject users to this mind bending experience.  What is the
difference between renaming a procedure call as a function, and having a
constructor function that calls the procedure instead?  Just a lot of
mental gymnastics that users wil complain about forever.  The way the
construction returns its result needs to be special because the value
has to be built in place.  Adding the special subprogram type which is
needed for many reasons to the "return ... do ... end;" construct which
should only be allowed in subprograms that are marked in the
specification in some way--and the name constructor deos that
nicely--results in a solution that users will just use without
complaining about the need for renaming, or that there is no obvious way
to convert a C++ program with constructors to Ada.  This is a nice to have.

Note one other very nice to have about the "return...do...end;
construct.  The object name after the return can be used as a prefix
within the sequence of statements.  The compiler has to do whatever
magic is required to match this to the place where the created object
belongs, and raise Constraint_Error if there is a mismatch.
(Constraints in both the return value and the object being created and
they don't match.)  But the normal cases work just fine.  If the type is
classwide, or the object is of an unconstrained subtype of a type with
discriminants, the values of the discriminants are supplied by the
object named in the return statement.  From that point on, from both the
user and compiler's perspective, all the magic is done.  And if you need
attributes of the actual object to do the initialization, they are
already there "for free."    There is a LOT of subtle semantics here.
Probably the most subtle part is what happens if  a constructor is
constructing an object of a class-wide type:

Foo: Bar'Class := Make_Foo(Param1, Param2);

This is actually an overload resolution issue.  If the compiler can
disambiguate which constructor to call from the parameters, fine.  If
not the user will have to change the object type to a specific type.  So
I think we do need the overload resolution rule that the declared return
type of a constructor must be a specific type. (I could see having a
constructor with several different return statements that returned
different specific types, while the declared return type was classwide.
It might be a fun idea to play with, but I think it should be out of
scope for this revision.)

I also feel, and I may be alone on this one, that constructors which are
predefined operations of a type should not be derived as abstract the
way functions are.  With constructors of tagged types there will be many
cases where calling the constructor for the parent type on a view
conversion is the magic you want.  (But if it creates compiler issues, I
can easily be talked out of it.)

>The fundamental problem is that a constructor is not a function and it
>certainly isn't a procedure. It is more a custom definition of ":=
>initialization", and really should have properties appropriate to that.

I agree with Randy.  This is a special thing from both the language
viewpoint and the user's viewpoint.  Trying to avoid that results in a
cognitive dissonance that is an invitation to headaches, both for
language lawers and Ada programmers.

(Randy said:)

>I don't see this as much of an issue. The "constructor" proposal has all of
>that localized to the return statement. And the cost of doing it there is
>not really any different than it is now -- I already have to stand on my
>head to keep the return object from being finalized in the function. Doing
>the same for tasks is trivial (it's the same chain for us anyway).

Yep.

(Back to Tucker:)

>>If we consider this direction, we *might* want to allow the
>>obvious generalization of allowing a function to rename
>>another function *call,* specified using a similar syntax.

I am not sure that I like this new idea, but without renaming of
function calls, I think it is a non-starter.  And once you look at the
renaming of function calls case we quickly get back to the constructors
with special return values.

>The one problem with this seems to be that if you can write an appropriate
>aggregate to initialize the object, you probably don't need the constructor
>in the first place. It also means that the constructor is split into two
>parts arbitrarily: the initial initialization, and the rest of it. That
>seems ugly.

Worse.  From a user's point of view, the action of the contructor will
often have to be split into two parts, the initialization of the parent
part of the object (posibly done as part of an aggregate), then what is
special to this constructor.  However, the split that Tucker's approach
creates will only match this cognative division by accident.  (Or by
very careful planning on the user's part..)  I just don't think we need
that additional hurdle to using this language feature.

I really keep coming back to the same thought.  The purpose of this
effort is to make limited types more usable in Ada.  (Some people would
take the more out of that statement.)  To do that is going to require
constructors which are pain free from a user's perspective.  If we don't
have that, we have junk that is not worth implementing.   Tucker's
approaches may be very clever tricks, but that is how they will be seen
by users.  It would be fine to implement the actual constructors that
way, but the user view needs to be simple and straightforward to use.

And as I said (by now yesterday my time), I think that the issue Bob
Duff identified with:

type T(<>) is limited private;

is crucial.  Why would I declare such a type?   Because I want to
control the creation of all objects of the type.  But unless I can
export constructors, the type is pretty useless.  Getting rid of that
special field, the only one with an initial value, that says this is
really an uninitialized object, would make me very happy.  But right now
that is the only way to have limited objects with control of creation.
(Actually, I have done nastier things with access discriminants, and
default initial values that call a protected object.  That is one big
hairy kludge.  But it did insure that each object had a unique,
sequential ID.)

*************************************************************

From: Tucker Taft
Sent: Sunday, December 7, 2003  8:27 AM

Randy Brukardt wrote:
> ...
> I have to wonder if this approach would work for a type with unknown
> discriminants (Bob indicated that he wants to do that.) I don't see how it
> would be able to set them in that case....

This is straightforward:

    type Lim(<>) is limited private;
    procedure P(Y : Integer; Out_Parm : Lim);
    function F(X : Integer) return Lim;
  private
    type Lim(F1 : Integer) is limited record
        F2 : Task_Type(F1);
    end record;

    function F(X : Integer) return Lim
      renames
        procedure P(Y => X, Out_Parm => (F1 => X+2, F2 => <>));

Now the user can't declare just "L : Lim;" but rather
must call a function like F at the declaration point
("L : Lim := F;").

---------

I understand the concern about the split between the functionality
specified at the rename, and the functionality buried in the procedure.
However, if we want the discriminants (and hence the size) known
to the caller, then you need to be able to write something
available to the caller (in Dan's idea, it was the function
result subtype) which is a function of the parameters.
But Dan's idea doesn't work for a "type Lim(<>) is ..." type
since the discriminants are not visible.  The renaming approach
because of the possible separation between the initial
declaration in the visible part, and the renaming in the
private part, allows for that.

Furthermore, the separation has other advantages.  A single
procedure can be used with several different renamings, with
the renamings differing in the expressions passed for some
of the IN parameters of the procedure, or the initializing
expression for the [IN] OUT parameter.  This is similar
to what is done now with renamings where you can create
several renamings of the same subprogram, with different
default expressions.  I would say this approach is
actually *clearer* than the "trick" of changing the default
expressions on renaming.  Presuming it is available for
function-to-function renaming and procedure-to-procedure
renaming, then it becomes a generally useful capability,
which can also be extended to handle procedure-to-function
renaming which is what limited types need.

I will say (again ;-) I am quite concerned about introducing a new
kind of program unit.  This brings up the issue of library
constructors, generic constructors, subunit constructors,
constructor stubs, etc.  Furthermore, from the caller's
point of view, is there any limitation on where the can
be called, or is it just like a function, and a call on
a constructor can be used anywhere a function can be used?

In my view, functions *are* Ada's constructors, and the fact
that we can name our functions anything we want is a step
up from C++.  The problem is that you can't write decent
functions for limited types.  I think "fixing" this by
inventing a completely new notion of a "constructor" leaves
the existing use of functions out in the cold.  Does this
mean that we should go back and change all the functions
we have created for non-limited types which are often
used as constructors, and make them "true" constructors?

I believe you are unnecessarily "orphaning" functions.

I would rather "orphan" the existing ability to return
by reference, which I think is of marginal use.  If we
wanted to call something a "limited function", I would
say a function that returns by reference is such a thing.

So perhaps the "right" fix is to say that if you want
to continue to use return-by-reference, you have to
label the function a "limited function."  All other
functions are "constructors" in that they create "new"
objects.  For limited types, the caller needs to
know enough to be able to allocate and at least partially
initialize the object, since this "new" object cannot
be copied as part of the function call, but must be
initialized in its final resting place.

Randy indicates he already deals with functions returning
controlled types, but I think for limited controlled,
it is a somewhat different problem, because *no* copying
is permitted (there is no "adjust" procedure).  We have
ourselves worked to minimize the number of copies involved,
but for types with discriminants, we end up with at least
one final copy even in "initializing" contexts.
To support limited types with discriminants, it seems
clear that something available to the caller (such as
a renaming declaration) has to provide an indication
of the value of the discriminants of the "new" object.

*************************************************************

From: Robert A. Duff
Sent: Sunday, December 7, 2003  10:26 AM

> > I've heard this vague claim many times.  Could you be more specific?  My
> > feeling is that the claim is true exactly because of this initialization
> > problem, and if we solve that, limited types would be very useful
> > indeed.  Are there *other* issues that you or others think make limited
> > type useless in Ada?
>
> The main problem is that you can't have a component of one in something that
> you want non-limited.

But that sort of begs the question.  I mean, if I ask, "why can't you
make type T1 limited?", and you say, "because I want T2 to have a
component of type T1, and T2 can't be limited," then I'll ask again,
"OK, why can't you make *T2* limited then?"

My feeling is that the reason you can't make T2 limited is because of
these initialization issues (including aggregates and constructor
functions), and if that were solved, then you would be happy to make
both T2 and T1 limited.  (Or maybe T2 is a component of T3, and so on
-- but somewhere down the line, there's got to be a real reason, other
than, "if I make *this* limited then I'd have to make *that* limited.")

>... Of course that has to be the case. But the workaround
> (use a pointer) flies in the face of my philosophy of never using pointers
> unless dynamic allocation is needed.
>
> The net effect is that either everything in your program has to be limited
> types, or everything has to be non-limited. Mixed systems don't work very
> well, because they don't compose very well.

I don't see any problem with that.  Why would you *want* to compose
them?  I mean, if you have something that's naturally limited (say,
Window_Handle), then I would want things containing Window_Handles to be
limited, too.

> I think that's fundamental to limited types. What we can do, however, is
> make it more possible to make everything limited (which is often the right
> choice anyway), and the constructors are part of them.

Yeah, that's what I think we should do.  My question was, if we do that,
will people *still* gripe that limited types "sound good but are useless
in practise"?  I think not (i.e. fixing the initialization problems is
sufficient to make limited types useful).

> But please keep in mind that constructors are not just for limited types,
> and in fact allow things for non-limited types that require doing the
> operations twice. See my Unbounded_String example, for instance. (BTW, that
> type is not one that I made up. That's the type definition in Janus/Ada and
> in GNAT (at least when I last looked) for Unbounded_String.) You can work
> around double initialization with flags, but that has a distributed
> overhead: every operation has to check the flags and be prepared for a
> missing data component. That is a lot harder to get right and hurts
> performance as well.

I agree that the double-initialization is not nice.  But I don't see a
better alternative.  (OTOH, as I said, I don't fully understand all the
details.)

> > >... And I think that virtually all ADTs ought to be controlled.
> >
> > Unfortunately, that's not feasible if you care about efficiency, given
> > the huge overhead most compilers have for controlled types.
>
> If Ada compilers have "huge overhead" for controlled types, they're
> strangling proper use of the language.

I agree.

> For Janus/Ada, it takes roughly 8 instructions (not counting the
> user-defined code) to finalize an object, and about twice that to initialize
> it.

Pretty impresive, I think.

Anyway, let's not argue about whether finalization is good or evil or
fast or slow.  The fact is, there are some programs (the one I'm working
on right now is an example), that would like to have lots of
non-controlled limited types, with constructors.  You said:

>...What I find important is that we solve
> this problem in a way that works for as many users as possible.

and I agree with *that*.  It implies that we should not create a
solution that works only for controlled types.

*************************************************************

From: Robert I. Eachus
Sent: Sunday, December 7, 2003  3:48 PM

This is a long post.  It is necessary that someone completely work
through these proposals to find any potential problems if they are
adopted.  Not everyone needs to check my work.  (At least the example
package as included here compiles. ;-)

The short version of this post is that Tucker's old or new approach may
cause problems that need to be resolved in the area of protected
objects.  I don't see it as a killer, but it deserves thought.  The new
constructor subprogram type can be allowed as a part of protected types,
but I don't see a pressing need.  Note that it is not the object being
created that needs protecting, it is parameters to the constructors that
may  raise concurrency issues.  'Allowing' a constructor of a type to be
part of a protected object to me is not needed even for orthogonality
reasons.  A constructor never needs a way to protect the object that it
is creating, and of course a constructor in a protected type would not
be creating objects of the protected type.

Tucker Taft wrote:

> Randy Brukardt wrote:
>
>> ...
>> I have to wonder if this approach would work for a type with unknown
>> discriminants (Bob indicated that he wants to do that.) I don't see
>> how it
>> would be able to set them in that case....
>
>
> This is straightforward:
>
>    type Lim(<>) is limited private;
>    procedure P(Y : Integer; Out_Parm : Lim);
>    function F(X : Integer) return Lim;
>  private
>    type Lim(F1 : Integer) is limited record
>        F2 : Task_Type(F1);
>    end record;
>
>    function F(X : Integer) return Lim
>      renames
>        procedure P(Y => X, Out_Parm => (F1 => X+2, F2 => <>));
>
> Now the user can't declare just "L : Lim;" but rather
> must call a function like F at the declaration point
> ("L : Lim := F;").

Um, that is "L: Lim := F(3);" or some such.  There are two limitiations
with this approach.  First, it is difficult to have really unknown
discriminants--the discriminant types probably have to be visible so
that the visible function declaration can have parameters of the type.
(Not a big deal.)   The other limitation is the one that bothers me.  Go
back and look at the example I posted of converting a linked list of
Unbounded_Strings to an array.  There is a workaround to use with this
approach, have a function which walks the list and counts the entries
and use a call to that to pass the array size to the function.  Let me
give a fully worked out (but currently useless) example:
---------------------------------------------------------------------------------------------------------
with Ada.Strings.Unbounded; use Ada.Strings.Unbounded;
package Unbounded_String_Utilities is
   type String_Array(<>) is limited private;
   type String_List is limited private;
   function Size (Arr: String_Array) return Natural;
   function Size (List: String_List) return Natural;
   function To_String_Array(List: in String_List) return String_Array;
private
   type List_Node;
   type String_List is access List_Node;
   type String_Array is array(Natural range <>) of Unbounded_String;
   type List_Node is record
     Value: Unbounded_String;
     Next: String_List;
   end record;
end Unbounded_String_Utilities;

package body Unbounded_String_Utilities is

   function Size (Arr: String_Array) return Natural is
   begin return Arr'Length; end Size;

   function Size (List: String_List) return Natural is
     Count: Natural := 0;
     Temp: String_List := List;
   begin
     while Temp /= null loop
       Temp := Temp.Next;
       Count := Count + 1;
     end loop;
     return Count;
   end Size;

   function To_String_Array(List: in String_List) return String_Array is
     Result: String_Array(1..10);
     Temp: String_List := List;
   begin
     if Temp = null then return Result(1..0); end if;
     for I in 1..10 loop
       Result(I) := Temp.Value;
       if Temp.Next = null then return Result(1..I); end if;
     end loop;
     return Result & To_String_Array(Temp);
   end To_String_Array;

end Unbounded_String_Utilities;
---------------------------------------------------------------------------------------------------------
Right now, without the limited keywords, I can use this package.  But
with String_Array and String_List declared limited, I can't even declare
an object of type String_Array outside the package private part, body,
or child packages.  My proposed solution is to declare To_String_Array
as a constructor, and, if necessary change the return statements.  (But
it shouldn't be necessary in this case, since type String_Array is not
limited inside the body of To_String_Array.)

What do I have to do for Tucker's solution?  I have to replace the body
To_String_Array with a procedure, and a renaming of  a call to the
procedure:

   procedure To_Array(List: in String_List; Result: out String_Array) is
     Temp: String_List := List;
   begin
     for I in Result'Range loop
       Result(I) := Temp.Value;
     end loop;
   end To_Array;

   Junk: Unbounded_String;

   function To_String_Array(List: in String_List) return String_Array
renames
          To_Array(List, (1..Size(List) =>Junk));

Making all those copies of Junk just to throw them away is something the
compiler might figure out.  But the other problem is the one that
concerns me.  Specifying the size of the array in the renaming makes the
procedure somewhat simpler than the function it replaces.  Instead there
is a call to Size, I probably have to write that function anyway, but
what if someone appends something to the List, or worse, shortens the
list while I am working on it?  Ah, I'll just create a protected object,
and insure that operations in the package have the appropriate locking
semantics.   Oops!   The problem is not that To_Array is a procedure, it
is that locking for Size then locking for To_Array is useless.  You have
to allow this funny renaming to be an operation of a protected type, and
that operation has to include the evaluation of the parameters of the
procedure call.  The renamed procedure probably also has to be an
operation of the protected type so that  internal calls are
appropriately recognized.  Of course, you could use semaphores and put
the P and V operations in different subprogram bodies.  But doing that
first requires that those particular subprograms not be visible to the
user of the package, and even then is a maintenance nightmare.

What about the new operation approach?   I don't think we need to touch
protected types at all.  I might want to have a per list lock for the
list type, but I can do that by making a new list head type which is the
public type, make it the protected object. (The public type is already
limited, right? ;-) When creating the array, I can have calls in the
constructor to lock and unlock the list head.  I don't need to worry
about protecting the array object during construction, it can't be
referenced before the declaration has been completely elaborated.  If I
had the need, I could do the same thing, in the other direction, lock an
array, copy the data, unlock the array and return.

Note that this does mean that the locking and unlocking operations needs
to go into the sequence of statements of the return ... do ... end.
That should be no problem compared to all the cruft necessary to make
the other approach work with protected objects.

If  everyone thinks that a constructor should be treated like a function
within a protected type declaration for orthogonality reasons. (Or for
that matter a procedure.)  It can be done, but I don't see any real need
for it.

> I will say (again ;-) I am quite concerned about introducing a new
> kind of program unit.  This brings up the issue of library
> constructors, generic constructors, subunit constructors,
> constructor stubs, etc.  Furthermore, from the caller's
> point of view, is there any limitation on where the can
> be called, or is it just like a function, and a call on
> a constructor can be used anywhere a function can be used?

I think that I have the same concerns as Tucker does, but a different
viewpoint.  There is no real need, outside of orthogonality issues to
have library constructors, generic constructors, or subunit
constructors.  A generic package could certainly contain a constructor
for the ADT it defines, but a constructor as a library unit or generic
unit makes no real sense to me.   Separate bodies for constructors may
be a nice to have, but not if it kills the proposal.

As for where a constructor can be called, my worry is that with Tuck's
proposal, a procedure renamed in private to create a thing that publicly
looks like a function means that is can be called anywhere a function
name is legal.  Otherwise we have a major contract violation.  With my
proposal, my intent is to keep things  as restricted as possible.  If we
have the return ... do ... end construct as Randy advocates, it
certainly should be legal to call the constructor for the parent type in
an aggregate.  You also need to be able to use a constructor to create a
target of an allocator.  And, of course, you need to be able to call a
constructor to create the initial value in an object declaration.

What about as an in parameter  in a procedure or other call?   I don't
see any harm in requiring users to actually  create an object with a
declaration, then pass the object in a call.  The problem now is that in
some cases you can't create a useful object to pass.

In fact, this is probably the hardest part of what we are doing here.
We are not just fixing a problem in the language we are refining what
the properties of a limited object are.   I actually want to make it
more difficult to create a limited object outside the facilites provided
by the ADT.   But that requires  making constructors for limited types
work.

> In my view, functions *are* Ada's constructors, and the fact
> that we can name our functions anything we want is a step
> up from C++.  The problem is that you can't write decent
> functions for limited types.  I think "fixing" this by
> inventing a completely new notion of a "constructor" leaves
> the existing use of functions out in the cold.  Does this
> mean that we should go back and change all the functions
> we have created for non-limited types which are often
> used as constructors, and make them "true" constructors?
>
> I believe you are unnecessarily "orphaning" functions.

LOL!  Most current functions, even those that are part of an ADT, do not
return a value of a private or limited private type.  If you want to
change To_Unbounded_String, etc., to constructors, I see no problem with
that.  But most of the functions in Ada.Strings.Unbounded are actually
operators, and intended to be used as such.  In fact, if you think about
it there are several 'special' types of functions in Ada already, such
as character literals, string literals, operators, and attributes.

If it helps you to think in terms of "constructor function Foo return
Bar is...", fine.  Or perhaps "function Foo return in_place Bar;" is
even better.  I am trying to stay away from "limited function Foo is..."
because I feel that constructors will be used for both limited and
non-limited ADTs.  I also feel that avoiding the word constructor is
wrong.  If it walks like a duck, and quacks like a duck, we don't do
anyone any favors by saying that in Ada we call that a moose.

> I would rather "orphan" the existing ability to return
> by reference, which I think is of marginal use.  If we
> wanted to call something a "limited function", I would
> say a function that returns by reference is such a thing.

I definitely disagree.  It may be that the compiler you are familiar
with seldom returns values by reference, but at least in my code it is
pretty common.

> To support limited types with discriminants, it seems
> clear that something available to the caller (such as
> a renaming declaration) has to provide an indication
> of the value of the discriminants of the "new" object.

 It may be that this is the real crux of the matter.  I don't want the
discriminants made available to the caller. Tucker is probably using
this as shorthand for "made available to the compiler in the context
from which the call is made."   But there are many useful programming
idioms where it is the constructor that determines the properties of the
object created.  Most Ada compilers currently handle this (in the
unconstrained function return value case) very well, and very
efficiently.  What is different in the limited case is that the return
value has to be created in place.  I see this as a dialog between the
calling environment and the constructor which takes place in the return
statement.  Randy apparantly sees this as well,  and it fits very nicely
with the "return ... do ... end" construct.  The constructor does
whatever it must then reaches the return statement.  There it either
calls a thunk or whatever and gets the space it needs for to create the
returned object.  In my mental model, it is fine for the limited case to
allocate space on the heap and the caller will deallocate the object
when the scope is left.  (Finalization, right?)  It is also possible to
have a separate call stack for functions so that return values can be
built directly in the calling procedure's context.  (Works very well in
practice because of the way that functions are used in Ada.)

But these compiler efficiency issues as far as I am concerned are low on
the priority list when discussing this issue.   I would have to put my
weightings at 40% usability, 30% ease of understanding/teaching, 20%
compiler implementation difficulty, and 10% efficiency issues.  I
thought about making a table of issues and comparing proposals  but this
message is long enough as it is.  Maybe someone else will give that a
try, possibly someone not associated with any current position. ;-)

Wish I could join you at the meeting, but right now even Route 128 is
outside my travel radius.

*************************************************************

From: Tucker Taft
Sent: Sunday, December 7, 2003  4:27 PM

> ... In my mental model, it is fine for the limited case to
> allocate space on the heap and the caller will deallocate the object
> when the scope is left.   ...

I don't see how that works for a component.  Only the caller
knows exactly where the object is to be allocated.  Trying
to communicate that to the called routine is not trivial.

*************************************************************

From: Robert A. Duff
Sent: Sunday, December 7, 2003  9:35 PM

Randy wrote:

> Bob Duff wrote:
>
> > Constructors ought to be composable.  That is, clients should be able to
> > write constructors, given primitive constructors.  For example, if a
> > "Sequence" package gives the client a way to construct a singleton
> > sequence, given one element, and a way to concatenate Sequences, the
> > client ought to be able to write a constructor that takes two Elements
> > and produces a sequence of length 2.  This is common in non-limited
> > cases.  Why not in limited?
>
> I'm not completely sure what you mean. Certainly, the proposal I've put
> forward allows setting components with a constructor (usually in an
> aggregate). Or did you mean something else?

I think one of your (or somebody's) e-mails suggested a legality rule
along the lines of "a constructor must be a primitive operation of the
constructed type".  What I meant was, I don't like that restriction,
because no such restriction exists for nonlimited types, and what we're
tring to accomplish is to initialize limited objects just like
nonlimited ones (or as close to that as we can manage).

> > This implies that constructors cannot be required to be primitive ops.
> > And therefore that such constructors cannot see improperly initialized
> > objects.
>
> I think it is important that constructors don't see *any* objects. That is,
> constructors are creating an object -- in no sense should an object be
> "passed in" to it. Implementations might implement them that way, but the
> semantics certainly should not be that way.

I'm not sure what you mean by that.  I was under the impression that all
of the proposals on the table had a named object inside the function
that acts as the function result (or constructor result or whatever),
and that this object *is* the object at the call site that is being
created.  Are you saying that it should be illegal to read from this
object (as in Ada 83 'out' parameters)?

I could be confused -- I admit I've lost track of the details in a
the morrass of e-mail.

*************************************************************

From: Randy Brukardt
Sent: Sunday, December 7, 2003 11:46 PM

Bob said, replying to me replying to him: (See I can remember to tag
these... :-)
> > > This implies that constructors cannot be required to be primitive ops.
> > > And therefore that such constructors cannot see improperly initialized
> > > objects.
> >
> > I think it is important that constructors don't see *any* objects. That
is,
> > constructors are creating an object -- in no sense should an object be
> > "passed in" to it. Implementations might implement them that way, but
the
> > semantics certainly should not be that way.
>
> I'm not sure what you mean by that.  I was under the impression that all
> of the proposals on the table had a named object inside the function
> that acts as the function result (or constructor result or whatever),
> and that this object *is* the object at the call site that is being
> created.  Are you saying that it should be illegal to read from this
> object (as in Ada 83 'out' parameters)?
>
> I could be confused -- I admit I've lost track of the details in a
> the morrass of e-mail.

No, I was simply saying that there shouldn't be any name for the object
being constructed until the constructor is actually ready to create it. That
is, in the proposal I sketched out, not until the return statement is
encountered. At that point, the object will be initialized -- but the
constructor can control what initialization is done (with an aggregate or
another constructor). In particular, the caller cannot and should not be
trying to allocate anything; it has to tell the constructor how to do that
(most likely on a secondary stack).

Tucker said:

> In my view, functions *are* Ada's constructors, and the fact
> that we can name our functions anything we want is a step
> up from C++.  The problem is that you can't write decent
> functions for limited types.  I think "fixing" this by
> inventing a completely new notion of a "constructor" leaves
> the existing use of functions out in the cold.  Does this
> mean that we should go back and change all the functions
> we have created for non-limited types which are often
> used as constructors, and make them "true" constructors?

The few functions which are constructors (like To_Unbounded_String) should
indeed be changed. But most things aren't constructors in the normal sense.

I see a function as returning a copy or reference of an existing object. A
constructor creates a new object. For limited types, that's in-place
construction, and that should be true for limited types as well.

In any case, if we had the will to change *all* functions this way, that
would be fine. But that's not going to happen - it would break too many
programs. So we have to introduce a new concept. That's the penalty for
getting it wrong the last time.

(Now, if constructors allowed "in out" parameters, then we could solve
another problem as well. And then if functions ended up orphaned, good
riddance. But I doubt that we have the will to do that.)

*************************************************************

From: Robert I. Eachus
Sent: Sunday, December 7, 2003 11:49 PM

I am more and more convinced that this will appear to most users as the
single largest change in Ada 0Y.  I don't think the compiler work is
going to be all that hard--after all C++ has constructors. ;-)  But  I
think we do need to devote time to getting the rules right.  If we do
almost all Ada programs will use this feature.   (I certainly know that
I am spending much too much time on this topic, if it is not as
important as I think it will be.)  -- RIE

Tucker Taft wrote:

>>... In my mental model, it is fine for the limited case to
>>allocate space on the heap and the caller will deallocate the object
>>when the scope is left.   ...
>>
>>
>
>I don't see how that works for a component.  Only the caller
>knows exactly where the object is to be allocated.  Trying
>to communicate that to the called routine is not trivial.
>
>
I understand what you are saying, I think.  This is why having a
separate function call stack is such a useful optimization.  If a
procedure (or for that matter a function) wants to build an object "in
place" then, if you can use another stack to contain the function's
return context and any local objects, then the return value just gets
built on top of the heap, or, if necessary in the heap.  If you can't do
this you end up doing things like  copying the return value lower down
the stack after the function returns.  I 've seen several variations on
this approach.  I think  the ALS used a ping-ponging scheme, and other
compilers  have a separate stack for function calls.  But given the way
Ada is used, even dynamic calling depths are low, and it is possible to
have a stack for each nesting level most of the time.

Impilict heap use works, and is clearly a workable solution for most of
the cases we have been discussing.  The only tricky part is when a
component of a record is being created is "big" and the size is
determined by the called function.  This is where Alsys used to go into
all that "mutant record" stuff.  (I don't know if they still do.)  But
this is just the nature of Ada record structures.  Most of the time it
is possible to allocate a record in one contiguous memory space with no
embedded pointers.  But for some of the extreme cases it is better to
use embedded pointers.  For example, if you have several components
whose size depends on the discriminants, you can pass thunks to
calculate the offets the way DEC Ada did.  It is just faster (but less
memory efficient) to have offsets or  pointers as part of the structure.

But as I said in my previous message, I would rather have some
inefficiency in the memory management in complex cases if that makes it
easier for programmers to use limited types for the normal cases.  And
as I see it, there are two 'normal' cases here. In the first case,  from
the point of view of the constructor the object will have a constant
size header, which will come from the parent (or collective ancestor)
types, and a fixed size extension.  In the second case, the type will be
a container type with discriminants, and a size that is computed by the
constructor.  But the object will not normally be a component.

I guess if  you really feel strongly about it based on your compiler
implementation, it would be possible to restrict constructors to
creating objects not components.  I just don't see it as a problem.  Of
course, I did advocate (intentionally) that constructors need to be
allowed in aggregates, but omitted default values in records.  You could
add a rule that a constructor is illegal as a default value in a record
definition, but again I don't see why.  The compiler will be creating an
entire object containing the component, just as in the aggregate case.

But I do see a big mess if a constructor can be called to change the
contents of a limited record object.  As I said before, for me--and I
suspect everyone--that is no way, no how what we want. Getting the
limited case right is more important than covering the non-limited ADT
case.  However, I don't see a problem with writing the rule so
constructors for non-limited objects can be used to change components of
records.  (As far as I am concerned, I would expect the compiler to use
the constructor to create a--remember non-limited--object, then copy it
into the actual record.  An "extra" copy in some cases, but I don't see
it as a big deal if it is inefficient to use constructors as other than
constructors.)

*************************************************************

From: Robert I. Eachus
Sent: Monday, December 8, 2003 12:03 PM

Randy Brukardt wrote:

>No, I was simply saying that there shouldn't be any name for the object
>being constructed until the constructor is actually ready to create it. That
>is, in the proposal I sketched out, not until the return statement is
>encountered. At that point, the object will be initialized -- but the
>constructor can control what initialization is done (with an aggregate or
>another constructor). In particular, the caller cannot and should not be
>trying to allocate anything; it has to tell the constructor how to do that
>(most likely on a secondary stack).
>
>
I think Randy and I are in agreement here, even if we don't agree  on
which is the "secondary" stack. ;-)

>In any case, if we had the will to change *all* functions this way, that
>would be fine. But that's not going to happen - it would break too many
>programs. So we have to introduce a new concept. That's the penalty for
>getting it wrong the last time.
>
>
I don't know that we got it "wrong" last time.  I think we just didn't
think through the need at all.  Functions returning results by reference
is a neat trick in other circumstances, but it can't handle this
particular problem/need/programming model.

>(Now, if constructors allowed "in out" parameters, then we could solve
>another problem as well. And then if functions ended up orphaned, good
>riddance. But I doubt that we have the will to do that.)
>
The will, or the stomach?  I would rather have functions with in out
parameters than constructors.  But the Rosen trick will work for both if
needed, and I hope the need can be restricted to random number generator
seeds. ;-)

*************************************************************

From: Tucker Taft
Sent: Monday, December 8, 2003  6:13 AM

Robert Eachus wrote:
>
> Tucker Taft wrote:
>
> >>... In my mental model, it is fine for the limited case to
> >>allocate space on the heap and the caller will deallocate the object
> >>when the scope is left.   ...
> >>
> >>
> >
> >I don't see how that works for a component.  Only the caller
> >knows exactly where the object is to be allocated.  Trying
> >to communicate that to the called routine is not trivial.
> >
> >
> I understand what you are saying, I think.  ...

I'm not sure you do.  You certainly cannot assume that limited components
will be allocated on the heap.  Yes, I suppose some compiler's,
like RRs, allocate all nested composite objects with a level
of indirection, but that is the exception, not the rule.  Almost
all the other compilers go out of their way to keep records
contiguous, whether they are limited or non-limited.
I presume you will be able to use a call on one of these
constructor functions to initialize a limited component of a
limited aggregate.  If not, you haven't solved the problem
in my view.  That is:

    X : Lim2 := Lim2'(F1 => 7, F2 => Lim1_Con_Func(3,4));

where Lim1_Con_Func is one of these constructor functions.

> ...
> But as I said in my previous message, I would rather have some
> inefficiency in the memory management in complex cases if that makes it
> easier for programmers to use limited types for the normal cases.

I really don't see any compiler changing its record layout just to make
these possible.  That would be hugely disruptive.

> ...
> I guess if  you really feel strongly about it based on your compiler
> implementation, it would be possible to restrict constructors to
> creating objects not components.

That is a non-starter.

> ...
> But I do see a big mess if a constructor can be called to change the
> contents of a limited record object.  ...

Noone is proposing that.  Except for the oddball return-by-reference
functions, all function calls are associated with the creation of a
new object.  The only issue is how this is *implemented.*
From the view of the user of the function, there is always a new object
produced.  The procedure-renamed-as-function proposal is suggesting
that the compiler does the allocation and basic initialization
inline before going out-of-line to the procedure.  In the
"return ... do ..." proposal, the allocation still needs to
be done prior to the call.  It could be raw, uninitialized storage,
but the space needs to exist before the call, if you are going
to allow components to be initialized by such a call.

To do the allocation, the compiler needs to know the size
of the result, meaning in general it needs to know the
discrminants if the result type is discriminated.  If the
caller knows the discriminants, then we need to be sure the
out-of-line code uses the same values for the discriminants.
This gets tricky if there is no name that refers to the
returned object until the out-of-line code declares one
(e.g. via the "return Result: T := <expr> do ..." construct), since they
can't refer to the values of the discriminants in the
initializing expression for the return object.

I suppose one way out of this conundrum is to allow references
to discriminants of "Result" in the "<expr>".  That is not the
way normal declarations work, but it could work here.

The other problem, as others have indicated, is that sometimes
we want the discriminants of the new object to be a function
of the parameters.  That seems harder in the
"return ... do ..." approach, since we would have to somehow
communicate the values to the calling context as well, since
they are needed *before* the object can be allocated,
and that needs to happen before the call (because of components).

> ... However, I don't see a problem with writing the rule so
> constructors for non-limited objects can be used to change components of
> records.

I don't know what you are talking about when you say
"change" a component.  Do you mean that for a non-limited
type, you would allow a call on a constructor function to be
the right hand side of an assignment, whose left-hand-side
is a component selection (or anything else, for that matter)?
Or do you mean the function would somehow be passed a reference
to a preexisting object, and treat it as an IN OUT parameter?
I don't understand what a call would look like in this latter
case, and it is not something I have any interest in.

In the procedure-renamed-as-function proposal, it is true
the *procedure* is passed an [IN] OUT parameter, but the
procedure is *not* the "constructor."  The constructor
is the function defined by the renaming.  The procedure
is just a useful hunk of code that the constructor
function reuses.  That same procedure might be used
for other things.  The function defined by the renaming,
on the other hand, has the properties of a "normal" function,
namely that it is always associated with the creation of
a new object.  It is different in that there is enough
information provided so that at the call-site, the compiler
can generate the code to do allocation and basic initialization,
and the out-of-line code need not worry about that.

*************************************************************

From: Pascal Leroy
Sent: Monday, December 8, 2003  7:22 AM

Randy wrote:

> Let's look specifically at the "complexity" of this counter
proposal...

If we are going to add constructors to the language (and that's a big
if, given that we are already pretty late in the game) then I strongly
favor the Brukardt-Eachus proposal over the other ideas that have been
floated in this thread.

*************************************************************

From: Tucker Taft
Sent: Monday, December 8, 2003  9:52 AM

It would be helpful if you could augment this with some rationale,
to help understand your view of the strength and weaknesses of the proposals.

*************************************************************

From: Robert I. Eachus
Sent: Monday, December 8, 2003 11:44 AM

Tucker Taft wrote:

>Robert Eachus wrote:
>
>
>>I understand what you are saying, I think.  ...
>>
>>
>
>I'm not sure you do.  You certainly cannot assume that limited components
>will be allocated on the heap.  Yes, I suppose some compiler's,
>like RRs, allocate all nested composite objects with a level
>of indirection, but that is the exception, not the rule.  Almost
>all the other compilers go out of their way to keep records
>contiguous, whether they are limited or non-limited.
>
>
I was not saying that a compiler should allocate all limited objects
built by constructors on the heap.  I was saying that it is one of the
cases that has to work:

Something:  Foo_Access := new Foo'(Constructor);

You are arguing for a solution where the caller allocates the space on
the heap, and passes the address to the constructor.  Randy and I are
heading more towards a dialog between the constructor and the caller, or
a thunk based approach.  The space on the heap would normally be
allocated in the return statement.

If the constructor returns an object which is constrained, there is no
problem with components, the caller can know the size of the object
before the call.  The problem with components is exactly the case which
Alsys called mutant records.  This is where allocating the maximum size
of the object doesn't work.

In Ada 83, many compilers used the "allocate the maximum" approach, and
some still do.  I was saying/accepting that for some objects,
Unbounded_String components are an excellent example, the normal case is
going to require a level of indirection.  In both Randy's compiler and
GNAT, that is the way Unbounded_String is declared.

Note that in cases where the compiler knows a reasonable maximum at
compile time, most compilers will, as you say, go out of their way to
allocate the record contiguously.  A good example is the Bounded_String
case.  For a Bounded_String I expect a compiler to allocate the maximum.

Where I think our mental pictures differ is that you are thinking that
since a limited object cannot change in size, this should never be a
problem.   But I want constructors to be able to handle the case where
the size of the object to be created is determined in the constructor.
That was what the To_String_Array example was about.  Ada currently
allows me to handle this case for non-limited objects, it should
actually be easier for limited objects, but right now it is illegal.

So my mental picture of how this works, is that there is a dialog
between caller and constructor, probably mediated by a thunk.  When the
constructor gets to the return statement, it says "the object I am about
to create will be N bytes long," and the thunk responds with an address
of the memory space to put it in.  Constructors need this hidden
data/extra parameter, which is why I favor an explicit new subprogram
declaration form.  What exactly it looks like is almost irrelevant to
the implementation issues, but as a user, I think it is much nicer to
name a constructor as a constructor.

Note that the hard case is when an object has several components
initialized by constructors, and the size of the object depends on many
of them.  A heroic compiler could create thunks that are co-routines,
and effectively call all the thunks in an aggregate in parallel.  I am
not advocating that a compiler support this, which is why I pointed out
the existance of the indirection bailout.  I would expect compilers to
handle aggregates with one size indeterminate component, and I would
expect aggregates that have more than one such component to be rare.
However, given the choice of between a rule that only allows one
constructor in an aggregate, and what Robert Dewar would call "junk
code" when an object has more than one component which has an unbounded
size, I'm willing to allow the junk code in the difficult case so that
the much more common case of  types with one unbounded size component
will be handled efficiently.

Incidently, this is what I currently get in existing compilers with
components of type Unbounded_String.  The advantage that many of us
expect from these AI's is not to eliminate the hidden indirection in
Unbounded_Strings, it is to eliminate the junk default initialization
that occurs in many cases.  (If you want to read for a limited type with
some components of type Unbounded String in the above feel free.  That
is exactly the normal mapping for database objects that gets painful.)

>I presume you will be able to use a call on one of these
>constructor functions to initialize a limited component of a
>limited aggregate.  If not, you haven't solved the problem
>in my view.  That is:
>
>    X : Lim2 := Lim2'(F1 => 7, F2 => Lim1_Con_Func(3,4));
>
>where Lim1_Con_Func is one of these constructor functions.
>
>
Definitely.  Although I expect the more common/necessary case to be:

           X: Lim2 := Lim2'(Lim1_Con_Func(3,4) with F1 => 7);

>I really don't see any compiler changing its record layout just to make
>these possible.  That would be hugely disruptive.
>
>
I don't see that either.  But I do see the compiler using exactly the
same record layouts for limited objects as non-limited objects with
otherwise identical declarations.  This is necessary anyway, since the
limited type may be non-limited in part of its scope.

>>From the view of the user of the function, there is always a new object
>produced.  The procedure-renamed-as-function proposal is suggesting
>that the compiler does the allocation and basic initialization
>inline before going out-of-line to the procedure.
>
I understand that very clearly.  I think what you are missing is that it
is what I object to about the proposal.  I showed in my complete worked
out example the differences, and how the workarounds end up with a
constructor split notationally into two pieces.  Tolerable if there are
no locking constructs involved, but very painful when there are.

>                                                       In the
>"return ... do ..." proposal, the allocation still needs to
>be done prior to the call.  It could be raw, uninitialized storage,
>but the space needs to exist before the call, if you are going
>to allow components to be initialized by such a call.
>
No, what Randy and I are proposing is that the allocation occurs at the
point of the object creation in the return statement (possibly a "return
... do ... end;" construct,  This requires either a thunk, or for the
compiler to pass the address where the object is to be placed as a
hidden parameter, and the size of the object to be an "extra" return
parameter.  (I say "extra" because there is no need for the actual
return value to be returned, the caller knows where it is.)  So in the
common case where the size is known at the point of the call, no thunk
is required, and no return value either.

So back to my initial three cases:

Easy cases (all objects the same size):  pass address, no thunk needed.
Constructor initializing a constrained object:  pass address with
constraints in place, no thunk needed.
Constructor computes constraints:  pass thunk which can be called with
size to get address.

I personally think that given that all three cases are possibly, most
uses will be cases one or three.  Except that it is nice to be able to
handle cases two and three with the same constructor code:

Handle all cases:  pass a constrained flag, and a thunk.

The user code then needs to be able to query this flag,  my suggestion
is that the 'Constrained attribute can be queried in the sequence of
statements in the do..end, and if true, then the code can look in the
memory returned by the thunk to see the discriminants.

Once you realize that this model works, it becomes clear that the right
choice from a user's point of view is to choose to permit the most
complex case, and allow compilers to recognize the other cases and
create more efficient code.  But the efficiency is in terms of the
number of parameters passed.  So when I say I am willing to tolerate
inefficiency to get full generality, the inefficiency I am talking about
is sometimes passing one or two "extra" parameters which the compiler
wasn't able to recognize as unnecessary.

>To do the allocation, the compiler needs to know the size
>of the result, meaning in general it needs to know the
>discrminants if the result type is discriminated.  If the
>caller knows the discriminants, then we need to be sure the
>out-of-line code uses the same values for the discriminants.
>This gets tricky if there is no name that refers to the
>returned object until the out-of-line code declares one
>(e.g. via the "return Result: T := <expr> do ..." construct), since they
>can't refer to the values of the discriminants in the
>initializing expression for the return object.
>
>I suppose one way out of this conundrum is to allow references
>to discriminants of "Result" in the "<expr>".  That is not the
>way normal declarations work, but it could work here.
>
>
Actually now that we are down to "bit fiddling", better IMHO is to allow
assignment to T in the sequence of statements following the do.  Then
the combined case above can be handled elegantly as:

        ...
        return Result: Object_Type
        do
            if Result'Constrained
            then
                Result :=  (Disc1 => Result.Disc1, Disc2 =>
Result.Disc2,...);
            else
                Result :=  Default_Constructor(Param1, Param2);
            end if;
         end;

Of course, when the Object_Type is only limited by fiat, this is
possible for constructors declared in the same package as Object_Type
(or in a child package).  The case where the parent type is limited
should be covered by the aggregate rules so that constructors for types
derived from Limited_Controlled can be written this way.

>The other problem, as others have indicated, is that sometimes
>we want the discriminants of the new object to be a function
>of the parameters.  That seems harder in the
>"return ... do ..." approach, since we would have to somehow
>communicate the values to the calling context as well, since
>they are needed *before* the object can be allocated,
>and that needs to happen before the call (because of components).
>
>
I know that it how you (Tucker) are thinking, but I am going further and
allowing the discriminants to be something that doesn't depend on the
parameters in an obvious way.  The classic case would be a data entry
system.  The constructor may return one of many variants of a record
based on the data entered by the user's interaction with the
constructor.  I am not oblivious to the cost of getting all this
"right,"  which is why I have been indicating a willingness to subset
the functionality for now.

The problem I have with the renaming approach is that it is not obvious,
perhaps not possible, to extend it later to cover all the useful cases.
Originally I was proposing to cover a different subset of the useful
cases than the renaming approach, but Randy and I have converged on a
solution (actually with some help from Dan Eilers) that covers all cases.

>>... However, I don't see a problem with writing the rule so
>>constructors for non-limited objects can be used to change components of
>>records.
>>
>>
>
>I don't know what you are talking about when you say
>"change" a component.  Do you mean that for a non-limited
>type, you would allow a call on a constructor function to be
>the right hand side of an assignment, whose left-hand-side
>is a component selection (or anything else, for that matter)?
>
Yes.

>Or do you mean the function would somehow be passed a reference
>to a preexisting object, and treat it as an IN OUT parameter?
>I don't understand what a call would look like in this latter
>case, and it is not something I have any interest in.
>
>
Ah, but it is something that you do have an interest in. ;-)  This is
the exactly the case that is described by your renamed procedure
approach.   As explained above, the constructor approach handles that
case as well.  To handle all cases when the caller doesn't know whether
the particular constructor called will expect existing discriminants or
not requires passing a constrained flag and a thunk.  And of course,
this will generally be the case in Ada, since the body of the package
containing the constructor will not be visible at the point of the
call.   However, the compiler can choose a more optimal calling sequence
when it knows that the object passed has no discriminants.

>In the procedure-renamed-as-function proposal, it is true
>the *procedure* is passed an [IN] OUT parameter, but the
>procedure is *not* the "constructor."  The constructor
>is the function defined by the renaming.  The procedure
>is just a useful hunk of code that the constructor
>function reuses.  That same procedure might be used
>for other things.  The function defined by the renaming,
>on the other hand, has the properties of a "normal" function,
>namely that it is always associated with the creation of
>a new object.  It is different in that there is enough
>information provided so that at the call-site, the compiler
>can generate the code to do allocation and basic initialization,
>and the out-of-line code need not worry about that.
>
>
 Understood.  Incidently please don't think of what is going on here as
just an argument, or a bunch of people ganging up on Tucker.  What is
really going on is that there are two proposals on the table, and they
have different properties.  I am willing to accept the extra parameters
that the constructor approach requires in some cases to get the
additional functionality it allows.  Tucker's approach is not as
general, but it is somewhat more efficient.  I would argue that the
inefficiency of my approach is not inherent, but depends on the (often
private) type declaration.  There will be cases where Tuckers approach
is clearly more efficient, but not catastrophically so.

So neither solution dominates the other.  The choice becomes a normative
one, depending on how you weight different considerations.  And as I
said earlier, the I think the time spent on examining both alternatives
is worth it, since I expect the result to be one of the most heavily
used features of Ada 0Y.

*************************************************************

[Editor's note: Additional discussion on this topic can be found in AI-318.]

*************************************************************

[Editor's note: At the March 2004 ARG Meeting, it was decided to fold parts
of this proposal into AI-318. The rest will be dropped.]

*************************************************************


Questions? Ask the ACAA Technical Agent