!standard 03.03.01 (02) 03-06-23 AI95-00318/01 !standard 06.05.00 (17) !standard 06.05.00 (18) !class amendment 02-10-09 !status work item 03-05-23 !status received 02-10-09 !priority Medium !difficulty Medium !subject Returning [limited] objects without copying !summary New syntax is proposed for identifying the object that will be returned from a function, allowing the object to be built in the context of the caller, without further copying required. This could be used to support returning limited objects from a function, to support returning objects of an anonymous access type, and more generally to reduce the copying that might be required when a function returns a complex object, a controlled object, etc. !problem We already have a proposal for allowing aggregates of a limited type, by requiring that the aggregate be built directly in the target object. rather than being copied into the target. But aggregates can only be used with non-private types. Limited private types could not be initializable at their declaration point. It would be natural to allow functions to return limited objects, so long as the object could be built directly in the "target" of the function call, which could be a newly created object being initialized, or simply a parameter to another subprogram call. We have also considered allowing functions to return anonymous access types. In this case, if the function returned an allocator, it would be natural for the caller context to determine the storage pool to be used by the allocator. Whether returning a limited type or an anonymous access type, in both cases, it may be desirable to perform some other initialization to the object after it has been created, but before returning from the function. This is difficult to do while still creating the object directly in its "final" location. !proposal When declaring a local variable inside a function (not including within a nested program unit), the variable may be declared to be a "return" object, using the following syntax (analagous to the syntax used for constants): identifier : [ALIASED] RETURN subtype_indication [:= expression]; Within the scope of a return object (except within nested program units), no other return objects may be declared, and all return statements must have the name of the return object as their returned expression. [Possible alternative: return statements in the scope of a "return" object must omit the returned expression, and be like the return statements of a procedure. One possible down side of omitting the name of the return object is that it makes the reader's job a bit harder; they have to look back to find the object being returned. One possible up side -- it is perhaps clearer that no copying is happening at the return statement.] The return object would not be finalized prior to leaving the function. The caller would be responsible for its finalization. This syntax would not be restricted to limited types. It could also be used for non-limited types. The implementation advice would be that the amount of copying, finalization, etc. should be reduced, if possible, as part of returning from the function. This could be particularly useful for functions that return large objects, or objects with controlled parts. A call of a function with a limited result type could be used in the same contexts where we have proposed to allow aggregates of a limited type, namely contexts where a new object is being created (or can be). 1) Initializing a newly declared object (including a "return" object) 2) Default initialization of a record component 3) Initialized allocator 4) Component of an aggregate 5) IN formal object in a generic instantiation (including as a default) 6) Expression of a return statement 7) IN parameter in a function call (including as a default expression) In addition, since the result of a function call is a name in Ada 95, the following contexts would be permitted, with the same semantics as creating a new temporary constant object, and then creating a reference to it: 8) Declaring an object that is the renaming of a function call. 9) Use of the function call as a prefix to 'Address If we permit function result types to be anonymous access types (e.g. "function Blah return access T"), then we likely will want such functions, if they return the result of an allocator, to be able to use the context of the call to determine the storage pool for the allocator. This proposed syntax would allow the function to do the allocator in the "caller" context, but still be able to perform further initialization of the allocated object after the allocator. Essentially the "return" object would inherit the storage pool determined by the calling context, so that allocators that are used to initialize it, or that are assigned to it later, would use the caller-determined storage pool. !wording !example Here is an example of a function with a limited result type using a "return" object: function Construct_Obj(Len : Natural) return Lim_Type is Result : return Lim_Type(Discrim => Len); -- the "return" object begin -- Finish the initialization of the "return" object. for I in 1..Len loop Result.Data(I) := I; end loop; -- And now return it. return Result; -- [Alternative: omit "Result" (or entire return statement); -- "return Result;" would be implicit] end Construct_Obj; Here is essentially the same function, but with an anonymous access type for its result type: function Construct_Obj(Len : Natural) return access Lim_Type is Result : return access Lim_Type; -- The "return" object begin Result := new Lim_Type(Discrim => Len); -- this uses the storage pool determined by the caller context -- Finish the initialization of the allocated object for I in 1..Len loop Result.Data(I) := I; end loop; -- And now return it. return Result; -- [Alternative: omit "Result" (or entire return statement); -- "return Result;" would be implicit] end Construct_Obj; By "caller context", we mean that the same rules as apply to an allocator would apply to calls on this function, where the expected (access) type would determine the storage pool: type My_Acc_Type is access Lim_Type; for My_Acc_Type'Storage_Pool use My_Amazing_Stg_Pool; P : My_Acc_Type; begin P := Construct_Obj(3); -- allocator inside Construct_Obj uses My_Amazing_Stg_Pool !discussion In meetings with Ada users, there has been a general sense that if limited aggregates are provided in Ada 200Y, it would be desirable to also provide limited function returns which could act as "constructor" functions. Just allowing a function whose whole body is a return statement returning an aggregate (or another function call) does not give the programmer much flexibility. What they would like is to be able to create the object and then initialize it further somehow, perhaps by calling a procedure, doing a loop (as in the examples above), etc. This requires a named object. However, to avoid copying, we need this object to be created in its final "resting place," i.e. in the target of the function call. This might be in the "middle" of some enclosing composite object the caller is initializing, or it might be in the heap, or it might be a stand-alone local object. Because the implementation needs to create the returned object in a place or a storage pool determined by the caller, it is important that the declaration of the object be distinguished in some way. By using the keyword "return" in its declaration, we have a fairly intuitive way for the programmer to indicate that this is *the* object to be returned. Clearly we only want to allow one of these at a time, and to require that all return statements within its scope explicitly (or perhaps implicitly) return that object. Because it may be necessary to do some computing before deciding exactly how the return object should be declared, we permit the return object to be declared within nested blocks within the function so long as there is no return object for the function already in scope. So different branches of an if or case statement could declare their "own" return object if appropriate, for example. Note that we have allowed the user to declare the return object as "aliased." This seems like a natural thing which might be wanted, so you could initialize a circularly-linked list header to point at itself, etc. We had considered a different syntax for this before, namely a new kind of return statement, analogous to an accept statement, e.g.: return Result : T := blah do Result.Data(3) := 77; ... end Result; However, Bob Duff pointed out that for simple cases you ended up with two levels of nesting which seemed excessive: function Fum() return T is begin return Result : T := blah do Result.Data(3) := 77; ... end Result; end Fum; Making a smaller change to the object declaration syntax seemed a simpler approach. POSSIBLE IMPLEMENTATION APPROACHES The implementation approach for anonymous access result types is very similar to that for limited result types. In the following, we will mostly talk about limited result types. Towards the end we will explain how it applies to anonymous access result types. Full accessibility level checking adds to the complexity. At the end we will show how to introduce restrictions that eliminate most of this complexity, in exchange for some loss in functionaliy. The implementation of this for limited result types is straightforward if the size of the result is known to the caller. It is essentially equivalent to a procedure with an OUT parameter -- the caller allocates space for the target object, and passes its address to the called routine, which uses it for the "return" object. If the size of the function result is not known to the caller (i.e. the function result subtype is unconstrained, and perhaps indefinite), then there are two basic possibilities: 1) The target object's (nominal) subtype is constrained (or at least "definite"), even though the function result subtype is unconstrained; the target object might be a component of a larger object. 2) The target object's nominal subtype is unconstrained, and its size is to be determined by the result returned from the function; the target object must be a stand-alone object, or an "entire" heap object. In the first case, the caller determines the size of the target object and can allocate space for it; in the second, the caller cannot preallocate space for the target object, and must rely on the called routine allocating space for it in an "appropriate" place. The code for the called routine must handle both of these cases. One reasonable way to do so is for the caller to provide a "storage pool" for the result. In the first case, this storage "pool" has space for exactly one object of a given maximum size. It's allocate routine is trivial. It just checks to see if the size is no greater than the space available, and then returns the preallocated (target) address. In the second case, the storage pool is either the storage pool associated with the initialized allocator at the call site, or a storage pool that represents a secondary stack, or equivalent, used for returning objects of unknown size from a function. For upward compatibility, we would need to accommodate functions that return pre-existing objects by reference. One way to do this would be for the caller to provide an additional implicit boolean parameter which would indicate whether the called routine *must* create a new object, or could return a reference to an existing object. Of the nine places identified above where calls on functions with limited result type would be permitted, the cases where the called routine must create a new object are (1)-(5). Cases (6)-(9) allow the use of preexisting objects, so the storage pool provided would generally be the secondary stack if the size is unknown to the caller, or a preallocated primary stack area, if the size of the object returned is always the same. Case (6), where a return statement returns the result of a function call, is a bit of a halfway situation. For (6), the storage pool provided as part of the call in the return statement would be the same storage pool passed to the function. When the boolean flag indicates that a new object is not required, the called routine could return a reference to a preexisting object, and ignore the storage pool or target address provided. As a possible optimization, this case could be indicated by simply providing a null storage pool parameter, rather than a separate boolean flag. The called routine would take this to mean that the secondary stack, or equivalent, should be used if a new object is being created, but that it may return a reference to a preexisting object. For the simplest implementation model where the size of the result is always known to the caller, and no storage pool parameter is provided, a separate flag would probably be necessary. The net effect is that there would be one implicit parameter in both situations, a boolean flag for the known-size function result, and a possibly-null storage pool for the unknown-size function result. In all cases, the called routine would return the address of the result, whether newly created or preexisting. The caller would use this returned address in all cases where the function result might be a preexisting object (cases (6)-(9)), or in cases where the caller didn't preallocate space for the target. IMPLEMENTATION APPROACH FOR ANONYMOUS ACCESS RESULT TYPE For anonymous access result types, a very similar approach would be taken. In this case, however, a new object is never required. It would always be permissible to return an access value designating a preexisting object. The storage pool parameter would always be required, but the caller could always ignore it. An accessibility level would be needed associated with the storage pool, so the called routine would know the accessibility level of the result of an allocator that used the storage pool. An accessibility level would also need to be returned, so the caller would know the accessibility of the result. Although the RM talks about accessibility levels in terms of dynamic levels of nesting, most implementations use accessibility levels that correspond to static levels of nesting, but adjust the level when passing a given (formal) access parameter to a more nested subprogram with an access parameter as well, by collapsing deeper static levels into a level that corresponds to the static level of the given formal access parameter's declaration. This is explained in AARM 3.10.2(22.x-22.ee). Unfortunately, this "collapsing" of levels loses information. So when passing accessibility levels to and from a function with an anonymous access type result, it would be desirable to avoid "collapsing" such levels, and use the original accessibility levels. In some implementations it might be helpful for the caller to provide the called routine with a level for the called routine to use for its own locals, which is guaranteed to be deeper than any level number that the caller cares about. LIMITED TYPES AND ACCESSIBILITY ISSUES Because limited types can have access discriminants, and an accessibility check is required when an allocator for such a type is performed to be sure the allocated object doesn't outlive the object referenced by the access discriminant, some kind of accessibility level will also have to be provided to the called routine when a storage pool is provided, at least when the result type has access discriminants. Because the storage pool will often be local to the caller and the access discriminant might be specified via an access parameter to the function, the collapsing of accessibility levels mentioned above would have to be supressed in this case as well. Hence we end up with a general rule that when access parameters are passed to a function with a limited result type (with access discriminants), or with an anonymous access result type, no collapsing of accessibility levels is performed. The caller's accessibility levels are used in the access parameters, and in the storage pool. The called routine has to accommodate this somehow. Again, in some implementations it may be helpful for the caller to provide the called routine with an accessibility level it can use for its own locals that is certain to be deeper than any other level passed in from the caller. POSSIBLE SIMPLIFICATIONS OF ACCESSIBILITY CHECKING If we would like some of these capabilities, but would like to avoid dealing with uncollapsed accessibility levels, accessibility levels associated with storage pools, etc., then we could make some restrictions that might simplify the implementation (though of course it would complexify the user's model a bit): 1) If a limited result type has access discriminants, then the storage pool passed in must not outlive the function declaration. This would imply that the function could safely set the access discriminants to point to objects with an accessibility level no deeper than the function declaration. This is similar to the test performed on return-by-reference now (6.5(17-20)). With this restriction, no accessibilty level needs to be passed in with the storage pool for limited result types. Note also that with this restriction, calls on local functions could not be used within initialized allocators for global access types, if the function's result type is a limited type with access discriminants (doesn't seem like much of a loss). 2) For anonymous access result types, if the storage pool were not used inside the function, the accessibility of the returned access value must be no deeper than that of the function declaration (e.g., it could not return the value of an access parameter passed in, unless the access parameter designated an object global to the function). Again, this is essentially the check performed now for return-by-reference types. If the storage pool is used, then naturally the accessibility is that of the storage pool, so the caller knows that the maximum accessibility depth of the result is the depth of the storage pool or the depth of the function declaration, whichever is deeper. If the designated type of the anonymous access type is a limited type with access discriminants, then the same restriction as (1) would apply to the storage pool, i.e. that the storage pool depth must be no less than that of the function declaration. With these restrictions, no accessibility level needs to be passed in with the storage pool for anonymous access result types, and in turn no level would be returned (without the storage pool level passed in, it would be pretty much impossible to pass it back!). Note that without the level being returned, a local function could not be used to create a value assigned to variable of a global access type, since the function might return a pointer designating a local object, and it has no way of indicating that. CONCLUSION If we are willing to accept the restrictions of the above section, then the implementation burden is roughly the same for either return limited types or returning anonymous access types, namely that a storage pool may need to be passed in. The called routine needs to use that storage pool when creating a limited return object, or evaluating an allocator whose target is the anonymous access return object. If the return object is itself initialized by a function call, then the storage pool needs to be passed into that function as well, presuming that function also returns a limited type or an anonymous access type. If a user doesn't explicitly declare a return object, then each return statement is equivalent to a local block that declares a return object initialized from the return expression, and then returns it. If we don't want to accept the restrictions given above, then accessibility levels need to be passed with the storage pool, and the accessibility levels passed with access parameters should not be "collapsed." An accessibility level would be returned from a function with an anonymous access result type. Note that an additional advantage of the restricted form is that more accessibility checking can be performed at compile-time, and it will generally involve less run-time overhead. Given this, it seems appropriate to consider the restricted (compile-time accessibility) form of the proposal first, and only if this is felt sufficiently valuable, to consider the unrestricted form of the proposal. !ACATS test !appendix From: Robert Duff Sent: Wednesday, February 6, 2002, 6:05 PM Limited Types Considered Limited One of my homework assignments was to propose changes to "fix" limited types. This e-mail outlines the problems, and proposed solutions. I haven't written down every detail -- I first want to find out if there is any interest in going forward with these changes. Some of these ideas came from Tucker. I apologize for doing my homework at the last minute. I hope folks will have a chance to read this before the meeting, and I hope Pascal is willing to put it on the agenda. Ada's limited types allow programmers to express the idea that "copying values of this type does not make sense". This is a very useful capability; after all, the whole point of a compile-time type system is to allow programmers to formally express which operations do and do not make sense for each type. Unfortunately, Ada places certain limitations on limited types that have nothing to do with the prevention of copying. The primary example is aggregates: the programmer is forced to choose between the benefits of aggregates (full coverage checking) and the benefits of limited types. Forcing programmers to choose between two features that ought to be orthogonal is one of the most frustrating aspects of Ada. I consider the full coverage rules (for aggregates and case statements) to be one of the primary benefits of Ada over many other languages, especially with type extensions, where some components are inherited from elsewhere. I will refrain from further singing the praises of full coverage; I assume I'm preaching to the choir. My goals are: - Allow aggregates of limited types. - Allow constructor functions that return limited types. - Allow initialization of limited objects. - Allow limited constants. - Allow subtype_marks in aggregates more generally. (They are currently allowed only for the parent part in an extension aggregate.) The basic idea is that there is nothing wrong with constructing a limited object; *copying* is the evil thing. One should be allowed to create a new object (whether it be a standalone object, or a formal parameter in a call, or whatever), and initialize that object with a function call or an aggregate. In implementation terms, the result object of the function call, or the aggregate, is built in place in its final destination -- no copying is necessary, or allowed. All of the above goals except constructor functions are fairly trivial to achieve, both in terms of language design and in terms of implementation effort. Constructor functions are somewhat more involved. However, I am against any language design that allows aggregates where function calls are not allowed; subprogram calls are perhaps the single most important tool of abstraction ever invented! (There is at least one other such case in Ada, and I hate it.) By "constructor function", I mean a function that returns an object created local to the function, as opposed to an object that already existed before the function was called. Ada currently allows functions to return limited types in two cases, neither of which achieves the goal here: If the limited type "becomes nonlimited" (for example, a limited private type whose full type is integer), then constructor functions are allowed, but the return involves a copy, thus defeating the purpose of limited types. Anyway, this feature is not allowed for various types, such as tagged types. If the limited type does not become nonlimited, then it is returned by reference, and the returned object must exist prior to the function call; it cannot be created by the function. In essense, these functions don't return limited objects at all; they simply return a pointer to a preexisting limited object (or perhaps a heap object). We need a new kind of function that constructs a new limited object inside of itself, and returns that object to be used in the initialization of some outer object. The run-time model is that the caller allocates the object, and passes in a pointer to that object. The function builds its result in that place; thus, no copying is done on return. Because the run-time model for calls to these constructor functions is different from that of existing functions that return a limited type, we need to indicate this syntactically on the spec of the function. In particular, change the syntax of functions so that "return" can be replaced by "out", indicating a constructor function. In addition, change the syntax of object declarations to allow "out", as in "X: out T;"; this marks the object as "the result object" of a limited constructor function. The reason for "out" is that these things behave much like parameters of mode 'out'. Examples: type T is tagged limited record X: ...; Y: ...; Z: ...; end record; type Ptr is access T'Class; Object_1: constant T := (X => ..., Y => ..., Z => ...); function F(X: ...; Y: ...) out T; function F(X: ...; Y: ...) out T is Result: out T := (X => X, Y => Y, Z => ...); begin ... -- possible modifications of Result. return Result; end F; Object_2: Ptr := new T'(F(X => ..., Y => ...)); -- Build a limited object in the heap. Rules: Change the rules in 4.3 to allow limited aggregates. This basically means erasing the word "nonlimited" in a few places. Change the rule in 3.3.1(5) about initializing objects to allow limited types. But require the expression to be an aggregate or a constructor function. ("X: T := Y;", where Y is a limited object, remains illegal, because that would necessarily involve a copy.) There are various analogous rules (initialized allocators, subexpressions of aggregates, &c) that need analogous changes. Assignment statements remain illegal for limited types, even if the right-hand side is an aggregate or limited constructor function. Allowing constants falls out from the other rules. Allow a component expression in an aggregate to be a subtype_mark. This means that the component is created as a default-initialized object. It's essentially the same thing we already allow in an extension aggregate; we're simply generalizing it to all components of all aggregates. This is important, in case some part of the type is private. There is no reason to limit this capability to limited types. Specify that limited aggregates are built "in place"; there is always a newly-created object provided by the context. Note that we already have one case where aggregates are built in place: (nonlimited) controlled aggregates. Similarly, the result of a limited constructor function is built in place; the context of the call provides a newly-created object. (In the case of "X: T := F(...);", where F says "return G(...);", F will receive the address of X, and simply pass it on to G.) If the result object of a limited constructor function contains tasks, the master is the caller. For a function whose result is declared "out T", T must be a limited type; such a function is defined to be a "limited constructor function". Subtype T must be definite. This rule is not semantically necessary. However, the run-time model calls for the caller to allocate the result object, and this rule allows the caller to know its size before the call. Without this rule, a different run-time model would be required for indefinite subtypes: the called function would have to allocate the result in the heap, and return a pointer to it. A design principle of Ada is to avoid language rules that require implicit heap allocation; hence this rule. (An alternative rule would be that T must be constrained if composite, thus eliminating defaulted discriminants.) A limited constructor function must have exactly one return statement. The expression must be one of the following: - An object local to the function (possibly nested in block statements), declared with "out". - A function call to a limited constructor function. - An aggregate. - A parenthesized or qualified expression of one of these. An object declared "out" must be local to a limited constructor function. A constraint check is needed on creation of a local "out" object. We have to do the check early (as opposed to the usual check on the return statement), because we need to make sure the object fits in the place where it belongs (at the call site). If the return expression is an aggregate, that needs a constraint check, as usual. If the return expression is a function call, then that function will do whatever checking is necessary. Is there an issue with dispatching-on-result functions? I don't think so. Compatibility: This change is not upward compatible. Consider: type Lim is limited record Comp: Integer; end record; type Not_Lim is record Comp: Integer; end record; procedure P(X: Lim); procedure P(X: Not_Lim); P((Comp => 123)); The call to P is currently legal, and calls P(Not_Lim). In the new language, this call will be ambiguous. This seems like a tolerable incompatibility. It is always caught at compile time, and cases where nonlimitedness is used to resolve overloading have got to be vanishingly rare. The above program, though legal, is highly confusing, and I can't imagine anybody wanting to do that. The current rule was a mistake in the first place: even if limited aggregates *should* be illegal, that should not be a Name Resolution Rule. Other advantages: One advantage of this change is that it makes the usage style of limited types more uniform with nonlimited types, thus making the language more accessible to beginners. How do you construct an object in Ada? You call a function. Cool -- no need for the kludginess of C++ constructors. But if it's limited, you have to fool about with discriminants -- not something that would naturally occur to a beginner. And discriminants have various annoying restrictions when used for this purpose. How do you capture the result of a function call? You put it in a constant: "X: constant T := F(...);". But if it's limited, you have to *rename* it: "X: T renames F(...);". Again, that's not something that would naturally occur to a beginner -- and the beginner would rightly look upon it as a "trick" or a "workaround". Another point is that the current rules force you into the heap, unnecessarily. You end up passing around pointers to limited objects, either explicitly or implicitly, which tends to add complexity to one's programs. Limited types offer other advantages in addition to lack of copying: access discriminants, and the ability to take 'Access of the "current instance". It seems a shame to require the programmer to choose between these and aggregates. Alternatives: It is not strictly necessary to mark the result object with "out"; the compiler could deduce this information by looking at the return statement(s). However, marking the object simplifies the compiler -- it needs to treat this object specially by using the space allocated by the caller. It is not necessary to limit the number of return statements to 1. However, it seems simplest. We need to prevent things like this: function F(...) out T is Result_1: out T; Result_2: out T; begin Result_1.X := 123; Result_2.X := 456; if ... then return Result_1; else return Result_2; end if; end F; because we can't allocate Result_1 and Result_2 in the same place! On the other hand, the following could work: function F(...) out T is Result: out T; begin if ... then return Result; else return Result; end if; end F; suggesting a rule that all return statements must refer to the *same* object. But this could work, too: function F(...) out T is begin if ... then return (...); -- aggregate elsif ... then return G(...); -- function call elsif ... then declare Result_1: out T; begin Result_1.X := 123; return Result_1; -- local end; else declare Result_2: out T; begin Result_2.X := 456; return Result_2; -- different local end; end if; end F; because only one of the four different result objects exists at any given time. I'm not sure how much to relax this rule. Perhaps some rule about declaring only one of these special result objects in a given region? **************************************************************** From: Tucker Taft Sent: Thursday, February 7, 2002, 6:15 AM I would allow these "constructor" functions for any kind of type. I would require that at most one OUT local variable be declared. It must be in the outermost declarative part (to avoid it going out of scope before it was returned), and that all return statements must identify the OUT local variable if present (or perhaps, all return statements must omit the return expression completely if present). Allowing the declaration of an "OUT" local variable might be generalized to normal functions (with the same restrictions as above). For a "normal" function, the OUT local variable would represent the returned object, but with the difference that the space for it is allocated by the called routine, rather than the caller. This allows the Pascal "style" of assigning to a return object, when it is appropriate. **************************************************************** From: Robert Duff Sent: Thursday, February 7, 2002, 7:52 AM > I would allow these "constructor" functions for any kind of > type. You mean for nonlimited as well as limited? Sounds OK. I assume you do not mean to allow them for unknown-size subtypes. >... I would require that at most one OUT local variable > be declared. It must be in the outermost declarative part > (to avoid it going out of scope before it was returned), I don't see why that's necessary. The OUT variable doesn't *really* go away -- it's really just a view of the object created by the caller. No action is required to return it -- it's already in the right spot, so a return of it is simply a "goto end of function". >... and > that all return statements must identify the OUT local variable > if present (or perhaps, all return statements must omit the > return expression completely if present). A less restrictive rule is that all return statements that refer to a variable must refer to the OUT variable. Does that not work? > Allowing the declaration of an "OUT" local variable might > be generalized to normal functions (with the same restrictions > as above). OK. >... For a "normal" function, the OUT local variable > would represent the returned object, but with the difference that > the space for it is allocated by the called routine, rather than > the caller. This allows the Pascal "style" of assigning to > a return object, when it is appropriate. One question is: what if the function doesn't execute a return statement? That's currently erroneous. But if there's an OUT variable, it would seem sensible to just let the function fall off the end, and have the OUT variable define the result. Which implies that we should eliminate the rule requiring at least one return statement, in the case where there's an OUT variable. I like it! I presume the OUT object can actually be a constant or variable, by the way. **************************************************************** From: Tucker Taft Sent: Thursday, February 7, 2002, 8:28 AM > > I would allow these "constructor" functions for any kind of > > type. > > You mean for nonlimited as well as limited? Sounds OK. Yes, that's what I meant. And since some limited types become non-limited, I think this may be necessary. > I assume you do not mean to allow them for unknown-size subtypes. Right. They should have exactly the same restrictions, independent of whether the type is limited or nonlimited, and the caller should always allocate space for the returned object. That way for limited types that become non-limited, we don't run into any weirdness. This also means that one can switch between limited and non-limited with minimal semantic disruption, and you don't have to remember a lot of special cases for limited vs. non-limited (I presume that is one of the key goals of this proposal). > >... I would require that at most one OUT local variable > > be declared. It must be in the outermost declarative part > > (to avoid it going out of scope before it was returned), > > I don't see why that's necessary. The OUT variable doesn't *really* go > away -- it's really just a view of the object created by the caller. > No action is required to return it -- it's already in the right spot, > so a return of it is simply a "goto end of function". I don't see how that could work. Consider the following: type Lim_T(B : Boolean := False) is record case B is when True => X : Task_Type; when False => null; end case; end record; function Cons(...) out Lim_T is begin declare Result : out Lim_T(True); begin null; end; return Lim_T'(B => False); end Cons; Are you going to require a "return" on all paths out of the declare block? What if an exception is propagated from the declare block, and then in the handler there is a return of something with a different discriminant? > >... and > > that all return statements must identify the OUT local variable > > if present (or perhaps, all return statements must omit the > > return expression completely if present). > > A less restrictive rule is that all return statements that refer to a > variable must refer to the OUT variable. Does that not work? No; see above. I think once you create the OUT local, that must be the one that gets returned. Hence, it is probably simplest if there is an OUT local declared, to eliminate the use of return expressions, or even the requirement for an explicit "return" statement, so it would work like a procedure with one OUT parameter. > > Allowing the declaration of an "OUT" local variable might > > be generalized to normal functions (with the same restrictions > > as above). > > OK. > > >... For a "normal" function, the OUT local variable > > would represent the returned object, but with the difference that > > the space for it is allocated by the called routine, rather than > > the caller. This allows the Pascal "style" of assigning to > > a return object, when it is appropriate. > > One question is: what if the function doesn't execute a return > statement? That's currently erroneous. But if there's an OUT variable, > it would seem sensible to just let the function fall off the end, and > have the OUT variable define the result. > > Which implies that we should eliminate the rule requiring at least one > return statement, in the case where there's an OUT variable. > > I like it! In my view, this would be desirable only if we eliminate all return expressions from return statements for constructor functions with a local OUT object, making them obey the same rules as procedures thereafter. > I presume the OUT object can actually be a constant or variable, by the > way. I would not allow it to be a constant. The word "OUT" would take the place of the word "CONSTANT" in the syntax, the way I see it. It seems nearly useless to have it be a constant, and clearly, the caller could use it to initialize a variable, so calling it a constant could be misleading. **************************************************************** From: Robert Duff Sent: Thursday, February 7, 2002, 8:57 AM > I don't see how that could work. Neither do I. ;-) > No; see above. I think once you create the OUT local, that > must be the one that gets returned. Hence, it is probably > simplest if there is an OUT local declared, to eliminate > the use of return expressions, or even the requirement for > an explicit "return" statement, so it would work like > a procedure with one OUT parameter. Agreed. > In my view, this would be desirable only if we eliminate all return > expressions from return statements for constructor functions with a > local OUT object, making them obey the same rules as procedures > thereafter. Yes, that makes sense. > I would not allow it to be a constant. The word "OUT" would take > the place of the word "CONSTANT" in the syntax, the way I see it. > It seems nearly useless to have it be a constant, and clearly, > the caller could use it to initialize a variable, so calling it a constant > could be misleading. Yes, of course. I wasn't thinking clearly. **************************************************************** From: Robert Dewar Sent: Thursday, February 7, 2002, 9:02 AM I must say that for me, this entire proposal seems to be insufficiently grounded in real requirements. I am concerned that the ARG is starting to wander around in the realm of nice-to-have-neat-language-extensions which are really rather irrelevant to the future success of Ada. I am not opposed to a few extensions in areas where a really important marketplace need has been demonstrated, but the burden for new extensions should be extremely high in my view, and this extension seems to fall far short of meeting that burden. **************************************************************** From: Randy Brukardt Sent: Thursday, February 7, 2002, 2:19 PM I hate to be agreeing with Robert here :-), but he's right. There is a problem worth solving here (the inability to have constants of limited types), but that could adequately be solved simply by the 'in-place' construction of aggregates (which we already require in similar contexts). [I'll post a real-world example of the problem in my next message.] The problem is relatively limited, and thus the solution also has to be limited, or it isn't worth it. This whole business of constructor functions only will sink any attempt to fix the real problem, because it is just too big of a change at this point. Bob's concerns about the purity of the language would make sense in a new language design, but we're working with limited resources here, and simple solutions are preferred over perfect ones. **************************************************************** From: Randy Brukardt Sent: Thursday, February 7, 2002, 3:05 PM Here is an example that came up in Claw where we really wanted constants of a limited type: The Windows registry contains a bunch of predefined "keys", along with user defined keys. Our original design for the key type was something like (these types were all private, and the constants were deferred, but I've left that out for clarity): type Key_Type is new Ada.Finalization.Limited_Controlled with record Handle : Claw.Win32.HKey := Claw.Win32.Null_HKey; Predefined_Key : Boolean := False; -- Is this a predefined key? -- (Only valid if Handle is not null) -- other components. end record; Classes_Root : constant Key_Type := (Ada.Finalization.Limited_Controlled with Handle => 16#80000000#, -- Windows magic number Predefined_Key => True, ...); Current_User : constant Key_Type := (Ada.Finalization.Limited_Controlled with Handle => 16#80000001#, -- Windows magic number Predefined_Key => True, ...); -- And several more like this. procedure Open (Key : in out Key_Type; Parent : in Key_Type; Name : in String); procedure Close (Key : in out Key_Type); procedure Put (Root_Key : in Key_Type; Subkey : in String; Value_Name : in String; Item : in String); -- and so on.. Of course, our favorite compiler rejected the constants as illegal. So, they were turned into functions. function Classes_Root return Key_Type; function Current_User return Key_Type; However, these have the problem that they have to be overridden for any extensions of the type (as they are primitive). We could have put them into a child/nested package (to make them not primitive), but that would bend the structure of the design even further and add an extra package for no good reason. We also could have made them class-wide, but that would be a misleading specification, as they can never return anything other than Key_Type. So we left them in the main package. Aside: we originally wanted to use these as default parameters for some of the various primitive routines. However, that would illegal by 3.9.2(11) unless they are primitive functions. This rule exists so that the default makes sense in inherited primitives. But we really would have preferred that the default expressions weren't inherited; they only make sense on the base routines. That is a problem that probably isn't worth solving though. Of course, now that we had functions, we had to implement them. The first try was: function Classes_Root return Key_Type is begin return (Ada.Finalization.Limited_Controlled with Handle => 16#80000000#, -- Windows magic number Predefined_Key => True, ...); end Classes_Root; But our friendly compiler told us that THIS was illegal, because this is return-by-reference type, and the aggregate doesn't have the required accessibility. So we had to add a library package-level constant and return that: Standard_Classes_Root : constant Key_Type := (Ada.Finalization.Limited_Controlled with Handle => 16#80000000#, -- Windows magic number Predefined_Key => True, ...); function Classes_Root return Key_Type is begin return Standard_Classes_Root; end Classes_Root; But of course THAT is illegal (its the original problem all over again), so we had to turn that into a variable and initialize it component-by-component in the package body's elaboration code: Standard_Classes_Root : Key_Type; function Classes_Root return Key_Type is begin return Standard_Classes_Root; end Classes_Root; begin Standard_Classes_Root.Handle := 16#80000000#; -- Windows magic number Standard_Classes_Root.Predefined_Key => True; ... Which is essentially how it is today. This turned into such a mess that we gave up deriving from it altogether, and created an entirely new higher-level abstraction to provide the most commonly used operations in an easy to use form. Thus, we ended up losing out on the benefits of O-O programming here. I certainly hope that newcomers to Ada don't run into a problem like this, because it is a classic "stupid language" problem. Simply having a way to initialize a limited constant with an aggregate would be sufficient to fix this problem. "Constructor functions" might add orthogonality, but seem unnecessary to solve the problem of being able to have constants as part of an abstraction's specification. **************************************************************** From: Robert Duff Sent: Friday, February 8, 2002, 10:12 AM > Simply having a way to initialize a limited constant with an aggregate would > be sufficient to fix this problem. "Constructor functions" might add > orthogonality, but seem unnecessary to solve the problem of being able to > have constants as part of an abstraction's specification. Surely you don't mean that we would allow limited aggregates only for initializing stand-alone constants?! Surely, you could use them to initialize variables. And if they can be used to initialize variables, surely initialized allocators should be allowed. And of course, parameters. In *my* programs, much of the data is heap-allocated. I want to say: X: Some_Ptr := new T'(...); when T is limited. Allowing only constants would solve about 1% of *my* problem. Are you saying that this is illegal: P(T'(...)); and I have to instead write: Temp: constant T := (...); P(Temp); ?! That sort of arbitrary restriction is what makes people laugh at the language. **************************************************************** From: Dan Eilers Sent: Friday, February 8, 2002, 12:12 PM Bob Duff wrote: > My goals are: > > - Allow aggregates of limited types. > > - Allow constructor functions that return limited types. > > - Allow initialization of limited objects. > > - Allow limited constants. > > - Allow subtype_marks in aggregates more generally. > (They are currently allowed only for the parent part in an > extension aggregate.) Tuck wrote: > I would allow these "constructor" functions for any kind of > type. I agree that the non-limited case is also important, and should be listed as an explicit goal of the AI. The non-limited case is an efficiency issue, where a programmer wishes to prevent unnecessary copying of large objects implied by the semantics of aggregates and function calls. Tuck wrote: > ... so it would work like a procedure with one OUT parameter. The proposal seems to go to a lot of trouble to define a new kind of function that behaves exactly like a procedure with one OUT parameter. I think there may be a simpler solution involving an extension to function renaming. The constructor function would be declared as a procedure with one OUT parameter, and then renamed (allowing it to be called) as a function, using the type of the OUT parameter as its return type. Example: procedure p(x: some_type; y: out return_type); function f(x: some_type) return return_type renames p; > > I assume you do not mean to allow them for unknown-size subtypes. > > Right. They should have exactly the same restrictions, > independent of whether the type is limited or nonlimited, > and the caller should always allocate space for the > returned object. That way for limited types that > become non-limited, we don't run into any weirdness. There are many cases, such as the string concat function, where the return type is declared to be unconstrained (unknown-size), but really its size is a function of the sizes of the parameters and therefore computable before the call. It might facilitate allocating space in the caller if Ada had a way of expressing the size of the return type in terms of the parameters, possibly using the proposed new assertion mechanism (AI-00286). **************************************************************** From: Tucker Taft Sent: Friday, February 8, 2002, 5:39 PM This is an intriguing idea. Clearly less syntax invention than the "out" function return idea. We actually perform the transformation implied (from a procedure with an OUT parameter to a function) as an optimization, when the OUT parameter is of an elementary type. Also there is a DEC-Ada pragma which imports an external function as an Ada procedure, because the external function had OUT parameters in addition to its return value. I believe GNAT supports this pragma. Unfortunately, I'm not sure the above would work for the case when you want to use an aggregate as the return expression. Also, a procedure presumes its OUT parameter is already at least default-initialized. With the constructor function, the initialization was deferred until entering the function, where it could be initialization from an aggregate or from some other function call. Another approach is to use the proposal for anonymous access types as return types, which for (access-to) limited types has many of the same advantages as the constructor function concept (see AI 231 for details). **************************************************************** From: Tucker Taft Sent: Friday, February 8, 2002, 5:48 PM Given the relative complexity of the constructor function concept compared to the other de-limiting ideas, I would propose we split the AI. The simpler one would allow: 1) Aggregates of limited objects, with use of a subtype_mark to mean default init of a component; 2) Explicit initialization of limited objects, both in a declaration and an allocator, from an aggregate, with the aggregate built "in place" in the target. The more complex one would address functions constructing limited objects on behalf of the caller. The aggregate one seems very straightforward. Almost just eliminate the existing restriction, presuming that compilers have already learned how to build aggregates "in place" for controlled types. The function one looks like a lot of work. **************************************************************** From: Tucker Taft Sent: Sunday, January 12, 2003, 7:55 PM When we presented some of the Ada 200Y ideas at SIGAda, there was a feeling that if we added support for aggregates of a limited type, we should also have function returns. Bob and I don't feel the two need to be tied that closely together, but they both go in the category of making limited types less limited. In any case, I got to thinking about the problem more, and wrote the following note to Bob describing a "brainstorm" I had a couple of nights ago. Bob said I might as well forward this to the full ARG for comments. He hasn't decided whether he will incorporate it into an AI on limited function returns.... So, fire away. -Tuck --------- Bob, I realized sometime after I gave my quick response on constructor functions, that I had forgotten about one of the main challenges, namely the desire to execute some statements (assignments, procedure calls, etc.) to initialize the object being returned, before actually returning it. If the only thing you can do is return an aggregate or another function call, you don't have much flexibility, and there is no way to insert a call on a procedure. Which got me to thinking about the various special naming conventions we had talked about for local variables which *must* be returned, all of which were unsatisfactory, kludgey, inelegant, etc. And then suddenly the idea came to me that if you could attach the statements directly to the return statement, that would be nice. E.g. something like: return blah with {statement} end return; But then you need a name for the object being returned, so that led to something like: return X : blah with {statement} end return; And then I thought, what construct in Ada already has an optional set of statements following it? The "accept" statement. So why not try to make use of the lonely "do" reserved word. Also, it seems odd to have a name on an expression, so let's make it a regular declaration with a subtype indication as well. The leads to: return_statement ::= RETURN ; | RETURN expr ; | RETURN identifier : subtype_indication [:= expr] [ DO handled_sequence_of_statements END [identifier] ; ] For example: function Cool(B : Boolean) return Variant_Rec is begin if B then return Result : Variant_Rec(True) do Fixup(Result); end Result; else return Result : Variant_Rec(False) do Different_Fixup(Result); end Result; end if; end Cool; With this construct, we could allow limited function returns, where either the second form of "return" statement is used and the expr is an aggregate or a function call (or a reference to a long-lived existing object, per Ada 95), or the third form of the "return" statement is used, and pretty much anything goes, since you are clearly creating a new object. This construct would also make it possible to support a result type that was an anonymous access type. E.g.: function Cooler(Blah : access T) return access U is begin return Result : access U := new U(discrim => blah) do Result.Fum := Blah; end Result; end Cooler; and the caller could determine the storage pool used for "Result" from context: X : U_Ptr := Cooler(Something'Access); In fact, limited function returns and anonymous access-type returns could be seen as almost the same thing. To implement both, the caller has to pass in a storage pool, and an accessibility level. The called routine can either use that storage pool (and its associated finalization/dependent-task list, if needed), or it can return an access/reference to an existing object, so long as it satisfies the accessibility check. There might be an accessibility level indication or some other flag that means the return object *must* be newly allocated out of the given storage pool. The compiler would also have to implicitly create a local storage pool to be passed in when the result of the function call is used to initialize a local variable. I suppose one could get even more radical, and allow this "do ... end identifer;" at the end of any declaration that is declaring only one object (i.e. isn't "X, Y : Blah"). This would solve the old problem of making sure any initialization procedures that need to be called get connected tightly to the declaration. But that problem could probably be solved better by having limited functions with the "return ... do ... end ID;" construct, so I think I will keep this more radical suggestion to myself ;-). So, that was my great "brainstorm" last night (well, actually this morning when I couldn't sleep...). As they used to say during the 9X project, I'll go don my flak jacket now, so feel free to fire away. **************************************************************** From: Dan Eilers Sent: Monday, January 13, 2003 1:01 PM Tuck's proposal looks interesting to me. I particularly like the idea of somehow solving "the old problem of making sure any initialization procedures that need to be called get connected tightly to the declaration." Being able to attach initialization code to declarations is useful for a variety of reasons, including eliminating the overhead of default initialization code where explicit initialization is provided later, and making sure the declaration is initialized before its first use. **************************************************************** From: Tucker Taft Sent: Friday, May 23, 2003, 8:05 AM I believe I floated a "trial balloon" a month or so ago about a syntax to support returning objects of a limited type from a function. Bob Duff pointed out that it created yet another level of nesting in simple cases. Also, it involved a completely new syntactic construct (return ... do ... end), which seems excessive. So here is a revamped proposal, now structured as a "real" AI. [Editor's note: this is version /01 of the AI.] **************************************************************** From: Randy Brukardt Sent: Friday, May 23, 2003, 1:22 PM My gut reaction is that your trial balloon syntax is much preferred: -- Using different syntax for return means that it is clear that this is not the "normal" return with copying; -- There is no need to look all over the source code to find the return object; -- There is no set of complex rules to guarantee that only one return object is available in a given context. The 'weight' of the extra syntax is pretty similar either way (an entire new kind of object declaration doesn't seem "light" to me either). So, given the advantages of the original syntax, and since the only identified problem is an extra level of nesting (who cares?), I much prefer that alternative. (I'm unconvinced that we can afford the complexity of any of these proposals, but that's a different issue altogether. I don't like the idea that the compiler has to be able to determine at run-time whether a call is 'build-in-place' or 'existing object reference'; that seems like substantial overhead, and I would expect 'build-in-place' to be commonly used.) **************************************************************** From: Tucker Taft Sent: Friday, May 23, 2003, 1:50 PM Can you explain this a bit more. The called routine knows whether it is returning an existing object or a new object, so I don't see extra overhead there. I suppose it has to check whether the caller allowed returning an existing object, but that is just a simple test, certainly cheaper than the average constraint check. The caller shouldn't care, since it can always use the address returned from the called function, whether or not it created a new object. What is the source of the overhead that I am missing? **************************************************************** From: Randy Brukardt Sent: Friday, May 23, 2003, 3:26 PM According to your write-up, the caller has to pass into the function some or all of: -- A flag as to whether an object return is allowed; -- The address of the memory to build the return in; -- A storage pool; -- An accessibility level. Obviously, there is going to be overhead to build and pass these things. (Parameter passing isn't free!) Even if these are all packed into a descriptor, initializing that descriptor is going to take a bunch of instructions. That's especially true if a storage pool is created on the fly for the call (which your writeup suggested in some cases). So, such function calls look quite a bit more expensive than the similar aggregates or the currently existing function calls. It's not likely to be hundreds of times worse, but it's pretty complicated and certainly will slow down these calls. That might matter for a few existing programs. **************************************************************** From: Tucker Taft Sent: Friday, May 23, 2003, 5:19 PM If we adopt the restriction that eliminates run-time accessibility issues for this proposal, then what the AI suggested was as follows: 1) If the function had a known-size result, then the caller would preallocate the space, and pass this address in the usual way for a function that returned a "large" result. In addition, a flag would be passed to indicate whether the function was allowed to simply return the address of a preexisting object. If so, then the caller would expect the function to return the address of the result, which could be the preallocated space, or the preexising object's address. In this case, the only extra overhead for typical implementations would be the extra boolean flag, and the test against it. 2) If the function had an unknown-size result, and hence would normally have to allocate the result on a secondary stack or heap, then the caller would pass in a storage pool and a boolean flag (or a possibly-null storage pool). The storage pool is one of the following: a) a "normal" storage pool, presuming the function call is used as the expression of an initialized allocator b) a special "secondary stack" storage pool, presumably which could be precreated by the run-time system c) an on-the-fly constructed "preallocated-space" storage pool, which at a minimum would consist of: i) a tag identifying it as one of these kinds of storage pools ii) an address of the preallocated space iii) the length of the preallocated space Case (2c) seems like the only one that involves measurable extra work at the call-site. Presuming the storage pool is allocated on the primary stack then at the call-site you would have at least 3 instructions to initialize the storage pool (assignments of the tag, address, and length), probably more like 6 for the typical RISC machine. Then you would have to pass the address of the storage pool as an implicit parameter. So I agree there would be some overhead, but by using the storage-pool "abstraction" for cases (2a,2b,2c) and a simple boolean flag for (1), the total amount seems pretty modest. Just to be more precise about the preallocate-space storage pool, here is a sample implementation of such a beast: type Preallocated_Space_Storage_Pool(Addr : Integer_Address; Max_Size : Storage_Count) is new Root_Storage_Pool with null record; procedure Allocate( Pool : in out Preallocated_Space_Storage_Pool; Storage_Address : out Address; Size_In_Storage_Elements : in Storage_Count; Alignment : in Storage_Count) is begin if Size_In_Storage_Elements > Max_Size then raise Storage_Error; end if; Storage_Address := To_Address(Addr); end; procedure Deallocate(...) is begin raise Program_Error; end; function Storage_Size(Pool : ...) is begin return Pool.Max_Size; end; A local object of this type would need to be created at the call-site and passed as an implicit parameter when the space for the object is preallocated by the caller (case 2c above). > So, such function calls look quite a bit more expensive than the similar > aggregates or the currently existing function calls. It's not likely to be > hundreds of times worse, but it's pretty complicated and certainly will slow > down these calls. That might matter for a few existing programs. It doesn't look that expensive to me. Calling functions with unknown-size results is relatively expensive anyway. Presuming case (2c) above is relatively rare (limited type, unknown size result, caller preallocates), this doesn't seem like a show-stopper. **************************************************************** From: Randy Brukardt Sent: Friday, May 23, 2003, 5:46 PM Tucker wrote: ... > 1) If the function had a known-size result, then the caller would > preallocate the space, and pass this address in the usual way for a > function that returned a "large" result. In addition, a flag > would be passed to indicate whether the function was allowed > to simply return the address of a preexisting object. If so, > then the caller would expect the function to return the address > of the result, which could be the preallocated space, or the > preexising object's address. > > In this case, the only extra overhead for typical implementations > would be the extra boolean flag, and the test against it. But then there is overhead to get rid of the extra 'preallocated' space if it isn't used. And the overhead of figuring out if that needs to be done. If the space is in a pool (because it's an anonymous access type, or an item with a non-stack size), this will require calling pool operation(s). In the stack case, this memory won't be recovered until the subprogram exits (Janus/Ada might reuse it, but it cannot recover it). If the object is controlled, it probably will have to be registered (not sure precisely when that would have to happen for this case, but it certainly can't happen inside the subprogram unless a finalization chain is passed into it, which would add even more overhead). In addition, this implementation means that you will end up allocating an 'extra' copy of the object in the return existing object case. If the object is large (and certainly some of the objects we're talking about are), that could be a problem, as it could cause an existing program to raise Storage_Error. > Presuming case (2c) above is relatively > rare (limited type, unknown size result, caller preallocates), this doesn't > seem like a show-stopper. I don't think it's necessarily a show-stopper. But we have to do a cost/benefit analysis on new features. Certainly, there is a benefit here, but just like Interfaces, it is not at all clear to me that the benefit outweighs the cost, which is considerable (and growing). **************************************************************** From: Tucker Taft Sent: Saturday, May 24, 2003, 12:34 PM > But then there is overhead to get rid of the extra 'preallocated' space if > it isn't used. I'm not sure I understand what this means. It may be something about the way your compiler works. In our compiler, if the caller preallocates temporary space for a function result, it is space that gets released automatically at the end of the enclosing scope, so there is no point (or sometimes, no way), to reclaim it earlier than then. > ... And the overhead of figuring out if that needs to be done. The caller would know at compile-time whether the function returns a known-size result, and whether the result is used in a context where a preexisting object would be permitted, so I don't see any run-time overhead there. > ... If > the space is in a pool (because it's an anonymous access type, or an item > with a non-stack size), this will require calling pool operation(s). In the > stack case, this memory won't be recovered until the subprogram exits > (Janus/Ada might reuse it, but it cannot recover it). If the object is > controlled, it probably will have to be registered (not sure precisely when > that would have to happen for this case, but it certainly can't happen > inside the subprogram unless a finalization chain is passed into it, which > would add even more overhead). I am unclear now whether you are talking about overhead that is new to limited function return, or is the same as what you would face for non-limited function return. > In addition, this implementation means that you will end up allocating an > 'extra' copy of the object in the return existing object case. If the object > is large (and certainly some of the objects we're talking about are), that > could be a problem, as it could cause an existing program to raise > Storage_Error. You could set your upper limit for caller-preallocated space relatively low for these kinds of functions, if this is a significant concern. That is, require use of the secondary stack or heap even if the result size is known, if the known size is so large as to be of concern. >>Presuming case (2c) above is relatively >>rare (limited type, unknown size result, caller preallocates), this doesn't >>seem like a show-stopper. > > > I don't think it's necessarily a show-stopper. But we have to do a > cost/benefit analysis on new features. Certainly, there is a benefit here, > but just like Interfaces, it is not at all clear to me that the benefit > outweighs the cost, which is considerable (and growing). Are you talking about implementation cost or run-time overhead? I don't see the run-time overhead as being much greater than function calls that return a non-limited type of similar complexity. If the result might be controlled, or large, or of unknown-size, then yes that adds to the run-time overhead, but that is true for non-limited functions as well. **************************************************************** From: Stephen W Baird Sent: Tuesday, October 14, 2003, 3:51 PM This is a discussion of the interaction between AI-318 and the IBM-Rational Apex Ada compiler's implementation of finalization, as per my homework assignment from the Sydney ARG meeting. ---- The Apex compiler manages pending finalization requirements (i.e. finalization of controlled and protected objects, not tasks) at the granularity of top-level (i.e. non-component) objects. The finalization code generated for the enclosing "construct or entity" (7.6.1(2)) of a given top-level object relies on the invariant that either all or none of the subcomponents of the object require finalization. This means, for example, that if an exception occurs while initializing an object, then the initialization code (which knows how far it has progressed) must handle the exception, finalize any components which were successfully initialized, and then (typically) reraise the exception. If an object cannot make it to the state where all of its subcomponents need to be finalized, then it must revert to the state where none require finalization before execution of the finalization code for the enclosing "construct or entity". This has proven to be a reasonable implementation model, but AI-318 might be difficult to implement using this approach. Consider the case of a return object which contains several controlled subcomponents. Suppose that some, but not all, of these subcomponents have been successfully initialized when an exception is raised. The code (in the callee) which knows how far initialization has progressed would have to handle the exception, perform any necessary finalization, and then (typically) reraise the exception. Unfortunately, the AI (as currently written) disallows this approach: "The return object would not be finalized prior to leaving the function. The caller would be responsible for its finalization". This problem could be resolved by having this provision of the AI apply only in the case of a "normal completion" (7.6.1(2)) of the function, with the callee responsible for finalization otherwise (or perhaps just by adding an implementation permission allowing the callee to perform finalization in this case). It cannot always be known whether a function is going to return normally until after any other finalization of objects declared by the function has completed. Thus, the return object might have to be the last object to be finalized. This could be accomplished either by requiring that it be the first object with nontrivial finalization to be declared or by inventing a special dynamic-semantics rule to handle this case (perhaps only an implementation permission). The Apex Ada compiler implements abortion (including ATC) by means of a distinguished anonymous "exception". Thus, abortion while the callee is executing introduces essentially the same problem for the caller as if the callee propagated an exception. The distinction between normal and abnormal completion proposed above would also help in resolving this problem. **************************************************************** From: Robert A Duff Sent: Wednesday, October 15, 2003, 11:53 AM > This problem could be resolved by having this provision of the AI apply > only in the case of a "normal completion" (7.6.1(2)) of the function, with > the callee responsible for finalization otherwise ... That makes sense to me. How can it make sense to let the caller do the finalization of the result when the function is not returning a result, but is propagating an exception instead? Also, Steve is talking about the case where the returned object is "half baked". But what if it hasn't been initialized at all? I believe there would be trouble in that case, too. >...(or perhaps just by > adding an implementation permission allowing the callee to perform > finalization in this case). I'm not a big fan of implementation permissions, but I would have no objection in this case. **************************************************************** From: Dan Eilers Sent: Thursday, October 30, 2003, 8:01 PM The initial proposals for AI-318 involved changes to the syntax of a function specification, such as using OUT instead of RETURN. The current proposals don't. The only proposed syntax changes are in the body of a function. This makes it impossible for the caller to know that this is a special return-in-place function, which would seem to be necessary in order to use different calling conventions. Note that the caller can't just go by the return type being limited, because the AI is intended to also eliminate the copy-back for non-limited types. **************************************************************** From: Randy Brukardt Sent: Thursday, October 30, 2003, 8:32 PM I *think* that's intentional. The majority of functions return by-copy types, and for those, it makes no difference (at least, it better not). For other types, most compilers already use an in-place function convention in most cases; and those that don't (i.e. Janus/Ada) probably would be better off changing to use one. So, it seems that for most calls, any performance changes would be in the direction of faster (and possibly larger) code. But any performance incompatibilities ought to be investigated throughly. I've already complained about performance incompatibilities with this proposal (see the mail thread of May 2003 in the AI); Tucker's response is essentially that compilers will optimize the calling conventions, and the ugly cases are rare. Since that *is* an incompatibility, it should be discussed in the AI. We've already asked for implementation reports on this AI, since several implementors expressed concern about the cost of the convention. I'm sure we'd welcome one from you as well. **************************************************************** From: Tucker Taft Sent: Thursday, October 30, 2003, 8:56 PM It is true that a single calling convention must be used. This implies some overhead on calling functions with a return-by-reference type, but the presumption is that these are very rare at the moment. The presumed model is that the caller specifies a storage "area" (or equivalent), and a flag indicating whether the storage area *must* be used, or simply *may* be used. (I believe this is discussed in the AI already.) If used in a context where a new object is being initialized (e.g. a component of an aggregate, initialization expression for a limited object, or an initialized allocator), the specified storage area must be used. If used in another context (e.g. in a renaming, as an IN parameter, or as an operand of some construct like a membership test), then the storage area need not be used, and returning an existing object is allowed. Currently there is an accessibility check on returning a preexisting return-by-reference object. That check would be expanded to include a check on whether returning an existing object is permitted. Right now the check is officially a run-time check, but it is generally easy to perform at compile-time (or instantiation time). It would become a real run-time check with this change. The underlying presumption behind all this is that the existing capability to return existing objects by reference is of relatively little use, and it is reasonable to largely ignore this capability, and focus on being able to use functions with limited result types as "constructors." The existing capability would be preserved, but perhaps might even deserve to be made obsolescent. The existing capability provides very little value over what can be done with returning an access value, whereas the new capability provides significant value as part of making limited types more useful. > ... Note that the caller > can't just go by the return type being limited, because the AI is > intended to also eliminate the copy-back for non-limited types. This is meant to be an optimization. There is no guarantee that copy-back is eliminated for non-limited types. The presumption is that for most functions returning large objects of known size, the caller already passes in the address of a place where the return object should be placed. This new syntax would simplify using that space directly within the function body, rather than doing another copy. **************************************************************** From: Robert Dewar Sent: Thursday, October 30, 2003, 8:42 PM > It is true that a single calling convention must be used. > This implies some overhead on calling functions with a > return-by-reference type, but the presumption is that these > are very rare at the moment. The presumed model is that the caller > specifies a storage "area" (or equivalent), and a flag indicating > whether the storage area *must* be used, or simply *may* be used. > (I believe this is discussed in the AI already.) That seems a nasty incomaptibility. I don't like to see a feature of relatively minor importance (in my view) causing an implementation incompatibility of this magnitude, potentially requiring reocmpilation of existing code that does not use the new feature, and invalidating libraries. **************************************************************** [Editor's note: Additional discussion on this topic can be found in AI-325.] **************************************************************** From: Tucker Taft Sent: Monday, December 8, 2003, 10:32 AM There seem to a lot of messages flying around about how best to support function-like-things returning/constructing limited objects. Using the "Getting to Yes" method of trying to focus on what we agree about, here is a list of possibly desired features of the solution. I will start with those that seem to already have a consensus. I would most appreciate responses that indicate if I missed any "consensus" statements, or if there are some that are clearly *not* a consensus. Secondly, it would be good to have a prioritization of the nice-to-haves. Finally, it would be good to get some feeling about the non-consensus statements, and perhaps adjustments to them which might allow them to become consensus statements. ------------------------- Consensus statements about Ada 200Y if we were to approve AI-318: 1) Should be possible to declare an object of a limited type and provide an initializing expression 2) Should be possible to use an initialized allocator for an access-to-limited type 3) Should be possible to provide an aggregate as the initializing expression for a declaration or an initialized allocator, or for a component of such an aggregate; such aggregates may use "<>" to represent default initialization of a component 4) Should be possible to use a function call (or something that looks syntactically like a function call) as the initializing expression for a declaration, initialized allocator, or component of an aggregate that is of a limited type, including a limited private type. 5) Should be possible to declare a function-like thing callable by such a function call for limited types whose first subtype is a definite subtype. 6) Should be possible to use an aggregate or a function-call (-like-thing) as an actual IN parameter of a limited type 7) Should *not* be possible to copy an existing limited object. I.e. Should *not* be possible to have an assignment statement for a limited type, and should *not* be possible to use the name of a limited declared object nor a dereference of an access-to-limited type as a component of an aggregate. 8) The compiler needs to know at the call-site whether function-like thing is returning an existing object by reference, or returning/initializing a new object Nice to have: 9) Ability to declare and call a function-like thing for a limited type with non-defaulted discriminants 10) Ability to declare and call a function-like thing for a limited type with unknown discriminants; such types would require an initializing expression -- no default initialization is defined for them. 11) Easy to implement Other possible desirables: 12) Should not require alteration in the way limited types are laid out 13) Should allow us to still have function-like things that return by reference 14) Should (or should not) use the word "constructor" somewhere 15) Should (or should not) use the word "limited" somewhere 16) Should (or should not) use the word "function" somewhere 17) Should provide more efficient way for non-limited types to be returned/initialized 18) Should not "orphan" existing language features **************************************************************** From: Robert A. Duff Sent: Monday, December 8, 2003, 2:48 PM Thanks, Tuck. This is a very helpful summary. I was getting lost in all those e-mails. > ------------------------- > > Consensus statements about Ada 200Y if we were to approve AI-318: > > 1) Should be possible to declare an object of a limited > type and provide an initializing expression > 2) Should be possible to use an initialized allocator for > an access-to-limited type I would add, "even when the storage pool is user-defined". > 3) Should be possible to provide an aggregate as the initializing expression > for a declaration or an initialized allocator, or for a component > of such an aggregate; such aggregates may use "<>" to represent default > initialization of a component > 4) Should be possible to use a function call (or something that looks syntactically > like a function call) as the initializing expression for a declaration, > initialized allocator, or component of an aggregate that is of a limited type, > including a limited private type. > 5) Should be possible to declare a function-like thing callable by such a function call > for limited types whose first subtype is a definite subtype. > 6) Should be possible to use an aggregate or a function-call (-like-thing) as > an actual IN parameter of a limited type > 7) Should *not* be possible to copy an existing limited object. I.e. > Should *not* be possible to have an assignment statement for a limited > type, and should *not* be possible to use the name of a limited declared object > nor a dereference of an access-to-limited type as a component of an aggregate. > 8) The compiler needs to know at the call-site whether function-like thing > is returning an existing object by reference, or returning/initializing > a new object I agree with 1..8 above. Can we also get concensus on this: Every context that allows an initialization expression for nonlimited types should also allow it for limited types. ? That subsumes 1,2,part-of-3,part-of-6. It also includes record-component-defaults, generic-formal-in's, and probably some others I've forgotten. The relevant AI's list all the cases. > Nice to have: > 9) Ability to declare and call a function-like thing for a limited type with > non-defaulted discriminants > 10) Ability to declare and call a function-like thing for a limited type with > unknown discriminants; such types would require an initializing expression -- > no default initialization is defined for them. I think 9 and 10 are important. I'm not quite willing to kill the whole idea if I can't have 9 and 10, but since these work already for the nonlimited case, it would seem pretty kludgy to leave them out in the limited case. > 11) Easy to implement Who could disaggree with that? But I'm willing to put up with some implementation complexity to get 9 and 10. > Other possible desirables: > 12) Should not require alteration in the way limited types are laid out I think I agree, but I'm not really sure what you're getting at. Which proposal(s) violate this? > 13) Should allow us to still have function-like things > that return by reference I don't much care about that for my own code, but I think it would be irresponsible of us to be incompatible with folks have used this feature, even if we think perhaps it's a misguided feature. > 14) Should (or should not) use the word "constructor" somewhere > 15) Should (or should not) use the word "limited" somewhere > 16) Should (or should not) use the word "function" somewhere I have no strong opinion on the syntax, but I think these new kinds of constructors are conceptually "functions". They just happen to build their result in the final resting place. Viewing them as "procedures" seems like a compiler-writer viewpoint; I'd rather take a user-oriented viewpoint. The fact that function results can take their discriminants from the function parameters, but you can't do that for 'out' parameters, is accidental, not fundamental. Viewing them as totally new animals seems like overkill. To me, a constructor is just a function that creates a new thing. Number 8 above implies that we need *some* sort of new syntax. I would prefer to keep it as close as possible to the existing function declaration syntax. But I do not feel strongly about this. > 17) Should provide more efficient way for non-limited types to be returned/initialized Seems like a nice side effect. Not important. > 18) Should not "orphan" existing language features This seems like a possible symptom of kludgery, but not a worthy goal in its own right. I mean, if I have to write "constructor" instead of "function" all over the place in future code, that's not a disaster. **************************************************************** From: Robert I. Eachus Sent: Monday, December 8, 2003, 4:15 PM I had to go out after sending my previous message, so effectively Tucker and I crossed in the mail. But Bob Duff did an good job of responding to Tucker's excellent list of points: Robert A Duff wrote: >Thanks, Tuck. This is a very helpful summary. I was getting lost in >all those e-mails. > > Agreed, and I was writing some of them, and referring back to others to keep everything straight. ... >I agree with 1..8 above. > >Can we also get concensus on this: > > Every context that allows an initialization expression for > nonlimited types should also allow it for limited types. > >? That subsumes 1,2,part-of-3,part-of-6. It also includes >record-component-defaults, generic-formal-in's, and probably some >others I've forgotten. The relevant AI's list all the cases. I like Tucker's breakdown better. It makes it easier to say that 1,2,3, and 6 are must haves, and some of the other cases are nice to haves. I certainly favor allowing record component defaults, and generic formal in parameters likewise seem safe, and in most compilers I would expect them to be implemented identically to required cases when the defaults were actually used. But I would certainly consider any objections from implementors if some fringe case caused serious implementation problems. >>Nice to have: >> 9) Ability to declare and call a function-like thing for a limited type with >> non-defaulted discriminants >> 10) Ability to declare and call a function-like thing for a limited type with >> unknown discriminants; such types would require an initializing expression -- >> no default initialization is defined for them. > >I think 9 and 10 are important. I'm not quite willing to kill the whole >idea if I can't have 9 and 10, but since these work already for the >nonlimited case, it would seem pretty kludgy to leave them out in the >limited case. Agree I think 10 is more important than 9, but both will be very important. >> 11) Easy to implement > >Who could disaggree with that? But I'm willing to put up with some >implementation complexity to get 9 and 10. Definitely agree. >>Other possible desirables: >> 12) Should not require alteration in the way limited types are laid out >> >> > >I think I agree, but I'm not really sure what you're getting at. Which >proposal(s) violate this? No current proposal, AFAIK. Doesn't mean a future new variation won't. But there is a difference between require and permit that should be kept in mind. There will be cases where compilers can generate more efficient layouts for types that are just not used today. Remember that my example code compiles cleanly today, the only problem is that it exports ADTs that can't be created by users. For example, it would be an optimization for a compiler that currently places unbounded structures on the heap and uses 'hidden' pointers in the structure to manage them for the compiler to allocate one contiguous chunck of the heap, for all components of a record, have pointers/offsets in the record structure, and just one heap object to free when the whole record is freed. That doesn't mean that all compilers have to treat records with multiple constructors that way, just that a compiler is allowed to do so. To be honest, I expect that the objects in point 10 above will become common in Ada 0Y. The types are currently legal in Ada 95/2000, but they just are not used. (And not really very usable.) So I don't know if any compilers have horrible overhead if someone does create one. If so, that compiler would probably need to change its layout policy for limited types. >> 13) Should allow us to still have function-like things >> that return by reference > >I don't much care about that for my own code, but I think it would be >irresponsible of us to be incompatible with folks have used this >feature, even if we think perhaps it's a misguided feature. Huh? Oh. You don't use it because you don't use limited tagged types. This feature will become much more useful and less 'misguided' if we can initialize objects of types derived from Limited_Controlled more easily. >> 14) Should (or should not) use the word "constructor" somewhere >> 15) Should (or should not) use the word "limited" somewhere >> 16) Should (or should not) use the word "function" somewhere > >I have no strong opinion on the syntax, but I think these new kinds of >constructors are conceptually "functions". They just happen to build >their result in the final resting place. > >Viewing them as "procedures" seems like a compiler-writer viewpoint; I'd >rather take a user-oriented viewpoint. The fact that function results >can take their discriminants from the function parameters, but you can't >do that for 'out' parameters, is accidental, not fundamental. > >Viewing them as totally new animals seems like overkill. To me, a >constructor is just a function that creates a new thing. Number 8 above >implies that we need *some* sort of new syntax. I would prefer to keep >it as close as possible to the existing function declaration syntax. >But I do not feel strongly about this. > > We really need Norman Cohen to take a look at the issue. I'm sure he could come up with something. Seriously, I have no objection to retaining the word function. But I do want the syntax to indicate that this is one of those special "constructor" things to both the user and the compiler. I don't like "limited function" because that implies that there are also "not limited function" types. ;-) Using "constructor function" seems a bit wordy but otherwise fine. Certainly whatever the syntax, the RM should talk about them as constructor functions or constructors. I also tried out: function Foo return new Bar; But that seems to imply that a hidden pointer must be used. function Foo return inplace Bar; Is a bit better, but even with the precedent of goto, I don't like the idea of reserved words that are not English words. function Foo create Bar; Might be acceptable to everyone? I am certainly open to any good ideas. >> 17) Should provide more efficient way for non-limited types to be returned/initialized >> >Seems like a nice side effect. Not important. > Agree. Well maybe more than just nice if we can improve the efficiency of Unbounded_String in some cases. But certainly not a requirement. >> 18) Should not "orphan" existing language features > >This seems like a possible symptom of kludgery, but not a worthy goal in >its own right. I mean, if I have to write "constructor" instead of >"function" all over the place in future code, that's not a disaster. One last suggested inclusion: 19) If we adopt a partial solution, that partial solution shouldn't limit a future extention to cover everything. I am certainly willing to consider scope reduction of a complete solution, as long as it doesn't preclude ever fixing the excluded cases. **************************************************************** From: Tucker Taft Sent: Monday, December 8, 2003, 5:52 PM Robert A Duff wrote: > ... > Can we also get concensus on this: > > Every context that allows an initialization expression for > nonlimited types should also allow it for limited types. I agree with that. I also believe that the reverse should be true, namely there should be no contexts where calling these function-like things are permitted, but calling good-old functions returning non-limited types are not permitted. So in other words, from a user point of view, these are all very similar. The limited-returning ones cannot be called in certain contexts because those contexts would require copying the result. The only context I can think of off the top of my head is as the right hand-side of an assignment statement, though there are probably others. Whether they can be used as the expression of a return statement depends on the details of how these limited-returning things are implemented (as opposed to called). It is the existing limited-returning-by-ref functions that are odd, because they can only be called in very limited contexts. In particular, a call on one of these can be used as an IN parameter, in a renaming, and as a prefix of a name (anyplace else?). This is not so noticeable in Ada 95, because the limitedness of the result and the absence of aggregates and initialization of limited types, means that the by-ref-ness doesn't create much additional limitation. *But*, if we add aggregates and initialization of limited types, then suddenly these kinds of functions have some odd-ball limitations which may be hard to remember, especially if there are function-like things that don't have these limitations. Hence, I feel pretty strongly that if we are going to use syntax to make these two kinds of limited-returning function-like things look different, we should make the existing returning-by-ref functions look different from non-limited-returning functions, and make the new more flexible limited-returning functions look like good old non-limited returning functions, since they have so much more in common (in terms of legal calling contexts). This is why I would recommend we require something like the word "limited" on a function if it will be returning by-ref, and can only be called in contexts where by-ref makes sense. This is of course incompatible, but it is easily caught at compile-time, and compilers could start allowing the word "limited" right away, even before they support the new capability. > > Other possible desirables: > > 12) Should not require alteration in the way limited types are laid out > > I think I agree, but I'm not really sure what you're getting at. Which > proposal(s) violate this? There were a lot of different ideas thrown around, but at least one of them implied that the caller might *not* know the size of the thing being allocated, nor where it was being allocated. Clearly if you call one of these function-like things as a component of an aggregate, and you lay out limited types contiguously (even if some component is dynamic-sized), then the caller *must* specify where the object is allocated, and *must* know the size before it goes out-of-line so it can add up all the sizes of the components and do the one overall allocation in the appropriate place (on the secondary stack, in some user-specified storage pool, as a component of a yet larger limited object, etc.). I got the sense that one solution being bandied about was that limited components of dynamic size would *have* to use a level of indirection, precluding a contiguous allocation for an enclosing limited record. This is not the way many compilers do things now, and so would imply a change in the way limited types are laid out. I would hope (12) is a point of consensus, but I couldn't tell if that were true based on the flurry of messages. > > > 13) Should allow us to still have function-like things > > that return by reference > > I don't much care about that for my own code, but I think it would be > irresponsible of us to be incompatible with folks have used this > feature, even if we think perhaps it's a misguided feature. But perhaps call these things "limited functions" because if we add aggregates and initialized limited objects, these guys won't be callable in those contexts. Alternatively, require that they be recast as functions returning anonymous access types, effectively moving the ".all" from the return expression to the point of call (since in my experience, these functions almost always return a reference to a heap object, due to accessibility limitations). > > 14) Should (or should not) use the word "constructor" somewhere > > 15) Should (or should not) use the word "limited" somewhere > > 16) Should (or should not) use the word "function" somewhere > > I have no strong opinion on the syntax, but I think these new kinds of > constructors are conceptually "functions". They just happen to build > their result in the final resting place. I agree (as is presumably obvious). As indicated above, it is the return-by-ref guys that will begin to look like oddballs, if we add limited aggregates and initialization. > Viewing them as "procedures" seems like a compiler-writer viewpoint; I'd > rather take a user-oriented viewpoint. The fact that function results > can take their discriminants from the function parameters, but you can't > do that for 'out' parameters, is accidental, not fundamental. I'm not sure I followed that logic, but I agree that they should be *viewed* as functions. The question is how does one implement these. I fear that to achieve nice-to-have's (10) and (11), allowing the first subtype to have non-defaulted or unknown discriminants, combined with (12), creates a real challenge. Renaming a procedure call as a function nicely solved all the problems: a) the visible declaration is a function b) the renaming declaration can use the parameters to specify the discriminants for the returned (i.e. OUT) object (e.g. "(Disc => 3, others => <>)") c) the out of line code has a name for the pre-allocated object so it can refer to the discriminants. If there is another solution that has all these capabilities that would be great. I have not found one. The hardest problem is where the discriminants are not explicitly determined by the caller, but are instead determined by some computation on the IN parameters. One suggested solution was: function Make_Text(Len : Natural) return Lim_Text(Len); But that doesn't work if the discriminants of Lim_Text are not visible (i.e. "(<>)"). The renaming (of a procedure call) could work because the renaming can be in the private part. It may be that some mild restrictions could be added to deal with this problem. I would hope the restrictions can be enforced on the *declaration* of the function rather than at the call site. Otherwise I fear we will get into the "applicable index constraint" game, which I don't relish. That is, certain calls would only be permitted when there is an applicable discriminant constraint. > > Viewing them as totally new animals seems like overkill. To me, a > constructor is just a function that creates a new thing. ... And except for the oddball return-by-ref functions, all functions create a new thing. **************************************************************** From: Randy Brukardt Sent: Monday, December 8, 2003, 6:53 PM Tucker said: > Hence, I feel pretty strongly that if we are going to use syntax to make > these two kinds of limited-returning function-like things look different, > we should make the existing returning-by-ref functions look different > from non-limited-returning functions, and make the new more flexible > limited-returning functions look like good old non-limited returning functions, > since they have so much more in common (in terms of legal calling > contexts). > > This is why I would recommend we require something like the word "limited" on > a function if it will be returning by-ref, and can only be called in contexts > where by-ref makes sense. This is of course incompatible, but it is easily > caught at compile-time, and compilers could start allowing the word "limited" > right away, even before they support the new capability. I don't mind that in a vacuum, but I think that it means that either (1) non-limited constructors are actually more expensive than current functions; or (2) converting a limited type to non-limited requires checking all functions for correct behavior. The former occurs because (in one model) you get a call to Initialize that generally can't be optimized away on top of the Adjust and Finalize calls that we already have; the latter occurs (in another model) because limited types call Initialize and non-limited types don't. I don't much like either result. > > > Other possible desirables: > > > 12) Should not require alteration in the way limited types are laid out > > > > I think I agree, but I'm not really sure what you're getting at. Which > > proposal(s) violate this? > > There were a lot of different ideas thrown around, but at least one of > them implied that the caller might *not* know the size of the thing > being allocated, nor where it was being allocated. Clearly if > you call one of these function-like things as a component of an aggregate, > and you lay out limited types contiguously (even if some component is > dynamic-sized), then the caller *must* specify where the object is allocated, > and *must* know the size before it goes out-of-line so it can add up all the sizes > of the components and do the one overall allocation in the appropriate place > (on the secondary stack, in some user-specified storage pool, as a component > of a yet larger limited object, etc.). Trying to lay out all possible record types contiguously is a fool's game. Kinda like trying to implement universal generic sharing. :-) It's possible to get it to work, but only with lots of standing on your head. And the result is very use-unfriendly: objects of reasonable types like type Sane_Bounded_String (D : Natural := 0) record Data : String (1 .. D); end record; raise Storage_Error unless constrained. In any case, the vast majority of real types can be implemented contiguously, with any of these proposals. (Most ADTs don't have discriminants anyway, at least not on the top-level types.) If a few types have to change representation in a few compilers (and only if there are constructors defined) to make this work, I cannot get too excited. It can't be incompatible: there are no constructors now. > > Viewing them as totally new animals seems like overkill. To me, a > > constructor is just a function that creates a new thing. ... > > And except for the oddball return-by-ref functions, all > functions create a new thing. I guess I view these as a new thing because what they do is create a user-defined "construction" of an object; they need to replace the "initialization assignment" operation of Ada as well as the "initialization" itself. Existing functions do not change the semantics of assignment. For non-controlled types, the distinction doesn't really matter, but it is a big deal for controlled types (of all stripes). Also, I see a new thing as necessary, because I don't believe that a useful constructor can be defined that won't force some representation changes in compilers. (That is, (12) is an impossible goal; holding to it is a disaster from a user perspective -- it forces unnatural separations of construction code into parts. And the idea of somehow specifying an aggregate as the argument of an In Out parameter seems goofy.) As long as the constructors are explicit, then there isn't a problem in that existing code would not have to change representation. If we don't have the will to do this right this time, I don't think there is any value to another partial band-aid solution. Especially if it cannot be extended properly in the future. Which is why Tucker's procedure renaming just isn't going to work. **************************************************************** From: Robert I. Eachus Sent: Monday, December 8, 2003, 7:52 PM Tucker Taft wrote: >Hence, I feel pretty strongly that if we are going to use syntax to make >these two kinds of limited-returning function-like things look different, >we should make the existing returning-by-ref functions look different >from non-limited-returning functions, and make the new more flexible >limited-returning functions look like good old non-limited returning functions, >since they have so much more in common (in terms of legal calling contexts). > >This is why I would recommend we require something like the word "limited" on >a function if it will be returning by-ref, and can only be called in contexts >where by-ref makes sense. This is of course incompatible, but it is easily >caught at compile-time, and compilers could start allowing the word "limited" >right away, even before they support the new capability. > > It sounds like you are proposing to make detecting whether a function is a constructor or a "normal" function depend on whether or not it returns a limited type. But that doesn't work. The problem is that the compiler may not know whether or not a function can be seen in contexts where its type must be returned by reference. For example, a generic formal part may specify a limited type, but the actual may be non-limited. The reverse happens as well. Inside a package where a type is declared as limited private, the type may or may not be limited. So I have been assuming that 'flagging' constructors as such must be done in syntax, and constructors for non-limited types must be allowed, subject to the same rules and restrictions as for limited types. The "normal" case will be that a constructor is actually defined in a scope where the return type is non-limited, at least for non-tagged types. >There were a lot of different ideas thrown around, but at least one of >them implied that the caller might *not* know the size of the thing >being allocated, nor where it was being allocated. Clearly if >you call one of these function-like things as a component of an aggregate, >and you lay out limited types contiguously (even if some component is >dynamic-sized), then the caller *must* specify where the object is allocated, >and *must* know the size before it goes out-of-line so it can add up all the sizes >of the components and do the one overall allocation in the appropriate place >(on the secondary stack, in some user-specified storage pool, as a component >of a yet larger limited object, etc.). > >I got the sense that one solution being bandied about was that limited components >of dynamic size would *have* to use a level of indirection, precluding a contiguous >allocation for an enclosing limited record. This is not the way many compilers >do things now, and so would imply a change in the way limited types are laid out. >I would hope (12) is a point of consensus, but I couldn't tell if that were true >based on the flurry of messages. > > Yes, it is not just being thrown around, it is the other proposal on the table. However, there is a solution which doesn't require the caller to know the size of the object at the point of the call, and does not require a level indirection. This requires the caller to pass a thunk to the constructor. When the constructor is ready to allocate the actual object, it calls the thunk, giving the needed size, and the thunk returns an address. The thunk can be an allocator for some heap, or can be code to add the size information to the size for some object on top of the stack. The case of an object with many constructors as part of say an initial value aggregate can be accomodated by calling all the tasks in sequence, if the object is being created on a different stack than the one that contains the object being created, or in the case of a heap object, you can get a large chunk of heap, and eventually return what is not used. Like Randy, though, I consider that whether an implementation uses indirection for some types should be left to the implementor. There are cases where it is more user friendly--and also more efficient at run-time--to do that. The particular case of arrays of Unbounded_Strings is not an idle pasttime, it comes up fairly frequently so that it is worth looking at performance of various solutions in that case. (Unbounded_String is a weird case in general. If you implement Unbounded_Strings efficiently they are really limited objects that cannot be copied. What happens on assignment is a "deep copy" that clones the object.) >>> 13) Should allow us to still have function-like things >>> that return by reference >>> >>> >But perhaps call these things "limited functions" because if we add >aggregates and initialized limited objects, these guys won't be callable >in those contexts. Alternatively, require that they be recast as >functions returning anonymous access types, effectively moving the ".all" from the >return expression to the point of call (since in my experience, these >functions almost always return a reference to a heap object, due to accessibility >limitations). > > I'm going to stay out of this argument, other than to say I will probably recast those functions that are actually return by reference using the new semantics. But I don't want to have to do it at gunpoint. >I'm not sure I followed that logic, but I agree that they should be *viewed* >as functions. The question is how does one implement these. I fear >that to achieve nice-to-have's (10) and (11), allowing the first subtype >to have non-defaulted or unknown discriminants, combined with (12), creates >a real challenge. Renaming a procedure call as a function nicely solved >all the problems: > a) the visible declaration is a function > b) the renaming declaration can use the parameters to specify the > discriminants for the returned (i.e. OUT) object (e.g. "(Disc => 3, others => <>)") > c) the out of line code has a name for the pre-allocated object so > it can refer to the discriminants. > >If there is another solution that has all these capabilities that would be >great. I have not found one. The hardest problem is where the discriminants >are not explicitly determined by the caller, but are instead determined >by some computation on the IN parameters. > You have not found one, but Randy and I have. I think it requires more compiler implementation work than your approach, and in some cases it will be less efficient (more information passed in the call). But the advantage of the approach is that it does cover all cases, including those where neither the caller nor the constructor can know the size of the returned object until the point of the return statement. Yes, if the compiler sees that for some types the constructors can return objects larger that the largest stack available, it may decide to use (hidden) indirection in such objects. But I consider that to just be the nature of Ada. (Has anyone really thought about what will happen when creating an allocate the maximum unconstrained String doesn't always raise Storage_Error? In the early days, there were a few compilers, including the one for the DPS6, that used 16-bits for Integer, but a few years ago I ordered a machine with 4 Gig of memory. Of course the OS wouldn't allow 2Gig to be allocated for one String, but that day is coming.) Is it worth this potential extra overhead to make declaring private types with unknown discriminants work right? I think so. I also think the syntax is easier to use in the easier cases which will make the new constructs more popular. >It may be that some mild restrictions could be added to deal with this >problem. I would hope the restrictions can be enforced on the *declaration* >of the function rather than at the call site. Otherwise I fear we >will get into the "applicable index constraint" game, which I don't >relish. That is, certain calls would only be permitted when there is >an applicable discriminant constraint. > > I very much don't relish that either. >And except for the oddball return-by-ref functions, all >functions create a new thing. > > That is a compiler implementor's view of return by value. ;-) Users talk all the time about functions returning this or returning that when they are just returning a copy of an existing object. From the user's point of view, a constructor is different even for non-limited types. It might be better to say that a constructor constructs a new value, while many functions return existing values. Of course, arithmetic operations don't really fit this picture, but they are already special in a different way. But especially for non-limited ADTs I see a semantic division between constructors, which build new records, and selector functions that return existing records. **************************************************************** From: Tucker Taft Sent: Monday, December 8, 2003, 10:28 PM > Tucker said: > > > Hence, I feel pretty strongly that if we are going to use syntax to make > > these two kinds of limited-returning function-like things look different, > > we should make the existing returning-by-ref functions look different > > from non-limited-returning functions, and make the new more flexible > > limited-returning functions look like good old non-limited returning > functions, > > since they have so much more in common (in terms of legal calling > > contexts). > > > > This is why I would recommend we require something like the word "limited" > on > > a function if it will be returning by-ref, and can only be called in > contexts > > where by-ref makes sense. This is of course incompatible, but it is > easily > > caught at compile-time, and compilers could start allowing the word > "limited" > > right away, even before they support the new capability. > > I don't mind that in a vacuum, but I think that it means that either (1) > non-limited constructors are actually more expensive than current functions; > or (2) converting a limited type to non-limited requires checking all > functions for correct behavior. Unfortunately, I have completely lost you. I was trying to focus on the "call" side of things first, before plunging into the body/implementation side. Once we know what we want from the call side, we can start to figure out what we need to provide on the body/implementation side. So strictly from the call side, non-limited-returning functions always create/initialize a new object. Unfortunately, in Ada 95, the only limited-returning functions are return-by-ref of a preexisting object (when I say limited, I mean "truly" limited). Now what AI-318 is trying to provide is limited-returning function-like things that create/initialize a new object, very much like non-limited-returning functions. This is important because we are now proposing to allow limited objects to have initializing expressions, and we want to allow a function-call-like thing for those expressions. Unfortunately, the existing limited-returning functions are exactly the *wrong* thing for these new contexts. These by-ref functions didn't seem so odd when we didn't allow limited initializing expressions. There were no contexts where they couldn't be called due their by-ref-ness. The limited-ness was enough to eliminate all such contexts. But now we have proposed new contexts where limited types are allowed, but the existing kinds of functions can't be called in those contexts -- a definite pity. > The former occurs because (in one model) you get a call to Initialize that > generally can't be optimized away on top of the Adjust and Finalize calls > that we already have; the latter occurs (in another model) because limited > types call Initialize and non-limited types don't. Let's just for a moment ignore this issue of whether the object has to be default initialized and then re-initialized. Notice that I didn't mention that in any of my "consensus" lists, and that's not what I am focusing on now. I am happy to keep searching for a solution that avoids the double initialization. What I want first is a good specification of what the solution should look like on the *call* side. > I don't much like either result. I think you are talking about the implementation side, but let's first try to agree about the call side. > > > > Other possible desirables: > > > > 12) Should not require alteration in the way limited types are laid > out > ... > Trying to lay out all possible record types contiguously is a fool's game. > Kinda like trying to implement universal generic sharing. :-) It's possible > to get it to work, but only with lots of standing on your head. And the > result is very use-unfriendly: objects of reasonable types like > type Sane_Bounded_String (D : Natural := 0) record > Data : String (1 .. D); > end record; > raise Storage_Error unless constrained. > > In any case, the vast majority of real types can be implemented > contiguously, with any of these proposals. (Most ADTs don't have > discriminants anyway, at least not on the top-level types.) If a few types > have to change representation in a few compilers (and only if there are > constructors defined) to make this work, I cannot get too excited. It can't > be incompatible: there are no constructors now. Are you proposing that if a programmer writes a function-like thing for a limited type, then the layout changes? I really think that is very bad news. And despite your concern about laying out records contiguously, I am pretty certain that GNAT, Rational, Green Hills, and Aonix all lay out records contiguously (I am *very* certain about Green Hills and Aonix ;-). I think that represents about 95% of the Ada market. > > > Viewing them as totally new animals seems like overkill. To me, a > > > constructor is just a function that creates a new thing. ... > > > > And except for the oddball return-by-ref functions, all > > functions create a new thing. > > I guess I view these as a new thing because what they do is create a > user-defined "construction" of an object; they need to replace the > "initialization assignment" operation of Ada as well as the "initialization" > itself. Existing functions do not change the semantics of assignment. For > non-controlled types, the distinction doesn't really matter, but it is a big > deal for controlled types (of all stripes). From the call-side, I don't see the big difference. Even from the implementation side, it seems like we are just trying to eliminate some extra "last minute" copying that is currently part of non-limited-type function semantics. Many functions are written with a local "Result" parameter, which is then built up as desired, and then returned. Many other functions are little more than the return of an aggregate. Both of these are clearly creating/initializing new objects. All we need to arrange is that in both cases for a limited type, the object to be returned is built in its final resting place. And the discriminants, if any, are known on the call side (at least to the generated code), before the out-of-line code begins. I suppose one (crazy?) possibility is that such functions must be inlined if the compiler run-time model needs additional information from the body, whereas they need not be inlined if the compiler run-time model uses implicit levels of indirection. This would make it quite analogous to the case with generics, where some compilers need the body to be able to generate code for an instance, while others don't, because their run-time model supports sharing. > Also, I see a new thing as necessary, because I don't believe that a useful > constructor can be defined that won't force some representation changes in > compilers. (That is, (12) is an impossible goal; holding to it is a disaster > from a user perspective -- it forces unnatural separations of construction > code into parts. And the idea of somehow specifying an aggregate as the > argument of an In Out parameter seems goofy.) As long as the constructors > are explicit, then there isn't a problem in that existing code would not > have to change representation. I think you are again implying that by writing a constructor-ish-thing, the record representation would change. This seems quite undesirable to me. > If we don't have the will to do this right this time, I don't think there is > any value to another partial band-aid solution. Especially if it cannot be > extended properly in the future. Which is why Tucker's procedure renaming > just isn't going to work. It would help if you had an example of the kind of extension you had in mind. I promise I am not wedded to the procedure renaming approach, but I do think it satisfies the requirements, except perhaps from an aesthetic point of view. I think we still might be able to make the "return ... do ..." approach work, but there would probably be more limitations. In any case, I believe these things still have so much in common with functions that calling them anything else would hurt more than it would help. > Randy. Here is a proposal that does not involve renaming: 1) Require "limited" (or some such word) if the function is going to return its result by reference; all other functions must return/initialize "new" objects. By-ref functions can only be called in contexts that don't require a new object (e.g. as an IN parameter or a renaming). [Better long-term alternative: replace these oddball functions with functions that have anonymous access-to-limited result types, since that is what they really are.] 2) Allow "return Result : Type := do ... end return;" as a way of having a name for the "new" object being returned/initialized. 3) If a limited type has "normal" functions, then its full type must be definite (e.g., there must be defaults for its discriminants). Note that its partial view may be indefinite (i.e. "(<>)"). Also note that the discriminants, though defaulted, may be given new values by the initializing , if the object to be returned is unconstrained (i.e. Result'Constrained is False). This ensures that the discriminants have well-defined values coming into the function, though they may be changed if the new object is unconstrained. 4) If the (full) result subtype is definite (and hence for all limited types), then the name given in the return ... do ... (e.g. "Result") can be used within the itself, but only as a prefix for discriminants and the 'Constrained attribute. If 'Constrained is False, then within , Result. will necessarily be equal to its default value. After , the discriminants will have the value that was determined by . The above ensures that for run-time models that need it, the discriminants and hence the size are known prior to going to out-of-line code, allowing the caller to do the allocation, and/or to include the object contiguously in an enclosing record or array, etc. The only real restriction is that if a limited type is going to allow objects to have their discriminants determined by an initializing expression, the full type must have defaults for the discriminants. And this restriction is enforced when the full type is declared, rather than when these functions are called. **************************************************************** From: Randy Brukardt Sent: Monday, December 8, 2003, 11:13 PM Tucker said: > Unfortunately, I have completely lost you. I was trying to > focus on the "call" side of things first, before plunging > into the body/implementation side. Once we know what we want > from the call side, we can start to figure out what we need > to provide on the body/implementation side. I have no idea what you mean by "call side". In any case, the only valid subsetting is to look at it from the user's perspective rather than the implementors. Other divisions just mean that you are ignoring half of the issues, and you're bound to get the wrong answer then. ... > Let's just for a moment ignore this issue of whether > the object has to be default initialized and then re-initialized. > Notice that I didn't mention that in any of my > "consensus" lists, and that's not what I am focusing > on now. I am happy to keep searching for a solution > that avoids the double initialization. What I > want first is a good specification of what the solution > should look like on the *call* side. Then you're not looking at the whole issue. That's not an implementation detail, it's a very visible part of the user semantics. The only question is "what makes sense to the user of Ada 2005"? Once we've figured that out, we can look at whether some implementation restrictions are needed. That's the only sensible approach. ... > > I guess I view these as a new thing because what they do is create a > > user-defined "construction" of an object; they need to replace the > > "initialization assignment" operation of Ada as well as the "initialization" > > itself. Existing functions do not change the semantics of assignment. For > > non-controlled types, the distinction doesn't really matter, but it is a big > > deal for controlled types (of all stripes). > > From the call-side, I don't see the big difference. > Even from the implementation side, it seems like we are > just trying to eliminate some extra "last minute" copying > that is currently part of non-limited-type function > semantics. We're also trying to eliminate unnecessary calls on Initialize, Adjust, and Finalize. Since those are very difficult to optimize without breaking the user's code, semantics which call them less often and still is safe is important. > > Also, I see a new thing as necessary, because I don't believe that a useful > > constructor can be defined that won't force some representation changes in > > compilers. (That is, (12) is an impossible goal; holding to it is a disaster > > from a user perspective -- it forces unnatural separations of construction > > code into parts. And the idea of somehow specifying an aggregate as the > > argument of an In Out parameter seems goofy.) As long as the constructors > > are explicit, then there isn't a problem in that existing code would not > > have to change representation. > > I think you are again implying that by writing a constructor-ish-thing, > the record representation would change. This seems quite undesirable > to me. Not "would", but "could" -- in unlikely cases. If the type is definite, there is never a problem with any proposal. > > If we don't have the will to do this right this time, I don't think there is > > any value to another partial band-aid solution. Especially if it cannot be > > extended properly in the future. Which is why Tucker's procedure renaming > > just isn't going to work. > > It would help if you had an example of the kind of extension you > had in mind. I promise I am not wedded to the procedure > renaming approach, but I do think it satisfies the requirements, > except perhaps from an aesthetic point of view. I think we still > might be able to make the "return ... do ..." approach work, > but there would probably be more limitations. In any case, > I believe these things still have so much in common with > functions that calling them anything else would hurt more than > it would help. Sheesh, Tuck, I wrote up a rough description of syntax and semantics yesterday. Do I have to do it again?? > Here is a proposal that does not involve renaming: > > 1) Require "limited" (or some such word) if the function is going to > return its result by reference; all other functions must > return/initialize "new" objects. > By-ref functions can only be called in contexts that don't > require a new object (e.g. as an IN parameter or a renaming). > [Better long-term alternative: replace these oddball functions > with functions that have anonymous access-to-limited result types, > since that is what they really are.] That's a lousy better alternative, even if it is accurate. I don't think we ever want to force the introduction of access type where there currently are none. > 2) Allow "return Result : Type := do ... end return;" as a way of > having a name for the "new" object being returned/initialized. So (1) and (2) are pretty close to what I proposed Saturday night (except for syntax). It's not clear to me what the user-level semantics for non-limited types is supposed to be. > 3) If a limited type has "normal" functions, then its full type > must be definite (e.g., there must be defaults for its discriminants). > Note that its partial view may be indefinite (i.e. "(<>)"). > Also note that the discriminants, though defaulted, may be given > new values by the initializing , if the object to be returned is > unconstrained (i.e. Result'Constrained is False). This ensures > that the discriminants have well-defined values coming into > the function, though they may be changed if the new object > is unconstrained. Tagged types don't allow defaults for discriminants. So you're saying that useful limited types either must not be tagged or must not have discriminants. That seems pretty fierce. And, in any case, I don't see what this has to do with the user perspective. You said something about ignoring implementation details, and I can't see any reason for this other than an implementation detail. > 4) If the (full) result subtype is definite (and hence for > all limited types), then the name given in the return ... do ... > (e.g. "Result") can be used within the itself, > but only as a prefix for discriminants and the 'Constrained > attribute. If 'Constrained is False, then within , > Result. will necessarily be equal to its default value. > After , the discriminants will have the value that was > determined by . Is this necessary? I haven't tried to work out full examples, but it adds complications where none seems to be needed. Anything that needs to refer to the name should be in the statement part, I would think. What would you need to do in that expression that can't be deferred to the body?? > The above ensures that for run-time models that need it, the > discriminants and hence the size are known prior to going to > out-of-line code, allowing the caller to do the allocation, > and/or to include the object contiguously in an enclosing record > or array, etc. That seems like an implementation detail, again. That's fair game, of course, but not when you say "let's focus on the user view". > The only real restriction is that if a limited type is going > to allow objects to have their discriminants determined by an > initializing expression, the full type must have defaults for > the discriminants. And this restriction is enforced when the > full type is declared, rather than when these functions are called. I agree that (hard) restrictions should be enforced when the constructor is declared (it's not really a problem with the type). There's nothing wrong with run-time checks if something is declared to be indefinite, but not otherwise. But I'd prefer to avoid restrictions if we can. I note that this proposal does seem to allow deferring or eliminating default initialization. But it seems to imply that non-limited types still require a copy (and Adjust call) afterwards. I'd much prefer to allow build-in-place. Perhaps the new 7.6.1 would allow that? --- My quicky summary of the user view of constructors: 1) These should be usable anywhere that an aggregate can be, with similar semantics. (This implies that top-level Adjust or Initialize should not be called on them in general.) 2) There should be as few as possible restrictions on the declarations and use of constructors. 3) They shouldn't feel "weird". (This implies retaining the function call-like syntax -- ":= Create(...);", and that the declaration probably should look something like a function call.) **************************************************************** From: Randy Brukardt Sent: Tuesday, December 9, 2003, 12:07 AM Tucker said, responding to me: > > Is this necessary? I haven't tried to work out full examples, but it adds > > complications where none seems to be needed. Anything that needs to refer to > > the name should be in the statement part, I would think. What would you need > > to do in that expression that can't be deferred to the body?? > > If you want to specify the initial value as an aggregate, you have > to specify the values for the discriminants. If you want them > to match the newly created object, you need to be able to refer > to the discriminants of the newly created object. OK. That seems like a nice-to-have rather than a requirement. Without it, you still could pass the discriminants as parameters to the constructor if you had to. Not as pretty, but workable. A lot of the time, you'd want to do that anyway. I'd hate to kill the proposal with some funny visibility that isn't strictly necessary. > > I note that this proposal does seem to allow deferring or eliminating > > default initialization. But it seems to imply that non-limited types still > > require a copy (and Adjust call) afterwards. I'd much prefer to allow > > build-in-place. Perhaps the new 7.6.1 would allow that? > > I don't think I was making any such requirement. Many compilers > currently preallocate space for the value returned by > a function having a non-limited composite result type. > The return ... do ... construct would allow that > space to be used directly. For functions returning values > on the secondary stack, it would also be possible to build > the returned value directly on the secondary stack, > and avoid cutting back the secondary stack upon return > and just use it where it was placed. We already do that > in certain circumstances. I think you're relying on 11.6 and 7.6.1 to get there. Right? That's fine as long as those sections have the right effect (and we all agree that is an appropriate effect). I don't think that the original 7.6.1 would have allowed that optimization (even though compilers probably do it!), but the new one ought to. > With limited types we need to create enough restrictions > to ensure it can be built in place. For non-limited types, > the return ... do ... construct makes it more likely that > no extra copies are required, but we can't impose enough > restrictions to ensure that no copies are needed in all > run-time models. That's fair. It seems likely that a copy would have to be done sometime (certainly for elementary types, although that's no real problem). > > My quicky summary of the user view of constructors: > > > > 1) These should be usable anywhere that an aggregate can be, with similar > > semantics. > > (This implies that top-level Adjust or Initialize should not be called on > > them in general.) > > 2) There should be as few as possible restrictions on the declarations and > > use of constructors. > > 3) They shouldn't feel "weird". > > (This implies retaining the function call-like syntax -- ":= Create(...);", and that the > > declaration probably should look something like a function call.) > > I agree with the above, and as indicated, I see no reason > not to call them functions. The "return ... do ..." construct > might be called a "constructor statement" or some such thing > if you want to get the term "constructor" into the language ;-). It wouldn't hurt, but I'm certainly not going to be banging any shoes if it doesn't happen. :-) **************************************************************** From: Robert I. Eachus Sent: Tuesday, December 9, 2003, 10:51 AM Wow! I think we are finally converging, but four long messages, two each from Tucker and Randy, after I gave up for the night? Is Tucker already on California time? ;-) But I would like to start pulling out one issue at a time to resolve. In this particular case, I am going to set ground rules for this thread, since it presumes that certain things will be adopted. In this case, the RM requirements to be specified for initializing objects inside the return do end; My feeling is that in reality writing useful constructors for private types outside the body of the package that defines the type (or one of its children) is going to be a pretty useless capability. Not totally useless, but okay to ignore for now. This means that if there are discriminants, we can assume they are known and visible, and the same with other fields of the target type. (But the fields of the parent type may not be visible in this context.) The three options that seem worth discussing are: 1) Controlled types are initialized implicitly. 2) Default initialization only occurs if there is no explicit initialization: (return Foo: Bar := (this, that, etc) do...) 3) We are all big boys here. (The constructor is normally defined by the same person who writes Initialize or assigns defaults.) So there are no default initializations, and no run-time checks. 4) There is a run-time check at the end of the do...end; and an exception is raised if any discriminants of the object were not initialized. I think that all of these are technically viable given the way things seem to be headed, so this is really a normative discussion (What do we want to happen?). The one troubling case I see can occur if we decide that if the target is constrained its discriminants, if any get 'copied' in from the target object. (In practice they will be the same object, no copying needed.) Then if a constructor that assumes it will get the constraints from the target is used to create an object that will get its constraints from the initial value, something should happen. This leads me to tend toward 4). If the sequence of statements checks 'Constrained and initializes the discriminants (or other bounds) if false, then the check can be optimized away. Otherwise the constructor should raise Program_Error. Compilers can provide warnings if code for the check is generated, or if something that reads the discriminants in the sequence of statements occurs where the discriminants may not yet be initialized. Notice though, that it might be nice if 'Constrained could be checked before the return statement. I think that this is the one detail that I miss from my proposal in Randy's variation. There will be cases where a programmer would like to write: if Foo'Constrained then return ...; else return ...; end if; Allowing the user to use the name of the constructor in this instance as a prefix for 'Constrained would allow this. I don't see any real need for other attributes of the target outside the return. Which brings up an interesting point. Will there be a restriction that prevents return statements inside the sequence of statements of a return construct? If not, then the above will work. But I think the principle of least surprise says that nested return statements should be illegal. Rule 2), it seems to me is the other contender. If there is no explicit initialization in the object declaration, implicit initialization occurs. This solves what to me is one troubling case with 4). If some fields of a record type have defaults, those default initializations may not happen, and there will be no warning to the programmer. I guess I am comfortable with either 2) or 4). I think for most types, and most constructors, there will be an explicit aggregate initial value, with the sequence of statements if present doing any fixup needed. As long as double initialization doesn't occur in that case, I am happy. **************************************************************** From: Randy Brukardt Sent: Tuesday, December 9, 2003, 11:01 AM Robert Eachus said: > The three options that seem worth discussing are: > > 1) Controlled types are initialized implicitly. > 2) Default initialization only occurs if there is no explicit > initialization: (return Foo: Bar := (this, that, etc) do...) > 3) We are all big boys here. (The constructor is normally defined by > the same person who writes Initialize or assigns defaults.) So there > are no default initializations, and no run-time checks. > 4) There is a run-time check at the end of the do...end; and an > exception is raised if any discriminants of the object were not > initialized. "Three options"? I see four. :-) ... > I guess I am comfortable with either 2) or 4). I think for most types, > and most constructors, there will be an explicit aggregate initial > value, with the sequence of statements if present doing any fixup > needed. As long as double initialization doesn't occur in that case, I > am happy. I had proposed (2). I agree that we don't want objects floating about that haven't been initialized at all. And, being able to write a specific initialization expression allows overridding that (or pieces of it, given the <> notation). That seems powerful enough to me. (And nested returns aren't allowed. Once a return object is created, it has to be returned. Otherwise, you wouldn't be able to build the object in place, which is the whole point.) **************************************************************** From: Tucker Taft Sent: Tuesday, December 9, 2003, 11:01 AM > The three options that seem worth discussing are: ^^^^^ "four" ;-) > 1) Controlled types are initialized implicitly. > 2) Default initialization only occurs if there is no explicit > initialization: (return Foo: Bar := (this, that, etc) do...) > 3) We are all big boys here. (The constructor is normally defined by > the same person who writes Initialize or assigns defaults.) So there > are no default initializations, and no run-time checks. > 4) There is a run-time check at the end of the do...end; and an > exception is raised if any discriminants of the object were not initialized. Only (2) makes sense to me. **************************************************************** From: Robert I. Eachus Sent: Tuesday, December 9, 2003, 4:57 PM Tucker Taft wrote: > > The three options that seem worth discussing are: > > ^^^^^ "four" ;-) "A foolish consistancy is the hobgoblin of little minds." -- Ralph Waldo Emerson. (Seriously, I decided to add case three, then didn't change the prefix. Of course, I was originally intending to leave case three out as not making much sense for Ada. ;-) > > 2) Default initialization only occurs if there is no explicit initialization: > > (return Foo: Bar := (this, that, etc) do...) > > Only (2) makes sense to me. Since 2 is acceptable to me also, shall we consider that issue resolved? Any other votes? The object in the return statement gets initialized, including a call to an explicit Initialize for controlled types, and and default values for record components, unless there is an initial value in the return statement. This implies that the syntax for these special returns should allow: return Foo: Bar := (some initial value aggregate); --without a do ... end; **************************************************************** From: Robert A. Duff Sent: Tuesday, December 9, 2003, 5:34 PM Robert Eachus wrote: > "A foolish consistancy is the hobgoblin of little minds." -- Ralph > Waldo Emerson. ;-) > > > 2) Default initialization only occurs if there is no explicit initialization: > > > (return Foo: Bar := (this, that, etc) do...) > > > > Only (2) makes sense to me. > > Since 2 is acceptable to me also, shall we consider that issue > resolved? Any other votes? I agree with (2). **************************************************************** From: Jean-Pierre Rosen Sent: Tuesday, December 9, 2003, 10:31 AM > Consensus statements about Ada 200Y if we were to approve AI-318: > > 1) Should be possible to declare an object of a limited > type and provide an initializing expression > 2) Should be possible to use an initialized allocator for > an access-to-limited type This is of course what is basically required. However, I think something is missing from the list: - the function-like thing should be able to access the characteristics (discriminants, bounds, etc) of the object being constructed. Seems to me that without this requirement, we could just allow initialization of limited types (but not assignment, of course). **************************************************************** From: Tucker Taft Sent: Tuesday, December 9, 2003, 12:04 PM Good point. We clearly need to be able to query the constraints of the "new" object. I said these constraints should be visible in the initializing expression. They should also be visible in the subtype indication of: return Result : [:= ] do ... I suppose if the is unconstrained, then the constraints come from the calling context. If the is constrained, then there must be a check that the constraints are compatible with those coming from the calling context. > Seems to me that without this requirement, we could just allow > initialization of limited types (but not assignment, of course). I don't understand this sentence. ****************************************************************