!standard 3.10 (01) 03-02-05 AI95-00325/01 !class amendment 03-02-05 !status No Action (8-1-1) 04-03-07 !status work item 03-02-05 !status received 03-02-05 !priority Medium !difficulty Hard !subject Anonymous access types as function result types !summary This AI focuses strictly on issues relating to anonymous access types as function result types. It only makes sense if AI-00230 in some form is approved. !problem This proposal builds on AI-00230, by extending the advantages of anonymous access types to function result types. Because Ada has local access types, there needs to be a way to remember the accessibility level of the object designated by a reference across function return. Because Ada has multiple storage pools, it is important that an allocator of an anonymous access type allocate from the appropriate storage pool. !proposal The "access_definition" syntactic category (see AI-00231) is permitted for function result types. The type associated with the result type is an anonymous access type, which permits implicit conversions from other access types with appropriately compatible designated subtypes (as defined by 4.6(13-17)). The accessibility level of an anonymous access type is determined at the time when the access_definition is elaborated. For a function result type, this happens at the time of the function return, and may possibly depend on information provided by the caller. To be specific, the level of the returned value is the same as the level of the type of the returned value, which must be shallower than the level of the execution of the function (to avoid dangling references by the caller). If the return expression is an allocator, the accessibility level and storage pool for the allocator is determined by the context of the call on the function. In particular, the storage pool used is the one that would have been used had the allocator appeared at the poin of the function call. (This rule is applied recursively for a chain of calls.) This implies that the call on a function with an anonymous access return type must be passed in a storage pool and an accessibility level, and must return both an access value and an accessibility level. [Implementation NOTE: The implementation may implement the accessibility level check either after returning from the call, or at the point of the return statement, since it has enough information to do it there as well.] !wording (See proposal.) !example !discussion Implementing anonymous function result types is probably more work than the other proposed uses for anonymous access types. However, anonymous result types provide important capabilities that are currently very difficult to accomplish. First of all, one can define a function that constructs an object (including an object of a limited type) in a storage pool determined by the caller, including a "stackish" storage pool (i.e. one that is cheap to reclaim). This kind of constructor is definitely missing in Ada now. Currently, the access type, and hence the storage pool, must be known inside the function that creates an object. Secondly, there is no way to create a single function that can be used both for creating a local object and a heap object with similar efficiencies. With this proposal, the "constructor" would look like this: function Construct(...) return access T is begin ... return new T'(...); end; X : constant access T := Construct(...); -- Create on the stack Y : constant T_Ptr := Construct(...); -- Create in T_Ptr'Storage_Pool Secondly, one could define a function that could return a reference to an aliased component of an object that is itself passed in using an access parameter. This would effectively allow a function to act as a "selector" that can be used on the left hand side of an assignment when followed by ".all". E.g: X : aliased Rec_With_Aliased_Component; begin Selected_Component(X'Access).all := Y; where Selected_Component is declared: function Selected_Component(RP : access Rec_With_Aliased_Component) return access Component_Type is begin return RP.Component'Access; end Selected_Component And if we approve the "object.operation" syntax: X.Selected_Component.all := Y; presuming Selected_Component is declared in the same package as Rec_With_Aliased_Component. Implementation Issues: There are some implementation issues associated with supporting function returns. Basically it means that such a function has to use run-time accessibility levels rather than static accessibility levels internally. Or more precisely, it needs to use accessibility levels that are comparable with those used by the caller. This need not be a huge burden. Probably the simplest is to have the caller pass in a "caller" accessibility level to the function, which the function then adds one to to use as its own accessibility level. [Some optimizations are possible: If there is only one access parameter, it could use the accessibility level associated with that parameter as the "caller" level (think about it ;-). If there are two or more access parameters, it could use the max of them. At that point, it would be more efficient to have the caller just pass in an extra parameter which is the "true" caller accessibility level. If it is nested in a function that also has access parameters, it is possible that it might have no access parameters, in which case it would probably need a "caller" accessibility level passed in as an additional parameter, though it could probably calculate it from the accessibility level of the access parameter passed into the enclosing function. Etc...] !ACATS test Tests should be created to check on the implementation of this feature. !appendix [... much interesting discussion elided ...] [See the appendex of AI-00230 for the interesting discussion.] ************************************************************* From: Tucker Taft Sent: Friday, February 1, 2002 11:54 AM Here is an update to ai-00230 on anonymous access types. I was able to resolve the problems that I had with them last time, while eliminating the mind-bending named subtypes of anonymous access types, but adding back components of an anonymous access type. I think it all works safely and usefully now, especially presuming the syntax for "access_definition" is generalized (see ai-231) to allow control over nullness, and constantness of the designated object. [This is version /01 of the AI - Ed.] ************************************************************* From: Tucker Taft Sent: Saturday, November 15, 2003 10:35 PM I am at least partly responsible for a couple of AIs (AI-318 and AI-325) that suggest supporting functions returning (new) limited objects, and functions returning anonymous access types. These two are pretty closely related, can be used to solve similar problems, and have some of the same issues. So this is a bit of a personal brain storming on this topic, which will be turned into AI discussion depending on the feedback received from other ARG members (so speak up!). First let me acknowledge some of the comments already received: 1) Robert Dewar made the point that the function-returning-new-limited-object proposal (AI-318), if there is nothing on the function declaration indicating it is one of these, would require a new run-time model for calling *all* functions returning limited types, including those that were allowed by Ada 95 (return-by-reference of *existing* limited objects). This kind of argument is not very convincing if it just applies to one particular implementation approach, but it seems that this will be true pretty much for *all* implementation approaches. This implies that all existing code will need to be recompiled if a compiler were to start supporting this new kind of function, and existing code would almost certainly slow down. Hence, I accept Robert's point as establishing an additional relatively important criteria for evaluating these proposals, namely they should not require a significant run-time model change for *existing* code. 2) Dan Eilers suggested that we consider allowing a function to be declared as a renaming of a procedure with a single [in-]out parameter. This is an interesting idea, but it has some tricky implications. First, I would suggest some alternative syntax for the renaming to make it clear what is happening, since otherwise the overload resolution algorithm would have no clue to look at procedures when working on a function renaming. Hence, I might suggest: function Ret_Limited(X : Integer) return Lim_Type renames procedure Init_Limited; presuming a directly visible declaration of: procedure Init_Limited(X : Integer; LT : [in] out Lim_Type); The presence of the keyword "procedure" would cause overload resolution to consider procedures rather than functions, and match any procedure with one additional parameter of type Lim_Type, be it first, last, or somewhere in the middle. A second issue here is that the "call" on Ret_Limited would have to be transformed into a call on Init_Limited. This would require that the target object of the function call be identified, and that it be already created and initialized (presumably by default), since we don't want the code of Init_Limited to be manipulating an uninitialized object. This implies that if the result subtype of the function is indefinite, the context of the call has to provide discriminants or bounds so the target object can be created and default initialized. This means that calling such a function would be analogous to using an aggregate with "others," the context would need to provide the constraint on the result. This would also imply that whether a given context was allowed for calling such a function would depend on whether it was a renaming of a procedure or not, meaning that this renaming could not be in the private part acting as completion for some visible function declaration. It would have to be visible to the callers. This also implies some generic contract model issues that might be quite nasty. Finally, if the function result subtype is "definite" but not the same as the corresponding procedure parameter subtype, one would have to decide which one would be relevant; normally the subtypes used in the profile of a subprogram renaming are irrelevant. Here it might be nice if the procedure parameter subtype were unconstrained, the result subtype of the function could be used to specify the constraints for the target object, but if the target object had its own constraints specified, would they have to agree with that of the function result subtype? 3) Steve Baird made the point that if for either the limited or the anonymous-access-type function we need to pass in a parameter representing a "storage pool" or equivalent, this would be the first situation where the storage pool is not known statically at the point of allocation. Of course the shared-generics folks could scoff at this kind of belly-aching ;-), but it does seem to be a relevant consideration. 4) For these kinds of functions, it is valuable to be able to have a name for the object that is going to be returned from the function. We have proposed various ways to do so. One proposal involved adding a "do...end" clause to a return statement. A second proposal involved declaring a variable using the keyword "return" analagous to how "constant" is used. The return statement must return this object if one is in scope. Pascal has indicated a preference for the do...end approach, because it links the special semantics more closely to the return statement, rather than having to check every return statement to see whether it happens to return the value of a specially marked variable. ------------------- All of this discussion to some extent begs the question of what problem are we really trying to solve. Here are various problems that we might or might not be hoping to solve with these "funky" functions: - Limited types are hard to use. Allowing aggregates of a limited type can provide some "completeness" checks, but doesn't help cases where the limited type is private. Allowing functions to specify the initial value of a limited object could make limited types safer, by ensuring all the needed initialization took place at the point of declaration, and generally make them more like non-limited types in the paradigms of use. - Anonymous access types help prevent proliferation of named access types (particularly relevant in the context of the "limited with" proposal). However, having anonymous access types only as parameters doesn't address the whole problem, since declaring a function returning an access value will necessitate declaring a named access type. - It is sometimes desirable to create a single constructor operation that can be used to create an object either in the heap or as a local variable. As things are now, you generally have to choose between producing a procedure for initializing a large or limited object, a function for returning a pointer to a heap object, or a function for returning a constructed value of a smaller, non-limited object. It would be preferable to have to single paradigm for creating a "constructor" operation that would work for small and large objects, for limited and non-limited types, and for stack-resident, heap-resident, and component objects. - It is not straightforward to write a function that allocates in a caller-determined storage pool. Generally it requires declaring a new access type, associating the storage pool with that access type, doing the allocator, converting the result to some second named access type, and then having the caller convert yet again to the ultimate desired access type. Alternatively, each caller could instantiate a generic function and then call it. With regard to returning a limited object without copying it, there are various possibilities: - The *caller* might want to determine the constraints of the created object, in which case there needs to be some way for the constraints to be provided to the called routine. In this case, it is probably natural for the caller to preallocate the space for the result, even if it doesn't initialize it in any way. (This case maps fairly well to a procedure renamed as a function, except that the renaming approach would require the object to be default initialized before whatever further initialization is performed out-of-line.) - Alternatively, the *called* routine might determine the constraints, perhaps based on parameters passed in, in which case the best the caller could do is provide some kind of storage pool or area in which the called routine does the allocation, and then it would have to effectively rename the result whose address would be returned. For "stack" objects, this would be similar to functions returning objects of unknown size, except that no copying of the result would be permitted. The caller would have to use it where the called routine allocated it, which might imply leaving holes in the (secondary) stack. (This approach maps fairly well to an approach based on an "extended" return statement or a specially marked "return" object, where the declaration of the object to be returned can (or must?) include discriminants.) - There is presumably still some need for the existing capability, where the called routine returns a reference to a preexisting object, and the caller either passes it on to some other subprogram, or renames it for repeated local use. (To avoid the run-time model change discussed above, some alternative syntax in the function declaration would seem to be necessary to distinguish the new-object case from this existing return-by-reference case.) Much of the complexity associated with returning limited types arises from cases where the result type is unconstrained, the worst being the case where it is also indefinite (i.e. no defaulted discriminants). It is therefore important to decide how valuable is the ability to support returning limited objects where the size of the returned object is not known just from the result subtype. ----------------------------------------------------------------- ------------ Procedure-renamed-as-function Redux ---------------- ----------------------------------------------------------------- In fact, if we restrict ourselves to cases where the result subtype is *definite*, then the procedure-renamed-as-function solution for "returning" limited objects looks more attractive. The caller would always be able to create the object using the result subtype if the "context" didn't provide constraints, so the call would be permitted in any context, eliminating the generic contract model problem and the need for new legality rules resembling those for "others" in array aggregates. The renaming could be in the private part as well (it could not be postponed to the body, since the convention of such a function would necessarily be intrinsic). This also eliminates all the complexity associated with the caller providing a storage pool or equivalent. Instead, they just pass in a reference to an already created and default-initialized object. The procedure renamed as a function can treat it like any other [in]-out parameter of a limited type. It is somewhat interesting to note that the "Initialize" procedure of limited controlled types can be thought of like one of these procedures renamed as functions, where the default initialization for controlled types is roughly equivalent to ":= Initialize;" presuming: function Initialize return Lim_Cntrl_Type renames procedure Initialize; In other words, the Initialize procedure for a controlled type is called at the same point that one of these procedures-renamed-as-function would be called for a limited object initialized by a call on such a function. Presumably if the limited object is both controlled and has an initialization specified by a function call, then Initialize is called immediately prior to calling the explicitly specified procedure-renamed-as-function, to ensure the object gets properly default initialized before being manipulate by "user" code. -------------------------------------------------------------------------- ------------ What about returning anonymous access types? ---------------- -------------------------------------------------------------------------- There is some advantage in having a name for the result object being returned from a function returning an anonymous access type, which gets us into the "extended" return statement. However, most of the advantage comes in the case where you are returning an allocator for an anonymous access-to-limited type, and there is no way to write anything but an aggregate (if that) as the intializer for the allocator. However, if we posit the existence of the above procedure-renamed-as-function as a solution for the limited type problem, then we can call such a function in our initialized allocator. Given a reduced need for the extended return statement, then we can go to a relatively simple proposal for returning anonymous access types, namely require that the caller pass in a storage pool and an associated accessibility level, which would be used only if the expression of a return statement is an allocator, or a call on another such function. An accessibility level would be included with the returned object. The "only" new syntax for this would be allowing an access_definition to appear for the result subtype of a function. -------------------------- So... my brainstorming has led to the following: 1) allow renaming a procedure with one [in] out parameter as a function, provided the [in] out parameter's nominal subtype is a definite subtype. Require the use of a syntax like "function blah(...) return foo renames procedure blurfo;" to make it clear to the reader and the compiler that a procedure is being renamed. If the parameter type is limited, then such a function can only be called where we have proposed to allow limited aggregates (i.e. initializing a declared object, initializing a component of an aggregate, a default expression for a component_definition, in an initialized allocator, as an actual IN parameter or parameter default, as an actual formal IN object or formal IN default). 2) allow an access_definition for a function result subtype. If the expression of a return statement in such a function is an allocator (or a call on another such function), a storage pool and accessibility level provided by the caller is used (or passed on to the called function). Any and all comments, flames, better brains, are welcomed... I will wait a week or two and write up the results in time for the San Diego meeting. ************************************************************* From: Robert Dewar Sent: Sunday, November 16, 2003 4:10 AM > This kind of argument is not very convincing if it just applies to > one particular implementation approach For the record, this is a *very* convincing argument if the one implementation approach that it applies to is in significant use today. If Ada 0Y (what a terrible name) requires major changes in implementation approaches at any point, the corresponding features will simply be ignored. The best hope for seeing anyone implement these features is to make sure that they are not disruptive in this sense. I realize that Tuck goes on to say that this is not the case hee, but it is still an important point for the record. I would guess that the GNAT situation is a typical one. We are indeed forging ahead implementing many of the new proposals, but at this stage, that is only practical if they do not involve any major shift in impoementation (which to a remarkable extent is the case with many/most/all? proposals so far). After all, we avoided downward closures based on concerns from just a couple of implementors in the Ada 9X process. The situation now even more requires no major shifts. ************************************************************* From: Tucker Taft Sent: Sunday, November 16, 2003 7:50 AM My real point was that any vendor could torpedo any proposal they don't like by saying that it would require changing their run-time model for existing code. I feel that if we can suggest reasonable alternative approaches that don't require changing the run-time model, then that should counteract some of the concern. However, if it seems that there really is no way to avoid changing the run-time model of existing code, then that is a significant problem. And it seems pretty clear that having to support both return-by-reference and return-new-object semantics with identical syntax for the function declaration is such a case. And if you promise not to overuse the argument, I'll promise to take each one you identify seriously ;-). ************************************************************* From: Robert Dewar Sent: Sunday, November 16, 2003 8:36 AM > My real point was that any vendor could torpedo any > proposal they don't like by saying that it would require > changing their run-time model for existing code. Basically I would agree they can. After all if you don't have the major vendors on board for a change at this stage, you might as well forget it. In fact I think you can expect vendors to operate in good faith (otherwise the whole process is broken). > And if you promise not to overuse the argument, I'll promise > to take each one you identify seriously ;-). exactly :-) That's really the way things work. An interesting question, in retrospect, did we really make the right decision to accomodate use of a display, when in practice for Ada 95 static chains make more sense anyway? We certainly paid a price for this accomodation! Going back to the first paragraph, any vendor can torpedo any proposal by simply not implementing it :-) Yes, there will be some competitive pressure, which may be relevant for some customers, but I would not count on this as a major factor. That being said, I note again, that in the case of GNAT we are implementingf away, and have a lot of the new proposals working in our latest builds. ************************************************************* From: Robert A. Duff Sent: Sunday, November 16, 2003 7:00 AM Robert Dewar says: > An interesting question, in retrospect, did we really make the right > decision to accomodate use of a display, when in practice for Ada 95 > static chains make more sense anyway? We certainly paid a price for this > accomodation! Well, *my* opinion on that point has not changed in 20 years. ;-) The one case where Ada is clearly inferior to Pascal... (I never found the "display" argument compelling, either, since I know of at least one Pascal compiler that used displays, and implemented procedural parameters properly. Not the *easiest* way to do it, but certainly doable.) > Going back to the first paragraph, any vendor can torpedo any proposal > by simply not implementing it :-) Indeed. This is an important point. ************************************************************* From: Robert Dewar Sent: Sunday, November 16, 2003 7:10 PM > The one case where Ada is clearly inferior to Pascal... Of course in GNAT, one has Unrestricted_Access to redress the balance, but it is hardly elegant :-) Unrestricted_Access is a real language extension. Are we going to fix this in Ada0Y ************************************************************* From: Tucker Taft Sent: Sunday, November 16, 2003 9:46 PM The AI on anonymous access-to-subprogram parameters has been approved for intent by the ARG. Unfortunately, it isn't quite ready for forwarding to the WG9. ************************************************************* From: John Barnes Sent: Monday, November 17, 2003 1:42 AM I just sent a new version to Randy to put on the database. ************************************************************* From: Robert I. Eachus Sent: Sunday, November 16, 2003 10:06 AM Tucker Taft wrote: >Any and all comments, flames, better brains, are welcomed... > > I don't know about the better brains, and I'll try and avoid flames. As far as I am concerned we are trying to create these special subprograms that are neither fish nor fowl to solve a real problem in Ada. Why don't we focus on what the problem is, how to solve it technically, and then invent a notation that works. Yes, I said invent a notation. This is a language revision and if there is something missing from the language--and I think we have all concluded that there is--we need to come up with a language change that does minimal violence to existing code and implementations while solving the real problem in a way that is acceptable to users. It will be nice if it doesn't look like a kludge, but that is a goal, not a necessity. So what problem are we trying to solve? Primarily constructors for limited types. There are also issues for functions returning anonymous access types, but I suspect that if we get the allocators right, that will fall out along the way. The hard problems seem to be the same. And as far as I see, if we get allocators right, the need for functions which return the problem access types goes down. So what is the problem? We need to define something that looks suspiciously like an assignment operator, except that it can only be used with new objects. (Don't worry I am going to try to avoid anything that looks like procedure ":=". That is a can of worms I don't want to touch.) Now I am going to make a cut. There are three cases to be dealt with: Easy: The object and constructor have identical static bounds, discriminants, and/or other constraints, and there are no run-time issues. Constrained object: The object is defined as constrained, the constructor may take its constraints from the actual object. (This is the case that definitely, IMHO, needs new syntax--if we allow it.) Constrained constructor: The constructor allocates the object and returns it. The object is created by the constructor and that determines any constraints necessary on the object. (For the anonymous access type cases, add "the subtype designated by" where necessary.) Do we need to support the constrained object case? Notice that the rules allow constraints on the target object in the other cases. The difference is that is what I am calling the constrained object case, the constraints back propagate into the constructor function. In the easy and constrained constructor case, there may be a constraint check at compile time or after the object is constructed, and an error if it fails. I think that this case is a "nice to have" and as I said above will require new syntax to avoid massive work in existing compilers. What do others think? Next, in what I call the constrained constructor case, the issues are that the size of the returned object is determined by the constructor, and that the object must be constructed in place. I can come up with several notational models for how this works, but they all come back to "hidden" parameters. The compiler has to pass the information on where the object is to be created to the constructor, OR the constructor gets a "thunk" as a parameter that it can call with the size of the object to be created, and then later return the address of the created object. (I deliberately said address. This may be an object of an anonymous in some of the cases of interest, and in others not. It is really a return "by reference" in the constructor case.) As I see it we are back to the need for a new syntax, but with one difference worth noting. I don't see any simple way to combine the constrained constructor and constrained object from the compilers point of view. I also don't see any reason to confuse users by trying to mix the two cases. So as I see it, we can: 1) go with the easy cases and let make (or keep) the other cases illegal. 2) allow the constrained object case with new syntax. 3) allow the constrained constructor case with new syntax.. 4) allow both of the above with different syntax. I guess I favor 3), with a syntax something like: constructor returns ; We can discuss where constructors can appear separately, but I think it is pretty obvious: in an object declaration after the :=, and in an allocator after new. That limits the compiler work, and makes it clear that these subprograms are special. ************************************************************* From: Tucker Taft Sent: Sunday, November 16, 2003 9:50 PM I will admit it is a bit frustrating that most ARG notes end up getting more comments off-topic than on. ;-) I sent three messages over the weekend. There has been only one truly on-topic response, I believe. And that one was essentially a completely new proposal. Oh well. I'm as guilty as anyone... ************************************************************* From: Robert I. Eachus Sent: Monday, November 17, 2003 4:32 AM If you are referring to my post, I guess you are right. I was trying to evaluate your proposal (renaming of procedures as functions) and thought the wider context discussion was implictly opened by your proposal, and we should resolve the issue. As far as I am concerned, you tabled a proposal that solved what I called the "constrained object" case, and either ignored or implictly ruled the "constrained constructor" case out of bounds. (The intersection of the two, the "easy" cases, should fall out of any workable solution.) As part of drawing this distinction, I thought it was a good idea to have a proposed notation on the floor for the "constrained constructor" case as well. So I put up a stalking horse. All I was really trying to do was to make it clear that, in your proposal, the constraints on the created object would be passed into the constructor function through the out parameter. I certainly like your mapping of the constrained object case to something that current compilers support, so it is mostly front-end work for compiler developers. However, I do think that in a final proposal, that mapping would be better left implicit. I think that the "extra" work that compilers would have to do to allow these special-purpose procedures to be called AS procedures (with an already initialized object) would not be doing users any favors. But current compilers also support the case (for non-limited types) of constructor functions where the constructor determines the bounds of the object. It would be nice to "open a window" so that when the full type is non-limited constructor functions can be made visible. For compilers that currently support creating the return value "in place" such a solution would work for types with say task components as well. There were constrained constructor proposals discussed previously with respect to AI-318, so I didn't see any new ground being opened. And I hope it was clear that I think it will place an undue burden on current compilers unless any solution to the constrained object or constrained constructor cases uses new syntax so that the implementor is clued in to what is going on as early as possible. ************************************************************* From: Randy Brukardt Sent: Thursday, December 4, 2003 8:22 PM In order to give Tucker the technical comment he craves: He suggested 4 problems: - Limited types are hard to use. - It is sometimes desirable to create a single constructor operation Sometimes? If it was possible, it would always be desirable. Certainly, that is how Janus/Ada works internally (every record type has a single constructor "thunk"), and its nasty that users can't do that as well. Anyway, these two are the same, given that these items would only be allowed in "constructor" locations. This is the problem we're trying to solve. - Anonymous access types help prevent proliferation of named access types True, but only worthwhile if there are no complications. AI-230 only got finished once all of the complications were removed. The same holds here. - It is not straightforward to write a function that allocates in a caller-determined storage pool. True, but is this a real problem? And if it is, I'd prefer to fix it with better support for pools in generics, not messing around here. > In fact, if we restrict ourselves to cases where the result subtype is > *definite*, then the procedure-renamed-as-function solution for > "returning" limited objects looks more attractive. OK by me. > It is somewhat interesting to note that the "Initialize" procedure > of limited controlled types can be thought of like one of these > procedures renamed as functions, where the default initialization > for controlled types is roughly equivalent to ":= Initialize;" > presuming: > > function Initialize return Lim_Cntrl_Type renames procedure Initialize; > > In other words, the Initialize procedure for a controlled type is > called at the same point that one of these procedures-renamed-as-function > would be called for a limited object initialized by a call on > such a function. > > Presumably if the limited object is both controlled > and has an initialization specified by a function call, then Initialize > is called immediately prior to calling the explicitly specified > procedure-renamed-as-function, to ensure the object gets properly > default initialized before being manipulate by "user" code. Umm, no. These ought to work like aggregates, and those don't call Initialize. We certainly want to be able to replace an aggregate with a constructor function, and vice versa. So if you have a user-defined constructor, it is the complete constructor; Initialize is called only for default initialized objects. ... > Given a reduced need for the extended return statement, then we can > go to a relatively simple proposal for returning anonymous access types, > namely require that the caller pass in a storage pool and an associated > accessibility level, which would be used only if the expression of a > return statement is an allocator, or a call on another such function. > An accessibility level would be included with the returned object. > > The "only" new syntax for this would be allowing an access_definition to > appear for the result subtype of a function. I don't like the idea that a lot of runtime expense and compiler complication is hidden behind the use of a single keyword ("access") which normally is quite cheap. And I don't see the need (I don't see the need for any access parameters, when you get right down to it.) ************************************************************* From: Tucker Taft Sent: Thursday, December 4, 2003 8:48 PM Randy, Thanks for the feedback. > ... > > In fact, if we restrict ourselves to cases where the result subtype is > > *definite*, then the procedure-renamed-as-function solution for > > "returning" limited objects looks more attractive. > > OK by me. Good. > ... > > Presumably if the limited object is both controlled > > and has an initialization specified by a function call, then Initialize > > is called immediately prior to calling the explicitly specified > > procedure-renamed-as-function, to ensure the object gets properly > > default initialized before being manipulate by "user" code. > > Umm, no. These ought to work like aggregates, and those don't call > Initialize. We certainly want to be able to replace an aggregate with a > constructor function, and vice versa. > > So if you have a user-defined constructor, it is the complete constructor; > Initialize is called only for default initialized objects. I don't think this really works. If the type is limited private, and there is an Initialize procedure, it should definitely be called to initialize the object properly. You can't rely on some arbitrary user-written procedure to do the right thing as far as maintaining reference counts, etc. It is *not* the same case as with an aggregate. Those are only permitted on non-private types, and hence can be treated as an "inside the abstraction" operation, which can be relied upon to initialize reference counts, etc., properly. The procedure-as-function can be declared anywhere, and so must be treated as an "outside the abstraction" operation, which cannot be relied on to preserve the invariant required by Initialize/Finalize. > > ... > > Given a reduced need for the extended return statement, then we can > > go to a relatively simple proposal for returning anonymous access types, > > namely require that the caller pass in a storage pool and an associated > > accessibility level, which would be used only if the expression of a > > return statement is an allocator, or a call on another such function. > > An accessibility level would be included with the returned object. > > > > The "only" new syntax for this would be allowing an access_definition to > > appear for the result subtype of a function. > > I don't like the idea that a lot of runtime expense and compiler > complication is hidden behind the use of a single keyword ("access") which > normally is quite cheap. This doesn't seem like a lot more overhead than other functions. functions that return composite objects typically have implicit parameters, unconstrained array parameters have implicit parameters, and access parameters have implicit parameters, so it seems not so big a surprise than anonymous access returns require an implicit parameter. This implicit parameter would presumably be a reference to a storage pool and an accessibility level. This can often be a statically initialized data structure, since most storage pools are declared at the library level. > ... And I don't see the need (I don't see the need for > any access parameters, when you get right down to it.) Well if you don't see the need for access parameters, then obviously you don't see the need for access returns. But if you have access parameters, which we already have and which are the basis for the whole AI on anonymous access types, and that adds access components and access renames, then it seems to leave a language hole to not include access returns. > Randy. (I suspect this will get more juices flowing, so thanks for starting off the discussion.) ************************************************************* From: Randy Brukardt Sent: Thursday, December 4, 2003 9:09 PM > The procedure-as-function can be declared anywhere, and so > must be treated as an "outside the abstraction" operation, > which cannot be relied on to preserve the invariant required > by Initialize/Finalize. Humm. It's true that such a constructor function can be declared anywhere, but I don't think that is a problem. The only thing that such a function (procedure really) could is to pass the object on to another function or procedure to construct it. The *real* constructor function/procedure would have to be declared inside of the abstraction, which of course could do all of the needed initialization. If this is a real issue, then Robert Eachus's solution of an explicitly declared constructor is a better idea, because that could be required to be a primitive of the type, and thus would have to be in the same place as Initialize and any aggregates. ************************************************************* From: Tucker Taft Sent: Thursday, December 4, 2003 9:59 PM I still don't think this works. Let's take an example: type Handle is new Limited_Controlled with record Ptr : Ptr_Type; X : Integer := 0; end record; procedure Initialize(H : in out Handle); procedure Finalize(H : in out Handle); procedure Some_Op(H : in out Handle); procedure Share_Object(Existing : in Handle; New_Ref : in out Handle); As things stand now, any primitive operation of the type Handle other than Initialize can assume that Initialize will have been called (let's presume it initializes Ptr to point to some heap object, and initializes the reference count of that to one). These operations should still be able to make the same assumption after this change. If someone should rename Some_Op or Share_Object to be a function, or declare their own procedure and rename it to be a function, then that shouldn't change the fact that Some_Op and Share_Object can safely assume that Ptr is non-null in all objects. I presume we all agree that they can all presume that the X component has been initialized to zero. It seems like the call on Initialize goes along with that. The analogy with an aggregate doesn't work. With an aggregate, we know that the programmer is forced to specify a value (possibly the "default" value) for every component. With the procedure-as-function proposal, all we are saying is that given "L : Lim := Func;" we know the object L is passed to the procedure renamed as Func, before proceeding to the next declaration. This by no means gurantees that all fields are properly initialized, unlike an aggreate. The renamed procedure might just return immediately, or might just set one component to have some special value. > > If this is a real issue, then Robert Eachus's solution of an explicitly > declared constructor is a better idea, because that could be required to be > a primitive of the type, and thus would have to be in the same place as > Initialize and any aggregates. Just making it a primitive doesn't fix the problem. If you create some special thing called a constructor, then presumably it can only be called as part of a declaration, and when called it must presume what? The object is totally uninitialized? The default initializations have occurred but Initialize has not been called? Pointers are nulled out but other default initializations have not occurred? I think you are wading into a morass. I believe this whole proposal flies only if it is kept very simple, with no complicated new semantics. I think specifying that default initializations, including a call on Initialize if appropriate, have occurred is quite reasonable. A useful model is T'Input. This is a function which is effectively a rename of the procedure T'Read for a type with a constrained first subtype. And the default implementation of T'Input must clearly default initialize the object before calling a user-provided T'Read procedure, because we don't want user-written code being handed an improperly initialized object. So I think default initialization is the right model for these other proposed cases of procedures renamed as functions. ************************************************************* From: Randy Brukardt Sent: Thursday, December 4, 2003 11:03 PM > If someone should rename Some_Op or Share_Object to be > a function, or declare their own procedure and rename > it to be a function, then that shouldn't change the fact > that Some_Op and Share_Object can safely assume that Ptr > is non-null in all objects. But you can't do that! It isn't reasonable for some random procedure to be used as a constructor. Only a purpose-built routine that initializes everything can be used as one, and it would never make sense to actually call it as a procedure. So this again argues for the Eachus solution. > I presume we all agree that they can all presume that > the X component has been initialized to zero. It seems > like the call on Initialize goes along with that. > The analogy with an aggregate doesn't work. With > an aggregate, we know that the programmer is forced to > specify a value (possibly the "default" value) for > every component. With the procedure-as-function > proposal, all we are saying is that given "L : Lim := Func;" > we know the object L is passed to the procedure renamed > as Func, before proceeding to the next declaration. > This by no means gurantees that all fields are properly > initialized, unlike an aggreate. The renamed procedure > might just return immediately, or might just set one > component to have some special value. I like this less and less. A constructor has to set everything, somehow. Anything else is madness. The problem seems to be that these aren't really constructors, they just can be used as them. You're saying that the default initializer does everything, and then the constructor has to come along an undo it. Consider a constructor for unbounded strings: function "+" (V : in String) return Unbounded_String.... (This is suspiciously similar to something that's been removed from AI-301.) with the actual type being: type Unbounded_String is new Ada.Finalization.Controlled with record Str : String_Access := new String'(""); end record; With your semantics, this could be a rename of Set_Unbounded_String. But in that case, there is no advantage to having it (or Set_Unbounded_String, for that matter) - because the default sized memory string has been allocated. "+" would have to deallocate the existing memory, then allocate a new one of the right size. That doesn't sound like a constructor to me; we'd be doing an extra allocation and deallocation every time. Might as well have just used a regular function. A real constructor would get the uninitialized object, examine its parameters, then allocate the proper sized string. Once. No deallocation involved. > > If this is a real issue, then Robert Eachus's solution of an explicitly > > declared constructor is a better idea, because that could be required to be > > a primitive of the type, and thus would have to be in the same place as > > Initialize and any aggregates. > > Just making it a primitive doesn't fix the problem. If you create > some special thing called a constructor, then presumably it can > only be called as part of a declaration, and when called it > must presume what? The object is totally uninitialized? > The default initializations have occurred but Initialize has > not been called? Pointers are nulled out but other default > initializations have not occurred? I think you are wading into > a morass. Been there, done that. This is precisely how Janus/Ada initializes all objects; there's no problem. The constructor gets discriminants and bounds, and everything else is uninitialized. This isn't a "morass", its how constructors work. For Janus/Ada, we have to do this to allocate memory for any dynamically sized components (since we always allocate them to size). There currently are the standard three cases: default construction (which default initializes everything), aggregates (which of course do their own initialization, bypassing the top-level constructor, but components use the copy constructor), and (for non-limited types) copy construction (which gets the values from the copied object). Thinking about that, I don't immediately see how to implement either of these proposals. I guess your proposal would do the full default construction followed by Initialize before even calling the constructor. That seems both limiting and time consuming (you're doing everything twice including the allocation/deallocation of memory, and you're stuck with the initial bounds/discriminants in all cases). The Eachus proposal doesn't leave any place to allocate the memory - you can't do it until you know the bounds/discriminants, but by the time that you do, you're in user-defined code and it is too late to do anything. I suppose that means that we really do have to limit ourselves to cases where the constraint is known. Then, the old (now abandoned) "make" constructor (which only allocated memory) would do the trick. The make constructor was abandoned because it didn't work for components with mutable discriminants: the memory to allocate can only be determined on the assignment. We had hacked around that for years, but I finally got fed up with it, and blew it away completely last year. One point for Tuck. :-) > I believe this whole proposal flies only if it is kept very > simple, with no complicated new semantics. I think > specifying that default initializations, including a call > on Initialize if appropriate, have occurred is quite reasonable. I think that we are talking inconsistent new semantics no matter what we choose. Just because it is simple to describe to an implementor doesn't mean that it makes much sense. Consider a non-limited constructor: package P is type T is new Ada.Finalization.Controlled with ...; procedure Tuck_Constructor (Obj : in out T); function New_Constructor return T renames procedure Tuck_Constructor; function Old_Constructor return T; procedure Initialize (Obj : in out T); procedure Adjust (Obj : in out T); end P; O1 : T; -- Calls Initialize. O2 : T := (Controlled with ...); -- Calls nothing. O3 : T := Old_Constructor; -- Calls Adjust. O4 : T := New_Constructor; -- Calls Initialize??? O3 and O4 sure look the same; it doesn't look good to have them do vastly different things. But of course, we can't have O4 call Adjust (in part because if we change T to be limited, there is no Adjust). Besides, Adjust would do the wrong thing, because it is expecting a fully initialized object. I don't think that there can be a solution to this confusion (unless we abandon the ":=" notation for constructors, but that seems to be too large a change). > A useful model is T'Input. This is a function which > is effectively a rename of the procedure T'Read for > a type with a constrained first subtype. And the default implementation > of T'Input must clearly default initialize the object before calling a > user-provided T'Read procedure, because we don't want user-written > code being handed an improperly initialized object. See, I don't agree with this at all. A T'Read that doesn't initialize all of the non-discriminant components of whatever it is handed is just plain wrong. It certainly shouldn't be depending on anything it is handed. Now, it is true that the language cannot enforce a requirement that T'Read or a constructor actually initializes everything, or that it doesn't read anything it didn't set, but that is what they're intended to do. T'Read is just a specific constructor, and it shouldn't be doing expensive default initialization which will immediately be overwritten. > So I think default initialization is the right model for these other > proposed cases of procedures renamed as functions. I won't argue that from an implementation perspective, but I think it would be confusing as heck to users. ************************************************************* From: Robert I. Eachus Sent: Thursday, December 4, 2003 11:48 PM Tucker Taft wrote: > Randy Brukardt wrote: > >> If this is a real issue, then Robert Eachus's solution of an explicitly >> declared constructor is a better idea, because that could be required >> to be >> a primitive of the type, and thus would have to be in the same place as >> Initialize and any aggregates. > I am beginning to accept that you are right. The renaming of procedures as functions looks cleaner to start with, but it still requires syntax changes. But worse, you do have this issue that when the function body is being compiled, there will be no explicit checks done. Users could call other operations that assumed that the object had already been initialized, with potentially damaging results. We could just say that the programmers should be careful in this case, but that is not the normal Ada approach. > Just making it a primitive doesn't fix the problem. If you create > some special thing called a constructor, then presumably it can > only be called as part of a declaration, and when called it > must presume what? The object is totally uninitialized? > The default initializations have occurred but Initialize has > not been called? Pointers are nulled out but other default > initializations have not occurred? I think you are wading into > a morass. I don't see a morass. The constructor would create a value of the limited type, but the object could not be accessed until after the constructor returned. Inside a function you have a value that you will return 'in-place'. In the case of a constructor, that would be the only use allowed. During the call to the constructor, the object being initialized cannot be referenced, either by the constructor or otherwise. You can certainly create an object of the type inside the constructor, and it will get default initialization. But that object is not the object being initialized. With Tucker's renaming of a procedure, inside the body the out parameter has a name, and can be accessed. This one the major difference between the two approaches. (Or better, it is a side-effect of the fact that in the renamed procedure case, the bounds and other attributes can come from the target object, in my approach the object is only namable after the constructor returns.) As I said before, the two approaches are different in what they allow. I personally can live with either set of constraints. But I think this discussion has shown a serious problem with the renamed procedure approach. We would need special rules to cover what can be done with the out parameter inside the procedure to be renamed, or special semantics for references to that object. But allowing it to be passed as a parameter to some other subprogram would seem to open an entire Pandora's box. Saying that some default initialization for the object occurs before the user-defined doesn't solve anything, or creates a new set of issues. Let's imagine a limited type that by default allocates some space on the heap. You want to extend this type, and create an explicit constructor for it. If the 'default' initialization occurs, and then you are going to stick another value in, you are going to have to deallocate the already assigned memory (or reuse it). But the extension type may not be able to see all the fields of the parent. Take Limited_Controlled for a horrible example. I keep coming back to the idea that in the 'normal' constructor case, if there is such a thing, the programmer will create an object of some ancestor type 'in place' to do some of the initialization, and then do whatever special bit fiddling is required in the parts of the type that he or she can name. But which ancestor type needs to be used is a decision that needs to be made by the programmer, not by the language. (For a type derived from Limited_Controlled, this type might be Limited_Controlled, but it is much more likely to be the parent, or possibly a grandparent.) The return statement of the constructor should not be required to be an extension aggregate, but in most cases it will be anyway. ************************************************************* From: Pascal Leroy Sent: Friday, December 5, 2003 8:59 AM > If this is a real issue, then Robert Eachus's solution of an > explicitly declared constructor is a better idea, because > that could be required to be a primitive of the type, and > thus would have to be in the same place as Initialize and any > aggregates. If we think that adding constructors to the language is important, then, yes, let's do something like what Eachus suggested. I really don't like Tucker's proposal which looks like a wart (or many warts) to me. The renaming of a procedure as a function is weird, and the dynamic semantics of the constructor being called after initialization has taken place doesn't make any sense to me, as the constructor would have to undo the initialization. I agree with Randy that constructors should in general work like aggregates. ************************************************************* From: Robert I. Eachus Sent: Thursday, December 4, 2003 11:48 PM >But you can't do that! It isn't reasonable for some random procedure to be >used as a constructor. Only a purpose-built routine that initializes >everything can be used as one, and it would never make sense to actually >call it as a procedure. So this again argues for the Eachus solution. Up until now, I have been viewing my role here as helping to define the problem. But now I think that I see a way to handle both difficult (and of course more useful) cases. First go with a special name/syntax for constructors. Second, insist that constructors can only be used to create objects. It may be useful to permit a constructor to appear as an (in) parameter in a call, but I really don't see that as all that important or useful. But the cases that have to be outlawed are the uses of a constructor in the prefix of a name. Next within a constructor, allow attributes of "return" to be queried. (Or you can use some other name, for example the name of the constructor itself, which does less damage to existing parsers, so I'll use that in my examples.) Now I can write (using String to make a point): constructor Blanks(Length: Integer := 0) return String is begin if Blanks'Constrained then declare Result: String(Blanks'Range) := (others => ' '); begin return Result; end; else declare Result: String(1..Length) := (others => ' '); begin return Result; end if; end Blanks; The amount of work in the semantic phase of compilers would be significant, but it has many advantages from a language point of view: Constructors are special and can be declared as such. Functions can serve as constuctors where the special features are not needed. (I could write a Blanks function that handled the unconstrained case and a parameterless constructor for the constrained case. In fact, in this case I probably would, but I wrote this as an example of the syntax and semantics.) Programmers don't have to learn any 'weird' new syntax or semantics. The only two things that are not intuitive about this proposal are the name to use as the prefix of attributes and components, and the new reserved word "constructor." >I like this less and less. A constructor has to set everything, somehow. >Anything else is madness. > >The problem seems to be that these aren't really constructors, they just can >be used as them. I am in complete agreement. I think it is possible to do constructors right. Whether the amount of work involved in doing them right can be justified in this language revision is a different question. I think the answer is in the affirmative. But there is no justification for adding something that looks like a kludge to programmers and doesn't solve the whole problem. The renaming of a procedure as a function as such probably doesn't reach the kluge level. But the things we are now discussing to make it 'work' certainly do. It may be that there is a better fix to be found, but I think allowing attributes of the result to be used in constructors looks like the most elegant solution. ************************************************************* From: Tucker Taft Sent: Friday, December 5, 2003 5:03 PM I think you are definitely killing this proposal with complexity. Let's look at typical ways of writing "constructor" functions: function make_blah1(x, y : params) return blah is Result : blah; begin Result.x := x; Result.y := y; Massage(Result); return Result; end make_blah1; function make_blah2(x, y : params) return blah is Result : blah := (x => x, y => y, z => 0); begin Massage(Result); return Result; end make_blah2; function make_blah3(x, y : params) return blah is Result : blah := some_other_func(x); begin Result.y := y; Massage(Result); return Result; end make_blah3; function make_blah4(x, y : params) return blah is begin return (x => x, y => y, z => 0); end make_blah4; I would tend to use make_blah1 if there are a lot of components, and the default initial value of most of the components is fine. I would tend to use make_blah2 if there are relatively few components, and I want to be sure that if a new component is added, I am forced to update my code. I would tend to use make_blah3 if there is an existing constructor that does about the right thing, but I want to tweak the result a bit. I would tend to use make_blah4 in the same circumstances as make_blah2, when there is no complex work to be done to build up the desired value. Of course as things exist now, none of these functions could be used with limited types. For limited types we are forced to use procedures to do all "construction", or possibly use discriminants as effectively parameters to the Initialize procedure. In all these cases, the object will undergo most if not all default initialization before we get our hands on it. Ada 2005 will hopefully allow us to use aggregates, but that is no use if the limited type is also private, which is exactly when you want to be able to have "constructor" operations. So as things are now, programmers are used to the idea that limited types get default initialized, and then you have to be sure to call the constructing "procedure" as soon as possible so that no access to the object occurs before it is initialized as intended. What this proposal was trying to create was a situation where the desired initialization can be specified at the point of creation, to ensure no inappropriate "early" access is possible. This proposal becomes even more important if we add limited aggregates, because having to insert a call on a separate initialization procedure for each of the limited components of a limited aggregate is going to be a real pain. We really want to have some way of specifying the initialization at the place of the component in the aggregate. Similarly, an initialized allocator for limited types is very limiting if the only thing we can use is an aggregate. Something with the syntax of a function call would be very useful, because it could be placed in all the contexts where we want to specify the (extra) initialization that is to be performed before further access is permitted. Note that one of the things that makes a limited type different from a non-limited type is the sense that it has identity, and it is connected to perhaps some entity that can't easily be represented by a few bits in a record (e.g. a thread of control, or a mutex, or some external resource like a window on a screen). These kinds of limited objects must undergo their "default" initialization, before the programmer gets their hands on them, and clearly the programmer isn't going to override all of the state of the component. They are just going to "tweak" the state of the component in some way, perhaps, and a procedure is just the thing for doing this. E.g., an initialization procedure-as-function for a task might call an entry with some initializing values. It certainly can't "fully initialize" the task. That is not meaningful for limited objects in general. So... I really think the idea of default-initialization-plus-user-specified procedure-to-tweak-the-initial-state is exactly what we want for limited types. I think we should allow these procedures renamed as functions for non-limited types as well (contract model and all that), but clearly they would normally only be used if the intended constructor was similar to make_blah1 or make_blah3 when some_other_func was a similar constructor. In that case, writing make_blah1, make_blah3, and some_other_func as procedures, and then renaming them as functions, could produce identical semantics. In any case, so long as it is clear that the semantics of a procedure renamed as a function is that the returned object is the result of applying the procedure to a default-initialized object, the semantics are well-defined and easy to explain. Having to define a new kind of program unit that via data flow rules or whatever we make sure that no component is referenced before it is properly initialized, reminds me of the "out" parameter morass of Ada 83, on steroids. For aggressive optimizers, when a procedure is renamed as a function, it could actually generate the function with a default-initialized "result" object, and then inline the call on the procedure, and then proceed to remove all redundant default initializations of the fields of the object. So back to my initial point. If we want to provide the ability to specify how a limited object should be initialized at its point of creation, let's not kill it with kindness and complexity. The semantics of a function result being equivalent to applying a procedure to a default-initialized object are well-defined, no worse than what is available today for limited types, appropriate to the underlying principle of limited types as having some amount of unalterable state bound up with its identity, and safer and friendlier for the user than forcing a separation between creation and initialization. ************************************************************* From: Dan Eilers Sent: Friday, December 5, 2003 5:42 PM Robert Eachus wrote: > Now I can write (using String to make a point): > > constructor Blanks(Length: Integer := 0) return String is > begin > if Blanks'Constrained > then > declare > Result: String(Blanks'Range) := (others => ' '); > begin > return Result; > end; > else > declare > Result: String(1..Length) := (others => ' '); > begin > return Result; > end if; > end Blanks; It seems to me that in most cases, the bounds of the return type are determined by the constructor's input parameters. This permits always allocating space for the return object prior to the call, even when the constructor is used in a larger expression. Your example of Blanks is such a constructor, as is concatenation, matrix multiply, etc. If the syntax of contructors allowed the return type to be constrained by the parameters, then we could simplify your example to: constructor Blanks(Length: Integer := 0) return String(1..length) is begin return (others => ' '); end Blanks; or for "&", constructor "&"(x,y: string) return result: String(1..x'length+y'length) is begin result(1..x'length) := x; result(x'length+1..result'last) := y; return result; end "&"; ************************************************************* From: Robert A. Duff Sent: Friday, December 5, 2003 7:04 PM > It seems to me that in most cases, the bounds of the return type are > determined by the constructor's input parameters. This permits always > allocating space for the return object prior to the call, even when the > constructor is used in a larger expression. This is something I've been wanting for years. ************************************************************* From: Randy Brukardt Sent: Friday, December 5, 2003 8:02 PM > I think you are definitely killing this proposal with complexity. I don't know who "you" refers to here, but I don't think the proposal is that complex. Moreover, if it isn't complex enough to solve the problem effectively, it isn't worth doing. If we're not willing to adopt AI-318 as it currently stands (meaning the change in calling conventions), then I think we need to step back and take a good look at the problem. We started by thinking we wanted limited functions. But those make no sense unless they are a constructor. They're a standard ADT feature that is poorly supported in Ada, even for non-limited types. So it makes good sense to look at this as a constructor. The problem is that all Ada types give very little control over initialization. I'd like to be able to write a version of Unbounded_Strings that doesn't repeatedly create useless junk (or have distributed overhead to avoid the useless junk). > So as things are now, programmers are used to the idea that limited types > get default initialized, and then you have to be sure to call the > constructing "procedure" as soon as possible so that no access to > the object occurs before it is initialized as intended. What programmers are used to is the idea that limited types sound good in theory, but are useless in practice. That's one of the reasons that windows in Claw are non-limited, even though it would make more sense for them to be limited. I don't think I've even declared a single limited type since Claw, since it is so clear that they're useless for ADTs. If you have to trust the user of the ADT to do something, either the language or the ADT (or both) is broken. That's precisely the model we need to get away from. Let's look specifically at the "complexity" of this counter proposal. It needs a new form of subprogram, the constructor. This is clearly not a function or procedure. But it reuses most of the syntax and rules from those. Is this a big deal? I don't think so: AI-348 adds just such a program unit, the null procedure. (Which is definitely not a "normal" procedure - it is an entirely new kind of unit, at least as the rules are written. Just as an abstract subprogram was in Ada 95 - which didn't cause an uproar.) It might make sense to limit constructors to be primitive operations of the result type, but I'm not certain that is necessary. The constructor would use the return do construct proposed in AI-318. This looks like: return identifier : subtype_indication [:= expr] [do handled_sequence_of_statements; end [identifier]]; This would only be allowed in a constructor. Initialization (default or explicit) would take place when this (compound) statement is evaluated (not before). Since AI-287 gives us not only limited aggregates, but default initialized components in those aggregates, it is possible to use an aggregate for any type if we need to avoid default initialization. [Aside: it would make sense to allow (others => <>) for any object, even for private types. That would give a way to explicitly default initialize a component without having to do specify all of the other components as well.] The sequence of statements would allow later massaging. I'd prefer that if the initializing expression is omitted, top-level default initializations are also omitted. But this isn't essential (and it does complicate things a bit). I'm sure Tucker will feel better if it is avoided. Calls on constructors would be limited to places where aggregates are allowed. If we allowed this in the general case, then some memory allocation would have to be supported at the point of the return statement. (This is a certainty for Janus/Ada, no matter what.) That means that we'd probably need to pass in a storage pool or some other representation, as previously outlined. But since this is a new feature, there cannot be a compatibility problem. We could mitigate this problem somewhat by limiting the result subtype to be constrained (it would still allow everything if a wrapper record was used), but I doubt that would help enough to justify the oddity. > Let's look at typical ways of writing "constructor" functions: OK, let's: constructor make_blah1(x, y : params) return blah is begin return Result : blah := (others => <>) do -- I'm explicitly showing the default -- initialization here, but that's not -- necessary. Result.x := x; Result.y := y; Massage(Result); end Result; end make_blah1; constructor make_blah2(x, y : params) return blah is begin return Result : blah := (x => x, y => y, z => 0) do Massage(Result); end Result; end make_blah2; constructor make_blah3(x, y : params) return blah is begin return Result : blah := some_other_func(x) do -- Some_other_func(x) better be a -- constructor, at least if blah -- is limited. Result.y := y; Massage(Result); end Result; end make_blah3; constructor make_blah4(x, y : params) return blah is begin return Result : blah := (x => x, y => y, z => 0); end make_blah4; These all look good to me. Moreover, the example I gave last night would work without double construction: type Unbounded_String is new Ada.Finalization.Controlled with record Str : String_Access := new String'(""); end record; constructor "+" (V : in String) return Unbounded_String is begin return Result : Unbounded_String := (Ada.Finalization.Controlled with Str => new String'(V); end "+"; > What this proposal was trying to create was a situation where the > desired initialization can be specified at the point of creation, to > ensure no inappropriate "early" access is possible. This proposal > becomes even more important if we add limited aggregates, because > having to insert a call on a separate initialization procedure for each > of the limited components of a limited aggregate is going to be a real > pain. We really want to have some way of specifying the initialization > at the place of the component in the aggregate. Similarly, an initialized > allocator for limited types is very limiting if the only thing we can > use is an aggregate. Something with the syntax of a function call would > be very useful, because it could be placed in all the contexts where > we want to specify the (extra) initialization that is to be performed > before further access is permitted. Exactly. That's what constructors are for. But they're clearly not functions (at least not "normal" functions), and they certainly aren't procedures. > Note that one of the things that makes a limited type different from > a non-limited type is the sense that it has identity, and it is connected > to perhaps some entity that can't easily be represented by a few bits > in a record (e.g. a thread of control, or a mutex, or some external > resource like a window on a screen). These kinds of limited objects > must undergo their "default" initialization, before the programmer > gets their hands on them, and clearly the programmer isn't going to override > all of the state of the component. They are just going to "tweak" the > state of the component in some way, perhaps, and a procedure is > just the thing for doing this. E.g., an initialization procedure-as-function > for a task might call an entry with some initializing values. It certainly > can't "fully initialize" the task. That is not meaningful for limited > objects in general. Of course it is. There are some limited objects/components which have to be default initialized. We've got a syntax for doing that; there is no problem. Let me say right now that I'm really only interested in constructors for ADTs. And I think that virtually all ADTs ought to be controlled. Given how controlled works in Ada 95, that means I'm only interested in types that are (ultimately) derived from one of the types in Ada.Finalization. Whatever we come up with ought to make some sort of sense for other types, but it not at all important that it is useful. (Which is why I don't care if constraints work; those shouldn't generally be visible anyway.) In any case, what you described would be written as: constructor Run_It (My_Id : in String) return Tucks_Tasks is begin return Result : Tucks_Tasks := (other => <>) do Result.Start_Up (My_Id); end Result; end "+"; > So back to my initial point. If we want to provide the ability to > specify how a limited object should be initialized at its point of > creation, let's not kill it with kindness and complexity. No, but we at least had better be able to do the job. If we cannot avoid default initialization, ADTs will have to have a convoluted design internally to avoid excessive costs -- but that adds a distributed overhead to the use of them which may be unacceptable. > The semantics of a function result being equivalent to applying > a procedure to a default-initialized object are well-defined, Sure. > no worse than what is available today for limited types, Boy, this is a strong endorsement. :-) I want to waste time implementing something that is "no worse than what we have today". > appropriate to the underlying principle of limited types as having some amount > of unalterable state bound up with its identity, Any proposal I've seen meets this requirement. > and safer and friendlier for the user than forcing a separation between > creation and initialization. Huh? That's precisely what we're trying to avoid with constructors: to be able to eliminate the separation between creation and initialization. Certainly, calling a separate procedure call at some unspecified later time is a horrible separation. And doing the default initialization when it is inappropriate is also a unnecessary separation. ************************************************************* From: Jean-Pierre Rosen Sent: Friday, December 5, 2003 3:59 PM From: "Robert I. Eachus" >[...] > and the new reserved word "constructor." We could call this "limited function" and save a keyword... Seems to carry the spirit. ************************************************************* From: Robert A. Duff Sent: Saturday, December 6, 2003 9:01 AM Randy, in reply to Tuck, writes: > > I think you are definitely killing this proposal with complexity. > > I don't know who "you" refers to here, I didn't understand that either. Because Tuck didn't say "... writes". You (Randy) followed suit. ;-) > What programmers are used to is the idea that limited types sound good in > theory, but are useless in practice. I've heard this vague claim many times. Could you be more specific? My feeling is that the claim is true exactly because of this initialization problem, and if we solve that, limited types would be very useful indeed. Are there *other* issues that you or others think make limited type useless in Ada? (I also have limited types with default initialization, and/or kludgy discriminant hackery, where constructor functions of some sort would be cleaner.) >... And I think that virtually all ADTs ought to be controlled. Unfortunately, that's not feasible if you care about efficiency, given the huge overhead most compilers have for controlled types. You can stick to controlled types if you like, but I think it's a bad idea to assume (for the language design) that the only types that need constructors are controlled. I have lots of limited types in my current project, and lots of types that *would* be limited if only I could initialize them nicely. But I have very few controlled types; they're simply too inefficient. (Actually, I guess I should say I have very few controlled *objects*. That is, if I have a type where I'm creating and destroying thousands or millions of objects of the type, I can't make it controlled, because it's too slow. The number of *types* is irrelevant to this efficiency issue.) ************************************************************* From: Tucker Taft Sent: Saturday, December 6, 2003 9:39 AM > Randy, in reply to Tuck, writes: > > > > I think you are definitely killing this proposal with complexity. > > > > I don't know who "you" refers to here, > > I didn't understand that either. Because Tuck didn't say "... writes". > You (Randy) followed suit. ;-) I thought it was obvious I was talking about Randy and Robert Eachus, who were proposing a completely new kind of program unit, namely a constructor. I could see adding a reserved word to make it clear that a given function was designed to create a new object, and the caller must allocate space for the object and initialize it at least to some extent before the call. I wouldn't limit the contexts in which such a function could be called. > > What programmers are used to is the idea that limited types sound good in > > theory, but are useless in practice. > > I've heard this vague claim many times. Could you be more specific? My > feeling is that the claim is true exactly because of this initialization > problem, and if we solve that, limited types would be very useful > indeed. ... The big question for me is then whether you feel the admittedly limited (;-) capability provided by procedures renamed as functions would be adequate. ************************************************************* From: Robert A. Duff Sent: Saturday, December 6, 2003 2:05 PM Tuck says: > > Randy, in reply to Tuck, writes: > > > > > > I think you are definitely killing this proposal with complexity. > > > > > > I don't know who "you" refers to here, > > > > I didn't understand that either. Because Tuck didn't say "... writes". > > You (Randy) followed suit. ;-) > > I thought it was obvious I was talking about Randy and Robert Eachus, > who were proposing a completely new kind of program unit, namely a > constructor. For those of us who might read these things out of order, or weeks/months/years later, a line at the front saying "so-and-so said, ..." would be useful. > I could see adding a reserved word to make it clear that a given > function was designed to create a new object, and the caller must > allocate space for the object and initialize it at least to > some extent before the call. I wouldn't limit the contexts in > which such a function could be called. > > > > What programmers are used to is the idea that limited types sound good in > > > theory, but are useless in practice. > > > > I've heard this vague claim many times. Could you be more specific? My > > feeling is that the claim is true exactly because of this initialization > > problem, and if we solve that, limited types would be very useful > > indeed. ... > > The big question for me is then whether you feel the admittedly > limited (;-) capability provided by procedures renamed as functions > would be adequate. Sorry, but I don't understand the details well enough to be sure. I don't understand the limitations. I've lost track. I just went back and re-read AI's 318 and 325, but they seem to have obsolete proposals. I also re-read your original e-mail on this subject, and lots of others, but I still don't have a good understanding of what all the proposals are, in detail. I suspect the answer is, "Yes, the func-renames-proc thing is good enough". But I don't understand the details well enough to distinguish it from the "constructor(...) return ..." syntax. Is it just an argument as to which syntax is more sugary, and which more sour? Despite my current ignorance, I'll offer some comments: Constructors ought to be composable. That is, clients should be able to write constructors, given primitive constructors. For example, if a "Sequence" package gives the client a way to construct a singleton sequence, given one element, and a way to concatenate Sequences, the client ought to be able to write a constructor that takes two Elements and produces a sequence of length 2. This is common in non-limited cases. Why not in limited? This implies that constructors cannot be required to be primitive ops. And therefore that such constructors cannot see improperly initialized objects. ---- I like the idea that these new constructor functions are recognized somehow by the *spec*. ---- One way to write "bullet proof" abstractions is to forbid clients from creating uninitialized objects, by saying "type T(<>) is limited private". It would be nice if the abstraction could then export constructor functions, and clients could compose them, and/or call them. Clients cannot write "X: T;", so they ought to be able to write "X: T := Primitive_Constructor(...);", or declare their own composed constructor, and use that. Either way, the package itself has total control over object creation. The unknown discriminants do not *necessarily* mean the thing has truly unknown size. I'm not sure how important this is. ---- I like Dan Eiler's idea, of allowing a function result to be definite, but dependent on parameters, as in "function F(X: String) return String(1..X'Length)". It seems related to all this limited-type stuff, in that it allows the caller to know sizes of function results that would otherwise be "return String". ************************************************************* From: Robert I. Eachus Sent: Saturday, December 6, 2003 3:21 PM Tucker Taft wrote: >I thought it was obvious I was talking about Randy and Robert Eachus, >who were proposing a completely new kind of program unit, namely a constructor. > >I could see adding a reserved word to make it clear that a given >function was designed to create a new object, and the caller must >allocate space for the object and initialize it at least to >some extent before the call. I wouldn't limit the contexts in >which such a function could be called. > > If you think the limitation I proposed--that a constructor cannot be used as a prefix of a name is a big deal, fine. But procedures can't be used in that context either, so I think that with the renaming approach you probably need that restriction too. To me the constructor proposal without the attributes is slightly simpler than your (Tucker's) renaming proposal, from a compiler viewpoint. But they cover different cases of constructors. The renaming approach covers cases where the constraints on the object being initialized come from the target, where in the new subprogram type approach, the case where the bounds are determined by the constructor is easily handled. Adding the attributes, perhaps only specific attributes, to the new subprogram approach covers all constructor cases, but is probably more work than the renaming approach. So, with my implementor hat on, I'd definitely want the context where constructors can appear limited. This fits nicely with using the constructor name in attributes. If the name always refers to the containing instance, then it is pretty clear that allowing the name as a call in prefixes would cause ambiguity. As a user, I very much like the idea of covering all the cases with one new construct. And I really think it is worth calling it "constructor" not "limited function." I also like the idea of adding the "return Result: whatever do .. end; construct. It is clear to me that in most constructors you will need a nested scope anyway, unless the constructor is equivalent to assigning default values to fields. As for the brouhaha about initialization, I definitely see a use for constructors in replacing "junk" initializations. But allowing the default initializations to occur (unless there is an attribute assignment) is no big deal. The solution for the problems where the initial values are not wanted will be to not provide the defaults, just an Initialized: Boolean := False; if there is a potential problem. The Limited_Controlled example is perfect. You want/need the implementation's default initialization of any implementation specific fields of objects derived from Controlled or Limited_Controlled. But if decent constructors are available, the creator of the type extension can just use constructors instead of default values for the difficult stuff. This does have one implication that needs watching though--it is not a problem, but should be explicitly discussed. What happens if someone tries to assign a constructor to an already existing limited object? The three reasonable answers are that it is illegal, Program_Error is raised, and that the current value is finalized (if it is controlled) and a new value/object is created. I favor making it illegal for a limited object, but Tucker seems to want something else. (If it is illegal, then any discussion of whether the object is initialized is out of bounds.) >The big question for me is then whether you feel the admittedly >limited (;-) capability provided by procedures renamed as functions >would be adequate. > At this point my answer is no. I think that the constructors without the attribute support are simpler to implement than the renaming approach, and much easier on the user. The big issue to me is whether we need to support constructors that take their bounds from the target object. At this point I feel that it is worth doing, but a close call. However, the partial functionality of constructors without the attributes is to me significantly better than the renaming approach. It can end up requiring slightly more writing and overhead than the renamed procedure approach to create a constrained object of an unconstrained type with discriminants (and without a default for the discriminants): Fubar: Foo(Bar) := Make_Foo(Bar); (You would have to explicitly repeat the discriminants once on the object, once as parameters to the constructor.) However, the renaming approach does not handle the case where the constructor deterimines the constraints at all. To give an example I recently ran into, think about a constructor that converts a linked list into an array. For a non-limited type, no problem at all, create a temporary array object, and walk the list, if you run out of room in the temporary object, recur: type String_Array is array (Positive range <>) of Unbounded_String; type String_List is private; ... private type String_Pointer is access String_List; type String_List is record Value: Unbounded_String; Next: String_Pointer end record; ... function To_String_Array(List: in String_List) return String_Array is Result: String_Array(1..10); Temp: String_List := List: begin if Temp := null then return Result(1..0) end if; for I in 1..10 loop Result(I) := Temp.Value; if Temp.Next = null then return Result(1..I); end if; end loop; return Result & To_String_Array(Temp); end To_String_Array; When writing this I was seriously concerned about the number of Unbounded_String objects being created and assigned, so this is probably a good test case for arguments about initializations. (It is possible to walk the list once counting and create the array "in place" if we add a way to do that.) ************************************************************* From: Robert I. Eachus Sent: Saturday, December 6, 2003 3:52 PM Robert A Duff wrote: >Sorry, but I don't understand the details well enough to be sure. >I don't understand the limitations. I've lost track. I just went back >and re-read AI's 318 and 325, but they seem to have obsolete proposals. >I also re-read your original e-mail on this subject, and lots of others, >but I still don't have a good understanding of what all the proposals >are, in detail. > >I suspect the answer is, "Yes, the func-renames-proc thing is good >enough". But I don't understand the details well enough to distinguish >it from the "constructor(...) return ..." syntax. Is it just an >argument as to which syntax is more sugary, and which more sour? > > No, in the cases where there are no discriminants involved either proposal suffices. In the renaming approach the object has to be created before the constructor is called, which means that the bounds have to be known before (or during) the call. I just sent in an example where this is not really possible. In the new subprogram approach, we can either allow attributes to query the target, or the constructor will construct a value and there may be a Constraint_Check after the call. This particular constraint check is ugly since it has to be made after the object is created, and it works best if there is an explicit copy of the result into the object. This is why I like the return...do construct. Clearly the constraint check will occur inside the constructor, at the point where the return value is created but before it can be changed by the do...end; block. >This implies that constructors cannot be required to be primitive ops. > I'm not sure what you are saying here. Constructors should be allowed to be primitive, and we will have to argue about whether they become abstract when inherited unless overridden in any case. But I don't feel strongly about whether they can be other than primitive operations. >And therefore that such constructors cannot see improperly initialized >objects. > > I tend to agree. The best arguement in favor of this position is that of type extensions. In the package that creates an extension, you may not be able to see into the parent part of the object. So its initialization has to be done implicitly or by a call to a constructor for the parent type. As I said in my previous post, this is not too onerous. If you are the author of a type, you can decide to put any initialization in the constructors instead of as initial values. >I like the idea that these new constructor functions are recognized >somehow by the *spec*. > > > I think it is necessary. Otherwise we have a fairly heavy distributed overhead. Right now both approaches, satisfy this requirement. >One way to write "bullet proof" abstractions is to forbid clients from >creating uninitialized objects, by saying "type T(<>) is limited private". >It would be nice if the abstraction could then export constructor >functions, and clients could compose them, and/or call them. >Clients cannot write "X: T;", so they ought to be able to write >"X: T := Primitive_Constructor(...);", or declare their own composed >constructor, and use that. Either way, the package itself has total >control over object creation. The unknown discriminants do not >*necessarily* mean the thing has truly unknown size. >I'm not sure how important this is. > > You may have just killed the renaming approach. I had forgotten about declaring a type with unknown discriminants as it is not too useful currently. With the renaming approach you can't use such a type. With the constructor as new subprogram approach, even without the attibutes, such types will be used all over the place. As you say, the only way to for a user to create one will be to call a constructor. >I like Dan Eiler's idea, of allowing a function result to be definite, >but dependent on parameters, as in "function F(X: String) return >String(1..X'Length)". It seems related to all this limited-type stuff, >in that it allows the caller to know sizes of function results that >would otherwise be "return String". > I like it too. But I think it is more of a "nice to have" for other reasons than a solution to this problem. ************************************************************* From: Randy Brukardt Sent: Saturday, December 6, 2003 10:09 PM Bob Duff wrote: > Randy, in reply to Tuck, writes: > > > > I think you are definitely killing this proposal with complexity. > > > > I don't know who "you" refers to here, > > I didn't understand that either. Because Tuck didn't say "... writes". > You (Randy) followed suit. ;-) I usually answer a number of messages at once (otherwise I have to file a lot more messages). That makes it hard to do and still be understandable. > > What programmers are used to is the idea that limited types sound good in > > theory, but are useless in practice. > > I've heard this vague claim many times. Could you be more specific? My > feeling is that the claim is true exactly because of this initialization > problem, and if we solve that, limited types would be very useful > indeed. Are there *other* issues that you or others think make limited > type useless in Ada? The main problem is that you can't have a component of one in something that you want non-limited. Of course that has to be the case. But the workaround (use a pointer) flies in the face of my philosophy of never using pointers unless dynamic allocation is needed. The net effect is that either everything in your program has to be limited types, or everything has to be non-limited. Mixed systems don't work very well, because they don't compose very well. I think that's fundamental to limited types. What we can do, however, is make it more possible to make everything limited (which is often the right choice anyway), and the constructors are part of them. But please keep in mind that constructors are not just for limited types, and in fact allow things for non-limited types that require doing the operations twice. See my Unbounded_String example, for instance. (BTW, that type is not one that I made up. That's the type definition in Janus/Ada and in GNAT (at least when I last looked) for Unbounded_String.) You can work around double initialization with flags, but that has a distributed overhead: every operation has to check the flags and be prepared for a missing data component. That is a lot harder to get right and hurts performance as well. > >... And I think that virtually all ADTs ought to be controlled. > > Unfortunately, that's not feasible if you care about efficiency, given > the huge overhead most compilers have for controlled types. If Ada compilers have "huge overhead" for controlled types, they're strangling proper use of the language. For Janus/Ada, it takes roughly 8 instructions (not counting the user-defined code) to finalize an object, and about twice that to initialize it. That's down in the noise for virtually all uses. Compilers that use mapping solutions should cost even less (if not, they shouldn't bother with them!!). The space overhead is more of an issue. I agree that if you have very small data types, and you need very many objects, then there may be an issue. But there are very few such types in any project by definition. > You can stick to controlled types if you like, but I think it's a bad > idea to assume (for the language design) that the only types that need > constructors are controlled. I have lots of limited types in my current > project, and lots of types that *would* be limited if only I could > initialize them nicely. No, I said the only types that need constructors are ADTs. And all ADTs should be controlled (but not necessarily in the way that Ada 95 does controlled). > But I have very few controlled types; they're simply too inefficient. > (Actually, I guess I should say I have very few controlled *objects*. > That is, if I have a type where I'm creating and destroying thousands or > millions of objects of the type, I can't make it controlled, because > it's too slow. The number of *types* is irrelevant to this efficiency > issue.) I was worried about the efficiency of my spam filter, because it stores everything in Unbounded_Strings -- which are just a ball of heap operations and finalizations. But it turns out that the big expense is actually loading the patterns - the actual filtering operations are down in the noise. If you're creating and destroying millions of objects, you have an efficiency problem from doing that. Whether the objects are controlled or just integers doesn't matter at all. Heap operations are so much slower than type creation that they are the bounding factor -- you'll need very careful caching to make the thing usable at all. Anyway, we have completely different philosophies of programming, so I'm not surprised that our results differ. What I find important is that we solve this problem in a way that works for as many users as possible. ************************************************************* From: Randy Brukardt Sent: Saturday, December 6, 2003 10:25 PM Bob Duff wrote: > Constructors ought to be composable. That is, clients should be able to > write constructors, given primitive constructors. For example, if a > "Sequence" package gives the client a way to construct a singleton > sequence, given one element, and a way to concatenate Sequences, the > client ought to be able to write a constructor that takes two Elements > and produces a sequence of length 2. This is common in non-limited > cases. Why not in limited? I'm not completely sure what you mean. Certainly, the proposal I've put forward allows setting components with a constructor (usually in an aggregate). Or did you mean something else? > This implies that constructors cannot be required to be primitive ops. > And therefore that such constructors cannot see improperly initialized > objects. I think it is important that constructors don't see *any* objects. That is, constructors are creating an object -- in no sense should an object be "passed in" to it. Implementations might implement them that way, but the semantics certainly should not be that way. > I like the idea that these new constructor functions are recognized > somehow by the *spec*. Good. :-) > One way to write "bullet proof" abstractions is to forbid clients from > creating uninitialized objects, by saying "type T(<>) is limited private". > It would be nice if the abstraction could then export constructor > functions, and clients could compose them, and/or call them. > Clients cannot write "X: T;", so they ought to be able to write > "X: T := Primitive_Constructor(...);", or declare their own composed > constructor, and use that. Either way, the package itself has total > control over object creation. The unknown discriminants do not > *necessarily* mean the thing has truly unknown size. > I'm not sure how important this is. Well, if you want to do this, I think Tucker's proposal is a non-starter. (Robert Eachus explained why). ---- A few other observations on the (rough) proposal I put out yesterday: The proposal is very similar to the AI-318 discussed at the Sydney meeting. The only real difference is the use of the keyword "constructor". That means that the implementation issues discussed for that proposal would pretty much hold. Constructors would be allowed exactly where aggregates are allowed, with the same semantics. That means a non-limited constructor used in a regular assignment would create a temporary object, then assign it. (Obviously, an implementation could optimize that in some cases.) The AARM points out in several places that := used for initialization is very different than := used for assignment. What this proposal is really doing is allowing users to write their own := initialization operations. That's a capability currently absent in Ada (you can get parts of it, but not the whole thing). One additional concern about Tucker's proposal. Many times in the past, we've used the invariant that you can't call a procedure in a declarative part to prove that some rule or other can't cause a problem (usually with "in out" parameters). If that is no longer true, there are a number of rules that would need to be revisited (freezing, incomplete types, who knows how many others). I don't look forward to finding new holes caused that eliminating that. ************************************************************* From: Tucker Taft Sent: Saturday, December 6, 2003 10:19 PM Here is an approach that might accomplish all of our goals. Enhance the procedure-renamed-as-function by also allowing the renaming of something that looks like a procedure *call*. For example: type Lim(F1 : Integer) is limited private; procedure P(Y : Integer; Out_Parm : out Lim); function F(X : Integer) return Lim; private type Lim(F1 : Integer) is limited record F2 : Task_Type(F1); end record; function F(X : Integer) return Lim renames procedure P(Y => X, Out_Parm => (F1 => X+2, F2 => <>)); The renamed thing can be simply the name of a procedure as in the earlier proposal, in which case it is equivalent to providing a default initialized object as the actual for the [IN] OUT parameter of the procedure, with the other function parameters passed to the corresponding (by position) procedure IN parameters. Alternatively, the renamed thing can be a procedure *call,* with the [IN] OUT parameter given an initializing expression rather than a variable as the actual. The other parameters (which would have to be IN parameters) would be given expressions of the approriate type. In both cases, the "result" of calling the "function" is the final value of the [IN] OUT parameter. This latter form would clearly give a lot of flexibility, but the *caller* would still be able to create the object (on the stack, in the heap, as a component, etc.), perform the specified initialization (rather than always performing default initialization), register it for task waiting and/or finalization, etc., as appropriate, and then call the designated procedure. This has a lot of nice properties. The caller is still in charge of getting the object ot a state at which it can be safely registered for task waiting and/or finalization, and it is still in charge of allocation. The discriminants of the object can be determined by other parameters to the "function." And the out-of-line procedure can still whatever it needs to to finish the desired initialization/construction. If we consider this direction, we *might* want to allow the obvious generalization of allowing a function to rename another function *call,* specified using a similar syntax. Anyway, this might be a way to kill several birds with one relatively straightforward stone... ************************************************************* From: Randy Brukardt Sent: Saturday, December 6, 2003 10:19 PM Tucker said: > Here is an approach that might accomplish all of our goals. > Enhance the procedure-renamed-as-function by also > allowing the renaming of something that looks like a procedure *call*. Well, I think this would solve my specific concern. But I think you're just making the idea uglier. The fundamental problem is that a constructor is not a function and it certainly isn't a procedure. It is more a custom definition of ":= initialization", and really should have properties appropriate to that. I have to wonder if this approach would work for a type with unknown discriminants (Bob indicated that he wants to do that.) I don't see how it would be able to set them in that case. ... > This latter form would clearly give a lot of flexibility, but > the *caller* would still be able to create the object (on the > stack, in the heap, as a component, etc.), perform > the specified initialization (rather than always performing default > initialization), register it for task waiting and/or > finalization, etc., as appropriate, and then call the designated > procedure. I don't see this as much of an issue. The "constructor" proposal has all of that localized to the return statement. And the cost of doing it there is not really any different than it is now -- I already have to stand on my head to keep the return object from being finalized in the function. Doing the same for tasks is trivial (it's the same chain for us anyway). I realize other implementors mileage may differ somewhat, but they've already got to face the problem of deferring finalization of the return object. > This has a lot of nice properties. The caller is still in charge > of getting the object ot a state at which it can be safely registered > for task waiting and/or finalization, and it is still in charge of > allocation. The discriminants of the object can be determined > by other parameters to the "function." And the out-of-line > procedure can still whatever it needs to to finish the desired > initialization/construction. I really see no benefit to having the caller do that. That said, I don't see much harm in it either. > If we consider this direction, we *might* want to allow the > obvious generalization of allowing a function to rename > another function *call,* specified using a similar syntax. Yes, it certainly would be more consistent that way. > Anyway, this might be a way to kill several birds with one > relatively straightforward stone... The one problem with this seems to be that if you can write an appropriate aggregate to initialize the object, you probably don't need the constructor in the first place. It also means that the constructor is split into two parts arbitrarily: the initial initialization, and the rest of it. That seems ugly. ************************************************************* From: Robert I. Eachus Sent: Sunday, December 7, 2003 12:17 AM >>Here is an approach that might accomplish all of our goals. >>Enhance the procedure-renamed-as-function by also >>allowing the renaming of something that looks like a procedure *call*. > >Well, I think this would solve my specific concern. But I think you're just >making the idea uglier. Once I bent my head around it, it isn't that bad, and adding functions seems to cover the case of unknown discriminants nicely. But there is no need to subject users to this mind bending experience. What is the difference between renaming a procedure call as a function, and having a constructor function that calls the procedure instead? Just a lot of mental gymnastics that users wil complain about forever. The way the construction returns its result needs to be special because the value has to be built in place. Adding the special subprogram type which is needed for many reasons to the "return ... do ... end;" construct which should only be allowed in subprograms that are marked in the specification in some way--and the name constructor deos that nicely--results in a solution that users will just use without complaining about the need for renaming, or that there is no obvious way to convert a C++ program with constructors to Ada. This is a nice to have. Note one other very nice to have about the "return...do...end; construct. The object name after the return can be used as a prefix within the sequence of statements. The compiler has to do whatever magic is required to match this to the place where the created object belongs, and raise Constraint_Error if there is a mismatch. (Constraints in both the return value and the object being created and they don't match.) But the normal cases work just fine. If the type is classwide, or the object is of an unconstrained subtype of a type with discriminants, the values of the discriminants are supplied by the object named in the return statement. From that point on, from both the user and compiler's perspective, all the magic is done. And if you need attributes of the actual object to do the initialization, they are already there "for free." There is a LOT of subtle semantics here. Probably the most subtle part is what happens if a constructor is constructing an object of a class-wide type: Foo: Bar'Class := Make_Foo(Param1, Param2); This is actually an overload resolution issue. If the compiler can disambiguate which constructor to call from the parameters, fine. If not the user will have to change the object type to a specific type. So I think we do need the overload resolution rule that the declared return type of a constructor must be a specific type. (I could see having a constructor with several different return statements that returned different specific types, while the declared return type was classwide. It might be a fun idea to play with, but I think it should be out of scope for this revision.) I also feel, and I may be alone on this one, that constructors which are predefined operations of a type should not be derived as abstract the way functions are. With constructors of tagged types there will be many cases where calling the constructor for the parent type on a view conversion is the magic you want. (But if it creates compiler issues, I can easily be talked out of it.) >The fundamental problem is that a constructor is not a function and it >certainly isn't a procedure. It is more a custom definition of ":= >initialization", and really should have properties appropriate to that. I agree with Randy. This is a special thing from both the language viewpoint and the user's viewpoint. Trying to avoid that results in a cognitive dissonance that is an invitation to headaches, both for language lawers and Ada programmers. (Randy said:) >I don't see this as much of an issue. The "constructor" proposal has all of >that localized to the return statement. And the cost of doing it there is >not really any different than it is now -- I already have to stand on my >head to keep the return object from being finalized in the function. Doing >the same for tasks is trivial (it's the same chain for us anyway). Yep. (Back to Tucker:) >>If we consider this direction, we *might* want to allow the >>obvious generalization of allowing a function to rename >>another function *call,* specified using a similar syntax. I am not sure that I like this new idea, but without renaming of function calls, I think it is a non-starter. And once you look at the renaming of function calls case we quickly get back to the constructors with special return values. >The one problem with this seems to be that if you can write an appropriate >aggregate to initialize the object, you probably don't need the constructor >in the first place. It also means that the constructor is split into two >parts arbitrarily: the initial initialization, and the rest of it. That >seems ugly. Worse. From a user's point of view, the action of the contructor will often have to be split into two parts, the initialization of the parent part of the object (posibly done as part of an aggregate), then what is special to this constructor. However, the split that Tucker's approach creates will only match this cognative division by accident. (Or by very careful planning on the user's part..) I just don't think we need that additional hurdle to using this language feature. I really keep coming back to the same thought. The purpose of this effort is to make limited types more usable in Ada. (Some people would take the more out of that statement.) To do that is going to require constructors which are pain free from a user's perspective. If we don't have that, we have junk that is not worth implementing. Tucker's approaches may be very clever tricks, but that is how they will be seen by users. It would be fine to implement the actual constructors that way, but the user view needs to be simple and straightforward to use. And as I said (by now yesterday my time), I think that the issue Bob Duff identified with: type T(<>) is limited private; is crucial. Why would I declare such a type? Because I want to control the creation of all objects of the type. But unless I can export constructors, the type is pretty useless. Getting rid of that special field, the only one with an initial value, that says this is really an uninitialized object, would make me very happy. But right now that is the only way to have limited objects with control of creation. (Actually, I have done nastier things with access discriminants, and default initial values that call a protected object. That is one big hairy kludge. But it did insure that each object had a unique, sequential ID.) ************************************************************* From: Tucker Taft Sent: Sunday, December 7, 2003 8:27 AM Randy Brukardt wrote: > ... > I have to wonder if this approach would work for a type with unknown > discriminants (Bob indicated that he wants to do that.) I don't see how it > would be able to set them in that case.... This is straightforward: type Lim(<>) is limited private; procedure P(Y : Integer; Out_Parm : Lim); function F(X : Integer) return Lim; private type Lim(F1 : Integer) is limited record F2 : Task_Type(F1); end record; function F(X : Integer) return Lim renames procedure P(Y => X, Out_Parm => (F1 => X+2, F2 => <>)); Now the user can't declare just "L : Lim;" but rather must call a function like F at the declaration point ("L : Lim := F;"). --------- I understand the concern about the split between the functionality specified at the rename, and the functionality buried in the procedure. However, if we want the discriminants (and hence the size) known to the caller, then you need to be able to write something available to the caller (in Dan's idea, it was the function result subtype) which is a function of the parameters. But Dan's idea doesn't work for a "type Lim(<>) is ..." type since the discriminants are not visible. The renaming approach because of the possible separation between the initial declaration in the visible part, and the renaming in the private part, allows for that. Furthermore, the separation has other advantages. A single procedure can be used with several different renamings, with the renamings differing in the expressions passed for some of the IN parameters of the procedure, or the initializing expression for the [IN] OUT parameter. This is similar to what is done now with renamings where you can create several renamings of the same subprogram, with different default expressions. I would say this approach is actually *clearer* than the "trick" of changing the default expressions on renaming. Presuming it is available for function-to-function renaming and procedure-to-procedure renaming, then it becomes a generally useful capability, which can also be extended to handle procedure-to-function renaming which is what limited types need. I will say (again ;-) I am quite concerned about introducing a new kind of program unit. This brings up the issue of library constructors, generic constructors, subunit constructors, constructor stubs, etc. Furthermore, from the caller's point of view, is there any limitation on where the can be called, or is it just like a function, and a call on a constructor can be used anywhere a function can be used? In my view, functions *are* Ada's constructors, and the fact that we can name our functions anything we want is a step up from C++. The problem is that you can't write decent functions for limited types. I think "fixing" this by inventing a completely new notion of a "constructor" leaves the existing use of functions out in the cold. Does this mean that we should go back and change all the functions we have created for non-limited types which are often used as constructors, and make them "true" constructors? I believe you are unnecessarily "orphaning" functions. I would rather "orphan" the existing ability to return by reference, which I think is of marginal use. If we wanted to call something a "limited function", I would say a function that returns by reference is such a thing. So perhaps the "right" fix is to say that if you want to continue to use return-by-reference, you have to label the function a "limited function." All other functions are "constructors" in that they create "new" objects. For limited types, the caller needs to know enough to be able to allocate and at least partially initialize the object, since this "new" object cannot be copied as part of the function call, but must be initialized in its final resting place. Randy indicates he already deals with functions returning controlled types, but I think for limited controlled, it is a somewhat different problem, because *no* copying is permitted (there is no "adjust" procedure). We have ourselves worked to minimize the number of copies involved, but for types with discriminants, we end up with at least one final copy even in "initializing" contexts. To support limited types with discriminants, it seems clear that something available to the caller (such as a renaming declaration) has to provide an indication of the value of the discriminants of the "new" object. ************************************************************* From: Robert A. Duff Sent: Sunday, December 7, 2003 10:26 AM > > I've heard this vague claim many times. Could you be more specific? My > > feeling is that the claim is true exactly because of this initialization > > problem, and if we solve that, limited types would be very useful > > indeed. Are there *other* issues that you or others think make limited > > type useless in Ada? > > The main problem is that you can't have a component of one in something that > you want non-limited. But that sort of begs the question. I mean, if I ask, "why can't you make type T1 limited?", and you say, "because I want T2 to have a component of type T1, and T2 can't be limited," then I'll ask again, "OK, why can't you make *T2* limited then?" My feeling is that the reason you can't make T2 limited is because of these initialization issues (including aggregates and constructor functions), and if that were solved, then you would be happy to make both T2 and T1 limited. (Or maybe T2 is a component of T3, and so on -- but somewhere down the line, there's got to be a real reason, other than, "if I make *this* limited then I'd have to make *that* limited.") >... Of course that has to be the case. But the workaround > (use a pointer) flies in the face of my philosophy of never using pointers > unless dynamic allocation is needed. > > The net effect is that either everything in your program has to be limited > types, or everything has to be non-limited. Mixed systems don't work very > well, because they don't compose very well. I don't see any problem with that. Why would you *want* to compose them? I mean, if you have something that's naturally limited (say, Window_Handle), then I would want things containing Window_Handles to be limited, too. > I think that's fundamental to limited types. What we can do, however, is > make it more possible to make everything limited (which is often the right > choice anyway), and the constructors are part of them. Yeah, that's what I think we should do. My question was, if we do that, will people *still* gripe that limited types "sound good but are useless in practise"? I think not (i.e. fixing the initialization problems is sufficient to make limited types useful). > But please keep in mind that constructors are not just for limited types, > and in fact allow things for non-limited types that require doing the > operations twice. See my Unbounded_String example, for instance. (BTW, that > type is not one that I made up. That's the type definition in Janus/Ada and > in GNAT (at least when I last looked) for Unbounded_String.) You can work > around double initialization with flags, but that has a distributed > overhead: every operation has to check the flags and be prepared for a > missing data component. That is a lot harder to get right and hurts > performance as well. I agree that the double-initialization is not nice. But I don't see a better alternative. (OTOH, as I said, I don't fully understand all the details.) > > >... And I think that virtually all ADTs ought to be controlled. > > > > Unfortunately, that's not feasible if you care about efficiency, given > > the huge overhead most compilers have for controlled types. > > If Ada compilers have "huge overhead" for controlled types, they're > strangling proper use of the language. I agree. > For Janus/Ada, it takes roughly 8 instructions (not counting the > user-defined code) to finalize an object, and about twice that to initialize > it. Pretty impresive, I think. Anyway, let's not argue about whether finalization is good or evil or fast or slow. The fact is, there are some programs (the one I'm working on right now is an example), that would like to have lots of non-controlled limited types, with constructors. You said: >...What I find important is that we solve > this problem in a way that works for as many users as possible. and I agree with *that*. It implies that we should not create a solution that works only for controlled types. ************************************************************* From: Robert I. Eachus Sent: Sunday, December 7, 2003 3:48 PM This is a long post. It is necessary that someone completely work through these proposals to find any potential problems if they are adopted. Not everyone needs to check my work. (At least the example package as included here compiles. ;-) The short version of this post is that Tucker's old or new approach may cause problems that need to be resolved in the area of protected objects. I don't see it as a killer, but it deserves thought. The new constructor subprogram type can be allowed as a part of protected types, but I don't see a pressing need. Note that it is not the object being created that needs protecting, it is parameters to the constructors that may raise concurrency issues. 'Allowing' a constructor of a type to be part of a protected object to me is not needed even for orthogonality reasons. A constructor never needs a way to protect the object that it is creating, and of course a constructor in a protected type would not be creating objects of the protected type. Tucker Taft wrote: > Randy Brukardt wrote: > >> ... >> I have to wonder if this approach would work for a type with unknown >> discriminants (Bob indicated that he wants to do that.) I don't see >> how it >> would be able to set them in that case.... > > > This is straightforward: > > type Lim(<>) is limited private; > procedure P(Y : Integer; Out_Parm : Lim); > function F(X : Integer) return Lim; > private > type Lim(F1 : Integer) is limited record > F2 : Task_Type(F1); > end record; > > function F(X : Integer) return Lim > renames > procedure P(Y => X, Out_Parm => (F1 => X+2, F2 => <>)); > > Now the user can't declare just "L : Lim;" but rather > must call a function like F at the declaration point > ("L : Lim := F;"). Um, that is "L: Lim := F(3);" or some such. There are two limitiations with this approach. First, it is difficult to have really unknown discriminants--the discriminant types probably have to be visible so that the visible function declaration can have parameters of the type. (Not a big deal.) The other limitation is the one that bothers me. Go back and look at the example I posted of converting a linked list of Unbounded_Strings to an array. There is a workaround to use with this approach, have a function which walks the list and counts the entries and use a call to that to pass the array size to the function. Let me give a fully worked out (but currently useless) example: --------------------------------------------------------------------------------------------------------- with Ada.Strings.Unbounded; use Ada.Strings.Unbounded; package Unbounded_String_Utilities is type String_Array(<>) is limited private; type String_List is limited private; function Size (Arr: String_Array) return Natural; function Size (List: String_List) return Natural; function To_String_Array(List: in String_List) return String_Array; private type List_Node; type String_List is access List_Node; type String_Array is array(Natural range <>) of Unbounded_String; type List_Node is record Value: Unbounded_String; Next: String_List; end record; end Unbounded_String_Utilities; package body Unbounded_String_Utilities is function Size (Arr: String_Array) return Natural is begin return Arr'Length; end Size; function Size (List: String_List) return Natural is Count: Natural := 0; Temp: String_List := List; begin while Temp /= null loop Temp := Temp.Next; Count := Count + 1; end loop; return Count; end Size; function To_String_Array(List: in String_List) return String_Array is Result: String_Array(1..10); Temp: String_List := List; begin if Temp = null then return Result(1..0); end if; for I in 1..10 loop Result(I) := Temp.Value; if Temp.Next = null then return Result(1..I); end if; end loop; return Result & To_String_Array(Temp); end To_String_Array; end Unbounded_String_Utilities; --------------------------------------------------------------------------------------------------------- Right now, without the limited keywords, I can use this package. But with String_Array and String_List declared limited, I can't even declare an object of type String_Array outside the package private part, body, or child packages. My proposed solution is to declare To_String_Array as a constructor, and, if necessary change the return statements. (But it shouldn't be necessary in this case, since type String_Array is not limited inside the body of To_String_Array.) What do I have to do for Tucker's solution? I have to replace the body To_String_Array with a procedure, and a renaming of a call to the procedure: procedure To_Array(List: in String_List; Result: out String_Array) is Temp: String_List := List; begin for I in Result'Range loop Result(I) := Temp.Value; end loop; end To_Array; Junk: Unbounded_String; function To_String_Array(List: in String_List) return String_Array renames To_Array(List, (1..Size(List) =>Junk)); Making all those copies of Junk just to throw them away is something the compiler might figure out. But the other problem is the one that concerns me. Specifying the size of the array in the renaming makes the procedure somewhat simpler than the function it replaces. Instead there is a call to Size, I probably have to write that function anyway, but what if someone appends something to the List, or worse, shortens the list while I am working on it? Ah, I'll just create a protected object, and insure that operations in the package have the appropriate locking semantics. Oops! The problem is not that To_Array is a procedure, it is that locking for Size then locking for To_Array is useless. You have to allow this funny renaming to be an operation of a protected type, and that operation has to include the evaluation of the parameters of the procedure call. The renamed procedure probably also has to be an operation of the protected type so that internal calls are appropriately recognized. Of course, you could use semaphores and put the P and V operations in different subprogram bodies. But doing that first requires that those particular subprograms not be visible to the user of the package, and even then is a maintenance nightmare. What about the new operation approach? I don't think we need to touch protected types at all. I might want to have a per list lock for the list type, but I can do that by making a new list head type which is the public type, make it the protected object. (The public type is already limited, right? ;-) When creating the array, I can have calls in the constructor to lock and unlock the list head. I don't need to worry about protecting the array object during construction, it can't be referenced before the declaration has been completely elaborated. If I had the need, I could do the same thing, in the other direction, lock an array, copy the data, unlock the array and return. Note that this does mean that the locking and unlocking operations needs to go into the sequence of statements of the return ... do ... end. That should be no problem compared to all the cruft necessary to make the other approach work with protected objects. If everyone thinks that a constructor should be treated like a function within a protected type declaration for orthogonality reasons. (Or for that matter a procedure.) It can be done, but I don't see any real need for it. > I will say (again ;-) I am quite concerned about introducing a new > kind of program unit. This brings up the issue of library > constructors, generic constructors, subunit constructors, > constructor stubs, etc. Furthermore, from the caller's > point of view, is there any limitation on where the can > be called, or is it just like a function, and a call on > a constructor can be used anywhere a function can be used? I think that I have the same concerns as Tucker does, but a different viewpoint. There is no real need, outside of orthogonality issues to have library constructors, generic constructors, or subunit constructors. A generic package could certainly contain a constructor for the ADT it defines, but a constructor as a library unit or generic unit makes no real sense to me. Separate bodies for constructors may be a nice to have, but not if it kills the proposal. As for where a constructor can be called, my worry is that with Tuck's proposal, a procedure renamed in private to create a thing that publicly looks like a function means that is can be called anywhere a function name is legal. Otherwise we have a major contract violation. With my proposal, my intent is to keep things as restricted as possible. If we have the return ... do ... end construct as Randy advocates, it certainly should be legal to call the constructor for the parent type in an aggregate. You also need to be able to use a constructor to create a target of an allocator. And, of course, you need to be able to call a constructor to create the initial value in an object declaration. What about as an in parameter in a procedure or other call? I don't see any harm in requiring users to actually create an object with a declaration, then pass the object in a call. The problem now is that in some cases you can't create a useful object to pass. In fact, this is probably the hardest part of what we are doing here. We are not just fixing a problem in the language we are refining what the properties of a limited object are. I actually want to make it more difficult to create a limited object outside the facilites provided by the ADT. But that requires making constructors for limited types work. > In my view, functions *are* Ada's constructors, and the fact > that we can name our functions anything we want is a step > up from C++. The problem is that you can't write decent > functions for limited types. I think "fixing" this by > inventing a completely new notion of a "constructor" leaves > the existing use of functions out in the cold. Does this > mean that we should go back and change all the functions > we have created for non-limited types which are often > used as constructors, and make them "true" constructors? > > I believe you are unnecessarily "orphaning" functions. LOL! Most current functions, even those that are part of an ADT, do not return a value of a private or limited private type. If you want to change To_Unbounded_String, etc., to constructors, I see no problem with that. But most of the functions in Ada.Strings.Unbounded are actually operators, and intended to be used as such. In fact, if you think about it there are several 'special' types of functions in Ada already, such as character literals, string literals, operators, and attributes. If it helps you to think in terms of "constructor function Foo return Bar is...", fine. Or perhaps "function Foo return in_place Bar;" is even better. I am trying to stay away from "limited function Foo is..." because I feel that constructors will be used for both limited and non-limited ADTs. I also feel that avoiding the word constructor is wrong. If it walks like a duck, and quacks like a duck, we don't do anyone any favors by saying that in Ada we call that a moose. > I would rather "orphan" the existing ability to return > by reference, which I think is of marginal use. If we > wanted to call something a "limited function", I would > say a function that returns by reference is such a thing. I definitely disagree. It may be that the compiler you are familiar with seldom returns values by reference, but at least in my code it is pretty common. > To support limited types with discriminants, it seems > clear that something available to the caller (such as > a renaming declaration) has to provide an indication > of the value of the discriminants of the "new" object. It may be that this is the real crux of the matter. I don't want the discriminants made available to the caller. Tucker is probably using this as shorthand for "made available to the compiler in the context from which the call is made." But there are many useful programming idioms where it is the constructor that determines the properties of the object created. Most Ada compilers currently handle this (in the unconstrained function return value case) very well, and very efficiently. What is different in the limited case is that the return value has to be created in place. I see this as a dialog between the calling environment and the constructor which takes place in the return statement. Randy apparantly sees this as well, and it fits very nicely with the "return ... do ... end" construct. The constructor does whatever it must then reaches the return statement. There it either calls a thunk or whatever and gets the space it needs for to create the returned object. In my mental model, it is fine for the limited case to allocate space on the heap and the caller will deallocate the object when the scope is left. (Finalization, right?) It is also possible to have a separate call stack for functions so that return values can be built directly in the calling procedure's context. (Works very well in practice because of the way that functions are used in Ada.) But these compiler efficiency issues as far as I am concerned are low on the priority list when discussing this issue. I would have to put my weightings at 40% usability, 30% ease of understanding/teaching, 20% compiler implementation difficulty, and 10% efficiency issues. I thought about making a table of issues and comparing proposals but this message is long enough as it is. Maybe someone else will give that a try, possibly someone not associated with any current position. ;-) Wish I could join you at the meeting, but right now even Route 128 is outside my travel radius. ************************************************************* From: Tucker Taft Sent: Sunday, December 7, 2003 4:27 PM > ... In my mental model, it is fine for the limited case to > allocate space on the heap and the caller will deallocate the object > when the scope is left. ... I don't see how that works for a component. Only the caller knows exactly where the object is to be allocated. Trying to communicate that to the called routine is not trivial. ************************************************************* From: Robert A. Duff Sent: Sunday, December 7, 2003 9:35 PM Randy wrote: > Bob Duff wrote: > > > Constructors ought to be composable. That is, clients should be able to > > write constructors, given primitive constructors. For example, if a > > "Sequence" package gives the client a way to construct a singleton > > sequence, given one element, and a way to concatenate Sequences, the > > client ought to be able to write a constructor that takes two Elements > > and produces a sequence of length 2. This is common in non-limited > > cases. Why not in limited? > > I'm not completely sure what you mean. Certainly, the proposal I've put > forward allows setting components with a constructor (usually in an > aggregate). Or did you mean something else? I think one of your (or somebody's) e-mails suggested a legality rule along the lines of "a constructor must be a primitive operation of the constructed type". What I meant was, I don't like that restriction, because no such restriction exists for nonlimited types, and what we're tring to accomplish is to initialize limited objects just like nonlimited ones (or as close to that as we can manage). > > This implies that constructors cannot be required to be primitive ops. > > And therefore that such constructors cannot see improperly initialized > > objects. > > I think it is important that constructors don't see *any* objects. That is, > constructors are creating an object -- in no sense should an object be > "passed in" to it. Implementations might implement them that way, but the > semantics certainly should not be that way. I'm not sure what you mean by that. I was under the impression that all of the proposals on the table had a named object inside the function that acts as the function result (or constructor result or whatever), and that this object *is* the object at the call site that is being created. Are you saying that it should be illegal to read from this object (as in Ada 83 'out' parameters)? I could be confused -- I admit I've lost track of the details in a the morrass of e-mail. ************************************************************* From: Randy Brukardt Sent: Sunday, December 7, 2003 11:46 PM Bob said, replying to me replying to him: (See I can remember to tag these... :-) > > > This implies that constructors cannot be required to be primitive ops. > > > And therefore that such constructors cannot see improperly initialized > > > objects. > > > > I think it is important that constructors don't see *any* objects. That is, > > constructors are creating an object -- in no sense should an object be > > "passed in" to it. Implementations might implement them that way, but the > > semantics certainly should not be that way. > > I'm not sure what you mean by that. I was under the impression that all > of the proposals on the table had a named object inside the function > that acts as the function result (or constructor result or whatever), > and that this object *is* the object at the call site that is being > created. Are you saying that it should be illegal to read from this > object (as in Ada 83 'out' parameters)? > > I could be confused -- I admit I've lost track of the details in a > the morrass of e-mail. No, I was simply saying that there shouldn't be any name for the object being constructed until the constructor is actually ready to create it. That is, in the proposal I sketched out, not until the return statement is encountered. At that point, the object will be initialized -- but the constructor can control what initialization is done (with an aggregate or another constructor). In particular, the caller cannot and should not be trying to allocate anything; it has to tell the constructor how to do that (most likely on a secondary stack). Tucker said: > In my view, functions *are* Ada's constructors, and the fact > that we can name our functions anything we want is a step > up from C++. The problem is that you can't write decent > functions for limited types. I think "fixing" this by > inventing a completely new notion of a "constructor" leaves > the existing use of functions out in the cold. Does this > mean that we should go back and change all the functions > we have created for non-limited types which are often > used as constructors, and make them "true" constructors? The few functions which are constructors (like To_Unbounded_String) should indeed be changed. But most things aren't constructors in the normal sense. I see a function as returning a copy or reference of an existing object. A constructor creates a new object. For limited types, that's in-place construction, and that should be true for limited types as well. In any case, if we had the will to change *all* functions this way, that would be fine. But that's not going to happen - it would break too many programs. So we have to introduce a new concept. That's the penalty for getting it wrong the last time. (Now, if constructors allowed "in out" parameters, then we could solve another problem as well. And then if functions ended up orphaned, good riddance. But I doubt that we have the will to do that.) ************************************************************* From: Robert I. Eachus Sent: Sunday, December 7, 2003 11:49 PM I am more and more convinced that this will appear to most users as the single largest change in Ada 0Y. I don't think the compiler work is going to be all that hard--after all C++ has constructors. ;-) But I think we do need to devote time to getting the rules right. If we do almost all Ada programs will use this feature. (I certainly know that I am spending much too much time on this topic, if it is not as important as I think it will be.) -- RIE Tucker Taft wrote: >>... In my mental model, it is fine for the limited case to >>allocate space on the heap and the caller will deallocate the object >>when the scope is left. ... >> >> > >I don't see how that works for a component. Only the caller >knows exactly where the object is to be allocated. Trying >to communicate that to the called routine is not trivial. > > I understand what you are saying, I think. This is why having a separate function call stack is such a useful optimization. If a procedure (or for that matter a function) wants to build an object "in place" then, if you can use another stack to contain the function's return context and any local objects, then the return value just gets built on top of the heap, or, if necessary in the heap. If you can't do this you end up doing things like copying the return value lower down the stack after the function returns. I 've seen several variations on this approach. I think the ALS used a ping-ponging scheme, and other compilers have a separate stack for function calls. But given the way Ada is used, even dynamic calling depths are low, and it is possible to have a stack for each nesting level most of the time. Impilict heap use works, and is clearly a workable solution for most of the cases we have been discussing. The only tricky part is when a component of a record is being created is "big" and the size is determined by the called function. This is where Alsys used to go into all that "mutant record" stuff. (I don't know if they still do.) But this is just the nature of Ada record structures. Most of the time it is possible to allocate a record in one contiguous memory space with no embedded pointers. But for some of the extreme cases it is better to use embedded pointers. For example, if you have several components whose size depends on the discriminants, you can pass thunks to calculate the offets the way DEC Ada did. It is just faster (but less memory efficient) to have offsets or pointers as part of the structure. But as I said in my previous message, I would rather have some inefficiency in the memory management in complex cases if that makes it easier for programmers to use limited types for the normal cases. And as I see it, there are two 'normal' cases here. In the first case, from the point of view of the constructor the object will have a constant size header, which will come from the parent (or collective ancestor) types, and a fixed size extension. In the second case, the type will be a container type with discriminants, and a size that is computed by the constructor. But the object will not normally be a component. I guess if you really feel strongly about it based on your compiler implementation, it would be possible to restrict constructors to creating objects not components. I just don't see it as a problem. Of course, I did advocate (intentionally) that constructors need to be allowed in aggregates, but omitted default values in records. You could add a rule that a constructor is illegal as a default value in a record definition, but again I don't see why. The compiler will be creating an entire object containing the component, just as in the aggregate case. But I do see a big mess if a constructor can be called to change the contents of a limited record object. As I said before, for me--and I suspect everyone--that is no way, no how what we want. Getting the limited case right is more important than covering the non-limited ADT case. However, I don't see a problem with writing the rule so constructors for non-limited objects can be used to change components of records. (As far as I am concerned, I would expect the compiler to use the constructor to create a--remember non-limited--object, then copy it into the actual record. An "extra" copy in some cases, but I don't see it as a big deal if it is inefficient to use constructors as other than constructors.) ************************************************************* From: Robert I. Eachus Sent: Monday, December 8, 2003 12:03 PM Randy Brukardt wrote: >No, I was simply saying that there shouldn't be any name for the object >being constructed until the constructor is actually ready to create it. That >is, in the proposal I sketched out, not until the return statement is >encountered. At that point, the object will be initialized -- but the >constructor can control what initialization is done (with an aggregate or >another constructor). In particular, the caller cannot and should not be >trying to allocate anything; it has to tell the constructor how to do that >(most likely on a secondary stack). > > I think Randy and I are in agreement here, even if we don't agree on which is the "secondary" stack. ;-) >In any case, if we had the will to change *all* functions this way, that >would be fine. But that's not going to happen - it would break too many >programs. So we have to introduce a new concept. That's the penalty for >getting it wrong the last time. > > I don't know that we got it "wrong" last time. I think we just didn't think through the need at all. Functions returning results by reference is a neat trick in other circumstances, but it can't handle this particular problem/need/programming model. >(Now, if constructors allowed "in out" parameters, then we could solve >another problem as well. And then if functions ended up orphaned, good >riddance. But I doubt that we have the will to do that.) > The will, or the stomach? I would rather have functions with in out parameters than constructors. But the Rosen trick will work for both if needed, and I hope the need can be restricted to random number generator seeds. ;-) ************************************************************* From: Tucker Taft Sent: Monday, December 8, 2003 6:13 AM Robert Eachus wrote: > > Tucker Taft wrote: > > >>... In my mental model, it is fine for the limited case to > >>allocate space on the heap and the caller will deallocate the object > >>when the scope is left. ... > >> > >> > > > >I don't see how that works for a component. Only the caller > >knows exactly where the object is to be allocated. Trying > >to communicate that to the called routine is not trivial. > > > > > I understand what you are saying, I think. ... I'm not sure you do. You certainly cannot assume that limited components will be allocated on the heap. Yes, I suppose some compiler's, like RRs, allocate all nested composite objects with a level of indirection, but that is the exception, not the rule. Almost all the other compilers go out of their way to keep records contiguous, whether they are limited or non-limited. I presume you will be able to use a call on one of these constructor functions to initialize a limited component of a limited aggregate. If not, you haven't solved the problem in my view. That is: X : Lim2 := Lim2'(F1 => 7, F2 => Lim1_Con_Func(3,4)); where Lim1_Con_Func is one of these constructor functions. > ... > But as I said in my previous message, I would rather have some > inefficiency in the memory management in complex cases if that makes it > easier for programmers to use limited types for the normal cases. I really don't see any compiler changing its record layout just to make these possible. That would be hugely disruptive. > ... > I guess if you really feel strongly about it based on your compiler > implementation, it would be possible to restrict constructors to > creating objects not components. That is a non-starter. > ... > But I do see a big mess if a constructor can be called to change the > contents of a limited record object. ... Noone is proposing that. Except for the oddball return-by-reference functions, all function calls are associated with the creation of a new object. The only issue is how this is *implemented.* From the view of the user of the function, there is always a new object produced. The procedure-renamed-as-function proposal is suggesting that the compiler does the allocation and basic initialization inline before going out-of-line to the procedure. In the "return ... do ..." proposal, the allocation still needs to be done prior to the call. It could be raw, uninitialized storage, but the space needs to exist before the call, if you are going to allow components to be initialized by such a call. To do the allocation, the compiler needs to know the size of the result, meaning in general it needs to know the discrminants if the result type is discriminated. If the caller knows the discriminants, then we need to be sure the out-of-line code uses the same values for the discriminants. This gets tricky if there is no name that refers to the returned object until the out-of-line code declares one (e.g. via the "return Result: T := do ..." construct), since they can't refer to the values of the discriminants in the initializing expression for the return object. I suppose one way out of this conundrum is to allow references to discriminants of "Result" in the "". That is not the way normal declarations work, but it could work here. The other problem, as others have indicated, is that sometimes we want the discriminants of the new object to be a function of the parameters. That seems harder in the "return ... do ..." approach, since we would have to somehow communicate the values to the calling context as well, since they are needed *before* the object can be allocated, and that needs to happen before the call (because of components). > ... However, I don't see a problem with writing the rule so > constructors for non-limited objects can be used to change components of > records. I don't know what you are talking about when you say "change" a component. Do you mean that for a non-limited type, you would allow a call on a constructor function to be the right hand side of an assignment, whose left-hand-side is a component selection (or anything else, for that matter)? Or do you mean the function would somehow be passed a reference to a preexisting object, and treat it as an IN OUT parameter? I don't understand what a call would look like in this latter case, and it is not something I have any interest in. In the procedure-renamed-as-function proposal, it is true the *procedure* is passed an [IN] OUT parameter, but the procedure is *not* the "constructor." The constructor is the function defined by the renaming. The procedure is just a useful hunk of code that the constructor function reuses. That same procedure might be used for other things. The function defined by the renaming, on the other hand, has the properties of a "normal" function, namely that it is always associated with the creation of a new object. It is different in that there is enough information provided so that at the call-site, the compiler can generate the code to do allocation and basic initialization, and the out-of-line code need not worry about that. ************************************************************* From: Pascal Leroy Sent: Monday, December 8, 2003 7:22 AM Randy wrote: > Let's look specifically at the "complexity" of this counter proposal... If we are going to add constructors to the language (and that's a big if, given that we are already pretty late in the game) then I strongly favor the Brukardt-Eachus proposal over the other ideas that have been floated in this thread. ************************************************************* From: Tucker Taft Sent: Monday, December 8, 2003 9:52 AM It would be helpful if you could augment this with some rationale, to help understand your view of the strength and weaknesses of the proposals. ************************************************************* From: Robert I. Eachus Sent: Monday, December 8, 2003 11:44 AM Tucker Taft wrote: >Robert Eachus wrote: > > >>I understand what you are saying, I think. ... >> >> > >I'm not sure you do. You certainly cannot assume that limited components >will be allocated on the heap. Yes, I suppose some compiler's, >like RRs, allocate all nested composite objects with a level >of indirection, but that is the exception, not the rule. Almost >all the other compilers go out of their way to keep records >contiguous, whether they are limited or non-limited. > > I was not saying that a compiler should allocate all limited objects built by constructors on the heap. I was saying that it is one of the cases that has to work: Something: Foo_Access := new Foo'(Constructor); You are arguing for a solution where the caller allocates the space on the heap, and passes the address to the constructor. Randy and I are heading more towards a dialog between the constructor and the caller, or a thunk based approach. The space on the heap would normally be allocated in the return statement. If the constructor returns an object which is constrained, there is no problem with components, the caller can know the size of the object before the call. The problem with components is exactly the case which Alsys called mutant records. This is where allocating the maximum size of the object doesn't work. In Ada 83, many compilers used the "allocate the maximum" approach, and some still do. I was saying/accepting that for some objects, Unbounded_String components are an excellent example, the normal case is going to require a level of indirection. In both Randy's compiler and GNAT, that is the way Unbounded_String is declared. Note that in cases where the compiler knows a reasonable maximum at compile time, most compilers will, as you say, go out of their way to allocate the record contiguously. A good example is the Bounded_String case. For a Bounded_String I expect a compiler to allocate the maximum. Where I think our mental pictures differ is that you are thinking that since a limited object cannot change in size, this should never be a problem. But I want constructors to be able to handle the case where the size of the object to be created is determined in the constructor. That was what the To_String_Array example was about. Ada currently allows me to handle this case for non-limited objects, it should actually be easier for limited objects, but right now it is illegal. So my mental picture of how this works, is that there is a dialog between caller and constructor, probably mediated by a thunk. When the constructor gets to the return statement, it says "the object I am about to create will be N bytes long," and the thunk responds with an address of the memory space to put it in. Constructors need this hidden data/extra parameter, which is why I favor an explicit new subprogram declaration form. What exactly it looks like is almost irrelevant to the implementation issues, but as a user, I think it is much nicer to name a constructor as a constructor. Note that the hard case is when an object has several components initialized by constructors, and the size of the object depends on many of them. A heroic compiler could create thunks that are co-routines, and effectively call all the thunks in an aggregate in parallel. I am not advocating that a compiler support this, which is why I pointed out the existance of the indirection bailout. I would expect compilers to handle aggregates with one size indeterminate component, and I would expect aggregates that have more than one such component to be rare. However, given the choice of between a rule that only allows one constructor in an aggregate, and what Robert Dewar would call "junk code" when an object has more than one component which has an unbounded size, I'm willing to allow the junk code in the difficult case so that the much more common case of types with one unbounded size component will be handled efficiently. Incidently, this is what I currently get in existing compilers with components of type Unbounded_String. The advantage that many of us expect from these AI's is not to eliminate the hidden indirection in Unbounded_Strings, it is to eliminate the junk default initialization that occurs in many cases. (If you want to read for a limited type with some components of type Unbounded String in the above feel free. That is exactly the normal mapping for database objects that gets painful.) >I presume you will be able to use a call on one of these >constructor functions to initialize a limited component of a >limited aggregate. If not, you haven't solved the problem >in my view. That is: > > X : Lim2 := Lim2'(F1 => 7, F2 => Lim1_Con_Func(3,4)); > >where Lim1_Con_Func is one of these constructor functions. > > Definitely. Although I expect the more common/necessary case to be: X: Lim2 := Lim2'(Lim1_Con_Func(3,4) with F1 => 7); >I really don't see any compiler changing its record layout just to make >these possible. That would be hugely disruptive. > > I don't see that either. But I do see the compiler using exactly the same record layouts for limited objects as non-limited objects with otherwise identical declarations. This is necessary anyway, since the limited type may be non-limited in part of its scope. >>From the view of the user of the function, there is always a new object >produced. The procedure-renamed-as-function proposal is suggesting >that the compiler does the allocation and basic initialization >inline before going out-of-line to the procedure. > I understand that very clearly. I think what you are missing is that it is what I object to about the proposal. I showed in my complete worked out example the differences, and how the workarounds end up with a constructor split notationally into two pieces. Tolerable if there are no locking constructs involved, but very painful when there are. > In the >"return ... do ..." proposal, the allocation still needs to >be done prior to the call. It could be raw, uninitialized storage, >but the space needs to exist before the call, if you are going >to allow components to be initialized by such a call. > No, what Randy and I are proposing is that the allocation occurs at the point of the object creation in the return statement (possibly a "return ... do ... end;" construct, This requires either a thunk, or for the compiler to pass the address where the object is to be placed as a hidden parameter, and the size of the object to be an "extra" return parameter. (I say "extra" because there is no need for the actual return value to be returned, the caller knows where it is.) So in the common case where the size is known at the point of the call, no thunk is required, and no return value either. So back to my initial three cases: Easy cases (all objects the same size): pass address, no thunk needed. Constructor initializing a constrained object: pass address with constraints in place, no thunk needed. Constructor computes constraints: pass thunk which can be called with size to get address. I personally think that given that all three cases are possibly, most uses will be cases one or three. Except that it is nice to be able to handle cases two and three with the same constructor code: Handle all cases: pass a constrained flag, and a thunk. The user code then needs to be able to query this flag, my suggestion is that the 'Constrained attribute can be queried in the sequence of statements in the do..end, and if true, then the code can look in the memory returned by the thunk to see the discriminants. Once you realize that this model works, it becomes clear that the right choice from a user's point of view is to choose to permit the most complex case, and allow compilers to recognize the other cases and create more efficient code. But the efficiency is in terms of the number of parameters passed. So when I say I am willing to tolerate inefficiency to get full generality, the inefficiency I am talking about is sometimes passing one or two "extra" parameters which the compiler wasn't able to recognize as unnecessary. >To do the allocation, the compiler needs to know the size >of the result, meaning in general it needs to know the >discrminants if the result type is discriminated. If the >caller knows the discriminants, then we need to be sure the >out-of-line code uses the same values for the discriminants. >This gets tricky if there is no name that refers to the >returned object until the out-of-line code declares one >(e.g. via the "return Result: T := do ..." construct), since they >can't refer to the values of the discriminants in the >initializing expression for the return object. > >I suppose one way out of this conundrum is to allow references >to discriminants of "Result" in the "". That is not the >way normal declarations work, but it could work here. > > Actually now that we are down to "bit fiddling", better IMHO is to allow assignment to T in the sequence of statements following the do. Then the combined case above can be handled elegantly as: ... return Result: Object_Type do if Result'Constrained then Result := (Disc1 => Result.Disc1, Disc2 => Result.Disc2,...); else Result := Default_Constructor(Param1, Param2); end if; end; Of course, when the Object_Type is only limited by fiat, this is possible for constructors declared in the same package as Object_Type (or in a child package). The case where the parent type is limited should be covered by the aggregate rules so that constructors for types derived from Limited_Controlled can be written this way. >The other problem, as others have indicated, is that sometimes >we want the discriminants of the new object to be a function >of the parameters. That seems harder in the >"return ... do ..." approach, since we would have to somehow >communicate the values to the calling context as well, since >they are needed *before* the object can be allocated, >and that needs to happen before the call (because of components). > > I know that it how you (Tucker) are thinking, but I am going further and allowing the discriminants to be something that doesn't depend on the parameters in an obvious way. The classic case would be a data entry system. The constructor may return one of many variants of a record based on the data entered by the user's interaction with the constructor. I am not oblivious to the cost of getting all this "right," which is why I have been indicating a willingness to subset the functionality for now. The problem I have with the renaming approach is that it is not obvious, perhaps not possible, to extend it later to cover all the useful cases. Originally I was proposing to cover a different subset of the useful cases than the renaming approach, but Randy and I have converged on a solution (actually with some help from Dan Eilers) that covers all cases. >>... However, I don't see a problem with writing the rule so >>constructors for non-limited objects can be used to change components of >>records. >> >> > >I don't know what you are talking about when you say >"change" a component. Do you mean that for a non-limited >type, you would allow a call on a constructor function to be >the right hand side of an assignment, whose left-hand-side >is a component selection (or anything else, for that matter)? > Yes. >Or do you mean the function would somehow be passed a reference >to a preexisting object, and treat it as an IN OUT parameter? >I don't understand what a call would look like in this latter >case, and it is not something I have any interest in. > > Ah, but it is something that you do have an interest in. ;-) This is the exactly the case that is described by your renamed procedure approach. As explained above, the constructor approach handles that case as well. To handle all cases when the caller doesn't know whether the particular constructor called will expect existing discriminants or not requires passing a constrained flag and a thunk. And of course, this will generally be the case in Ada, since the body of the package containing the constructor will not be visible at the point of the call. However, the compiler can choose a more optimal calling sequence when it knows that the object passed has no discriminants. >In the procedure-renamed-as-function proposal, it is true >the *procedure* is passed an [IN] OUT parameter, but the >procedure is *not* the "constructor." The constructor >is the function defined by the renaming. The procedure >is just a useful hunk of code that the constructor >function reuses. That same procedure might be used >for other things. The function defined by the renaming, >on the other hand, has the properties of a "normal" function, >namely that it is always associated with the creation of >a new object. It is different in that there is enough >information provided so that at the call-site, the compiler >can generate the code to do allocation and basic initialization, >and the out-of-line code need not worry about that. > > Understood. Incidently please don't think of what is going on here as just an argument, or a bunch of people ganging up on Tucker. What is really going on is that there are two proposals on the table, and they have different properties. I am willing to accept the extra parameters that the constructor approach requires in some cases to get the additional functionality it allows. Tucker's approach is not as general, but it is somewhat more efficient. I would argue that the inefficiency of my approach is not inherent, but depends on the (often private) type declaration. There will be cases where Tuckers approach is clearly more efficient, but not catastrophically so. So neither solution dominates the other. The choice becomes a normative one, depending on how you weight different considerations. And as I said earlier, the I think the time spent on examining both alternatives is worth it, since I expect the result to be one of the most heavily used features of Ada 0Y. ************************************************************* [Editor's note: Additional discussion on this topic can be found in AI-318.] ************************************************************* [Editor's note: At the March 2004 ARG Meeting, it was decided to fold parts of this proposal into AI-318. The rest will be dropped.] *************************************************************