!standard 03.10.02(10) 05-12-09 AI95-00318-02/12 !standard 03.10.02(13) !standard 03.01(06) !standard 03.08(14) !standard 03.09(24) !standard 04.03.03(11) !standard 05(02) !standard 05.01(04) !standard 05.01(05) !standard 05.01(14) !standard 06.01(13) !standard 06.01(23) !standard 06.01(24) !standard 06.01(28) !standard 06.03.01(16) !standard 06.04(11) !standard 06.05(01) !standard 06.05(02) !standard 06.05(03) !standard 06.05(04) !standard 06.05(05) !standard 06.05(06) !standard 06.05(07) !standard 06.05(08) !standard 06.05(09) !standard 06.05(10) !standard 06.05(11) !standard 06.05(12) !standard 06.05(13) !standard 06.05(14) !standard 06.05(15) !standard 06.05(16) !standard 06.05(17) !standard 06.05(18) !standard 06.05(19) !standard 06.05(20) !standard 06.05(21) !standard 06.05(22) !standard 06.05(24) !standard 07.03(19) !standard 07.05(02) !standard 07.05(08) !standard 07.05(09) !standard 07.05(23) !standard 07.06(17.1) !standard 07.06.01(02) !standard 07.06.01(18) !standard 08.01(4) !standard 09.05.02(29) !standard 13.08(10) !class amendment 02-10-09 !status Amendment 200Y 04-09-23 !status WG9 approved 04-11-18 !status ARG Approved 5-0-4 04-09-23 !status work item 04-07-28 !status ARG Approved 9-0-2 04-06-17 !status work item 03-05-23 !status received 02-10-09 !priority Medium !difficulty Medium !subject Limited and anonymous access return types !summary A new extended syntax is proposed for the return statement, providing a name for the new object being created as a result of a call on the function. This new syntax can be used to support returning limited objects from a function and more generally to reduce the copying that might be required when a function returns a complex object, a controlled object, etc. The existing ability to return by reference is replaced by an ability to have an anonymous access type as a return type. !problem We already have a proposal (AI-287) for allowing aggregates of a limited type, by requiring that the aggregate be built directly in the target object rather than being copied into the target. But aggregates can only be used with non-private types. Limited private types could not be initializable at their declaration point. It would be natural to allow functions to return limited objects, so long as the object could be built directly in the "target" of the function call, which could be a newly created object being initialized, or simply a parameter to another subprogram call. When returning a limited type it may be desirable to perform some other initialization to the object after it has been created, but before returning from the function. This is difficult to do while still creating the object directly in its "final" location. Currently functions that return a limited private type may have an accessibility check performed on the object returned, depending on a property ("return-by-reference-ness") which is not generally visible based on the partial view of the type. This means that a function that works initially may stop working if the full type of the result type is changed to include, say, a limited tagged component, or some other component that is return-by-reference. A function whose result type turns out to be return-by-reference cannot be allowed where a new object is required. However, there is nothing in the declaration of such a function that indicates it returns by reference. The capability to return-by-reference could be useful for nonlimited types, but it becomes even more useful if a call on such a function could be treated as a variable, so it could be used on the left-hand side of an assignment. These capabilities exist in the language without introducing the conceptual oddity of return-by-reference. A function returning an access value allows the effect of return-by-reference, and doesn't require changing the language, at the cost of a bit of extra verbosity in some cases (the need to dereference the result). !proposal Anonymous access types are permitted for a function result type: parameter_and_result_profile ::= [formal_part] RETURN subtype_mark | [formal_part] RETURN access_definition An anonymous access type used as the result type of a function is called an *access result type*. The accessibility level of an access result type is that of the declaration containing the parameter_and_result_profile. ------------- An extended syntax for the return statement is proposed: RETURN defining_identifier : [ALIASED] return_subtype_indication [:= expression] [DO handled_sequence_of_statements END RETURN]; Such an extended return statement is permitted only immediately within a function. The specified defined_identifier names the object that is the result of a call on the function. If the expression is present, it provides the initial value for the result object. If not, the result object is default initialized. If the handled_sequence_of_statements is present, it is executed after initializing the result object. Within the handled_sequence_of_statements, the defining_identifier denotes a variable view of the result object with nominal subtype given by the subtype_indication. When the handled_sequence_of_statements completes, the function is complete. Note: An expression-less return statement is permitted within the handled_sequence_of_statements, similar to the way that accept statements work. A call of a function with a limited result type may be used in the same contexts where we have proposed to allow aggregates of a limited type, namely contexts where a new object is being created (or can be). 1) Initializing a newly declared object (including a result object identified in an extended return statement) 2) Default initialization of a record component 3) Initialized allocator 4) Component of an aggregate 5) IN formal object in a generic instantiation (including as a default) 6) Expression of a return statement 7) IN parameter in a function call (including as a default expression) In addition, since the result of a function call is a name in Ada 95, the following contexts would be permitted, with the same semantics as creating a new temporary constant object, and then creating a reference to it: 8) Declaring an object that is the renaming of a function call. 9) Use of the function call as a prefix to 'Address In other words, it would be permitted in *any* context where limited types are permitted. With the new proposals, that is pretty much *any* context where a "name" that denotes an object or value is permitted, except as the right hand side of an assignment statement. This proposal assumes that AI-287 is adopted; it does not repeat the changes needed to allow function_calls in the contexts listed above. !wording Add at the end of 3.1(6): In addition, an extended_return_statement is a declaration of its defining_identifier. Add before 3.8(14): If a record_type_declaration includes the reserved word limited, the type is called an *explicitly limited record* type. In 3.9(24), change "return expression" to "return object". Replace 3.10.2(10) with: For any function, the accessibility level of the result object is that of the execution of the called function. Add after 3.10.2(13): * The accessibility level of the anonymous access type of an access result type (see 6.5) is the same as that of the associated function or access-to-subprogram type. Change 4.3.3(11) as follows: For an ..., the expression of a [return_statement]{return statement}, the initialization expression in an object_declaration, or ... function [result]{return object}, object, or ... Replace "return_statement" with "return statement" in 5(2). Change "return_statement" in 5.1(4) to "simple_return_statement". Add "extended_return_statement" to 5.1(5). Replace "return_statement" with "return statement" in 5.1(14). Change 6.1(13) to: parameter_and_result_profile ::= [formal_part] RETURN subtype_mark | [formal_part] RETURN access_definition Add the following to 6.1(23): The nominal subtype of a function result is the subtype denoted by the subtype_mark, or defined by the access_definition, in the parameter_and_result_profile. Change 6.1(24) to: An *access parameter* is a formal in* parameter specified by an access_definition. An *access result type* is a function result type specified by an access_definition. An access parameter or result type is of an anonymous general access-to-variable type (see 3.10). Access parameters allow dispatching calls to be controlled by access values. Change 6.1(28) to: * For any non-access result, the nominal subtype of the function result. * For any access result type of an access-to-object type, the designated subtype of the result type. * For any access result type of an access-to-subprogram type, the subtypes of the profile of the result type. Modify 6.3.1(16) as follows: Two profiles are mode conformant if they are type-conformant, and corresponding parameters have identical modes, and, for access parameters {or access result types}, the designated subtypes statically match. Replace "return_statement" with "return statement" in 6.4(11). Replace clause 6.5 with the following: 6.5 Return Statements A simple_return_statement or extended_return_statement (collectively called a *return statement*) is used to complete the execution of the innermost enclosing subprogram_body, entry_body, or accept_statement. Syntax simple_return_statement ::= return [expression]; extended_return_statement ::= RETURN defining_identifier : [ALIASED] return_subtype_indication [:= expression] [DO handled_sequence_of_statements END RETURN]; return_subtype_indication ::= subtype_indication | access_definition Name Resolution Rules The *result subtype* of a function is the subtype denoted by the subtype_mark, or defined by the access_definition, after the reserved word RETURN in the profile of the function. The expected type for the expression, if any, of a simple_return_statement is the result type of the corresponding function. The expected type for the expression of an extended_return_statement is that of the return_subtype_indication. Legality Rules A return statement shall be within a callable construct, and it applies to the innermost callable construct or extended_return_statement that contains it. A return statement shall not be within a body that is within the construct to which the return statement applies. A function body shall contain at least one return statement that applies to the function body, unless the function contains code_statements. A simple_return_statement shall include an expression if and only if it applies to a function body. An extended_return_statement shall apply to a function body. For an extended_return_statement that applies to a function body: * If the result subtype of the function is defined by a subtype_mark, the return_subtype_indication shall be a subtype_indication. The type of the subtype_indication shall be the result type of the function. If the result subtype of the function is constrained, then the subtype defined by the subtype_indication shall also be constrained and shall statically match this result subtype. If the result subtype of the function is unconstrained, then the subtype defined by the subtype_indication shall be a definite subtype, or there shall be an expression. * If the result subtype of the function is defined by an access_definition, the return_subtype_indication shall be an access_definition. The subtype defined by the access_definition shall statically match the result subtype of the function. The accessibility level of this anonymous access subtype is that of the result subtype. For any return statement that applies to a function body: * If the result subtype of the function is limited, then the expression of the return statement (if any) shall be an aggregate, a function call (or equivalent use of an operator), or a qualified_expression or parenthesized expression whose operand is one of these. AARM Note: In other words, if limited, the expression must produce a "new" object, rather than being the name of a preexisting object (which would imply copying). Static Semantics Within an extended_return_statement, the *return object* is declared with the given defining_identifier, with the nominal subtype defined by the return_subtype_indication. Dynamic Semantics For the execution of an extended_return_statement, the subtype_indication or access_definition is elaborated. This creates the nominal subtype of the return object. If there is an expression, it is evaluated and converted to the nominal subtype (which might raise Constraint_Error -- see 4.6), and the converted value is assigned to the return object. Otherwise, the return object is initialized by default as for a stand-alone object of its nominal subtype (see 3.3.1). If the nominal subtype is indefinite, the return object is constrained by its initial value. For the execution of a simple_return_statement, the expression (if any) is first evaluated, converted to the result subtype, and then assigned to the anonymous *return object*. If the result type of a function is a specific tagged type, the tag of the return object is that of the result type. If the result type is class-wide, the tag of the return object is that of the value of the expression. AARM Ramification: The first sentence is true even if the tag of the expression is different, which could happen if the expression were a view conversion or a dereference of an access value. Note that for a limited type, because of the restriction to aggregates and function calls (and no conversions), the tag will already match. AARM Reason: The first rule ensures that a function whose result type is a specific tagged type always returns an object whose tag is that of the result type. This is important for dispatching on controlling result, and allows the caller to allocate the appropriate amount of space to hold the value being returned (assuming there are no discriminants). For the execution of an extended_return_statement, the handled_sequence_of_statements is executed. Within this handled_sequence_of_statements, the execution of a simple_return_statement that applies to the extended_return_statement causes a transfer of control that completes the extended_return_statement. Upon completion of a return statement that applies to a callable construct, a transfer of control is performed which completes the execution of the callable construct, and returns to the caller. In the case of a function, the function_call denotes a constant view of the return object. Examples Examples of return statements: return; -- in a procedure body, entry_body, -- accept_statement, or extended_return_statement return Key_Value(Last_Index); -- in a function body return Node : Cell do -- in a function body, see 3.10.1 for Cell Node.Value := Result; Node.Succ := Next_Node; end return; Delete all but the first sentence of 7.3(19). The new rule added before 7.5(2) by AI-287 should say: ...unless it is an aggregate, a function_call, or a parenthesized... A new bullet should be added to the list here: * the expression of a return statement (see 6.5) The following should be added to the new rule added after 7.5(8) by AI-287: For a function_call of a type with a part that is of a task, protected, or explicitly limited record type that is used to initialize an object as allowed above, the implementation shall not create a separate return object (see 6.5) for the function_call. The function_call shall be constructed directly in the new object. Similarly, the replacement note of 7.5(9) of AI-287 should say "aggregate or function_call" in each occurrence. Replace 7.5(23) by: The fact that the full view of File_Name is explicitly declared limited means that parameter passing will always be by reference and function results will always be built directly in the result object (see 6.2 and 6.5). Replace 7.6(17.1) with: For an aggregate of a controlled type whose value is assigned, other than by an assignment_statement, the implementation shall not create a separate anonymous object for the aggregate. The aggregate value shall be constructed directly in the target of the assignment operation and Adjust is not called on the target object. Replace part of 7.6.1(2) and 7.6.1(18) as follows: "...[exit_, return_, goto_]{exit_statement, return statement, goto_statement},..." [Editor's note: removing the "cute" wording here improves searchability of the Standard; we should avoid use abbreviations of technical terms.] Add after 8.1(4): * an extended_return_statement; Replace "return_statement" with "return statement" in 9.5.2(29). !example Here is an example of a function with a limited result type using an extended return statement: function Make_Obj(Param : Natural) return Lim_Type is begin return Result : Lim_Type do -- the "return" object -- Finish the initialization of the "return" object. Further_Processing(Result, Param); end return; end Make_Obj; Here is a similar function that returns an access-to-limited type: function Make_Obj(Param : Natural) return access Lim_Type is begin return Result : access Lim_Type do -- The "return" object Result := new Lim_Type; -- storage pool associated with scope where -- function declared Further_Processing(Result.all, Param); end return; end Make_Obj; Here is an abstraction which uses functions with access result types, to support an extensible array abstraction (aka vector): generic type Element is private; type Index is (<>); package Extensible_Arrays is pragma Assert(Index'First > Index'Base'First); -- so can have empty arrays type Ext_Array is private; -- Extensible array, initially Last(EA) = Index'First-1 procedure Set_Elem(EA : in out Ext_Array; I : Index; Elem : Element); -- Set element, extend array if necessary -- Postcondition: Last(EA) >= I function Last(EA : Ext_Array) return Index'Base; -- Returns index of current last element of array function Elem(EA : Ext_Array; I : Index) return access Element; -- Refer to existing element -- Precondition: I in Index'First .. Last(EA) -- Result can be implicitly dereferenced. procedure Set_Empty(EA : in out Ext_Array); -- Set array back to empty -- Postcondition: Last(EA) = Index'First - 1 private type Elem_Array is array(Index range <>) of aliased Element; -- We define an array-of-aliased so can implement "Elem" type Elem_Array_Ptr is access Elem_Array; -- We want a named access type so can use unchecked deallocation type Ext_Array is record Last : Index'Base := Index'First - 1; Data : Elem_Array_Ptr; -- This is reallocated as necessary to accommodate at least -- Index'First .. Last elements end record; end Extensible_Arrays; procedure Ext_Array_Test(Max : Positive) is package Ext_Int_Arrays is new Extensible_Arrays(Element => Integer; Index => Positive); type Ext_Int_Array is new Ext_Int_Arrays.Ext_Array; X : Ext_Int_Array; -- Initially empty begin -- Initialize table of squares, extending as necessary for I in 1..Max loop Set_Elem(X, I, Elem => I**2); end loop; -- Add one to each of the elements with indices up to Max/2 for I in 1..Max/2 loop Elem(X, I).all := Elem(X, I).all + 1; end loop; -- Now print out the table for I in 1..Last(X) loop Ada.Text_IO.Put_Line(Integer'Image(I) & " => " & Integer'Image(Elem(X, I).all)); end loop; Set_Empty(X); -- All done end Ext_Array_Test; !discussion In meetings with Ada users, there has been a general sense that if limited aggregates are provided in Ada 200Y, it would be desirable to also provide limited function returns which could act as "constructor" functions. Just allowing a function whose whole body is a return statement returning an aggregate (or another function call) does not give the programmer much flexibility. What they would like is to be able to create the object being returned and then initialize it further somehow, perhaps by calling a procedure, doing a loop (as in the examples above), etc. This requires a named object. However, to avoid copying, we need this object to be created in its final "resting place", i.e. in the target of the function call. This might be in the "middle" of some enclosing composite object the caller is initializing, or it might be in the heap, or it might be a stand-alone local object. Because the implementation needs to create the result object in a place determined by the caller, it is important that the declaration of the object be distinguished in some way. By declaring it as part of an extended return statement, we have a way for the programmer to indicate that this is *the* object to be returned. Clearly we don't want to allow extended return statements to be nested. Because it may be necessary to do some computing before deciding exactly how the result object should be declared, we permit the extended return statement to occur wherever a normal return statement is permitted. So different branches of an if or case statement could have their own extended return statements, each with its own named result object. Note that we have allowed the user to declare the result object as "aliased." This seems like a natural thing which might be wanted, so you could initialize a circularly-linked list header to point at itself, etc. Note that we had discussed various mechanisms where information from the calling context would be available inside the function at the *language* level. In particular, it would be possible to refer to the values of the discriminants or bounds of the object being initialized, presuming it was constrained, *within* the subtype indication and initializing expression, if any. Ultimately this capability was not included in this proposal, as it created a series of somewhat complicated restrictions on usage and made the implementation that much more difficult. Note that the implementation may still need to pass in information from the calling context, depending on the run-time model, because if the type is "really" limited (e.g. it is limited tagged, or contains a task or a protected object), then the new object must be built in its final resting place. In many run-time models, that means the storage needs to be allocated at the call-site if the object being initialized is a component of some larger object. However, by not allowing the *programmer* to refer to this contextual information at the language level, we give the implementation more flexibility in how it solves the build-in-place requirement for "really" limited objects. See the discussion below about implementation approaches. The syntax for extended return statements was initially proposed early on, but when this AI was first written up, we proposed instead a revised object declaration syntax where the word "return" was used almost like the word "constant," as a qualifier. This was somewhat more economical in terms of syntax and indenting, but was not felt to be as clear semantically as this current syntax. We have eliminated the capability for returning by reference, in favor of returning a value of an anonymous access type. A rejected alternative proposal (AI-318-1) proposed to make return-by-reference a separate capability, triggered by the presence of the reserved word "ALIASED" in the function profile. This was felt by some reviewers to be enshrining the confusing notion of return-by-reference, which earlier had been buried in a discussion of certain limited types. Furthermore, the implementation model of return by reference was clearly to return a "reference" (effectively an access value) to the result object. Making this explicit presumably makes the feature easier to understand, and we can also piggyback on the usual accessibility checks, rather than have to invent special ones associated with a return by reference. The capability to return an anonymous access type goes well with the other changes allowing anonymous access types in more contexts. We have kept the implementation simple by making the accessibility level of the result type the same as that of the associated function (or access-to-subprogram type). POSSIBLE IMPLEMENTATION APPROACHES The implementation of the extended return statement for nonlimited types should minimize the number of copies, but may still require a copy in some implementation models and in some calling contexts. The implementation of the extended return statement for limited result types is straightforward if the result subtype is constrained. It is essentially equivalent to a procedure with an OUT parameter -- the caller allocates space for the target object, perhaps does some of the "implicit" initialization for tags, discriminants, tasks, or protected components, etc., and passes its address to the called routine, which uses it for the "return" object. Nonlimited controlled components can still require some fancy footwork, since they can be explicitly initialized, so default initializing them would be inappropriate. But compilers already have to deal with returning nonlimited controlled objects, so presumably this won't create an insurmountable burden. If the result subtype is unconstrained, then there are two basic possibilities: 1) The target object's (nominal) subtype is definite, and either constrained or the size of the object is independent of the constraints (e.g. allocate-the-max is used for the object); the target object might be a component of a larger object. 2) The target object's nominal subtype is unconstrained, and its size is to be determined by the result returned from the function; the target object must be a stand-alone object, or an "entire" heap object. In the first case, the caller determines the size of the target object and can allocate space for it; in the second, the caller cannot preallocate space for the target object, and must rely on the called routine allocating space for it in an "appropriate" place. The code for the called routine must handle both of these cases. One reasonable way to do so is for the caller to provide a "storage pool" for the result. In the first case, this storage "pool" has space for exactly one object of a given maximum size. It's Allocate routine is trivial. It just checks to see if the size is no greater than the space available, and then returns the preallocated (target) address. In the second case, the storage pool is either the storage pool associated with the initialized allocator at the call site, or a storage pool that represents a secondary stack, or equivalent, used for returning objects of unknown size from a function. In either case, the function would return the address of the new object. A "bare" storage pool may not be enough in general. If the type has any task parts, then these tasks must be placed on an activation list determined by the calling context. They may also be linked onto a master record of some sort, unless this is deferred until activation occurs. Note that the tasks cannot be activated until after returning from the call, since they may have to be activated in conjunction with other tasks having the same master. If the type has any controlled or protected parts, then the object as a whole, or the individual parts, may need to be added to a cleanup list determined by the calling context. If the type has any access discriminants, then some kind of accessibility level will need to be provided, since the access discriminant may only be initialized to point to an object whose accessibility level is no deeper than that of the storage pool where the new object is being allocated. What this means is that rather than passing just a reference to a storage pool, it is more likely the caller will pass a reference to a structure which in turn refers to: - a storage pool, - an accessibility level, - an activation list, - the associated master, - a cleanup list Supporting a function result of an anonymous access type presents no special challenges since we have defined the accessibility level of the result type to be the same as that as the associated function or access-to-subprogram declaration. Hence, it is as though a named access type were declared and then used as the result type, from a run-time model point of view. There is no need for any (new) run-time accessibility checking. DEALING WITH EXCEPTIONS There was some concern about what would happen if an exception were propagated by an extended return statement, and then the same or some other extended return statement were reentered. There doesn't seem to be a real problem. The return object doesn't really exist outside the function until the function returns, so it can be restored to its initial state on call of the function if an exception is propagated from an extended return statement. Once restored to its initial state, there seems no harm in starting over in another extended_return_statement. THE BUILD-IN-PLACE IMPLEMENTATION REQUIREMENT The intent of this feature is that there never is copying of a "really" limited object. We have added an Implementation Requirement to insure that that is really the case. It has been argued that this requirement is not needed because any such copies are semantically neutral. But no copies of a self-referencing object could ever really be semantically neutral. Moreover, the definition of object creation in 3.3(19) says that the subcomponents are assigned from the expression already evaluated. This clearly must be superseded. In addition, we want to tell the reader (Ada user and implementers alike) that function calls have changed. In Ada up to this point, function calls were always about copying (at least logically, 7.6(21) allows omitting the copy). That is emphatically not the case in the Amendment; indeed in a similar case for aggregates, we included such a requirement in the Corrigendum. !corrigendum 3.1(6) @drepl Each of the following is defined to be a declaration: any @fa; an @fa; a @fa; a @fa; a @fa; a @fa; a @fa; an @fa; an @fa; a @fa; a @fa. @dby Each of the following is defined to be a declaration: any @fa; an @fa; a @fa; a @fa; a @fa; a @fa; a @fa; an @fa; an @fa; a @fa; a @fa. In addition, an @fa is a declaration of its @fa. !corrigendum 3.8(14) @dinsb The @fa of a @fa defines the (nominal) subtype of the component. If the reserved word @b appears in the @fa, then the component is aliased (see 3.10). @dinst If a @fa includes the reserved word @b, the type is called an @i type. !corrigendum 3.9(24) @drepl @xbullet @dby @xbullet !corrigendum 3.10.2(10) @drepl For a function whose result type is a return-by-reference type, the accessibility level of the result object is the same as that of the master that elaborated the function body. For any other function, the accessibility level of the result object is that of the execution of the called function. @dby For any function, the accessibility level of the result object is that of the execution of the called function. !corrigendum 3.10.2(13) @dinsa @xbullet, this is the accessibility level of the execution of the called subprogram.> @dinst @xbullet !corrigendum 4.3.3(11) @drepl For an @fa, an @fa, the @fa of a @fa, the initialization expression in an @fa, or a @fa (for a parameter or a component), when the nominal subtype of the corresponding formal parameter, generic formal parameter, function result, object, or component is a constrained array subtype, the applicable index constraint is the constraint of the subtype; @dby For an @fa, an @fa, the @fa of a return statement, the initialization expression in an @fa, or a @fa (for a parameter or a component), when the nominal subtype of the corresponding formal parameter, generic formal parameter, function return object, object, or component is a constrained array subtype, the applicable index constraint is the constraint of the subtype; !corrigendum 5(2) @drepl This section describes the general rules applicable to all @fas. Some @fas are discussed in later sections: @fas and @fas are described in 6, "Subprograms". @fas, @fas, @fas, @fas, @fas, and @fas are described in 9, "Tasks and Synchronization". @fas are described in 11, "Exceptions", and @fas in 13. The remaining forms of @fas are presented in this section. @dby This section describes the general rules applicable to all @fas. Some @fas are discussed in later sections: @fas and return statements are described in 6, "Subprograms". @fas, @fas, @fas, @fas, @fas, and @fas are described in 9, "Tasks and Synchronization". @fas are described in 11, "Exceptions", and @fas in 13. The remaining forms of @fas are presented in this section. !corrigendum 5.1(4) @drepl @xcode<@fa> @dby @xcode<@fa> !corrigendum 5.1(5) @drepl @xcode<@fa> @dby @xcode<@fa> !corrigendum 5.1(14) @drepl A @i is the run-time action of an @fa, @fa, @fa, or @fa, selection of a @fa, raising of an exception, or an abort, which causes the next action performed to be one other than what would normally be expected from the other rules of the language. As explained in 7.6.1, a transfer of control can cause the execution of constructs to be completed and then left, which may trigger finalization. @dby A @i is the run-time action of an @fa, return statement, @fa, or @fa, selection of a @fa, raising of an exception, or an abort, which causes the next action performed to be one other than what would normally be expected from the other rules of the language. As explained in 7.6.1, a transfer of control can cause the execution of constructs to be completed and then left, which may trigger finalization. !corrigendum 6.1(13) @drepl @xcode<@fa@ft<@b>@fa< subtype_mark>> @dby @xcode<@fa@ft<@b>@fa< subtype_mark | [formal_part] >@ft<@b>@fa< access_definition>> !corrigendum 6.1(23) @drepl The nominal subtype of a formal parameter is the subtype denoted by the @fa, or defined by the @fa, in the @fa. @dby The nominal subtype of a formal parameter is the subtype denoted by the @fa, or defined by the @fa, in the @fa. The nominal subtype of a function result is the subtype denoted by the @fa, or defined by the @fa, in the @fa. !corrigendum 6.1(24) @drepl An @i is a formal @b parameter specified by an @fa. An access parameter is of an anonymous general access-to-variable type (see 3.10). Access parameters allow dispatching calls to be controlled by access values. @dby An @i is a formal @b parameter specified by an @fa. An @i is a function result type specified by an @fa. An access parameter or result type is of an anonymous general access-to-variable type (see 3.10). Access parameters allow dispatching calls to be controlled by access values. !corrigendum 6.1(28) @drepl @xbullet @dby @xbullet @xbullet @xbullet !corrigendum 6.3.1(16) @drepl Two profiles are @i if they are type-conformant, and corresponding parameters have identical modes, and, for access parameters, the designated subtypes statically match. @dby Two profiles are @i if they are type-conformant, and corresponding parameters have identical modes, and, for access parameters or access result types, the designated subtypes statically match. !corrigendum 6.4(11) @drepl The exception Program_Error is raised at the point of a @fa if the function completes normally without executing a @fa. @dby The exception Program_Error is raised at the point of a @fa if the function completes normally without executing a return statement. !corrigendum 6.5(1) @drepl A @fa is used to complete the execution of the innermost enclosing @fa, @fa, or @fa. @dby A @fa or @fa (collectively called a @i) is used to complete the execution of the innermost enclosing @fa, @fa, or @fa. !corrigendum 6.5(2) @drepl @xcode<@fa@ft<@b>@fa< [expression];>> @dby @xcode<@fa@ft<@b>@fa< [expression];>> @xcode<@fa@ft<@b>@fa< defining_identifier : [>@ft<@b>@fa<] return_subtype_indication [:= expression] [>@ft<@b>@fa< handled_sequence_of_statements >@ft<@b>@fa<];>> @xcode<@fa> !corrigendum 6.5(03) @drepl The @fa, if any, of a @fa is called the @i. The @i of a function is the subtype denoted by the @fa after the reserved word @b in the profile of the function. The expected type for a return expression is the result type of the corresponding function. @dby The @i of a function is the subtype denoted by the @fa, or defined by the @fa, after the reserved word @b in the profile of the function. The expected type for the @fa, if any, of a @fa is the result type of the corresponding function. The expected type for the @fa of an @fa is that of the @fa. !corrigendum 6.5(04) @drepl A @fa shall be within a callable construct, and it @i the innermost one. A @fa shall not be within a body that is within the construct to which the @fa applies. @dby A return statement shall be within a callable construct, and it @i the innermost callable construct or @fa that contains it. A return statement shall not be within a body that is within the construct to which the return statement applies. !corrigendum 6.5(05) @drepl A function body shall contain at least one @fa that applies to the function body, unless the function contains @fas. A @fa shall include a return expression if and only if it applies to a function body. @dby A function body shall contain at least one return statement that applies to the function body, unless the function contains @fas. A @fa shall include an @fa if and only if it applies to a function body. An @fa shall apply to a function body. For an @fa that applies to a function body: @xbullet, the @fa shall be a @fa. The type of the @fa shall be the result type of the function. If the result subtype of the function is constrained, then the subtype defined by the @fa shall also be constrained and shall statically match this result subtype. If the result subtype of the function is unconstrained, then the subtype defined by the @fa shall be a definite subtype, or there shall be an @fa.> @xbullet, the @fa shall be an @fa. The subtype defined by the @fa shall statically match the result subtype of the function. The accessibility level of this anonymous access subtype is that of the result subtype.> For any return statement that applies to a function body: @xbullet of the return statement (if any) shall be an @fa, a function call (or equivalent use of an operator), or a @fa or parenthesized expression whose operand is one of these.> @i<@s8> Within an @fa, the @i is declared with the given @fa, with the nominal subtype defined by the @fa. !corrigendum 6.5(06) @drepl For the execution of a @fa, the @fa (if any) is first evaluated and converted to the result subtype. @dby For the execution of an @fa, the @fa or @fa is elaborated. This creates the nominal subtype of the return object. If there is an @fa, it is evaluated and converted to the nominal subtype (which might raise Constraint_Error -- see 4.6) and then the converted value becomes the initial value of the return object. Otherwise, the return object is initialized by default as for a stand-alone object of its nominal subtype (see 3.3.1). If the nominal subtype is indefinite, the return object is constrained by its initial value. For the execution of a @fa, the @fa (if any) is first evaluated, converted to the result subtype, and then assigned to the anonymous @i. !corrigendum 6.5(07) @ddel If the result type is class-wide, then the tag of the result is the tag of the value of the @fa. !corrigendum 6.5(08) @drepl If the result type is a specific tagged type: @dby If the result type of a function is a specific tagged type, the tag of the return object is that of the result type. If the result type is class-wide, the tag of the return object is that of the value of the expression. A check is made that the accessibility level of the type identified by the tag of the result is not deeper than that of the master that elaborated the function body. If this check fails, Program_Error is raised. !corrigendum 6.5(09) @ddel @xbullet !corrigendum 6.5(10) @ddel @xbullet !corrigendum 6.5(11) @ddel A type is a @i type if it is a descendant of one of the following: !corrigendum 6.5(12) @ddel @xbullet !corrigendum 6.5(13) @ddel @xbullet !corrigendum 6.5(14) @ddel @xbullet in its declaration;> !corrigendum 6.5(15) @ddel @xbullet !corrigendum 6.5(16) @ddel @xbullet !corrigendum 6.5(17) @ddel If the result type is a return-by-reference type, then a check is made that the return expression is one of the following: !corrigendum 6.5(18) @ddel @xbullet that denotes an object view whose accessibility level is not deeper than that of the master that elaborated the function body; or> !corrigendum 6.5(19) @ddel @xbullet whose operand is one of these kinds of expressions.> !corrigendum 6.5(20) @ddel The exception Program_Error is raised if this check fails. !corrigendum 6.5(21) @ddel For a function with a return-by-reference result type the result is returned by reference; that is, the function call denotes a constant view of the object associated with the value of the return expression. For any other function, the result is returned by copy; that is, the converted value is assigned into an anonymous constant created at the point of the @fa, and the function call denotes that object. !corrigendum 6.5(22) @drepl Finally, a transfer of control is performed which completes the execution of the callable construct to which the @fa applies, and returns to the caller. @dby For the execution of an @fa, the @fa is executed. Within this @fa, the execution of a @fa that applies to the @fa causes a transfer of control that completes the @fa. Upon completion of a return statement that applies to a callable construct, a transfer of control is performed which completes the execution of the callable construct, and returns to the caller. In the case of a function, the @fa denotes a constant view of the return object. !corrigendum 6.5(24) @drepl @xcode<@b; --@ft<@i< in a procedure body, >>@fa@ft<@i<, or >>@fa @b Key_Value(Last_Index); --@ft<@i< in a function body>>> @dby @xcode<@b; --@ft<@i< in a procedure body, >>@fa@ft<@i<,>> -- @fa@ft<@i<, or >>@fa> @xcode<@b Key_Value(Last_Index); --@ft<@i< in a function body>>> @xcode<@b Node : Cell @b --@ft<@i< in a function body, see 3.10.1 for Cell>> Node.Value := Result; Node.Succ := Next_Node; @b;> !corrigendum 7.3(19) @drepl Declaring a private type with an @fa is a way of preventing clients from creating uninitialized objects of the type; they are then forced to initialize each object by calling some operation declared in the visible part of the package. If such a type is also limited, then no objects of the type can be declared outside the scope of the @fa, restricting all object creation to the package defining the type. This allows complete control over all storage allocation for the type. Objects of such a type can still be passed as parameters, however. @dby Declaring a private type with an @fa is a way of preventing clients from creating uninitialized objects of the type; they are then forced to initialize each object by calling some operation declared in the visible part of the package. !corrigendum 7.5(2) !comment This rule only talks about function_calls, because those are only !comment appropriate here. The conflict text handles the combination of !comment function_calls and aggregates. @dinsb If a tagged record type has any limited components, then the reserved word @b shall appear in its @fa. @dinst In the following contexts, an @fa of a limited type is not permitted unless it is a @fa or a parenthesized @fa or @fa whose operand is permitted by this rule: @xbullet of an @fa (see 3.3.1)> @xbullet of a @fa (see 3.8)> @xbullet of a @fa (see 4.3.1)> @xbullet for an @fa of an @fa (see 4.3.2)> @xbullet of a @fa or the @fa of an @fa (see 4.3.3)> @xbullet of an initialized allocator (see 4.8)> @xbullet of a return statement (see 6.5)> @xbullet or actual parameter for a formal object of mode @b (see 12.4)> !corrigendum 7.5(8) @dinsa There are no predefined equality operators for a limited type. @dinst @i<@s8> For a @fa of a type with a part that is of a task, protected, or explicitly limited record type that is used to initialize an object as allowed above, the implementation shall not create a separate return object (see 6.5) for the @fa. The @fa shall be constructed directly in the new object. !corrigendum 7.5(9) @drepl @xindent<@s9<13 The following are consequences of the rules for limited types: >> @dby @xindent<@s9<13 While it is allowed to write initializations of limited objects, such initializations never copy a limited object. The source of such an assignment operation must be a @fa, and such @fas must be built directly in the target object.>> !corrigendum 7.5(23) @drepl @xindent<@s9 means that parameter passing and function return will always be by reference (see 6.2 and 6.5).>> @dby @xindent<@s9 means that parameter passing will always be by reference and function results will always be built directly in the result object (see 6.2 and 6.5).>> !corrigendum 7.6(17.1) @drepl For an @fa of a controlled type whose value is assigned, other than by an @fa or a @fa, the implementation shall not create a separate anonymous object for the @fa. The aggregate value shall be constructed directly in the target of the assignment operation and Adjust is not called on the target object. @dby For an @fa of a controlled type whose value is assigned, other than by an @fa, the implementation shall not create a separate anonymous object for the @fa. The aggregate value shall be constructed directly in the target of the assignment operation and Adjust is not called on the target object. !corrigendum 7.6.1(2) @drepl The execution of a construct or entity is @i when the end of that execution has been reached, or when a transfer of control (see 5.1) causes it to be abandoned. Completion due to reaching the end of execution, or due to the transfer of control of an @fa, @fa, @fa, or @fa or of the selection of a @fa is @i. Completion is @i otherwise @emdash when control is transferred out of a construct due to abort or the raising of an exception. @dby The execution of a construct or entity is @i when the end of that execution has been reached, or when a transfer of control (see 5.1) causes it to be abandoned. Completion due to reaching the end of execution, or due to the transfer of control of an @fa, return statement, @fa, or @fa or of the selection of a @fa is @i. Completion is @i otherwise @emdash when control is transferred out of a construct due to abort or the raising of an exception. !corrigendum 7.6.1(18) @drepl For a Finalize invoked by the transfer of control of an @fa, @fa, @fa, or @fa, Program_Error is raised no earlier than after the finalization of the master being finalized when the exception occurred, and no later than the point where normal execution would have continued. Any other finalizations due to be performed up to that point are performed before raising Program_Error. @dby For a Finalize invoked by the transfer of control of an @fa, return statement, @fa, or @fa, Program_Error is raised no earlier than after the finalization of the master being finalized when the exception occurred, and no later than the point where normal execution would have continued. Any other finalizations due to be performed up to that point are performed before raising Program_Error. !corrigendum 8.1(4) @dinsa @xbullet;> @dinst @xbullet;> !corrigendum 9.5.2(29) @drepl @xindent<@s9<24 A @fa (see 6.5) or a @fa (see 9.5.4) may be used to complete the execution of an @fa or an @fa.>> @dby @xindent<@s9<24 A return statement (see 6.5) or a @fa (see 9.5.4) may be used to complete the execution of an @fa or an @fa.>> !corrigendum 13.8(10) @drepl @xindent<@s9<16 Machine code functions are exempt from the rule that a @fa is required. In fact, @fas are forbidden, since only @fas are allowed.>> @dby @xindent<@s9<16 Machine code functions are exempt from the rule that a return statement is required. In fact, return statements are forbidden, since only @fas are allowed.>> !ACATS test ACATS(s) tests need to be created for these features. !appendix From: Tucker Taft Sent: Thursday, April 1, 2004, 6:13 AM I have been asked to prepare an alternative to AI-318 which drops the notion of "aliased" return-by-reference functions, and replaces it with a simplified version of anonymous access type return. One thing that is being lost in this process is that return-by-reference eliminates the need for ".all" at the call site. However, it struck me that we already allow implicit dereference in a number of contexts, and since anonymous access types as return types is a new feature, it would be feasible to allow implicit dereference of calls of such functions in *any* context. Allowing implicit dereference has some advantages: 1) It provides better compatibility with the existing (albeit limited) return-by-reference capability, because call sites would not have to change, only the function would change to return X'access rather than X (or Y rather than Y.all). Implicit dereference would eliminate the need for a .all at the call sites. 2) C++ has a return-by-reference capability ("&" return type) which allows a natural way to use a call on a function as the left hand side of an assignment, allowing the implementation of "abstract" arrays, e.g.: Arr(X) := Arr(X) + 1; where "Arr" is actually a function that implements an array-like data structure. We could get much of this same capability by allowing functions declared to return an anonymous access type to be implicitly dereferenced in any context. Furthermore, since Ada uses "()" for both array indexing and function calling, this would actually get some value out of that syntactic unification (or as Robert might call it, "confusion" ;-). This is actually better than the "aliased" return-by-ref capability, since in that case the returned object was necessarily considered a constant. Of course if the writer of the function wanted the result to be access-to-constant, they could declare it that way. 3) Similar to above, but relevant to me because Bob Duff and I have recently been sparring over an issue that would be nicely solved by implicit dereference: As in many text- and language- processing tools, we convert all strings into unique IDs as soon as we read the source file. We call these unique IDs "spellings," LISP used to call them "symbols," and I have seen them called String-IDs and a number of other similar things. They significantly simplify further processing because string equality involves a simple ID equality comparison, and these IDs can be efficiently passed and returned from subprograms without any of the issues associated with passing and returning unconstrained arrays. *However*, when it comes to passing these IDs to subprograms that expect Strings, we have to convert the ID back to a String. The simplest way to do this is to write a function, say To_String, which takes an ID and returns a String. Unfortunately, that immediately gets you back into the inefficiencies of returning unconstrained arrays. An alternative is to expose the representation of the IDs, and allow the caller to explicitly use ".all" or a component selection to retrieve the String at the call site, but that clearly makes the "abstraction" a bit less abstract. By allowing implicit dereference of functions returning anonymous access types, we could have the best of both worlds. The To_String function could actually return "access constant String" instead of String, but it could still be used in any context that required a String without the overhead of returning unconstrained arrays. This would preserve both abstraction and performance. So, barring major objection, I am going to propose that calls on functions returning anonymous access types will permit implicit dereference in any context (instead of only in front of ".", "(", and "'"). Comments welcomed... **************************************************************** From: Pascal Leroy Sent: Monday, April 5, 2004, 10:47 AM Tuck wrote: > Here is an alternative proposal, which drops > "aliased return blah" (return-by-reference) in > favor of "return access blah." It still includes > functions returning limited types. It took me a while to realize that this AI really has two proposals: 1 - Functions returning anonymous access types. That includes implicit dereferencing, but as I see it the extended_return_statement is not necessary for this part. 2 - Improvements for functions returning limited types. This is the part that really needs the extended_return_statement. The more I look at the AI, the more I like #1 (especially with implicit dereferencing and the capability to have a function call on the LHS of an assignment) and the less convinced I am about #2. Yeah, it would be nice to improve the usability of limited types, but the baggage needed to do that (and the somewhat arbitrary restrictions that come with it) sounds clunky to me. What do others think? **************************************************************** From: Tucker Taft Sent: Monday, April 5, 2004, 11:13 AM Pascal Leroy wrote: > > Tuck wrote: > > > Here is an alternative proposal, which drops > > "aliased return blah" (return-by-reference) in > > favor of "return access blah." It still includes > > functions returning limited types. > > It took me a while to realize that this AI really has two proposals: > > 1 - Functions returning anonymous access types. That includes implicit > dereferencing, but as I see it the extended_return_statement is not > necessary for this part. > > 2 - Improvements for functions returning limited types. This is the > part that really needs the extended_return_statement. I believe I was directed to keep these two proposals as part of a single AI. > The more I look at the AI, the more I like #1 (especially with implicit > dereferencing and the capability to have a function call on the LHS of > an assignment) and the less convinced I am about #2. Yeah, it would be > nice to improve the usability of limited types, but the baggage needed > to do that (and the somewhat arbitrary restrictions that come with it) > sounds clunky to me. > > What do others think? I think this is the key thing to make limited types more useful. With this change, making a type limited allows the implementor to control all cases of copying, without dramatically undermining the usability of the type, and with almost no negative performance impact. **************************************************************** From: Randy Brukardt Sent: Tuesday, April 6, 2004, 3:53 PM The only reason that I would ever vote for (1) would be if it was the only way to get (2). If we don't want to handle the limited functions, then we need do nothing for return-by-reference. Visible access parameters and results in modern programs should be discouraged; used only when there is absolutely no other choice. (If we'd have "in out" on functions, there would never be a need for them.) As far as the implicit dereference goes, I've been waiting for the expected "April Fool" that goes with it. Since I've been waiting a week, I suppose it is actually a serious proposal. I find it completely bizarre, because it ruins the model of implicit dereference (it occurs only before '.' or '()'). Moreover, why type Int_Access is access all Integer; function anon return access all Integer; function IA return Int_Access; I := Anon; -- Legal. I := IA; -- Illegal. should behave differently is going to be just too goofy to explain. OTOH, (2) will not only eliminate arbitrary restrictions from limited types, but it also will make code more readable anytime that it takes multiple steps to create a result. (And should allow the generation of better code as well by building in place more often.) **************************************************************** From: Tucker Taft Sent: Thursday, April 15, 2004, 10:30 AM In my recent AI-318-2 proposal for functions returning an anonymous access type, I had specified that implicit dereference was provided for calls on such functions, in part to minimize the impact of eliminating return-by-reference functions. However, since then I noticed an additional use of implicit deref of such calls. I mentioned this in a response to ada-comment, but here I repeat it for those who don't follow that mailing list. There are often times where one wants to provide a read-only view of a "private" global variable. There are also times that you have a large global table, but you want to put its initialization in a package body so you don't suffer recompilation headaches every time you change the table slightly (e.g. a large parse table). Ada doesn't really have any good solution for this. In C/C++, you can declare a large constant (or variable) without giving its initialization in a spec (i.e. the ".h" file), and then in the body (i.e. the ".c" file) give the full initialization. With this new implicit deref proposal, it would make it pretty easy and efficient to solve this problem: In the spec: function Read_Only_View return access constant T; pragma Inline(Read_Only_View); In the body: Var : aliased T := (...); function Read_Only_View return access constant T is begin return Var'Access; end Read_Only_View; Similarly, if there were a large constant table that you just wanted to postpone to the body: In the spec: function Parse_Table return access constant Parse_Table_Type; pragma Inline(Parse_Table); In the body: Parse_Table_Obj : aliased constant Parse_Table_Type := (...); function Parse_Table return access constant Parse_Table_Type is begin return Parse_Table_Obj'Access; end Parse_Table; Since the existing uses of anonymous access types are quite limited right now (only parameters and discriminants), we could consider providing implicit dereference for *all* expressions of an anonymous access type, since that would be more uniform. But I also think it is not too bad to only provide this for calls, since there is already implicit "deproceduring" (as Algol 68 called it) for parameterless subprograms in Ada. Providing implicit deref for functions returning an anonymous access type seems like a natural progression of that. If we instead choose to extend implicit deref to all anonymous access values, then there is more of an upward compatibility concern, since there could be additional ambiguities created when an access parameter or discriminant is passed to an overloaded subprogram with one version having a param of type "T" and another a param of type "access T". Given the relatively modest use of access parameters and access discriminants at this point, this seems relatively unlikely, and in any case it will not silently change meaning -- you'll get a compile-time error. I could go either way. I think implicit deref of function calls at least is pretty important. It is a very nice way to return a reference to a large object without incurring significant overhead, and without having to change the syntax used at the call point (i.e., no need to insert ".all"). **************************************************************** From: Robert Dewar Sent: Thursday, April 15, 2004, 10:37 AM > Ada doesn't really have any good solution for this. I am missing something, it seems quite reasonable to return an appropriate constant access type. Yes you add some nice syntactic sugar below, but nothing fundamental. What am I missing? **************************************************************** From: Tucker Taft Sent: Thursday, April 15, 2004, 11:32 AM We have generalized the availability of anonymous access types, in part to go with the "limited with" proposal, since "limited with" doesn't solve the proliferation of access types problem. The one significant place where anonymous access types weren't permitted was as function result types. AI-318-2 was addressing that (as well as limited result types). Franco was extremely keen on getting this, because he felt that it was a clear hole when trying to explain the new anonymous access type paradigm. The implicit deref of calls of such functions may seem like a small point, but it can have a significant effect on usability in my experience. Having to add ".all" on the result of a function call is a pain and changes the perceived nature of the abstraction. What you really want is return by reference in some contexts, and having to add an explicit ".all" makes the abstraction feel less abstract. I gave examples of an "array" abstraction with the ability to assign to components of the array. **************************************************************** From: Robert I. Eachus Sent: Thursday, April 15, 2004, 4:22 PM I don't think you are missing anything. But that doesn't mean that what Tucker is trying to accomplish is not very useful. Right now you can convert a function to an array in some cases and not have to change the code that uses the abstraction. This does the same thing for anonymous access types. (Actually, Tucker goes further, but I think that the anonymous access cases are the high payoff.) If we can eliminate gratuitous uses of .all and 'Access, it makes programming in Ada easier. There is a potential problem in that overloadings will be possible where supplying the .all (or 'Access) will resolve the overloading but the direct call will always be ambiguous. If you really find that a problem, is should be easy enough to say that if a function call or argument is in parentheses, the .all or 'Access must be explicit. So if: X := Foo(Y, Bar); -- is ambiguous then; X := Foo(Y, (Bar)); and X := Foo(Y, Bar.all); Would resolve to the two different meanings. It might require some changes to existing code, but not that much, and it would of course, be caught at compile time. **************************************************************** From: Randy Brukardt Sent: Thursday, July 29, 2004, 1:06 AM AI-10318 includes the following rule: Legality Rules If the result subtype of a function is limited at the point where the function is frozen (see 13.14), the result subtype shall be constrained. This was intended to make the implementation of limited build-in-place functions easier. At the Palma meeting, Pascal pointed out that this is incompatible with existing generic units that have generic limited private type parameters. He asked me to add the example of ACATS foundation FDD2A00, which this rule makes illegal -- and it doesn't raise Program_Error in use. (This example is not quite as compelling as it could be, as the foundation in question was created by "PHL". :-) However, I noticed that this foundation is testing stream attributes. The problem is caused by 'Input being a function. Indeed, that shows that the above legality rule has additional compatibility problems and as well needs help to cover stream attributes. Let's look at an example. package Ugh is type Lim_Tagged is limited tagged private; function My_Input (Stream : access Root_Stream_Type'Class) return Lim_Tagged; -- Legal only if the type doesn't have any discriminants. for Lim_Tagged'Input use My_Input; function My_Class_Input (Stream : access Root_Stream_Type'Class) return Lim_Tagged'Class; -- Never legal (T'Class is never constrained). for Lim_Tagged'Class'Input use My_Class_Input; procedure Do_Something (Object : Lim_Tagged'Class); task type Tsk is ... function My_Tsk_Input (Stream : access Root_Stream_Type'Class) return Tsk; -- Legal. for Tsk'Input use My_Tsk_Input; type Tsk_Array is array (Positive range <>) of Tsk; -- Tsk_Array'Input is available by the Corrigendum and AI-195. procedure Do_Something_Else (Object : Tsk_Array); private ... end Ugh; with Ugh; package Factory is function Constructor (...) return Ugh.Lim_Tagged'Class; -- A "factory" constructor. -- Never legal (T'Class is never constrained) end Factory; with Ugh; procedure Test is Obj : Ugh.Lim_Tagged'Class := Ugh.Lim_Tagged'Class'Input (A_Stream); -- This is clearly legal if we don't try to redefine the -- attribute. But it is returning a clearly unconstrained (new) -- object, and it will require build-in-place semantics. TObj : Ugh.Tsk_Array := Ugh.Tsk_Array'Input (A_Stream); -- This also is clearly legal. begin Ugh.Do_Something (Ugh.Lim_Tagged'Class'Input (A_Stream)); -- Legal now in Ada 95+Corr. Ugh.Do_Something_Else (Ugh.Tsk_Array'Input (A_Stream)); -- Legal now in Ada 95+Corr if Tsk_Array'Input overridden. end Test; The first problem to note is incompatibility. With this legality rule, we aren't allowed to declare a function to be used as a user defined 'Input for Tsk_Array and Lim_Tagged'Class. Ada 95 certainly allows that. Tucker has argued privately that it would be nearly impossible to write a useful user-defined routine in this case, because of the return-by-reference rules -- returning an existing object is never what you want. The second problem is the language defines stream attributes for all types - including unconstrained limited types. Moreover, we've made (limited) function calls legal in many circumstances with build-in-place semantics. So, these stream attributes are legal in calls like the above (and the Do_Something calls are perfectly useful in existing Ada 95+Corr code). That means that we still have to implement unconstrained function calls, just the user can't write them. That's obviously silly. We're also giving much less than meets the eye here. It's not allowed to write a class-wide constructor (of which T'Class'Input is just a single example). That's a significant loss. The workaround of using an anonymous (or named) access type puts the burden of storage management on the user - reducing a key advantage of Ada. And it means that it won't be possible to convert non-limited tagged types which were non-limited only to get functions and constructors to limited tagged types. --- Anyway, the second problem has to be solved somehow. It's of no help to implementers to limit users from writing unconstrained functions if the system still has to do it. There are three basic solutions. The more ugly rules solution: We could make T'Input "unavailable" if T is limited and unconstrained. This is messy, though, because we wouldn't want to prevent the use of T'Input for (untagged) types that aren't really limited (and thus allow the declaration of functions); that would be a bigger compatibility problem. Unfortunately, figuring out whether types are "really" limited breaks privateness, and Mr. Private is sure to object to a legality rule depending on privateness. And, of course, this rule would completely undo all of the work which makes streaming for limited tagged types useful, leaving such types as second-class citizens. An alternative is the New ugly rule solution: The real problem is that we need to prevent making *calls* to functions that we can't handle, not the functions themselves. So we could drop the above rule altogether (possibly leaving an implementation permission for an implementation to reject a function that cannot be called), and replace it by rule on calls: If the result subtype of a function_call is limited at the point of the call, the result subtype shall be constrained. This solves the problem cleanly, and also reduces the problems with generics (as now only internal calls would be made illegal; calls from outside the instance would determine from the actual type if they are legal. But again this covers too much. And trying to make it tighter again will break privateness and also be a contract problem in generics. The final alternative is the Too much work solution: drop these silly restrictions intended to make the implementation easier (and which by definition harm the user) and get to work. Tucker explained how to implement this long ago, and although it looks painful, it's only a short-term pain -- rather than inflicting this pain on users forever. It also is much less incompatible, as there are no problems with existing generics with limited private formal types. Still, this didn't fly in the past, and I don't expect it to fly now. --- To me, this looks like a giant tease. We tease programmers by claiming that you can now do almost everything with limited types that you can do with non-limited types, but in return some of your generics are now going to be illegal (with no meaningful workaround), you can't use class-wide functions (meaning no factories of any kind), and even class-wide streaming isn't allowed. This, to steal one of Tucker's favorite phrases, is just moving the bump under the rug. This is a self-inflicted bump; if we were willing to do the additional work to move the furniture, this bump would be gone. (The analogy comes true in the carpet here in my office; since we didn't move all of my furniture in the last flood, the carpet repair guy left a large bump next to my desk...) There is also a safety issue here that we haven't really discussed. Objects returned by return-by-reference functions have a limited lifetime: it's not possible for the caller to hang onto them forever. That makes it possible to do storage management and resource locking (although the language doesn't support this well). However, anonymous access types can be converted to other types, and via 'Unchecked_Access, held onto forever. (Sure, the programmer is responsible that the 'Unchecked_Access value is destroyed before the object is. But if the programmer *knows* [by peeking at the source] that the actual objects are at library-level or in a heap, the use is OK vis-a-vis the language -- even though it may not be part of the "contract" of the function. And 'Unchecked_Access is so common that it isn't going to be a red flag to anyone - I don't know if I've ever found a useful case where I could use 'Access on an object -- I don't even try anymore.) So we've reduced the safety of the language a bit. Anyway, to draw an actual conclusion. :-) I don't believe that an AI that purports to give limited types equal footing with non-limited ones can deny a major benefit of OOP programming to limited types. Restrictions on class-wide programming are simply unacceptable in my view -- if limited tagged types are going to remain useless, we might as well not bother with lots of work and incompatibility. So I would vote for accepting the short-term pain and making this useful to all. However, if that is unacceptable to the majority, then (in the absence of a better idea) I would rather drop the AI completely rather than give users another (but different) crippled version of limited types. **************************************************************** From: Stephen W. Baird Sent: Thursday, January 13, 2005 3:31 PM In the course of reviewing section 6, some questions came up about extended return statements (AI-318). Initially this discussion involved only Pascal, Randy, and the section 6 reviewers (Steve B. and Jean-Pierre). It has become clear that a broader discussion is warranted. The discussion so far (with some minor editing) is attached. ---- Stephen W Baird/Cupertino/IBM wrote on 01/11/2005 03:09:16 PM: If we get as far as entering the handled_sequence of statements of an extended_return_statement (see AI-318), then clearly the object associated with the return statement has been successfully initialized and it will need to be finalized at some point. If the return statement executes "normally" (i.e., if the final transfer of control described in the dynamic semantics of a return statement is executed), then the caller is responsible for this finalization. Otherwise, the callee has to take care of this finalization (right?). This could occur if the extended_return_statement is exited via a goto or exit statement, or if an exception flies out of the handled_sequence_of_statements, or if execution of the extended_return_statement is aborted (either by an abort statement or via ATC). Thus, the finalization rules for the return object are quite different depending on how the extended_return_statement is exited. It appears that the RM is missing a description of this distinction. There also may be related problems in defining the master and accessibility level of the function return object of an extended_return_statement (see AI-162). One approach would be to define an extended_return_statement to be a master (in 7.6.1(3)), and therefore it would be the master of the return object. This would prohibit the following example type R1 is record F : aliased Integer; end record; function F1 return R1 is type Ref is access all Integer; Dangling : Ref; begin return Result : R1 do Dangling := Result.F'Access; -- should not be legal goto L; end return <> Dangling.all := 123; ... ; end F1; , and would result in the appropriate task-termination-awaiting in the following variation type R2 is record F : Some_Task_Type; end record; function F2 return R2 is begin return Result : R2 do goto L; -- must wait for Result.F to terminate end return <> ... ; end F2; . On the other hand, this might introduce confusion (or worse) in the case of a "normal" return and might require adding some special rules to handle that case. Another approach (the "black hole" model) would be to prohibit exiting an extended_return statement without exiting the enclosing function. This would require - prohibiting goto statements and exit statements which would transfer control out of an Extended_Return_Statement - a dynamic semantics rule to the effect that an exception propagated out of the Handled_Sequence_Of_Statements of an Extended_Return_Stmt is propagated to the caller . In this example, function F return Integer is E1, E2 : exception; begin return Result : Integer do raise E1; end return; exception when others => raise E2; end F; , this would mean that E1 would be propagated to the caller, not E2. This seems unintuitive. This approach has the advantage that there is no need to define what happens if an extended_return_statement is exited without exiting its enclosing function, but it does not eliminate the need for finalization rules in the case where an extended_return_statement propagates an exception to the caller. ======== Pascal Leroy/France/IBM wrote on 01/12/2005 03:52:45 AM: I don't like the "black hole" model because of its implication for exception handlers. On the other hand the other approach you mention is even less palatable as it seems to imply that the master of the return object might change during its lifetime. Now that I reread this, it also seems strange that initialization and finalization of the return object are done by different masters (in the normal case). Could this be the root of the problem? What if the caller always did initialization and finalization of the return object? Would that make any sense? ======== Jean-Pierre Rosen wrote on 01/12/2005 04:44:04 AM: Actually, the "black hole" model is what I had in mind when I read this remark. I don't like the idea of "canceling" a return statement in the middle, and if an exception is raised, I would expect it to be propagated to the caller, not inside the function. In short, I would expect these to be equivalent (assuming the return type is not limited): 1) declare function F return T is -- do something end F; begin return F; end; 2) return X : T do -- Do the same thing end return; (Note that in 1), there is no goto or exit issue, and that an exception is propagated). ... If the model is that the caller creates the returned object before calling the function, yes, Pascal's idea of having the caller perform initialization and finalization of the return object would seem to make sense. ======== Stephen W Baird/Cupertino/IBM wrote on 01/12/2005 10:25:49 AM : I don't like the black hole model either, mostly because it would be very confusing for users. However, I don't think it would be very difficult to implement: Upon entering an extended_return_statement, a flag is set. The handler for any handled_sequence_of_statements which encloses an extended_return_statement would then query the flag and do the right thing if the flag is set. Still, it is probably a bad idea. Your idea of having the caller perform default initialization is appealing, but it seems like it would be impossible to implement in case of an indefinite function result subtype. As a minor point, there is also the case where default initialization is not supposed to be performed: return X : T := Explicit_Initial_Value do ... end return; Finally, there is the problem of modifying the function result object, exiting the extended_return_statement (e.g. via a goto) and then executing another extended_return_statement. ======== Pascal Leroy/France/IBM wrote on 01/12/2005 10:59:57 AM: > Your idea of having the caller perform default initialization is > appealing, but it seems like it would be impossible > to implement in case of an indefinite function result subtype. Yes, that part bothers me, although I haven't given it enough thought yet. > As a minor point, there is also the case where default initialization > is not supposed to be performed: > return X : T := Explicit_Initial_Value do ... end return; Good point. This is somewhat related to the previous issue, as the only case where this capability is important is for indefinite types (boxy discriminants or class-wide). > Finally, there is the problem of modifying the function result object, > exiting the extended_return_statement (e.g. via a goto) and then > executing another extended_return_statement. If the result object is initialized/finalized by the caller, this is not an issue. Exiting an extended_return_statement doesn't cause finalization of the result object, and re-entering another extended_return_statement (or the same one) doesn't cause it to be initialized. This is all well-defined. But again the indefinite case is problematic. ======== "Randy Brukardt" wrote on 01/12/2005 11:19:29 AM: I completely agree with Jean-Pierre. I don't think any transfers out of an extended return statement should be allowed, and exceptions should be propagated directly to the caller. The master of the object is that of the caller (and yes, you may have to pass it in -- the initialization is done in the return statement, but it doesn't use the master that is textually there). I thought that was all obvious, it can't work any other way -- and I thought that the wording made that clear (I see I was wrong about that). Once you start a return statement, you have to return, not do other random junk. I don't see why that should be confusing to users (with the possible exception of the function's exception handler not working). ======== Stephen W Baird/Cupertino/IBM wrote on 01/12/2005 12:51:05 PM: > I don't see why that should be confusing to users (with the possible > exception of the function's exception handler not working). When I said that this approach would be confusing for users, I was talking about the behavior of exceptions. You say "The master of the object is that of the caller". I don't see how it can be that simple, even with the "black hole" model, because of the possibility that an extended_return_statement can propagate an exception (albeit directly to the caller). If an extended return statement propagates an exception, then a) if the function result object has component tasks, who waits for them to terminate? b) if the function result object requires termination, when is it performed? It would be very odd to require the caller to iterate over the components of the function result object (e.g. to perform finalization) in the case where the called function propagated an exception, particularly in the case where the function result subtype is indefinite. I suppose you could view this case as being a lot like an Unchecked_Deallocation of the function result object. That would mean that the callee would perform finalization but the caller would wait for tasks. We would also have to deal, one way or another, with the case of erroneous execution of "deallocated" discriminated tasks (see 13.11.2(11-15)). There is also the question of the static accessibility level of the function return object of an extended_return_statement. You certainly don't want to allow Local_Variable'Access as an access discriminant value for the function result object. ======== "Randy Brukardt" wrote on 01/12/2005 01:24:59 PM: > You say "The master of the object is that of the caller". > > I don't see how it can be that simple, even with the "black hole" model, > because of the possibility that an extended_return_statement can propagate > an exception (albeit directly to the caller). Of course it's that simple. It's the same rules that you use for a "regular" function return that has finalizable components. > If an extended return statement propagates an exception, then > > a) if the function result object has component tasks, who waits > for them to terminate? The caller, of course. > b) if the function result object requires termination, when is it > performed? At the point that it would have happened if the function call had been successful. > It would be very odd to require the caller to iterate over the components > of the function result object (e.g. to perform finalization) in the case > where the called function propagated an exception, particularly in the > case where the function result subtype is indefinite. I don't think so. The only sensible rule is that the finalization/waiting (they are the same thing in my view) takes place at the same point whether the call returns normally or raises an exception during the return statement. Anything else would be madness - you'd have to move the object in the finalization chain and/or change its master partway through the evaluation of the return statement. That would be something that never happens in Ada 95, and it seems insane. > I suppose you could view this case as being a lot like an > Unchecked_Deallocation of the function result object. That would mean > that the callee would perform > finalization but the caller would wait for tasks. We would also have to > deal, one way or another, with the case of erroneous execution of > "deallocated" discriminated tasks (see 13.11.2(11-15)). No, finalization and task waiting take place at the same place. Unchecked_Deallocation is weird because the master is somewhere else, but that's a bug in my view - one that we can't change, of course. Task waiting and finalization are closely related things, and they should happen at the same place (even for Unchecked_Deallocation). > There is also the question of the static accessibility level of the > function return object of an extended_return_statement. You certainly don't > want to allow Local_Variable'Access as an access discriminant value for the > function result object. Right, but that should be obvious, and the rule needed is quite simple (the function object is less nested than the function). **************************************************************** From: Gary Dismukes Sent: Thursday, January 13, 2005 4:20 PM > It has become clear that a broader discussion is warranted. After an initial reading of the exchange I agree with Randy's view that you can't exit out of an extended return statement (i.e., you have to return to the caller, whether normally or by exception) and that the caller has to perform all termination and finalization of the object. I haven't thought through all the implications of this, but it seems like the only reasonable model to me. Now, where are the gotchas? **************************************************************** From: Stephen W. Baird Sent: Thursday, January 13, 2005 6:49 PM If the odd exception propagation rules don't bother you, then I don't think there are any big problems with the rule that you can't exit the handled_sequence_of_statements of an extended_return_statement without also exiting the enclosing function (i.e., the "black hole" rule). I just thought of another hole in this area that would need to be plugged, function Bad_Transfer_Of_Control return T is Return_Statement_Was_Entered : Boolean := False; begin select delay 1.0; then abort return X : T do Return_Statement_Was_Entered := True; delay 10.0; end return; end select; if Return_Statement_Was_Entered then Put_Line ("Houston - we've got a problem"); end if; ... ; end Bad_Transfer_Of_Control; , but that's ok; we can ban extended_return_statements within the abortable part of an ATC statement (or make them abort-deferred, or ...). The "gotcha" as I see it is in the rule that "the caller has to perform all termination and finalization of the object". In the case where the function result subtype is, say, an unconstrained array subtype, and the function propagates an exception, this would mean that the callee would have to simultaneously propagate an exception and return an array for the caller to finalize. I don't see how to implement this without imposing distributed overhead on functions that don't use extended_return_statements. **************************************************************** From: Randy Brukardt Sent: Thursday, January 13, 2005 7:37 PM > The "gotcha" as I see it is in the rule that "the > caller has to perform all termination and finalization of the object". > In the case where the function result subtype is, say, an unconstrained > array subtype, and the function propagates an exception, this would mean > that the callee would have to simultaneously propagate an exception and > return an array for the caller to finalize. I don't see how to implement > this without imposing distributed overhead on functions that don't use > extended_return_statements. I don't see this as an issue with extended_return_statements, but rather with functions returning limited unconstrained subtypes. The rules require build-in-place, even for these sorts of functions. They have to, or limited functions don't work. The overhead has to be there even for a regular return statement, because they too are build-in-place. I view these to be more like procedures with a convenient syntax than functions, at lease in implementation. So, I think some sort of expensive special calling convention will be required for these things, so that the object can be created in the right place. That's going to be unpleasant, but its always necessary (even if a regular return is used to return an aggregate, for example -- you'll have precisely the same problems). I would expect that some implementations would have to pass a thunk to do that, or at least a package of storage pool/task master/finalization thumb. Anyway, I don't think that limited functions are going to be returning anything; the object (or a holder descriptor for it, for unconstrained subtypes) will be passed it, and the object will be constructed there. There's nothing to return (ever); the object gets finalized as for any other object constructed at that place. I could imagine an implementation returning a pointer to this object in the normal case (just so that it works like other functions), but that certainly wouldn't have anything to do with its finalization. I do think that there is an obvious transformation of such a function into constructs that we already understand. If LC is a limited controlled type, then: type LC_Array is array (Positive range <>) of LC; function Constructor return LC_Array is begin return (1..10 => <>); end Constructor; ... Obj : LC_Array := Constructor; can be transformed into (with the same finalization meaning): type LC_Array is array (Positive range <>) of LC; type LC_Array_Access is access LC_Array; type LC_Holder is new Ada.Finalization.Limited_Controlled with record Item : LC_Array_Access; end record; procedure Constructor (Holder : in out LC_Holder) is begin Holder.Item := new LC_Array'(1..10 => <>); end Constructor; procedure Finalize (Holder : in out LC_Holder) is begin Free (Holder.Item); end Finalize; ... Obj : Constructor; Constructor (Obj); (Finalization of Obj forces the deallocation and finalization of the "Item" component.) Obviously, a compiler vendor can probably do better than this (without the explicit declarations, for instance), and probably would want to use a storage pool other than the default heap for this allocation. For an extended_return, the "body" of the return statement would be operating on the allocated item: function Constructor return LC_Array is begin return D : LC_Array := (1..10 => <>) do D(5) := ...; end return; end Constructor; would turn into: procedure Constructor (Holder : in out LC_Holder) is begin Holder.Item := new LC_Array'(1..10 => <>); begin Holder.Item.all (5) := ...; end; end Constructor; Note that this transformation suggests that there is no real problem with transfers of control. But I still think it is weird to initiate a return statement and then goto out of it. So I think that should be illegal irrespective of any semantic issues. Exceptions seem to matter less, but I suspect that it would be easier to generate better code if you couldn't handle them locally. **************************************************************** From: Tucker Taft Sent: Thursday, January 13, 2005 9:18 PM I think I have a somewhat different model. Here is what the AI says that is related to this issue: > DEALING WITH EXCEPTIONS > > There was some concern about what would happen if an exception were > propagated by an extended return statement, and then the same or some > other extended return statement were reentered. There doesn't seem to be > a real problem. The return object doesn't really exist outside the > function until the function returns, so it can be restored to its > initial state on call of the function if an exception is propagated from > an extended return statement. Once restored to its initial state, there > seems no harm in starting over in another extended_return_statement. In my view, the extended return statement should be treated like returning an aggregate, where all of the various statements after the "do" may be thought of as being squeezed into the middle of the aggregate. You can have arbitrary expressions in the middle of an aggregate, which can raise exceptions, etc., and these do *not* cause control to go to the caller. The exception is propagated to the enclosing exception handler, not directly to the caller. And until you hit the final right paren of the aggregate, or the "end" of the extended return, the object doesn't really "exist" as far as the outside world is concerned. The object can be returned to the initial state it had when the function was first called. I don't agree with Randy that finalization is performed by the caller in case of a failed extended return. I think a good model is a record that contains a controlled component and a regular component, where the regular component is initialized *after* the controlled component, and its initialization fails. In this case, we finalize the controlled component right away, even if, say, the record is being initialized as part of an allocator which wouldn't normally be finalized until much later. An interesting issue is how to handle task components. The question is when do they get added to the appropriate "activation list" which is presumably walked when the caller hits the point to activate the tasks. It seems simplest to let the caller take care of adding any task components to the activation list. This can presumably be done after the function returns, by walking the object to find all the task components and add them to the activation list. This would only happen if and when the function returns successfully. This saves having to pass in such an activation list, and avoids having to remove tasks from the list if the extended return statement fails in the middle. By the way, did we ever specify what happens if we have a limited aggregate as an actual parameter, and it has a task component? I would presume that all task components of such actual parameters are activated after evaluating all the parameters, immediately prior to invoking the body of the subprogram. It seems unwise to activate them piecemeal as the parameters are evaluated, as there is no defined order of parameter evaluation. It is sort of like the actual parameters are the components of a heap object, and the call represents the allocator. **************************************************************** From: Robert I. Eachus Sent: Thursday, January 13, 2005 10:11 PM Randy Brukardt wrote: >I don't see this as an issue with extended_return_statements, but rather >with functions returning limited unconstrained subtypes. The rules require >build-in-place, even for these sorts of functions. They have to, or limited >functions don't work. The overhead has to be there even for a regular return >statement, because they too are build-in-place. I view these to be more like >procedures with a convenient syntax than functions, at lease in >implementation... I agree. We have to get this case right for any of this to make sense doing. >Note that this transformation suggests that there is no real problem with >transfers of control. But I still think it is weird to initiate a return >statement and then goto out of it. So I think that should be illegal >irrespective of any semantic issues. Exceptions seem to matter less, but I >suspect that it would be easier to generate better code if you couldn't >handle them locally. It is possible to make a transfer out of a return statement work, but I don't see much point to making implementors deal with it. The simplest rule seems to me to be that return statements create a scope for statement identifiers, and we fix the wording in 5.1(12&13) and 5.8(4) to match. As for exceptions, the rule should be similar, we don't care if an exception is raised AND handled inside the return statement. So the return statement should be outside the scope of exception handlers local to the function, but it may contain exception handlers. It is probably worth adding an example to the RM of a function call with nested exception handler just to show how to do the 'hard' case: function Constructor return Limited_Array is begin return D: Limited_Array(1..10 => <>) do for I in D'Range loop begin D(I) := ...; exception when others => -- fix D(I) for some I. end; end loop. end return; end Constructor; with of course, an explanation pointing out that errors when allocating D will be handled by the caller, while errors in computing D(I) can be handled locally. **************************************************************** From: Randy Brukardt Sent: Thursday, January 13, 2005 10:24 PM Tucker wrote: ... > In my view, the extended return statement should be treated > like returning an aggregate, where all of the various statements > after the "do" may be thought of as being squeezed into the middle > of the aggregate. You can have arbitrary expressions in the > middle of an aggregate, which can raise exceptions, etc., and > these do *not* cause control to go to the caller. The exception > is propagated to the enclosing exception handler, not directly > to the caller. And until you hit the final right paren of the > aggregate, or the "end" of the extended return, the object > doesn't really "exist" as far as the outside world is concerned. > The object can be returned to the initial state it had when > the function was first called. I certainly agree with you that the model is the same, but I don't agree with the conclusions that you draw from it. This model works in Ada 95 because the aggregate is necessarily non-limited, so the aggregate is created into a temporary, and it isn't copied into the final object until after the function returns (or immediately before - in the context of the caller, in any case. A Program_Error raised by the Adjust copying into the final object will certainly not be caught by the function). The problem is, for limited functions, we have to build the aggregate (or extended_return_statement) in place. In *either* case, raising an exception in the middle is a problem, because the object is already partially constructed. And, at least in Janus/Ada, an object belongs to a particular master, and that is assumed to never change -- so the object gets finalized whenever that master goes away (unless it happens earlier, of course). So the object is constructed with an owner of the ultimate master - and thus won't get finalized until that master goes away. Your model would require changing the master of the object after it is created. While I suppose that there is some way to implement that, it would require a distributed overhead -- both adding extra nodes into the finalization chains to allow safe deletion from the main finalization chain, and of thunks that would have to be defined for all record types (or at least all limited record types). > I don't agree with Randy that finalization is performed by > the caller in case of a failed extended return. I think > a good model is a record that contains a controlled component > and a regular component, where the regular component is > initialized *after* the controlled component, and its initialization > fails. In this case, we finalize the controlled component > right away, even if, say, the record is being initialized > as part of an allocator which wouldn't normally be finalized > until much later. I think you're confused: there is no such rule in the Standard that I can find. The rules that exist pertain to a failed Adjust, and that rule (at the insistence of one S. Tucker Taft, as I recall) was relaxed to "might or might not be finalized". Finalization occurs when the object's master is left. In the case of a failed allocator, that could indeed be a long time in the future. I think the model you describe would be wrong, because it would finalize the object too soon. The overhead of the model that you espouse here would be severe -- it could only be implemented by putting an exception handler around *any* initialization that could possibly fail. Unless the implementation has zero-cost exception handlers, that's going to be a lot of expense. (We have to do that for Finalize calls, and it is by far the largest overhead of finalization in normal use. In the case of finalization, there really isn't a choice (because we can't let something failed poison an unrelated abstraction), but I don't see any special issue with initialization (as long as the finalization actually happens eventually). > An interesting issue is how to handle task components. > The question is when do they get added to the appropriate > "activation list" which is presumably walked when the > caller hits the point to activate the tasks. It seems > simplest to let the caller take care of adding any task > components to the activation list. This can presumably > be done after the function returns, by walking the object > to find all the task components and add them to the > activation list. This would only happen if and when > the function returns successfully. This saves having to > pass in such an activation list, and avoids having to > remove tasks from the list if the extended return > statement fails in the middle. You seem to be separating tasks and finalizable objects. I think that is serious mistake -- they should follow the same rules. Moreover, it is too late; tasks are created belonging to a master in Janus/Ada (so that they can be cleaned up if activation fails), so we'd have to pass in the master and activation list into the function. And, in any case, walking components of a record is that very expensive operation requiring a custom thunk for *all* record types (to avoid contract problems). You're telling us we need two of them? You're insane! > By the way, did we ever specify what happens if we have > a limited aggregate as an actual parameter, and it > has a task component? Yes, it's the AI-162 that you were the last person to rewrite. I seriously think that you are working too hard! (Now go review your sections of the AARM. :-) We redefined masters so that expressions and subprogram calls have them. > I would presume that all task > components of such actual parameters are activated > after evaluating all the parameters, immediately > prior to invoking the body of the subprogram. > It seems unwise to activate them piecemeal as the > parameters are evaluated, as there is no defined > order of parameter evaluation. It is sort of like > the actual parameters are the components of a heap > object, and the call represents the allocator. God, I hope not. Activating tasks is complicated enough without deciding to treat parameters specially. Why the heck would anyone care what order they're activated in anyway? It's usually unspecified, so you couldn't depend on anything about it anyway. And, as you say, the order of the parameters is unspecified; so how could one parameter even be able to determine if another parameter has it's tasks started? It certainly can't see them! I do agree that a single parameter would have to be activated like an allocator (it has to be activated somewhere). But we evaluate each parameter individually, and trying to tie them together would certainly add additional complexity for no benefit. I'm pretty much ready to give up on the entire idea of limited aggregates and functions, because it simply isn't worth the headaches that you guys keep coming up with. It would be tempting to disallow aggregates and functions that contain tasks, but we know that that sort of restrictions never work. You've certainly convinced me that it couldn't possibly be worth trying to implement these. (Probably ever.) Sigh. **************************************************************** From: Tucker Taft Sent: Thursday, January 13, 2005 11:00 PM > ... Exceptions seem to matter less, but I > suspect that it would be easier to generate better code if you couldn't > handle them locally. I don't agree. I think we want things to work as similarly as possible between limited and non-limited, and between normal and extended returns. We know that in Ada 95, if you return an aggregate, and something fails in the middle of creating the aggregate, you can handle that inside the function. This should also be true if the type being returned happens to be limited, and should be true if the aggregate is turned into an extended return statement with statements initializing the components, and finally, it should be true if it is a limited extended return. **************************************************************** From: Randy Brukardt Sent: Friday, January 14, 2005 12:05 AM I have to agree, but I come to the opposite conclusion: the way non-limited returns work currently is unacceptable, and if you really want them to be exactly the same, we'll need to change non-limited to match limited. The reason is that the current semantics essentially require a temporary; build-in-place is not allowed for non-limited types is not allowed. That's because build-in-place has to be undone if an exception occurs after the evaluation of the aggregate starts. We deal with that for regular assignments by checking the aggregate for the possibility of raising an exception before doing build-in-place, but for any type with controlled components, such a check must necessarily fail (we cannot know what happens in Adjust). My intent was to use build-in-place (that is, the new calling convention) for all record types. That would get rid of the obnoxious temporary that can't be optimized away, and which makes composite functions something to avoid whenever you care about performance. But you're saying that build-in-place can never be done for a non-limited type with controlled components, because you can tell if the target object is finalized before the function starts evaluating the aggregate or other expression (the function can handle the exception and then access the target). So you always have to use a temporary (backing out a user-defined Finalize is impossible). As someone who believes that the vast majority of types should be controlled, I cannot justify a significant required overhead that has virtually no user benefit. If a user really wants an explicit temporary, they can write one. In any case, we're pretty much required to use the same convention for limited and non-limited functions, because otherwise generics wouldn't work. And being forced to make a temporary at each call site is worse than the current situation, because the function might decide it has to make a copy too. In any case, the point of this exercise (from my perspective) was to get better performance for all controlled types. If the performance is going to be worse, I'd be better off forgetting I ever heard about AI-318 (because there is no reason to do a lot of work to end up with worse performance). Or, perhaps, forgetting about the Ada standard and doing it right would make the most sense. Neither would help Ada in the long run. Anyway, I've wasted far too much time on this discussion. I've made my position clear; I would rather drop AI-318 than be forced into Tucker's semantics. **************************************************************** From: Tucker Taft Sent: Friday, January 14, 2005 9:10 AM > I have to agree, but I come to the opposite conclusion: the way non-limited > returns work currently is unacceptable, and if you really want them to be > exactly the same, we'll need to change non-limited to match limited.... That seems like a potentially serious incompatibility. I can imagine there is a non-trivial amount of code that looks like: function Blah... is begin return Fum(...); exception when ... => raise Different_Exception; end Blah; If exceptions propagated by Fum are not caught by the exception handler of Blah, we may break a lot of code which assumes that only "Different_Exception" is propagated from Blah. I understand your implementation concerns, but I think they are all manageable, though certainly not trivial. Yes, you may have to figure out how to finalize a partially initialized object sooner than normal, but unchecked-deallocation knows how to do that, and I believe most compilers clean up partially initialized allocated objects right away, rather than waiting until the heap as a whole is finalized. It would be significantly worse, in my view, to have to go back and change the way these things work now, while also subjecting existing code to an incompatible, inconsistent change in run-time semantics. **************************************************************** From: Bob Duff Sent: Friday, January 14, 2005 9:57 AM Tucker wrote: > function Blah... is > begin > return Fum(...); > exception > when ... => > raise Different_Exception; > end Blah; Or how about: function Blah... is begin return Fum(...); exception when ... => return Alternate_Result(...); end Blah; It does seem that a function should be able to handle an exception raised anywhere within it. (Well, ahem, cough... except in the handler itself.) **************************************************************** From: Gary Dismukes Sent: Friday, January 14, 2005 11:48 AM > It does seem that a function should be able to handle an exception > raised anywhere within it. (Well, ahem, cough... except in the handler > itself.) And, ahem, cough, in the function's declarative part. ;) **************************************************************** From: Bob Duff Sent: Friday, January 14, 2005 3:22 PM > And, ahem, cough, in the function's declarative part. ;) Well, sure, but you can always put that code into a block statement. You can put the exception handlers in a block statement, too, but that leads to infinite regress. **************************************************************** From: Jean-Pierre Rosen Sent: Friday, January 14, 2005 11:43 AM From the text of the AI: The syntax for extended return statements was initially proposed early on, but when this AI was first written up, we proposed instead a revised object declaration syntax where the word "return" was used almost like the word "constant," as a qualifier. This was somewhat more economical in terms of syntax and indenting, but was not felt to be as clear semantically as this current syntax. Given the current number of worms struggling to get out of the can, would it be appropriate to reconsider this solution? **************************************************************** From: Tucker Taft Sent: Friday, January 14, 2005 1:59 PM I believe Randy has made a number of good arguments that indicate this has relatively little to do with the extended return statement, and even less to do with the specifics of its syntax. It is mostly related to returning limited objects, whether they be specified by an aggregate, function call, or extended return statement. In all cases you need to specify what happens if an exception is propagated before the return statement is completed. **************************************************************** From: Bob Duff Sent: Friday, January 14, 2005 4:06 PM > Your model would require changing the master of the object after it is > created. I don't see that. Consider an uninit allocator of an array, in Ada 95: type A is array(Positive range <>) of Some_Lim_Controlled; type P is access A; ... new A(1..10) ... If an exception is raised in the middle of initializing components, the implementation is required to finalize the ones for which Initialize completed successfully, and is forbidden from finalizing the others. I think the easiest way to implement this is to catch any exceptions that occur in the middle of initialization, finalize as necessary, and then reraise. This requires keeping track of how far through the array you got -- but that's necessary anyway -- this is just the loop index. Are you saying this implementation is wrong, and that finalization must wait until the collection is finalized? If so, I think we'd better change the rules to allow this, because it's the easiest and most efficient, and I believe many implementations already do it. (It's more efficient, because otherwise, you'd have to store the index of how far you got in the heap, for later use.) In other words, it's already the case (at least on many implementations) that finalization takes place earlier than the master, and this does not involve "changing masters". Therefore, the same could apply to these new kinds of returns. ---------------- I think your other concern was doing: X := F(...); in the nonlimited controlled case. You don't want to have to make temporaries. I agree. Why don't we simply allow that? That is, an implementation is free to finalize X first, then pass its address to F, which can build-in-place -- if the implementation so chooses. **************************************************************** From: Tucker Taft Sent: Friday, January 14, 2005 4:47 PM > I think your other concern was doing: > > X := F(...); > > in the nonlimited controlled case. You don't want to have to make > temporaries. I agree. Why don't we simply allow that? That is, an > implementation is free to finalize X first, then pass its address to F, > which can build-in-place -- if the implementation so chooses. This seems like another potentially dangerous incompatibility. Suppose you have: X := F(X(1..3)); or X is visible up-level to F. You can't finalize X if F might be able to see part or all of X during its execution. Perhaps you meant we should allow pre-finalization of X if its value is not needed to evaluate the right hand side, and there is no chance it will be visible in an exception handler if F propagates an exception. The current language allows aborts to occur between the finalize and the copy-and-adjust steps. Allowing the evaluation of the right hand side to occur then is probably also OK, given the above provisos. But your compiler will have to do the analysis to determine it is safe. Our compiler essentially already does this analysis for *non* controlled types, I believe, to decide whether it is safe to pass in the address of the left hand side as the result temp for the function. We could certainly do this for controlled objects as well, if we could perform finalization before the call when safe. But I suspect Randy was looking for a single approach that was always allowed, without having to do any analysis at the call site. In that case, he is stuck for an assignment statement. Of course, if the function call is used to initialize a new object, then you can always safely pass in the address of the new object. **************************************************************** From: Bob Duff Sent: Friday, January 14, 2005 9:47 PM > If an exception is raised in the middle of initializing components, the > implementation is required to finalize the ones for which Initialize > completed successfully, and is forbidden from finalizing the others. Correct. > I think the easiest way to implement this is to catch any exceptions > that occur in the middle of initialization, finalize as necessary, and > then reraise. This requires keeping track of how far through the array > you got -- but that's necessary anyway -- this is just the loop index. This I totally disagree with. Certainly, you *could* try to implement it that way, but it would be a lousy implementation for the majority of compilers. First, it requires a "virtual" exception handler, and that has a significant cost on most systems. Second, figuring out where you are is possible in simple cases (like this), but it doesn't generalize in any sensible way. When you have discriminated types with variants and arrays, and multiple controlled components nested to several levels (which in fact happens in some Claw example programs), it just becomes a nightmare. It makes much more sense for each controlled part (with the technical meaning of part) to be initialized and finalized individually. Each gets registered when its initialization finishes successfully, so you can only finalize those that have finished. > Are you saying this implementation is wrong, and that finalization must > wait until the collection is finalized? If so, I think we'd better > change the rules to allow this, because it's the easiest and most > efficient, and I believe many implementations already do it. It's clearly wrong: the allocated object belongs to the collection, and shouldn't be finalized until the collection goes away. For a declared object, you can't tell the difference. And you can't tell the difference in Ada 95, either, because the type necessarily must be non-limited, and you can't tell the difference between the object being created in a temporary (which is finalized immediately, before the allocated object is even created) or this implementation. But for a limited object, the object has to be built-in-place, and thus the implementation is clearly wrong (*only* for a limited type). Whether that should be relaxed is an open question. If you want to require the above behavior, I'll fight you until the ends of the earth - it's clearly requiring a horrible implementation, and buys nothing for users. But if you just want to allow it, I don't particularly care. OTOH, we have generally specified finalization behavior of limited types in Ada without allowing much optimization, so I would tend to be conservative with these rules. > (It's more efficient, because otherwise, you'd have to store the index > of how far you got in the heap, for later use.) You have to store *something* in the heap in order to know to finalize these objects anyway (and usually that needs to be quite a bit - a subprogram address and static link and a chain, at a minimum), so I don't see much reason to worry about another word. For a chained implementation, there is no extra cost at all (just make sure its linked on the right chain). > In other words, it's already the case (at least on many implementations) > that finalization takes place earlier than the master, and this does not > involve "changing masters". Therefore, the same could apply to these > new kinds of returns. Such implementations are wrong for limited types (certainly with the rules as written). It's an "as-if" optimization for non-limited types, so it's fine to use it in Ada 95 and in Ada 2005. To even allow your implementation would require writing quite a bit of really tricky wording - I don't even know all of the places where it would have to be allowed. To implement something like what you describe in Janus/Ada certainly is possible (in the sense that anything is possible), but it would be quite a hit in performance, because you'd have to do everything twice. "Changing masters" would certainly be the cheapest way to do it, easiest would be a handler and explicit finalize in the sense of Unchecked_Deallocation (but that would be far more expensive because of exception handling overhead). But the latter only works on the "collections" created for access types, because those don't have pointers into them. The main finalization chain is full of pointers into it. The only way to allow early finalization of stack objects or changing a master oof a stack object would be to add additional dummy nodes for the pointers to point at. That would add overhead to all programs with finalization (with a lot of extra work, the extra overhead could be mitigated in blocks that don't do anything nasty, but that would be very expensive to check with any degree of accuracy). I prefer to keep the model simple, which is to finalize at the master of the object. **************************************************************** From: Randy Brukardt Sent: Friday, January 14, 2005 9:59 PM ... > This seems like another potentially dangerous incompatibility. > Suppose you have: > > X := F(X(1..3)); > > or X is visible up-level to F. You can't finalize X if > F might be able to see part or all of X during its execution. Right. I thought about that on the way home last night. It would have to work like optimizing slices or (non-limited) aggregate assignments. ... > Our compiler essentially already does this analysis for > *non* controlled types, I believe, to decide whether it > is safe to pass in the address of the left hand side > as the result temp for the function. We could certainly > do this for controlled objects as well, if we could perform > finalization before the call when safe. Exactly what I had in mind. > But I suspect Randy was looking for a single approach > that was always allowed, without having to do any analysis > at the call site. In that case, he is stuck for an assignment > statement. Of course, if the function call is used to initialize a > new object, then you can always safely pass in the address > of the new object. Not really. It's clearly going to be necessary to be able to use a temporary (for all types), because the function or aggregate could be directly passed as a parameter, thus there being no object to assign into. Once you have that, doing it for function calls if needed is fine. It's the "if needed" that matters; you don't want simple functions: Ten := To_Unbounded_String ("Ten"); to have to use temporaries soley to get the finalization "right". We already have lots of rules allowing optimizations in this area, so one more is unlikely to be harmful. It should never be necessary to use a temporary for: Ten : Unbounded_String := To_Unbounded_String ("Ten"); and I certainly want that be true for *all* types, not just limited types. **************************************************************** From: Bob Duff Sent: Saturday, January 15, 2005 9:35 AM Randy wrote: > But for a limited object, the object has to be built-in-place, and thus the > implementation is clearly wrong (*only* for a limited type). My example showed a limited type. What you seem to be saying is that the suggested implementation is wrong only for limited types, *and* only for heap objects. That seems insane. Why should heap objects be different from stack objects, here? (Again, I'm talking about the limited case.) The heap object in question is inaccessible (the allocator failed!), so *requiring* it to remain in existence until collection finalization seems to be of zero benefit to the user. [I realize there are sneaky ways to make this half-baked object accessible, but those are bugs waiting to happen. My point is that the pointer returned by "new" never arrives in this case.] >... Whether that > should be relaxed is an open question. If you want to require the above > behavior, I'll fight you until the ends of the earth - it's clearly > requiring a horrible implementation, and buys nothing for users. But if you > just want to allow it, I don't particularly care. I said "allow", not "require". Certainly, if an implementation has expensive exception handlers, that changes the trade-offs. We need not argue about which methods are "best" for all compilers. As I said, I'm surprised it's not already allowed, and I believe there are implementations that do this sort of thing in some or all cases. **************************************************************** From: Tucker Taft Sent: Saturday, January 15, 2005 12:43 PM ... >>I think the easiest way to implement this is to catch any exceptions >>that occur in the middle of initialization, finalize as necessary, and >>then reraise. This requires keeping track of how far through the array >>you got -- but that's necessary anyway -- this is just the loop index. > > This I totally disagree with. Certainly, you *could* try to implement it > that way, but it would be a lousy implementation for the majority of > compilers. Be that as it may, we implement it this way, and I believe so does Rational. I'm not sure about GNAT. > ... > It makes much more sense for each controlled part (with the technical > meaning of part) to be initialized and finalized individually. Each gets > registered when its initialization finishes successfully, so you can only > finalize those that have finished. I believe Rational keeps careful track of what components have been initialized, and then finalizes just those. I believe Bob was involved in implementing that. The AdaMagic approach is to have a "components master" which we use temporarily for registering components that need finalization, and then at some point we convert to a single registration for the whole object, throwing away the components master. If things get interrupted in the middle, then the components master is the "innermost" master, and the components on it get cleaned up. After we make the switch, the clean up of the components is embedded in a whole-object clean up routine generated for types with multiple finalizable components. The components master is unlinked from the chain of masters, and the cleanup routine for the object as a whole is linked onto the appropriate master. I'm sure there a million ways to implement this, and what makes the most sense will vary from one implementation to the next, depending on their run-time model, and on other tradeoffs they choose to make. > ... > It's clearly wrong: the allocated object belongs to the collection, and > shouldn't be finalized until the collection goes away.... It might be hard to find "clear" wording saying this. And it seems undesirable. It is also interesting that "garbage collection" is permitted by the language, but never defined in detail nor is there a clear explanation of how garbage collection relates to finalization. > But for a limited object, the object has to be built-in-place, and thus the > implementation is clearly wrong (*only* for a limited type). Whether that > should be relaxed is an open question. As far as I know, AdaMagic (and hence Green Hills and Aonix) and Rational both attempt to finalize partially initialized objects right away. I can't imagine any value to the user to postpone this finalization, and by finalizing right away, we can reclaim the space that much sooner. > ... > If you want to require the above > behavior, I'll fight you until the ends of the earth - it's clearly > requiring a horrible implementation, and buys nothing for users. This seems a bit of an overstatement, and reclaiming storage sooner seems more than "nothing." > ... > But if you > just want to allow it, I don't particularly care. OTOH, we have generally > specified finalization behavior of limited types in Ada without allowing > much optimization, so I would tend to be conservative with these rules. It sounds like we need some clarification of this issue. If an allocator fails before creating the access value or activating component tasks, it seems difficult to justify requiring deferring finalization, and based on your strong statement, perhaps also difficult to justify requiring immediate finalization. Also, garbage collection needs to be factored into the finalization rules. > ... > Such implementations are wrong for limited types (certainly with the rules > as written). It would be interesting to identify these rules. We probably want to make them clearer, and given existing implementations, be permissive of either immediate or deferred finalization of partially initialized heap objects. **************************************************************** From: Bob Duff Sent: Saturday, January 15, 2005 1:22 PM > "garbage collection" is permitted by the language, but > never defined in detail nor is there a clear explanation > of how garbage collection relates to finalization. There is some discussion of this in 13.11.3, much of which was banished to the AARM because we didn't think it was worth putting in the RM, given the lack of Ada implementations supporting GC. We figured, if somebody wants to implement GC, let *them* figure out all the interactions with finalization, with some hints about the issues in the AARM. But it's clear that the intent was that a GC'ed implementation would finalize the collected objects "prematurely". There are Ada implementations on top of the Java Virtual Machine and the .NET virtual machine. I presume they deal with these interactions by letting the virtual machine do its thing. I firmly believe that Ada should *allow* GC, just like I firmly believe that Ada should *allow* generic code sharing, despite the fact that (sadly) not too many implementations do these nice things. **************************************************************** From: Tucker Taft Sent: Saturday, January 14, 2005 2:58 PM Here is a statement that I believe, at least in part, disallows premature finalization of heap objects (7.6.1(10)), and should be revised, probably: ... If an instance of Unchecked_Deallocation is never applied to an object created by an allocator, the object will stilll exist when the corresponding master completes, and it will be finalized then. **************************************************************** From: Robert I. Eachus Sent: Saturday, January 15, 2005 9:07 PM I think that this argument is getting very far away from the original intent of this feature. The decision that has to be made IMHO is to keep the ability to initialize limited objects, or to let some notion of linguistic purity get in the way. In current Ada, if a function has an exception handler, that handler does not catch all exceptions raised between the call and return. If a user writing a function whose definition requires handling all exceptions that may be raised internally, then he knows how to use nested blocks and so forth to insure this. At first it looks like the necessary rule here would create undue hardship. But it won't. It will hardly even come up. Why? Because within the function returning a limited object, the object may not be limited. More often the case will be that the object will be partially limited. The function will be defined in a location that can "see into" the limited object. The parent part of the object may still be limited in this view, but that is fine. The initialization function called for the parent part will initialize that part of the object, and handle any exceptions it should and can internally. This leaves only two potential concerns. First an exception caused by the creation of the entire object being returned. But that is not a problem, at least as I see it. The object may be defined in the declarative part--and doesn't get handled locally anyway. More likely it gets defined before the keyword *do*. We can discuss that particular case at length, but I don't see the errors that may occur there (as opposed to being propagated there) as being all that important. *Storage_Error *may occur, but this is Dave Emery's parachute that opens on impact. Not when the object is too large for the heap--that case should work. But when the function is called with only a few words left on the stack, predicting where *Storage_Error* will occur and which handler might see it, is futile. If a programmer wants to handle Tucker's or Bob Duff's examples, he will write: <>Tucker wrote: function Blah... is begin return Fum(...); exception when ... => raise Different_Exception; end Blah; Or function Blah... is begin return Fum(...); exception when ... => return Alternate_Result(...); end Blah; Both are legal today, and AFAIK we are not talking about changing that, with the possible exception of limited objects being built in place. But the potential problem case as I see it, is when there is a sequence of statements within the return: function Constructor return Limited_Array is begin return D: Limited_Array(1..10 => <>) do for I in D'Range loop begin D(I) := ...; exception when others => -- fix D(I) for some I. end; end loop. end return; end Constructor; And here there are no philosophical or other issues. The user can declare a handler inside the return, and it will do exactly what he wants. Of course, we do need to disallow gotos out of the scope and so on, but I hope we have already agreed on that already. This means that the *creation* of the object to be returned is the bone of contention. But I guess that I don't see that either. If the object is being built in place, how can it matter to the user whether it is the creation of the 'temporary' or the 'target' that causes the exception? Certainly in: Foo: Limited_Bar := Constructor(...); The user is not going to be surprised to get *Storage_Error* outside the constructor if there is not enough space to create Foo, or *Tasking_Error* if he has exceeded the limit on the number of tasks. So I just don't see a real problem here. Programmers will need to know that exceptions in the sequence_of_statements in a return statement occur after the enclosing scope has been left. But that is the sort of thing that is dealt with by an example in the RM. I could even see resolving the dispute between Tucker and Randy by allowing exceptions caused by creating and allocating the returned object to be handled in either the function or the caller as the implementation chooses. I don't really think it is as bad as all that though. There are some exceptions that will occur in the caller, others that can occur in the function, and a lot where either scope will be appropriate. I just can see destroying a useful new functionality by getting pedantic about potentially obscure cases. Yes, we have to define it, but no we don't have to over define it. **************************************************************** From: Robert Dewar Sent: Saturday, January 15, 2005 9:11 PM I would hesitate to revise this. Allowing arbitrary early finalization can be disastrous to fundamental semantics of currently correct programs that use finalization e.g. to properly unlock something at the proper point. Garbage collection can occur early in such a case, since very likely we are using a dummy variable that no one references. In GNAT we have a pragma Finalize_Storage_Only, which says that the only reason for a finalization routine is to free storage. We had in mind two purposes 1. Skip finalization at the outer level when program terminates (this is implemented in GNAT now). 2. Allow early finalization in the garbage collected case. I really believe it is essential not to introduce a giant upwards incompatibility here! **************************************************************** From: Tucker Taft Sent: Sunday, January 16, 2005 9:30 AM This is *not* talking about local variables, even unreferenced local variables. In any case we have existing rules that disallow removing local variables of a limited controlled type. This is talking about objects created by an allocator, and most specifically, the issue at hand is components created by an allocator that *fails* due to an exception raised during its evaluation. I believe that we want to encourage implementations to recover the space for a failed allocator as soon as possible, which implies finalizing the components of the as-yet-incomplete-heap-object immediately, rather than at the point where the access type goes out of scope. **************************************************************** From: Robert Dewar Sent: Sunday, January 16, 2005 9:42 AM It still worries me to assume that we are only talking about recovering space here. This would be a definite incompatibility. I just don't know how severe a one. If all the finalizer does is to deallocate, then that's not an issue, if it has unbounded interesting side effects, such as referencing something that does not exist until later on, then this seems a recipe for semantic confusion to me. **************************************************************** From: Jean-Pierre Rosen Sent: Sunday, January 16, 2005 11:51 AM > I believe Randy has made a number of good arguments that > indicate this has relatively little to do with the > extended return statement, and even less to do with > the specifics of its syntax. It is mostly related > to returning limited objects, whether they be specified > by an aggregate, function call, or extended return > statement. In all cases you need to specify what happens > if an exception is propagated before the return statement > is completed. Of course, finalization issues are here to stay. I just thought that this would get rid of the exit/goto/exception issues. **************************************************************** From: Robert I. Eachus Sent: Sunday, January 16, 2005 10:54 PM >It still worries me to assume that we are only talking about >recovering space here. This would be a definite incompatibility. >I just don't know how severe a one. If all the finalizer does >is to deallocate, then that's not an issue, if it has unbounded >interesting side effects, such as referencing something that >does not exist until later on, then this seems a recipe for >semantic confusion to me. I tend to agree with Tucker that the concern you have here is misplaced. There are some tough issues involving creation of a complex object containing tasks, but other rules seem to me to prevent those from ever surfacing. None of the tasks which are part of the object will ever be activated. A task designated by an access variable in the object could be created by an allocator, and be activated even though the object as a whole was never created. But such a task could never be referenced, and should be considered a program design error (or an ACVC* test case ;-) independent of what goes on here. I see the other issue that Robert raises as again, a possible problem, but one which is larger than what could arise in this case. If you create a finalization routine which assumes that all objects of the type have the same lifetime, then finalizing one of these objects early due to a failure during creation of a larger object is a potential problem. But that exception, which will occur in a declarative part (for any limited object). So *any* object created in that declarative part can be finalized early, or out of the expected order. Doesn't matter what is decided here. The program designer (or tester) is going to have to think about such things in every case where there is a complex initialization in a declarative part. This was true in Ada 83, and is true in Ada 95. Any potential exception in a declarative part has to be treated and tested as adding an additional path through the unit. (What I like about Ada compared to PL/I here is that each such case only adds one path. You don't get exponential explosions.) *If anyone reading this is too young to remember certain ACVC tests that didn't make it into the ACATS, consider yourself lucky. **************************************************************** From: Randy Brukardt Sent: Monday, January 147 2005 5:10 PM Tucker Taft wrote, replying to Robert Dewar: > Robert Dewar wrote: > > Tucker Taft wrote: > > > >> Here is a statement that I believe, at least in part, > >> disallows premature finalization of heap objects (7.6.1(10)), > >> and should be revised, probably: > >> > >> ... If an instance of Unchecked_Deallocation is never > >> applied to an object created by an allocator, the object > >> will stilll exist when the corresponding master completes, > >> and it will be finalized then. > > > > I would hesitate to revise this. Allowing arbitrary early > > finalization can be disastrous to fundamental semantics of > > currently correct programs that use finalization e.g. to > > properly unlock something at the proper point. Garbage > > collection can occur early in such a case, since very > > likely we are using a dummy variable that no one references... > > This is *not* talking about local variables, even unreferenced > local variables. In any case we have existing rules that disallow > removing local variables of a limited controlled type. > > This is talking about objects created by an allocator, and > most specifically, the issue at hand is components created > by an allocator that *fails* due to an exception raised > during its evaluation. I believe that we want to encourage > implementations to recover the space for a failed allocator > as soon as possible, which implies finalizing the components > of the as-yet-incomplete-heap-object immediately, rather than > at the point where the access type goes out of scope. I don't know what we want to do, but I thought I'd try to shed a bit of light on this topic. So I wrote an example program of the case that we're talking about. ----- with Ada.Finalization; package Checkit2 is Created_Yet : Boolean := False; Finalized_Yet : Boolean := False; type Check_Type is new Ada.Finalization.Limited_Controlled with null record; procedure Initialize (Obj : in out Check_Type); procedure Finalize (Obj : in out Check_Type); function Raise_P_E return Integer; end Checkit2; package body Checkit2 is procedure Finalize (Obj : in out Check_Type) is begin Finalized_Yet := True; end Finalize; procedure Initialize (Obj : in out Check_Type) is begin Created_Yet := True; end Initialize; function Raise_P_E return Integer is begin raise Program_Error; return 10; end Raise_P_E; end Checkit2; with Ada.Text_IO; with Checkit2; procedure Check2 is -- Check when finalization of a failed limited allocator occurs. type Record_Type is limited record Cont : Checkit2.Check_Type; Oops : Integer := Checkit2.Raise_P_E; end record; begin Ada.Text_IO.Put_Line ("--- Check when an allocated object is" & " finalized when the initializer fails"); declare Early : Boolean; begin declare type Acc_Record_Type is access Record_Type; Obj : Acc_Record_Type; function Test return Acc_Record_Type is begin return (new Record_Type); exception when Program_Error => Early := Checkit2.Finalized_Yet; if Checkit2.Finalized_Yet then Ada.Text_IO.Put_Line ("%% Failed allocated object " & "finalized inside of function"); elsif Checkit2.Created_Yet then Ada.Text_IO.Put_Line ("%% Failed allocated object " & "created but not finalized inside of function"); else Ada.Text_IO.Put_Line ("%% Failed allocated object " & "controlled component not created"); end if; raise; end Test; begin Obj := Test; Ada.Text_IO.Put_Line ("** Test failed to raise exception."); end; exception when Program_Error => if Checkit2.Created_Yet then if Checkit2.Finalized_Yet then if not Early then Ada.Text_IO.Put_Line ("%% Allocated object finalized " & "when type goes out of scope"); -- else already reported on finalization. end if; else Ada.Text_IO.Put_Line ("** Allocated controlled component " & "created but not finalized!"); end if; else Ada.Text_IO.Put_Line ("%% Allocated controlled component " & "never created"); end if; end; Ada.Text_IO.Put_Line ("--- Check complete"); end Check2; ---- Unfortunately, it didn't shed much light. Janus/Ada worked as I expected (printing "created but not finalized inside of function" and "finalized when type goes out of scope"). The GNAT version I tried failed outright, printing "created but not finalized inside of function", then "component created but not finalized!". I didn't check if it just finalized the object too late, or never. I don't have ObjectAda installed on this OS right now, so I didn't try it. And I still haven't gotten around to trying to get the Rational Apex working again (the license manager just refuses to work on our network). I'm sure others will try those. ---- My personal opinion on this is rather split. Certainly, Initialize and Finalize routines can link objects into other data structures (Claw works this way), so it's important that we don't require any magic "going away". OTOH, any controlled type that allows allocation of its objects and can't handle an arbitrary Finalize call (from Unchecked_Deallocation, for example) is pretty dubious. OT3H, AI-179 decides not to decide on what happens with Unchecked_Deallocations that fail, and it would be odd to make similar requirements on allocations. Moreover, deallocation of storage is not semantically neutral -- especially if the storage comes from a user-defined pool. I do think it is crystal clear that the current model in the RM does not allow such early finalizations (7.6.1 defines when finalizations are done, and there is nothing about exceptions raised by allocators or return statements in 7.6.1!) I suspect implementers got caught making an "as-if" optimization that doesn't quite work. In any case, I'm not planning to run out and create an ACATS test for this case. It's hard to imagine a legitimate use for this, as raising an exception in an initializer is clearly a bug (not a feature!). If some programmers want to play Russian Roulette and leave those bugs in their production systems, that's their business, but I don't have a lot of sympathy. **************************************************************** From: Pascal Leroy Sent: Tuesday, January 18, 2005 6:14 AM Apex produces: --- Check when an allocated object is finalized when the initializer fails %% Failed allocated object created but not finalized inside of function ** Allocated controlled component created but not finalized! --- Check complete which surprises me and doesn't seem quite right. At least, it's GNAT-compatible ;-) **************************************************************** From: Tucker Taft Sent: Tuesday, January 18, 2005 1:45 PM If an allocator's initialization fails, the user can't reclaim the space using Unchecked_Deallocation. For a long-running application, it would presumably be desirable if the implementation would recover the space. Perhaps for this and for garbage collection, we should permit an implementation to implicitly perform an "Unchecked_Deallocation" and the associated finalization as soon as the object is no longer accessible. This would tie into the wording in of 7.6.1(10), because it talks in terms of Unchecked_Deallocation. Of course if the pragma Controlled applies to the access type, then the garbage collection option would be disallowed, though implicitly deallocating the failed allocator might still be desirable to avoid a storage leak. **************************************************************** From: Tucker Taft Sent: Tuesday, January 18, 2005 4:21 PM Interestingly, the second sentence of 7.6.1(10), which indicates that if Unchecked_Deallocation isn't used, the object is finalized at the end of the access type scope, is bracketed in the AARM, as though it is redundant. But later in the AARM, it says that if the implementation does garbage collection, then it "should" finalize the object before reclaiming its storage. These two seem to be inconsistent, unless we hypothesize that garbage collection is implicitly invoking Unchecked_Deallocation. **************************************************************** From: Bob Duff Sent: Monday, June 6, 2005 12:35 PM I'm using draft 11.8 of the [A]ARM. 3.8(13.1/2): 13.1/2 {AI95-00318-02} If a record_type_declaration includes the reserved word limited, the type is called a limited record type. So "limited record type" is not synonymous with "record type that is limited"?! That seems rather confusing. How about renaming this concept "explicitly limited record type"? **************************************************************** From: Tucker Taft Sent: Monday, June 6, 2005 7:08 PM Sounds reasonable. I can't imagine this special term is used very much, and it would be wise to be as "explicit" as possible... ;-) **************************************************************** From: Randy Brukardt Sent: Monday, June 6, 2005 9:18 PM That seems OK to me, although finding where it is used is going to be tricky. (Which, I suppose, is the point.) **************************************************************** From: Bob Duff Sent: Tuesday, June 7, 2005 6:46 AM 3.7(10.f/2, 10.i/2) 3.8(13.1/2, 31.i/2) 7.5(8.1/2) 7.6(17.1/2) 10.2.1(28.e/2) D.10(5.a) I don't know which of the above are old wording that was intended to mean "record type that is limited". The term does not appear in the Index. **************************************************************** From: Randy Brukardt Sent: Saturday, June 11, 2005 12:35 AM Bob gives a list of places to change: > 3.7(10.f/2, 10.i/2) Interestingly, the notes use the new term, the normative wording here uses the old gobbledygook. Should 3.7(10) be changed to use "explicitly limited record type" like all of the new wording?? It would seem to be more consistent. > 3.8(13.1/2, 31.i/2) > > 7.5(8.1/2) > > 7.6(17.1/2) > > 10.2.1(28.e/2) These are all uses of the new term. > D.10(5.a) This is an old use of the new term; the meaning is exactly what we mean. So I guess there wasn't any confusion. :-) > I don't know which of the above are old wording that was intended to > mean "record type that is limited". None, amazingly. But you didn't look for "limited record", which might find more hits. > The term does not appear in the Index. It does now. **************************************************************** From: Pascal Leroy Sent: Saturday, June 11, 2005 3:52 AM > > 3.7(10.f/2, 10.i/2) > > Interestingly, the notes use the new term, the normative > wording here uses the old gobbledygook. Should 3.7(10) be > changed to use "explicitly limited record type" like all of > the new wording?? It would seem to be more consistent. It should be changed to use the new terminology, but it must still talk about ancestors and all that, so the sentence will remain rather convoluted. **************************************************************** From: Randy Brukardt Sent: Monday, June 13, 2005 10:51 PM Turns out that we have to change it, because a "type containing the reserved word limited" clearly includes derived types that explicitly include limited, but we certainly don't want them to (it wouldn't necessarily be a "really limited" type). type L (...) is limited private; type D (...) is new limited L; D meets the letter of the old rule, but shouldn't be included if L is actually completed by Integer (say). **************************************************************** From: Sergey I. Rybin Sent: Friday, December 2, 2005 1:29 PM !topic question about the definition of extended_return_statement syntax !reference RM06-6.5(2.1/2) !from Sergey Rybin 2005-12-02 !keywords extended_return_statement, identifier, defining identifier !discussion RM06-6.5(2.1/2) defines the syntax of extended_return_statement as: extended_return_statement ::= return identifier : [aliased] return_subtype_indication [:= expression] [do ^^^^^^^^^^ handled_sequence_of_statements end return]; The question is about the use of the identifier notion. From one side, the construct identifier : [aliased] return_subtype_indication [:= expression] looks exactly as a declaration, except that it uses 'identifier', but not 'DEFINING_identifier'. Moreover, RM06-6.5(5.7/2) says: Within an extended_return_statement, the return object is declared ^^^^^^^^^^^ with the given identifier, with the nominal subtype defined by the return_subtype_indication. But in all the other cases, if something "is declared", then the corresponding syntax construct contains 'defining_identifier', but not 'identifier'! This looks confusing, and I've got really confused by this situation trying to analyze how the existing ASIS Standard should be extended for Ada 2005. So, in my opinion, either 'identifier' should be replaced with 'defining_identifier' here (and the corresponding note should be added to that part of RM06-3.1 that discusses various forms of declarations), or RM06-6.5 should explicitly explain why it uses 'identifier' in the syntax structure of extended_return_statement And as an ASIS person, I'm pretty sure that the definition of extended_return_statement syntax structure should use 'defining_identifier', but not 'identifier' **************************************************************** From: Randy Brukardt Sent: Friday, December 2, 2005 2:16 PM ... > So, in my opinion, either 'identifier' should be replaced with > 'defining_identifier' here (and the corresponding note should be > added to that part of RM06-3.1 that > discusses various forms of declarations), or RM06-6.5 should > explicitly explain > why it uses 'identifier' in the syntax structure of > extended_return_statement My initial reaction to this was "what's a defining_identifier"? Which might explain the syntax. :-) The closest correllary to extended return is for loops (that is, a statement that declares something), and it uses "defining_identifier". So I think you are right. But I'd like to hear from Tucker if there is some subtle reason for this difference, or if he, like me, just plain forgot about "defining_identifier". **************************************************************** From: Gary Dismukes Sent: Friday, December 2, 2005 2:17 PM This looks like an oversight to me. It certainly seems that this should be a defining_identifier. **************************************************************** From: Tucker Taft Sent: Friday, December 2, 2005 3:08 PM I agree. defining_identifier is the right syntactic construct. **************************************************************** From: Robert A. Duff Sent: Friday, December 2, 2005 4:30 PM Yes. I suspect there are dozens of rules that apply to defining_identifiers (and defining names, which are defined in terms of defining_identifiers), and that _should_ apply to these return objects. ****************************************************************