CVS difference for ais/ai-00318.txt

Differences between 1.6 and version 1.7
Log of other versions for file ais/ai-00318.txt

--- ais/ai-00318.txt	2003/12/10 00:14:04	1.6
+++ ais/ai-00318.txt	2004/01/08 04:16:38	1.7
@@ -1,4 +1,4 @@
-!standard 03.03.01  (02)                              03-06-23  AI95-00318/01
+!standard 03.03.01  (02)                              03-12-10  AI95-00318/02
 !standard 06.05.00  (17)
 !standard 06.05.00  (18)
 !class amendment 02-10-09
@@ -10,16 +10,20 @@
 
 !summary
 
-New syntax is proposed for identifying the object that
-will be returned from a function, allowing the
-object to be built in the context of the caller,
-without further copying required.
+A new extended syntax is proposed for the return statement,
+providing a name for the new object being created as a result
+of a call on the function.
 
-This could be used to support returning limited objects from a
+This new syntax can be used to support returning limited objects from a
 function, to support returning objects of an anonymous access type,
 and more generally to reduce the copying that might be required
 when a function returns a complex object, a controlled object, etc.
 
+The existing ability to return by reference is changed so that it
+would be permitted only if the function were declared with an extra
+keyword which distinguishes it syntactically from a "normal"
+function -- the result of a "normal" function is always a newly created object.
+
 !problem
 
 We already have a proposal for allowing aggregates of a limited type,
@@ -44,47 +48,72 @@
 from the function. This is difficult to do while still creating
 the object directly in its "final" location.
 
+Currently functions that return a limited private type may have an
+accessibility check performed on the object returned, depending on a
+property ("return-by-reference-ness") which is not generally visible
+based on the partial view of the type. This means that a function
+that works initially may stop working if the full type of the result
+type is changed to include, say, a limited tagged component, or some
+other component that is return-by-reference.
+
+A function whose result type turns out to be return-by-reference
+cannot be allowed where a new object is required. However, there
+is nothing in the declaration of such a function that indicates it
+returns by reference.
+
 !proposal
+
+A new syntax for a function specification is proposed:
 
-When declaring a local variable inside a function (not including
-within a nested program unit), the variable may be declared to
-be a "return" object, using the following syntax (analagous to
-the syntax used for constants):
-
-    identifier : [ALIASED] RETURN subtype_indication [:= expression];
-
-Within the scope of a return object (except within nested program units), no
-other return objects may be declared, and all return statements
-must have the name of the return object as their returned expression.
-
-[Possible alternative: return statements in the scope of a "return" object
-must omit the returned expression, and be like the return statements
-of a procedure. One possible down side of omitting the name
-of the return object is that it makes the reader's job a bit
-harder; they have to look back to find the object being returned.
-One possible up side -- it is perhaps clearer that no copying is happening
-at the return statement.]
-
-The return object would not be finalized prior to leaving the function.
-The caller would be responsible for its finalization.
-
-This syntax would not be restricted to limited types. It could
-also be used for non-limited types. The implementation advice would
-be that the amount of copying, finalization, etc. should be reduced,
-if possible, as part of returning from the function. This could be
-particularly useful for functions that return large objects, or objects
-with controlled parts.
-
-A call of a function with a limited result type could be used in the
-same contexts where we have proposed to allow aggregates of a limited
-type, namely contexts where a new object is being created (or can be).
+    FUNCTION designator [parameter_profile] ALIASED RETURN subtype_mark
 
-  1) Initializing a newly declared object (including a "return" object)
+Such a function is defined to return by reference. Upon return from such a
+function, a check is made that the object associated with the return expression
+has an accessibility level that is no deeper than that of the function
+declaration. Program_Error is raised if this check fails.
+
+A call of a return-by-reference function denotes an aliased constant view of the object
+associated with the return expression. The accessibility level of this view
+is that of the function.
+
+[NOTE: One possibility is to eliminate this return-by-reference capability,
+in favor of functions with an anonymous access result type. See the discussion.]
+
+-------------
+
+An extended syntax for the return statement is proposed:
+
+    RETURN identifier : [ALIASED] subtype_indication [:= expression] [DO
+      handled_sequence_of_statements
+    END RETURN];
+
+Such an extended return statement is permitted only immediately
+within a function which is not a return-by-reference function.
+The specified identifier names the object that is the result of
+a call on the function. If the expression is present, it provides
+the initial value for the result object. If not, the result object
+is default initialized. If the handled_sequence_of_statements is
+present, it is executed after initializing the result object. Within
+the handled_sequence_of_statements, the identifier denotes a variable
+view of the result object with nominal subtype given by the subtype_indication.
+When the handled_sequence_of_statements completes, the function is complete.
+
+[Question: Should an expression-less return statement be permitted
+within the handled_sequence_of_statements?  That would be consistent
+with the way that accept statements work.]
+
+A call of a non-return-by-reference function with a limited result type may
+be used in the same contexts where we have proposed to allow aggregates of a
+limited type, namely contexts where a new object is being created (or can be).
+
+  1) Initializing a newly declared object (including a result object identified
+     in an extended return statement)
   2) Default initialization of a record component
   3) Initialized allocator
   4) Component of an aggregate
   5) IN formal object in a generic instantiation (including as a default)
-  6) Expression of a return statement
+  6) Expression of a return statement (though note that if the function
+     is return-by-reference, it would surely fail the accessibility check)
   7) IN parameter in a function call (including as a default expression)
 
 In addition, since the result of a function call is a name in Ada 95,
@@ -94,9 +123,22 @@
 
   8) Declaring an object that is the renaming of a function call.
   9) Use of the function call as a prefix to 'Address
+
+In other words, it would be permitted in *any* context where limited types
+are permitted.  With the new proposals, that is pretty much *any* context
+where a "name" that denotes an object or value is permitted, except as the
+right hand side of an assignment statement.
 
-If we permit function result types to be anonymous access types
-(e.g. "function Blah return access T"), then we likely will want such
+Note that a call of a return by reference function, because it is represents
+a view of a preexisting object, is not permitted in contexts 1, 2, 3, 4, and 6
+if the result type is limited.  Such a call *would* be permitted in the
+expression  of a return statement for another return-by-reference function.
+
+
+-----------
+
+[ASIDE (see AI-325): If we permit function result types to be anonymous access
+types (e.g. "function Blah return access T"), then we likely will want such
 functions, if they return the result of an allocator, to be able to use the
 context of the call to determine the storage pool for the allocator.
 This proposed syntax would allow the function to do the allocator in the
@@ -105,48 +147,40 @@
 object would inherit the storage pool determined by the calling context,
 so that allocators that are used to initialize it, or that are assigned
 to it later, would use the caller-determined storage pool.
+END of ASIDE]
 
 !wording
 
 !example
 
 Here is an example of a function with a limited result type
-using a "return" object:
+using an extended return statement:
 
-    function Construct_Obj(Len : Natural) return Lim_Type is
-	Result : return Lim_Type(Discrim => Len);  -- the "return" object
+    function Make_Obj(Len : Natural) return Lim_Type is
     begin
-	-- Finish the initialization of the "return" object.
-	for I in 1..Len loop
-	    Result.Data(I) := I;
-	end loop;
-
-	-- And now return it.
-	return Result;
-           -- [Alternative: omit "Result" (or entire return statement);
-	   --  "return Result;" would be implicit]
-    end Construct_Obj;
+        return Result : Lim_Type(Discrim => Len) do -- the "result" object
+            -- Finish the initialization of the "return" object.
+            for I in 1..Len loop
+                Result.Data(I) := I;
+            end loop;
+        end return;
+    end Make_Obj;
 
-Here is essentially the same function, but with an anonymous access
+[ASIDE: See AI-325: Here is essentially the same function, but with an anonymous access
 type for its result type:
 
-    function Construct_Obj(Len : Natural) return access Lim_Type is
-	Result : return access Lim_Type; -- The "return" object
+    function Make_Obj(Len : Natural) return access Lim_Type is
     begin
-        Result := new Lim_Type(Discrim => Len);
-          -- this uses the storage pool determined by the caller context
+        return Result : access Lim_Type do -- The "result" object
+            Result := new Lim_Type(Discrim => Len);
+                -- this uses the storage pool determined by the caller context
+            -- Finish the initialization of the allocated object
+            for I in 1..Len loop
+                Result.Data(I) := I;
+            end loop;
+        end return;
+    end Make_Obj;
 
-	-- Finish the initialization of the allocated object
-	for I in 1..Len loop
-	    Result.Data(I) := I;
-	end loop;
-
-	-- And now return it.
-	return Result;
-           -- [Alternative: omit "Result" (or entire return statement);
-	   --  "return Result;" would be implicit]
-    end Construct_Obj;
-
 By "caller context", we mean that the same rules as apply to an
 allocator would apply to calls on this function, where the
 expected (access) type would determine the storage pool:
@@ -156,8 +190,10 @@
 
     P : My_Acc_Type;
   begin
-    P := Construct_Obj(3);
-     -- allocator inside Construct_Obj uses My_Amazing_Stg_Pool
+    P := Make_Obj(3);
+     -- allocator inside Make_Obj uses My_Amazing_Stg_Pool
+
+END of ASIDE]
 
 !discussion
 
@@ -169,8 +205,8 @@
 Just allowing a function whose whole body is a return statement
 returning an aggregate (or another function call) does not give the
 programmer much flexibility. What they would like is to be able
-to create the object and then initialize it further somehow, perhaps
-by calling a procedure, doing a loop (as in the examples above),
+to create the object being returned and then initialize it further somehow,
+perhaps by calling a procedure, doing a loop (as in the examples above),
 etc. This requires a named object. However, to avoid copying,
 we need this object to be created in its final "resting place,"
 i.e. in the target of the function call. This might be in the
@@ -178,77 +214,92 @@
 or it might be in the heap, or it might be a stand-alone local
 object.
 
-Because the implementation needs to create the returned object in a place
+Because the implementation needs to create the result object in a place
 or a storage pool determined by the caller, it is important that
 the declaration of the object be distinguished in some way.
-By using the keyword "return" in its declaration, we have
-a fairly intuitive way for the programmer to indicate that
-this is *the* object to be returned. Clearly we only want
-to allow one of these at a time, and to require that all
-return statements within its scope explicitly (or perhaps
-implicitly) return that object.
+By declaring it as part of an extended return statement, we have
+way for the programmer to indicate that
+this is *the* object to be returned.  Clearly we don't
+want to allow extended return statements to be nested.
 
 Because it may be necessary to do some computing before deciding
-exactly how the return object should be declared, we permit
-the return object to be declared within nested blocks within
-the function so long as there is no return object
-for the function already in scope. So different
-branches of an if or case statement could declare their "own"
-return object if appropriate, for example.
+exactly how the result object should be declared, we permit
+the extended return statement to occur any place a normal return
+statement is permitted. So different branches of an if or case statement
+could have their own extended return statements, each with its own named
+result object.
 
-Note that we have allowed the user to declare the return object
-as "aliased."  This seems like a natural thing which might be
+Note that we have allowed the user to declare the result object
+as "aliased." This seems like a natural thing which might be
 wanted, so you could initialize a circularly-linked list header
 to point at itself, etc.
 
-We had considered a different syntax for this before, namely a
-new kind of return statement, analogous to an accept statement,
-e.g.:
-
-    return Result : T := blah do
-        Result.Data(3) := 77;
-        ...
-    end Result;
+Note that we had discussed various mechanisms where information
+from the calling context would be available inside the function
+at the *language* level. In particular, it would be possible to refer
+to the values of the discriminants or bounds of the object being
+initialized, presuming it was constrained, *within* the subtype
+indication and initializing expression, if any.
+
+Ultimately this capability was not included in this proposal, as it
+created a series of somewhat complicated restrictions on usage and made the
+implementation that much more difficult. Note that the implementation
+may still need to pass in information from the calling context, depending
+on the run-time model, because if the type is "really" limited (e.g.
+it is limited tagged, or contains a task or a protected object), then
+the new object must be built in its final resting place. In many run-time
+models, that means the storage needs to be allocated at the call-site if the
+object being initialized is a component of some larger object.
+
+However, by not allowing the *programmer* to refer to this contextual
+information at the langauge level, we give the implementation more
+flexibility in how it solves the build-in-place requirement for
+"really" limited objects. See the discussion below about implementation
+approaches.
+
+The proposed syntax for extended return statements was discussed a year or so
+ago, but when this AI was first written up, we proposed instead a revised
+object declaration syntax where the word "return" was used almost like the word
+"constant," as a qualifier.  This was somewhat more economical in terms of
+syntax and indenting, but was not felt to be as clear semantically as this
+current syntax.
 
-However, Bob Duff pointed out that for simple cases you ended up
-with two levels of nesting which seemed excessive:
-
-    function Fum() return T is
-    begin
-       return Result : T := blah do
-          Result.Data(3) := 77;
-          ...
-       end Result;
-    end Fum;
-
-Making a smaller change to the object declaration syntax seemed
-a simpler approach.
-
 POSSIBLE IMPLEMENTATION APPROACHES
 
-The implementation approach for anonymous access result types is very
-similar to that for limited result types. In the following,
+[ASIDE (See AI-325): The implementation approach for anonymous access result
+types is very similar to that for limited result types. In the following,
 we will mostly talk about limited result types. Towards the end
 we will explain how it applies to anonymous access result types.
-
 Full accessibility level checking adds to the complexity.
 At the end we will show how to introduce restrictions
-that eliminate most of this complexity, in exchange for
-some loss in functionaliy.
-
-The implementation of this for limited result types is straightforward if
-the size of the result is known to the caller. It is essentially equivalent
-to a procedure with an OUT parameter -- the caller allocates
+that eliminate most of this complexity, in exchange for some loss in
+functionaliy.
+END of ASIDE]
+
+The implementation of the return-by-reference function is the same as
+the existing capability for functions whose result type is return-by-reference.
+The difference is that this would be permitted for types that are not
+"return-by-reference," and perhaps not even limited.  In particular,
+an accessibility check is performed at the point of the return statement,
+and a reference to the object associated with the return expresion is
+returned to the caller.
+
+The implementation of the extended return statement for non-limited types
+should minimize the number of copies, but may still require a copy in some
+implementation models and in some calling contexts.
+
+The implementation of the extended return statement for limited result types is
+straightforward if the result subtype is constrained. It is essentially
+equivalent to a procedure with an OUT parameter -- the caller allocates
 space for the target object, and passes its address to the called
 routine, which uses it for the "return" object.
+
+If the result subtype is unconstrained, then there are two basic possibilities:
 
-If the size of the function result is not known to the caller (i.e.
-the function result subtype is unconstrained, and perhaps indefinite), then
-there are two basic possibilities:
-
-   1) The target object's (nominal) subtype is constrained (or at least
-      "definite"), even though the function result subtype is unconstrained;
-      the target object might be a component of a larger object.
+   1) The target object's (nominal) subtype is definite, and either constrained or
+      the size of the object is independent of the constraints (e.g. allocate-the-max
+      is used for the object); the target object might be a component of a
+      larger object.
 
    2) The target object's nominal subtype is unconstrained, and its size
       is to be determined by the result returned from the function;
@@ -264,7 +315,7 @@
 One reasonable way to do so is for the caller to provide a
 "storage pool" for the result. In the first case, this storage
 "pool" has space for exactly one object of a given maximum size.
-It's allocate routine is trivial. It just checks to see if the
+It's Allocate routine is trivial. It just checks to see if the
 size is no greater than the space available, and then returns the
 preallocated (target) address.
 
@@ -272,53 +323,44 @@
 associated with the initialized allocator at the call site,
 or a storage pool that represents a secondary stack, or equivalent,
 used for returning objects of unknown size from a function.
+
+In either case, the function would return the address of the new
+object.
 
-For upward compatibility, we would need to accommodate
-functions that return pre-existing objects by reference.
-One way to do this would be for the caller to provide an
-additional implicit boolean parameter which would indicate
-whether the called routine *must* create a new object, or
-could return a reference to an existing object.
-
-Of the nine places identified above where calls on functions with
-limited result type would be permitted, the cases where the called
-routine must create a new object are (1)-(5). Cases (6)-(9)
-allow the use of preexisting objects, so the storage pool provided
-would generally be the secondary stack if the size is unknown
-to the caller, or a preallocated primary stack area, if
-the size of the object returned is always the same. Case (6),
-where a return statement returns the result of a function call,
-is a bit of a halfway situation. For (6), the storage pool
-provided as part of the call in the return statement would be
-the same storage pool passed to the function.
-
-When the boolean flag indicates that a new object is not required,
-the called routine could return a reference to a preexisting object,
-and ignore the storage pool or target address provided.
-As a possible optimization, this case could be indicated by simply
-providing a null storage pool parameter, rather than a separate
-boolean flag. The called routine would take this to mean that
-the secondary stack, or equivalent, should be used if a new
-object is being created, but that it may return a reference to a
-preexisting object. For the simplest implementation model where
-the size of the result is always known to the caller, and no
-storage pool parameter is provided, a separate flag would probably
-be necessary. The net effect is that there would be one
-implicit parameter in both situations, a boolean flag for the
-known-size function result, and a possibly-null storage pool
-for the unknown-size function result.
-
-In all cases, the called routine would return the address of the result,
-whether newly created or preexisting. The caller would use
-this returned address in all cases where the function result might
-be a preexisting object (cases (6)-(9)), or in cases where the caller
-didn't preallocate space for the target.
+A "bare" storage pool may not be enough in general.  If the type
+has any task parts, then these tasks must be placed on an activation
+list determined by the calling context.  They may also be linked onto a
+master record of some sort, unless this is deferred until
+activation occurs.  Note that the tasks cannot be activated
+until after returning from the call, since they may have
+to be activated in conjunction with other tasks having the
+same master.
+
+If the type has any controlled or protected parts, then the object
+as a whole, or the individual parts, may need to be added to
+a cleanup list determined by the calling context.
+
+If the type has any access discriminants, then some kind of
+accessibility level will need to be provided, since the access
+discriminant may only be initialized to point to an object
+whose accessibility level is no deeper than that of the
+storage pool where the new object is being allocated.
+
+What this means is that rather than passing just a reference
+to a storage pool, it is more likely the caller will pass
+a reference to a structure which in turn refers to:
+
+  - a storage pool,
+  - an accessibility level,
+  - an activation list,
+  - the associated master,
+  - a cleanup list
 
 IMPLEMENTATION APPROACH FOR ANONYMOUS ACCESS RESULT TYPE
 
-For anonymous access result types, a very similar approach would
-be taken. In this case, however, a new object is never required.
-It would always be permissible to return an access value designating
+For anonymous access result types, a similar approach would
+be taken. In this case, however, a new object is not required.
+It would be permissible to return an access value designating
 a preexisting object. The storage pool parameter would always
 be required, but the caller could always ignore it. An accessibility
 level would be needed associated with the storage pool, so the called

Questions? Ask the ACAA Technical Agent