Version 1.15 of ais/ai-10318.txt
!standard 03.10.02(10) 05-09-20 AI95-00318-02/11
!standard 03.10.02(13)
!standard 03.08(14)
!standard 03.09(24)
!standard 04.03.03(11)
!standard 05(02)
!standard 05.01(04)
!standard 05.01(05)
!standard 05.01(14)
!standard 06.01(13)
!standard 06.01(23)
!standard 06.01(24)
!standard 06.01(28)
!standard 06.03.01(16)
!standard 06.04(11)
!standard 06.05(01)
!standard 06.05(02)
!standard 06.05(03)
!standard 06.05(04)
!standard 06.05(05)
!standard 06.05(06)
!standard 06.05(07)
!standard 06.05(08)
!standard 06.05(09)
!standard 06.05(10)
!standard 06.05(11)
!standard 06.05(12)
!standard 06.05(13)
!standard 06.05(14)
!standard 06.05(15)
!standard 06.05(16)
!standard 06.05(17)
!standard 06.05(18)
!standard 06.05(19)
!standard 06.05(20)
!standard 06.05(21)
!standard 06.05(22)
!standard 06.05(24)
!standard 07.03(19)
!standard 07.05(02)
!standard 07.05(08)
!standard 07.05(09)
!standard 07.05(23)
!standard 07.06(17.1)
!standard 07.06.01(02)
!standard 07.06.01(18)
!standard 08.01(4)
!standard 09.05.02(29)
!standard 13.08(10)
!class amendment 02-10-09
!status Amendment 200Y 04-09-23
!status WG9 approved 04-11-18
!status ARG Approved 5-0-4 04-09-23
!status work item 04-07-28
!status ARG Approved 9-0-2 04-06-17
!status work item 03-05-23
!status received 02-10-09
!priority Medium
!difficulty Medium
!subject Limited and anonymous access return types
!summary
A new extended syntax is proposed for the return statement,
providing a name for the new object being created as a result
of a call on the function.
This new syntax can be used to support returning limited objects from a
function and more generally to reduce the copying that might be required
when a function returns a complex object, a controlled object, etc.
The existing ability to return by reference is replaced by an
ability to have an anonymous access type as a return type.
!problem
We already have a proposal (AI-287) for allowing aggregates of a limited type,
by requiring that the aggregate be built directly in the target object rather
than being copied into the target.
But aggregates can only be used with non-private types. Limited
private types could not be initializable at their declaration point.
It would be natural to allow functions to return limited objects,
so long as the object could be built directly in the "target"
of the function call, which could be a newly created object being
initialized, or simply a parameter to another subprogram call.
When returning a limited type it may be desirable to perform some other
initialization to the object after it has been created, but before returning
from the function. This is difficult to do while still creating the object
directly in its "final" location.
Currently functions that return a limited private type may have an
accessibility check performed on the object returned, depending on a
property ("return-by-reference-ness") which is not generally visible
based on the partial view of the type. This means that a function
that works initially may stop working if the full type of the result
type is changed to include, say, a limited tagged component, or some
other component that is return-by-reference.
A function whose result type turns out to be return-by-reference
cannot be allowed where a new object is required. However, there
is nothing in the declaration of such a function that indicates it
returns by reference.
The capability to return-by-reference could be useful for nonlimited types,
but it becomes even more useful if a call on such a function could be treated
as a variable, so it could be used on the left-hand side of an assignment.
These capabilities exist in the language without introducing the
conceptual oddity of return-by-reference. A function returning
an access value allows the effect of return-by-reference, and doesn't
require changing the language, at the cost of a bit of extra verbosity in
some cases (the need to dereference the result).
!proposal
Anonymous access types are permitted for a function result type:
parameter_and_result_profile ::=
[formal_part] return subtype_mark
| [formal_part] return access_definition
An anonymous access type used as the result type of a function is called an
access result type. The accessibility level of an access result type is that
of the declaration containing the parameter_and_result_profile.
-------------
An extended syntax for the return statement is proposed:
RETURN identifier : [ALIASED] return_subtype_indication [:= expression] [DO
handled_sequence_of_statements
end return];
Such an extended return statement is permitted only immediately within a
function. The specified identifier names the object that is the result of
a call on the function. If the expression is present, it provides
the initial value for the result object. If not, the result object
is default initialized. If the handled_sequence_of_statements is
present, it is executed after initializing the result object. Within
the handled_sequence_of_statements, the identifier denotes a variable
view of the result object with nominal subtype given by the subtype_indication.
When the handled_sequence_of_statements completes, the function is complete.
Note: An expression-less return statement is permitted
within the handled_sequence_of_statements, similar to the way
that accept statements work.
A call of a function with a limited result type may
be used in the same contexts where we have proposed to allow aggregates of a
limited type, namely contexts where a new object is being created (or can be).
1) Initializing a newly declared object (including a result object identified
in an extended return statement)
2) Default initialization of a record component
3) Initialized allocator
4) Component of an aggregate
5) IN formal object in a generic instantiation (including as a default)
6) Expression of a return statement
7) IN parameter in a function call (including as a default expression)
In addition, since the result of a function call is a name in Ada 95,
the following contexts would be permitted, with the same semantics
as creating a new temporary constant object, and then creating a
reference to it:
8) Declaring an object that is the renaming of a function call.
9) Use of the function call as a prefix to 'Address
In other words, it would be permitted in any context where limited types
are permitted. With the new proposals, that is pretty much any context
where a "name" that denotes an object or value is permitted, except as the
right hand side of an assignment statement.
This proposal assumes that AI-287 is adopted; it does not repeat the changes
needed to allow function_calls in the contexts listed above.
!wording
Add before 3.8(14):
If a record_type_declaration includes the reserved word limited, the type
is called an explicitly limited record type.
In 3.9(24), change "return expression" to "return object".
Replace 3.10.2(10) with:
For any function, the accessibility level of the result object is that of the
execution of the called function.
Add after 3.10.2(13):
* The accessibility level of the anonymous access type of an access result
type (see 6.5) is the same as that of the associated function or
access-to-subprogram type.
Change 4.3.3(11) as follows:
For an ..., the
expression of a [return_statement]{return statement},
the initialization expression in an object_declaration, or ...
function [result]{return object}, object, or ...
Replace "return_statement" with "return statement" in 5(2).
Change "return_statement" in 5.1(4) to "simple_return_statement".
Add "extended_return_statement" to 5.1(5).
Replace "return_statement" with "return statement" in 5.1(14).
Change 6.1(13) to:
parameter_and_result_profile ::=
[formal_part] return subtype_mark
| [formal_part] return access_definition
Add the following to 6.1(23):
The nominal subtype of a function result is the subtype
denoted by the subtype_mark, or defined by the access_definition,
in the parameter_and_result_profile.
Change 6.1(24) to:
An *access parameter* is a formal in* parameter specified by an
access_definition.
An access result type is a function result type specified by an
access_definition.
An access parameter or result type is of an anonymous general
access-to-variable type (see 3.10). Access parameters allow dispatching
calls to be controlled by access values.
Change 6.1(28) to:
* For any non-access result, the nominal subtype of the function result.
* For any access result type of an access-to-object type, the
designated subtype of the result type.
* For any access result type of an access-to-subprogram type, the
subtypes of the profile of the result type.
Modify 6.3.1(16) as follows:
Two profiles are mode conformant if they are type-conformant, and
corresponding parameters have identical modes, and, for access parameters
{or access result types}, the designated subtypes statically match.
Replace "return_statement" with "return statement" in 6.4(11).
Replace clause 6.5 with the following:
6.5 Return Statements
A simple_return_statement or extended_return_statement (collectively called
a return statement) is used to complete the execution of the innermost
enclosing subprogram_body, entry_body, or accept_statement.
Syntax
simple_return_statement ::= return [expression];
extended_return_statement ::=
return identifier : [aliased] return_subtype_indication [:= expression] [do
handled_sequence_of_statements
end return];
return_subtype_indication ::= subtype_indication | access_definition
Name Resolution Rules
The result subtype of a function is the subtype denoted by the
subtype_mark, or defined by the access_definition, after the reserved word
RETURN in the profile of the function. The expected type for the
expression, if any, of a simple_return_statement is the result type
of the corresponding function.
The expected type for the expression of an extended_return_statement is
that of the return_subtype_indication.
Legality Rules
A return statement shall be within a callable construct, and it applies to
the innermost callable construct or extended_return_statement that contains
it. A return statement shall not be within a body that is within the
construct to which the return statement applies.
A function body shall contain at least one return statement that applies
to the function body, unless the function contains code_statements. A
simple_return_statement shall include an expression if and only if
it applies to a function body. An extended_return_statement shall apply to
a function body.
For an extended_return_statement that applies to a function body:
* If the result subtype of the function is defined by a subtype_mark, the
return_subtype_indication shall be a subtype_indication. The type of the
subtype_indication shall be the result type of the function. If the
result subtype of the function is constrained, then the subtype defined
by the subtype_indication shall also be constrained and shall statically
match this result subtype. If the result subtype of the function is
unconstrained, then the subtype defined by the subtype_indication shall
be a definite subtype, or there shall be an expression.
* If the result subtype of the function is defined by an
access_definition, the return_subtype_indication shall be an
access_definition. The subtype defined by the access_definition shall
statically match the result subtype of the function. The accessibility
level of this anonymous access subtype is that of the result subtype.
For any return statement that applies to a function body:
* If the result subtype of the function is limited, then the expression of
the return statement (if any) shall be an aggregate, a function call (or
equivalent use of an operator), or a qualified_expression or
parenthesized expression whose operand is one of these.
AARM Note:
In other words, if limited, the expression must produce a
"new" object, rather than being the name of a preexisting object
(which would imply copying).
Static Semantics
Within an extended_return_statement, the return object is declared with
the given identifier, with the nominal subtype defined by the
return_subtype_indication.
Dynamic Semantics
For the execution of an extended_return_statement, the subtype_indication
or access_definition is elaborated. This creates the nominal subtype of the
return object. If there is an expression, it is evaluated and
converted to the nominal subtype (which might raise Constraint_Error --
see 4.6), and the converted value is assigned to the return object.
Otherwise, the return object is initialized by default as for a stand-alone
object of its nominal subtype (see 3.3.1). If the nominal subtype is
indefinite, the return object is constrained by its initial value.
For the execution of a simple_return_statement, the expression (if any) is
first evaluated, converted to the result subtype, and then assigned to
the anonymous return object.
If the result type of a function is a specific tagged type, the
tag of the return object is that of the result type. If the result type
is class-wide, the tag of the return object is that of the value of
the expression.
AARM Ramification:
The first sentence is true even if the tag of the expression is
different, which could happen if the expression were a view conversion or
a dereference of an access value. Note that for a limited type, because
of the restriction to aggregates and function calls (and no conversions),
the tag will already match.
AARM Reason:
The first rule ensures that a function whose result type is a specific
tagged type always returns an object whose tag is that of the result
type. This is important for dispatching on controlling result, and allows
the caller to allocate the appropriate amount of space to hold the value
being returned (assuming there are no discriminants).
For the execution of an extended_return_statement, the
handled_sequence_of_statements is executed. Within this
handled_sequence_of_statements, the execution of a simple_return_statement
that applies to the extended_return_statement causes a transfer of control
that completes the extended_return_statement. Upon completion of a return
statement that applies to a callable construct, a transfer of control is
performed which completes the execution of the callable construct, and
returns to the caller.
In the case of a function, the function_call denotes a constant view
of the return object.
Examples
Examples of return statements:
return; --
--
return Key_Value(Last_Index); --
return Node : Cell do --
Node.Value := Result;
Node.Succ := Next_Node;
end return;
Delete all but the first sentence of 7.3(19).
The new rule added before 7.5(2) by AI-287 should say:
...unless it is an aggregate, a function_call, or a parenthesized...
A new bullet should be added to the list here:
* the expression of a return statement (see 6.5)
The following should be added to the new rule added after 7.5(8) by AI-287:
For a function_call of a type with a part that is of a task, protected, or
explicitly limited record type that is used to initialize an object as allowed
above, the implementation shall not create a separate return object (see 6.5)
for the function_call. The function_call shall be constructed directly in the
new object.
Similarly, the replacement note of 7.5(9)
of AI-287 should say "aggregate or function_call" in each occurrence.
Replace 7.5(23) by:
The fact that the full view of File_Name is explicitly declared limited
means that parameter passing will always be by reference and function
results will always be built directly in the result object (see 6.2 and
6.5).
Replace 7.6(17.1) with:
For an aggregate of a controlled type whose value is assigned, other than by an
assignment_statement, the
implementation shall not create a separate anonymous object for the aggregate.
The aggregate value shall be constructed directly in the target of the
assignment operation and Adjust is not called on the target object.
Replace part of 7.6.1(2) and 7.6.1(18) as follows:
"...[exit_, return_, goto_]{exit_statement, return statement,
goto_statement},..."
[Editor's note: removing the "cute" wording here improves searchability of
the Standard; we should avoid use abbreviations of technical terms.]
Add after 8.1(4):
* an extended_return_statement;
Replace "return_statement" with "return statement" in 9.5.2(29).
!example
Here is an example of a function with a limited result type
using an extended return statement:
function Make_Obj(Param : Natural) return Lim_Type is
begin
return Result : Lim_Type do --
--
Further_Processing(Result, Param);
end return;
end Make_Obj;
Here is a similar function that returns an access-to-limited type:
function Make_Obj(Param : Natural) return access Lim_Type is
begin
return Result : access Lim_Type do --
Result := new Lim_Type; --
--
Further_Processing(Result.all, Param);
end return;
end Make_Obj;
Here is an abstraction which uses functions with access result types, to
support an extensible array abstraction (aka vector):
generic
type Element is private;
type Index is (<>);
package Extensible_Arrays is
pragma Assert(Index'First > Index'Base'First);
--
type Ext_Array is private;
--
procedure Set_Elem(EA : in out Ext_Array; I : Index; Elem : Element);
--
--
function Last(EA : Ext_Array) return Index'Base;
--
function Elem(EA : Ext_Array; I : Index) return access Element;
--
--
--
procedure Set_Empty(EA : in out Ext_Array);
--
--
private
type Elem_Array is array(Index range <>) of aliased Element;
--
type Elem_Array_Ptr is access Elem_Array;
--
type Ext_Array is record
Last : Index'Base := Index'First - 1;
Data : Elem_Array_Ptr;
--
--
end record;
end Extensible_Arrays;
procedure Ext_Array_Test(Max : Positive) is
package Ext_Int_Arrays is
new Extensible_Arrays(Element => Integer; Index => Positive);
type Ext_Int_Array is new Ext_Int_Arrays.Ext_Array;
X : Ext_Int_Array; --
begin
--
for I in 1..Max loop
Set_Elem(X, I, Elem => I**2);
end loop;
--
for I in 1..Max/2 loop
Elem(X, I).all := Elem(X, I).all + 1;
end loop;
--
for I in 1..Last(X) loop
Ada.Text_IO.Put_Line(Integer'Image(I) & " => " &
Integer'Image(Elem(X, I).all));
end loop;
Set_Empty(X); --
end Ext_Array_Test;
!discussion
In meetings with Ada users, there has been a general sense
that if limited aggregates are provided in Ada 200Y, it would be desirable
to also provide limited function returns which could act
as "constructor" functions.
Just allowing a function whose whole body is a return statement
returning an aggregate (or another function call) does not give the
programmer much flexibility. What they would like is to be able
to create the object being returned and then initialize it further somehow,
perhaps by calling a procedure, doing a loop (as in the examples above),
etc. This requires a named object. However, to avoid copying,
we need this object to be created in its final "resting place",
i.e. in the target of the function call. This might be in the
"middle" of some enclosing composite object the caller is initializing,
or it might be in the heap, or it might be a stand-alone local
object.
Because the implementation needs to create the result object in a place
determined by the caller, it is important that the declaration of the
object be distinguished in some way. By declaring it as part of an
extended return statement, we have a way for the programmer to indicate
that this is the object to be returned. Clearly we don't want to allow
extended return statements to be nested.
Because it may be necessary to do some computing before deciding
exactly how the result object should be declared, we permit
the extended return statement to occur wherever a normal return
statement is permitted. So different branches of an if or case statement
could have their own extended return statements, each with its own named
result object.
Note that we have allowed the user to declare the result object
as "aliased." This seems like a natural thing which might be
wanted, so you could initialize a circularly-linked list header
to point at itself, etc.
Note that we had discussed various mechanisms where information
from the calling context would be available inside the function
at the language level. In particular, it would be possible to refer
to the values of the discriminants or bounds of the object being
initialized, presuming it was constrained, within the subtype
indication and initializing expression, if any.
Ultimately this capability was not included in this proposal, as it
created a series of somewhat complicated restrictions on usage and made the
implementation that much more difficult. Note that the implementation
may still need to pass in information from the calling context, depending
on the run-time model, because if the type is "really" limited (e.g.
it is limited tagged, or contains a task or a protected object), then
the new object must be built in its final resting place. In many run-time
models, that means the storage needs to be allocated at the call-site if the
object being initialized is a component of some larger object.
However, by not allowing the programmer to refer to this contextual
information at the language level, we give the implementation more
flexibility in how it solves the build-in-place requirement for
"really" limited objects. See the discussion below about implementation
approaches.
The syntax for extended return statements was initially proposed early on,
but when this AI was first written up, we proposed instead a revised
object declaration syntax where the word "return" was used almost like the word
"constant," as a qualifier. This was somewhat more economical in terms of
syntax and indenting, but was not felt to be as clear semantically as this
current syntax.
We have eliminated the capability for returning by reference, in favor
of returning a value of an anonymous access type. A rejected alternative
proposal (AI-318-1) proposed to make return-by-reference a separate capability,
triggered by the presence of the reserved word "ALIASED" in the function
profile. This was felt by some reviewers to be enshrining the confusing notion
of return-by-reference, which earlier had been buried in a discussion
of certain limited types. Furthermore, the implementation model of
return by reference was clearly to return a "reference" (effectively an access
value) to the result object. Making this explicit presumably makes
the feature easier to understand, and we can also piggyback on the
usual accessibility checks, rather than have to invent special ones
associated with a return by reference.
The capability to return an anonymous access type goes well with
the other changes allowing anonymous access types in more contexts.
We have kept the implementation simple by making the accessibility
level of the result type the same as that of the associated function
(or access-to-subprogram type).
POSSIBLE IMPLEMENTATION APPROACHES
The implementation of the extended return statement for nonlimited
types should minimize the number of copies, but may still require a copy
in some implementation models and in some calling contexts.
The implementation of the extended return statement for limited result
types is straightforward if the result subtype is constrained. It is
essentially equivalent to a procedure with an OUT parameter -- the
caller allocates space for the target object, perhaps does some of the
"implicit" initialization for tags, discriminants, tasks, or protected
components, etc., and passes its address to the called routine, which
uses it for the "return" object. Nonlimited controlled components can
still require some fancy footwork, since they can be explicitly
initialized, so default initializing them would be inappropriate. But
compilers already have to deal with returning nonlimited controlled
objects, so presumably this won't create an insurmountable burden.
If the result subtype is unconstrained, then there are two basic possibilities:
1) The target object's (nominal) subtype is definite, and either constrained
or the size of the object is independent of the constraints (e.g.
allocate-the-max is used for the object); the target object might be a
component of a larger object.
2) The target object's nominal subtype is unconstrained, and its size
is to be determined by the result returned from the function;
the target object must be a stand-alone object, or an "entire"
heap object.
In the first case, the caller determines the size of the target object and
can allocate space for it; in the second, the caller cannot
preallocate space for the target object, and must rely on the called
routine allocating space for it in an "appropriate" place.
The code for the called routine must handle both of these cases.
One reasonable way to do so is for the caller to provide a
"storage pool" for the result. In the first case, this storage
"pool" has space for exactly one object of a given maximum size.
It's Allocate routine is trivial. It just checks to see if the
size is no greater than the space available, and then returns the
preallocated (target) address.
In the second case, the storage pool is either the storage pool
associated with the initialized allocator at the call site,
or a storage pool that represents a secondary stack, or equivalent,
used for returning objects of unknown size from a function.
In either case, the function would return the address of the new
object.
A "bare" storage pool may not be enough in general. If the type
has any task parts, then these tasks must be placed on an activation
list determined by the calling context. They may also be linked onto a
master record of some sort, unless this is deferred until
activation occurs. Note that the tasks cannot be activated
until after returning from the call, since they may have
to be activated in conjunction with other tasks having the
same master.
If the type has any controlled or protected parts, then the object
as a whole, or the individual parts, may need to be added to
a cleanup list determined by the calling context.
If the type has any access discriminants, then some kind of
accessibility level will need to be provided, since the access
discriminant may only be initialized to point to an object
whose accessibility level is no deeper than that of the
storage pool where the new object is being allocated.
What this means is that rather than passing just a reference
to a storage pool, it is more likely the caller will pass
a reference to a structure which in turn refers to:
- a storage pool,
- an accessibility level,
- an activation list,
- the associated master,
- a cleanup list
Supporting a function result of an anonymous access type presents no
special challenges since we have defined the accessibility level of
the result type to be the same as that as the associated function
or access-to-subprogram declaration. Hence, it is as though a named
access type were declared and then used as the result type, from
a run-time model point of view. There is no need for any (new) run-time
accessibility checking.
DEALING WITH EXCEPTIONS
There was some concern about what would happen if an exception were
propagated by an extended return statement, and then the same or some
other extended return statement were reentered. There doesn't seem to be
a real problem. The return object doesn't really exist outside the
function until the function returns, so it can be restored to its
initial state on call of the function if an exception is propagated from
an extended return statement. Once restored to its initial state, there
seems no harm in starting over in another extended_return_statement.
THE BUILD-IN-PLACE IMPLEMENTATION REQUIREMENT
The intent of this feature is that there never is copying of a "really" limited
object. We have added an Implementation Requirement to insure that that is
really the case.
It has been argued that this requirement is not needed because any such copies
are semantically neutral. But no copies of a self-referencing object could
ever really be semantically neutral. Moreover, the definition of object
creation in 3.3(19) says that the subcomponents are assigned from the
expression already evaluated. This clearly must be superseded.
In addition, we want to tell the reader (Ada user and implementers alike) that
function calls have changed. In Ada up to this point, function calls were
always about copying (at least logically, 7.6(21) allows omitting the copy).
That is emphatically not the case in the Amendment; indeed in a similar case
for aggregates, we included such a requirement in the Corrigendum.
!corrigendum 3.8(14)
Insert before the paragraph:
The component_definition of a component_declaration defines the
(nominal) subtype of the component. If the reserved word aliased appears
in the component_definition, then the component is aliased (see 3.10).
the new paragraph:
If a record_type_declaration includes the reserved word limited, the
type is called an explicitly limited record type.
!corrigendum 3.9(24)
Replace the paragraph:
- The tag of the result returned by a function with a class-wide result
type is that of the return expression.
by:
- The tag of the result returned by a function with a class-wide result
type is that of the return object.
!corrigendum 3.10.2(10)
Replace the paragraph:
For a function whose result type is a return-by-reference type, the
accessibility level of the result object is the same as that of the master that
elaborated the function body. For any other function, the accessibility level
of the result object is that of the execution of the called function.
by:
For any function, the accessibility level of the result object is that of the
execution of the called function.
!corrigendum 3.10.2(13)
Insert after the paragraph:
- The accessibility level of the anonymous access type of an access
parameter is the same as that of the view designated by the actual. If the
actual is an allocator, this is the accessibility level of the execution
of the called subprogram.
the new paragraph:
- The accessibility level of the anonymous access type of an access
result type (see 6.5) is the same as that of the associated function or
access-to-subprogram type.
!corrigendum 4.3.3(11)
Replace the paragraph:
For an explicit_actual_parameter, an
explicit_generic_actual_parameter, the
expression of a return_statement, the initialization expression in an
object_declaration, or a default_expression (for a parameter or a
component), when the nominal subtype of the corresponding formal parameter,
generic formal parameter, function result, object, or component is a
constrained array subtype, the applicable index constraint is the constraint of
the subtype;
by:
For an explicit_actual_parameter, an
explicit_generic_actual_parameter, the
expression of a return statement, the initialization expression in an
object_declaration, or a default_expression (for a parameter or a
component), when the nominal subtype of the corresponding formal parameter,
generic formal parameter, function return object, object, or component is a
constrained array subtype, the applicable index constraint is the constraint of
the subtype;
!corrigendum 5(2)
Replace the paragraph:
This section describes the general rules applicable to all statements. Some
statements are discussed in later sections: Procedure_call_statements and
return_statements are described in 6, "Subprograms".
Entry_call_statements, requeue_statements, delay_statements,
accept_statements, select_statements, and abort_statements are
described in 9, "Tasks and Synchronization". Raise_statements are
described in 11, "Exceptions", and code_statements in 13. The remaining
forms of statements are presented in this section.
by:
This section describes the general rules applicable to all statements. Some
statements are discussed in later sections: Procedure_call_statements and
return statements are described in 6, "Subprograms".
Entry_call_statements, requeue_statements, delay_statements,
accept_statements, select_statements, and abort_statements are
described in 9, "Tasks and Synchronization". Raise_statements are
described in 11, "Exceptions", and code_statements in 13. The remaining
forms of statements are presented in this section.
!corrigendum 5.1(4)
Replace the paragraph:
simple_statement ::= null_statement
| assignment_statement | exit_statement
| goto_statement | procedure_call_statement
| return_statement | entry_call_statement
| requeue_statement | delay_statement
| abort_statement | raise_statement
| code_statement
by:
simple_statement ::= null_statement
| assignment_statement | exit_statement
| goto_statement | procedure_call_statement
| simple_return_statement | entry_call_statement
| requeue_statement | delay_statement
| abort_statement | raise_statement
| code_statement
!corrigendum 5.1(5)
Replace the paragraph:
compound_statement ::=
if_statement | case_statement
| loop_statement | block_statement
| accept_statement | select_statement
by:
compound_statement ::=
if_statement | case_statement
| loop_statement | block_statement
| extended_return_statement
| accept_statement | select_statement
!corrigendum 5.1(14)
Replace the paragraph:
A transfer of control is the run-time action of an exit_statement,
return_statement, goto_statement, or requeue_statement,
selection of a terminate_alternative, raising of an exception, or an
abort, which causes the next action performed to be one other than what would
normally be expected from the other rules of the language. As explained in
7.6.1, a transfer of control can cause the execution of constructs to be
completed and then left, which may trigger finalization.
by:
A transfer of control is the run-time action of an exit_statement,
return statement, goto_statement, or requeue_statement,
selection of a terminate_alternative, raising of an exception, or an
abort, which causes the next action performed to be one other than what would
normally be expected from the other rules of the language. As explained in
7.6.1, a transfer of control can cause the execution of constructs to be
completed and then left, which may trigger finalization.
!corrigendum 6.1(13)
Replace the paragraph:
parameter_and_result_profile ::= [formal_part] return subtype_mark
by:
parameter_and_result_profile ::=
[formal_part] return subtype_mark
| [formal_part] return access_definition
!corrigendum 6.1(23)
Replace the paragraph:
The nominal subtype of a formal parameter is the subtype denoted by the
subtype_mark, or defined by the access_definition, in the
parameter_specification.
by:
The nominal subtype of a formal parameter is the subtype denoted
by the subtype_mark, or
defined by the access_definition, in the parameter_specification.
The nominal subtype of a function result is the subtype
denoted by the subtype_mark, or
defined by the access_definition, in the parameter_and_result_profile.
!corrigendum 6.1(24)
Replace the paragraph:
An access parameter is a formal in parameter specified by an
access_definition.
An access parameter is of an anonymous general access-to-variable type (see
3.10). Access parameters allow dispatching calls to be controlled by access
values.
by:
An access parameter is a formal in parameter specified by an
access_definition.
An access result type is a function result type specified by an
access_definition.
An access parameter or result type is of an anonymous general
access-to-variable type (see 3.10). Access parameters allow dispatching
calls to be controlled by access values.
!corrigendum 6.1(28)
Replace the paragraph:
- For any result, the result subtype.
by:
- For any non-access result, the nominal subtype of the function result.
- For any access result type of an access-to-object type, the
designated subtype of the result type.
- For any access result type of an access-to-subprogram type, the
subtypes of the profile of the result type.
!corrigendum 6.3.1(16)
Replace the paragraph:
Two profiles are mode conformant if they are type-conformant, and
corresponding parameters have identical modes, and, for access parameters, the
designated subtypes statically match.
by:
Two profiles are mode conformant if they are type-conformant, and
corresponding parameters have identical modes, and, for access parameters or
access result types, the designated subtypes statically match.
!corrigendum 6.4(11)
Replace the paragraph:
The exception Program_Error is raised at the point of a function_call if
the function completes normally without executing a return_statement.
by:
The exception Program_Error is raised at the point of a function_call if
the function completes normally without executing a return statement.
!corrigendum 6.5(1)
Replace the paragraph:
A return_statement is used to complete the execution of the
innermost enclosing subprogram_body, entry_body, or
accept_statement.
by:
A simple_return_statement or extended_return_statement (collectively
called a return statement) is used to complete the execution of the
innermost enclosing subprogram_body, entry_body, or
accept_statement.
!corrigendum 6.5(2)
Replace the paragraph:
return_statement ::= return [expression];
by:
simple_return_statement ::= return [expression];
extended_return_statement ::=
return identifier : [aliased] return_subtype_indication [:= expression] [do
handled_sequence_of_statements
end return];
return_subtype_indication ::= subtype_indication | access_definition
!corrigendum 6.5(03)
Replace the paragraph:
The expression, if any, of a return_statement is called the return
expression. The result subtype of a function is the subtype denoted by the
subtype_mark after the reserved word return in the profile of the
function. The expected type for a return expression is the result type of the
corresponding function.
by:
The result subtype of a function is the subtype denoted by the
subtype_mark, or defined by the access_definition, after the reserved
word return in the profile of the function. The expected type for the
expression, if any, of a simple_return_statement is the result
type of the corresponding function.
The expected type for the expression of an extended_return_statement
is that of the return_subtype_indication.
!corrigendum 6.5(04)
Replace the paragraph:
A return_statement shall be within a callable construct, and it applies
to the innermost one. A return_statement shall not be within a body that
is within the construct to which the return_statement applies.
by:
A return statement shall be within a callable construct, and it applies
to the innermost callable construct or extended_return_statement that
contains it. A return statement shall not be within a body that is within
the construct to which the return statement applies.
!corrigendum 6.5(05)
Replace the paragraph:
A function body shall contain at least one return_statement that applies
to the function body, unless the function contains code_statements. A
return_statement shall include a return expression if and only if
it applies to a function body.
by:
A function body shall contain at least one return statement that applies
to the function body, unless the function contains code_statements. A
simple_return_statement shall include an expression if and only if
it applies to a function body. An extended_return_statement shall apply to
a function body.
For an extended_return_statement that applies to a function body:
- If the result subtype of the function is defined by a
subtype_mark, the return_subtype_indication shall be a
subtype_indication. The type of the subtype_indication shall be the
result type of the function. If the result subtype of the function is
constrained, then the subtype defined by the subtype_indication shall also
be constrained and shall statically match this result subtype. If the result
subtype of the function is unconstrained, then the subtype defined by the
subtype_indication shall be a definite subtype, or there shall be an
expression.
- If the result subtype of the function is defined by an
access_definition, the return_subtype_indication shall be an
access_definition. The subtype defined by the access_definition shall
statically match the result subtype of the function. The accessibility level of
this anonymous access subtype is that of the result subtype.
For any return statement that applies to a function body:
- If the result subtype of the function is limited, then the
expression of the return statement (if any) shall be an aggregate, a
function call (or equivalent use of an operator), or a
qualified_expression or parenthesized expression whose operand is one of
these.
Static Semantics
Within an extended_return_statement, the return object is declared
with the given identifier, with the nominal subtype defined by the
return_subtype_indication.
!corrigendum 6.5(06)
Replace the paragraph:
For the execution of a return_statement, the expression (if any) is
first evaluated and converted to the result subtype.
by:
For the execution of an extended_return_statement, the
subtype_indication or access_definition is elaborated. This creates
the nominal subtype of the return object. If there is an expression,
it is evaluated and converted
to the nominal subtype (which might raise Constraint_Error -- see 4.6) and then
the converted value becomes the initial value of the return object.
Otherwise, the return object is
initialized by default as for a stand-alone object of its nominal subtype
(see 3.3.1). If the nominal subtype is indefinite, the return object is
constrained by its initial value.
For the execution of a simple_return_statement, the expression
(if any) is first evaluated, converted to the result subtype, and then
assigned to the anonymous return object.
!corrigendum 6.5(07)
Delete the paragraph:
If the result type is class-wide, then the tag of the result is the tag of the
value of the expression.
!corrigendum 6.5(08)
Replace the paragraph:
If the result type is a specific tagged type:
by:
If the result type of a function is a specific tagged type, the
tag of the return object is that of the result type. If the result type
is class-wide, the tag of the return object is that of the value of
the expression. A check is made that the accessibility
level of the type identified by the tag of the result is not deeper than
that of the master that elaborated the function body. If this check fails,
Program_Error is raised.
!corrigendum 6.5(09)
Delete the paragraph:
- If it is limited, then a check is made that the tag of the value of
the return expression identifies the result type. Constraint_Error is raised
if this check fails.
!corrigendum 6.5(10)
Delete the paragraph:
- If it is nonlimited, then the tag of the result is that of the result
type.
!corrigendum 6.5(11)
Delete the paragraph:
A type is a return-by-reference type if it is a descendant of one of the
following:
!corrigendum 6.5(12)
Delete the paragraph:
!corrigendum 6.5(13)
Delete the paragraph:
- a task or protected type;
!corrigendum 6.5(14)
Delete the paragraph:
- a nonprivate type with the reserved word limited in its
declaration;
!corrigendum 6.5(15)
Delete the paragraph:
- a composite type with a subcomponent of a return-by-reference type;
!corrigendum 6.5(16)
Delete the paragraph:
- a private type whose full type is a return-by-reference type.
!corrigendum 6.5(17)
Delete the paragraph:
If the result type is a return-by-reference type, then a check is made that the
return expression is one of the following:
!corrigendum 6.5(18)
Delete the paragraph:
- a name that denotes an object view whose accessibility level is
not deeper than that of the master that elaborated the function body; or
!corrigendum 6.5(19)
Delete the paragraph:
- a parenthesized expression or qualified_expression whose operand
is one of these kinds of expressions.
!corrigendum 6.5(20)
Delete the paragraph:
The exception Program_Error is raised if this check fails.
!corrigendum 6.5(21)
Delete the paragraph:
For a function with a return-by-reference result type the result is returned by
reference; that is, the function call denotes a constant view of the object
associated with the value of the return expression. For any other function, the
result is returned by copy; that is, the converted value is assigned into an
anonymous constant created at the point of the return_statement, and the
function call denotes that object.
!corrigendum 6.5(22)
Replace the paragraph:
Finally, a transfer of control is performed which completes the
execution of the callable construct to which the return_statement applies,
and returns to the caller.
by:
For the execution of an extended_return_statement, the
handled_sequence_of_statements is executed. Within this
handled_sequence_of_statements, the execution of a
simple_return_statement that applies to the extended_return_statement
causes a transfer of control that completes the extended_return_statement.
Upon completion of a return statement that applies to a callable construct, a
transfer of control is performed which completes the execution of the callable
construct, and returns to the caller.
In the case of a function, the function_call denotes a constant view of
the return object.
!corrigendum 6.5(24)
Replace the paragraph:
return; -- in a procedure body, entry_body, or accept_statement
return Key_Value(Last_Index); -- in a function body
by:
return; -- in a procedure body, entry_body,
-- accept_statement, or extended_return_statement
return Key_Value(Last_Index); -- in a function body
return Node : Cell do -- in a function body, see 3.10.1 for Cell
Node.Value := Result;
Node.Succ := Next_Node;
end return;
!corrigendum 7.3(19)
Replace the paragraph:
Declaring a private type with an unknown_discriminant_part is a
way of preventing clients from creating uninitialized objects of the
type; they are then forced to initialize each object by calling some
operation declared in the visible part of the package.
If such a type is also limited, then no objects of the type can
be declared outside the scope of the full_type_declaration, restricting
all object creation to the package defining the type. This allows
complete control over all storage allocation for the type.
Objects of such a type can still be passed as parameters, however.
by:
Declaring a private type with an unknown_discriminant_part is a
way of preventing clients from creating uninitialized objects of the
type; they are then forced to initialize each object by calling some
operation declared in the visible part of the package.
!corrigendum 7.5(2)
!comment This rule only talks about function_calls, because those are only
!comment appropriate here. The conflict text handles the combination of
!comment function_calls and aggregates.
@dinsb
If a tagged record type has any limited components, then the reserved word
@b<limited> shall appear in its @fa<record_type_definition>.
@dinst
In the following contexts, an @fa<expression> of a limited
type is not permitted unless it is a @fa<function_call>
or a parenthesized @fa<expression> or @fa<qualified_expression> whose operand
is permitted by this rule:
@xbullet<the initialization @fa<expression> of an @fa<object_declaration> (see 3.3.1)>
@xbullet<the @fa<default_expression> of a @fa<component_declaration> (see 3.8)>
@xbullet<the @fa<expression> of a @fa<record_component_association> (see 4.3.1)>
@xbullet<the @fa<expression> for an @fa<ancestor_part> of an @fa<extension_aggregate> (see 4.3.2)>
@xbullet<an @fa<expression> of a @fa<positional_array_aggregate> or the
@fa<expression> of an @fa<array_component_association> (see 4.3.3)>
@xbullet<the @fa<qualified_expression> of an initialized allocator (see 4.8)>
@xbullet<the @fa<expression> of a return statement (see 6.5)>
@xbullet<the @fa<default_expression> or actual parameter for a formal object
of mode @b<in> (see 12.4)>
!corrigendum 7.5(8)
Insert after the paragraph:
There are no predefined equality operators for a limited type.
the new paragraph:
Implementation Requirements
For a function_call of a type with a part that is of a task, protected, or
explicitly limited record type that is used to initialize an object as allowed
above, the implementation shall not create a separate return object (see 6.5)
for the function_call. The function_call shall be constructed
directly in the new object.
!corrigendum 7.5(9)
Replace the paragraph:
13 The following are consequences of the rules for limited types:
by:
13 While it is allowed to write initializations of limited objects,
such initializations never copy a limited object. The source of such an
assignment operation must be a function_call, and such function_calls
must be built directly in the target object.
!corrigendum 7.5(23)
Replace the paragraph:
The fact that the full view of File_Name is explicitly declared
limited means that parameter passing and function return will always be
by reference (see 6.2 and 6.5).
by:
The fact that the full view of File_Name is explicitly declared
limited means that parameter passing will always be by reference and function
results will always be built directly in the result object (see 6.2 and
6.5).
!corrigendum 7.6(17.1)
Replace the paragraph:
For an aggregate of a controlled type whose value is assigned,
other than by an assignment_statement or a
return_statement, the implementation shall not create a separate
anonymous object for the aggregate. The aggregate value shall be
constructed directly in the target of the assignment operation and Adjust is
not called on the target object.
by:
For an aggregate of a controlled type whose value is assigned,
other than by an assignment_statement, the implementation shall not
create a separate
anonymous object for the aggregate. The aggregate value shall be
constructed directly in the target of the assignment operation and Adjust is
not called on the target object.
!corrigendum 7.6.1(2)
Replace the paragraph:
The execution of a construct or entity is complete when the end of that
execution has been reached, or when a transfer of control (see 5.1) causes it
to be abandoned. Completion due to reaching the end of
execution, or due to the transfer of control of an exit_, return_,
goto_, or requeue_statement or of the selection of a
terminate_alternative is normal completion. Completion is
abnormal otherwise — when control is transferred out
of a construct due to abort or the raising of an exception.
by:
The execution of a construct or entity is complete when the end of that
execution has been reached, or when a transfer of control (see 5.1) causes it
to be abandoned. Completion due to reaching the end of
execution, or due to the transfer of control of an exit_statement,
return statement, goto_statement, or requeue_statement or of the
selection of a terminate_alternative is normal completion. Completion
is abnormal otherwise — when control is transferred out of a construct due
to abort or the raising of an exception.
!corrigendum 7.6.1(18)
Replace the paragraph:
For a Finalize invoked by the transfer of control of an exit_,
return_, goto_, or requeue_statement, Program_Error is raised no
earlier than after the finalization of the master being finalized when the
exception occurred, and no later than the point where normal execution would
have continued. Any other finalizations due to be performed up to that point
are performed before raising Program_Error.
by:
For a Finalize invoked by the transfer of control of an exit_statement,
return statement, goto_statement, or requeue_statement, Program_Error
is raised no earlier than after the finalization of the master being finalized
when the exception occurred, and no later than the point where normal execution
would have continued. Any other finalizations due to be performed up to that
point are performed before raising Program_Error.
!corrigendum 8.1(4)
Insert after the paragraph:
the new paragraph:
- an extended_return_statement;
!corrigendum 9.5.2(29)
Replace the paragraph:
24 A return_statement (see 6.5) or a requeue_statement
(see 9.5.4) may be used to complete the execution of an accept_statement
or an entry_body.
by:
24 A return statement (see 6.5) or a requeue_statement
(see 9.5.4) may be used to complete the execution of an accept_statement
or an entry_body.
!corrigendum 13.8(10)
Replace the paragraph:
16 Machine code functions are exempt from the rule that a
return_statement is required. In fact, return_statements are
forbidden, since only code_statements are allowed.
by:
16 Machine code functions are exempt from the rule that a
return statement is required. In fact, return statements are
forbidden, since only code_statements are allowed.
!ACATS test
ACATS(s) tests need to be created for these features.
!appendix
From: Tucker Taft
Sent: Thursday, April 1, 2004, 6:13 AM
I have been asked to prepare an alternative to AI-318
which drops the notion of "aliased" return-by-reference
functions, and replaces it with a simplified version
of anonymous access type return. One thing that is
being lost in this process is that return-by-reference
eliminates the need for ".all" at the call site.
However, it struck me that we already allow implicit
dereference in a number of contexts, and since
anonymous access types as return types is a new
feature, it would be feasible to allow implicit
dereference of calls of such functions in *any* context.
Allowing implicit dereference has some advantages:
1) It provides better compatibility with the existing
(albeit limited) return-by-reference capability,
because call sites would not have to change, only
the function would change to return X'access rather
than X (or Y rather than Y.all). Implicit dereference
would eliminate the need for a .all at the call sites.
2) C++ has a return-by-reference capability ("&" return type)
which allows a natural way to use a call on a function as
the left hand side of an assignment, allowing the implementation
of "abstract" arrays, e.g.:
Arr(X) := Arr(X) + 1;
where "Arr" is actually a function that implements an array-like
data structure.
We could get much of this same capability by allowing
functions declared to return an anonymous access type to
be implicitly dereferenced in any context. Furthermore,
since Ada uses "()" for both array indexing and function
calling, this would actually get some value out of that
syntactic unification (or as Robert might call it,
"confusion" ;-).
This is actually better than the "aliased" return-by-ref
capability, since in that case the returned object was
necessarily considered a constant. Of course if the
writer of the function wanted the result to be access-to-constant,
they could declare it that way.
3) Similar to above, but relevant to me because Bob Duff and I
have recently been sparring over an issue that would be nicely
solved by implicit dereference: As in many text- and language-
processing tools, we convert all strings into unique IDs as soon
as we read the source file. We call these unique IDs "spellings,"
LISP used to call them "symbols," and I have seen them called
String-IDs and a number of other similar things. They
significantly simplify further processing because string equality
involves a simple ID equality comparison, and these IDs can
be efficiently passed and returned from subprograms without
any of the issues associated with passing and returning
unconstrained arrays.
*However*, when it comes to passing these IDs to
subprograms that expect Strings, we have to convert
the ID back to a String. The simplest way to do this is to
write a function, say To_String, which takes an ID and returns
a String. Unfortunately, that immediately gets you back into
the inefficiencies of returning unconstrained arrays. An
alternative is to expose the representation of the IDs, and
allow the caller to explicitly use ".all" or a component selection
to retrieve the String at the call site, but that clearly makes
the "abstraction" a bit less abstract.
By allowing implicit dereference of functions returning
anonymous access types, we could have the best of both worlds.
The To_String function could actually return "access constant
String" instead of String, but it could still be used in
any context that required a String without the overhead of
returning unconstrained arrays. This would preserve both
abstraction and performance.
So, barring major objection, I am going to propose that calls
on functions returning anonymous access types will permit
implicit dereference in any context (instead of only in front
of ".", "(", and "'").
Comments welcomed...
****************************************************************
From: Pascal Leroy
Sent: Monday, April 5, 2004, 10:47 AM
Tuck wrote:
> Here is an alternative proposal, which drops
> "aliased return blah" (return-by-reference) in
> favor of "return access blah." It still includes
> functions returning limited types.
It took me a while to realize that this AI really has two proposals:
1 - Functions returning anonymous access types. That includes implicit
dereferencing, but as I see it the extended_return_statement is not
necessary for this part.
2 - Improvements for functions returning limited types. This is the
part that really needs the extended_return_statement.
The more I look at the AI, the more I like #1 (especially with implicit
dereferencing and the capability to have a function call on the LHS of
an assignment) and the less convinced I am about #2. Yeah, it would be
nice to improve the usability of limited types, but the baggage needed
to do that (and the somewhat arbitrary restrictions that come with it)
sounds clunky to me.
What do others think?
****************************************************************
From: Tucker Taft
Sent: Monday, April 5, 2004, 11:13 AM
Pascal Leroy wrote:
>
> Tuck wrote:
>
> > Here is an alternative proposal, which drops
> > "aliased return blah" (return-by-reference) in
> > favor of "return access blah." It still includes
> > functions returning limited types.
>
> It took me a while to realize that this AI really has two proposals:
>
> 1 - Functions returning anonymous access types. That includes implicit
> dereferencing, but as I see it the extended_return_statement is not
> necessary for this part.
>
> 2 - Improvements for functions returning limited types. This is the
> part that really needs the extended_return_statement.
I believe I was directed to keep these two proposals as part of a single AI.
> The more I look at the AI, the more I like #1 (especially with implicit
> dereferencing and the capability to have a function call on the LHS of
> an assignment) and the less convinced I am about #2. Yeah, it would be
> nice to improve the usability of limited types, but the baggage needed
> to do that (and the somewhat arbitrary restrictions that come with it)
> sounds clunky to me.
>
> What do others think?
I think this is the key thing to make limited types more useful.
With this change, making a type limited allows the implementor to
control all cases of copying, without dramatically undermining
the usability of the type, and with almost no negative performance
impact.
****************************************************************
From: Randy Brukardt
Sent: Tuesday, April 6, 2004, 3:53 PM
The only reason that I would ever vote for (1) would be if it was the only
way to get (2). If we don't want to handle the limited functions, then we
need do nothing for return-by-reference.
Visible access parameters and results in modern programs should be
discouraged; used only when there is absolutely no other choice. (If we'd
have "in out" on functions, there would never be a need for them.)
As far as the implicit dereference goes, I've been waiting for the expected
"April Fool" that goes with it. Since I've been waiting a week, I suppose it
is actually a serious proposal. I find it completely bizarre, because it
ruins the model of implicit dereference (it occurs only before '.' or '()').
Moreover, why
type Int_Access is access all Integer;
function anon return access all Integer;
function IA return Int_Access;
I := Anon; -- Legal.
I := IA; -- Illegal.
should behave differently is going to be just too goofy to explain.
OTOH, (2) will not only eliminate arbitrary restrictions from limited types,
but it also will make code more readable anytime that it takes multiple
steps to create a result. (And should allow the generation of better code as
well by building in place more often.)
****************************************************************
From: Tucker Taft
Sent: Thursday, April 15, 2004, 10:30 AM
In my recent AI-318-2 proposal for functions returning
an anonymous access type, I had specified that
implicit dereference was provided for calls on
such functions, in part to minimize the impact
of eliminating return-by-reference functions.
However, since then I noticed an additional use
of implicit deref of such calls. I mentioned this
in a response to ada-comment, but here I repeat
it for those who don't follow that mailing list.
There are often times where one wants to provide
a read-only view of a "private" global variable.
There are also times that you have a large global
table, but you want to put its initialization in
a package body so you don't suffer recompilation
headaches every time you change the table slightly
(e.g. a large parse table).
Ada doesn't really have any good solution for this.
In C/C++, you can declare a large constant (or
variable) without giving its initialization in a
spec (i.e. the ".h" file), and then in the body
(i.e. the ".c" file) give the full initialization.
With this new implicit deref proposal, it would
make it pretty easy and efficient to solve this
problem:
In the spec:
function Read_Only_View return access constant T;
pragma Inline(Read_Only_View);
In the body:
Var : aliased T := (...);
function Read_Only_View return access constant T is
begin
return Var'Access;
end Read_Only_View;
Similarly, if there were a large constant table that you just
wanted to postpone to the body:
In the spec:
function Parse_Table return access constant Parse_Table_Type;
pragma Inline(Parse_Table);
In the body:
Parse_Table_Obj : aliased constant Parse_Table_Type := (...);
function Parse_Table return access constant Parse_Table_Type is
begin
return Parse_Table_Obj'Access;
end Parse_Table;
Since the existing uses of anonymous access types are
quite limited right now (only parameters and discriminants),
we could consider providing implicit dereference for *all*
expressions of an anonymous access type, since that would
be more uniform. But I also think it is not too bad to
only provide this for calls, since there is already
implicit "deproceduring" (as Algol 68 called it) for parameterless
subprograms in Ada. Providing implicit deref for functions
returning an anonymous access type seems like a natural
progression of that.
If we instead choose to extend implicit deref to all anonymous
access values, then there is more of an upward compatibility
concern, since there could be additional ambiguities created
when an access parameter or discriminant is passed to an
overloaded subprogram with one version having a param
of type "T" and another a param of type "access T".
Given the relatively modest use of access parameters
and access discriminants at this point, this seems relatively
unlikely, and in any case it will not silently change meaning --
you'll get a compile-time error.
I could go either way. I think implicit deref of function
calls at least is pretty important. It is a very nice way
to return a reference to a large object without incurring
significant overhead, and without having to change the syntax
used at the call point (i.e., no need to insert ".all").
****************************************************************
From: Robert Dewar
Sent: Thursday, April 15, 2004, 10:37 AM
> Ada doesn't really have any good solution for this.
I am missing something, it seems quite reasonable to return
an appropriate constant access type. Yes you add some nice
syntactic sugar below, but nothing fundamental.
What am I missing?
****************************************************************
From: Tucker Taft
Sent: Thursday, April 15, 2004, 11:32 AM
We have generalized the availability of
anonymous access types, in part to go with
the "limited with" proposal, since "limited with"
doesn't solve the proliferation of access types
problem. The one significant place where
anonymous access types weren't permitted was
as function result types. AI-318-2 was addressing that
(as well as limited result types). Franco was
extremely keen on getting this, because he felt
that it was a clear hole when trying to explain
the new anonymous access type paradigm.
The implicit deref of calls of such functions may
seem like a small point, but it can have a significant
effect on usability in my experience. Having to add
".all" on the result of a function call is a pain
and changes the perceived nature of the abstraction.
What you really want is return by reference in some
contexts, and having to add an explicit ".all" makes
the abstraction feel less abstract. I gave examples
of an "array" abstraction with the ability to
assign to components of the array.
****************************************************************
From: Robert I. Eachus
Sent: Thursday, April 15, 2004, 4:22 PM
I don't think you are missing anything. But that doesn't mean that what
Tucker is trying to accomplish is not very useful. Right now you can
convert a function to an array in some cases and not have to change the
code that uses the abstraction. This does the same thing for anonymous
access types. (Actually, Tucker goes further, but I think that the
anonymous access cases are the high payoff.) If we can eliminate
gratuitous uses of .all and 'Access, it makes programming in Ada
easier. There is a potential problem in that overloadings will be
possible where supplying the .all (or 'Access) will resolve the
overloading but the direct call will always be ambiguous.
If you really find that a problem, is should be easy enough to say that
if a function call or argument is in parentheses, the .all or 'Access
must be explicit. So if:
X := Foo(Y, Bar); -- is ambiguous
then;
X := Foo(Y, (Bar));
and
X := Foo(Y, Bar.all);
Would resolve to the two different meanings. It might require some
changes to existing code, but not that much, and it would of course, be
caught at compile time.
****************************************************************
From: Randy Brukardt
Sent: Thursday, July 29, 2004, 1:06 AM
AI-10318 includes the following rule:
Legality Rules
If the result subtype of a function is limited at the point where the
function is frozen (see 13.14), the result subtype shall be constrained.
This was intended to make the implementation of limited build-in-place
functions easier.
At the Palma meeting, Pascal pointed out that this is incompatible with
existing generic units that have generic limited private type parameters. He
asked me to add the example of ACATS foundation FDD2A00, which this rule
makes illegal -- and it doesn't raise Program_Error in use. (This example is
not quite as compelling as it could be, as the foundation in question was
created by "PHL". :-)
However, I noticed that this foundation is testing stream attributes. The
problem is caused by 'Input being a function. Indeed, that shows that the
above legality rule has additional compatibility problems and as well needs
help to cover stream attributes.
Let's look at an example.
package Ugh is
type Lim_Tagged is limited tagged private;
function My_Input (Stream : access Root_Stream_Type'Class)
return Lim_Tagged;
-- Legal only if the type doesn't have any discriminants.
for Lim_Tagged'Input use My_Input;
function My_Class_Input (Stream : access Root_Stream_Type'Class)
return Lim_Tagged'Class;
-- Never legal (T'Class is never constrained).
for Lim_Tagged'Class'Input use My_Class_Input;
procedure Do_Something (Object : Lim_Tagged'Class);
task type Tsk is ...
function My_Tsk_Input (Stream : access Root_Stream_Type'Class)
return Tsk;
-- Legal.
for Tsk'Input use My_Tsk_Input;
type Tsk_Array is array (Positive range <>) of Tsk;
-- Tsk_Array'Input is available by the Corrigendum and AI-195.
procedure Do_Something_Else (Object : Tsk_Array);
private
...
end Ugh;
with Ugh;
package Factory is
function Constructor (...) return Ugh.Lim_Tagged'Class;
-- A "factory" constructor.
-- Never legal (T'Class is never constrained)
end Factory;
with Ugh;
procedure Test is
Obj : Ugh.Lim_Tagged'Class := Ugh.Lim_Tagged'Class'Input (A_Stream);
-- This is clearly legal if we don't try to redefine the
-- attribute. But it is returning a clearly unconstrained (new)
-- object, and it will require build-in-place semantics.
TObj : Ugh.Tsk_Array := Ugh.Tsk_Array'Input (A_Stream);
-- This also is clearly legal.
begin
Ugh.Do_Something (Ugh.Lim_Tagged'Class'Input (A_Stream));
-- Legal now in Ada 95+Corr.
Ugh.Do_Something_Else (Ugh.Tsk_Array'Input (A_Stream));
-- Legal now in Ada 95+Corr if Tsk_Array'Input overridden.
end Test;
The first problem to note is incompatibility. With this legality rule, we
aren't allowed to declare a function to be used as a user defined 'Input for
Tsk_Array and Lim_Tagged'Class. Ada 95 certainly allows that. Tucker has
argued privately that it would be nearly impossible to write a useful
user-defined routine in this case, because of the return-by-reference
rules -- returning an existing object is never what you want.
The second problem is the language defines stream attributes for all types -
including unconstrained limited types. Moreover, we've made (limited)
function calls legal in many circumstances with build-in-place semantics.
So, these stream attributes are legal in calls like the above (and the
Do_Something calls are perfectly useful in existing Ada 95+Corr code). That
means that we still have to implement unconstrained function calls, just the
user can't write them. That's obviously silly.
We're also giving much less than meets the eye here. It's not allowed to
write a class-wide constructor (of which T'Class'Input is just a single
example). That's a significant loss. The workaround of using an anonymous
(or named) access type puts the burden of storage management on the user -
reducing a key advantage of Ada. And it means that it won't be possible to
convert non-limited tagged types which were non-limited only to get
functions and constructors to limited tagged types.
---
Anyway, the second problem has to be solved somehow. It's of no help to
implementers to limit users from writing unconstrained functions if the
system still has to do it. There are three basic solutions.
The more ugly rules solution: We could make T'Input "unavailable" if T is
limited and unconstrained. This is messy, though, because we wouldn't want
to prevent the use of T'Input for (untagged) types that aren't really
limited (and thus allow the declaration of functions); that would be a
bigger compatibility problem. Unfortunately, figuring out whether types are
"really" limited breaks privateness, and Mr. Private is sure to object to a
legality rule depending on privateness.
And, of course, this rule would completely undo all of the work which makes
streaming for limited tagged types useful, leaving such types as
second-class citizens.
An alternative is the
New ugly rule solution: The real problem is that we need to prevent making
*calls* to functions that we can't handle, not the functions themselves. So
we could drop the above rule altogether (possibly leaving an implementation
permission for an implementation to reject a function that cannot be
called), and replace it by rule on calls:
If the result subtype of a function_call is limited at the point of the
call, the result subtype shall be constrained.
This solves the problem cleanly, and also reduces the problems with generics
(as now only internal calls would be made illegal; calls from outside the
instance would determine from the actual type if they are legal.
But again this covers too much. And trying to make it tighter again will
break privateness and also be a contract problem in generics.
The final alternative is the
Too much work solution: drop these silly restrictions intended to make the
implementation easier (and which by definition harm the user) and get to
work. Tucker explained how to implement this long ago, and although it looks
painful, it's only a short-term pain -- rather than inflicting this pain on
users forever. It also is much less incompatible, as there are no problems
with existing generics with limited private formal types.
Still, this didn't fly in the past, and I don't expect it to fly now.
---
To me, this looks like a giant tease. We tease programmers by claiming that
you can now do almost everything with limited types that you can do with
non-limited types, but in return some of your generics are now going to be
illegal (with no meaningful workaround), you can't use class-wide functions
(meaning no factories of any kind), and even class-wide streaming isn't
allowed. This, to steal one of Tucker's favorite phrases, is just moving the
bump under the rug. This is a self-inflicted bump; if we were willing to do
the additional work to move the furniture, this bump would be gone. (The
analogy comes true in the carpet here in my office; since we didn't move all
of my furniture in the last flood, the carpet repair guy left a large bump
next to my desk...)
There is also a safety issue here that we haven't really discussed. Objects
returned by return-by-reference functions have a limited lifetime: it's not
possible for the caller to hang onto them forever. That makes it possible to
do storage management and resource locking (although the language doesn't
support this well). However, anonymous access types can be converted to
other types, and via 'Unchecked_Access, held onto forever. (Sure, the
programmer is responsible that the 'Unchecked_Access value is destroyed
before the object is. But if the programmer *knows* [by peeking at the
source] that the actual objects are at library-level or in a heap, the use
is OK vis-a-vis the language -- even though it may not be part of the
"contract" of the function. And 'Unchecked_Access is so common that it isn't
going to be a red flag to anyone - I don't know if I've ever found a useful
case where I could use 'Access on an object -- I don't even try anymore.) So
we've reduced the safety of the language a bit.
Anyway, to draw an actual conclusion. :-)
I don't believe that an AI that purports to give limited types equal footing
with non-limited ones can deny a major benefit of OOP programming to limited
types. Restrictions on class-wide programming are simply unacceptable in my
view -- if limited tagged types are going to remain useless, we might as
well not bother with lots of work and incompatibility.
So I would vote for accepting the short-term pain and making this useful to
all. However, if that is unacceptable to the majority, then (in the absence
of a better idea) I would rather drop the AI completely rather than give
users another (but different) crippled version of limited types.
****************************************************************
From: Stephen W. Baird
Sent: Thursday, January 13, 2005 3:31 PM
In the course of reviewing section 6, some questions came up about
extended return statements (AI-318).
Initially this discussion involved only Pascal, Randy, and the section 6
reviewers (Steve B. and Jean-Pierre).
It has become clear that a broader discussion is warranted.
The discussion so far (with some minor editing) is attached.
----
Stephen W Baird/Cupertino/IBM wrote on 01/11/2005 03:09:16 PM:
If we get as far as entering the handled_sequence of statements of an
extended_return_statement (see AI-318), then clearly the object associated
with the return statement has been successfully initialized and it
will need to be finalized at some point.
If the return statement executes "normally" (i.e., if the final transfer
of control described in the dynamic semantics of a return statement is
executed), then the caller is responsible for this finalization.
Otherwise, the callee has to take care of this finalization (right?).
This could occur if the extended_return_statement is exited via a goto or exit
statement, or if an exception flies out of the handled_sequence_of_statements,
or if execution of the extended_return_statement is aborted (either by an
abort statement or via ATC).
Thus, the finalization rules for the return object are quite different
depending on how the extended_return_statement is exited.
It appears that the RM is missing a description of this distinction.
There also may be related problems in defining the master and
accessibility level of the function return object of an
extended_return_statement (see AI-162).
One approach would be to define an extended_return_statement to be
a master (in 7.6.1(3)), and therefore it would be the master of the
return object.
This would prohibit the following example
type R1 is record F : aliased Integer; end record;
function F1 return R1 is
type Ref is access all Integer;
Dangling : Ref;
begin
return Result : R1 do
Dangling := Result.F'Access; -- should not be legal
goto L;
end return
<<L>> Dangling.all := 123;
... ;
end F1;
, and would result in the appropriate task-termination-awaiting in
the following variation
type R2 is record F : Some_Task_Type; end record;
function F2 return R2 is
begin
return Result : R2 do
goto L; -- must wait for Result.F to terminate
end return
<<L>> ... ;
end F2;
. On the other hand, this might introduce confusion (or worse) in the
case of a "normal" return and might require adding some
special rules to handle that case.
Another approach (the "black hole" model) would be to prohibit exiting
an extended_return statement without exiting the enclosing function. This
would require
- prohibiting goto statements and exit statements which would
transfer control out of an Extended_Return_Statement
- a dynamic semantics rule to the effect that an exception propagated
out of the Handled_Sequence_Of_Statements of an Extended_Return_Stmt
is propagated to the caller
. In this example,
function F return Integer is
E1, E2 : exception;
begin
return Result : Integer do
raise E1;
end return;
exception
when others => raise E2;
end F;
, this would mean that E1 would be propagated to the caller, not E2.
This seems unintuitive.
This approach has the advantage that there is no need to define what
happens if an extended_return_statement is exited without exiting
its enclosing function, but it does not eliminate the need for finalization
rules in the case where an extended_return_statement propagates an exception
to the caller.
========
Pascal Leroy/France/IBM wrote on 01/12/2005 03:52:45 AM:
I don't like the "black hole" model
because of its implication for exception handlers. On the other
hand the other approach you mention is even less palatable as it
seems to imply that the master of the return object might change
during its lifetime.
Now that I reread this, it also seems strange that initialization
and finalization of the return object are done by different masters
(in the normal case). Could this be the root of the problem? What
if the caller always did initialization and finalization of the
return object? Would that make any sense?
========
Jean-Pierre Rosen <rosen@adalog.fr> wrote on 01/12/2005 04:44:04 AM:
Actually, the "black hole" model is what I had in mind when I read this
remark. I don't like the idea of "canceling" a return statement in the
middle, and if an exception is raised, I would expect it to be
propagated to the caller, not inside the function.
In short, I would expect these to be equivalent (assuming the return
type is not limited):
1)
declare
function F return T is
-- do something
end F;
begin
return F;
end;
2)
return X : T do
-- Do the same thing
end return;
(Note that in 1), there is no goto or exit issue, and that an exception
is propagated).
...
If the model is that the caller creates the returned object before
calling the function, yes, Pascal's idea of having the
caller perform initialization and finalization of the return object
would seem to make sense.
========
Stephen W Baird/Cupertino/IBM wrote on 01/12/2005 10:25:49 AM
<responding to Pascal's message>:
I don't like the black hole model either, mostly because it would be very
confusing for users.
However, I don't think it would be very difficult to implement:
Upon entering an extended_return_statement, a flag is set.
The handler for any handled_sequence_of_statements which encloses
an extended_return_statement would then query the flag and
do the right thing if the flag is set.
Still, it is probably a bad idea.
Your idea of having the caller perform default initialization is
appealing, but it seems like it would be impossible
to implement in case of an indefinite function result subtype.
As a minor point, there is also the case where default initialization
is not supposed to be performed:
return X : T := Explicit_Initial_Value do ... end return;
Finally, there is the problem of modifying the function result object,
exiting the extended_return_statement (e.g. via a goto) and then
executing another extended_return_statement.
========
Pascal Leroy/France/IBM wrote on 01/12/2005 10:59:57 AM:
> Your idea of having the caller perform default initialization is
> appealing, but it seems like it would be impossible
> to implement in case of an indefinite function result subtype.
Yes, that part bothers me, although I haven't given it enough thought yet.
> As a minor point, there is also the case where default initialization
> is not supposed to be performed:
> return X : T := Explicit_Initial_Value do ... end return;
Good point. This is somewhat related to the previous issue, as the
only case where this capability is important is for indefinite types
(boxy discriminants or class-wide).
> Finally, there is the problem of modifying the function result object,
> exiting the extended_return_statement (e.g. via a goto) and then
> executing another extended_return_statement.
If the result object is initialized/finalized by the caller, this is
not an issue. Exiting an extended_return_statement doesn't cause
finalization of the result object, and re-entering another
extended_return_statement (or the same one) doesn't cause it to be
initialized. This is all well-defined. But again the indefinite
case is problematic.
========
"Randy Brukardt" <randy@rrsoftware.com> wrote on 01/12/2005 11:19:29 AM:
I completely agree with Jean-Pierre. I don't think any transfers out of an
extended return statement should be allowed, and exceptions should be
propagated directly to the caller. The master of the object is that of the
caller (and yes, you may have to pass it in -- the initialization is done in
the return statement, but it doesn't use the master that is textually
there). I thought that was all obvious, it can't work any other way -- and I
thought that the wording made that clear (I see I was wrong about that).
Once you start a return statement, you have to return, not do other random
junk. I don't see why that should be confusing to users (with the possible
exception of the function's exception handler not working).
========
Stephen W Baird/Cupertino/IBM wrote on 01/12/2005 12:51:05 PM:
> I don't see why that should be confusing to users (with the possible
> exception of the function's exception handler not working).
When I said that this approach would be confusing for users,
I was talking about the behavior of exceptions.
You say "The master of the object is that of the caller".
I don't see how it can be that simple, even with the "black hole" model,
because of the possibility that an extended_return_statement can propagate
an exception (albeit directly to the caller).
If an extended return statement propagates an exception, then
a) if the function result object has component tasks, who waits
for them to terminate?
b) if the function result object requires termination, when is
it performed?
It would be very odd to require the caller to iterate over the components of
the function result object (e.g. to perform finalization) in the case where
the called function propagated an exception, particularly in the case where
the function result subtype is indefinite.
I suppose you could view this case as being a lot like an Unchecked_Deallocation
of the function result object. That would mean that the callee would perform
finalization but the caller would wait for tasks. We would also have to deal,
one way or another, with the case of erroneous execution of "deallocated"
discriminated tasks (see 13.11.2(11-15)).
There is also the question of the static accessibility level of the function
return object of an extended_return_statement. You certainly don't want to
allow Local_Variable'Access as an access discriminant value for the function
result object.
========
"Randy Brukardt" <randy@rrsoftware.com> wrote on 01/12/2005 01:24:59 PM:
> You say "The master of the object is that of the caller".
>
> I don't see how it can be that simple, even with the "black hole" model,
> because of the possibility that an extended_return_statement can propagate
> an exception (albeit directly to the caller).
Of course it's that simple. It's the same rules that you use for a "regular"
function return that has finalizable components.
> If an extended return statement propagates an exception, then
>
> a) if the function result object has component tasks, who waits
> for them to terminate?
The caller, of course.
> b) if the function result object requires termination, when is it
> performed?
At the point that it would have happened if the function call had been
successful.
> It would be very odd to require the caller to iterate over the components
> of the function result object (e.g. to perform finalization) in the case
> where the called function propagated an exception, particularly in the
> case where the function result subtype is indefinite.
I don't think so. The only sensible rule is that the finalization/waiting
(they are the same thing in my view) takes place at the same point whether
the call returns normally or raises an exception during the return
statement. Anything else would be madness - you'd have to move the object in
the finalization chain and/or change its master partway through the
evaluation of the return statement. That would be something that never
happens in Ada 95, and it seems insane.
> I suppose you could view this case as being a lot like an
> Unchecked_Deallocation of the function result object. That would mean
> that the callee would perform
> finalization but the caller would wait for tasks. We would also have to
> deal, one way or another, with the case of erroneous execution of
> "deallocated" discriminated tasks (see 13.11.2(11-15)).
No, finalization and task waiting take place at the same place.
Unchecked_Deallocation is weird because the master is somewhere else, but
that's a bug in my view - one that we can't change, of course. Task waiting
and finalization are closely related things, and they should happen at the
same place (even for Unchecked_Deallocation).
> There is also the question of the static accessibility level of the
> function return object of an extended_return_statement. You certainly don't
> want to allow Local_Variable'Access as an access discriminant value for the
> function result object.
Right, but that should be obvious, and the rule needed is quite simple (the
function object is less nested than the function).
****************************************************************
From: Gary Dismukes
Sent: Thursday, January 13, 2005 4:20 PM
> It has become clear that a broader discussion is warranted.
After an initial reading of the exchange I agree with Randy's view that
you can't exit out of an extended return statement (i.e., you have to
return to the caller, whether normally or by exception) and that the
caller has to perform all termination and finalization of the object.
I haven't thought through all the implications of this, but it seems
like the only reasonable model to me. Now, where are the gotchas?
****************************************************************
From: Stephen W. Baird
Sent: Thursday, January 13, 2005 6:49 PM
If the odd exception propagation rules don't bother you, then I don't
think there are any big problems with the rule that you can't
exit the handled_sequence_of_statements of an extended_return_statement
without also exiting the enclosing function (i.e., the "black hole" rule).
I just thought of another hole in this area that would need to be plugged,
function Bad_Transfer_Of_Control return T is
Return_Statement_Was_Entered : Boolean := False;
begin
select
delay 1.0;
then abort
return X : T do
Return_Statement_Was_Entered := True;
delay 10.0;
end return;
end select;
if Return_Statement_Was_Entered then
Put_Line ("Houston - we've got a problem");
end if;
... ;
end Bad_Transfer_Of_Control;
, but that's ok; we can ban extended_return_statements within the abortable
part of an ATC statement (or make them abort-deferred, or ...).
The "gotcha" as I see it is in the rule that "the
caller has to perform all termination and finalization of the object".
In the case where the function result subtype is, say, an unconstrained
array subtype, and the function propagates an exception, this would mean
that the callee would have to simultaneously propagate an exception and
return an array for the caller to finalize. I don't see how to implement
this without imposing distributed overhead on functions that don't use
extended_return_statements.
****************************************************************
From: Randy Brukardt
Sent: Thursday, January 13, 2005 7:37 PM
> The "gotcha" as I see it is in the rule that "the
> caller has to perform all termination and finalization of the object".
> In the case where the function result subtype is, say, an unconstrained
> array subtype, and the function propagates an exception, this would mean
> that the callee would have to simultaneously propagate an exception and
> return an array for the caller to finalize. I don't see how to implement
> this without imposing distributed overhead on functions that don't use
> extended_return_statements.
I don't see this as an issue with extended_return_statements, but rather
with functions returning limited unconstrained subtypes. The rules require
build-in-place, even for these sorts of functions. They have to, or limited
functions don't work. The overhead has to be there even for a regular return
statement, because they too are build-in-place. I view these to be more like
procedures with a convenient syntax than functions, at lease in
implementation.
So, I think some sort of expensive special calling convention will be
required for these things, so that the object can be created in the right
place. That's going to be unpleasant, but its always necessary (even if a
regular return is used to return an aggregate, for example -- you'll have
precisely the same problems). I would expect that some implementations would
have to pass a thunk to do that, or at least a package of storage pool/task
master/finalization thumb.
Anyway, I don't think that limited functions are going to be returning
anything; the object (or a holder descriptor for it, for unconstrained
subtypes) will be passed it, and the object will be constructed there.
There's nothing to return (ever); the object gets finalized as for any other
object constructed at that place. I could imagine an implementation
returning a pointer to this object in the normal case (just so that it works
like other functions), but that certainly wouldn't have anything to do with
its finalization.
I do think that there is an obvious transformation of such a function into
constructs that we already understand. If LC is a limited controlled type,
then:
type LC_Array is array (Positive range <>) of LC;
function Constructor return LC_Array is
begin
return (1..10 => <>);
end Constructor;
...
Obj : LC_Array := Constructor;
can be transformed into (with the same finalization meaning):
type LC_Array is array (Positive range <>) of LC;
type LC_Array_Access is access LC_Array;
type LC_Holder is new Ada.Finalization.Limited_Controlled with record
Item : LC_Array_Access;
end record;
procedure Constructor (Holder : in out LC_Holder) is
begin
Holder.Item := new LC_Array'(1..10 => <>);
end Constructor;
procedure Finalize (Holder : in out LC_Holder) is
begin
Free (Holder.Item);
end Finalize;
...
Obj : Constructor;
Constructor (Obj);
(Finalization of Obj forces the deallocation and finalization of the "Item"
component.) Obviously, a compiler vendor can probably do better than this
(without the explicit declarations, for instance), and probably would want
to use a storage pool other than the default heap for this allocation.
For an extended_return, the "body" of the return statement would be
operating on the allocated item:
function Constructor return LC_Array is
begin
return D : LC_Array := (1..10 => <>) do
D(5) := ...;
end return;
end Constructor;
would turn into:
procedure Constructor (Holder : in out LC_Holder) is
begin
Holder.Item := new LC_Array'(1..10 => <>);
begin
Holder.Item.all (5) := ...;
end;
end Constructor;
Note that this transformation suggests that there is no real problem with
transfers of control. But I still think it is weird to initiate a return
statement and then goto out of it. So I think that should be illegal
irrespective of any semantic issues. Exceptions seem to matter less, but I
suspect that it would be easier to generate better code if you couldn't
handle them locally.
****************************************************************
From: Tucker Taft
Sent: Thursday, January 13, 2005 9:18 PM
I think I have a somewhat different model. Here is what the AI
says that is related to this issue:
> DEALING WITH EXCEPTIONS
>
> There was some concern about what would happen if an exception were
> propagated by an extended return statement, and then the same or some
> other extended return statement were reentered. There doesn't seem to be
> a real problem. The return object doesn't really exist outside the
> function until the function returns, so it can be restored to its
> initial state on call of the function if an exception is propagated from
> an extended return statement. Once restored to its initial state, there
> seems no harm in starting over in another extended_return_statement.
In my view, the extended return statement should be treated
like returning an aggregate, where all of the various statements
after the "do" may be thought of as being squeezed into the middle
of the aggregate. You can have arbitrary expressions in the
middle of an aggregate, which can raise exceptions, etc., and
these do *not* cause control to go to the caller. The exception
is propagated to the enclosing exception handler, not directly
to the caller. And until you hit the final right paren of the
aggregate, or the "end" of the extended return, the object
doesn't really "exist" as far as the outside world is concerned.
The object can be returned to the initial state it had when
the function was first called.
I don't agree with Randy that finalization is performed by
the caller in case of a failed extended return. I think
a good model is a record that contains a controlled component
and a regular component, where the regular component is
initialized *after* the controlled component, and its initialization
fails. In this case, we finalize the controlled component
right away, even if, say, the record is being initialized
as part of an allocator which wouldn't normally be finalized
until much later.
An interesting issue is how to handle task components.
The question is when do they get added to the appropriate
"activation list" which is presumably walked when the
caller hits the point to activate the tasks. It seems
simplest to let the caller take care of adding any task
components to the activation list. This can presumably
be done after the function returns, by walking the object
to find all the task components and add them to the
activation list. This would only happen if and when
the function returns successfully. This saves having to
pass in such an activation list, and avoids having to
remove tasks from the list if the extended return
statement fails in the middle.
By the way, did we ever specify what happens if we have
a limited aggregate as an actual parameter, and it
has a task component? I would presume that all task
components of such actual parameters are activated
after evaluating all the parameters, immediately
prior to invoking the body of the subprogram.
It seems unwise to activate them piecemeal as the
parameters are evaluated, as there is no defined
order of parameter evaluation. It is sort of like
the actual parameters are the components of a heap
object, and the call represents the allocator.
****************************************************************
From: Robert I. Eachus
Sent: Thursday, January 13, 2005 10:11 PM
Randy Brukardt wrote:
>I don't see this as an issue with extended_return_statements, but rather
>with functions returning limited unconstrained subtypes. The rules require
>build-in-place, even for these sorts of functions. They have to, or limited
>functions don't work. The overhead has to be there even for a regular return
>statement, because they too are build-in-place. I view these to be more like
>procedures with a convenient syntax than functions, at lease in
>implementation...
I agree. We have to get this case right for any of this to make sense
doing.
>Note that this transformation suggests that there is no real problem with
>transfers of control. But I still think it is weird to initiate a return
>statement and then goto out of it. So I think that should be illegal
>irrespective of any semantic issues. Exceptions seem to matter less, but I
>suspect that it would be easier to generate better code if you couldn't
>handle them locally.
It is possible to make a transfer out of a return statement work, but I
don't see much point to making implementors deal with it. The simplest
rule seems to me to be that return statements create a scope for
statement identifiers, and we fix the wording in 5.1(12&13) and 5.8(4)
to match.
As for exceptions, the rule should be similar, we don't care if an
exception is raised AND handled inside the return statement. So the
return statement should be outside the scope of exception handlers local
to the function, but it may contain exception handlers. It is probably
worth adding an example to the RM of a function call with nested
exception handler just to show how to do the 'hard' case:
function Constructor return Limited_Array is
begin
return D: Limited_Array(1..10 => <>) do
for I in D'Range loop
begin
D(I) := ...;
exception when others =>
-- fix D(I) for some I.
end;
end loop.
end return;
end Constructor;
with of course, an explanation pointing out that errors when allocating
D will be handled by the caller, while errors in computing D(I) can be
handled locally.
****************************************************************
From: Randy Brukardt
Sent: Thursday, January 13, 2005 10:24 PM
Tucker wrote:
...
> In my view, the extended return statement should be treated
> like returning an aggregate, where all of the various statements
> after the "do" may be thought of as being squeezed into the middle
> of the aggregate. You can have arbitrary expressions in the
> middle of an aggregate, which can raise exceptions, etc., and
> these do *not* cause control to go to the caller. The exception
> is propagated to the enclosing exception handler, not directly
> to the caller. And until you hit the final right paren of the
> aggregate, or the "end" of the extended return, the object
> doesn't really "exist" as far as the outside world is concerned.
> The object can be returned to the initial state it had when
> the function was first called.
I certainly agree with you that the model is the same, but I don't agree
with the conclusions that you draw from it. This model works in Ada 95
because the aggregate is necessarily non-limited, so the aggregate is
created into a temporary, and it isn't copied into the final object until
after the function returns (or immediately before - in the context of the
caller, in any case. A Program_Error raised by the Adjust copying into the
final object will certainly not be caught by the function).
The problem is, for limited functions, we have to build the aggregate (or
extended_return_statement) in place. In *either* case, raising an exception
in the middle is a problem, because the object is already partially
constructed. And, at least in Janus/Ada, an object belongs to a particular
master, and that is assumed to never change -- so the object gets finalized
whenever that master goes away (unless it happens earlier, of course). So
the object is constructed with an owner of the ultimate master - and thus
won't get finalized until that master goes away.
Your model would require changing the master of the object after it is
created. While I suppose that there is some way to implement that, it would
require a distributed overhead -- both adding extra nodes into the
finalization chains to allow safe deletion from the main finalization chain,
and of thunks that would have to be defined for all record types (or at
least all limited record types).
> I don't agree with Randy that finalization is performed by
> the caller in case of a failed extended return. I think
> a good model is a record that contains a controlled component
> and a regular component, where the regular component is
> initialized *after* the controlled component, and its initialization
> fails. In this case, we finalize the controlled component
> right away, even if, say, the record is being initialized
> as part of an allocator which wouldn't normally be finalized
> until much later.
I think you're confused: there is no such rule in the Standard that I can
find. The rules that exist pertain to a failed Adjust, and that rule (at the
insistence of one S. Tucker Taft, as I recall) was relaxed to "might or
might not be finalized". Finalization occurs when the object's master is
left. In the case of a failed allocator, that could indeed be a long time in
the future. I think the model you describe would be wrong, because it would
finalize the object too soon.
The overhead of the model that you espouse here would be severe -- it could
only be implemented by putting an exception handler around *any*
initialization that could possibly fail. Unless the implementation has
zero-cost exception handlers, that's going to be a lot of expense. (We have
to do that for Finalize calls, and it is by far the largest overhead of
finalization in normal use. In the case of finalization, there really isn't
a choice (because we can't let something failed poison an unrelated
abstraction), but I don't see any special issue with initialization (as long
as the finalization actually happens eventually).
> An interesting issue is how to handle task components.
> The question is when do they get added to the appropriate
> "activation list" which is presumably walked when the
> caller hits the point to activate the tasks. It seems
> simplest to let the caller take care of adding any task
> components to the activation list. This can presumably
> be done after the function returns, by walking the object
> to find all the task components and add them to the
> activation list. This would only happen if and when
> the function returns successfully. This saves having to
> pass in such an activation list, and avoids having to
> remove tasks from the list if the extended return
> statement fails in the middle.
You seem to be separating tasks and finalizable objects. I think that is
serious mistake -- they should follow the same rules. Moreover, it is too
late; tasks are created belonging to a master in Janus/Ada (so that they can
be cleaned up if activation fails), so we'd have to pass in the master and
activation list into the function. And, in any case, walking components of a
record is that very expensive operation requiring a custom thunk for *all*
record types (to avoid contract problems). You're telling us we need two of
them? You're insane!
> By the way, did we ever specify what happens if we have
> a limited aggregate as an actual parameter, and it
> has a task component?
Yes, it's the AI-162 that you were the last person to rewrite. I seriously
think that you are working too hard! (Now go review your sections of the
AARM. :-) We redefined masters so that expressions and subprogram calls have
them.
> I would presume that all task
> components of such actual parameters are activated
> after evaluating all the parameters, immediately
> prior to invoking the body of the subprogram.
> It seems unwise to activate them piecemeal as the
> parameters are evaluated, as there is no defined
> order of parameter evaluation. It is sort of like
> the actual parameters are the components of a heap
> object, and the call represents the allocator.
God, I hope not. Activating tasks is complicated enough without deciding to
treat parameters specially. Why the heck would anyone care what order
they're activated in anyway? It's usually unspecified, so you couldn't
depend on anything about it anyway. And, as you say, the order of the
parameters is unspecified; so how could one parameter even be able to
determine if another parameter has it's tasks started? It certainly can't
see them! I do agree that a single parameter would have to be activated like
an allocator (it has to be activated somewhere). But we evaluate each
parameter individually, and trying to tie them together would certainly add
additional complexity for no benefit.
I'm pretty much ready to give up on the entire idea of limited aggregates
and functions, because it simply isn't worth the headaches that you guys
keep coming up with. It would be tempting to disallow aggregates and
functions that contain tasks, but we know that that sort of restrictions
never work. You've certainly convinced me that it couldn't possibly be worth
trying to implement these. (Probably ever.) Sigh.
****************************************************************
From: Tucker Taft
Sent: Thursday, January 13, 2005 11:00 PM
> ... Exceptions seem to matter less, but I
> suspect that it would be easier to generate better code if you couldn't
> handle them locally.
I don't agree. I think we want things to work as similarly
as possible between limited and non-limited, and between
normal and extended returns. We know that in Ada 95,
if you return an aggregate, and something fails in the
middle of creating the aggregate, you can handle that
inside the function. This should also be true if the
type being returned happens to be limited, and should
be true if the aggregate is turned into an extended
return statement with statements initializing the components,
and finally, it should be true if it is a limited extended
return.
****************************************************************
From: Randy Brukardt
Sent: Friday, January 14, 2005 12:05 AM
I have to agree, but I come to the opposite conclusion: the way non-limited
returns work currently is unacceptable, and if you really want them to be
exactly the same, we'll need to change non-limited to match limited.
The reason is that the current semantics essentially require a temporary;
build-in-place is not allowed for non-limited types is not allowed. That's
because build-in-place has to be undone if an exception occurs after the
evaluation of the aggregate starts. We deal with that for regular
assignments by checking the aggregate for the possibility of raising an
exception before doing build-in-place, but for any type with controlled
components, such a check must necessarily fail (we cannot know what happens
in Adjust).
My intent was to use build-in-place (that is, the new calling convention)
for all record types. That would get rid of the obnoxious temporary that
can't be optimized away, and which makes composite functions something to
avoid whenever you care about performance. But you're saying that
build-in-place can never be done for a non-limited type with controlled
components, because you can tell if the target object is finalized before
the function starts evaluating the aggregate or other expression (the
function can handle the exception and then access the target). So you always
have to use a temporary (backing out a user-defined Finalize is impossible).
As someone who believes that the vast majority of types should be
controlled, I cannot justify a significant required overhead that has
virtually no user benefit. If a user really wants an explicit temporary,
they can write one.
In any case, we're pretty much required to use the same convention for
limited and non-limited functions, because otherwise generics wouldn't work.
And being forced to make a temporary at each call site is worse than the
current situation, because the function might decide it has to make a copy
too.
In any case, the point of this exercise (from my perspective) was to get
better performance for all controlled types. If the performance is going to
be worse, I'd be better off forgetting I ever heard about AI-318 (because
there is no reason to do a lot of work to end up with worse performance).
Or, perhaps, forgetting about the Ada standard and doing it right would make
the most sense. Neither would help Ada in the long run.
Anyway, I've wasted far too much time on this discussion. I've made my
position clear; I would rather drop AI-318 than be forced into Tucker's
semantics.
****************************************************************
From: Tucker Taft
Sent: Friday, January 14, 2005 9:10 AM
> I have to agree, but I come to the opposite conclusion: the way non-limited
> returns work currently is unacceptable, and if you really want them to be
> exactly the same, we'll need to change non-limited to match limited....
That seems like a potentially serious incompatibility.
I can imagine there is a non-trivial amount of code that
looks like:
function Blah... is
begin
return Fum(...);
exception
when ... =>
raise Different_Exception;
end Blah;
If exceptions propagated by Fum are not caught by the exception
handler of Blah, we may break a lot of code which assumes
that only "Different_Exception" is propagated from Blah.
I understand your implementation concerns, but I think they
are all manageable, though certainly not trivial. Yes, you
may have to figure out how to finalize a partially initialized
object sooner than normal, but unchecked-deallocation knows
how to do that, and I believe most compilers clean up partially
initialized allocated objects right away, rather than waiting
until the heap as a whole is finalized.
It would be significantly worse, in my view, to have
to go back and change the way these things work now, while
also subjecting existing code to an incompatible, inconsistent
change in run-time semantics.
****************************************************************
From: Bob Duff
Sent: Friday, January 14, 2005 9:57 AM
Tucker wrote:
> function Blah... is
> begin
> return Fum(...);
> exception
> when ... =>
> raise Different_Exception;
> end Blah;
Or how about:
function Blah... is
begin
return Fum(...);
exception
when ... =>
return Alternate_Result(...);
end Blah;
It does seem that a function should be able to handle an exception
raised anywhere within it. (Well, ahem, cough... except in the handler
itself.)
****************************************************************
From: Gary Dismukes
Sent: Friday, January 14, 2005 11:48 AM
> It does seem that a function should be able to handle an exception
> raised anywhere within it. (Well, ahem, cough... except in the handler
> itself.)
And, ahem, cough, in the function's declarative part. ;)
****************************************************************
From: Bob Duff
Sent: Friday, January 14, 2005 3:22 PM
> And, ahem, cough, in the function's declarative part. ;)
Well, sure, but you can always put that code into a block statement.
You can put the exception handlers in a block statement, too,
but that leads to infinite regress.
****************************************************************
From: Jean-Pierre Rosen
Sent: Friday, January 14, 2005 11:43 AM
From the text of the AI:
The syntax for extended return statements was initially proposed early
on, but when this AI was first written up, we proposed instead a revised
object declaration syntax where the word "return" was used almost like
the word "constant," as a qualifier. This was somewhat more economical
in terms of syntax and indenting, but was not felt to be as clear
semantically as this current syntax.
Given the current number of worms struggling to get out of the can,
would it be appropriate to reconsider this solution?
****************************************************************
From: Tucker Taft
Sent: Friday, January 14, 2005 1:59 PM
I believe Randy has made a number of good arguments that
indicate this has relatively little to do with the
extended return statement, and even less to do with
the specifics of its syntax. It is mostly related
to returning limited objects, whether they be specified
by an aggregate, function call, or extended return
statement. In all cases you need to specify what happens
if an exception is propagated before the return statement
is completed.
****************************************************************
From: Bob Duff
Sent: Friday, January 14, 2005 4:06 PM
> Your model would require changing the master of the object after it is
> created.
I don't see that.
Consider an uninit allocator of an array, in Ada 95:
type A is array(Positive range <>) of Some_Lim_Controlled;
type P is access A;
... new A(1..10) ...
If an exception is raised in the middle of initializing components, the
implementation is required to finalize the ones for which Initialize
completed successfully, and is forbidden from finalizing the others.
I think the easiest way to implement this is to catch any exceptions
that occur in the middle of initialization, finalize as necessary, and
then reraise. This requires keeping track of how far through the array
you got -- but that's necessary anyway -- this is just the loop index.
Are you saying this implementation is wrong, and that finalization must
wait until the collection is finalized? If so, I think we'd better
change the rules to allow this, because it's the easiest and most
efficient, and I believe many implementations already do it.
(It's more efficient, because otherwise, you'd have to store the index
of how far you got in the heap, for later use.)
In other words, it's already the case (at least on many implementations)
that finalization takes place earlier than the master, and this does not
involve "changing masters". Therefore, the same could apply to these
new kinds of returns.
----------------
I think your other concern was doing:
X := F(...);
in the nonlimited controlled case. You don't want to have to make
temporaries. I agree. Why don't we simply allow that? That is, an
implementation is free to finalize X first, then pass its address to F,
which can build-in-place -- if the implementation so chooses.
****************************************************************
From: Tucker Taft
Sent: Friday, January 14, 2005 4:47 PM
> I think your other concern was doing:
>
> X := F(...);
>
> in the nonlimited controlled case. You don't want to have to make
> temporaries. I agree. Why don't we simply allow that? That is, an
> implementation is free to finalize X first, then pass its address to F,
> which can build-in-place -- if the implementation so chooses.
This seems like another potentially dangerous incompatibility.
Suppose you have:
X := F(X(1..3));
or X is visible up-level to F. You can't finalize X if
F might be able to see part or all of X during its execution.
Perhaps you meant we should allow pre-finalization of X
if its value is not needed to evaluate the right hand side,
and there is no chance it will be visible in an exception
handler if F propagates an exception. The current language
allows aborts to occur between the finalize and the
copy-and-adjust steps. Allowing the evaluation of the
right hand side to occur then is probably also OK, given
the above provisos. But your compiler will have to do
the analysis to determine it is safe.
Our compiler essentially already does this analysis for
*non* controlled types, I believe, to decide whether it
is safe to pass in the address of the left hand side
as the result temp for the function. We could certainly
do this for controlled objects as well, if we could perform
finalization before the call when safe.
But I suspect Randy was looking for a single approach
that was always allowed, without having to do any analysis
at the call site. In that case, he is stuck for an assignment
statement. Of course, if the function call is used to initialize a
new object, then you can always safely pass in the address
of the new object.
****************************************************************
From: Bob Duff
Sent: Friday, January 14, 2005 9:47 PM
> If an exception is raised in the middle of initializing components, the
> implementation is required to finalize the ones for which Initialize
> completed successfully, and is forbidden from finalizing the others.
Correct.
> I think the easiest way to implement this is to catch any exceptions
> that occur in the middle of initialization, finalize as necessary, and
> then reraise. This requires keeping track of how far through the array
> you got -- but that's necessary anyway -- this is just the loop index.
This I totally disagree with. Certainly, you *could* try to implement it
that way, but it would be a lousy implementation for the majority of
compilers. First, it requires a "virtual" exception handler, and that has a
significant cost on most systems. Second, figuring out where you are is
possible in simple cases (like this), but it doesn't generalize in any
sensible way. When you have discriminated types with variants and arrays,
and multiple controlled components nested to several levels (which in fact
happens in some Claw example programs), it just becomes a nightmare.
It makes much more sense for each controlled part (with the technical
meaning of part) to be initialized and finalized individually. Each gets
registered when its initialization finishes successfully, so you can only
finalize those that have finished.
> Are you saying this implementation is wrong, and that finalization must
> wait until the collection is finalized? If so, I think we'd better
> change the rules to allow this, because it's the easiest and most
> efficient, and I believe many implementations already do it.
It's clearly wrong: the allocated object belongs to the collection, and
shouldn't be finalized until the collection goes away. For a declared
object, you can't tell the difference. And you can't tell the difference in
Ada 95, either, because the type necessarily must be non-limited, and you
can't tell the difference between the object being created in a temporary
(which is finalized immediately, before the allocated object is even
created) or this implementation.
But for a limited object, the object has to be built-in-place, and thus the
implementation is clearly wrong (*only* for a limited type). Whether that
should be relaxed is an open question. If you want to require the above
behavior, I'll fight you until the ends of the earth - it's clearly
requiring a horrible implementation, and buys nothing for users. But if you
just want to allow it, I don't particularly care. OTOH, we have generally
specified finalization behavior of limited types in Ada without allowing
much optimization, so I would tend to be conservative with these rules.
> (It's more efficient, because otherwise, you'd have to store the index
> of how far you got in the heap, for later use.)
You have to store *something* in the heap in order to know to finalize these
objects anyway (and usually that needs to be quite a bit - a subprogram
address and static link and a chain, at a minimum), so I don't see much
reason to worry about another word. For a chained implementation, there is
no extra cost at all (just make sure its linked on the right chain).
> In other words, it's already the case (at least on many implementations)
> that finalization takes place earlier than the master, and this does not
> involve "changing masters". Therefore, the same could apply to these
> new kinds of returns.
Such implementations are wrong for limited types (certainly with the rules
as written). It's an "as-if" optimization for non-limited types, so it's
fine to use it in Ada 95 and in Ada 2005.
To even allow your implementation would require writing quite a bit of
really tricky wording - I don't even know all of the places where it would
have to be allowed.
To implement something like what you describe in Janus/Ada certainly is
possible (in the sense that anything is possible), but it would be quite a
hit in performance, because you'd have to do everything twice. "Changing
masters" would certainly be the cheapest way to do it, easiest would be a
handler and explicit finalize in the sense of Unchecked_Deallocation (but
that would be far more expensive because of exception handling overhead).
But the latter only works on the "collections" created for access types,
because those don't have pointers into them. The main finalization chain is
full of pointers into it. The only way to allow early finalization of stack
objects or changing a master oof a stack object would be to add additional
dummy nodes for the pointers to point at. That would add overhead to all
programs with finalization (with a lot of extra work, the extra overhead
could be mitigated in blocks that don't do anything nasty, but that would be
very expensive to check with any degree of accuracy).
I prefer to keep the model simple, which is to finalize at the master of the
object.
****************************************************************
From: Randy Brukardt
Sent: Friday, January 14, 2005 9:59 PM
...
> This seems like another potentially dangerous incompatibility.
> Suppose you have:
>
> X := F(X(1..3));
>
> or X is visible up-level to F. You can't finalize X if
> F might be able to see part or all of X during its execution.
Right. I thought about that on the way home last night. It would have to
work like optimizing slices or (non-limited) aggregate assignments.
...
> Our compiler essentially already does this analysis for
> *non* controlled types, I believe, to decide whether it
> is safe to pass in the address of the left hand side
> as the result temp for the function. We could certainly
> do this for controlled objects as well, if we could perform
> finalization before the call when safe.
Exactly what I had in mind.
> But I suspect Randy was looking for a single approach
> that was always allowed, without having to do any analysis
> at the call site. In that case, he is stuck for an assignment
> statement. Of course, if the function call is used to initialize a
> new object, then you can always safely pass in the address
> of the new object.
Not really. It's clearly going to be necessary to be able to use a temporary
(for all types), because the function or aggregate could be directly passed
as a parameter, thus there being no object to assign into. Once you have
that, doing it for function calls if needed is fine. It's the "if needed"
that matters; you don't want simple functions:
Ten := To_Unbounded_String ("Ten");
to have to use temporaries soley to get the finalization "right". We already
have lots of rules allowing optimizations in this area, so one more is
unlikely to be harmful.
It should never be necessary to use a temporary for:
Ten : Unbounded_String := To_Unbounded_String ("Ten");
and I certainly want that be true for *all* types, not just limited types.
****************************************************************
From: Bob Duff
Sent: Saturday, January 15, 2005 9:35 AM
Randy wrote:
> But for a limited object, the object has to be built-in-place, and thus the
> implementation is clearly wrong (*only* for a limited type).
My example showed a limited type.
What you seem to be saying is that the suggested implementation is wrong
only for limited types, *and* only for heap objects. That seems insane.
Why should heap objects be different from stack objects, here?
(Again, I'm talking about the limited case.)
The heap object in question is inaccessible (the allocator failed!),
so *requiring* it to remain in existence until collection finalization
seems to be of zero benefit to the user. [I realize there are sneaky
ways to make this half-baked object accessible, but those are bugs
waiting to happen. My point is that the pointer returned by "new" never
arrives in this case.]
>... Whether that
> should be relaxed is an open question. If you want to require the above
> behavior, I'll fight you until the ends of the earth - it's clearly
> requiring a horrible implementation, and buys nothing for users. But if you
> just want to allow it, I don't particularly care.
I said "allow", not "require". Certainly, if an implementation has
expensive exception handlers, that changes the trade-offs. We need not
argue about which methods are "best" for all compilers.
As I said, I'm surprised it's not already allowed, and I believe there
are implementations that do this sort of thing in some or all cases.
****************************************************************
From: Tucker Taft
Sent: Saturday, January 15, 2005 12:43 PM
...
>>I think the easiest way to implement this is to catch any exceptions
>>that occur in the middle of initialization, finalize as necessary, and
>>then reraise. This requires keeping track of how far through the array
>>you got -- but that's necessary anyway -- this is just the loop index.
>
> This I totally disagree with. Certainly, you *could* try to implement it
> that way, but it would be a lousy implementation for the majority of
> compilers.
Be that as it may, we implement it this way, and I believe so
does Rational. I'm not sure about GNAT.
> ...
> It makes much more sense for each controlled part (with the technical
> meaning of part) to be initialized and finalized individually. Each gets
> registered when its initialization finishes successfully, so you can only
> finalize those that have finished.
I believe Rational keeps careful track of what components
have been initialized, and then finalizes just those. I believe
Bob was involved in implementing that. The AdaMagic approach is to
have a "components master" which we use temporarily for registering
components that need finalization, and then at some point we
convert to a single registration for the whole object, throwing
away the components master. If things get interrupted in the
middle, then the components master is the "innermost" master,
and the components on it get cleaned up. After we make the
switch, the clean up of the components is embedded in a
whole-object clean up routine generated for types with multiple
finalizable components. The components master is unlinked
from the chain of masters, and the cleanup routine for
the object as a whole is linked onto the appropriate master.
I'm sure there a million ways to implement this, and what makes
the most sense will vary from one implementation to the next,
depending on their run-time model, and on other tradeoffs they
choose to make.
> ...
> It's clearly wrong: the allocated object belongs to the collection, and
> shouldn't be finalized until the collection goes away....
It might be hard to find "clear" wording saying this.
And it seems undesirable. It is also interesting that
"garbage collection" is permitted by the language, but
never defined in detail nor is there a clear explanation
of how garbage collection relates to finalization.
> But for a limited object, the object has to be built-in-place, and thus the
> implementation is clearly wrong (*only* for a limited type). Whether that
> should be relaxed is an open question.
As far as I know, AdaMagic (and hence Green Hills and Aonix) and
Rational both attempt to finalize partially initialized objects
right away. I can't imagine any value to the user to postpone
this finalization, and by finalizing right away, we can reclaim
the space that much sooner.
> ...
> If you want to require the above
> behavior, I'll fight you until the ends of the earth - it's clearly
> requiring a horrible implementation, and buys nothing for users.
This seems a bit of an overstatement, and reclaiming storage
sooner seems more than "nothing."
> ...
> But if you
> just want to allow it, I don't particularly care. OTOH, we have generally
> specified finalization behavior of limited types in Ada without allowing
> much optimization, so I would tend to be conservative with these rules.
It sounds like we need some clarification of this issue.
If an allocator fails before creating the access value or
activating component tasks, it seems difficult to justify
requiring deferring finalization, and based on your strong statement,
perhaps also difficult to justify requiring immediate
finalization. Also, garbage collection needs to be factored
into the finalization rules.
> ...
> Such implementations are wrong for limited types (certainly with the rules
> as written).
It would be interesting to identify these rules. We probably
want to make them clearer, and given existing implementations,
be permissive of either immediate or deferred finalization of
partially initialized heap objects.
****************************************************************
From: Bob Duff
Sent: Saturday, January 15, 2005 1:22 PM
> "garbage collection" is permitted by the language, but
> never defined in detail nor is there a clear explanation
> of how garbage collection relates to finalization.
There is some discussion of this in 13.11.3, much of which was banished
to the AARM because we didn't think it was worth putting in the RM,
given the lack of Ada implementations supporting GC. We figured, if
somebody wants to implement GC, let *them* figure out all the
interactions with finalization, with some hints about the issues in the
AARM. But it's clear that the intent was that a GC'ed implementation
would finalize the collected objects "prematurely".
There are Ada implementations on top of the Java Virtual Machine and
the .NET virtual machine. I presume they deal with these interactions
by letting the virtual machine do its thing.
I firmly believe that Ada should *allow* GC, just like I firmly believe
that Ada should *allow* generic code sharing, despite the fact that
(sadly) not too many implementations do these nice things.
****************************************************************
From: Tucker Taft
Sent: Saturday, January 14, 2005 2:58 PM
Here is a statement that I believe, at least in part,
disallows premature finalization of heap objects (7.6.1(10)),
and should be revised, probably:
... If an instance of Unchecked_Deallocation is never
applied to an object created by an allocator, the object
will stilll exist when the corresponding master completes,
and it will be finalized then.
****************************************************************
From: Robert I. Eachus
Sent: Saturday, January 15, 2005 9:07 PM
I think that this argument is getting very far away from the original
intent of this feature. The decision that has to be made IMHO is to
keep the ability to initialize limited objects, or to let some notion of
linguistic purity get in the way.
In current Ada, if a function has an exception handler, that handler
does not catch all exceptions raised between the call and return. If a
user writing a function whose definition requires handling all
exceptions that may be raised internally, then he knows how to use
nested blocks and so forth to insure this. At first it looks like the
necessary rule here would create undue hardship. But it won't. It will
hardly even come up.
Why? Because within the function returning a limited object, the object
may not be limited. More often the case will be that the object will be
partially limited. The function will be defined in a location that can
"see into" the limited object. The parent part of the object may still
be limited in this view, but that is fine. The initialization function
called for the parent part will initialize that part of the object, and
handle any exceptions it should and can internally.
This leaves only two potential concerns. First an exception caused by
the creation of the entire object being returned. But that is not a
problem, at least as I see it. The object may be defined in the
declarative part--and doesn't get handled locally anyway. More likely
it gets defined before the keyword *do*. We can discuss that particular
case at length, but I don't see the errors that may occur there (as
opposed to being propagated there) as being all that important.
*Storage_Error *may occur, but this is Dave Emery's parachute that opens
on impact. Not when the object is too large for the heap--that case
should work. But when the function is called with only a few words left
on the stack, predicting where *Storage_Error* will occur and which
handler might see it, is futile.
If a programmer wants to handle Tucker's or Bob Duff's examples, he will
write:
<>Tucker wrote:
function Blah... is
begin
return Fum(...);
exception
when ... =>
raise Different_Exception;
end Blah;
Or
function Blah... is
begin
return Fum(...);
exception
when ... =>
return Alternate_Result(...);
end Blah;
Both are legal today, and AFAIK we are not talking about changing that,
with the possible exception of limited objects being built in place.
But the potential problem case as I see it, is when there is a sequence
of statements within the return:
function Constructor return Limited_Array is
begin
return D: Limited_Array(1..10 => <>) do
for I in D'Range loop
begin
D(I) := ...;
exception when others =>
-- fix D(I) for some I.
end;
end loop.
end return;
end Constructor;
And here there are no philosophical or other issues. The user can declare a
handler inside the return, and it will do exactly what he wants. Of course, we
do need to disallow gotos out of the scope and so on, but I hope we have
already agreed on that already.
This means that the *creation* of the object to be returned is the bone of
contention. But I guess that I don't see that either. If the object is being
built in place, how can it matter to the user whether it is the creation of the
'temporary' or the 'target' that causes the exception? Certainly in:
Foo: Limited_Bar := Constructor(...);
The user is not going to be surprised to get *Storage_Error* outside the
constructor if there is not enough space to create Foo, or *Tasking_Error* if
he has exceeded the limit on the number of tasks. So I just don't see a real
problem here.
Programmers will need to know that exceptions in the sequence_of_statements in
a return statement occur after the enclosing scope has been left. But that is
the sort of thing that is dealt with by an example in the RM. I could even see
resolving the dispute between Tucker and Randy by allowing exceptions caused by
creating and allocating the returned object to be handled in either the
function or the caller as the implementation chooses. I don't really think it
is as bad as all that though. There are some exceptions that will occur in the
caller, others that can occur in the function, and a lot where either scope
will be appropriate. I just can see destroying a useful new functionality by
getting pedantic about potentially obscure cases. Yes, we have to define it,
but no we don't have to over define it.
****************************************************************
From: Robert Dewar
Sent: Saturday, January 15, 2005 9:11 PM
I would hesitate to revise this. Allowing arbitrary early
finalization can be disastrous to fundamental semantics of
currently correct programs that use finalization e.g. to
properly unlock something at the proper point. Garbage
collection can occur early in such a case, since very
likely we are using a dummy variable that no one references.
In GNAT we have a pragma Finalize_Storage_Only, which says
that the only reason for a finalization routine is to free
storage. We had in mind two purposes
1. Skip finalization at the outer level when program terminates
(this is implemented in GNAT now).
2. Allow early finalization in the garbage collected case.
I really believe it is essential not to introduce a giant
upwards incompatibility here!
****************************************************************
From: Tucker Taft
Sent: Sunday, January 16, 2005 9:30 AM
This is *not* talking about local variables, even unreferenced
local variables. In any case we have existing rules that disallow
removing local variables of a limited controlled type.
This is talking about objects created by an allocator, and
most specifically, the issue at hand is components created
by an allocator that *fails* due to an exception raised
during its evaluation. I believe that we want to encourage
implementations to recover the space for a failed allocator
as soon as possible, which implies finalizing the components
of the as-yet-incomplete-heap-object immediately, rather than
at the point where the access type goes out of scope.
****************************************************************
From: Robert Dewar
Sent: Sunday, January 16, 2005 9:42 AM
It still worries me to assume that we are only talking about
recovering space here. This would be a definite incompatibility.
I just don't know how severe a one. If all the finalizer does
is to deallocate, then that's not an issue, if it has unbounded
interesting side effects, such as referencing something that
does not exist until later on, then this seems a recipe for
semantic confusion to me.
****************************************************************
From: Jean-Pierre Rosen
Sent: Sunday, January 16, 2005 11:51 AM
> I believe Randy has made a number of good arguments that
> indicate this has relatively little to do with the
> extended return statement, and even less to do with
> the specifics of its syntax. It is mostly related
> to returning limited objects, whether they be specified
> by an aggregate, function call, or extended return
> statement. In all cases you need to specify what happens
> if an exception is propagated before the return statement
> is completed.
Of course, finalization issues are here to stay.
I just thought that this would get rid of the exit/goto/exception issues.
****************************************************************
From: Robert I. Eachus
Sent: Sunday, January 16, 2005 10:54 PM
>It still worries me to assume that we are only talking about
>recovering space here. This would be a definite incompatibility.
>I just don't know how severe a one. If all the finalizer does
>is to deallocate, then that's not an issue, if it has unbounded
>interesting side effects, such as referencing something that
>does not exist until later on, then this seems a recipe for
>semantic confusion to me.
I tend to agree with Tucker that the concern you have here is
misplaced. There are some tough issues involving creation of a complex
object containing tasks, but other rules seem to me to prevent those
from ever surfacing. None of the tasks which are part of the object
will ever be activated. A task designated by an access variable in the
object could be created by an allocator, and be activated even though
the object as a whole was never created. But such a task could never be
referenced, and should be considered a program design error (or an ACVC*
test case ;-) independent of what goes on here.
I see the other issue that Robert raises as again, a possible problem,
but one which is larger than what could arise in this case. If you
create a finalization routine which assumes that all objects of the type
have the same lifetime, then finalizing one of these objects early due
to a failure during creation of a larger object is a potential problem.
But that exception, which will occur in a declarative part (for any
limited object). So *any* object created in that declarative part can
be finalized early, or out of the expected order. Doesn't matter what
is decided here. The program designer (or tester) is going to have to
think about such things in every case where there is a complex
initialization in a declarative part. This was true in Ada 83, and is
true in Ada 95. Any potential exception in a declarative part has to be
treated and tested as adding an additional path through the unit. (What
I like about Ada compared to PL/I here is that each such case only adds
one path. You don't get exponential explosions.)
*If anyone reading this is too young to remember certain ACVC tests that
didn't make it into the ACATS, consider yourself lucky.
****************************************************************
From: Randy Brukardt
Sent: Monday, January 147 2005 5:10 PM
Tucker Taft wrote, replying to Robert Dewar:
> Robert Dewar wrote:
> > Tucker Taft wrote:
> >
> >> Here is a statement that I believe, at least in part,
> >> disallows premature finalization of heap objects (7.6.1(10)),
> >> and should be revised, probably:
> >>
> >> ... If an instance of Unchecked_Deallocation is never
> >> applied to an object created by an allocator, the object
> >> will stilll exist when the corresponding master completes,
> >> and it will be finalized then.
> >
> > I would hesitate to revise this. Allowing arbitrary early
> > finalization can be disastrous to fundamental semantics of
> > currently correct programs that use finalization e.g. to
> > properly unlock something at the proper point. Garbage
> > collection can occur early in such a case, since very
> > likely we are using a dummy variable that no one references...
>
> This is *not* talking about local variables, even unreferenced
> local variables. In any case we have existing rules that disallow
> removing local variables of a limited controlled type.
>
> This is talking about objects created by an allocator, and
> most specifically, the issue at hand is components created
> by an allocator that *fails* due to an exception raised
> during its evaluation. I believe that we want to encourage
> implementations to recover the space for a failed allocator
> as soon as possible, which implies finalizing the components
> of the as-yet-incomplete-heap-object immediately, rather than
> at the point where the access type goes out of scope.
I don't know what we want to do, but I thought I'd try to shed a bit of
light on this topic. So I wrote an example program of the case that we're
talking about.
-----
with Ada.Finalization;
package Checkit2 is
Created_Yet : Boolean := False;
Finalized_Yet : Boolean := False;
type Check_Type is new Ada.Finalization.Limited_Controlled with null record;
procedure Initialize (Obj : in out Check_Type);
procedure Finalize (Obj : in out Check_Type);
function Raise_P_E return Integer;
end Checkit2;
package body Checkit2 is
procedure Finalize (Obj : in out Check_Type) is
begin
Finalized_Yet := True;
end Finalize;
procedure Initialize (Obj : in out Check_Type) is
begin
Created_Yet := True;
end Initialize;
function Raise_P_E return Integer is
begin
raise Program_Error;
return 10;
end Raise_P_E;
end Checkit2;
with Ada.Text_IO;
with Checkit2;
procedure Check2 is
-- Check when finalization of a failed limited allocator occurs.
type Record_Type is limited record
Cont : Checkit2.Check_Type;
Oops : Integer := Checkit2.Raise_P_E;
end record;
begin
Ada.Text_IO.Put_Line ("--- Check when an allocated object is" &
" finalized when the initializer fails");
declare
Early : Boolean;
begin
declare
type Acc_Record_Type is access Record_Type;
Obj : Acc_Record_Type;
function Test return Acc_Record_Type is
begin
return (new Record_Type);
exception
when Program_Error =>
Early := Checkit2.Finalized_Yet;
if Checkit2.Finalized_Yet then
Ada.Text_IO.Put_Line ("%% Failed allocated object " &
"finalized inside of function");
elsif Checkit2.Created_Yet then
Ada.Text_IO.Put_Line ("%% Failed allocated object " &
"created but not finalized inside of function");
else
Ada.Text_IO.Put_Line ("%% Failed allocated object " &
"controlled component not created");
end if;
raise;
end Test;
begin
Obj := Test;
Ada.Text_IO.Put_Line ("** Test failed to raise exception.");
end;
exception
when Program_Error =>
if Checkit2.Created_Yet then
if Checkit2.Finalized_Yet then
if not Early then
Ada.Text_IO.Put_Line ("%% Allocated object finalized " &
"when type goes out of scope");
-- else already reported on finalization.
end if;
else
Ada.Text_IO.Put_Line ("** Allocated controlled component " &
"created but not finalized!");
end if;
else
Ada.Text_IO.Put_Line ("%% Allocated controlled component " &
"never created");
end if;
end;
Ada.Text_IO.Put_Line ("--- Check complete");
end Check2;
----
Unfortunately, it didn't shed much light.
Janus/Ada worked as I expected (printing "created but not finalized inside
of function" and "finalized when type goes out of scope").
The GNAT version I tried failed outright, printing "created but not
finalized inside of function", then "component created but not finalized!".
I didn't check if it just finalized the object too late, or never.
I don't have ObjectAda installed on this OS right now, so I didn't try it.
And I still haven't gotten around to trying to get the Rational Apex working
again (the license manager just refuses to work on our network). I'm sure
others will try those.
----
My personal opinion on this is rather split. Certainly, Initialize and
Finalize routines can link objects into other data structures (Claw works
this way), so it's important that we don't require any magic "going away".
OTOH, any controlled type that allows allocation of its objects and can't
handle an arbitrary Finalize call (from Unchecked_Deallocation, for example)
is pretty dubious. OT3H, AI-179 decides not to decide on what happens with
Unchecked_Deallocations that fail, and it would be odd to make similar
requirements on allocations. Moreover, deallocation of storage is not
semantically neutral -- especially if the storage comes from a user-defined
pool.
I do think it is crystal clear that the current model in the RM does not
allow such early finalizations (7.6.1 defines when finalizations are done,
and there is nothing about exceptions raised by allocators or return
statements in 7.6.1!) I suspect implementers got caught making an "as-if"
optimization that doesn't quite work.
In any case, I'm not planning to run out and create an ACATS test for this
case. It's hard to imagine a legitimate use for this, as raising an
exception in an initializer is clearly a bug (not a feature!). If some
programmers want to play Russian Roulette and leave those bugs in their
production systems, that's their business, but I don't have a lot of
sympathy.
****************************************************************
From: Pascal Leroy
Sent: Tuesday, January 18, 2005 6:14 AM
Apex produces:
--- Check when an allocated object is finalized when the initializer fails
%% Failed allocated object created but not finalized inside of function
** Allocated controlled component created but not finalized!
--- Check complete
which surprises me and doesn't seem quite right. At least, it's
GNAT-compatible ;-)
****************************************************************
From: Tucker Taft
Sent: Tuesday, January 18, 2005 1:45 PM
If an allocator's initialization fails, the user can't reclaim the space
using Unchecked_Deallocation. For a long-running application,
it would presumably be desirable if the implementation would
recover the space. Perhaps for this and for garbage collection,
we should permit an implementation to implicitly perform
an "Unchecked_Deallocation" and the associated finalization
as soon as the object is no longer accessible. This would
tie into the wording in of 7.6.1(10), because it talks in
terms of Unchecked_Deallocation.
Of course if the pragma Controlled applies to the access type,
then the garbage collection option would be disallowed, though
implicitly deallocating the failed allocator might still be desirable
to avoid a storage leak.
****************************************************************
From: Tucker Taft
Sent: Tuesday, January 18, 2005 4:21 PM
Interestingly, the second sentence of 7.6.1(10), which
indicates that if Unchecked_Deallocation isn't used,
the object is finalized at the end of the access type scope, is
bracketed in the AARM, as though it is redundant.
But later in the AARM, it says that if the implementation
does garbage collection, then it "should" finalize the
object before reclaiming its storage. These two seem
to be inconsistent, unless we hypothesize that
garbage collection is implicitly invoking Unchecked_Deallocation.
****************************************************************
From: Bob Duff
Sent: Monday, June 6, 2005 12:35 PM
I'm using draft 11.8 of the [A]ARM.
3.8(13.1/2):
13.1/2 {AI95-00318-02} If a record_type_declaration includes the reserved word
limited, the type is called a limited record type.
So "limited record type" is not synonymous with "record type that is limited"?!
That seems rather confusing. How about renaming this concept "explicitly
limited record type"?
****************************************************************
From: Tucker Taft
Sent: Monday, June 6, 2005 7:08 PM
Sounds reasonable. I can't imagine this special
term is used very much, and it would be wise
to be as "explicit" as possible... ;-)
****************************************************************
From: Randy Brukardt
Sent: Monday, June 6, 2005 9:18 PM
That seems OK to me, although finding where it is used is going to be
tricky. (Which, I suppose, is the point.)
****************************************************************
From: Bob Duff
Sent: Tuesday, June 7, 2005 6:46 AM
3.7(10.f/2, 10.i/2)
3.8(13.1/2, 31.i/2)
7.5(8.1/2)
7.6(17.1/2)
10.2.1(28.e/2)
D.10(5.a)
I don't know which of the above are old wording that was intended to
mean "record type that is limited".
The term does not appear in the Index.
****************************************************************
From: Randy Brukardt
Sent: Saturday, June 11, 2005 12:35 AM
Bob gives a list of places to change:
> 3.7(10.f/2, 10.i/2)
Interestingly, the notes use the new term, the normative wording here uses
the old gobbledygook. Should 3.7(10) be changed to use "explicitly limited
record type" like all of the new wording?? It would seem to be more
consistent.
> 3.8(13.1/2, 31.i/2)
>
> 7.5(8.1/2)
>
> 7.6(17.1/2)
>
> 10.2.1(28.e/2)
These are all uses of the new term.
> D.10(5.a)
This is an old use of the new term; the meaning is exactly what we mean. So
I guess there wasn't any confusion. :-)
> I don't know which of the above are old wording that was intended to
> mean "record type that is limited".
None, amazingly. But you didn't look for "limited record", which might find
more hits.
> The term does not appear in the Index.
It does now.
****************************************************************
From: Pascal Leroy
Sent: Saturday, June 11, 2005 3:52 AM
> > 3.7(10.f/2, 10.i/2)
>
> Interestingly, the notes use the new term, the normative
> wording here uses the old gobbledygook. Should 3.7(10) be
> changed to use "explicitly limited record type" like all of
> the new wording?? It would seem to be more consistent.
It should be changed to use the new terminology, but it must still talk
about ancestors and all that, so the sentence will remain rather
convoluted.
****************************************************************
From: Randy Brukardt
Sent: Monday, June 13, 2005 10:51 PM
Turns out that we have to change it, because a "type containing the reserved
word limited" clearly includes derived types that explicitly include limited,
but we certainly don't want them to (it wouldn't necessarily be a "really
limited" type).
type L (...) is limited private;
type D (...) is new limited L;
D meets the letter of the old rule, but shouldn't be included if L is actually
completed by Integer (say).
****************************************************************
Questions? Ask the ACAA Technical Agent