!standard 4.3.2(5/2) 08-06-13 AI05-0067-1/05 !standard 7.5(8.1/2) !standard 7.6(0) !standard 7.6(16) !standard 7.6(17/2) !standard 7.6(17.1/2) !standard 7.6(18) !standard 7.6(21/2) !class binding interpretation 07-10-22 !status work item 07-10-22 !status received 07-09-03 !priority High !difficulty Hard !qualifier Error !subject Build-in-place objects !summary The rules for build-in-place function calls and aggregates are clarified. Rather than trying to view the two objects as "the same", the model is that there are two distinct objects: the anonymous object (function or aggregate result) and the newly-created object (the one being initialized). The anonymous object "morphs into" the newly-created object. The intended implementation model is still the same: both objects occupy the same storage, so this "morphing" is implemented as a no-op in most cases. For an extension aggregate of the form "(F(...) with ...)", the Tag of the anonymous function result object is that of the function's result type. The Tag of the newly-created aggregate object is that of its type. Thus, in the case where the result of F is built in place, the Tag of the anonymous object is different from the Tag of the newly-created object it morphs into. For an extension aggregate of the form "(T with ...)" there is only one Tag -- that of the aggregate. Other properties are preserved by the "morphing" operation -- for example, access values that designate the anonymous object are updated to designate the newly-created object. !question Is a type derived from a build-in-place type also a build-in-place type? (Yes.) In an extension aggregate of a build-in-place type, what is the tag of the ancestor part? (See summary.) What happens to self-referential access values when the object is built in place? (They end up pointing to the newly-created object.) The assignment operation is defined for all types in the Amendment. But 7.6(16) and 7.6(18) have parenthetical remarks suggesting that they apply only to nonlimited types. Should that be corrected? (Yes.) !recommendation (See summary.) !wording Remove 7.5(8.1/2) and 7.6(17.1/2), which are "Implementation Requirements". Rename section 7.6 from "User-Defined Assignment and Finalization" to "Assignment and Finalization". Add after 7.6(17), the following (this is under "Dynamic Semantics"): When a function call or aggregate is used to initialize an object, the result of the function call or aggregate is an anonymous object, which is assigned into the newly-created object. Under certain circumstances, the anonymous object is "built in place", in which case the assignment does not involve any copying. In particular: - If the full type of any part of the object is immutably limited, the anonymous object is built in place. - In the case of an aggregate, if the type of the newly-created object is controlled, the anonymous object is built in place. - In other cases, it is unspecified whether the anonymous object is built in place. ["Immutably limited type" is defined in AI05-0052-1.] AARM To Be Honest: We say assignment to build-in-place objects does not involve copying, which matches the intended implementation (see below). Of course, the implementation can do any copying it likes, if it can make such copying semantically invisible (by patching up access values to point to the copy, and so forth). AARM Reason: We talk about the full types being immutably limited, as (like parameter passing), this is independent of the view of a type. That is, privacy is ignored for this purpose. Notwithstanding what this International Standard says elsewhere, if an object is built in place: - Upon successful completion of the return statement or aggregate, the anonymous object "morphs into" the newly-created object; that is, the anonymous object ceases to exist, and the newly-created object appears in its place. - Finalization is not performaned on the anonymous object. - Adjustment is not performed on the newly-created object. - All access values that designate parts of the anonymous object now designate the corresponding parts of the newly-created object. - All renamings of parts of the anonymous object now denote views of the corresponding parts of the newly-created object. - Coextensions of the anonymous object morph into coextensions of the newly-created object. AARM notes: The intended implementation is that the anonymous object is allocated at the same address as the newly-created object. Thus, no run-time action is required to cause all the access values and renamings to point to the right place. They just point to the newly-created object, which is what the return object has magically "morphed into". There is no requirement that 'Address of the return object is equal to 'Address of the newly-created object, but that will be true in this implementation. For a function call, if the size of the newly-created object is known at the call site, the object is allocated there, and the address is implicitly passed to the function; the return object is created at that address. Otherwise, a storage pool is implicitly passed to the function; the size is determined at the point of the return statement, and passed to the Allocate procedure. The address returned by the storage pool is returned from the function, and the newly-created object uses that same address. If the return statement is left without returning (via an exception or a goto, for example), then Deallocate is called. The storage pool might be a dummy pool that represents "allocate on the stack". The Tag of the newly-created object may be different from that of the result object. Likewise, the master and accessibility level may be different. An alternative implementation model might allow objects to move around to different addresses. In this case, access values and renamings would need to be modified at run time. It seems that this model requires the full power of tracing garbage collection. AARM To Be Honest: This "morphing" does not necessarily happen atomically with respect to abort and other tasks. For example, if a function call is used as the parent part of an extension aggregate, then the tag of the anonymous object (the function result) will be different from the tag of the newly-created object (the parent part of the extension aggregate). In implementation terms, this involves modifying the tag field. If the current task is aborted during this modification, the object might become abnormal. Likewise, if some other task accesses the tag field during this modification, it constitutes improper use of shared variables, and is erroneous. Modify 7.6(21/2) as follows: [For an aggregate or function call whose value is assigned into a target object, the implementation need not create a separate anonymous object if it can safely create the value of the aggregate or function call directly in the target object. Similarly, for]{For} an assignment_statement, ... 3.9.1(8.a): After first sentence, add "It also makes it easier to implement extension aggregates in the case where the type is limited, the object is allocated from a user-defined storage pool, and the discriminants are inherited from the parent type." The first sentence of 7.6(16) should be modified: To adjust the value of a [(nonlimited)] composite object, the values of the components of the object are first adjusted in an arbitrary order, and then, if the object is {nonlimited} controlled, Adjust is called. Delete "nonlimited" from 7.6(18). Add after 4.3.2(5/2): If the ancestor_part is a function call and the type requires build-in-place (see 7.6), then the ancestor_part shall have a constrained nominal subtype, unless the record_component_association_list is *null record*. AARM Reason: This restriction simplifies implementation, because it ensures that either the caller or the callee knows the size to allocate for the aggregate. Without this restriction, information from both caller and callee would have to be combined to determine the appropriate size. The (F(...) with null record) case is exempt from this rule, because (1) it is not needed in that case, and (2) such extension aggregates are created internally for inherited functions returning null-extension types -- we can't very well make those illegal. [end AARM Reason] If the ancestor_part is a function call to a generic formal function, and the extension_aggregate occurs in a generic body, then the ancestor_part shall have a constrained nominal subtype, unless the record_component_association_list is *null record*. AARM Reason: This is essentially an "assume the worst" version of the previous paragraph. !discussion Consider the following example. We have an allocator with a limited heap object initialized with an extension aggregate, where the parent part is given by a build-in-place function call to Make_T1. The expected implementation strategy is to pass T1_Ref'Storage_Pool as an implicit parameter to Make_T1, along with the size and alignment of the extension part as determined by Comp2. The size of the parent part is determined at the return statement in Make_T1. Make_T1 must calculate the size and alignment to allocate based on the two sizes and two alignments, and pass that on to the storage pool's Allocate procedure. Make_T1 will return the address of returned by Allocate, and that will be used at the call site to calculate the address of Comp2. package P1 is type T1 (D1 : Integer) is tagged limited private; type T1_Ref is access all T1'Class; for T1_Ref'Storage_Pool use ...; package Nested_Pkg is function Make_T1 (D1 : Integer) return T1; end Nested_Pkg; private type T1 (D1 : Integer) is tagged limited record Comp1 : String (1..D1); end record; end P1; with P1; use P1; package P2 is type T2 is new T1 with private; function Make return T1_Ref; N : Integer := 7; private type T2 is new T1 with record Comp2 : String (1..N); end record; end P2; package body P1 is package body Nested_Pkg is function Make_T1 (D1 : Integer) return T1 is begin return Result : T1 (D1) do Result.Comp1 := (others => 'x'); end return; end Make_T1; end Nested_Pkg; end P1; package body P2 is function Make return T1_Ref is begin return Result : T1_Ref := new T2' (Nested_Pkg.Make_T1 (D1 => 5) with Comp2 => "ABCDEFG") do null; end return; end Make; end P2; with P1; use P1; with P2; use P2; procedure Main is Ptr : T1_Ref := Make; begin null; end Main; Note that the size of Comp2 (based on N) need not be known at compile time, but it cannot depend on the discriminant. This is fortunate -- otherwise we would need to pass a size-calculating thunk to Make_T1. On the other hand, if T2 had a discriminant part (for example, "type T2 (D1, D2 : Integer) is new T1 (D1) with...") then we can calculate the size to allocate at the call site. In that case, we would pass the address of the parent part to Make_T1, instead of the storage pool and the size/alignment information. The Tag of the return object in Make_T1 is T1'Tag. The Tag of the return object in Make is T2'Tag. Semantically, we have two separate objects, but in implementation terms, this means the Tag will be overwritten in Make, since the two objects' Tags are at the same address. This is complicated enough that we make cases like this illegal. --- Paragraphs 7.6(16) and 7.6(18) apply to all types, and surely should not claim to apply to nonlimited types only. In addition, 7.6(16) should say that Adjust is called only for nonlimited controlled types, so that the canonical semantics is well-defined (limited controlled types do not have Adjust). --!corrigendum 7.6.1(17.1/1) - Many corrigendum sections to re-insert. !corrigendum 7.6(16) @drepl To adjust the value of a (nonlimited) composite object, the values of the components of the object are first adjusted in an arbitrary order, and then, if the object is controlled, Adjust is called. Adjusting the value of an elementary object has no effect, nor does adjusting the value of a composite object with no controlled parts. @dby To adjust the value of a composite object, the values of the components of the object are first adjusted in an arbitrary order, and then, if the object is nonlimited controlled, Adjust is called. Adjusting the value of an elementary object has no effect, nor does adjusting the value of a composite object with no controlled parts. !corrigendum 7.6(18) @drepl An implementation is allowed to relax the above rules (for nonlimited controlled types) in the following ways: @dby An implementation is allowed to relax the above rules (for controlled types) in the following ways: !ACATS Test This just clarifies the rules, so that additional tests are not needed. However, it could change the semantics of a few existing tests. Also, there may be value to testing an example using extension aggregates. !appendix !topic Limited type extensions and build-in-place !reference RM 3.2(6), 3.8(13.1), 7.5(8.1) !from Adam Beneschan 07-06-25 !discussion package Pack1 is type Typ1 is tagged limited record F1 : Integer; F2 : Integer; end record; end Pack1; with Pack1; package Pack2 is type Typ2 is new Pack1.Typ1 with record F3 : Integer; end record; function Func (Z : Integer) return Typ2; end Pack2; with Pack2; procedure Proc is X : Pack2.Typ2 := Pack2.Func (123); -- built-in-place? begin ... end; Does the language require Func to build its result in place? It would seem that it should; if an anonymous object were created, you'd be copying a Typ1 along with F3, which should be a no-no. But, since I have this ridiculous habit of trying to read the RM literally, it seems to me that the rules as written don't actually require the function result to be built in place. Here's how I read it: The Implementation Requirement in 7.5(8.1) says that when a function call is used to initialize an object, and the function call has a type "with a part that is of a task, protected, or explicitly limited record type", build-in-place is required. Typ2, however, is not an explicitly limited record type. This is defined in 3.8(13.1) to be a record_type_definition that includes the reserved word "limited", and the definition of Typ2 doesn't include the word "limited" in it anywhere. I can't find anything in the RM that says a type extension derived from an "explicitly limited" record type is also explicitly limited. (It's not in 3.9.1; if it's somewhere else, the index entry for "explicitly limited" doesn't point to it.) Also, I don't think Typ2 has a *part* that is an explicitly limited record type. "Part" is defined in 3.2(6): "Similarly, a _part_ of an object or value is used to mean the whole object or value, or any set of its subcomponents." Even though an object of type Typ2 contains a portion has Typ1, I don't think this meets the definition of "part". A Typ2 doesn't have a component or subcomponent of type Typ1; and while the "set of its subcomponents" phrase means that the set {F1, F2} would be a "part" of a Typ2 object, to me there is a definite distinction between the type Typ1 and the subcomponent set {F1, F2}. In my opinion, it would be a real stretch to use that phrase to claim that a Typ1 is a "part" of a Typ2. So based on that, an object of thpe Typ2 doesn't meet the language of 7.5(8.1) and therefore the Implementation Requirement does not apply. (I haven't looked at types derived from untagged limited types very closely, but might those not have the same problem?) Assuming that I haven't missed something and that my analysis is correct, and that it was the intent that the Implementation Requirement should apply to such type extensions, the best solution is probably to change 7.5(8.1) to say "with a part that is of a task or protected type or a type that is a descendant of an explicitly limited record type". (I haven't checked to make sure this would be good enough to deal with generic formal limited types.) This would still conform to the rationale given in AARM 7.5(8.a/2): "For a function_call, we only require build-in-place for a limited type that would have been a return-by-reference type in Ada 95", since types descended from explicitly limited types were return-by-reference (RM95 6.5(11)). **************************************************************** From: Robert A. Duff Sent: Saturday June 30, 2007 9:22 AM ... > Does the language require Func to build its result in place? > > It would seem that it should; Certainly true. ... > So based on that, an object of thpe Typ2 doesn't meet the language of > 7.5(8.1) and therefore the Implementation Requirement does not apply. This does seem like a hole in the wording. > (I haven't looked at types derived from untagged limited types very > closely, but might those not have the same problem?) > > Assuming that I haven't missed something and that my analysis is > correct, and that it was the intent that the Implementation > Requirement should apply to such type extensions, the best solution is > probably to change 7.5(8.1) to say "with a part that is of a task or > protected type or a type that is a descendant of an explicitly limited > record type". (I haven't checked to make sure this would be good > enough to deal with generic formal limited types.) Generics are irrelevant. Building in place happens at run-time, so the type we're talking about is the actual type passed in to the generic. One instance might use b-i-p, and the other not. >... This would still > conform to the rationale given in AARM 7.5(8.a/2): "For a > function_call, we only require build-in-place for a limited type that > would have been a return-by-reference type in Ada 95", since types > descended from explicitly limited types were return-by-reference (RM95 > 6.5(11)). I think that rationale is intended to be stronger -- the b-i-p types are intended to be exactly the types that were ret-by-ref in Ada 95. (Ret-by-ref was a nasty language design mistake, by the way. I wish JDI had invented b-i-p in 1980.) **************************************************************** From: Stephen W. Baird Sent: Tuesday, August 7, 2007 6:10 PM When an aggregate is used to initialize an object, initialization is performed "in place" in some cases. In other words, instead of creating a separate anonymous object for the aggregate and then assigning it to the object being initialized, the aggregate is built in place. This is required in some cases (7.5(8.1/2), 7.6(17.1/2)) and permitted in others (7.6(21/2)). When a function call is used as the ancestor expression of a build-in-place extension aggregate, the RM seems to impose contradictory requirements on the initialization of the tag component of the object being initialized. Consider the following example: with Ada.Tags; procedure Test is type T0 is tagged limited null record; type Enclosed (Enclosing : access T0'Class) is limited null record; type T1 (N : Natural) is new T0 with record Self_Ref : Enclosed (T1'Access); Discrim_Dependent : String (1 .. N); end record; type T2 is new T1 with record Ptr : Some_Access_Type; Lim : Some_Limited_Type; end record; F_Result_Tag : Ada.Tags.Tag; function F return T1 is begin return Result : T1 (Ident_Int (1234)) do F_Result_Tag := Result.Self_Ref.Enclosing.all'Tag; -- What value is assigned here? Some_Procedure (Result); end return; end F; type T0_Ref is access T0'Class; for T0_Ref'Storage_Pool use ... ; Allocator_Value : constant T0_Ref := new T2'(F with Ptr => Some_Function, Lim => Some_Other_Function); Allocated : T2 renames T2 (Allocator_Value.all); begin ...; end Test; Is F_Result_Tag assigned the value T1'Tag or T2'Tag? The first sentence of 6.5(8/2) seems to require T1'Tag. The first sentence of 3.9(3) seems to require T2'Tag. Assigning T1'Tag would introduce a violation of the fundamental rule that a tagged object's tag is immutable over the lifetime of the object. If the call to F is to be viewed as initializing a portion of an object of type T2, then it seems that T2'Tag should be assigned. If F.Result is used as the controlling operand of a dispatching call, one might equivalently ask which subprogram body is executed - T1's or T2's? Suppose that F_Result_Tag is assigned T2'Tag and this dispatching call executes the subprogram body associated with T2. This leads to another question about the state of the T2-specific portion of F.Result at the time of the dispatching call. Might this portion of F.Result be uninitialized at this point, resulting in an access-before-initialization problem? Certainly AI95-00373 (the "requires late initialization" AI) is only an incomplete solution to the problem of references to an object whose initialization is in progress. This part of the language definition is not completely bulletproof. Still, a user might reasonably conclude that it is safe, for example, to pass F.Result off to a procedure which then makes a dispatching function call using the operand it has been passed. The memory fault that results when the called function attempts to dereference a T2-specific access component which happens to contain the value "37" will probably not amuse the user. Within the statement list of an extended return statement, it seems quite reasonable to assume that the return object's initialization is complete. One solution would be to ignore the problem and hope that users won't do this sort of thing very often. Another solution would be to introduce a requirement that the initialization of an extended return object must be complete before beginning execution of the statement list of an extended return statement. This might sound like an obvious consequence of existing language rules, but the substance of this requirement stems from the inclusion of extension components which are not components of the function result type. Is this implementable? Consider an implementation which passes a storage-allocating callback routine into F in order to allocate storage for the function result. In the case of this particular example, that function might be called with the discriminant value of the result. It would then apply T2's (not T1's) discriminant-to-size mapping function and allocate the appropriate amount of storage out of the right storage_pool. It could also, before returning to F, initialize the tag component to T2'Tag and then initialize the T2-specific portions of the allocated object by calling Some_Function and Some_Other_Function. This would meet the proposed requirement and solve the aforementioned access-before-initialization problem, but it would introduce a somewhat peculiar possibility. Suppose that F repeatedly enters the extended return statement and exits it (e.g., via a goto statement) before finally returning a value. This would result in multiple calls to the storage-allocating callback routine. This would in turn mean that the initialization of the T2-specific portion of F.Result would be executed repeatedly. The RM does not currently permit this sort of oddness. Note that this only arises in the presumably rare case of exiting an extended return statement without returning. Showing how this requirement might be satisfied for one particular example does not demonstrate that it can always be satisfied, but the approach of initializing the extension components before, rather than after, the ancestor portion of the extension aggregate seems to generalize well. Returning to the original question about F_Result_Tag, the other alternative is that F_Result_Tag is assigned T1'Tag and a dispatching call executes the subprogram body associated with T1. After F has returned, the tag of the allocated object is changed from T1'Tag to T2'Tag. This form of mutable tags doesn't seem to introduce any *definitional* problems. There would be a problem if the language included some form of a general access type which could only designate objects with one particular tag, but it doesn't. The major problem with this approach seems to be a safety issue associated with usage patterns. Suppose, for example, that a customer has a pair of dispatching operations, Acquire and Release, that are supposed to match up in some way. A dispatching call to Acquire is executed while an object has one tag value; later, after the object's tag value has changed, a dispatching call to Release is executed. This is likely to lead to inconsistencies. This issue alone seems like enough to rule out the "mutable tags" model. Mutable tags would also introduce some peculiar corner cases, most of them involving the Unchecked_Access attribute. If F passes Result'Unchecked_Access off to a task which then renames The_Pointer_I_Was_Passed.Some_Dispatching_Operation, it will be odd when two calls to that rename end up executing different subprogram bodies. An implementation which resolved the call address at the point of the rename declaration would presumably be in error. In the case of optional build-in-place aggregates, the mutable tag model does introduce fewer portability problems. Dispatching calls will execute the same subprogram bodies if an aggregate is built in a a separate object vs. being built in place with a mutable tag. If it is built in place with an immutable tag, then different subprogram bodies may be executed. This is probably not a major concern as long as no access-before-initialization problems are introduced. Pascal Leroy has pointed out (in private correspondence) that there is an analogous issue in Ada95 because of RM95 7.6(17.1/1). An Initialize procedure plays the role of the Ada05 build-in-place function. Consider the following Ada95 example: with Ada.Finalization; package Pkg is type T1 is new Ada.Finalization.Controlled with null record; procedure Initialize (X1 : in out T1); type T2 is new T1 with ...; end Pkg; with Ada.Tags; with Text_Io; package body Pkg is procedure Initialize (X1 : in out T1) is begin Text_Io.Put_Line (Ada.Tags.Expanded_Name (T1'Class (X1)'Tag)); end Initialize; end Pkg; procedure Pkg.Maine is X : T2 := (T1 with null record); begin null; end Pkg.Maine; It seems clear that a call to Pkg.Maine should output the line "PKG.T2". It is true that RM95 4.3.2(7) talks about initialization "as for an object of the ancestor type", but this initialization does not include the tag component itself. Ada95 also has an analogous access-before-initialization issue - the Initialize procedure in the above example might make a dispatching call and end up accessing uninitialized T2-specific components. It seems that while this was possible in Ada95, it did not come up in practice. With the introduction of extended return statements and build-in-place function calls, this may be a more serious problem in Ada05. **************************************************************** From: Randy Brukardt Sent: Tuesday, August 7, 2007 6:55 PM Replying to Steve Baird (whose message is in some funny format that I can't seem to quote): As you point out in toward the end of this message, this is nothing new. The issue seems to revolve around (re)dispatch and direct access to tags. Re-dispatch is generally considered bad design. Indeed, Ada recognizes this by making it hard to write: you have to insert an explicit conversion to a class-wide type in order to force a redispatch. It seems even more dangerous in an initialization context, because many of the components may not yet be initialized. Direct uses of tag values are rare, and doing things based on them is also considered bad practice. Mostly likely, this issue has never come up because it is hard to write a redispatch by mistake, and the "bad practice" thoughts noted above keep people from doing it on purpose. All that seems to be new here is to note that an extended_return_statement is also an initialization context where redispatch is a very bad idea. The only real question in my mind is whether we ought to make a language change to discourage redispatch even further. Ideally, it should be banned outright in initialization contexts. But I don't think that can be done without compatibility issues (especially inside of Initialize). Steve's "solution" to the problem of putting even more requirements on initialization order doesn't really solve the problem of redispatch, and it seems to make finalization more risky. (If the components are finalized in reverse order, a redispatch in a Finalize routine has the same problem of potentially accessing components that have already been finalized; if the "arbitrary order" of the RM is used, you again cannot reason about what a redispatch will do in terms of accessing components. I realize that the Finalize case probably wouldn't make a program erroneous, but if you can't reason about the behavior of Finalize, the program might as well be erroneous because you can't write anything that will be guaranteed to work.) In any case, I don't think this will be a problem in extended_return_statements any more than in Initialize routines. It's unlikely that anyone will be writing redispatching calls in these contexts unless they're seriously confused (if you need to do something to the object as a whole, you have to do it via the whole-object Initialize/Adjust). This all leads me to think that the problem is with 6.5(8/2). This rule does not make sense for build-in-place, because we don't know the tag of the ultimate object. So I think we need to drop it or modify it such that it does not apply inside the function or to build-in-place objects. (Not sure which.) Steve didn't tell us why this issue came up; I doubt it was a user problem. It seems most likely to be an implementation problem which he is then trying to make into a mountain. **************************************************************** From: Tucker Taft Sent: Tuesday, August 7, 2007 8:22 PM My view is that you should not take "build in place" too literally from a semantic point of view. It is more of an implementation requirement, where the the semantics should generally work out as they would without build-in-place. In particular, the "outer" object of an extension aggregate doesn't really exist until the aggregate is fully evaluated. While evaluating the parent part, the parent part is a separate object, that just happens to overlap with the ultimate resting place of the outer object. So in my view, until you return from the function, the return object it is building is definitely of the result type of the function, and so the tag will be T1 until that moment. I also don't think 3.9(3) is really violated by this model, because the outer object doesn't really exist until all of the components of the aggregate have been evaluated. You certainly don't want anyone manipulating an object corresponding to the result of an aggregate until these evaluations have occurred. **************************************************************** From: Randy Brukardt Sent: Wednesday, August 8, 2007 12:19 PM > So in my view, until you return from the function, > the return object it is building is definitely of the > result type of the function, and so the tag will > be T1 until that moment. That's a more consistent view than whatever I was babbling about last night. I like it better, so I'm going to switch my answer to agree with Tucker, as it means that "optional" build-in place won't change the semantics of a return statement (and that would be very bad). OTOH, I don't think we're going to be seeing any tests of redispatching in extended_return_statements, so I doubt much that whatever implementations do will matter much in practice. **************************************************************** From: Stephen W. Baird Sent: Tuesday, August 14, 2007 7:58 PM > Steve didn't tell us why this issue came up; I doubt it was a user problem. It was an implementation question. I'm just trying to figure out how build-in-place extension aggregates are supposed to work. > .. the "outer" > object of an extension aggregate doesn't really exist > until the aggregate is fully evaluated. While > evaluating the parent part, the parent part is > a separate object, that just happens to overlap > with the ultimate resting place of the outer > object. > > So in my view, until you return from the function, > the return object it is building is definitely of the > result type of the function, and so the tag will > be T1 until that moment. > Given that Randy and I were unable to deduce this two-object model from the RM, it seems that some clarification is needed. Viewing an extension aggregate as two distinct objects residing at the same location certainly seems more appealing than viewing it as a single object whose tag value changes midway through its lifetime. This definition does, however, seem to involve some handwaving. In Ada terms, "residing at the same location" means at least the following: a) The metamorphosis of the first object into (a portion of) the second is accomplished without any Adjust/Finalize calls or any other user-visible side effects. b) An access value which refers to (a part of) the first object implicitly becomes a reference to the (corresponding part of) the second object at the end of the lifetime of the first object (i.e., at the time of the assignment of the first object to the second). This is an exception to the usual rule that access values which refer to an object after the end of the object's lifetime are dangling references. Similarly, references derived from such access values (e.g., a rename of Outstanding_Access_Value.all.Some_Component) are similarly transformed. c) Coextension ownership is handed off from the first object to the second. Should all of this be spelled out in an AARM note? The two-object approach also opens the door for a violation of the last sentence of 3.9.3(1/2): Because objects of an abstract type cannot be created, a dispatching call to an abstract subprogram always dispatches to some overriding body. If the ancestor type of an extension aggregate is abstract, then a dispatching call may end up attempting to execute an abstract subprogram: procedure Abstract_Call is package Pkg1 is type T1 is abstract tagged limited null record; function Abstract_Function (X1 : T1) return Integer is abstract; end Pkg1; function Dispatcher (Ref : access Pkg1.T1'Class) return Integer is begin return Pkg1.Abstract_Function (Ref.all); end Dispatcher; package Pkg2 is type T2 is abstract new Pkg1.T1 with record F1 : Integer := Dispatcher (T2'Access); end record; end Pkg2; package Pkg3 is type T3 is new Pkg2.T2 with record F2, F3 : Integer; end record; function Abstract_Function (X3 : T3) return Integer; end Pkg3; package body Pkg3 is function Abstract_Function (X3 : T3) return Integer is begin return 123; end Abstract_Function; end Pkg3; X : Pkg3.T3 := (Pkg2.T2 with 456, 789); begin null; end Abstract_Call; How is this supposed to work? When this example is compiled with our compiler, the tag field of the aggregate is never set to anything other than T3'Tag and Dispatcher's call dispatches to Pkg3.Abstract_Function. Is our implementation in error? **************************************************************** From: Tucker Taft Sent: Tuesday, August 14, 2007 8:08 AM ... As far as your question about an ancestor part that is specified by an abstract ancestor subtype_mark, the dynamic semantics are actually pretty different. There is no separate object created in that case. It simply says that the components that are not given by the association list are initialized by default. By contrast, when the ancestor part is specified by an expression, the expression is evaluated, which creates a separate object (with its own tag) that is then assigned to the ancestor part of the aggregate object. So I don't see the need to ever have a tag that identifies an abstract type, since there is no separate object when you specify an ancestor subtype_mark rather than an ancestor expression. I believe that the key thing is that in build-in-place, the assignment and associated adjust/finalize become no-ops at run-time. But if there are two objects in the "normal" semantics, then there are still two objects conceptually in the build-in-place semantics. It probably would be worth describing in more detail what sort of transformation happens upon a build-in-place "assignment," especially to an ancestor part. I think most of it could be in AARM implementation notes. It is important probably in normative words somewhere to say that access values remain valid, though for the ancestor part, they now denote a view of the "new" object whose type is that of the ancestor part. I.e., it is roughly equivalent to 'Access applied to a view conversion, e.g. T2(New_Obj_Of_Type_T3)'Access. **************************************************************** From: Randy Brukardt Sent: Friday, August 24, 2007 10:38 PM Not wanting to do anything tough, I'm working on test objectives for 7.6 this evening before leaving. I noticed a very minor glitch in 7.6(16) and 7.6(18). 7.6(13-15) talks about the canonical implementation of the assignment operation. That is now defined for limited types. 7.6(16) defines adjustment for, including a call to Adjust. The paragraph parenthetically claims to apply only to non-limited types, but that begs the question of what the meaning of this is for limited types. (Yes, I know that later requirements are intended to make this moot, but should the canonical semantics really be undefined??) So I suggest removing the parenthetical non-limited, and adding a word to make it clear that adjusting a limited type is a no-op (and surely does not call a routine that does not exist!): "To adjust the value of a [(nonlimited)] composite object, the values of the components of the object are first adjusted in an arbitrary order, and then, if the object is {nonlimited} controlled, Adjust is called. Adjusting the value of an elementary object has no effect, nor does adjusting the value of a composite object with no controlled parts." We could add something to the last phrase about limited objects, but since it is already considered redundant, it isn't necessary (and might be considered conflicting with the requirements following). The AARM note at 7.6(16.a) also needs a bit of repair. There is a similar problem with 7.6(18). It has a parenthetical "for nonlimited controlled types", but 7.6(21/2) is the rule that 7.6(17.1/2) is requiring to be implemented -- for both limited and nonlimited aggregates (and functions for that matter). It's quite clear that 7.6(21/2) *can* happen for limited types, the so-called proof of 7.6(18.a) notwithstanding. So I propose dropping "nonlimited" from 7.6(18) and fixing 7.6(18.a) to read something like: Ramification: The rules below about assignment_statements apply only to nonlimited controlled types, as assignment_statements are not allowed for limited types. The other rule applies to both limited and nonlimited types, and in fact is required for all assignment operations involving aggregates and function calls. This is important because the programmer can count on a stricter semantics for limited controlled types. As both of these are parenthetical remarks, they don't affect the correctness of the language. As such, I propose to add these changes to the presentation AI. **************************************************************** !topic Effect of build-in-place on "current instance" of a type !reference 7.5(8.1), 4.3(5), 8.6(17) !from Adam Beneschan 07-08-31 !discussion This is kind of a nitpick, but it's a bit disturbing to me. My understanding of the RM is that the "Dynamic Semantics" sections define what the effect of a legal Ada program will be when it runs. There are "Implementation Requirements" sections that impose additional constraints how things work, but those generally aren't supposed to affect the semantics, except perhaps that certain checks not take place, or things may not work exactly according to the Dynamic Semantics in some error or boundary cases, or some redundant Adjust/Finalize pairs may be eliminated. But is that the basic idea? If so, I'm not sure what to make of this: type T is limited record F1 : Integer; F2 : access T := T'Unchecked_Access; end record; R : aliased T := (F1 => 5, F2 => <>); I can't find anywhere in the RM that says that R.F2 will be R'Access after R is created. The semantics of the aggregate are that an anonymous object is created and the components are assigned (4.3(5)); based on 4.3.1(19.1) and 8.6(17), the initial value of F2 should then be an access to an anonymous object. 7.5(8.1) says that for an aggregate of a limited type, the *implementation* shall not create a separate anonymous object, but that the aggregate shall be constructed directly in the target object. In my opinion, this isn't quite strong enough; while this tells something about how the operation will be implemented (and implies that there is no extra assignment that would cause an adjust/finalize operation to be performed), it doesn't say anything to the effect that the anonymous object is "identified" with the target object, or the target object is treated as the anonymous object. The semantics of 4.3(5), 4.3.1(19.1) and 8.6(17) seem to require that F2 be initialized to point to some anonymous object, and while 7.5(8.1) says we're not supposed to create an anonymous object, it doesn't say what will replace the anonymous object for the purposes of the above RM sections. I'm sure we all know what's *supposed* to happen, and that R.F2 will point to R. But I'm not quite comfortable with that sort of fuzziness, especially given all the effort that those of you who work on the RM put into making sure things are precise. I think there's a little bit of language missing. Perhaps something needs to be added to 7.5(8.1) like, "The aggregate or function_call shall be constructed directly in the new object; and for the purposes of 8.6(17), the new object [or a corresponding subcomponent of the new object?] is considered to be the object 'associated with the execution that evaluates the usage name', when an expression that contains the current instance of the type [or the current instance of the type of a subcomponent?] is evaluated". **************************************************************** From: Tucker Taft Sent: Tuesday, September 4, 2007 4:12 PM Some discomfort with this area has already been expressed by some members of the ARG. The conclusion seems to be that we should be more explicit about what sorts of things are preserved by the "in-place" assignment. In particular, access values designating the "temporary" object will also designate the more "permanent" object. We'll factor in some of your issues when we tackle that. **************************************************************** From: Robert A. Duff Sent: Friday, September 28, 2007 1:03 PM > My view is that you should not take "build in place" > too literally from a semantic point of view. It > is more of an implementation requirement, where the > the semantics should generally work out as they would > without build-in-place. In particular, the "outer" > object of an extension aggregate doesn't really exist > until the aggregate is fully evaluated. While > evaluating the parent part, the parent part is > a separate object, that just happens to overlap > with the ultimate resting place of the outer object. > > So in my view, until you return from the function, > the return object it is building is definitely of the > result type of the function, and so the tag will > be T1 until that moment. FWIW, GNAT agrees with Tucker's interpretation. In the example below, A_Function returns T1, so it creates an object with T1'Tag. When a call to A_Function is used as the parent part of a T2 aggregate, that's still the case, but the final T2 object has T2'Tag. Makes sense to me. But dispatching within the return statement seems questionable. I wonder if it makes sense to make that a run-time error. (I can think of various ways to implement that without distributed overhead.) ---------------- with Ada.Text_IO; use Ada.Text_IO; with Ada.Tags; use Ada.Tags; package Ext_Agg_Test is type T0 is tagged limited null record; procedure Prim (X : T0); procedure Class_Wide (X : T0'Class); type T1 is new T0 with null record; procedure Prim (X : T1); type T2 is new T1 with null record; procedure Prim (X : T2); end Ext_Agg_Test; package body Ext_Agg_Test is procedure Prim (X : T0) is begin Put_Line ("Prim (T0)"); end Prim; procedure Class_Wide (X : T0'Class) is begin Put_Line ("Class_Wide (T0'Class)"); Put_Line (Expanded_Name (X'Tag)); Prim (X); Put_Line ("Class_Wide (T0'Class) done"); end Class_Wide; procedure Prim (X : T1) is begin Put_Line ("Prim (T1)"); end Prim; procedure Prim (X : T2) is begin Put_Line ("Prim (T2)"); end Prim; end Ext_Agg_Test; procedure Ext_Agg_Test.Test is function A_Function return T1 is begin return Result : T1 do if True then Class_Wide (Result); end if; end return; end A_Function; begin Put_Line ("Calling A_Function"); declare T1_Obj : T1 := A_Function; begin Put_Line ("A_Function returned"); Class_Wide (T1_Obj); end; New_Line; Put_Line ("Calling A_Function again"); declare T2_Obj : T2 := (A_Function with null record); begin Put_Line ("A_Function returned again"); Class_Wide (T2_Obj); end; end Ext_Agg_Test.Test; Here's the output: Calling A_Function Class_Wide (T0'Class) EXT_AGG_TEST.T1 Prim (T1) Class_Wide (T0'Class) done A_Function returned Class_Wide (T0'Class) EXT_AGG_TEST.T1 Prim (T1) Class_Wide (T0'Class) done Calling A_Function again Class_Wide (T0'Class) EXT_AGG_TEST.T1 Prim (T1) Class_Wide (T0'Class) done A_Function returned again Class_Wide (T0'Class) EXT_AGG_TEST.T2 Prim (T2) Class_Wide (T0'Class) done **************************************************************** From: Robert A. Duff Sent: Monday, January 4, 2008 1:11 PM Here's wording for AI05-67. [This is version /02 of the AI - ED.] [Wording omitted.] **************************************************************** From: Tucker Taft Sent: Monday, January 4, 2008 1:31 PM I think we use the term "unspecified" rather than "implementation dependent" in the RM. You start talking about a "thunk" in the AARM note somewhat out of the blue. Is there really a need for a thunk? If so, I think you need to explain what it does. I suspect that some implementations might prefer to create "pseudo" storage pools rather than thunks, though that is hopefully not semantically significant. Overall, the solution you propose seems reasonable. It will probably require some rewording in AI-66 (which I sent around last night). **************************************************************** From: Robert A. Duff Sent: Monday, January 4, 2008 6:25 PM > I think we use the term "unspecified" rather than > "implementation dependent" in the RM. OK. > You start talking about a "thunk" in the AARM note > somewhat out of the blue. Is there really a need > for a thunk? If so, I think you need to explain > what it does. I suspect that some implementations > might prefer to create "pseudo" storage pools > rather than thunks, though that is hopefully not > semantically significant. That "thunk" is an editing mistake. I think I thunk about the GNAT implementation details too much ;-), and then eliminated those details, but missed that one. The basic idea is that a storage pool is passed in. One common "storage pool" is the "allocate-on-secondary-stack" storage pool. I think this is optimized in the GNAT case by passing various flags that bypass the actual pool -- e.g. a flag might say "allocate on secondary stack". I don't remember the details of the GNAT implementation, but anyway, the AARM doesn't need to talk too much about those optimization details. Maybe just mention the possibility. There's one more occurrence of the term "thunk", which I really mean. > Overall, the solution you propose seems reasonable. > It will probably require some rewording in AI-66 > (which I sent around last night). Right, I noticed that, but didn't say anything. I'm glad you say "reasonable". I'm thinking there are two objects, the return object and the newly-created object that it turns into. In my AI writeup, I said "becomes", but I wish there were a more evocative word -- "morphs into" or "magically transforms itself" or (from the minutes) "poofs"? **************************************************************** From: Randy Brukardt Sent: Monday, January 4, 2008 10:29 PM > > I think we use the term "unspecified" rather than > > "implementation dependent" in the RM. > > OK. Either that or "implementation-defined". In this case, "unspecified" is better, I think. > > You start talking about a "thunk" in the AARM note > > somewhat out of the blue. Is there really a need > > for a thunk? If so, I think you need to explain > > what it does. I suspect that some implementations > > might prefer to create "pseudo" storage pools > > rather than thunks, though that is hopefully not > > semantically significant. > > That "thunk" is an editing mistake. I think I thunk about the GNAT > implementation details too much ;-), and then eliminated those details, but > missed that one. The basic idea is that a storage pool is passed in. One > common "storage pool" is the "allocate-on-secondary-stack" storage pool. > I think this is optimized in the GNAT case by passing various flags that > bypass the actual pool -- e.g. a flag might say "allocate on secondary > stack". I don't remember the details of the GNAT implementation, but > anyway, the AARM doesn't need to talk too much about those optimization > details. Maybe just mention the possibility. OK. > There's one more occurrence of the term "thunk", which I really mean. > > > Overall, the solution you propose seems reasonable. > > It will probably require some rewording in AI-66 > > (which I sent around last night). > > Right, I noticed that, but didn't say anything. We'll have to reconcile a number of these AIs when I integrate this stuff. That's what I get paid for, usually. :-) > I'm glad you say "reasonable". I'm thinking there are two objects, the > return object and the newly-created object that it turns into. In my AI > writeup, I said "becomes", but I wish there were a more evocative word > -- "morphs into" or "magically transforms itself" or (from the minutes) > "poofs"? I prefer "morphs into". As long as it's in the notes. "Becomes" sounds too much like normal assignment (":=" is supposed to be read "becomes" after all - 2.2(13)). I'm made these minor changes to the AI that we'll discuss in Florida. (And reworded to use "inherently limited type", otherwise we'd still have the bug from AI05-0059-1.) **************************************************************** From: Edmond Schonberg Sent: Friday, June 13, 2008 1:51 PM [Commenting on version /05 of the AI - ED.] Neat example, package P2 crashes gnat right now :-)! **************************************************************** From: Gary Dismukes Sent: Friday, June 13, 2008 6:11 PM It's not too surprising to me that GNAT chokes on the aggregate in P2, since I believe it's just the sort of case that we discussed in Florida as causing major implementation headaches, enough that we decided to make it illegal. In fact, AI-67 includes the wording change to make this illegal: > Add after 4.3.2(5/2): > > If the ancestor_part is a function call and the type requires > build-in-place (see 7.6), then the ancestor_part shall have a > constrained nominal subtype, unless the record_component_association_list > is *null record*. What's puzzling to me is that this example from the !discussion uses just this very kind of extension aggregate, but there doesn't seem to be any mention that it's illegal by the AI. (The nominal subtype of the ancestor part is P1.T1, which is unconstrained.) Bob, am I missing something? **************************************************************** From: Robert A. Duff Sent: Friday, June 13, 2008 6:51 PM No, just that I worked mainly on the RM wording. I admit I missed that example, and neglected to change it from "horror for implementers" to "illegal". **************************************************************** From: Randy Brukardt Sent: Friday, June 13, 2008 6:51 PM ... > Bob, am I missing something? Perhaps that part of the discussion predates Bob's work? It's a common problem with multiple-author AIs. Or he just forgot to update that part (I saw that with an AI I was working on yesterday, and there were a number of such problems uncovered in the recent editorial review). Anyway, what I remember most about the rule we talked about in Florida was that someone announced at lunch on the veranda that there was some problem with the rule that was likely to cause major problems. I don't know if Bob has adequately covered that. I do know that the rule Bob proposed doesn't work, because there is no such rule that applies in generic private parts (he didn't use the boilerplate, so there is no recheck), and thus the supposedly significant expense still exists in that case. For a rule that is supposed to be eliminating expensive cases, there can be no holes. I'm still dubious about having a complex rule whose sole purpose is to avoid a supposedly annoying implementation. Bob's discussion shows that there a reasonable (maybe annoying) ways to implement this case, and I am dubious that we need this wart. (Of course, Janus/Ada, which doesn't insist on allocating objects contigiously anyway would have no problem with this case. The base memory of an object is always fixed size and never depends on any bounds or discriminants; the rest is allocated when we know the appropriate constraint.) [Note that I'm only talking about the quoted rule; the primary intent of the AI, the early raising of certain errors is fine by me and probably should be allowed in more cases anyway.] **************************************************************** From: Gary Dismukes Sent: Friday, June 13, 2008 7:36 PM > Perhaps that part of the discussion predates Bob's work? It's a common > problem with multiple-author AIs. Or he just forgot to update that > part (I saw that with an AI I was working on yesterday, and there were > a number of such problems uncovered in the recent editorial review). Yes, based on Bob's recent message sounds like it just didn't get updated. > Anyway, what I remember most about the rule we talked about in Florida > was that someone announced at lunch on the veranda that there was some > problem with the rule that was likely to cause major problems. I don't > know if Bob has adequately covered that. It was more than just a lunch discussion (btw, Steve B. was the one who brought it up at lunch). We discussed it during the meeting as part of AI-67, and took a straw poll in favor of adding the restriction (5-0-2). Not sure there was a decision to put it in AI-67, but I guess Bob took the initiative. > I do know that the rule Bob proposed doesn't work, because there is no > such rule that applies in generic private parts (he didn't use the > boilerplate, so there is no recheck), and thus the supposedly > significant expense still exists in that case. For a rule that is > supposed to be eliminating expensive cases, there can be no holes. Doesn't seem like it would be too hard to correct that oversight. :) > I'm still dubious about having a complex rule whose sole purpose is to > avoid a supposedly annoying implementation. Bob's discussion shows > that there a reasonable (maybe annoying) ways to implement this case, > and I am dubious that we need this wart. (Of course, Janus/Ada, which > doesn't insist on allocating objects contigiously anyway would have no > problem with this case. The base memory of an object is always fixed > size and never depends on any bounds or discriminants; the rest is > allocated when we know the appropriate constraint.) There appear to be real hardships for implementing this in both GNAT and the IBM compiler (due to different difficulties in the two compilers). There seemed to be broad agreement during the meeting that this restriction would help, and not be a signficant loss of user functionality (though I think you may have been one of the two abstainers, I can't recall clearly). > [Note that I'm only talking about the quoted rule; the primary intent > of the AI, the early raising of certain errors is fine by me and > probably should be allowed in more cases anyway.] Right, except the r-t checks are AI-50; this is AI-67 we're discussing here. :-) **************************************************************** From: Randy Brukardt Sent: Friday, June 13, 2008 8:15 PM > It was more than just a lunch discussion (btw, Steve B. was the one > who brought it up at lunch). We discussed it during the meeting as > part of AI-67, and took a straw poll in favor of adding the > restriction (5-0-2). Not sure there was a decision to put it in > AI-67, but I guess Bob took the initiative. No, no, no. The discussion I remember is someone stating *after* the rule was adopted that it would not work. I note that the minutes has a note that Steve Baird said something "on Sunday" about generic bodies; perhaps that is the concern that was raised that I am remembering. Perhaps it was at breakfast on Sunday and not lunch (we were on that veranda a lot). > > I do know that the rule Bob proposed doesn't work, because there is > > no such rule that applies in generic private parts (he didn't use > > the boilerplate, so there is no recheck), and thus the supposedly > > significant expense still exists in that case. For a rule that is > > supposed to be eliminating expensive cases, there can be no holes. > > Doesn't seem like it would be too hard to correct that oversight. :) I'm not certain that the boilerplate is the right solution here; I think it would be good for someone actually in favor of the rule to check out whether it works in generics (specs and bodies). In bodies, in particular, it would seem to pretty much disallow extension aggregates (as anything that *might* be build-in-place -- which is any type and surely any formal type -- would trigger the rule). Surely it should apply to any function that returns a descendant of a generic formal type, for instance (I don't understand why Bob only discusses formal functions in his assume-the-worst rule). Such a function could have been inherited from an interface or formal derived type, or be a local function. The problem would still exist for all of those. Anyway, I'm sure we'll discuss this more next week. **************************************************************** From: Gary Dismukes Sent: Saturday, June 13, 2008 1:07 AM > No, no, no. The discussion I remember is someone stating *after* the > rule was adopted that it would not work. I note that the minutes has a > note that Steve Baird said something "on Sunday" about generic bodies; > perhaps that is the concern that was raised that I am remembering. > Perhaps it was at breakfast on Sunday and not lunch (we were on that veranda a lot). OK, sorry. I was thinking of a discussion during Saturday lunch. > I'm not certain that the boilerplate is the right solution here; I > think it would be good for someone actually in favor of the rule to > check out whether it works in generics (specs and bodies). In bodies, > in particular, it would seem to pretty much disallow extension > aggregates (as anything that *might* be build-in-place -- which is any > type and surely any formal type -- would trigger the rule). Surely it > should apply to any function that returns a descendant of a generic > formal type, for instance (I don't understand why Bob only discusses > formal functions in his assume-the-worst rule). Such a function could > have been inherited from an interface or formal derived type, or be a > local function. The problem would still exist for all of those. Well, it couldn't be any formal type, it would have to be limited, but you may be right that it has to be pretty restrictive in generics. > Anyway, I'm sure we'll discuss this more next week. Sorry I won't be there for the discussion -- as it happens I'll be in Wisconsin at that time. I hope the flooding's mostly over by the time we arrive (and that you don't end up getting stuck there!). Have a good meeting! **************************************************************** From: Randy Brukardt Sent: Sunday, June 15, 2008 12:11 AM A tiny nitpick on the following wording: >If the ancestor_part is a function call and the type requires build-in-place >(see 7.6), then the ancestor_part shall have a constrained nominal >subtype, unless the record_component_association_list is *null record*. This means that (F with null record) is always legal, but (F with others => <>) might not be legal, even if <> is not associated with any components. We've worked hard to ensure that these two are always equivalent, and this wording needs to reflect that. We probably need to add: "... is *null record* {or is *others* => <> with no associated components}." **************************************************************** From: Edmond Schonberg Sent: Sunday, June 15, 2008 1:42 AM > We've worked hard to ensure that these two are always equivalent, and > this wording needs to reflect that. We probably need to add: "... is > *null record* {or is *others* => <> with no associated components}." Why not just say "unless the type is a null extension"? **************************************************************** From: Randy Brukardt Sent: Sunday, June 15, 2008 2:01 AM I was thinking there was some reason for this to be syntactical. If not, "unless the type of the extension aggregate is a null extension" would work. (I think it is necessary to identify which type, because it isn't clear whether we're talking about the type of the ancestor part or the type of the aggregate as a whole. The first reference can afford to be sloppy (either works), but this one cannot.) **************************************************************** From: Tucker Taft Sent: Sunday, June 15, 2008 2:32 AM Just because a type is a null extension, the extension aggregate might still have lots of components specified, since the ancestor part need not be of the immediate parent type, but may be a more distant ancestor. I think you could say something like "... unless there are no components needed in the record_component_association_list of the extension aggregate," since that is exactly when *null record* is allowed. ****************************************************************