!standard A.18.2(239/2) 08-05-15 AI05-0027-1/06 !standard A.18.3(152/2) !standard A.18.4(75/2) !standard A.18.7(96/2) !class binding interpretation 06-11-13 !status Amendment 201Z 08-11-26 !status WG9 Approved 08-06-20 !status ARG Approved 9-0-0 06-11-09 !status work item 06-11-13 !status received 06-11-03 !priority Medium !difficulty Easy !qualifier Omission !subject Behavior of container operations when passed a finalized container object !summary It is a bounded error if an operation is passed a container object that has been finalized. If the operation is read-only, then either the operation proceeds as though the container were empty, or Constraint_Error or Program_Error is raised. If a finalized container is passed to an operation that would update the container, Constraint_Error or Program_Error is raised. !question What happens in container operations when passed a finalized container object? !recommendation (See Summary.) !wording Add the following after A.18.2(239/2): It is a bounded error to call any subprogram declared in the visible part of Containers.Vectors when the associated container has been finalized. If the operation takes Container as an IN OUT parameter, then it raises Constraint_Error or Program_Error. Otherwise, the operation either proceeds as it would for an empty container, or it raises Constraint_Error or Program_Error. Make corresponding additions to all other container packages, after A.18.3(152/2), after A.18.4(75/2) [needs "Bounded Error" label as well], and after A.18.7(96/2) [needs "Bounded Error" label]. !discussion It requires some circumlocution (though see the example for a plausible situation) to create a situation where an operation can be applied to a finalized container. Nevertheless, we don't want such an operation to de-finalize the container, and thereby create storage leaks or worse. Hence, we do not allow any operation that updates the container to proceed, and any other operation either raises an exception, or proceeds as though the container were empty. This minimizes any distributed overhead associated with handling the finalized case (because all non-updating operations need not distinguish a finalized state from an empty state), while ensuring that no storage leak or potentially erroneous execution is introduced by such "late" operations. We allow Constraint_Error rather than Program_Error for efficiency reasons as well, since many of the operations that attempt to update an empty container will already raise Constraint_Error. It is only the updating operations that allow for the container to start out empty that will need an explicit check for the finalized state. We are suggesting adding this paragraph *before* the discussion of ambiguous cursors, as we want that discussion to flow right into the discussion of invalid cursors. !example Access to a finalized container can happen if a Finalize routine accesses a container object declared in the same master as the Finalize routine. This is more likely than it appears at first glance. Imagine for a moment that someone needs to instrument a controlled type to collect some usage information. The program might be structured like: package Pack is type T is new Ada.Finalization.Controlled with ... overriding procedure Finalize (Object : in out T); type Acc_T is access all T'Class; A_List : Acc_T; end Pack; package body Pack is package Usage_History is new Ada.Containers.Vectors (...); Finalization_Usage: Usage_History.Vector; procedure Finalize (Object : in out T) is begin -- operations on Finalization_Usage to collect data on the -- finalization of objects of T. end Finalize; end Pack; When the program finalizes, any allocated objects remaining on A_List will be finalized when the finalization of the collection for type Acc_T occurs. However, since finalization occurs in reverse order of declaration, the object Finalization_Usage will be finalized before the collection for Acc_T. So those finalizations will be accessing an object that is already finalized. Note that it doesn't matter where A_List is declared; it only matters where the access type is declared relative to the collection object. It's not unusual for that type to be declared visibly (this program structure is taken from the CLAW Windows interface library), and good Ada practice is to hide things in the body when possible. But doing so can easily cause finalization anomalies (see AI95-00280 for some examples). !corrigendum A.18.2(239/2) @dinsa Calling Merge in an instance of Generic_Sorting with either Source or Target not ordered smallest first using the provided generic formal "<" operator is a bounded error. Either Program_Error is raised after Target is updated as described for Merge, or the operation works as defined. @dinst It is a bounded error to call any subprogram declared in the visible part of Containers.Vectors when the associated container has been finalized. If the operation takes Container as an @b parameter, then it raises Constraint_Error or Program_Error. Otherwise, the operation either proceeds as it would for an empty container, or it raises Constraint_Error or Program_Error. !corrigendum A.18.3(152/2) @dinsa Calling Merge in an instance of Generic_Sorting with either Source or Target not ordered smallest first using the provided generic formal "<" operator is a bounded error. Either Program_Error is raised after Target is updated as described for Merge, or the operation works as defined. @dinst It is a bounded error to call any subprogram declared in the visible part of Containers.Doubly_Linked_Lists when the associated container has been finalized. If the operation takes Container as an @b parameter, then it raises Constraint_Error or Program_Error. Otherwise, the operation either proceeds as it would for an empty container, or it raises Constraint_Error or Program_Error. !corrigendum A.18.4(75/2) @dinsa @xindent with a cursor that designates each node in Container, starting with the first node and moving the cursor according to the successor relation. Program_Error is propagated if Process.@b tampers with the cursors of Container. Any exception raised by Process.@b is propagated. @dinss @s8<@i> It is a bounded error to call any subprogram declared in the visible part of a map package when the associated container has been finalized. If the operation takes Container as an @b parameter, then it raises Constraint_Error or Program_Error. Otherwise, the operation either proceeds as it would for an empty container, or it raises Constraint_Error or Program_Error. !corrigendum A.18.7(96/2) @dinsa If Element_Type is unconstrained and definite, then the actual Element parameter of Process.@b shall be unconstrained. @dinss @s8<@i> It is a bounded error to call any subprogram declared in the visible part of a set package when the associated container has been finalized. If the operation takes Container as an @b parameter, then it raises Constraint_Error or Program_Error. Otherwise, the operation either proceeds as it would for an empty container, or it raises Constraint_Error or Program_Error. !ACATS test An ACATS C-Test could be constructed for this rule, using an code similar to the example. !appendix From: Pascal Leroy Date: Wednesday, November 8, 2006 6:02 AM The RM doesn't seem to say what happens when you operate on a finalized container. I believe that any subprogram call that involves a finalized container should raise P_E: there doesn't seem to be any reason to make this unspecified or erroneous (it's easy enough to detect: just keep a Finalized bit in the container header), and there doesn't seem to be any reason to try and define sensible semantics (who plays with finalized containers anyway?). Such an occurrence is likely to be a bug, and the user will be glad to hear about it. This should be nailed down in the RM. **************************************************************** From: Robert Dewar Date: Wednesday, November 8, 2006 6:09 AM I would prefer this to be erroneous, to avoid the extra overhead of the test. If an implementation decides unconditionally or under control of some flag that it should make the check, it is free to do so. **************************************************************** From: Robert I. Eachus Date: Wednesday, November 8, 2006 3:56 PM >This should be nailed down in the RM. At best this is a bad idea, at worst, a disaster for existing code. Let's think this through, you want a bit, let's call it 'Finalized in every container. Shades of 'Terminated in tasks returned from inside functions. (I guess that was before your involvement with the ARG.) The AI gave the legalistic obvious answer, and as a result tasking in the DEC compiler in particular became a lot less useful. We invented the term pathology to describe that kind of AI, but it as you would expect took years for the damage to be undone. Let's look at the properties of a hypothetical 'Finalized attribute. Without the proposed rule, it is harmless. But add the requirement to raise P_E and it becomes a disaster. A call to Foo'Finalized will either return True, or raise Program_Error. Oops! Similarly no procedure can finalize a container passed as an in out parameter, or call any procedure or function that would finalize a container passed in as a parameter. What do we gain for this? Safety? Hardly. Dave Emery referred to Storage_Error as a parachute that opens on impact. With this rule finalized containers become hand grenades that explode when touched. Adding the attribute, or a better a pair of Finalized functions to Ada.Controlled might be a nice improvement for the next version of Ada. That would allow subprograms that could be called with a finalized object to determine the state of the object. Of course, I can create a package My_Controlled, that does all that for types descended from it. That, in some cases would be a safety improvement, and is otherwise harmless. Actually I have done something similar to this in a couple of instances. If you are counting the (maximum) number of controlled objects of a particular type, you don't want to decrement the count twice when an object is finalized twice, or decrement the count when an object is finalized before initialization. Putting a bit in the object that is set during initialization and reset if set by finalization fixes all that. Why do you want to count the number of objects of a type? Usually for optimization purposes. If you know how large the collection gets, you can use a (fixed size) array with a free list. Yes, that change means changing the code later on, but the actual change is small, in number of lines, and often necessary to meet real-time requirements. (Including those that require all allocation to be done during initialization.) **************************************************************** From: Matthew Heaney Date: Wednesday, November 8, 2006 4:44 PM > At best this is a bad idea, at worst, a disaster for existing code. I can't see how this follows from Pascal's suggestion. All he's is advocating is using a state variable to remember whether finalization has occurred for this container instance. Containers already have state variables to detect tampering, so what he's asking for is very modest. In fact, you could use the state variables that already exist. For example in GNAT each container object has two counters ("busy" and "lock" if you've perused the sources) to detect tampering with elements and cursors. Right now when a container object is initialized, those counters get set to 0. When the container is finalized, we check to ensure that they're 0 (to detect finalization during Query_Element, etc), and if not to raise Program_Error. To implement Pascal's suggestion, all you'd have to do is set one or both of the counters to some nonce value (-1, say) in Finalize. Other container operations would simply check the counter on entry, and raise PE if the counter were less than 0. **************************************************************** From: Robert Dewar Date: Wednesday, November 8, 2006 6:41 PM Sounds reasonable to me ... makes me change my mind on this issue, I think raising PE makes sense. **************************************************************** From: Matthew Heaney Date: Wednesday, November 8, 2006 9:19 PM I am curious, though, about what motivated Pascal's suggestion: under what circumstances would someone be able to refer to an object that has been finalized? If finalization happens immediately prior to the object being destroyed, then how can someone refer to the object at all? **************************************************************** From: Randy Brukardt Date: Wednesday, November 8, 2006 9:39 PM Actually, it's quite easy to create such cases, although the code is pretty unlikely in practice. (OTOH, cases like it occurred in early versions of Claw, so it's not impossible for them to happen in real code.) AI-280 discusses some related cases (that AI fixed several bugs in this area) and has an example. Typically, the access occurs in the finalization routine of some object declared in the same declarative_part. Remember that all of the objects of a declarative part are finalized before any of them are destroyed (the language is quite clear that the memory isn't deallocated for each object individually). **************************************************************** From: Robert I. Eachus Date: Thursday, November 9, 2006 1:36 AM Matt said: I can't see how this follows from Pascal's suggestion. All he's is advocating is using a state variable to remember whether finalization has occurred for this container instance. Containers already have state variables to detect tampering, so what he's asking for is very modest. I thought I made it clear that the problem is not in marking finalized objects as finalized. Pascal also wants passing a finalized container as a parameter to raise Program_Error. The problem is that if you have complex finalization code--the only case I can think of where this would matter--then you end up juggling hand grenades. But I just realized that raising Program_Error is silly, and thus falls under the Dewar rule. Right now there are cases where an object gets finalized twice. Should a second call to Finalize, raise Program_Error for a container? After all Finalize is a subprogram with the object being finalized as a parameter. That, of course, is not the case I really care about. Often when finalizing components , especially those with access discriminants, you need to reference the containing object. The AARM says this is intentionally supported: *Implementation Note: *An implementation has to ensure that the storage for an object is not reclaimed when references to the object are still possible (unless, of course, the user explicitly requests reclamation via an instance of Unchecked_Deallocation). This implies, in general, that objects cannot be deallocated one by one as they are finalized; a subsequent finalization might reference an object that has been finalized, and that object had better be in its (well-defined) finalized state. Should we allow direct references to components of containers when passing the container as a parameter will raise Program_Error? That is my problem. Writing error free finalization code is tough. Why make it tougher? This is why I said that a class of controlled objects with such a flag would be fine, but raising Program_Error when passed as a parameter is nasty. For example, with Pascal's full proposal, I can write: begin if Some_Container.Finalized then null; -- never reached else Do_Something; end if; exception when Program_Error => Do_Something_Else; end; But note that I can't wrap this in a procedure. Well I can, but I can't have a the container as a parameter. That is what bothers me about Pascal's proposal. Figuring out how to write finalization code for types which need some finalization operations to be done after the components are finalized is just one case I have run into. (Yes it can be done, and it isn't all that bad. But you end up finalizing the components inside the finalization routine for the parent, then letting the 'normal' finalization call later do nothing. This is why I worry about what happens in the multiple finalization case. I may have to dig up a mailbox implementation where I did just this, and in fact needed a flag in the components to keep the (real) finalization from doing things twice. Randy Brukardt wrote: >Actually, it's quite easy to create such cases, although the code is pretty >unlikely in practice. (OTOH, cases like it occurred in early versions of >Claw, so it's not impossible for them to happen in real code.) AI-280 >discusses some related cases (that AI fixed several bugs in this area) and >has an example. Typically, the access occurs in the finalization routine of >some object declared in the same declarative_part. Remember that all of the >objects of a declarative part are finalized before any of them are destroyed >(the language is quite clear that the memory isn't deallocated for each >object individually). Exactly. AFAIK, this is only an issue where you have either multiple objects being finalized when the parent subprogram is left, or when a controlled object has controlled components. Note that the controlled components of a controlled object are finalized in some (unspecifiied) order (consistant with the requirement that components with access discriminants be finalized first). I had one case where I added an access discriminant just to force the finalization order. Yes, I could have checked whether the other component was already finalized--but that would require an access discriminant. ;-) **************************************************************** From: Pascal Leroy Date: Thursday, November 9, 2006 1:50 AM > To implement Pascal's suggestion, all you'd have to do is set > one or both of the counters to some nonce value (-1, say) in > Finalize. Other container operations would simply check the > counter on entry, and raise PE if the counter were less than 0. Exactly. Furthermore, we have to keep in mind that any object may be finalized multiple times, so the first finalization has to set some piece of information in the container header so that subsequent finalizations are just no-ops. As you point out, this could be done by setting a magic value in the lock field. **************************************************************** From: Pascal Leroy Date: Thursday, November 9, 2006 2:10 AM > Actually, it's quite easy to create such cases, although the > code is pretty unlikely in practice. (OTOH, cases like it > occurred in early versions of Claw, so it's not impossible > for them to happen in real code.) Well, it can happen, and we probably agree that only the mildly insane would do such a thing on purpose. Hence the notion that users would love to hear about that case. **************************************************************** From: Pascal Leroy Date: Thursday, November 9, 2006 3:30 AM > Should a second call to Finalize, raise Program_Error > for a container? After all Finalize is a subprogram with the object > being finalized as a parameter. I am interested in calls to subprograms exported by the container packages, and Finalize is not one of them (heck, the container may not even be a controlled type, it might just be "magic"). If an implementation raises P_E when Finalize is called twice, it just has a bug because we don't allow exceptions to fly out of random places. > Should we allow direct references to components of containers when > passing the container as a parameter will raise > Program_Error? That is > my problem. Writing error free finalization code is tough. > Why make it > tougher? This is why I said that a class of controlled objects with > such a flag would be fine, but raising Program_Error when passed as a > parameter is nasty. For example, with Pascal's full > proposal, I can write: > > begin > if Some_Container.Finalized > then null; -- never reached > else Do_Something; > end if; > exception > when Program_Error => Do_Something_Else; > end; This doesn't make any sense to me. Containers are private types, so they don't export components. Therefore, user code cannot do anything like the above. Now if that code is in the body of a container package you can obviously access the Finalized component, and if you do that impropely you might have a bug. Shrug. **************************************************************** From: Robert Dewar Date: Thursday, November 9, 2006 5:59 AM > But I just realized that raising Program_Error is silly, and thus falls > under the Dewar rule. Right now there are cases where an object gets > finalized twice. Should a second call to Finalize, raise Program_Error > for a container? After all Finalize is a subprogram with the object > being finalized as a parameter. No, a second finalize call should be ignored **************************************************************** From: Robert I. Eachus Date: Thursday, November 9, 2006 10:44 AM >Furthermore, we have to keep in mind that any object may be finalized >multiple times, so the first finalization has to set some piece of >information in the container header so that subsequent finalizations are >just no-ops. As you point out, this could be done by setting a magic >value in the lock field. Pascal and I apparently crossed in the mail, but I see now that he is only proposing that operations in the container packages raise Program_Error, with a finalized parameter, not all subprograms. However, I think my point still stands. The extra code to raise Program_Error will only have an effect inside Finalization routines. There will be cases where it calling operations on containers makes sense in (user) finalization routines, usually on user defined components of containers, probably with access discriminants. (There is also the case of someone calling Unchecked_Deallocation on a container, or more realistically on an object that contains a container. But I think if that happens outside the container package bodies, anything goes.) **************************************************************** From: Randy Brukardt Date: Thursday, December 14, 2006 12:31 AM Ages ago, Matt Heaney wrote: > I am curious, though, about what motivated Pascal's suggestion: under what > circumstances would someone be able to refer to an object that has been > finalized? If finalization happens immediately prior to the object being > destroyed, then how can someone refer to the object at all? I was working on the minutes for that AI that came out of this discussion, and I remembered that I wasn't convinced that it could actually happen for a container object (because you can't control what happens inside the container). So I spent a bit of time trying to figure out how it could happen. It turns out to be more likely than you might expect. Access to a finalized container can happen if a Finalize routine accesses a container object declared in the same master as the Finalize routine. How could that happen in real code? Imagine for a moment that someone needs to instrument a controlled type to collect some usage information. The program might be structured like: package Pack is type T is new Ada.Finalization.Controlled with ... overriding procedure Finalize (Object : in out T); type Acc_T is access all T'Class; A_List : Acc_T; end Pack; package body Pack is package Usage_History is new Ada.Containers.Vectors (...); Finalzation_Usage: Usage_History.Vector; procedure Finalize (Object : in out T) is begin -- operations on Finalization_Usage to collect data on the -- finalization of objects of T. end Finalize; end Pack; When the program finalizes, any allocated objects remaining on A_List will be finalized when the finalization of the collection for type Acc_T is occurs. However, since finalization occurs in reverse order of declaration, the object Finalization_Usage will be finalized before the collection for Acc_T. So those finalizations will be accessing an object that is already finalized. ["Finalization of the collection" for an access type is defined in 7.6.1(11/2).] Note that it doesn't matter where A_List is declared; it only matters where the access type is declared relative to the container object. It's not unusual for that type to be declared visibly (this program structure is taken from the CLAW Windows interface library), and good Ada practice is to hide things in the body when possible. But doing so can easily cause finalization anomolies (some of the cases in AI95-00280 were originally encountered in CLAW). This argues strongly for at *least* the resolution decided on by the ARG in ABQ: Program_Error for tampering operations, bounded error (Program_Error or acts as empty) for operations that just read from the container. I've added this example to the otherwise empty AI, so that Tucker doesn't have to remember to add it when he writes it up 90 minutes before the next ARG meeting starts. ;-) ****************************************************************