Version 1.4 of ai05s/ai05-0027-1.txt
!standard A.18.2(239/2) 07-10-28 AI05-0027-1/03
!standard A.18.3(152/2)
!standard A.18.4(75/2)
!standard A.18.7(96/2)
!class binding interpretation 06-11-13
!status work item 06-11-13
!status received 06-11-03
!priority Medium
!difficulty Easy
!qualifier Omission
!subject Behavior of containers operations when passed finalized container objects
!summary
If an operation is passed a container object that has been finalized,
it is a bounded error. If the operation is read-only, then either
the operation proceeds as though the container were empty, or
Constraint_Error or Program_Error is raised.
If a finalized container is passed to an operation that would update
the container, Constraint_Error or Program_Error is raised.
!question
What happens in container operations when passed a finalized container object?
!recommendation
(See Summary.)
Add the following after A.18.2(239):
It is a bounded error to call any subprogram declared in Containers.Vectors
when the associated container has been finalized. If the operation takes
Container as an IN OUT parameter, then it raises Constraint_Error or
Program_Error. Otherwise, the operation either proceeds as it would
for an empty container, or it raises Constraint_Error or Program_Error.
Make corresponding additions to all other container packages,
after A.18.3(152), after A.18.4(75) [needs "Bounded Error" label
as well], and after A.18.7(96) [needs "Bounded Error" label].
!discussion
It requires some circumlocution (though see the example for a plausible
situation) to create a situation where an operation can be applied to a
finalized container. Nevertheless, we don't want such an operation
to de-finalize the container, and thereby create storage
leaks or worse. Hence, we do not allow any operation that updates the
container to proceed, and any other operation either raises an exception,
or proceeds as though the container were empty. This minimizes any
distributed overhead associated with handling the finalized case (because
all non-updating operations need not distinguish a finalized state
from an empty state), while ensuring that no storage leak or potentially
erroneous execution is introduced by such "late" operations.
We allow Constraint_Error rather than Program_Error for efficiency reasons as
well, since many of the operations that attempt to update an empty
container will already raise Constraint_Error. It is only the updating
operations that allow for the container to start out empty that will need an
explicit check for the finalized state.
We have suggesting adding this paragraph before the discussion of ambiguous
cursors, as we want that discussion to flow right into the discussion of
invalid curaors.
!example
Access to a finalized container can happen if a Finalize routine accesses
a container object declared in the same master as the Finalize routine.
This is more likely than it appears at first glance.
Imagine for a moment that someone needs to instrument a controlled type to
collect some usage information. The program might be structured like:
package Pack is
type T is new Ada.Finalization.Controlled with ...
overriding
procedure Finalize (Object : in out T);
type Acc_T is access all T'Class;
A_List : Acc_T;
end Pack;
package body Pack is
package Usage_History is new Ada.Containers.Vectors (...);
Finalzation_Usage: Usage_History.Vector;
procedure Finalize (Object : in out T) is
begin
--
--
end Finalize;
end Pack;
When the program finalizes, any allocated objects remaining on A_List will be
finalized when the finalization of the collection for type Acc_T is occurs. However,
since finalization occurs in reverse order of declaration, the object
Finalization_Usage will be finalized before the collection for Acc_T. So those
finalizations will be accessing an object that is already finalized.
Note that it doesn't matter where A_List is declared; it only matters where
the access type is declared relative to the collection object. It's not unusual
for that type to be declared visibly (this program structure is taken from the
CLAW Windows interface library), and good Ada practice is to hide things in the
body when possible. But doing so can easily cause finalization anomolies (see
AI95-00280 for some examples).
--!corrigendum A.18.2(239/2)
!ACATS test
!appendix
From: Pascal Leroy
Date: Wednesday, November 8, 2006 6:02 AM
The RM doesn't seem to say what happens when you operate on a finalized
container.
I believe that any subprogram call that involves a finalized container
should raise P_E: there doesn't seem to be any reason to make this
unspecified or erroneous (it's easy enough to detect: just keep a
Finalized bit in the container header), and there doesn't seem to be any
reason to try and define sensible semantics (who plays with finalized
containers anyway?). Such an occurrence is likely to be a bug, and the
user will be glad to hear about it.
This should be nailed down in the RM.
****************************************************************
From: Robert Dewar
Date: Wednesday, November 8, 2006 6:09 AM
I would prefer this to be erroneous, to avoid the extra overhead of
the test. If an implementation decides unconditionally or under
control of some flag that it should make the check, it is free to
do so.
****************************************************************
From: Robert I. Eachus
Date: Wednesday, November 8, 2006 3:56 PM
>This should be nailed down in the RM.
At best this is a bad idea, at worst, a disaster for existing code.
Let's think this through, you want a bit, let's call it 'Finalized in
every container. Shades of 'Terminated in tasks returned from inside
functions. (I guess that was before your involvement with the ARG.)
The AI gave the legalistic obvious answer, and as a result tasking in
the DEC compiler in particular became a lot less useful. We invented
the term pathology to describe that kind of AI, but it as you would
expect took years for the damage to be undone.
Let's look at the properties of a hypothetical 'Finalized attribute.
Without the proposed rule, it is harmless. But add the requirement to
raise P_E and it becomes a disaster. A call to Foo'Finalized will
either return True, or raise Program_Error. Oops! Similarly no
procedure can finalize a container passed as an in out parameter, or
call any procedure or function that would finalize a container passed in
as a parameter.
What do we gain for this? Safety? Hardly. Dave Emery referred to
Storage_Error as a parachute that opens on impact. With this rule
finalized containers become hand grenades that explode when touched.
Adding the attribute, or a better a pair of Finalized functions to
Ada.Controlled might be a nice improvement for the next version of Ada.
That would allow subprograms that could be called with a finalized
object to determine the state of the object. Of course, I can create a
package My_Controlled, that does all that for types descended from it.
That, in some cases would be a safety improvement, and is otherwise
harmless.
Actually I have done something similar to this in a couple of
instances. If you are counting the (maximum) number of controlled
objects of a particular type, you don't want to decrement the count
twice when an object is finalized twice, or decrement the count when an
object is finalized before initialization. Putting a bit in the object
that is set during initialization and reset if set by finalization fixes
all that. Why do you want to count the number of objects of a type?
Usually for optimization purposes. If you know how large the collection
gets, you can use a (fixed size) array with a free list. Yes, that
change means changing the code later on, but the actual change is small,
in number of lines, and often necessary to meet real-time requirements.
(Including those that require all allocation to be done during
initialization.)
****************************************************************
From: Matthew Heaney
Date: Wednesday, November 8, 2006 4:44 PM
> At best this is a bad idea, at worst, a disaster for existing code.
I can't see how this follows from Pascal's suggestion. All he's is advocating
is using a state variable to remember whether finalization has occurred for
this container instance. Containers already have state variables to detect
tampering, so what he's asking for is very modest.
In fact, you could use the state variables that already exist. For example in
GNAT each container object has two counters ("busy" and "lock" if you've
perused the sources) to detect tampering with elements and cursors. Right now
when a container object is initialized, those counters get set to 0. When the
container is finalized, we check to ensure that they're 0 (to detect
finalization during Query_Element, etc), and if not to raise Program_Error.
To implement Pascal's suggestion, all you'd have to do is set one or both of
the counters to some nonce value (-1, say) in Finalize. Other container
operations would simply check the counter on entry, and raise PE if the counter
were less than 0.
****************************************************************
From: Robert Dewar
Date: Wednesday, November 8, 2006 6:41 PM
Sounds reasonable to me ... makes me change my mind on this issue, I
think raising PE makes sense.
****************************************************************
From: Matthew Heaney
Date: Wednesday, November 8, 2006 9:19 PM
I am curious, though, about what motivated Pascal's suggestion: under what
circumstances would someone be able to refer to an object that has been
finalized? If finalization happens immediately prior to the object being
destroyed, then how can someone refer to the object at all?
****************************************************************
From: Randy Brukardt
Date: Wednesday, November 8, 2006 9:39 PM
Actually, it's quite easy to create such cases, although the code is pretty
unlikely in practice. (OTOH, cases like it occurred in early versions of
Claw, so it's not impossible for them to happen in real code.) AI-280
discusses some related cases (that AI fixed several bugs in this area) and
has an example. Typically, the access occurs in the finalization routine of
some object declared in the same declarative_part. Remember that all of the
objects of a declarative part are finalized before any of them are destroyed
(the language is quite clear that the memory isn't deallocated for each
object individually).
****************************************************************
From: Robert I. Eachus
Date: Thursday, November 9, 2006 1:36 AM
Matt said: I can't see how this follows from Pascal's suggestion. All he's is advocating
is using a state variable to remember whether finalization has occurred for
this container instance. Containers already have state variables to detect
tampering, so what he's asking for is very modest.
I thought I made it clear that the problem is not in marking finalized
objects as finalized. Pascal also wants passing a finalized container as
a parameter to raise Program_Error. The problem is that if you have
complex finalization code--the only case I can think of where this would
matter--then you end up juggling hand grenades.
But I just realized that raising Program_Error is silly, and thus falls
under the Dewar rule. Right now there are cases where an object gets
finalized twice. Should a second call to Finalize, raise Program_Error
for a container? After all Finalize is a subprogram with the object
being finalized as a parameter.
That, of course, is not the case I really care about. Often when
finalizing components , especially those with access discriminants, you
need to reference the containing object. The AARM says this is
intentionally supported: *Implementation Note: *An implementation has
to ensure that the storage for an object is not reclaimed when
references to the object are still possible (unless, of course, the user
explicitly requests reclamation via an instance of
Unchecked_Deallocation). This implies, in general, that objects cannot
be deallocated one by one as they are finalized; a subsequent
finalization might reference an object that has been finalized, and that
object had better be in its (well-defined) finalized state.
Should we allow direct references to components of containers when
passing the container as a parameter will raise Program_Error? That is
my problem. Writing error free finalization code is tough. Why make it
tougher? This is why I said that a class of controlled objects with
such a flag would be fine, but raising Program_Error when passed as a
parameter is nasty. For example, with Pascal's full proposal, I can write:
begin
if Some_Container.Finalized
then null; -- never reached
else Do_Something;
end if;
exception
when Program_Error => Do_Something_Else;
end;
But note that I can't wrap this in a procedure. Well I can, but I can't
have a the container as a parameter. That is what bothers me about
Pascal's proposal. Figuring out how to write finalization code for
types which need some finalization operations to be done after the
components are finalized is just one case I have run into. (Yes it can
be done, and it isn't all that bad. But you end up finalizing the
components inside the finalization routine for the parent, then letting
the 'normal' finalization call later do nothing. This is why I worry
about what happens in the multiple finalization case. I may have to dig
up a mailbox implementation where I did just this, and in fact needed a
flag in the components to keep the (real) finalization from doing things
twice.
Randy Brukardt wrote:
>Actually, it's quite easy to create such cases, although the code is pretty
>unlikely in practice. (OTOH, cases like it occurred in early versions of
>Claw, so it's not impossible for them to happen in real code.) AI-280
>discusses some related cases (that AI fixed several bugs in this area) and
>has an example. Typically, the access occurs in the finalization routine of
>some object declared in the same declarative_part. Remember that all of the
>objects of a declarative part are finalized before any of them are destroyed
>(the language is quite clear that the memory isn't deallocated for each
>object individually).
Exactly. AFAIK, this is only an issue where you have either multiple
objects being finalized when the parent subprogram is left, or when a
controlled object has controlled components. Note that the controlled
components of a controlled object are finalized in some (unspecifiied)
order (consistant with the requirement that components with access
discriminants be finalized first). I had one case where I added an
access discriminant just to force the finalization order. Yes, I could
have checked whether the other component was already finalized--but that
would require an access discriminant. ;-)
****************************************************************
From: Pascal Leroy
Date: Thursday, November 9, 2006 1:50 AM
> To implement Pascal's suggestion, all you'd have to do is set
> one or both of the counters to some nonce value (-1, say) in
> Finalize. Other container operations would simply check the
> counter on entry, and raise PE if the counter were less than 0.
Exactly.
Furthermore, we have to keep in mind that any object may be finalized
multiple times, so the first finalization has to set some piece of
information in the container header so that subsequent finalizations are
just no-ops. As you point out, this could be done by setting a magic
value in the lock field.
****************************************************************
From: Pascal Leroy
Date: Thursday, November 9, 2006 2:10 AM
> Actually, it's quite easy to create such cases, although the
> code is pretty unlikely in practice. (OTOH, cases like it
> occurred in early versions of Claw, so it's not impossible
> for them to happen in real code.)
Well, it can happen, and we probably agree that only the mildly insane
would do such a thing on purpose. Hence the notion that users would love
to hear about that case.
****************************************************************
From: Pascal Leroy
Date: Thursday, November 9, 2006 3:30 AM
> Should a second call to Finalize, raise Program_Error
> for a container? After all Finalize is a subprogram with the object
> being finalized as a parameter.
I am interested in calls to subprograms exported by the container
packages, and Finalize is not one of them (heck, the container may not
even be a controlled type, it might just be "magic"). If an
implementation raises P_E when Finalize is called twice, it just has a bug
because we don't allow exceptions to fly out of random places.
> Should we allow direct references to components of containers when
> passing the container as a parameter will raise
> Program_Error? That is
> my problem. Writing error free finalization code is tough.
> Why make it
> tougher? This is why I said that a class of controlled objects with
> such a flag would be fine, but raising Program_Error when passed as a
> parameter is nasty. For example, with Pascal's full
> proposal, I can write:
>
> begin
> if Some_Container.Finalized
> then null; -- never reached
> else Do_Something;
> end if;
> exception
> when Program_Error => Do_Something_Else;
> end;
This doesn't make any sense to me. Containers are private types, so they
don't export components. Therefore, user code cannot do anything like the
above. Now if that code is in the body of a container package you can
obviously access the Finalized component, and if you do that impropely you
might have a bug. Shrug.
****************************************************************
From: Robert Dewar
Date: Thursday, November 9, 2006 5:59 AM
> But I just realized that raising Program_Error is silly, and thus falls
> under the Dewar rule. Right now there are cases where an object gets
> finalized twice. Should a second call to Finalize, raise Program_Error
> for a container? After all Finalize is a subprogram with the object
> being finalized as a parameter.
No, a second finalize call should be ignored
****************************************************************
From: Robert I. Eachus
Date: Thursday, November 9, 2006 10:44 AM
>Furthermore, we have to keep in mind that any object may be finalized
>multiple times, so the first finalization has to set some piece of
>information in the container header so that subsequent finalizations are
>just no-ops. As you point out, this could be done by setting a magic
>value in the lock field.
Pascal and I apparently crossed in the mail, but I see now that he is
only proposing that operations in the container packages raise
Program_Error, with a finalized parameter, not all subprograms.
However, I think my point still stands. The extra code to raise
Program_Error will only have an effect inside Finalization routines.
There will be cases where it calling operations on containers makes
sense in (user) finalization routines, usually on user defined
components of containers, probably with access discriminants. (There is
also the case of someone calling Unchecked_Deallocation on a container,
or more realistically on an object that contains a container. But I
think if that happens outside the container package bodies, anything goes.)
****************************************************************
From: Randy Brukardt
Date: Thursday, December 14, 2006 12:31 AM
Ages ago, Matt Heaney wrote:
> I am curious, though, about what motivated Pascal's suggestion: under what
> circumstances would someone be able to refer to an object that has been
> finalized? If finalization happens immediately prior to the object being
> destroyed, then how can someone refer to the object at all?
I was working on the minutes for that AI that came out of this discussion,
and I remembered that I wasn't convinced that it could actually happen for a
container object (because you can't control what happens inside the
container). So I spent a bit of time trying to figure out how it could
happen. It turns out to be more likely than you might expect.
Access to a finalized container can happen if a Finalize routine accesses a
container object declared in the same master as the Finalize routine. How
could that happen in real code?
Imagine for a moment that someone needs to instrument a controlled type to
collect some usage information. The program might be structured like:
package Pack is
type T is new Ada.Finalization.Controlled with ...
overriding
procedure Finalize (Object : in out T);
type Acc_T is access all T'Class;
A_List : Acc_T;
end Pack;
package body Pack is
package Usage_History is new Ada.Containers.Vectors (...);
Finalzation_Usage: Usage_History.Vector;
procedure Finalize (Object : in out T) is
begin
-- operations on Finalization_Usage to collect data on the
-- finalization of objects of T.
end Finalize;
end Pack;
When the program finalizes, any allocated objects remaining on A_List will
be finalized when the finalization of the collection for type Acc_T is
occurs. However, since finalization occurs in reverse order of declaration,
the object Finalization_Usage will be finalized before the collection for
Acc_T. So those finalizations will be accessing an object that is already
finalized. ["Finalization of the collection" for an access type is defined
in 7.6.1(11/2).]
Note that it doesn't matter where A_List is declared; it only matters where
the access type is declared relative to the container object. It's not
unusual for that type to be declared visibly (this program structure is
taken from the CLAW Windows interface library), and good Ada practice is to
hide things in the body when possible. But doing so can easily cause
finalization anomolies (some of the cases in AI95-00280 were originally
encountered in CLAW).
This argues strongly for at *least* the resolution decided on by the ARG in
ABQ: Program_Error for tampering operations, bounded error (Program_Error or
acts as empty) for operations that just read from the container.
I've added this example to the otherwise empty AI, so that Tucker doesn't
have to remember to add it when he writes it up 90 minutes before the next
ARG meeting starts. ;-)
****************************************************************
Questions? Ask the ACAA Technical Agent