Version 1.1 of acs/ac-00326.txt

Unformatted version of acs/ac-00326.txt version 1.1
Other versions for file acs/ac-00326.txt

!standard 13.11.4(35/4)          20-03-05 AC95-00326/00
!class confirmation 20-03-05
!status received no action 20-03-05
!status received 20-02-28
!subject Subpools and Deallocation
!summary
!appendix

From: Richard Wai
Sent: Friday, February 28, 2020  1:22 PM

So I was trying to make heads or tales in regards to the interactions between
Unchecked_Deallocate and Unchecked_Deallocate_Subpool.

Let's say you want to make a user-define pool with subpools, and you want to
enable Unchecked_Deallocate with storage reclamation for your subpools (as
permitted by 13.11.4 (35/3)) Root_Storage_Pool_With_Subpools gives us an
Allocate_From_Subpool, but no Deallocate_From_Subpool. The reasoning for this is
that "there is no efficient way for the implementation to determine the subpool
for an arbitrary object" (13.11.4 (10.c/3)). This sounds reasonable in the face
of it, but considered more broadly, this rationale lead me to something more
problematic.

Finalization (7.6.1 (11.1/3)) tells us that "Each nonderived access type T has
an associated collection, which is a set of objects created by allocators of T,
or of types derived from T. " and that "Unchecked_Deallocation removes an object
from its collection".

But there is also an implementation permission (7.6.1 (20.2/3)) that allows the
implementation to "finalize objects created by allocators for an access type
whose storage pool supports subpools (see 13.11.4) as if the objects were
created (in an arbitrary order) at the point where the storage pool was
elaborated instead of at the first freezing point of the access type."

The AARM elaborates that the permission allows that "objects will only need to
belong to the subpool and not also to the collection.". So what then should
happen if that object is subjected to Unchecked_Deallocate? As per 7.6.1
(11.1/3) above, Unchecked Deallocation "removes an object from its collection".
If the implementation permission is taken, and the object is not in the
collection, then it can't be removed. But yet it still must be finalized
(13.11.2 (9/3)), which could leave it belonging to a subpool. The user is
allowed to implement deallocation for a pool with subpools, and is therefore
permitted to reuse the memory reclaimed from Unchecked_Deallocate. So if that
memory is then reused as part of an allocation to a different subpool of the
same pool, what happens if Unchecked_Deallocate_Subpool is then called on the
first subpool? Would it not finalize the object that is now in a different
subpool, or worse attempt to finalize an object of a different type entirely?
That would be bad.

It seems this scenario is a valid case of erroneous execution, but is not
mentioned anywhere. It also seems an obvious problem, and so the implementor can
avoid it in two possible ways, both of which require some relation between the
collection and the subpools, since actually taking the implementation permission
and not including subpool allocations in the collection leads to potential
erroneous execution.

I see three possible implementation approaches that avoid this problem:
1. The implementation keeps all allocated objects in the appropriate collection,
   regardless of subpools, and so knows that when Unchecked_Deallocation is
   called on such an object, it no longer exists, and should not be finalized
   again. This means that the collection would need to track to which subpool an
   object in the collection belongs, so that Unchecked_Deallocate_Subpool can
   finalize the correct objects.

2. The implementation tracks all unfinalized subpools somehow, and uses this to
   remove objects from subpools when Unchecked_Deallocate is called on an object
   that "belongs" to a subpool.

In either case, the implementation will always know, at the point of
Unchecked_Deallocate, to which subpool the object belongs. It needs to know that
in order to avoid erroneous finalization during a subsequent
Unchecked_Deallocate_Subpool. So why not then provide Deallocate_From_Subpool?
If the user is permitted to reclaim memory from a subpool on a call to
Deallocate (which is only called from Unchecked_Deallocate) (13.11.4 (35/3)),
then we should allow both Unchecked_Deallocate and Unchecked_Deallocate_Subpool
to be used on a given pool that supports subpools.

So either we need to:
A.  Define Erroneous Execution to include the use of
    Unchecked_Deallocate_Subpool after a call to Unchecked_Deallocate on an
    object allocated from the same subpool; or,
B.  Require the implementation to actively disown objects from a subpool on a
    call to Unchecked_Deallocate (i.e approach 1/2 of above). If we do this, it
    makes sense to provide Deallocate_From_Subpool, since it would conceivably
    come at no additional cost.

I note that GNAT appears to "do the right thing", and so must be doing something
like "B", though I haven't looked into the specifics.

I'd rather see option B, since GNAT apparently already does this already, the
overhead seems reasonable, and we could get Deallocate_From_Subpool, which would
be really neat.

Am I missing something here?

***************************************************************

From: Tucker Taft
Sent: Friday, February 28, 2020  2:09 PM

A collection is an abstract concept, and it represents all of the objects
created by an allocator for a given access type (or any of its descendants).
There is only one collection per access type, even if there are multiple
subpools. Unfortunately, the AARM implementation note 7.6.1(20.e.3) uses
the term "collection" in a different sense, probably meant to convey a
linked list of objects that hangs off a header associated with the access
type, so the allocated objects that still exist can all be found and
finalized at the end of the scope of the access type.

...
> The AARM elaborates that the permission allows that "objects will only
> need to belong to the subpool and not also to the collection.". So
> what then should happen if that object is subjected to
> Unchecked_Deallocate? As per
> 7.6.1 (11.1/3) above, Unchecked Deallocation "removes an object from
> its collection". If the implementation permission is taken, and the
> object is not in the collection, then it can't be removed.

The problem, as mentioned above, is that this AARM note [7.6.1(20.e/3) -
Editor.] is not using the term "collection" properly, but rather using it as a
short-hand for something like a linked list of allocated objects.  The last
sentence of that note currently says:

   "This is expected to ease implementation, as the objects will only need to
   belong to the subpool and not also to the collection."

Instead, it should probably say:

   "This is expected to ease implementation, as the remaining un-deallocated
   objects will only need to be accessible at run time from the subpool header
   rather than from the overall access type collection header."

Does that resolve your concern?

***************************************************************

From: Richard Wai
Sent: Friday, February 28, 2020  2:38 PM

> A collection is an abstract concept, and it represents all of the objects
> created by an allocator for a given access type (or any of its
> descendants).

Exactly right, but the rules of the RM require that the implementation refer
to that collection to finalize allocated objects that have not been
deallocated, and to modify it (at runtime) when an object is deallocated,
so there must be a real list of some kind.

> There is
> only one collection per access type, even if there are multiple subpools.

Again, there in lies the problem. If the subpools have their own list of
objects to be finalized (besides the collection), and one calls
Unchecked_Deallocation, how does Unchecked_Deallocation know to remove that
object from the subpool's list? Even if it removes that object from the
collection, If it doesn't know how to find the subpool, then deallocating
the subpool will cause an erroneous finalization of an object that
potentially doesn't exist, or has been reused. Alternatively the subpool
at deallocation would need to check each of it's items against an assumed
duplicate entry in the collection, which seems very inefficient. It would
also need to handle the case where a new object was allocated (and thus
added to the collection) into the same piece of memory of a previous
deallocation, which would be hard.

So that brings me back to the crux of it - In order for the implementation to
know at run-time if an object of a subpool has already been deallocated,
Unchecked_Deallocation will need to know which subpool to which an object
belongs. And thus why does the RM also suggest that Deallocate_From_Subpool is
difficult for that reason, if we have made it clear that the implementer needs
to be able to do that anyways.

***************************************************************

From: Randy Brukardt
Sent: Friday, February 28, 2020  3:24 PM

...
> > There is
> > only one collection per access type, even if there are
> > multiple subpools.
>
> Again, there in lies the problem. If the subpools have their own list
> of objects to be finalized (besides the collection), and one calls
> Unchecked_Deallocation, how does Unchecked_Deallocation know to remove
> that object from the subpool's list?

This is not an issue, certainly not for a list-based implementation of
finalization. This is the case for *any* finalization (one does not know exactly
what list it is on). But that is not a problem, since one can remove an object
from a doubly linked list without knowing what list it appears on. So there is
no real problem in implementing a list removal without knowing the list.

Obviously, that can be more expensive than a singly-linked list, but the point
is that a reasonable implementation exists that doesn't need to know where the
object is linked (just that it is linked in exactly one place). Implementers are
free to come up with other schemes if they like.

> Even if it removes that object from the collection, If it doesn't know
> how to find the subpool, then deallocating the subpool will cause an
> erroneous finalization of an object that potentially doesn't exist, or
> has been reused. Alternatively the subpool at deallocation would need
> to check each of it's items against an assumed duplicate entry in the
> collection, which seems very inefficient. It would also need to handle
> the case where a new object was allocated (and thus added to the
> collection) into the same piece of memory of a previous deallocation,
> which would be hard.

Yes, neither of these implementations make any sense. Just use a doubly linked
list, put each object on exactly one list and you're done. The impl-perm is
specifically about allowing that implementation.

> So that brings me back to the crux of it - In order for the
> implementation to know at run-time if an object of a subpool has
> already been deallocated, Unchecked_Deallocation will need to know
> which subpool to which an object belongs. And thus why does the RM
> also suggest that Deallocate_From_Subpool is difficult for that
> reason, if we have made it clear that the implementer needs to be able
> to do that anyways.

As noted above, the implementer does *not* need to do that anyways. I suppose
the program could have a way to find out the subpool by walking the doubly
linked list to find the head, but *that* is potentially very expensive compared
to unlinking a doubly linked list item.

Perhaps the assumption of a doubly linked list as part of finalization should
have been outlined somewhere (it seems an obvious requirement to me, but
apparently not to everyone). Being able to remove arbitrary elements from the
list without walking all over the place or knowing exactly which sublist it is
linked on is a clear requirement (because of unchecked deallocation, but it also
comes up in other cases - at least in our implementation).

***************************************************************

From: Richard Wai
Sent: Friday, February 28, 2020  3:10 PM

So I went ahead and dug into the GNAT internals, and it simply has each
controlled object carry-along a header with a "FM_Node_Ptr" that points to a
particular "Finalization Master". This nodes points to the finalization master
of the particular controlled object. So deallocation of a controlled object
allow it to detach itself from whatever finalization master it happens to be on.


Subpools have their own finalization master, and deallocation always unlinks
from that master. Deallocation of the subpool simply iterates it's own list.

So this is what I thought - it would be trivial to also be able to identify a
particular subpool via the FM_Node_Ptr, if the FM_Node identified the subpool,
and it could since a subpool gets it's own FM_Node.

So I'm really saying, it seems like the omission of Deallocate_From_Subpool is
not very well justified.

***************************************************************

From: Randy Brukardt
Sent: Friday, February 28, 2020  3:34 PM

A basic doubly linked list has the same memory cost, but deletions are cheaper
(there's no need to find the object from the header) and doesn't have the
(cheap) ability to find the header. Indeed, there's no safe way to find the
header at all, since the header isn't a list object, but one could add a header
object to the head of the list in order to have something to find safely. That's
what Janus/Ada uses. (This is all in assembler, so we get to ignore strong
typing for efficiency.)

There's plenty of justification for not having Deallocate_from_Subpool, unless
you are assuming that GNAT's is the only possible implementation. Moreover, it's
not even a useful operation (why worry about what subpool a particular object is
in when you can delete an object from *any* subpool -- getting the subpool wrong
would necessarily be erroneous, which seems like a step in the wrong direction).

Morals: every implementation is slightly different, because there are many
trade-offs and everyone does them differently. And what is cheap in one is
expensive in another.

***************************************************************

From: Richard Wai
Sent: Friday, February 28, 2020  3:36 PM

...
> Perhaps the assumption of a doubly linked list as part of finalization should
> have been outlined somewhere (it seems an obvious requirement to me,
> but apparently not to everyone). Being able to remove arbitrary
> elements from the list without walking all over the place or knowing
> exactly which sublist it is linked on is a clear requirement (because
> of unchecked deallocation, but it also comes up in other cases - at
> least in our implementation).

What if you're de-linking yourself from the head (or tail) of the list.
There needs to be a pointer to the head of the list somewhere to handle
finalization. So if you're going to delink yourself from a list, you better know
where the list is actually held, incase you need to point that to your next. And
that is what GNAT does, the head of the list being the "Finalization_Master"
"node". All controlled objects know where that is, and each subpool has their
own, so an "arbitrary object" that needs to be finalized, can just as easily see
what subpool's list it is on.

And literally as I was writing this, it hit me - what if it's not an object that
needs explicit finalization, like say an Integer. In that case there's no way to
know what subpool it came from.

Talking to oneself in the mirror only gets you so far I guess..

***************************************************************

From: Richard Wai
Sent: Friday, February 28, 2020  3:48 PM

> There's plenty of justification for not having
> Deallocate_from_Subpool, unless you are assuming that GNAT's is the only
> possible implementation. Moreover, it's not even a useful operation (why
> worry about what subpool a particular object is in when you can delete an
> object from *any* subpool -- getting the subpool wrong would necessarily
> be erroneous, which seems like a step in the wrong direction).

Well this is helpful if you want to implement a bounded storage pool with
subpools. You want to be able to re-use memory from a given subpool. Say you
have a Tree where all nodes are allocated from a subpool. When you finalize the
Tree, you can deallocate the subpool. But you want to also be able to delete
nodes from the tree dynamically, and not run out of memory before you're done
with the tree. If you want each Subpool to be statically allocated, say as an
array of Storage_Elements, then you want a particular deallocation of a
particular object from a particular subpool to be freed from the right array, so
you need to know which subpool an allocation belongs. I'd be nice to not have to
scan through a list of address ranges every time, especially if the compiler had
some idea (which I now see is in fact not reasonable).

Regardless, you've helped me spot the errors in my ways ??

***************************************************************

From: Randy Brukardt
Sent: Friday, February 28, 2020  4:01 PM

You can still do this with the regular Unchecked_Deallocation. There's nothing
preventing the implementer of a subpool from including the information about
which subpool it is in the object somehow. We're just not requiring it of the
implementation.

***************************************************************

From: Randy Brukardt
Sent: Friday, February 28, 2020  4:22 PM

> What if you're de-linking yourself from the head (or tail) of the list.
> There needs to be a pointer to the head of the list somewhere to
> handle finalization.

No, not really. The Prev pointer points at the head of the list, which has
the same format as a linked object. (The tail of the list is just a null
ptr in the usual way). (This is why I noted that the implementation is in
assembler, because you can't do this in Ada.)

Every controlled object includes three words at the head: a tag, a next ptr,
and a prev ptr. Deleting a node is just:

    Node.Next.Prev := Node.Prev;
    Node.Prev.Next := Node.Next;

And this works even for the head of the list (assuming the head is laid out
properly). In such a case, Node.Prev points at the head, and the assignment
to Node.Prev.Next changes the head as needed. The actual code has no need to
treat the head as different from any other predecessor node.

We do this in-line, directly after the call to Finalize, for
Unchecked_Deallocation. So we don't want any conditional code there. (Jcode
has REGOBJ and DEREGOBJ opcodes which link and unlink controlled objects from
a chain; the entire finalization scheme is part of the Jcode "virtual
machine").

I don't think there is any special identification of the head, and there is no
Prev in the various heads (note that it will never be used in the above code),
so it's not safe to walk the chain to find the head as it is currently
constituted. That could be fixed if there was a need, but there isn't one
today.

> So if you're going to delink yourself from a list, you better know
> where the list is actually held, incase you need to point that to your
> next.

Nope, not necessary (see above). Now, I agree, if you're going to write this
in Ada or a high-level language, then the requirement is different. Although
you could simply make the head a typical node that is statically allocated
and thus is never deleted (in that case, you don't have to special case
accessing the head). In any case, this is compiler-generated code that we're
talking about, and it needs to be as efficient as possible.

> And that is what GNAT does, the head of the list being the
> "Finalization_Master" "node".

Sure, but that is not necessary unless you are going to make the list a
singly-linked list. Or if you have some other reason to get back to the
head of the list. (There isn't one in general, at least that I know of.
Every controlled object is somewhere on the main finalization list, but it
could be on any of many sublists, so finding it that way would be pretty
expensive.)

> All controlled objects know
> where that is, and each subpool has their own, so an "arbitrary
> object" that needs to be finalized, can just as easily see what
> subpool's list it is on.
>
> And literally as I was writing this, it hit me - what if it's not an
> object that needs explicit finalization, like say an Integer. In that
> case there's no way to know what subpool it came from.
>
> Talking to oneself in the mirror only gets you so far I guess..

There's that, too. ;-)

***************************************************************

Questions? Ask the ACAA Technical Agent