Version 1.6 of ai05s/ai05-0111-1.txt

Unformatted version of ai05s/ai05-0111-1.txt version 1.6
Other versions for file ai05s/ai05-0111-1.txt

!standard 4.8(2)          09-02-15 AI05-0111-1/04
!standard 4.8(3/2)
!standard 4.8(5.3/2)
!standard 4.8(7/2)
!standard 13.11(15)
!class Amendment 08-08-08
!status work item 08-08-08
!status received 08-03-12
!priority Medium
!difficulty Hard
!subject Specifying a pool on an allocator
!summary
(See proposal.)
!problem
One often wants to manage dynamically allocated objects in multiple heaps with different lifetimes. Ada provides this automatically if things can be arranged so that all of the objects created with a given access type have the same lifetime. The finalization of the storage pool associated with the access type provides for the reclamation of the objects allocated using these access types. However, it is common for the access types to be declared at the library level, while there need to be multiple heaps that are reclaimed at a more nested level.
One possible way to support multiple heaps is to allow the storage pool for a dynamically allocated object to be specified explicitly at the point of the allocator. However, it is desirable that there be protections against dangling references, and allowing storage pools of short lifetimes to be specified in an allocator for a long-lived access type clearly does not inherently provide such protection.
!proposal
Allow the storage pool associated with one or more access types to be split into multiple, separately reclaimable "subpools," and allow the particular subpool to be used for a given allocator to be specified at the point of the allocator, with the following syntax:
X := new (Subpool) T'(ABC);
To protect against a subpool being reclaimed while there are still access values outside the subpool designating objects residing within it, a reference count is maintained for each subpool. Various compile-time and run-time checks are used to ensure that as long as there might be access values outside the subpool that designate objects in the subpool, the reference count on the subpool is greater than zero. Once the reference count drops to zero, the subpool may be safely reclaimed. If a set of subpools have references to each other, but there are no external references to any of the subpools, the entire set of mutually-referencing subpools may be reclaimed.
For any given task, a default subpool may be established for a given access type, within a given scope. An allocator without specifying a subpool explicitly for the access type will use the innermost default subpool specification for the type.
DYNAMIC MASTERS
Associated with each subpool is a "dynamic master," similar to the "static" masters associated with the execution of various language constructs. The dynamic master associated with a subpool is the master for all objects residing in the subpool that need finalization. When a subpool is ready to be reclaimed, this is analogous to a static master being completed. Any tasks dependent on the master are awaited, and then once all the tasks are completed or terminable, the objects in the subpool are finalized and the subpool is reclaimed.
SUBPOOL REFERENCES
We propose four kinds of references to a subpool:
1) Static subpool handle 2) Dynamic subpool handle 3) Strong object reference 4) Weak object reference
A "static subpool handle" is a counted reference to a subpool represented by a controlled immutable object declared within a static master. A "dynamic subpool handle" is an immutable, counted reference to a subpool from another subpool. Both kinds of handles, once established, remain in place until the handle is finalized, and are immutable in the sense that they always refer to the same subpool for as long as they exist. Static subpool handles are the objects used in an allocator to identify the subpool for the allocation. A dynamic subpool handle is used to allow access values residing in one subpool to designate objects in another subpool. A dynamic subpool handle is created by calling a language-defined procedure "Allow_Reference (From_Subpool, To_Subpool)," where the subpools are identified by static handles.
We use the terminology that a handle "gives access" to a subpool, as well as its constituent objects and associated master, if it references that subpool directly, or it references a subpool that has a dynamic handle that gives access to the subpool. In other words "gives access" is essentially a closure over the "handle" relationship. Similarly we say that a subpool or master "has access" to another subpool, master, or object if it has a handle that gives access to that subpool, master, or object. A subpool or master also "has access" to itself and its constituent objects. Furthermore a static master "has access" to anything to which an enclosing static master has access. Because handles are immutable, the "gives access" and "has access" relationship are similarly immutable. Given this terminology, we can say that it is "safe" for an access value within a given master to designate an object in a subpool to which that master has access, because the access value cannot "outlive" the subpool.
A "strong object reference" is a combination of a counted reference on a subpool and an access value designating some object to which the subpool has access. For programmatic use, a strong object reference can be converted into a static handle and an access value. A strong object reference can be assigned a new value, given a static handle on a subpool and an access value designating an object to which the handle gives access. When a strong object reference is changed, the reference count is transferred from the original target subpool to the new target subpool. A strong object reference may also be set to be a null reference, designating no subpool or object. This is the default initial value of a strong object reference. A strong object reference may be stored anywhere that the associated access type is defined.
A "weak object reference" is a combination of a managed reference on a subpool and an access value designating some object to which the subpool has access. The reference is "managed" in the sense that if the subpool is about to be finalized, all weak object references designating the subpool are disconnected from the subpool (this can be accomplished efficiently using a level of indirection). For programmatic use, a weak object reference can be converted to a strong object reference, and from there to a handle and access value, though the resulting static handle and access value will be null if the weakly referenced pool has been finalized. Weak object references may be assigned from other weak object references or strong object references. A weak object reference may be stored anywhere that the associated access type is defined.
Subpools with only weak references are made available for reclamation, though they will not be reclaimed immediately if free storage is "plentiful" (exactly how "plentiful" is defined is TBD).
ACCESSIBILITY CHECKS
If an access type is associated with a storage pool that has subpools, then checks are performed on assignment and function return to make sure that an access value never outlives the subpool containing the object it designates (the "designated subpool"). On assignment of an access value to a stand-alone access object, or to a component of a composite object, an accessibility check will be performed to ensure that the master of the assigned object has access to the designated subpool. This includes the case of composite assignment where one or more of the components of the object being assigned are such access values.
When assigning to OUT or IN OUT parameters of an access type, the master of the object is presumed to be the innermost master enclosing the call, so the check ensures that that master has access to the designated subpool. Similarly on function return of an access value, the check is against the innermost master enclosing the call. On assignment to a component of a composite OUT or IN OUT parameter that is passed by reference, the check is against the master of the actual parameter (requiring an additional implicit parameter to identify the master, analogous to the accessibility level of an access parameter). Similarly, on assignment to a component of a composite return object that is built in place, the check is against the actual eventual master of the return object. If the composite parameter is passed by copy, or the return object is not built in place, the check is against the innermost master of the call.
COMPILE-TIME CHECKING
It is the intent that many of these checks can be performed at compile-time. This is enabled by the transitive nature of the "has access" relationship. For example, if a function returns an access value and the return statement is not "inside" some local master having a static handle on some subpool, then no accessibility check is necessary as part of the return. This is because, whatever is the designated subpool, the master immediately enclosing the call must have access to it, since only the presence of a static subpool handle can cause a nested master to have more access than its enclosing master. This property ensures that most access value manipulations need no checks.
It is only if an assignment or function return is "crossing" a master having a static subpool handle that any check need be performed. Furthermore, if the compiler can trace the origin of the access value to a point outside any handle being "crossed," then again, no check is required. For example, if the access value came from a formal parameter, perhaps after following various intermediate indirections, it is always safe to return it, even if the function did have local masters with static subpool handles.
Note that this means that existing Ada code that takes an access value, and follows one or more levels of indirection to retrieve another access value, and then returns it, could be used as is with no additional run-time checks required, since clearly existing code has no explicit use of subpool handles.
RUN-TIME CHECKING
Despite the goal of making most of these checks be performable at compile time, some run-time checking may be necessary. To enable such checks to be performed at run-time, the implementation of a storage pool with subpools must provide an operation "Originating_Subpool (Storage_Pool, Address)" that, given an address returned from "Allocate" from one of its subpools, will return a static handle on the subpool. This might be implemented by storing a reference to the subpool immediately preceding each object allocated from the subpool, or by using separate memory pages for separate subpools, and having a reference to the associated subpool at the start of each page, or by a combination of techniques. An attribute_function, perhaps <access_type>'Subpool(acc_val), might be applicable to a non-null access value of the type, and be equivalent to calling the appropriate Originating_Subpool function on acc_val.all'Address.
Once the designated subpool of a given access value is determined, the required check can be performed. Note that although the implementor of the storage pool is required to provide the Originating_Subpool operation, it is the underlying Ada implementation that actually maintains the reference counts associated with subpool handles and strong references, and the structures associated with weak references. It also performs whatever run-time checks are required. One way to think about it is that the Ada implemention worries about the masters, while the user code can take over allocation from and reclamation of the subpools.
!wording
[**TBD**]
!discussion
PERFORMANCE
To make run-time checks efficient, the Ada implementation may create auxiliary data structures and do additional computing as part of creating new dynamic subpool handles in response to Allow_Reference calls. The intent is that many checks can be done using a simple comparison based on an analogue of the static accessibility levels, and only if the simple check fails would a more complex check be necessary. For example, because subpool handles are immutable, once two subpools reference each other directly or indirectly, they can effectively be treated as a single combined subpool from the point of view of lifetime. That is, neither can be reclaimed until both can be. Using an algorithm such as Tarjan's "union find" algorithm, subpools participating in cyclic dependencies can be combined efficiently, with any one of them acting as the representative of them all. What this means is that an arbitrary directed graph of dependencies between subpools can be reduced to a forest of strict trees between subpool "combines," simplifying the job of relative lifetime determination.
For weak references, we would recommend using an intermediate structure that lives as long as there is any weak reference to a given subpool, even after the subpool itself is finalized. This intermediate structure would have a count of weak references to the subpool, and the subpool and this intermediate structure would have pointers to one another. When the first weak reference to a subpool is created, this intermediate structure could be allocated out of the "main" subpool. It would be released back to the main subpool when the last weak reference was finalized. When the target subpool was finalized, the intermediate structure would be marked so that any future conversions of a weak reference to a strong reference would result in null.
DEFAULT SUBPOOL
In the absence of an explicit subpool specified on an allocator for an access type with subpools, which subpool should be used? The "original" one (i.e. access_type'Storage_Pool). We would recommend allowing a static subpool handle to be marked as a "default allocation handle," and then the innermost such handle associated with the storage pool of the access type would be used. This latter approach would be more flexible, and allow existing code to be used without having to annotate existing allocators with explicit subpools. On the other hand, it can create possibilities for confusion, since the identification of the handle might be far removed from the actual allocator. Another alternative is to base the subpool on the target of the allocator, if it is being stored directly into a component of a subpool-resident object.
The other major advantage of defining a notion of "default subpool" is that implementations could provide a subpool capability without extending the syntax of the language.
!example
As an example of the use of subpools, imagine a long-running web server for a retail establishment, which builds up in-memory representations of historical customer information, current product information, and each active customer's current shopping cart. There is a desire to be able to reclaim this in-memory information when it has not been recently referenced. However, the various representations are interlinked, such that a shopping cart refers to the in-memory representation for the products it contains, and the historical customer information and the current shopping cart might refer to each other.
Although these representations might be made up of multiple heap objects, linked into hash tables, sets, lists, trees, etc., reclamation generally happens at the unit of an entire customer history, shopping cart, or product description, so there is a desire to avoid individual object-by-object deallocation. Hence, we would establish a separate subpool for each separately reclaimable item, namely one for each customer historical record, one for each product description, and one for each shopping cart.
Weak object references could be used in a table indexed by customer name or product name. Before retrieving information from the persistent database, a check would be made in the table to see if there is a weak reference, and whether upon converting into a strong reference it is non-null. If so, then the existing in-memory information can be reused.
If it is null, then the information must be retrieved from the
persistent database, and the appropriate in-memory data structures created, again depositing a weak reference in the table for future reference. Meanwhile strong references would be kept in the web server's table of active sessions, so the information in use by an active session would not be prematurely reclaimed. Finally there would be appropriate dynamic subpool handles or strong object refernces between the subpools representing related shopping carts, customers, and products, ensuring that, for example, as long as a shopping cart remained active the associated customer history record remained active.
Whenever a subprogram would actually need to "walk" through an in-memory data structure, a strong object reference, presumably to the "root" node of the designated data structure, would be converted into a static subpool handle and an access value to the root node. The static subpool handle would reside in a local master, protecting all local use of the access value, and anything reachable from the access value, until the local master was completed and the static subpool handle was finalized.
To avoid reclaiming subpools as soon as the last reference from an active session went away, since the information might be needed again relatively soon, a separate caching task could keep strong references in a list, ordered by recency of use. When the storage pool implementation detects that the total space occupied by all of its subpools exceeds some threshhold, it could request that the caching task start discarding strong references to the least recently used subpools. This presumes the application would notify the caching task on each "significant" use of the subpool, so the recency of significant use could be determined.
Alternatively, as we suggested above at the end of the section on "SUBPOOL REFERENCES," the implementation could automatically defer reclaiming subpools with outstanding weak references, as long as free memory is "plentiful." This would eliminate the need for each application to have its own "caching" task to accomplish the same goal.
!ACATS test
!appendix

From: Tucker Taft
Sent: Wednesday, March 12, 2008  8:03 AM

One thing I keep running into when trying to do more interesting things
with storage pools is the inability to specify the storage pool at the
point of the allocator.  Obviously this only makes sense for general
access types, but given that they can point to objects in different
pools, why not be able to specify the pool on allocation?

To work around this problem, we currently use some seriously groddy,
thread-unsafe mechanisms that essentially specify a "virtual" storage
pool on the access type, and then store into that storage pool a
pointer to the "real" storage pool to use the next time an allocator is
evaluated.

Wouldn't it be a whole lot more reasonable and thread safe to allow
something like:

     X : Gen_Acc_Type := new (pool-to-use) T'(initial_value);

That is, allow the specification of the pool as part of the syntax for
an allocator.

I'll create an AI on this if there is any sympathy for the approach.

****************************************************************

From: Robert Dewar
Sent: Wednesday, March 12, 2008  8:07 AM

I think this would definitely be useful, almost more useful than specifying
on types! In practice all access types are general (I regard pool specific
allocators as a bizarre corner feature of the language that is almosst never
used :-)

****************************************************************

From: Tucker Taft
Sent: Wednesday, March 12, 2008  9:12 AM

Glad to hear I am not the only one who might find this useful.  I suppose
there ought to be a corresponding variant of Unchecked_Deallocation that
takes a storage pool as an explicit parameter, e.g.:

   generic
      type Object(<>) is limited private;
      type Name is access all Object;
   procedure Ada.Unchecked_Deallocation_From_Pool
      (X : in out Name;
       Pool : in out System.Storage_Pools.Root_Storage_Pool'Class);

This is probably less critical, since it is more common for heap objects to
keep track of what pool they come from.  Furthermore, one of the main reasons
to take over allocation is so you can avoid ever calling unchecked
deallocation.  But it still seems like something like the above should be
available if one can specify an allocator-specific pool.

(Note that I have limited the actual access type to be a general access type
with the use of "all".)

****************************************************************

From: Robert A. Duff
Sent: Wednesday, March 12, 2008 11:41 AM

> I think this would definitely be useful,...

I agree with Tucker and Robert.

I also agree with Tucker's other note about Unchecked_Deallocation.

Maybe I'll be accused of making "best" the enemy of "good enough",   ;-)
but along these lines, I'd also like a way to define a default pool for
all access types in a scope, and/or a way to force all access types in a
scope to have a Storage_Pool clause.

>...almost
> more useful than specifying on types! In practice  all access types
>are general (I regard pool  specific allocators as a bizarre corner
>feature  of the language that is almosst never used :-)

Historical note: Tucker wanted to make all access types "general", so there
would be no need for the extra "all" syntax.
I convinced him to keep the Ada-83 kind.

I've since changed my mind: pool-specific access types are useful, but not
useful enough to warrant the added language complexity of having two kinds
of access types.  Tucker was right all along.

I am also responsible for the bogus term "pool-specific".
The issue is only vaguely related to pools.  But I confused myself into
thinking that pools are more-or-less the same thing as what Ada 83 calls a
"collection".  They're not.

My fault.

****************************************************************

From: Tucker Taft
Sent: Wednesday, March 12, 2008 12:08 PM

Your memory of history is better than mine.
One thing I do remember, though, was that from an optimizer point of view,
pool-specific access types are preferable, since they don't need to worry
about updates to (non-heap) aliased objects or updates through unrelated
access types.

And if you are going to be using unchecked-deallocation with your own
storage pools, if you only associate a given pool with pool-specific access
types, it gives you an extra level of comfort that the pointer passed to a
pool's deallocate routine was produced by a call on the allocate routine for
that pool.

I also know that the Rational compiler was able to use smaller integers to
represent pool-specific access values if they had an associated Storage_Size.

So I suppose I have changed *my* mind too, that pool-specific access types
aren't all bad... ;-)

But the bigger point in my mind is that associating a pool with an access type
isn't what I usually want to do.  I want to associate a pool with some set
of objects that all have the same lifetime (or more precisely, the same
moment of "death").  And that rarely matches the lifetime of the access
types involved.

****************************************************************

From: Randy Brukardt
Sent: Wednesday, March 12, 2008  1:47 PM

...
> I am also responsible for the bogus term "pool-specific".
> The issue is only vaguely related to pools.  But I confused myself
> into thinking that pools are more-or-less the same thing as what Ada
> 83 calls a "collection".  They're not.

Would you care to elaboration on "they're not"? I've always viewed pools
as a more powerful version of collections. (Indeed, collections are
implemented as a special predefined kind of storage pool in Janus/Ada.)
If I'm wrong in some fundamental way, this would be as good of a time
as any to find out.

****************************************************************

From: Randy Brukardt
Sent: Wednesday, March 12, 2008  2:36 PM

> One thing I keep running into when trying to do more interesting
> things with storage pools is the inability to specify the storage pool
> at the point of the allocator.  Obviously this only makes sense for
> general access types, but given that they can point to objects in
> different pools, why not be able to specify the pool on allocation?
...
>
> Wouldn't it be a whole lot more reasonable and
> thread safe to allow something like:
>
>      X : Gen_Acc_Type := new (pool-to-use) T'(initial_value);
>
> That is, allow the specification of the pool as part
> of the syntax for an allocator.

My "language purity" alarm went off with this one. Something about the
solution seems wrong; not sure I can explain it. It seems unsafe, in that
there would be no way to tell what pool something was allocated from, and
thus deallocation would be extremely dangerous.

I'm definitely dubious that the problem is the inability to specify a
storage pool on an allocator. That seems more like a possible solution to a
problem, rather than the problem itself. I'm sure there are problems in this
area, but I guess I'd like to lay out the problems before solving them (if
for no other reason than to ensure that we've really solved them all).

One technical question on this proposal: is this intended to work with
anonymous access types? That would seem to make allocators of them useful,
but also would introduce a number of interesting new issues (such as, can an
allocator with a specified pool be a coextension? If so, you can have
coextensions with parts in different pools, and if not, then it becomes even
less obvious if a particular object is a coextension).

> ... I suppose there ought
>to be a corresponding variant of Unchecked_Deallocation
>that takes a storage pool as an explicit parameter...
>
>This is probably less critical, since it is
>more common for heap objects to keep track of what
>pool they come from. ...

I've never done that, and this is the first time I can recall anyone saying
that they did that. So I'm dubious about the "more common" part. I've
considered having Janus/Ada do that internally for mutable types (it would
solve a number of obscure bugs and leaks), but that wouldn't help the user
any.

>...
>Furthermore, one of the main
>reasons to take over allocation is so you can avoid
>ever calling unchecked deallocation.

Humm, now that seems to be a problem, since it is impractical to do that in
Ada as it stands. (I've given up trying.) Either you end up with a huge
number of type conversions to some global general access type or you end up
with a global pool that isn't able to do any early deallocations. I've
generally used pools to catch double deallocations and the like, and/or to
make allocations more efficient by optimizing a particular size into an
array (other sizes being passed on to the main pool).

My first thought was that anonymous access types are the right solution for
this (as they are the sort of short-lived type needed) but that doesn't
actually work, because you'd still need all of the type conversions. (Which,
BTW, is one of the reasons I hate them so; they don't even solve the problem
that they were advertised to solve [eliminating excess type conversions].
And you can't specify a pool. So you can't actually use them in practice (or
they don't add anything), so why have them?)

So I'm not sure what the right answer is.

> And if you are going to be using unchecked-deallocation
> with your own storage pools, if you only associate
> a given pool with pool-specific access types, it gives you an
> extra level of comfort that the pointer passed
> to a pool's deallocate routine was produced by
> a call on the allocate routine for that pool.

It's a lot more than comfort -- in an absence of Unchecked_Conversions into
a pool-specific type, it is a guarantee. (And there shouldn't be any such
conversions; we'll raise Constraint_Error on the use of any such value if
the type uses the default pool - so it won't work anyway. I'd ban such U_Cs
outright if the language allowed me to do that.)

For the main pool, we can check for deallocations from the stack and raise
Program_Error if they're attempted; that provides (almost) the same level of
safety for a general access type using the default pool. But once you mix
user-defined pools into the type, you're playing Russian roulette if you are
using Unchecked_Deallocation.

Anyway, I don't like the idea, but perhaps there is no better solution to
the problem(s).

****************************************************************

From: Robert A. Duff
Sent: Wednesday, March 12, 2008  2:37 PM

> Could you care to elaboration on "they're not"? I've always viewed
> pools as a more powerful version of collections. (Indeed, collections
> are implemented as a special predefined kind of storage pool in
> Janus/Ada.) If I'm wrong in some fundamental way, this would be as
> good of a time as any to find out.

As I recall, in Ada 83, the "collection" of an access type just means
the set of all objects allocated by allocators returning that type.  (I'm
ignoring derived access types.)  "Collection" is a fairly high-level
concept.  If we had kept the term in Ada 95, I suppose we would use it
to talk about finalization.

"Storage pool", OTOH, is a low-level memory-management concept.  Many
unrelated access types can share the same storage pool; hence it's not
the same thing as "collection".  If we take Tucker's suggestion, then
the reverse is also true -- many storage pools per collection.  (In
fact, through torturous programming, one can already achieve that
in Ada 95, as Tucker hinted.)

I suspect you're using "collection" to refer to 'Storage_Size clauses.
As such, I don't think you're doing anything wrong (except perhaps
misusing the term ;-) ).

...all of which reminds me: if we take Tucker's suggestion, it's worth
thinking about finalization, and perhaps awaiting task termination.
One useful thing to do with pools is to avoid Unchecked_Dealloc on
individual objects.  The pool collects objects together into "memory
pages" or whatever, and the entire pool can be wiped out with one call
(either explicit, or triggered by finalization of the pool).  Makes
deallocation very efficient, and very simple (if all objects in the
pool have roughly the same lifetime).  But this doesn't quite work
if the objects have finalization.  (And of course formally, ALL
objects have finalization according to the RM -- one must hope
that the implementation doesn't actually do anything if the
finalization is a no-op.)

Not a big deal, but worth thinking about.

****************************************************************

From: Stephen W Baird
Sent: Wednesday, March 12, 2008  2:08 PM

> I also know that the Rational compiler was able to use smaller
> integers to represent pool-specific access values if they had an
> associated Storage_Size.

You're going way, way back with this one.
Back in the days when Rational's Ada development environment was
hosted on proprietary hardware, the default representation for a
value of an access type was as a (bit) offset relative to the
beginning of a chunk of memory associated with the access type.
A storage_size representation clause could restrict the range of
these offsets and would therefore allow a smaller representation
for access values. This, coupled with the R1000's aggressive bit
packing policies, meant that a storage_size specification for an
access type with no other accompanying black magic could result
in an array of access values having a component size of, say, 23 bits.

Getting back to the original question about specifying a pool for
an allocator, it might be worth noting that Rational felt the need
to invent an Ada83 solution to this problem. Like many pragma-based
solutions to problems which really require language support, it was
somewhat ugly. The solution included a pragma Heap, which occurred
immediately after a statement or declaration which included an
allocator, and which specified the "storage pool" (although that
was not the terminology that was used at the time) for the
allocator(s). There is no need to point out the various problems
with this approach.

****************************************************************

From: Tucker Taft
Sent: Wednesday, March 12, 2008  3:41 PM

> My "language purity" alarm went off with this one. Something about the
> solution seems wrong; not sure I can explain it. It seems unsafe, in
> that there would be no way to tell what pool something was allocated
> from, and thus deallocation would be extremely dangerous.

This is already an issue in Ada 95, because any general access value
could have originated from any allocator associated with any access type
that has a compatible designated type, or for that matter from 'Access
of an appropriate level aliased object.  Presuming we restrict this to
(named!) general access types, it is not introducing any more danger
than already exists.  Any work-around for this is going to be just as
dangerous, presuming you use some kind of level of indirection in your
storage pool, where you change the "real" storage pool being used during
execution.

> I'm definitely dubious that the problem is the inability to specify a
> storage pool on an allocator.

I'm not. ;-)  I am a big believer in using something approximating a
mark/release approach to allocation, or equivalently, a "region-based"
approach. That is, you allocate objects from a region or heap one at
a time, and then you throw them all away in "one swell foop."  This
approach may not be too useful for some kinds of applications, but
applications that are vaguely like a compiler (which most of the
ones I develop are), generally build up a tree one node at a time,
and then at some point have no need for the entire tree.  Walking the
tree just to free it one node at a time is a pain, and probably means
that you need to have decent a defragmenting storage management system,
etc.

When it comes to freeing the tree, I'm not freeing
*all* trees, just the entire tree associated, say, with a particular
compilation unit.  There is no natural way in Ada to associate that one
particular tree with a storage pool, unless you play thread-unsafe games
with levels of indirection, etc.

If this capability existed, then I could have a factory routine which
took the pool as a parameter, or perhaps retrieved it from the "header"
of the overall tree, or wherever.  It would then use that pool to do the
allocation.

> ...
> That seems more like a possible solution to a
> problem, rather than the problem itself. I'm sure there are problems
> in this area, but I guess I'd like to lay out the problems before
> solving them (if for no other reason than to ensure that we've really solved them all).

The problem is that you want the pool to be associated with a set of objects,
not with a set of access types.

Another place this comes up is with containers.  I really want to pass in the
container to use when I create a hash table.  Different hash tables originating
from the same generic instantiation might very well want to use different
storage pools.  Essentially I want the storage pool to be identified via
an access discriminant of the container.  When the hash table goes away,
I can easily reclaim all the storage used to represent it.

> One technical question on this proposal: is this intended to work with
> anonymous access types?

It depends on the kind of anonymous access types.  A coextension is allocated
from the same storage pool as that used for the enclosing object, so if that
object were allocated using this new capability, then all the coextensions
would be allocated from the pool specified in the allocator.  I suppose
specifying a storage pool an allocator used to initialize an access
discriminant would be a way of effectively breaking the coextension
relationship, just as though you had wrapped the allocator in a qualified
expression for some named access type.

If we are talking about an anonymous allocator passed to an access parameter,
then that uses a local storage pool that goes away when the subprogram
returns.  I can't see much value in overriding that, though I suppose there
might be reasons to do so.

If we are talking about the new kinds of anonymous access types introduced
in Ada 2005 for components, then I suppose it could be useful.

If we are talking about access return types, then the current AI proposes
that a return of an anonymous allocator uses the storage pool determined
by the context of use.  I suppose overriding the pool in that case again
would be roughly analogous to putting a qualified expression around the
allocator for some particular named access type, again breaking any
coextension relationship that might have been produced in the absence
of the explicit pool specification.

> ... That would seem to make allocators of them useful,
> but also would introduce a number of interesting new issues (such as,
> can an allocator with a specified pool be a coextension? If so, you
> can have coextensions with parts in different pools, and if not, then
> it becomes even less obvious if a particular object is a coextension).

Part and parcel of the definition of a coextension is that it has the
same lifetime of the object that designates it.
Hence, as suggested above, if you specify an explicit pool you would
be preventing a coextension relationship, just as if you had wrapped
the allocator in a qualified expression for a named access type.

>> ... I suppose there ought
>> to be a corresponding variant of Unchecked_Deallocation that takes a
>> storage pool as an explicit parameter...
>>
>> This is probably less critical, since it is more common for heap
>> objects to keep track of what pool they come from. ...
>
> I've never done that, and this is the first time I can recall anyone
> saying that they did that.

Well it is quite common in the C++ world that "delete" is designed to
work no matter what "arena" the object is allocated from.  In fact,
there isn't much choice, since "delete" can't be passed any extra
parameters in C++.  Somewhat analogously, "free" in C doesn't take a
length.  The length is squirreled away in front of the object usually.

> ...
> So I'm dubious about the "more common" part. I've
> considered having Janus/Ada do that internally for mutable types (it
> would solve a number of obscure bugs and leaks), but that wouldn't
> help the user any.

I'm not proposing to *require* it, but it is always an option for
a user-defined storage pool.  In fact, the whole point of proposing the
additional version of unchecked deallocation is to put the burden
on the user rather than the implementor of the storage pool when that
is appropriate.  In Ada 95, the user has to decide what type to use
for Unchecked_Deallocation, even if there is a lot of
access-type-to-access-type conversion being performed in the
application.  I know our implementation of finalization of
heap-allocated objects as part of Unchecked_Deallocation goes out
of its way to survive in the presence of access-type-to-access-type
conversion.

>> ...
>> Furthermore, one of the main
>> reasons to take over allocation is so you can avoid ever calling
>> unchecked deallocation.
>
> Humm, now that seems to be a problem, since it is impractical to do
> that in Ada as it stands. (I've given up trying.)

Maybe you are misinterpeting what I am saying.
We pretty much never call Unchecked_Deallocation in our static analysis
tool.  We rely on en-masse deallocation of an entire storage pool to
reclaim storage before the access type goes out of scope.

Of course this doesn't work if the objects need any sort of
finalization, so we have to be careful about that.  Sometimes we even
go to the trouble of calling Unchecked_Deallocation on each object
that might need finalization, even though we know the underlying pool
actually does nothing when you call "Deallocate."

> ...
> Anyway, I don't like the idea, but perhaps there is no better solution
> to the problem(s).

It may be you have never encountered the issue.

But in my experience, eliminating overhead in storage management is
absolutely critical to the performance of our kind of tool.  Garbage
collection is much too slow (even the high end, real-time, incremental,
blah, blah, blah versions), and even malloc/free are significantly
slower, due to the fragmentation problems, first-fit vs. best-fit
list walking, etc.

Only something approximating mark/release have produced the kind of
performance we require.

And then we end up wasting some of that on silly levels of indirection
through virtual storage pools.

****************************************************************

From: Robert A. Duff
Sent: Wednesday, March 12, 2008  6:48 PM

> That is, you allocate objects from a region or heap one at a time, and
> then you throw them all away in "one swell foop."  This approach may
> not be too useful for some kinds of applications, but applications
> that are vaguely like a compiler (which most of the ones I develop
> are), generally build up a tree one node at a time, and then at some
> point have no need for the entire tree.

Right.  I'd like to point out that this "tree" (or whatever data structure
it is) consists of many different record types, linked together via many
different access types.  And the same access types are used in
different trees -- that's why per-type pools don't work in this case.

****************************************************************

From: Randy Brukardt
Sent: Wednesday, March 12, 2008  7:55 PM

> This is already an issue in Ada 95, because any general access value
> could have originated from any allocator associated with any access
> type that has a compatible designated type, or for that matter from
> 'Access of an appropriate level aliased object.  Presuming we restrict
> this to (named!) general access types, it is not introducing any more
> danger than already exists. Any work-around for this is going to be just
> as dangerous, presuming you use some kind of level of indirection in your
> storage pool, where you change the "real" storage pool being used
> during execution.

I agree that a workaround is just as dangerous, that's why I want to look
for a better solution that is not dangerous.

And while I agree that Ada 95 has this problem, it is one that is very
bounded to named general access types with a user-defined storage pool
(or with type conversions from a pool-specific access type that has such
a pool whose memory is allocated in the main heap) *and* that has an
instance of Unchecked_Deallocation.

As I discussed previously, in Janus/Ada, using Unchecked_Deallocation on
the default pool raises Program_Error if given an access value outside of
the main heap. Thus, using Unchecked_Deallocation on a general access type
(that uses the default pool) is almost as safe as using it in a
pool-specific type. (The only new issue is that a access-to-an-aliased-component
of a heap object would sneak through.)

[The reason for this check is to avoid heap corruption long enough to provide
an error message; otherwise, we've seen programs where the exception handling
mechanism fell over because of the corruption. Besides, we're already
tracking the heap bounds for the dereference check for pool-specific types
(an Ada 83 feature that we've preserved because we didn't want moving to
Ada 95 to lose functionality) and thus the check is essentially free.]

It would make perfect sense to warn on the declaration of such an instance
of Unchecked_Deallocation for a general access type that has a
user-defined pool. Indeed, I'm not sure why we don't have such a warning:
"Use of this deallocator is potentially unsafe as deallocation of objects
that were not originally allocated cannot be detected." Probably just
never thought about this in detail.

(Indeed, it would be possible to make many user-defined storage pools safe
in this way as well, but it would not be clear whether that sort of thing
is best handled by the compiler or by the user-defined pool.)

However, with your proposal, this danger now spreads to all (named) general
access types, because it would take full program analysis to prove or
disprove whether there is a danger. That means that this is no longer
something that the compiler can help with (and giving the above warning
on all general access types is not helpful).

...
> > One technical question on this proposal: is this intended to work
> > with anonymous access types?
>
> It depends on the kind of anonymous access types. ...
[Lots more text].

You could have just said is "Yes". ;-) I've pretty much convinced myself
that if it is a good idea at all, it ought to be allowed on anonymous
access types - it would make them a lot more flexible.

> > ... That would seem to make allocators of them useful, but also
> > would introduce a number of interesting new issues (such as, can an
> > allocator with a specified pool be a coextension? If so, you can
> > have coextensions with parts in different pools, and if not, then it
> > becomes even
> > less obvious if a particular object is a coextension).
>
> Part and parcel of the definition of a coextension is that it has the
> same lifetime of the object that designates it.
> Hence, as suggested above, if you specify an explicit pool you would
> be preventing a coextension relationship, just as if you had wrapped
> the allocator in a qualified expression for a named access type.

This is what I hate about coextensions. You mentioned at the last meeting
that the way Janus/Ada allocates objects means that all objects with
dynamic-sized array components act like coextensions. And that is true,
except for one major difference: *every* such object acts the same way.
There is no wishy-washy "maybe it is and maybe it isn't" and changing
a couple of characters here or there changes the status.

With the semantics you are suggesting, you couldn't even tell
syntactically whether it is or is not a coextension (it would depend on
what pool is given in the allocator). Yuck. (Yuck is not strong enough
of a word for this, but I don't want to spend 10 minutes finding a
thesaurus in order to find a better word...)

> > ... So I'm dubious about the "more common" part. I've considered
> > having Janus/Ada do that internally for mutable types (it
> > would solve a number of obscure bugs and leaks), but that wouldn't
> > help the user any.
>
> I'm not proposing to *require* it, but it is always an option for a
> user-defined storage pool.

I certainly would consider doing it as part of the implementation;
I can't stomach the lack of safety and at least that would reduce it.
(It's annoying that we'd have to check and raise an exception rather
than doing the right thing, but I suppose that would be for the best -
it would be more portable.)

> In fact,
> the whole point of proposing the additional version of unchecked
> deallocation is to put the burden on the user rather than the
> implementor of the storage pool when that is appropriate.

I think it makes just as much sense to put the burden on the compiler.
Why are we messing with all of this horrible accessibility stuff if
then we're going to immediately turn around and say "it's all erroneous
anyway!" If Ada is the language for safe programming, we need to act
that way.

It would make sense to allow the user to suppress such checks (as
with all language-defined checks), but to have the default to be
unsafe seems bad.

> In Ada 95, the user has
> to decide what type to use for Unchecked_Deallocation, even if there
> is a lot of access-type-to-access-type conversion being performed in
> the application.  I know our implementation of finalization of
> heap-allocated objects as part of Unchecked_Deallocation goes out of
> its way to survive in the presence of access-type-to-access-type
> conversion.

You have to do that! Not sure how that "is going out its way", unless
you want to appeal to erroneousness and allow unrelated stuff to fail.
(You'd be right vis-a-vis the letter of the standard, but users wouldn't
be too happy.)

> >> ...
> >> Furthermore, one of the main
> >> reasons to take over allocation is so you can avoid ever calling
> >> unchecked deallocation.
> >
> > Humm, now that seems to be a problem, since it is impractical to do
> > that in
> > Ada as it stands. (I've given up trying.)
>
> Maybe you are misinterpeting what I am saying.

No, you stopped reading before I explained what I meant. I think we
actually agree. I said that it takes heroic efforts to do that (and I'm not
willing to go to those efforts), and you said (elsewhere) that it takes you
heroic efforts to do so (and that's why you're proposing this language
change). The only difference is that you are willing to go to those efforts
and I'm not.

...
> > Anyway, I don't like the idea, but perhaps there is no better
> > solution to the problem(s).
>
> It may be you have never encountered the issue.

Say what? What does that have to do with not liking the idea? Just
because there is a problem that is hard to solve in Ada doesn't mean
that we should throw out all of the safety built-into the language and
particular implementations of the language in order to solve the problem.

I'm not against solving the problem. I just don't like this particular
form of the proposed solution to the problem. Sheesh.

> But in my experience, eliminating overhead in storage management is
> absolutely critical to the performance of our kind of tool.  Garbage
> collection is much too slow (even the high end, real-time,
> incremental, blah, blah, blah versions), and even malloc/free are
> significantly slower, due to the fragmentation problems, first-fit vs.
> best-fit list walking, etc.
> Only something approximating mark/release have produced the kind of
> performance we require.
> And then we end up wasting some of that on silly levels of indirection
> through virtual storage pools.

OK. But claiming that "you've never encountered the problem" has nothing to
do with this.

Besides, I *have* encountered the problem, way back in the early days of Ada
83 and Janus/Ada. I see that I solved it (in our particular case) on
October 18, 1985. The solution is the obvious one, in fact: if
Unchecked_Deallocation is too slow, then don't call it!

We put the nodes (more than 5 different kinds) on a bounded free chain
when we're done with them. (It's bounded so that an unusual program doesn't
soak up all of the memory in one kind of node, but that rarely happens.)
It only takes 5 machine instructions to do so, so it can hardly be a
noticeable overhead until there are billions of nodes. Moreover, allocation
off of the free list is actually faster than normal allocation, so there
is a win on that end as well. The bounds were set by experimentation
with lots of test programs, so that exceeding the bounds is rare.

When the compiler pass terminates, the entire program exits (each pass
is a separate program), so there is no need to every actually release
the nodes on the free chain. We do have an "emergency dump-all" routine that
dumps all of the free chains if we run out of memory allocating nodes
(that occurred frequently back on the MS-DOS host with 640K of RAM, but I
doubt it has ever happened on any of the larger hosts).

Indeed, we tried a mark-release scheme early on (it was built-into the
Pascal compiler we started with). It was just a huge tangle of bugs, we
were forever walking into stuff that had been deallocated. We eventually
gave it up as not worth the hassle.

Obviously, this scheme wouldn't work as well if we needed to do major mode
shifts during the compiler passes, but even then it would probably work
pretty well (it cut the number of deallocations in our compiler by a
factor of 100). And the reuse of the nodes reduces the number of crashes
due to dangling pointers, but probably increases the number of latent
bugs caused by using a reused node instead.

Still, obviously one size does not fit all when it comes to memory
allocation. I'm just not interested in sacrificing some safety to solve
this (or any other, for that matter) problem.

****************************************************************

From: Randy Brukardt
Sent: Wednesday, March 12, 2008  8:06 PM

> Right.  I'd like to point out that this "tree" (or whatever data
> structure it is) consists of many different record types, linked
> together via many different
> access types.  And the same access types are used in different trees
> -- that's why per-type pools don't work in this case.

Same here. I detailed how I solved that (in 1985!) for Janus/Ada.
Perhaps if I would have had storage pools back then, I'd have
tried a different solution. But in all honesty, the simpler solution
is good enough, and I've reused it in a number of other contexts
since with success.

That is to say, I'm not sure that "storage pools" is the right
abstraction to solve this problem. I realize that once you have
them, the temptation is great to map everything into them. Anyway,
some food for thought.

****************************************************************

From: Randy Brukardt
Sent: Wednesday, March 12, 2008  8:40 PM

...
> As I recall, in Ada 83, the "collection" of an access type just means
> the set > of all objects allocated by allocators returning that type.
> (I'm ignoring derived access types.)  "Collection" is a fairly high-level
> concept.

Humm, I didn't recall that. (The term is not in the Ada 83 index, which
doesn't help any.) The Ada 83 Rationale uses the term the way you describe,
but I can't find anything in the Ada 83 standard that does. (Indeed, the
only uses of the term I can find are in 4.8(9) and 13.2(8), which is
defining 'Storage_Size.)

> If we had kept the term in Ada 95, I suppose we would use it to talk
> about finalization.

We put it back (in a sense), to talk about finalization. See 7.6.1(11/2).

> "Storage pool", OTOH, is a low-level memory-management concept. Many
> unrelated access types can share the same storage pool; hence it's not
> the same thing as "collection".  If we take Tucker's suggestion, then
> the reverse is also true -- many storage pools per collection.
>  (In fact, through torturous programming,
> one can already achieve that in Ada 95, as Tucker hinted.)

I guess you're right if you take a narrow view of the meaning of
"collection". That is, that it is the objects of only a single type.
Period. Surely that is true in Ada 83 (no overlapping collections is
mentioned in the Rationale). But I see this as simply something that
was expanded in Ada 95 (multiple types can share their collections -
surely this is true for derived access types). And that the term was
changed to be more accurate.

> I suspect you're using "collection" to refer to 'Storage_Size clauses.
> As such, I don't think you're doing anything wrong (except perhaps
> misusing the term ;-) ).

That was the only visible meaning of the term in Ada 83.

> ...all of which reminds me: if we take Tucker's suggestion, it's worth
> thinking about finalization, and perhaps awaiting task termination.
> One useful thing to do with pools is to avoid Unchecked_Dealloc on
> individual objects. The pool collects objects together into
> "memory pages" or whatever, and the entire pool
> can be wiped out with one call (either explicit, or triggered by
> finalization of the pool).  Makes deallocation very efficient, and
> very simple (if all objects in the pool have roughly the same
> lifetime).  But this doesn't quite work if the objects have
> finalization.  (And of course formally, ALL objects
> have finalization according to the RM -- one must hope that the
> implementation doesn't actually do anything if the finalization is a
> no-op.)
>
> Not a big deal, but worth thinking about.

Actually, it is a very big deal (assuming non-trivial finalization).
The current rules ensure that the pool always outlives the type, so
finalization is a non-problem. More generally, the current Ada rules
require that the *memory* always outlives the Ada objects, so there
never is a problem with dangling pointers during finalization.

This change completely destroys that property (at least it does in
the intended use that Tucker is suggesting). Now, obviously, we can
simply declare that if the pool goes away early, the program is
erroneous. But that means that any attempt to solve Tucker's problem
is going to necessarily be erroneous. We could try to formally define
what "trivial finalization" is, and declare that if the pool goes
away early, the program is erroneous. But that's a maintenance
hazard and means that you can't use this mechanism generally (if you
are using types from some outside [independently developed] library,
you can't know or depend on the sort of finalization that it uses).

A better solution would be to define some mechanism to force early
finalization. (After all, I'm on record as saying that the vast
majority of new types should have non-trivial finalization.) But if
we have to do that, perhaps we could combine that with the
storage pool so that we don't have to lose the safety that we
currently have.

Not sure what the solution needs to be. But what I don't want to
see is lots of new ways to write erroneous programs. (Or old ways
that are ten times more likely to occur, either.)

One last point: we still need to fix Ada finalization to allow
garbage collection. Perhaps there is some way to do both (they
surely are closely related). At least if compilers were not
*required* to do something impossible (finalize an object after
the memory is gone), a "safe" compiler could take steps to
eliminate the problem. (Rather than crashing or raising an
exception that could not be avoided.)

****************************************************************

From: Tucker Taft
Sent: Wednesday, March 12, 2008  9:38 PM

> One last point: we still need to fix Ada finalization to allow garbage
> collection. Perhaps there is some way to do both (they surely are
> closely related). At least if compilers were not *required* to do
> something impossible (finalize an object after the memory is gone), a
> "safe" compiler could take steps to eliminate the problem. (Rather
> than crashing or raising an exception that could not be avoided.)

Good point.  One approach to avoiding the finalization issue is to
require that storage pools used with this new syntax be required to
have some additional operations related to finalization.  In our
Ada 95 compiler, for each access type that might designate objects
requiring finalization, we maintain a (circularly-linked) list of
allocated objects that need finalization.  An allocator adds the
object to the list if it needs finalization, and
Unchecked_Deallocation removes the object from the list if
it needs finalization.   When the access type itself is finalized,
we walk the list finalizing all of the objects that remain on it.

If any early memory reclamation is done on pools, in Ada 95 or with
this new proposal, you have to hope you don't have objects that
need finalization, because that circularly-linked list is going to
have dangling references.  If instead we expected storage pools to
provide operations to add and remove objects from such a list, and
we require the storage pool object to never outlive the access types
whose objects it stores, so that when the storage pool itself is
finalized, it can free any remaining objects, then we can avoid
this dangling reference problem.  Furthermore, if the storage pool
doesn't want to handle objects needing finalization, it can simply
raise an exception if the "add-to-finalization-list" operation is
called.  Mark/release can then be handled safely with respect to
finalization, by simply doing a "mark" on this finalization list,
and at release finalizing all of the objects added to the list
since the mark.

****************************************************************

From: Tucker Taft
Sent: Wednesday, March 12, 2008  9:13 PM

> I agree that a workaround is just as dangerous, that's why I want to
> look for a better solution that is not dangerous.
>
> And while I agree that Ada 95 has this problem, it is one that is very
> bounded to named general access types with a user-defined storage pool
> (or with type conversions from a pool-specific access type that has
> such a pool whose memory is allocated in the main heap) *and* that has
> an instance of Unchecked_Deallocation.

I'm not sure I follow.  In Ada 95 I can convert to any general access type
with the same designated subtype and compatible accessibility level,
whether or not it has a user-defined storage pool.  I guess I don't
see the added danger with this proposal.

Are you associating the added danger with the allocator or with
Unchecked_Deallocation_From_Pool?
Both of these would be explicit in the source code.
There is no chance a user could use them by mistake.
And it seems that both of these new capabilities are equivalent
(ignoring thread-safety) to something that could be accomplished in Ada 95
using a level of indirection in a storage pool, plus a couple of access
type conversions.

> ...
> It would make perfect sense to warn on the declaration of such an
> instance of Unchecked_Deallocation for a general access type that has
> a user-defined pool. Indeed, I'm not sure why we don't have such a
> warning: "Use of this deallocator is potentially unsafe as
> deallocation of objects that were not originally allocated cannot be
> detected." Probably just never thought about this in detail.
> ...
>
> However, with your proposal, this danger now spreads to all (named)
> general access types, because it would take full program analysis to
> prove or disprove whether there is a danger.

I'm not sure I see why you need more sophisticated analysis for this
proposal.  The type conversions can be buried anywhere in Ada 95.

> ...
> With the semantics you are suggesting, you couldn't even tell
> syntactically whether it is or is not a coextension (it would depend
> on what pool is given in the allocator). Yuck.

Sorry, that wasn't what I was trying to say.  If you specify a storage
pool explicitly for the allocator, it would *not* be a coextension, even
if it happened to specify the same pool.

> ...
> I'm not against solving the problem. I just don't like this particular
> form of the proposed solution to the problem....

For me the "problem" is that I can't associate a pool with a particular
allocator, without using a thread-unsafe approach.

How would you express the "problem" that you perceive in this area?

>> But in my experience, eliminating overhead in storage management is
>> absolutely critical to the performance of our kind of tool. ...
> ...
>
> Still, obviously one size does not fit all when it comes to memory
> allocation. I'm just not interested in sacrificing some safety to
> solve this (or any other, for that matter) problem.

I guess I don't see this as particularly affecting pointer safety
either way, since we have already said that Unchecked_Deallocation
is the users' responsibility to get right, both in terms of dangling
references and in terms of pools.

I am trying to solve a problem with thread-safety, without
significantly changing anything else.

If you could explain the problem as you perceive it, and at least
the general characteristics of a solution you would prefer, that
would help me better understand your view.

****************************************************************

From: Randy Brukardt
Sent: Wednesday, March 12, 2008  10:51 PM

> Good point.  One approach to avoiding the finalization issue is to
> require that storage pools used with this new syntax be required to
> have some additional operations related to finalization.  In our Ada
> 95 compiler, for each access type that might designate objects
> requiring finalization, we maintain a (circularly-
> linked) list of allocated objects that need finalization.  An
> allocator adds the object to the list if it needs finalization, and
> Unchecked_Deallocation removes the object from the list if
> it needs finalization.   When the access type itself is finalized,
> we walk the list finalizing all of the objects that remain on it.

Yes, our compiler works similarly.

> If any early memory reclamation is done on pools, in Ada 95 or with
> this new proposal, you have to hope you don't have objects that need
> finalization, because that circularly-linked list is going to have
> dangling references.

"Hope" doesn't cut it for language design! :-)

> If instead we expected storage pools
> to provide operations to add and remove objects from such a list, and
> we require the storage pool object to never outlive the access types
> whose objects it stores, so that when the storage pool itself is
> finalized, it can free any remaining objects, then we can avoid this
> dangling reference problem.

Don't you mean the inverse? Currently, a pool has to outlive the access
type. Surely we can't invert that or we'd have nasty compatibility problems.

I'd presume that these "enhanced" pools would derived from
Root_Storage_Pool (as this is a nice abstract limited controlled type),
and it would be nasty to have different lifetime requirements for
Root_Storage_Pool and Enhanced_Storage_Pool. Especially consider the
problems making a legality check for Root_Storage_Pool'Class!

    package System.Enhanced_Storage_Pools is
         type Enhanced_Storage_Pool is new System.Storage_Pools.Root_Storage_Pool with private;

         procedure Save_Object_for_Finalization (Pool : in out Enhanced_Storage_Pool;
                  Storage_Address : in Address);

         procedure Finalize_Saved_Object (Pool : in out Enhanced_Storage_Pool;
                  Storage_Address : in Address);
    end System.Enhanced_Storage_Pools;

I would not be happy if these were unrelated types, since this is exactly
the sort of place where dispatching and inheritance works to maximum
advantage. That is, there would be no problem using "enhanced" storage pools
in contexts that also expect regular storage pools - so they should be related
types.

Not sure what the best solution is. If the type retained its own list, then there
is no problem having it call the pool operations as needed. But that seems like
overkill (both the user-defined pool and the language-defined type would have
to keep lists of objects). If the type tries to use the pool's list, then
objects no longer go away when the type does: that seems like a significant
change in semantics.

Maybe the answer is that if the allocator uses a "normal" pool, it goes onto
the type's list, and if the allocator uses an "enhanced" pool, it goes on the
pool's list. In the latter case, there is no defined relationship between
the access type and the object lifetime: that's the pools job.

Humm, this would seem to fix the garbage collection problem, too. Just specify
an "enhanced" pool for your access type (or explicitly in the allocators),
and you can finalize when you want (but no later than when the pool is
finalized). That couldn't be incompatible, as there are no "enhanced" pools
now.

> Furthermore, if the storage pool doesn't want to handle objects
> needing finalization, it can simply raise an exception if the
> "add-to-finalization-list" operation is called.

Yup. Probably would come in handy for some algorithms.

> Mark/release can then be handled safely with respect to finalization,
> by simply doing a "mark" on this finalization list, and at release
> finalizing all of the objects added to the list since the mark.

Yup. The only remaining issue is how to tie this to processes defined by
the standard that work like finalization (i.e. task waiting and protected
objects). Specifically, the pool has an address of an object to finalize:
how does it in fact finalize that object?

Implementation-wise, it seems easy. (For Janus/Ada, at least.) Just pass
it to the low-level library routine that finalizes an object. (That grabs
the tag out of location 0 [there must be one if it is a finalizable
object], and either dispatches appropriately, or if the tag is a
"magic marker" tag, do the operation required by that marker - such as
waiting for tasks or freeing a memory block.)

But how best to represent this operation so that the user code can call it?
Surely they must do so in order to finalize any remaining objects. All
I can think of is a magic routine in System.Enhanced_Storage_Pools:

      procedure Finalize_Object_at_Address (Storage_Address : in Address);
          -- Finalizes the object at Storage_Address, calling Finalize if
          -- needed, waits for tasks, does protected object finalization, etc.

This is not very satisfying.

P.S. This is more fun than fixing bugs in the Standard (like trying to
figure out how accessibility should work)! :-)

****************************************************************

From: Randy Brukardt
Sent: Wednesday, March 12, 2008  11:53 PM

> I'm not sure I follow.  In Ada 95 I can convert to any general access
> type with the same designated subtype and compatible accessibility
> level, whether or not it has a user-defined storage pool.  I guess I
> don't see the added danger with this proposal.

If I was worried about such conversions, I could trivially add a
warning to them as well. But they're not often a problem; many
user-defined pools in fact allocate memory from the stack and thus
the check would catch them, too. Only user-defined pools that use
memory allocated from the main heap can sneak through.

> Are you associating the added danger with the allocator or with
> Unchecked_Deallocation_From_Pool?
> Both of these would be explicit in the source code.
> There is no chance a user could use them by mistake.
> And it seems that both of these new capabilities are equivalent
> (ignoring thread-safety) to something that could be accomplished in
> Ada 95 using a level of indirection in a storage pool, plus a couple
> of access type conversions.

The added danger is that an Unchecked_Deallocation (any kind) could
never know what pool the object might have come from.

Currently, the object (in the absence of type conversions) can *only*
have come from the defined pool or from some other place (usually the
stack). Thus, a simple check on the deallocation that the address belongs
to the defined pool will detect 98% of the misuses.

Note that I'm am not saying that our current implementation is 100% safe,
because it's not. (Among other things, it will miss pointers created into
the middle of heap objects.) But it doesn't need to be in order to add
substantial value (this is the same as the dangling pointer checks that
we do on containers: they're 98% effective, but that in practice catches
virtually all real problems eventually, and they're a lot cheaper than
checks that are 100% effective).

Obviously, you still can write code that is unsafe (we can't detect
double deallocations, for instance), but by catching a significant
percentage of problems, the buggy code gets detected earlier in testing
-- usually before any significant damage occurs.

...
> > However, with your proposal, this danger now spreads to all (named)
> > general access types, because it would take full program analysis to
> > prove or disprove whether there is a danger.
>
> I'm not sure I see why you need more sophisticated analysis for this
> proposal.  The type conversions can be buried anywhere in Ada 95.

Type conversions don't cause a real problem, because hardly anyone
would go to the lengths that you did to use storage pools. If I have
to use a lot of type conversions, I generally give up and make
everything the same type.

In any case, you could warn on any type conversion that could cause a
problem (I think they are rare enough that wouldn't usually be a major
annoyance). It's only a problem for a conversion into a general access
type that has the default pool (and thus would not warn on the
instantiation of Unchecked_Deallocation). Other conversions would have
no reason to be warned. Warning on every use of an allocator that has
a storage pool would rightly be considered noise by a user (most would
say, of course I used the feature, if you can't suggest how to avoid the
warning, don't bother giving one!).

Of course, if there was some way to figure out that a particular
user-defined pool was "safe", the warning could be avoided, but I can't
imagine a good way to do that.

...
> > ...
> > I'm not against solving the problem. I just don't like this
> > particular form
> > of the proposed solution to the problem....
>
> For me the "problem" is that I can't associate a pool with a
> particular allocator, without using a thread-unsafe approach.

That's not a problem, that's a solution to a problem.

> How would you express the "problem" that you perceive in this area?

The user need that you described: the ability to conveniently manage
the memory lifetime of a data structure, possibly made up of multiple
types and multiple access types, separately from the lifetime of those
types.

I can grant that the solution might involve storage pools, but they
can only be part of a solution, never part of a problem.

...
> > Still, obviously one size does not fit all when it comes to memory
> > allocation. I'm just not interested in sacrificing some safety to
> > solve this (or any other, for that matter) problem.
>
> I guess I don't see this as particularly affecting pointer safety
> either way, since we have already said that Unchecked_Deallocation is
> the users' responsibility to get right, both in terms of dangling
> references and in terms of pools.

That's of course a cop out. This user rarely gets these "user's
responsibility" things exactly right, and this implementer tries to
help out this user find out where he screwed up. And neither of me
much like losing a tool in this particular battle.

To take a current example: my spam filter crashes with a bad memory
reference roughly once every 30 days. Because one of the versions
of the crash is the dereference of a null pointer in code that can't
have a null pointer, I suspect a memory management problem. (Whether
it is with Windows, with my code, or with the Janus/Ada runtime code,
I don't know). I'd like to find the problem, but how can you do that
with something that happens once every 30 days (which is typically
longer than the time before Windows needs to reboot!). The only
thing I can do is make it less likely that I made a mistake, thus
extra checking on Unchecked_Deallocation at all levels.

> I am trying to solve a problem with thread-safety, without
> significantly changing anything else.

Given the finalization problems that we've noted elsewhere, I don't
think that is possible. And if a solution is not at least as safe (and
preferably, much more safe) than existing code, than I'm not very
interested.

> If you could explain the problem as you perceive it, and at least the
> general characteristics of a solution you would prefer, that would
> help me better understand your view.

Well, the problem you described is being able to get a better handle
on memory management of large abstractions. It's not really about
objects per-se, but rather a collection of objects that make up
an instance of a single abstraction.

The specific deficiency of Ada in this area is that the only facilities
provided require tying of pools to the lifetime of access types. And
those are impossible to control with the needed granularity.

As far as solutions goes, I really don't know. (This isn't my problem,
after all.) I guess I would like to see the ability to tie pools to
an entire abstraction. One of the problems with your solution is that
you have to repeat it on every node of your abstraction -- passing
the pools everywhere, and putting them on every allocator. Forget one,
(or heaven forbid, use the wrong one), and the whole thing collapses
into a leaky mess (or worse).

One thought would be to have a way to specify a pool for the entire
abstraction. That would mean that that pool would be the default for
any allocations made as part of the abstraction. But I admit I don't
see how that would work in user-defined code.

---

On a detail level, I think you may have already addressed my concerns
when you suggested a special kind of storage pool to handle finalization
properly. It would be easy enough to add some implementation-defined
stuff into those fancier pools in order to preserve most of the
current level of safety. (In particular a check that you don't
deallocate an address that you didn't allocate.) Moreover, there
would be even less reason to interconvert general access types, so a
warning on conversions would be more effective.

And there would be less need to use Unchecked_Deallocation, so less
chance for problems. Still, there would be some loss of safety because
a user-defined pool allocating out of the general heap (and deallocating
an object of it by mistake) is a lot more likely than trying to
deallocate an object that is part of a larger heap object.

Which is all to say that I don't know. I still find that going from
98% accuracy to (say) 70% isn't a good thing. (If I could find a
cheap way without a memory footprint to detect dangling pointers, I'd
implement it immediately for similar reasons.)

****************************************************************

From: Robert A. Duff
Sent: Thursday, March 13, 2008  8:02 AM

> ...

(Somewhere you mentioned lifetimes, but I can't find the quote to reply
to.) Anyway, in Ada 95/2005, the pool object has a longer lifetime
than the access type(s).  But that does NOT prevent dangling pointers,
because you can have a pool operation that frees the memory prematurely.
For example, Tucker likes mark/release pools (type MR_Pool, which I
pronounce "Mister Pool" ;-)) -- Release frees memory.

> Type conversions don't cause a real problem, because hardly anyone
> would go to the lengths that you did to use storage pools. If I have
> to use a lot of type conversions, I generally give up and make
> everything the same type.

You can't make them the same type if they have different designated types.
That's a common case when using tagged/class-wides.

> To take a current example: my spam filter crashes with a bad memory
> reference roughly once every 30 days. Because one of the versions of
> the crash is the dereference of a null pointer in code that can't have
> a null pointer, I suspect a memory management problem. (Whether it is
> with Windows, with my code, or with the Janus/Ada runtime code, I
> don't know). I'd like to find the problem, but how can you do that
> with something that happens once every 30 days (which is typically
> longer than the time before Windows needs to reboot!). The only thing
> I can do is make it less likely that I made a mistake, thus extra
> checking on Unchecked_Deallocation at all levels.

You could try a language (or implementation?) with garbage collection.  ;-)

> Given the finalization problems that we've noted elsewhere, I don't
> think that is possible. And if a solution is not at least as safe (and
> preferably, much more safe) than existing code, than I'm not very interested.

I was surprised, at first, to find that dangling pointers and leaks were
LESS of a problem using the storage pools Tucker is talking about.  I
think this is because you don't have to worry about freeing individual
objects (tracking object "ownership" or whatever on a per-object basis).
You just decide which pool to put it in, which is a fairly coarse-grained
decision (all the objects that come from input file X belong in the
pool for X).  And then they get freed wholesale at some point (e.g. when
we notice that the user has deleted file X).

Those (Ada 95) pools require a global variable ("the current pool"),
which is bad for the usual reasons -- passing the pool to "new" would
solve those problems.

> As far as solutions goes, I really don't know. (This isn't my problem,
> after all.) I guess I would like to see the ability to tie pools to an
> entire abstraction. One of the problems with your solution is that you
> have to repeat it on every node of your abstraction -- passing the
> pools everywhere, and putting them on every allocator. Forget one, (or
> heaven forbid, use the wrong one), and the whole thing collapses into a
> leaky mess (or worse).

I briefly mentioned a solution to this earlier.  I want to say "for all
access types declared in this package, and all its children, the default
pool is X". I would make X be a pool that always raises an exception on
Allocate. (Added bonus -- make sure the compiler _knows_ that, so it can
warn. It's basically the same as the "for T'Storage_Size use 0" pool.)
Then every "new" had better have a specified pool.

> One thought would be to have a way to specify a pool for the entire
> abstraction.

Not sure what you mean by "abstraction", here.  Whatever it is, it needs
to be dynamic.

> On a detail level, I think you may have already addressed my concerns
> when you suggested a special kind of storage pool to handle
> finalization properly.

That's one approach.  Another acceptable approach would be to say that
allocating any object with nontrivial finalization (or with tasks) in
these prematurely-finalized pools is erroneous.  I don't think it's
hard to define "nontrivial finalization", but I think it needs to be
determined at run time for class-wide objects to work properly.

> Which is all to say that I don't know. I still find that going from
> 98% accuracy to (say) 70% isn't a good thing. (If I could find a cheap
> way without a memory footprint to detect dangling pointers, I'd
> implement it immediately for similar reasons.)

It's not hard to make a pool type such that you can find out which pool
a given address points into.  Use BiBOP ("Big Bag Of Pages").  Memory
footprint is one word per memory "page" (where "page" = 1K or something).
I.e. approximately zero.

****************************************************************

From: Randy Brukardt
Sent: Thursday, March 13, 2008  2:19 PM

> (Somewhere you mentioned lifetimes, but I can't find the quote to
> reply to.)
> Anyway, in Ada 95/2005, the pool object has a longer lifetime than the
> access type(s).  But that does NOT prevent dangling pointers, because
> you can have a pool operation that frees the memory prematurely.
> For example, Tucker likes mark/release pools (type MR_Pool, which I
> pronounce "Mister Pool" ;-)) -- Release frees memory.

Correct, but such use of a storage pool is explicitly erroneous by
13.11(21). As an implementer, I'd be justified in deleting Boston if
Tucker does that. (This, I presume, is a reason that Tucker likes to
use his own compiler. ;-)

> > Type conversions don't cause a real problem, because hardly anyone
> > would go to the lengths that you did to use storage pools. If I have
> > to use a lot of type conversions, I generally give up and make
> > everything the same type.
>
> You can't make them the same type if they have different designated types.
> That's a common case when using tagged/class-wides.

If they have different designated types, they're not convertable. As
far as tagged types go, I use a global "access to root class" for
everything, and explicitly convert into a particular specific/classwide
type if I need to access components or operations that are only available
on a derived type. That doesn't happen very often if your design is good
(dispatching does most of those conversions for you).

The net effect is, I rarely use access conversions; I only use view
conversions on the dereferenced access type. And I generally use only a
single global access type for the entire hierarchy. Other approaches seem
to cause a lot more explicit conversions.

> > To take a current example: my spam filter crashes with a bad memory
> > reference roughly once every 30 days. Because one of the versions of
> > the crash is the dereference of a null pointer in code that can't
> > have a null pointer, I suspect a memory management problem. (Whether
> > it is with Windows, with my code, or with the Janus/Ada runtime code,
> > I don't know). I'd like to find the problem, but how can you do that
> > with something that happens once
> > every 30 days (which is typically longer than the time before
> > Windows needs to reboot!). The only thing I can do is make it less
> > likely that I made a
> > mistake, thus extra checking on Unchecked_Deallocation at all levels.
>
> You could try a language (or implementation?) with garbage collection.
> ;-)

Rewriting 2 Megabytes of working Ada into another language is not practical
(or smart!). And Ada does not allow garbage collection on types with
non-trivial finalization (you'd have to finalize too early to collect
the garbage - that's a problem that ought to be fixed), so that's out, too.

...
> > As far as solutions goes, I really don't know. (This isn't my
> > problem, after
> > all.) I guess I would like to see the ability to tie pools to an
> > entire abstraction. One of the problems with your solution is that
> > you have to repeat it on every node of your abstraction -- passing
> > the pools everywhere,
> > and putting them on every allocator. Forget one, (or heaven forbid,
> > use the
> > wrong one), and the whole thing collapses into a leaky mess (or worse).
>
> I briefly mentioned a solution to this earlier.  I want to say "for all access
> types declared in this package, and all its children, the default pool is X".
> I would make X be a pool that always raises an exception on Allocate.
> (Added bonus -- make sure the compiler _knows_ that, so it can warn.
> It's basically the same as the "for T'Storage_Size use 0" pool.) Then
> every "new" had better have a specified pool.

That's the sort of idea I had in mind. Thanks for mentioning it again.

> > One thought would be to have a way to specify a pool for the entire
> > abstraction.
>
> Not sure what you mean by "abstraction", here.  Whatever it is, it
> needs to be dynamic.

I'm not sure, either! That's why I was so vague. I don't see an obvious
way to map "abstraction" into language-defined terms like object and type.

> > On a detail level, I think you may have already addressed my
> > concerns when
> > you suggested a special kind of storage pool to handle finalization
> > properly.
>
> That's one approach.  Another acceptable approach would be to say that
> allocating any object with nontrivial finalization (or with tasks) in
> these prematurely-finalized pools is erroneous.  I don't think it's
> hard to define "nontrivial finalization", but I think it needs to be
> determined at run time for class-wide objects to work properly.

No, that is NOT an acceptable approach. For two reasons:

(1) I don't think a significant segment of the Ada community would stand
for a feature adding more erroneousness to Ada, especially in a common case.
(We went through this with the containers, after all, which is why we added
various checks to them over Matt's objections.) Tucker's argument that
the basic feature doesn't add any more erroneousness for
Unchecked_Deallocation to the language is probably sound (I don't like it
because it adds more erroneousness to our implementation, but that's
a different issue). But making finalization erroneous where it previously
was safe seems like an extension of erroneousness, and that is bad.

(2) Many common types have non-trivial finalization. I like to say that
all new types should be controlled, which while going a bit too far, is
still true for many projects. Even if you remove memory management from
the equation (and I'm not convinced that is a good idea), there still
are many other reasons to do things when objects go away. One example:
Claw puts active top-level windows on a chain. When a top-level window
is destroyed, it better be removed from that chain!

Probably more importantly to you, the language-defined containers are
controlled types. A solution that doesn't let you use the containers seems
very bad.

> > Which is all to say that I don't know. I still find that going from
> > 98% accuracy to (say) 70% isn't a good thing. (If I could find a
> > cheap way without a memory footprint to detect dangling pointers,
> > I'd implement it immediately for similar reasons.)
>
> It's not hard to make a pool type such that you can find out which
> pool a given
> address points into.  Use BiBOP ("Big Bag Of Pages").  Memory
> footprint is one
> word per memory "page" (where "page" = 1K or something).  I.e.
> approximately zero.

I don't see how that would detect many dangling pointers (most of them
point into the main pool, after all, and almost certainly into pages
that are valid [since freeing pages is rare; we only do it for huge
single allocations that are deallocated]). Maybe I misunderstand the
algorithm.

It would help a bit on the deallocation from the wrong pool problem,
but what we do for the main heap right now would work nearly as well
(keeping track of the largest and smallest addresses previously
allocated). And that doesn't take much space, either. But since
dereferences don't know what pool they came from, you can't use that
on them (other than on pool-specific types).

Indeed, detecting dangling pointers in general access types is
very expensive, because you can only do it if you know what pool
the pointer supposedly was allocated from (user-defined pools would
need a checking routine as part of the pool, and aliased objects
couldn't be checked at all). That means carrying an indication
of the pool along with the access value.

****************************************************************

From: Tucker Taft
Sent: Thursday, March 13, 2008  3:28 PM

One thing that might alleviate some of the concerns you exoressed
would be to require some kind of rep clause on an access type to
allow it to be used with allocator-specific pools.  For example,
you might be required to specify the type of the pool objects being
passed, e.g.:

    for Acc_Type'Allocator_Pool_Type use My_Storage_Pool_Type;

Then you could only pass pools of type "My_Storage_Pool_Type"
to an allocator of type Acc_Type.  Of course you could say
"Root_Storage_Pool'Class" and you would have full flexibility (and
no checking).

If we want anonymous access types to fit into this scheme, we might
want to turn it around somewhat for that and specify whether a
particular storage pool type could be used with an anonymous allocator,
e.g.:

    pragma Allow_As_Anonymous_Allocator_Pool(My_Storage_Pool_Type);

or some such thing.

****************************************************************

From: Robert A. Duff
Sent: Thursday, March 13, 2008  5:08 PM

> I don't see how that would detect many dangling pointers...

Sorry, I wasn't clear.  The BiBOP technique can allow you to tell
(from the address) which pool it points into.  That may or may not
be part of dangling-pointer detection, or schemes for doing various
other things.

> Indeed, detecting dangling pointers in general access types is very
> expensive, because you can only do it if you know what pool the
> pointer supposedly was allocated from (user-defined pools would need a
> checking routine as part of the pool, and aliased objects couldn't be
> checked at all). That means carrying an indication of the pool along
> with the access value.

The BiBOP technique means you don't have to carry the pool with
each access value, in order to tell which pool it came from.  Of
course, it only works for suitably-programmed pools -- whether
user-defined or part of the implementation.

****************************************************************

From: Tucker Taft
Sent: Thursday, March 13, 2008  9:30 PM

>> (Somewhere you mentioned lifetimes, but I can't find the quote to
>> reply to.)
>> Anyway, in Ada 95/2005, the pool object has a longer lifetime than
>> the access type(s).  But that does NOT prevent dangling pointers,
>> because you can have a pool operation that frees the memory prematurely.
>> For example, Tucker likes mark/release pools (type MR_Pool, which I
>> pronounce "Mister Pool" ;-)) -- Release frees memory.
>
> Correct, but such use of a storage pool is explicitly erroneous by
> 13.11(21). As an implementer, I'd be justified in deleting Boston if
> Tucker does that. (This, I presume, is a reason that Tucker like to
> use his own compiler. ;-)

I don't quite follow this (and please don't delete Boston on my behalf ;-).
13.11(21) says you can't "... use the storage for any other purpose
while the pool element remains in existence."  Clearly after a "release"
on a pool, the pool elements whose storage is freed no longer exist,
just like with Unchecked_Deallocation.  So what makes it erroneous?
Are you saying it is erroneous for an object that "needs finalization"
to go out of existence without having finalization performed? That I
would agree with you, though I'm not sure I know what paragraph of
the manual would back us up.  I guess 7.6.1(10) implies that the
only way for an object to cease to exist early is via Unchecked_Deallocation.
That is actually in redundancy brackets, but it seems like a bit of a
stretch, given the admitted possibility of Mark/Release storage pools
as exemplified in 13.11.

This whole area seems a bit confused.

In any case, the manual clearly implies that Mark/Release pools are
not erroneous, since they are used in the Example in 13.11.

****************************************************************

From: Randy Brukardt
Sent: Friday, March 14, 2008  12:13 AM

> > Correct, but such use of a storage pool is explicitly erroneous by
> > 13.11(21). As an implementer, I'd be justified in deleting Boston if
Tucker
> > does that. (This, I presume, is a reason that Tucker like to use his
> > own compiler. ;-)
>
> I don't quite follow this (and please don't delete Boston on my behalf
> ;-).  13.11(21) says you can't "... use the storage for any other
> purpose while the pool element remains in existence."  Clearly after a
> "release" on a pool, the pool elements whose storage is freed no
> longer exist, just like with Unchecked_Deallocation.

Not according to Ada. You and I argued about the meaning of "exists"
in Ada some years ago, and you convinced me that I was wrong and that it
is in fact a technical term. You can't flip-flop now... (well, it is an
election year, so I suppose you can. ;-)

> So what makes it erroneous?
> Are you saying it is erroneous for an object that "needs finalization"
> to go out of existence without having finalization performed? That I
> would agree with you, though I'm not sure I know what paragraph of the
> manual would back us up.  I guess 7.6.1(10) implies that the only way
> for an object to cease to exist early is via Unchecked_Deallocation.
> That is actually in redundancy brackets, but it seems like a bit of a
> stretch, given the admitted possibility of Mark/Release storage pools
> as exemplified in 13.11.

7.6.1(11/2) says that objects only cease to exist *after* they are
finalized. No redundancy brackets there. Since all objects are (logically)
finalized, and the only permission to clobber an object early is via
Unchecked_Deallocation, clearly the objects continue exist until the
master goes away. Thus the storage pool violates the third sentence
of 13.11(21), and execution is erroneous.

Another way to say this is that the actions of a random user-defined
storage pool (which could be anything) should not change the underlying
semantics of the language. Surely it should not be changing the lifetime
of objects. Otherwise, you'll have a nightmare for program analysis and
understanding. (See P.S.)

> This whole area seems a bit confused.

There is nothing confused here, the semantics are clear and straightforward.
They may not be what you think they should be, but that's different.

> In any case, the manual clearly implies that Mark/Release pools are
> not erroneous, since they are used in the Example in 13.11.

I always thought that example meant that Release was called after the
type Acc went away. Anything else would clearly be erroneous. I see by
looking at it carefully that this is not clear. But in any case, this is
not normative.

This is another symptom of the "no garbage collection" problem. We have
no permission for early finalization of *any* object in Ada as it stands
 - anything that violates that is technically erroneous (even if not
erroneous in practice).

I've on a number of occasions suggested adding Implementation Permission
to allow early finalization in limited cases at limited points.
("Limited cases" means no early task waiting, and only for objects
that are otherwise inaccessible; "limited points" means at any point
where finalization would normally occur. Those limits should make it
possible to support garbage collection, mark/release pools, and anything
else in between without making program analysis intractable.) But there
never seems to be a lot of interest in tackling this problem (one of
the worst in Ada today, I think). [Keep in mind that recovering memory
from a failed allocator for an object with a controlled part is currently
wrong for this reason; I've written an ACATS-style test to test that --
I could issue it...]


P.S. Allowing existence to be controlled by storage pools would pretty much
mean that 7.6.1 can be thrown away and compilers can do anything they want.
Moreover, it would prevent a lot of static analysis, because it would be
hard to rule out a change in existence caused by a pool.

Consider the following:

    type Acc is access ...
    for Acc'Storage_Pool use MR_Pool;

    O : Acc;

    A := O.all; -- (1)
    null; -- (2)
    P (O.all); -- (3)


Can a tool or compiler assume that O.all is the same between uses (1)
and (3)? The answer is (amazingly) no. Another task could cause O.all to go
away at (2), and the program becomes erroneous.

Now, a compiler can use that erroneousness to allow it to make the assumption
anyway, because the "anything goes" rule of erroneous execution surely
includes assuming (3) is the same as (1). But I don't see that the same
applies to a static analysis tool. If P's parameter is in out, making
this assumption could overlook a major program disaster (overwriting
random memory). That would greatly damage the purpose of static analysis
in the first place.

Note that (2), it is perfectly OK for another task to access MR_Pool
and call Release. This task is not using it in any way at this point. Only
13.11(21) makes it erroneous to use the object later. (13.11.2(16) does
not apply - the object still exists.)

Currently, a static analysis tool could prove that there is no problem
by ensuring that no other task ever calls Unchecked_Deallocation on
a type that uses MR_Pool. While that wouldn't necessarily be easy,
it seems tractable. (Much of the time there would be no instance of UD
accessible in other tasks, which would prove the premise.)

However, allowing any random operation on a pool to change the existence
of objects means that a tool would need a much wider net: essentially
any possible touching of a pool (even a call to Allocate) could cause
some object elsewhere in the program to cease to exist. That would
seem to be very hard for a static analysis tool to disprove: it would
have to figure out the intent of any calls into the pool.

Consider:

    A := O.all; -- (1)
    B := Allocated_Size(MR_Pool); -- (2)
    P (O.all); -- (3)

A tool would have to assume that O.all could be killed by the user-defined
pool operation Allocated_Size. That's even though it is fairly obvious to
a human reader that the intent is just to get information from the pool.
The only way to avoid such an assumption would be analyze how
Allocated_Size worked, and since there are as many pool designs as
programmers, I'm not sure that would be very practical. (After all,
Release(MR_Pool) probably does nothing except set the value of a
ariable of type Address - it wouldn't do any memory operations itself.
How to tell that from a benign statistics gathering routine is not
obvious to me.)

Besides, such a rule is not addressing the real problem, which is that
early finalization is not allowed in Ada. Since all objects are
(logically) finalized, and surely the object has to exist in order that
it can be finalized, any attempt to release the memory early is going
to have be erroneous. I do agree that this (in the case of trivial
finalization) is an example of what Robert called "things that will
work fine in practice, but are still technically erroneous and cause
language lawyers to go tsk-tsk". The trouble is that trying to
separate trivial finalization from other finalization just makes
the problem more annoying: now you can do it legally with carefully
crafted types, but if maintenance adds a controlled part later on
(say by adding a container), the system deletes Boston (because it
now falls into Robert's second category "things that are genuinely
dangerous"). I think that's worse than the original situation (where
it never is legitimate).

****************************************************************

From: Tucker Taft
Sent: Friday, March 14, 2008  1:14 AM

>> So what makes it erroneous?
>>...
>
> 7.6.1(11/2) says that objects only cease to exist *after* they are
> finalized. No redundancy brackets there.

Well if we are getting technical, what 7.6.1(11/2) says is that it
finalizes objects *that still exist*.  If the object ceases to exist
earlier, then it is not finalized at the end of the scope of the access
type.

Now I admit that 7.6.1(10) implies (in redundancy brackets) that
Unchecked_Deallocation is the only way for an object to cease to exist
"early," but I have never taken that at face value, since the whole
point of a Mark/Release heap is to release storage *before* the whole
access type goes away, in the same way that Unchecked_Deallocation does.
Think of Release as a "grouped" Unchecked_Deallocation of a bunch of
objects.  In fact, it could be implemented that way, by keeping a list
of objects as a side-effect of Allocate, and then iterating over them
at the time of Release. In fact, if the objects need finalization,
then that is the only safe thing to do.  The actual "Deallocate"
routine could be a no-op, if the presumption is that the storage is
actually going to be reclaimed "en masse" by the Release operation.
Of course we would like a better way to accomplish this, so that
we don't have to have heap objects needing finalization on two lists,
one maintained by the compiler and one maintained by the pool.

> ... Since all objects are (logically)
> finalized, and the only permission to clobber an object early is via
> Unchecked_Deallocation, clearly the objects continue exist until the
> master goes away. Thus the storage pool violates the third sentence of
> 13.11(21), and execution is erroneous.

I think the statement in 7.6.1(10) will want to be generalized to allow
for Mark/Release pools, if we can figure out a safe way for them to get
any needed finalizations performed on the objects whose storage is being
reclaimed upon Release.

> Another way to say this is that the actions of a random user-defined
> storage pool (which could be anything) should not change the
> underlying semantics of the language. Surely it should not be changing
> the lifetime of objects. Otherwise, you'll have a nightmare for program
> analysis and understanding. (See P.S.)

I think you would agree that a call on Release could be implemented via
a bunch of calls on Unchecked_Deallocation.  Of course, part of the design
problem we are working on is how to make such pools be able to accomplish
the equivalent of Unchecked_Deallocation on a whole group of objects so
that needed finalizations are performed, but do it more efficiently.

>> This whole area seems a bit confused.
>
> There is nothing confused here, the semantics are clear and straightforward.
> They may not be what you think they should be, but that's different.

I think it is confused to say that Unchecked_Deallocation is the only way
for an object to cease to exist "early", given user-defined storage pools
and operations like "Release." But I agree that if we are going to be
storing objects needing finalization in user-defined mark/release storage
pools, we need to add more mechanism to enable them to do the needed
finalizations.

>> In any case, the manual clearly implies that Mark/Release pools are
>> not erroneous, since they are used in the Example in 13.11.
>
> I always thought that example meant that Release was called after the
> type Acc went away. Anything else would clearly be erroneous. I see by
> looking at it carefully that this is not clear. But in any case, this
> is not normative.
>
> This is another symptom of the "no garbage collection" problem. We
> have no permission for early finalization of *any* object in Ada as it
> stands - anything that violates that is technically erroneous (even if
> not erroneous in practice).

I think we are in violent agreement here, though I am more interested
in mark/release pools than in garbage collection. But it is basically
the same problem.  I suspect you might get more interest in the problem
if it was recognized that the current rules imply that mark/release pools
are erroneous. They are certainly used!

****************************************************************

From: Randy Brukardt
Sent: Friday, March 14, 2008  1:18 PM

> > Another way to say this is that the actions of a random user-defined storage
> > pool (which could be anything) should not change the underlying semantics of
> > the language. Surely it should not be changing the lifetime of objects.
> > Otherwise, you'll have a nightmare for program analysis and understanding.
> > (See P.S.)
>
> I think you would agree that a call on Release could be implemented
> via a bunch of calls on Unchecked_Deallocation.  Of course, part of
> the design problem we are working on is how to make such pools be able
> to accomplish the equivalent of Unchecked_Deallocation on a whole
> group of objects so that needed finalizations are performed, but do it
> more efficiently.

While I agree with your principle, I don't think that is actually
possible in practice. You could implement it as calls to Deallocate,
but it doesn't change the existence of objects. That is the magic part
of Unchecked_Deallocation.

All I'm saying is that random user code shouldn't be allowed to change
the lifetime of objects other than through strictly defined ways. Otherwise,
it would be near impossible to analyze anything. How could a tool tell
between Release (which supposely changes the lifetime of things) and
Is_Valid (which is just a query)?

Currently, the only defined way to change the lifetime of an object is
Unchecked_Deallocation (and that does *not* mean the implementation
of UD via a pool, but rather the actual call to a UD instance). It's
obvious that we need more than that, but whatever we have ought to be
very well defined so it is possible to write tools and tests that depend
on the lifetime of objects.

> >> This whole area seems a bit confused.
> >
> > There is nothing confused here, the semantics are clear and straightforward.
> > They may not be what you think they should be, but that's different.
>
> I think it is confused to say that Unchecked_Deallocation is the only
> way for an object to cease to exist "early", given user-defined
> storage pools and operations like "Release."
> But I agree that if we are going to be storing objects needing
> finalization in user-defined mark/release storage pools, we need to
> add more mechanism to enable them to do the needed finalizations.

I've always thought that a Mark/Release pool is erroneous unless the type
has gone away. If the designated type has trivial finalization, it is
what Robert called "language-lawyer erroneousness", in that it would work
on virtually all implementations, but it still is erroneous. (We don't
have a concept of "trivial finalization" in Ada, after all.)

...
> I think we are in violent agreement here, though I am more interested
> in mark/release pools than in garbage collection.
> But it is basically the same problem.  I suspect you might get more
> interest in the problem if it was recognized that the current rules
> imply that mark/release pools are erroneous.
> They are certainly used!

Some would say that Mark/Release is just a poor man's garbage collection.
(And they've never worked for me, although with multiple pools in Ada,
they might in fact work better.)

In any case, the issue is recognized now: at least by anyone who's read
this e-mail thread!

****************************************************************

From: Robert A. Duff
Sent: Friday, March 14, 2008  1:26 PM

Tuck, Randy is right -- the storage pool stuff in SofCheck Inspector is
utterly erroneous, from a formal point of view.  It relies on
implementation behavior that is not guaranteed by the RM.  Nonetheless,
it works just fine!

Anyway, I think we need a specific detailed proposal at this point.
It should address Randy's concerns, which I think are valid, but not
show-stopping.

I'm willing to volunteer to write it up, but I don't want to step on
Tucker's toes if he's planning to.

So far, we have Tucker, Robert, and me in favor of doing something in this
area. And some fairly negative comments from Randy.  I don't want to waste
my time, so I'd like to hear some positive comments from others, first.

So, who wants me to write up a proposal on the general topic of "more
flexible memory management"?

P.S. Re: Randy's issues with garbage collection.  I put some stuff about
it in the RM, but Tucker wanted it moved to the AARM.  It's in 13.11.3.
It's pretty stupid stuff -- I've learned a lot about GC and memory
management in general since those days.

****************************************************************

From: Tucker Taft
Sent: Friday, March 14, 2008  1:55 PM

> ...(We don't have a
> concept of "trivial finalization" in Ada, after all.

I thought that any object that doesn't
"need finalization" per RM 7.6(9.1) is presumed to have "trivial"
finalization in the sense we care about.

Class-wide objects are a bit
of a special case, since you could decide whether they
"need finalization" based on treating them as an object of a
class-wide type (and then the answer is "yes") or as an
object of the specific type determined by the tag (and then the
answer depends on the properties of the specific type).

It pretty much depends on whether you
are interested in a "static" (compile
or link-time) answer ("yes") or a run-time answer (it depends
on the tag).

****************************************************************

From: Tucker Taft
Sent: Friday, March 14, 2008  2:03 PM

> Anyway, I think we need a specific detailed proposal at this point.
> It should address Randy's concerns, which I think are valid, but not
> show-stopping.
>
> I'm willing to volunteer to write it up, but I don't want to step on
> Tucker's toes if he's planning to.
>
> So far, we have Tucker, Robert, and me in favor of doing something in this area.
> And some fairly negative comments from Randy.

Actually I think Randy and I are converging, albeit a bit slowly... ;-)
I think Randy is afraid we are creating a bunch of new holes while trying to
reduce the use of Unchecked_Deallocation.  My ultimate goal is to make this
all automatic, so you have both efficient and safe, region/pool-based storage
allocation, where you can eliminate the possibility of dangling references
statically.  But first I was looking for a primitive that would allow me
to make progress toward that longer-term goal without all kinds of nasty
thread-unsafe kludgery using indirect storage pools.

But I think Randy doesn't want to add a
new primitive that makes things less safe, even if there is some grand
and glorious vision for eventually making things safer.

I accept Randy's concerns, and am beginning to see a path to achieving my goals
while not setting off his erroneousness alarm.

I guess I am still interested in writing something up myself, but I would be
happy to collaborate with you on it, as it seems to be an area we both
care about a lot.  If you want to take the first shot, great.  I'm a
little overwhelmed with other things right now, though some of them involve
storage pool hackery, so that is why I decided to float this syntactic
trial balloon.

****************************************************************

From: Randy Brukardt
Sent: Friday, March 14, 2008  2:15 PM

...
> Actually I think Randy and I are converging, albeit a bit slowly...
> ;-)  I think Randy is afraid we are creating a bunch of new holes
> while trying to reduce the use of Unchecked_Deallocation.  My ultimate
> goal is to make this all automatic, so you have both efficient and
> safe, region/pool-based storage allocation, where you can eliminate
> the possibility of dangling references statically.  But first I was
> looking for a primitive that would allow me to make progress toward
> that longer-term goal without all kinds of nasty thread-unsafe
> kludgery using indirect storage pools.

I'd probably have been less negative if you were not stopping halfway
on this journey. If the path to Nirvana leads through an aligator-infested
swamp, I'm less likely to take it. ;-)

> But I think Randy doesn't want to add a new primitive that makes
> things less safe, even if there is some grand and glorious vision for
> eventually making things safer.

Especially since no one has laid out the "grand and glorious" vision.

> I accept Randy's concerns, and am beginning to see a path to achieving
> my goals while not setting off his erroneousness alarm.

I think I do too. It'll be interesting to see if they're the same path...

****************************************************************

From: Randy Brukardt
Sent: Friday, March 14, 2008  2:26 PM

> Tuck, Randy is right -- the storage pool stuff in SofCheck Inspector
> is utterly erroneous, from a formal point of view.  It relies on
> implementation behavior that is not guaranteed by the RM.  Nonetheless,
> it works just fine!

Glad to see I'm not crazy. (I worry sometimes.)

...
> So, who wants me to write up a proposal on the general topic of "more
> flexible memory management"?

I think you should go for it. I'm always more negative about pie-in-the-sky
ideas rather than a formal proposal.

I'm going to write up a proposed Implementation Permission to allow
garbage collection and Mark/Release pools for Ada. I think this is a
somewhat separate issue, in that it is an existing bug rather than
an extension of some sort. (I'd like to see it fixed now, rather than in
201x.)

I'd like to see a proposal with all of the elements we've previously discussed
(it's easier to delete them than add them later):

* A way to set the default pool for all types in a package;
* A way to set the class of pool types allowed for an access type (and that
  then allows specification of specific pools on allocators); [this also allows
  the allocators to depend on implementation-defined "enhanced pool" types,
  such as one that has a dereference check routine (which would allow pools
  that positively identify dangling references upon use)]
* A solution to the finalization problem when using specific pools in allocators;
* Probably a matching Unchecked_Deallocation for specific pools in allocators.

I hope I didn't miss any.

> P.S. Re: Randy's issues with garbage collection.  I put some stuff
> about it in the RM, but Tucker wanted it moved to the AARM.  It's in
> 13.11.3. It's pretty stupid stuff -- I've learned a lot about GC
> and memory management in general since those days.

It surely is, especially as it never really addresses the problem. Indeed,
it actually says that the implementation needs to defer garbage collection
of objects that need finalization -- which seems exactly wrong.

I suppose we ought to update those notes along with a new
Implementation Permission, but I don't have much interest in that. Maybe
I can talk you into that (I suspect you know a lot more about GC than I do).

****************************************************************

From: Robert A. Duff
Sent: Friday, March 14, 2008  4:22 PM

> I guess I am still interested in writing something up myself, but I
> would be happy to collaborate with you on it, as it seems to be an
> area we both care about a lot.

I'm happy to let you go first.

I just think the whole discussion would be more productive if we had
something specific to argue about.  Randy has spent a lot of energy
pointing out what the RM _currently_ says, but obviously you want to
_change_ what it says.

P.S. I find memory management (GC and otherwise) to be fascinating,
for some strange reason!  ;-)

****************************************************************

From: Randy Brukardt
Sent: Friday, March 14, 2008  4:25 PM

> * A way to set the class of pool types allowed for an access type (and
that
> then allows specification of specific pools on allocators); [this also
> allows the allocators to depend on implementation-defined "enhanced pool"
> types, such as one that has a dereference check routine (which would
> allow pools that positively identify dangling references upon use)]

Really, the dereference check probably ought to be language defined. The
idea is to add:

    procedure Check_Dereference (Pool : in out Some_Storage_Pool;
                                 Addr : in Address) is null;

to a derived storage pool, and then define that a dereference of a
pool-specific access value (can't use this on general access types,
unfortunately, because you don't know the pool) calls this routine
(if it exists) before actually doing the dereference.

The routine then can check if the address is one that was actually allocated
by the pool, and raise an exception (preferably with a message) if it
was not. This could be used to detect dangling pointers as well as
bad pointers that were streamed in or Unchecked_Converted, and probably
to do other things as well if you want to assume the the type of object
that is designated.

It would allow the development of portable debugging pools that would
really be useful in that they could detect virtually all dynamic memory
management errors. (And combined with the ability to change the default
pool suggested above, could be used to debug an entire system rather
simply).

Perhaps the idea could even be extended enough to allow portable
user-defined garbage collection. (But that's not as obvious.)

****************************************************************

From: Randy Brukardt
Sent: Friday, March 14, 2008  4:46 PM

> I'm happy to let you go first.

As editor of the ARG, I think that is a bad idea. Tucker has several
critical projects for the ARG: he's the critical path for the entire
ASIS Standard at this point. I can't recommend that he do anything
that would take away from those focuses.

You, on the other hand seem to have both the necessary interest and
only a small amount of ARG homework (and not critical to the schedule
of a standard that has a hard deadline coming up).

Moreover, Tucker said he'd be happy to collaborate. What I don't want
to see is Tucker working on these "fun" projects and not getting his
critical ARG work done (because I know he has a limited amount of spare
time).

So I think very strongly that you should do it, and preferably right
away while you remember the gist of this e-mail discussion. Especially
because...

> P.S. I find memory management (GC and otherwise) to be fascinating,
> for some strange reason!  ;-)

...which makes you the ideal person to work on this.

****************************************************************

From: Stephen W Baird
Sent: Friday, March 14, 2008  7:28 PM

> So, who wants me to write up a proposal on the general topic of "more
> flexible memory management"?

I do.

****************************************************************

From: Tucker Taft
Sent: Friday, March 14, 2008  8:57 PM

By all means, after you... ;-)

Randy is right, I am too busy to be getting all fired up about this issue.
But I can't resist tossing out a couple of more thoughts:

I think Randy is right, we need to capture the "special sauce" that
Unchecked_Deallocation now has and figure out a way to "sprinkle" it
(carefully) on user-defined code.

I had suggested giving the management of a list of to-be-finalized
objects over to the user-defined code, but that seems like too much
work, and too "dangerous" work, for user-defined code.  So instead
I would suggest that a user-defined pool that wants to "destroy" objects
 early should be an extension of a "Root_Extended_Storage_Pool"
(Root ESP?) type (or some such name) that the implementation must
provide. This Root ESP type would have the necessary
mechanism, presumably provided via private operations that aren't
overridable, to maintain a list of to-be- finalized objects
associated with the pool.  So an allocator that uses such a pool
(either because the access type is associated with it, or the
individual allocator specifies it) would link the object, presuming
it needs finalization, into the pool-resident list.  Finalizing such
a pool object would automatically walk this pool-resident list.
In addition there would be an explicit operation provided by this
Root ESP type to finalize the whole list, thereby destroying all of
the objects on the list, and allowing the pool to reclaim all of their
storage.

I suppose there is no particular reason we couldn't just specify that
an explicit call on the Finalize operation of the Root ESP would have
this effect, but it might be better to give the operation some other
name, such as "Reset_Pool" (or even, Release_Pool ;-).

An unchecked_deallocation would still be permitted on a pool type
derived from this Root ESP, and it would remove the object from this
pool-resident list before finalizing it and calling the Deallocate
operation.

For example:

     type Root_Extended_Storage_Pool is abstract
       new Root_Storage_Pool with private;
         -- This Root ESP type has a (private) component that
         -- the RTS uses to maintain a list of heap
         -- objects allocated from the pool that need
         -- finalization.  The finalization of this component
         -- automatically finalizes all of the objects on
         -- the list and resets the list to empty.
         -- An explicit operation is provided to do
         -- the same thing:

     procedure Reset(Pool : in out Root_Extended_Storage_Pool);
         -- This resets the finalization list to empty after
         -- finalizing all of the objects on the list.
         -- All of the objects that have been allocated
         -- from the pool cease to exist after this call.
         -- It is presumed that extenders of the Root ESP
         -- type will override Reset to first call this
         -- original Reset routine to get any necessary
         -- finalizations performed, and then reclaim
         -- the actual storage associated with the pool.
         -- This operation is implicitly called as part
         -- of finalizing any type derived from Root ESP.

     -- Note that Allocate and Deallocate are inherited from
     -- Root_Storage_Pool, and are called as part of allocators
     -- and Unchecked_Deallocation[_From_Pool].  An allocator
     -- for an object that needs finalization will implicitly call
     -- a private operation of Root ESP to add the object to
     -- the pool's finalization list, and a call on an instance
     -- of Unchecked_Deallocation that applies to such a pool will
     -- implicitly call a private operation of Root ESP to remove
     -- the object from the list and finalize it, before calling
     -- the Deallocate routine.

   Reclaiming storage from a pool that is in use by an object
   would be erroneous unless either Unchecked_Deallocation is
   performed on the object, or Reset is performed on the pool
   as a whole.  As usual, referencing an object that no longer
   exists would be erroneous.

   It is important that a Root ESP pool object not outlive the
   type of the objects that are stored in it if they need
   finalization, since the finalization of a Root ESP object
   implicitly calls the finalize operation of these objects.
   That is why I said that the accessibility level of such
   a pool object must be at least deep as that of the types
   stored in it.  I realize that is somewhat the opposite of
   pools that are not managed in a mark/release manner.

So that's one approach to solve the finalization problem...

****************************************************************

From: Tucker Taft
Sent: Saturday, March 15, 2008  8:42 AM

Well of course once I got started on this it will be hard to stop
thinking about it.  A couple of additional thoughts came to me relating
to it.

First, suppose we want to allow objects with task parts to be allocated
from such a pool. We might then want to reasonably be able to wait
for all such tasks to complete before performing a Reset.  Similarly,
we might just want to be able to check whether all such tasks are
terminated (e.g. an All_Task_Parts_Are_Terminated function defined on
Root_ESP), and we might want to be able to abort them all en masse and
then wait for them to terminate.

A second set of thoughts related to being able to check the accessibility
level of an ESP object. The intent is that you can't allocate an object
from an ESP if the type of the object doesn't outlive the ESP object.
But we don't really have a way of doing that, since objects don't
generally have any kind of indicator of their own accessibility.
Even access parameters can lie, since they might be the result of a
'Access of something whose accessibility level isn't precisely known
(such as a tagged in-out parameter).  Even a value of a named access
type can lie, since it might be the result of a conversion from a
longer-lived access type.  So, it would seem to do this check precisely,
the implementation must in an ESP object a component that faithfully
records the accessibility level of the object.  This would be similar
to what a "tag" must record so we can correctly check the accessibility
of a tagged type on function return and on allocators.

The above two thoughts led me to a third thought, namely that an ESP
object is very close to what we have called a "master."  If we allow
task parts, then it could act as the master for those tasks.  Similarly,
we presume a master has a well known accessibility level, since we need
to compare against it, particularly in the check on function return for
functions nested in accept statements as specified by AI05-24:

    A check is made that the master of the type
    identified by the tag of the result includes
    the elaboration of the master that elaborated
    the function body.

So if we begin to think of ESP objects as a kind of "dynamic" master,
that the user can declare and apply operations to, such as wait for all
tasks of the dynamic master, finalize all objects of the dynamic master,
abort all tasks of the master, then it begins to make a bit more sense
in terms of how it relates to the language as a whole.

Similarly, when this dynamic master is finalized, so are all of the
objects associated with the master.  It would also make sense to wait
for all the tasks of the dynamic master prior to finalizing it.

To limit the places where one would have to worry about dynamic masters,
we might require some specification on an access type to indicate that
some of its values might be pointing to objects managed by a dynamic
master.  I had suggested a rep-clause for something like that, to
indicate the *type* of the ESP object (aka dynamic master) that
an access type might specify in one of its allocators.  That might
carry over further to limiting the conversion from one access type
to another, namely requiring that when converting acc type A to
acc type B, if acc type A allows ESP pools of type E in its allocators,
then acc type B must specify a type that covers E for its allocators
(with Root_ESP'Class covering them all).  (This begins to sound more
like an "operational item" than a "rep-clause.")

I guess that's enough craziness for one message...

****************************************************************

From: Robert A. Duff
Sent: Saturday, March 15, 2008  9:12 AM

> The above two thoughts led me to a third thought, namely that an ESP
> object is very close to what we have called a "master."

So we should call it Master_Pool instead of Mr_Pool.

****************************************************************

From: Randy Brukardt
Sent: Friday, March 14, 2008  4:07 PM

As previously discussed, Ada does not currently allow garbage collection
and mark/release pools of objects, because Ada does not allow early
finalization. Thus, technically any such uses are erroneous. Moreover,
if the designated type in question has non-trivial finalization, any
such use is really likely to be erroneous, when the finalization hits
non-existent memory.

These advanced memory management techniques should surely be
explicitly allowed in Ada. So, I'm proposing an Implementation Permission
to allow them. I'm providing it in the form of half of an AI.

!wording

Add after 7.6.1(20):

Implementation Permissions

  An *inaccessible object* is one that is never, other than for
  finalization, accessed again by the partition. An object declared
  in a passive partition (see E.1) is never inaccessible.

    AARM Discussion: This definition seems to require predicting the
    future, which is usually a mistake. But that is intended in this
    case, because it is attempting to provide a definition of objects
    that can be garbage collected (or released from a Mark/Release
    pool). Typically, an implementation will apply this definition
    to local objects which have no remaining references in the code,
    and to objects accessed via access values in those objects.

    AARM Reason: The second sentence is just to exclude passive
    partitions (which can be accessed by many active partitions) from
    this definition. Otherwise, the objects of a passive partition
    would always be inaccessible (as it has no execution thread), and
    that's clearly not what we want.

  If a directly allocated object Q of a type that is not immutably
  limited is inaccessible, then if the object Q would normally be
  finalized by task T by the finalization of a master M, then during
  the finalization of any master of task T that occurs prior to the
  finalization of the master M the implementation may finalize the
  object Q. After such a finalization, the object no longer exists.

    [Editor's note: "directly allocated" means one that comes from
    "new", not a subcomponent of such an object. Better wording is
     requested.]

    AARM Ramification: After the object ceases to exist, the implementation may
    recover its storage. Or the user may recover its storage via an operation of
    its storage pool (as in a Mark/Release pool). Note that such storage
    reclaiming by a storage pool is erroneous until the object ceases to exist;
    this permission allows that to happen sooner than in the canonical
    semantics. If the finalization of the object has no effect, this early
    finalization does not even require any code. [Editor's note: An additional,
    separate proposal is expected to provide a mechanism to ensure that
    finalization occurs early for a Mark/Release pool, so that the use of such
    pools can be made portable.]

    AARM Discussion: We only allow early finalizations to occur at points where
    finalizations are expected to occur anyway, and only by the same task that
    would normally finalize the object. Truly asynchronous finalizations would
    cause many new issues for programmers and implementers; imagine a
    finalization being invoked during the assignment of a global composite
    variable that is also used by the finalization routine. We don't programmers
    to have to take extra precautions to protect such objects in single tasking
    programs.

There are a lot of AARM notes on Garbage Collection in 13.11.3 which should be
updated to reflect what the language actually allows and does not allow. I'll
try to get Bob to do that.

!discussion

We exclude "immutably limited" types from this permission for two reasons:
(1) In AI95-00147, we decided that we would not allow optimization of
Finalization for limited controlled types, as these are often used for locks and
the like. It would seem weird to water down that decision and allow that
Finalization to be moved to some other point in the program.
(2) Finalization of objects containing task parts includes waiting for those
tasks to terminate. This would mean that any finalization point could
potentially be a task waiting point and essentially a blocking point. Moreover,
it could introduce deadlock where none was previously present if the object
containing the task is finalized before some resource in the parent task that it
depends on is available. This seems like a likely scenario, as the whole idea of
child tasks is that they run independently. The wording as written doesn't take
into account that active nature of tasks.

There doesn't seem to be any technical problem preventing extending this
permission to limited controlled types and to protected objects if we wish to do
that. There would need to be some wording preventing problematic task scenarios
that might cause deadlocks.

There wouldn't be any implementation problem for Janus/Ada on these, either, as
tasks waiting and protected objects are handled much like controlled objects.
But that may not be true in other implementations, particularly for the task
waiting.

---

Note that this permission does not directly allow finalizing of objects during
an allocator. This means that invoking a garbage collector as part of an
allocator is not directly supported by the language. However, it appears to me
that this would be allowed in virtually all real cases, as it is highly likely
that a master finalization would directly precede the allocator. In that case,
an "as-if" optimization would allow doing the garbage collection as part of the
allocator. I've been unable to think of any non-pathological cases where this
would not be allowed (and it would matter), but I have not tried very hard to do
so.

On the other hand, an asynchronous garbage collector (that is, one that runs on
a separate thread) would take special care, even with this permission. It could
find garbage and collect garbage with trivial finalization, but it would have to
synchronize with the task that owns the object in order to finalize it. While
this is unpleasant, the alternative would be to require most existing Ada code
that uses finalization to be rewritten (to add additional locking). That seems
like too much of a burden.

---

I considered allowing this for stand-alone objects as well. That makes sense as
an implementation could implement local composite objects as if they are all
allocated from the heap (similar to Java), and the implementation could also
garbage collect them.

However, I eventually gave up on the idea as it is possible for the finalization
of an object to depend on the finalization of some other object in the same
scope (there are examples of this in AI95-00280). It doesn't make sense to
revisit that can of worms.

The definition of "inaccessible" is designed to cover all objects, in case we
want in the future to add additional permissions.

---

In case that there was a problem with defining "inaccessible object", I checked
for all uses of the term "inaccessible" in the AARM:

AARM 3.10.2(22.t) says that the term "inaccessible" is sometimes used in
relationship to accessibility. I can find no such uses in the AARM; I think this
text should be deleted.

13.11(25.2/2) uses "inaccessible object". Previously there was no definition of
that; now there is. I don't think that there is a problem with this, but if
someone finds one, we could reword that paragraph. (Or use a different term.)

AARM 3.9.1(4.a) uses "inaccessible subprogram" informally. No issue.

E.1(7) defines "inaccessible partition", which is then used in a number of other
places in E.1 and E.3. That's not a problem.

The automatically generated annex M uses "inaccessible" based on some of the
previously discussed rules. And that's it.

-----------------------

That seems like enough for now. Send in the brickbats!!

****************************************************************

From: Robert Dewar
Sent: Friday, March 14, 2008  4:37 PM

> These advanced memory management techniques should surely be
> explicitly allowed in Ada. So, I'm proposing an Implementation
> Permission to allow them. I'm providing it in the form of half of an AI.

I would say this is premature unless someone is actually interested in
trying to implement a garbage collector. Wait till that point, any
earlier work on the issue is wasted effort in my opinion.

****************************************************************

From: Randy Brukardt
Sent: Friday, March 14, 2008  4:52 PM

As noted in the previous thread, this also affects Tucker's
Mark/Release storage pools (which are erroneous as the standard
is written). It also affects the behavior of allocators that fail
to initialize (they can't be finalized until the type goes away,
which means that the memory has to stay around a long time). That
also has come up previously.

The point is that this is a general problem that comes up in Ada
if you want to do any sort of memory management at all. The form
of that memory management does not have to specifically be garbage
collection. And surely there are people that do want to do memory
management!

****************************************************************

From: Robert A. Duff
Sent: Friday, March 14, 2008  7:32 PM

Randy, I didn't read the whole thing carefully yet.  I will, but I
wonder if you could reformat it with short lines?  Somebody's mailer
(possibly mine) wrapped them in an ugly fashion.

> As previously discussed, Ada does not currently allow garbage
> collection and mark/release pools of objects, because Ada does not
> allow early finalization. Thus, technically any such uses are
> erroneous. Moreover, if the designated type in question has
> non-trivial finalization, any such use is really likely to be
> erroneous, when the finalization hits non-existent memory.
>
> These advanced memory management techniques should surely be
> explicitly allowed in Ada. So, I'm proposing an Implementation
> Permission to allow them. I'm providing it in the form of half of an AI.

Not sure if the above is intended to become the !summary or !question
or whatever of an AI, but it seems way too negative.  Normally, I
wouldn't care about that, but garbage collection is a contentious
issue, and an easy anti-Ada point.  The truth is, nobody ever
intended to outlaw GC in Ada!  Nor Mister Pools.

How about something like this:

  The design of Ada has always been intended to allow implementations that
  support garbage collection (GC).  However, certain paragraphs of the RM seem
  to accidentally imply that GC is impossible.  Is GC allowed in Ada?  (Yes.)

  Similarly, these paragraphs seem to accidentally render certain kinds of
  user-defined storage pools, such as mark/release, technically erroneous.
  Do mark/release-like pools work in Ada?  (Yes.)

More later...

****************************************************************

From: Randy Brukardt
Sent: Friday, March 14, 2008  7:57 PM

> Randy, I didn't read the whole thing carefully yet.  I will, but I
> wonder if
> you could reformat it with short lines?  Somebody's mailer (possibly
> mine) wrapped them in an ugly fashion.

I just looked, and it's a mess in mine, too. I think I forgot to not
manually wrap the lines like I have to in an AI.

I'll try to repost it.

...
> Not sure if the above is intended to become the !summary or !question
> or whatever of an AI, but it seems way too negative.

No, it is just an introduction for the ARG members who haven't been
following the other thread. It was supposed to be one sentence, but as
you probably know, I tend to write two paragraphs when a sentence will
do. :-)

> Normally, I wouldn't care about that, but garbage collection is a
> contentious issue, and an easy anti-Ada point.  The truth is, nobody
> ever intended to outlaw GC in Ada! Nor Mister Pools.

True enough.

> How about something like this:
>
>   The design of Ada has always been intended to allow implementations that
>   support garbage collection (GC).  However, certain paragraphs of the RM
>   seem to accidentally imply that GC is impossible.  Is GC allowed in Ada?
>   (Yes.)
>
>   Similarly, these paragraphs seem to accidentally render certain kinds of
>   user-defined storage pools, such as mark/release, technically erroneous.
>   Do mark/release-like pools work in Ada?  (Yes.)

Sounds good. Are you writing this AI, too? :-) I'm sure I can find
something else to work on. :-)

****************************************************************

From: Robert Dewar
Sent: Friday, March 14, 2008  8:12 PM

> I would say this is premature unless someone is actually interested in
> trying to implement a garbage collector. Wait till that point, any
> earlier work on the issue is wasted effort in my opinion.

Thanks Randy for clarifying that the intent goes beyond just garbage
collection.

Of course if you implement your own storage pool, it is allowed to do
arbitrary magic, since there is no requirement that it be programmed
in Ada, so I am not sure that any IA is necessary, but it is perhaps
useful as a guideline of how much magic is considered OK, though
ultimately such good-taste guidelines are of limited use (sort of like
trying to legislate what are OK pragmas).

****************************************************************

From: Robert Dewar
Sent: Friday, March 14, 2008  8:16 PM

> As noted in the previous thread, this also affects Tucker's
> Mark/Release storage pools (which are erroneous as the standard is
>  written).

Not really. They may look erroneous, but if you just declare them
to be written in Ada++ whose semantics are not well documented, they
are fine, since the standard has nothing to say about Ada++.

The status of such a package is that "it works using compiler bla"
and perhaps "it does not work using compiler bla". The proposed
implementation advice/permission won't affect that status either way.

I don't see that the ARG mulling over some implementation advice or
permission has any practical or useful effect on this pool.

> The point is that this is a general problem that comes up in Ada if
> you want to do any sort of memory management at all. The form of that
> memory management does not have to specifically be garbage collection.
> And surely there are people that do want to do memory management!

Rignt, but if all they have to depend on is implementation dependent
behavior, then the standard is somewhat irrelevant, what is important
is what the implementation does, and that's unlikely to be affected
in practice by what the standard says in this area.

****************************************************************

From: Randy Brukardt
Sent: Friday, March 14, 2008  8:13 PM

Here's the repost Bob wanted; I put his introduction at the beginning rather
than mine. (I did drop the word "accidentally" from it; no reason to talk about
intent directly ("seem to imply" says it more subtly anyway.) I added a few
sentences here and there to (hopefully) clarify things a bit; it's impossible to
re-read something and not see ways to improve it. - RLB

--------------

!question

The design of Ada has always been intended to allow implementations that support
garbage collection (GC). However, certain paragraphs of the RM seem to imply
that GC is impossible. Is GC allowed in Ada?  (Yes.)

Similarly, these paragraphs seem to render certain kinds of user-defined storage
pools, such as mark/release, technically erroneous. Do mark/release-like pools
work in Ada?  (Yes.)

!wording

Add after 7.6.1(20):

Implementation Permissions

An *inaccessible object* is one that is never, other than for finalization,
accessed again by the partition. An object declared in a passive partition (see
E.1) is never inaccessible.

  AARM Discussion: This definition seems to require predicting the future, which
  is usually a mistake. But that is intended in this case, because it is
  attempting to provide a definition of objects that can be garbage collected
  (or released from a Mark/Release pool). Typically, an implementation will
  apply this definition to local objects which have no remaining references in
  the code, and to objects accessed via access values in those objects.

  AARM Reason: The second sentence is just to exclude passive partitions (which
  can be accessed by many active partitions) from this definition. Otherwise,
  the objects of a passive partition would always be inaccessible (as it has no
  execution thread), and that's clearly not what we want.

If a directly allocated object Q of a type that is not immutably limited is
inaccessible, then if the object Q would normally be finalized by task T by the
finalization of a master M, then during the finalization of any master of task T
that occurs prior to the finalization of the master M the implementation may
finalize the object Q. After such a finalization, the object no longer exists.

  [Editor's note: "directly allocated" means one that comes from "new", not a
  subcomponent of such an object. Better wording is requested.]

  AARM Ramification: After the object ceases to exist, the implementation may
  recover its storage. Or the user may recover its storage via an operation of
  its storage pool (as in a Mark/Release pool). Note that such storage
  reclaiming by a storage pool is erroneous until the object ceases to exist;
  this permission allows that to happen sooner than in the canonical semantics.
  If the finalization of the object has no effect, this early finalization does
  not even require any code. [Editor's note: An additional, separate proposal is
  expected to provide a mechanism to ensure that finalization occurs early for a
  Mark/Release pool, so that the use of such pools can be made portable.]

  AARM Discussion: We only allow early finalizations to occur at points where
  finalizations are expected to occur anyway, and only by the same task that
  would normally finalize the object. Truly asynchronous finalizations would
  cause many new issues for programmers and implementers; imagine a finalization
  being invoked during the assignment of a global composite variable that is
  also used by the finalization routine. We don't programmers to have to take
  extra precautions to protect such objects in single tasking programs.

There are a lot of AARM notes on Garbage Collection in 13.11.3 which should be
updated to reflect what the language actually allows and does not allow. I'll
try to get Bob to do that.

!discussion

We exclude "immutably limited" types from this permission for three reasons:
(1) In AI95-00147, we decided that we would not allow optimization of
Finalization for limited controlled types, as these are often used for locks and
the like. It would seem weird to water down that decision and allow that
Finalization to be moved to some other point in the program.
(2) Finalization of objects containing task parts includes waiting for those
tasks to terminate. This would mean that any finalization point could
potentially be a task waiting point and essentially a blocking point. Moreover,
it could introduce deadlock where none was previously present if the object
containing the task is finalized before some resource in the parent task that it
depends on is available. This seems like a likely scenario, as the whole idea of
child tasks is that they run independently. The wording as written doesn't take
into account that active nature of tasks.
(3) The implementation of task waiting is often different than that for
finalization. An implementation burden may result from broadening the
permission.

There doesn't seem to be any technical problem preventing extending this
permission to limited controlled types and to protected objects if we wish to do
that. There would need to be some wording preventing problematic task scenarios
that might cause deadlocks should we want to go all the way.

There wouldn't be any implementation problem for Janus/Ada on these, either, as
tasks waiting and protected objects are handled much like controlled objects.
But that may not be true in other implementations, particularly for the task
waiting.

---

Note that this permission does not directly allow finalizing of objects during
an allocator. This means that invoking a garbage collector as part of an
allocator is not directly supported by the language. However, it appears to me
that this would be allowed in virtually all real cases, as it is highly likely
that a master finalization would directly precede the allocator. In that case,
an "as-if" optimization would allow doing the garbage collection as part of the
allocator. I've been unable to think of any non-pathological cases where this
would not be allowed (and it could be detected), but I have not tried very hard
to do so.

On the other hand, an asynchronous garbage collector (that is, one that runs on
a separate thread) would need special care, even with this permission. It could
find garbage and collect garbage with trivial finalization at any point, but it
would have to synchronize with the task that owns the object in order to
finalize it. While this is unpleasant, the alternative would be to require most
existing Ada code that uses finalization to be rewritten (to add additional
locking). That seems like too much of a burden.

---

I considered allowing this for stand-alone objects as well. That makes sense as
an implementation could implement local composite objects as if they are all
allocated from the heap (similar to Java), and the implementation could also
garbage collect them.

Also, it is uncomfortable that in:
    A : Some_Type;
    B : access Some_Type := new Some_Type;
the finalization of B can be moved and A's cannot.

However, I eventually gave up on the idea as it is possible for the finalization
of an object to depend on the finalization of some other object in the same
scope (there are examples of this in AI95-00280). It doesn't make sense to
revisit that can of worms.

Moreover, memory management is needed for object allocated from storage pools;
it is not needed for stand-alone objects. At least at this point, the only
problems that have been identified have to do with memory management.

The definition of "inaccessible" is designed to cover all objects, in case we
want in the future to add additional permissions.

---

In case that there was a problem with defining "inaccessible object", I checked
for all uses of the term "inaccessible" in the AARM:

AARM 3.10.2(22.t) says that the term "inaccessible" is sometimes used in
relationship to accessibility. I can find no such uses in the AARM (nor can I
recall anyone ever having done that); I think this text should be deleted from
the AARM.

13.11(25.2/2) uses "inaccessible object". Previously there was no definition of
that; now there is. I don't think that there is a problem with this, but if
someone finds one, we could reword that paragraph. (Or use a different term.)

AARM 3.9.1(4.a) uses "inaccessible subprogram" informally. No issue.

E.1(7) defines "inaccessible partition", which is then used in a number of other
places in E.1 and E.3. That's not a problem.

The automatically generated annex M uses "inaccessible" based on some of the
previously discussed rules. And that's it.

****************************************************************

From: Robert Dewar
Sent: Saturday, March 15, 2008  8:18 AM

Unless I misunderstand, this has the general form of implementation
permission allowing a compiler to do certain things for an erroneous
program.

But any compiler is always allowed to do anything it likes for an
erroneous program, including documenting that it does a certain thing
and that you can rely on it.

So I am not sure what the point is in practice.

****************************************************************

From: Randy Brukardt
Sent: Sunday, March 16, 2008  9:23 PM

Yes, of course, but that is not exactly the case here.

The permission allows a compiler to do something early for a program
that *might* later become erroneous. That's not quite the same thing,
because the early finalization could occur even if the program does
not in fact ever execute anything that would trigger erroneous
execution. (Remember that the executions are erroneous, not programs
per se.) The fact that the finalization was done before the program
became erroneous could be detected, and would be wrong (period).

There is also value to having the permission so that people that
are writing portable Ada code know what they have to support. In
particular, if you are writing a library like Claw that is supposed
to work on a variety of Ada systems (potentially including some not
yet constructed), it's useful to know the bounds of "optimizations"
that compilers are allowed to do on finalization.

Of course, there may be a better solution out there; we've got to
start somewhere and get the discussion rolling.

****************************************************************

From: Tucker Taft
Sent: Saturday, March 15, 2008  12:35 PM

It was actually persistence that reignited my interest in storage pools
recently.  About the closest Ada 2005 has to persistence is shared passive
library units.  Shared passive units don't allow tasks or entry-ful protected
types, and generally are used to hold objects or heaps that may outlast the
execution of a single active partition. Similarly, shared passive units don't
allow controlled types, since Ada.Finalization is neither Pure nor
Shared_Passive.

If one were to try to use the kinds of "allocator-specific" storage pools we
have been talking about for persistence, one might like to somehow incorporate
the rules for shared-passive library units.  One annoying restriction for
shared-passive units is that they don't permit the declaration of an
access-to-class-wide type (E.2.1(7/1)), which is now *more* restrictive than
what pure packages permit.  This rule should probably be changed to be a
restriction on allocators, namely that the type determined for the object
created by an allocator for an access type declared in a shared-passive package
must have preelaborable_initialization, and further, should itself be declared
in a shared-passive or pure package.

There is already a restriction on conversions implicit in E.2.1(8) that would
prevent a conversion from an access type declared outside a shared-passive
package being converted to one declared inside a shared-passive package.

Another bug in E.2.1(7/1) is that it says the access type can't designate a task
type or a protected type with entry_declarations, but there is nothing
prohibiting designating a type that has a task component.  This should probably
piggy-back on the definition of preelaborable_initialization, and disallow an
access type that designates a type that does not have
preelaborable_initialization.

Actually, if the class-wide type restriction of E.2.1(7/1) is shifted to being
enforced at the point of an allocator, the task/protected-type restriction could
as well, but perhaps doing it at both points is friendliest to the programmer.
No point in declaring a type that no one can use.

Now getting back to the topic of allocator-specific storage pools and
persistence...  Currently you can't specify a storage pool for an access type
declared in a shared-passive package, since System.Storage_Pools is neither pure
nor shared-passive.  We should consider declaring it shared-passive.  We might
then have a separate sub-hierarchy of Root_Storage_Pool, say
Root_Passive_Storage_Pool (or Root_Persistent_Storage_Pool, but in either case,
Root PSP), that is specifically designed for use with access types and
allocators associated with shared-passive packages.  Unlike ESPs, PSPs would
*not* need to worry about tasks and finalization, but they *would* need to worry
about being referenced from multiple active partitions, and persisting past the
end of their executions.

I suppose rather than making System.Storage_Pools shared-passive, we could
define a Root_Passive_Storage_Pool which is shared-passive, and then define
Root_Storage_Pool to be a private extension of that.  Presumably only
descendants of the Root PSP that are themselves declared in a shared-passive
package would be considered "passive" storage pools.

In any case, an allocator for a shared-passive access type would only be allowed
to specify a passive storage pool as the allocator-specific storage pool.

So the overall rule would be that an allocator creating an object that needs
finalization must specify a pool of a type that is derived from
Root_Extended_Storage_Pool, and an allocator for a shared-passive access type
must *not* be creating an object that needs finalization, and may only specify a
pool that is a passive storage pool.

Or something like that...

****************************************************************

From: Tucker Taft
Sent: Saturday, March 15, 2008  1:06 PM

By the way, we might consider changing the definition of "needs finalization" to
be consistent with the definition for "preelaborable_initialization" by
exempting entry-less protected types.  Since entry-less protected objects can be
placed in shared-passive packages, and since they are presumed to require no
run-time elaboration, it seems that they should not be considered to "need"
finalization.

****************************************************************

From: Randy Brukardt
Sent: Sunday, March 16, 2008  9:29 PM

> It was actually persistence that reignited my interest in storage
> pools recently.  About the closest Ada 2005 has to persistence is
> shared passive library units.  Shared passive units don't allow tasks
> or entry-ful protected types, and generally are used to hold objects
> or heaps that may outlast the execution of a single active partition.
> Similarly, shared passive units don't allow controlled types, since
> Ada.Finalization is neither Pure nor Shared_Passive.

I'm not convinced that shared passive units are the best model to use for
persistence. For one thing, omitting (non-limited) controlled types seems like a
non-starter to me: it would not allow any containers, for instance.

I've always thought that the place to start was with streaming and availability,
because the existence of available attributes mean that it is possible to
"flatten" the object to some sort of backing device (and also to allow
appropriate restoring). Non-limited controlled objects do have available stream
attributes. (And limited types with available attributes also could be made
persistent - in that case, the user is deciding how to handle the messy cases,
which seems appropriate.)

Anyway, more to think about. (This does seem to be getting more complex as we
go...)

****************************************************************

From: Randy Brukardt
Sent: Sunday, March 16, 2008  9:43 PM

> By the way, we might consider changing the definition of "needs
> finalization" to be consistent with the definition for
> "preelaborable_initialization" by exempting entry-less protected
> types.  Since entry-less protected objects can be placed in
> shared-passive packages, and since they are presumed to require no
> run-time elaboration, it seems that they should not be considered to
> "need" finalization.

I've never understood how that was supposed to work. Our PT's surely need
runtime elaboration: they're all controlled objects (even if the finalization
doesn't do anything explicit, it still has to clean up the locks). So I'd be
against spreading a mistake further (the current rules don't actually *require*
no run-time actions, only "advise" it).

****************************************************************

From: Tucker Taft
Sent: Sunday, March 16, 2008  10:57 PM

There are really at least two different approaches to persistence, one which is
based on a representation that can be paged in and out of memory, and has few if
any differences between the in-memory and the on-disk representation.  At a
minimum, the in-memory and the on-disk representations take up the same amount
of space, so you can do whatever "fixups" are needed on read/write of a page "in
place." As an example of such a persistence approach, take a look at the "Thor"
system described by Barbara Liskov et al. in "Distributed Object Management in
Thor":

   http://citeseer.ist.psu.edu/liskov93distributed.html

An alternative approach is the one you suggest, which is more based on
streaming/marshalling/ serializing, where the in-memory and on-disk
representations have little or nothing to do with one another.  These generally
involve swapping objects rather than, say, pages.

I am currently more interested in the former, which maps quite nicely to the
distribution passive partition model, where something very close to the
in-memory representation persists beyond the execution of a single active
partition. For this approach, finalization could be accommodated, but it would
only be to support unchecked-deallocation, since it is not generally practical
to have to invoke the finalization operations of persistent objects when the
passive partition finally dies (if it ever does).

The second model clearly relates closely to the streaming stuff, and I agree for
that model, having controlled types works fine.

I don't see much of a connection between this second model and storage pools
(though I may be missing something obvious), whereas for the first model,
persistent/passive storage pools make sense, and would almost be essential to
make shared-passive packages useful (I suspect you find shared-passive packages
useless, but I think at least for the purposes of supporting persistence, they
could be quite useful).

****************************************************************

From: Robert A. Duff
Sent: Monday, March 17, 2008  8:18 AM

> useless, but I think at least for the purposes of supporting
> persistence, they could be quite useful).

You might be interested in this:

http://www.adacore.com/2007/12/03/ada-gem-20/

which talks about how GNAT shared passive packages support persistence.

****************************************************************

From: Robert Dewar
Sent: Monday, March 17, 2008  8:25 AM

The interesting history here is that I needed to do something to support
shared passive, and I hit on this idea of using persistence in this way,
and then decided it was a nice useful feature :-)

****************************************************************

From: Robert I. Eachus
Sent: Monday, March 17, 2008  12:49 PM

> This does seem to be getting more complex as we go...

Yes, it does.  And as such, I think that modifying all of Ada to allow for
mark-release and garbage collected storage pools is taking way too large a bite.
I'm going to make a strawman proposal then see if it can be fit back into the RM
without major harm.  I'm going to start with a (somewhat too) simple idea, then
see what it takes to make it work.

with Ada.Finalization;

package Ada.Unchecked_Finalization is
    pragma Preelaborate(Unchecked_Finalization);


    type Controlled is new Ada.Finalization.Controlled;
    pragma Preelaborable_Initialization(Controlled);


    procedure Initialize (Object : in out Controlled) is null;
    procedure Adjust     (Object : in out Controlled) is null;
    procedure Finalize   (Object : in out Controlled) is null;
    procedure Unchecked_Finalize   (Object : in out Controlled) is null;



    type Limited_Controlled is new Ada.Finalization.Limited_Controlled;
    pragma Preelaborable_Initialization(Limited_Controlled);


    procedure Initialize (Object : in out Limited_Controlled) is null;
    procedure Finalize   (Object : in out Limited_Controlled) is null;
    procedure Unchecked_Finalize   (Object : in out Limited_Controlled) is null;
private
    ... -- not specified by the language
end Ada.Unchecked_Finalization;

What does this do for us?  The first thing it does, is that the semantics for
Unchecked_Finalization are that

  Unchecked_Finalization can be called while an object is accessible*
  Unchecked_Finalization does not call Finalization or Unchecked_Finalization
    on subcomponents, and in particular
    the compiler magic to finalize components before finalizing the object
    doesn't happen.
 Calling Unchecked_Finalization on an object means that
     1) Any future references to the object are erroneous.
     2) Implementations are expected not to call Finalize for objects that have
        been Unchecked_Finalized.

Now you can have mark-release and garbage-collected storage pools of objects
derived from Unchecked_Finalization.Controlled.

Does this do everything desired?  Not really.  There are two issues.  The first
is that the user is responsible for finalization of components of types derived
from Controlled or Limited_Controlled.  Fortunately defining Unchecked_Finalize
to make the necessary calls to Finalize, and in the right order is possible--and
probably a good idea during debugging.  But...

The hope is that storage pools for Unchecked_Finalization types can wave a magic
wand and do Unchecked_Finalization for all the objects that are about to be
released.  This is especially important for mark-release storage pools.  Garbage
collecting pools are probably going to do compaction.  If so, you would want a
different package which would only support non-limited types.  In practice an
implementation would probably provide a package with a different name indicating
the intent.

Let's go back to mark-release for a bit.  If the pool is intended to use
mark-release for a set of types all derived from the same root, I would probably
for efficiency reasons specify a size for the type and the same size for all its
children.  This gives you a pretty easy implementation for the pool. (If for no
other reason than to insure that the objects do not cross cache line
boundaries.) You create an array of objects of that size, and mark and release
just move an index.  If Unchecked_Finalization is non-null for any child type,
the compiler can give a warning at compile time, and set a flag that requires
the release procedure to step through the array calling Unchecked_Finalization.
Again, if a subcomponent requires finalization, it is the programmer's job to
make that call from Unchecked_Finalization.  But note that a type now has two
finalization procedures, one that assumes that linked lists and so on do not
need to be maintained, and regular finalization which does ll that work.

Is this a complete solution to the issue?  No.  But the amount of compiler
involvement is small enough that I can see making the changes and creating
mark-release and garbage collected packages.  Then we would have a real example
to work with, and something for users that does not change erroneousness for any
existing code.  Yes, it would result in erroneousness if a programmer extended a
type with unchecked finalization using a type that needs finalization, and
didn't put a call to Finalize in the Unchecked_Finalization routine.  But that
is a very localized extension of erroneousness.

* Big deal, you can do that with Finalize, and in some cases the implementation
will even call Finalize several times on the same object.

****************************************************************

From: Randy Brukardt
Sent: Monday, March 17, 2008  2:07 PM

...
> There are really at least two different approaches to persistence, one
> which is based on a representation that can be paged in and out of
> memory, and has few if any differences between the in-memory and the
> on-disk representation.  At a minimum, the in-memory and the on-disk
> representations take up the same amount of space, so you can do
> whatever "fixups"
> are needed on read/write of a page "in place."
> As an example of such a persistence approach, take a look at the
> "Thor" system described by Barbara Liskov et al. in "Distributed
> Object Management in Thor":
>
>    http://citeseer.ist.psu.edu/liskov93distributed.html

OK, but that also maps to a sort of streaming. In fact, Storage_IO provides that
streaming. (Indeed, I think that the streaming should have been done that way in
the first place.)

You do need access values to be implemented as relative references in order for
that to work practically (the alterntive of trying to record every access part
in the partition quickly becomes a nightmare, plus rather expensive to do
fixups). Which seems like a fairly big change.

...
> I am currently more interested in the former, which maps quite nicely
> to the distribution passive partition model, where something very
> close to the in-memory representation persists beyond the execution of
> a single active partition.
> For this approach, finalization could be accommodated, but it would
> only be to support unchecked-deallocation, since it is not generally
> practical to have to invoke the finalization operations of persistent
> objects when the passive partition finally dies (if it ever does).

There is more to controlled types than just finalization, of course. Initialize
and especially Adjust are just as important. And there is value to types that
work everywhere (Finalize is called when the object ceases to exist, which may
be never for a persistent object).

There doesn't seem to be any need to call Finalize on a persistent partition,
assuming that it is defined to live forever. (If an extra-implementation means,
such as a file delete, is used to get rid of it, we've not going have any reason
to worry about preserving the semantics.)

We've talked in the past at trying to figure out the difference between Finalize
routines that are purely memory management (and as such, don't really need to be
called at the library level when the program is going away anyway) and those
that do something critical (like release a resource). The latter kind shouldn't
be used in a persistent anything (don't want to lock a resource forever), but
the former is just fine.

The real problem with this approach seems to be that you have to have types that
are crafted to work for persistence (or are scalar). That limits the utility
compared to the streaming approach where the only requirement is that streaming
work (which is a good idea anyway).

...
> I don't see much of a connection between this second model and storage
> pools (though I may be missing something obvious), whereas for the
> first model, persistent/passive storage pools make sense, and would
> almost be essential to make shared-passive packages useful (I suspect
> you find shared-passive packages useless, but I think at least for the
> purposes of supporting persistence, they could be quite useful).

They *are* useless, as far as a language-defined persistence mechanism, because
they support neither controlled types nor OOP (no access-to-classwide). We have
way too many either you get this or you get this but not both things in Ada as
it is. Saying that you can have persistence OR OOP but not both is surely not
going to help advance Ada.

There would be value to a language-defined way to do persistence, but it would
have to be fairly general (at least all non-limited types). And I definitely
agree that storage pools have nothing much to do with persistence, because the
only things you could put into a persistent storage pool are those that are
self-flattened. That's way too limiting (or very burdensome on the
implementation).

If someone wants to create such pools for their own use, that's fine. But I
don't see much value into twisting the language definition around solely to make
it easier to create persistent pools that could never be part of the language
definition as they are so limiting.

****************************************************************

From: Tucker Taft
Sent: Friday, June 13, 2008  2:48 PM

Here is the start of an AI on per-allocator pools.
Still pretty sketchy in parts.

[This is version /01 of the AI - Editor.]

****************************************************************

From: Robert A. Duff
Sent: Friday, June 13, 2008  4:20 PM

Interesting ideas.

I see we've discussed similar stuff in these threads:

Subject: [ARG] Specifying pool on allocator
Subject: [ARG] Persistent storage pools; E.2.1(7/1)

[Editor's note: These threads are filed in this AI.]

****************************************************************

From: Tucker Taft
Sent: Staurday, June 14, 2008  7:17 AM

I had some better ideas on how to deal with "subpool"
access values that are components.  See below.

[This is version /02 of the AI - Editor.]

****************************************************************

From: Tucker Taft
Sent: Saturday, October 25, 2008  2:02 PM

I presented some of the ideas associated with per-allocator pools to the "container"
working group recently at a meeting we held to discuss new kinds of containers, and
as a result refined my thinking on this topic quite a bit.  I realize it is past the
deadline, but it seems worth providing this latest incarnation, so we don't end up
discussing out-of-date ideas.

This is presumably pretty low on the priority list in any case, so this might end up
being more of an over-lunch or in-the-bar discussion anyway.

I ended up throwing away all of the !wording section, since I felt it was
premature, so you will find a !proposal, a !discussion, and an Example.

[This is version /03 of the AI - Editor.]

****************************************************************

From: Tucker Taft
Sent: Sunday, February 15, 2009  5:29 PM

Here is an update to my AI on breaking up a storage pool into "subpools" with
different allocators for the same access type allocating from different
subpools. The changes are due to further reflection on comments provided
by Randy and Steve on an earlier version.
The main changes are:

    1) Composite assignment requires component-by-component
       accessibility checks for components having a multi-subpool
       access type (the earlier approach was too limiting).

    2) Composite types that are passed by reference that
       contain multi-subpool access values will need to
       pass an implicit parameter identifying their master.

    3) I have put more emphasis on the notion of a default
       subpool, so an implementation could provide much
       of the proposed capability without relying on the extended
       syntax for allocators.

    4) I have recommended that subpools with outstanding
       weak references not be reclaimed immediately, but
       rather remain unreclaimed as long as memory is
       "plentiful," to avoid each application having to
       provide its own "caching" task to prevent immediate
       reclamation of such subpools.

[This is version /04 of the AI - Editor.]

****************************************************************

Questions? Ask the ACAA Technical Agent