Version 1.1 of ai05s/ai05-0111-1.txt

Unformatted version of ai05s/ai05-0111-1.txt version 1.1
Other versions for file ai05s/ai05-0111-1.txt

!standard 11.6(5)          08-08-08 AI05-0111-1/00
!class Amendment 08-08-08
!status work item 08-08-08
!status received 08-03-12
!priority Medium
!difficulty Hard
!subject Specifying a pool on an allocator
!summary
(See proposal.)
!problem
!proposal
!wording
(** TBD **)
!discussion
!example
!ACATS test
!appendix

From: Tucker Taft
Sent: Wednesday, March 12, 2008  8:03 AM

One thing I keep running into when trying to do more interesting things
with storage pools is the inability to specify the storage pool at the
point of the allocator.  Obviously this only makes sense for general
access types, but given that they can point to objects in different
pools, why not be able to specify the pool on allocation?

To work around this problem, we currently use some seriously groddy,
thread-unsafe mechanisms that essentially specify a "virtual" storage
pool on the access type, and then store into that storage pool a
pointer to the "real" storage pool to use the next time an allocator is
evaluated.

Wouldn't it be a whole lot more reasonable and thread safe to allow
something like:

     X : Gen_Acc_Type := new (pool-to-use) T'(initial_value);

That is, allow the specification of the pool as part of the syntax for
an allocator.

I'll create an AI on this if there is any sympathy for the approach.

****************************************************************

From: Robert Dewar
Sent: Wednesday, March 12, 2008  8:07 AM

I think this would definitely be useful, almost more useful than specifying
on types! In practice all access types are general (I regard pool specific
allocators as a bizarre corner feature of the language that is almosst never
used :-)

****************************************************************

From: Tucker Taft
Sent: Wednesday, March 12, 2008  9:12 AM

Glad to hear I am not the only one who might find this useful.  I suppose
there ought to be a corresponding variant of Unchecked_Deallocation that
takes a storage pool as an explicit parameter, e.g.:

   generic
      type Object(<>) is limited private;
      type Name is access all Object;
   procedure Ada.Unchecked_Deallocation_From_Pool
      (X : in out Name;
       Pool : in out System.Storage_Pools.Root_Storage_Pool'Class);

This is probably less critical, since it is more common for heap objects to
keep track of what pool they come from.  Furthermore, one of the main reasons
to take over allocation is so you can avoid ever calling unchecked
deallocation.  But it still seems like something like the above should be
available if one can specify an allocator-specific pool.

(Note that I have limited the actual access type to be a general access type
with the use of "all".)

****************************************************************

From: Robert A. Duff
Sent: Wednesday, March 12, 2008 11:41 AM

> I think this would definitely be useful,...

I agree with Tucker and Robert.

I also agree with Tucker's other note about Unchecked_Deallocation.

Maybe I'll be accused of making "best" the enemy of "good enough",   ;-)
but along these lines, I'd also like a way to define a default pool for
all access types in a scope, and/or a way to force all access types in a
scope to have a Storage_Pool clause.

>...almost
> more useful than specifying on types! In practice  all access types 
>are general (I regard pool  specific allocators as a bizarre corner 
>feature  of the language that is almosst never used :-)

Historical note: Tucker wanted to make all access types "general", so there
would be no need for the extra "all" syntax.
I convinced him to keep the Ada-83 kind.

I've since changed my mind: pool-specific access types are useful, but not
useful enough to warrant the added language complexity of having two kinds
of access types.  Tucker was right all along.

I am also responsible for the bogus term "pool-specific".
The issue is only vaguely related to pools.  But I confused myself into
thinking that pools are more-or-less the same thing as what Ada 83 calls a
"collection".  They're not.

My fault.

****************************************************************

From: Tucker Taft
Sent: Wednesday, March 12, 2008 12:08 PM

Your memory of history is better than mine.
One thing I do remember, though, was that from an optimizer point of view,
pool-specific access types are preferable, since they don't need to worry
about updates to (non-heap) aliased objects or updates through unrelated
access types.

And if you are going to be using unchecked-deallocation with your own
storage pools, if you only associate a given pool with pool-specific access
types, it gives you an extra level of comfort that the pointer passed to a
pool's deallocate routine was produced by a call on the allocate routine for
that pool.

I also know that the Rational compiler was able to use smaller integers to
represent pool-specific access values if they had an associated Storage_Size.

So I suppose I have changed *my* mind too, that pool-specific access types
aren't all bad... ;-)

But the bigger point in my mind is that associating a pool with an access type
isn't what I usually want to do.  I want to associate a pool with some set
of objects that all have the same lifetime (or more precisely, the same
moment of "death").  And that rarely matches the lifetime of the access
types involved.

****************************************************************

From: Randy Brukardt
Sent: Wednesday, March 12, 2008  1:47 PM

...
> I am also responsible for the bogus term "pool-specific".
> The issue is only vaguely related to pools.  But I confused myself 
> into thinking that pools are more-or-less the same thing as what Ada 
> 83 calls a "collection".  They're not.

Would you care to elaboration on "they're not"? I've always viewed pools
as a more powerful version of collections. (Indeed, collections are
implemented as a special predefined kind of storage pool in Janus/Ada.)
If I'm wrong in some fundamental way, this would be as good of a time
as any to find out.

****************************************************************

From: Randy Brukardt
Sent: Wednesday, March 12, 2008  2:36 PM

> One thing I keep running into when trying to do more interesting 
> things with storage pools is the inability to specify the storage pool 
> at the point of the allocator.  Obviously this only makes sense for 
> general access types, but given that they can point to objects in 
> different pools, why not be able to specify the pool on allocation?
...
>
> Wouldn't it be a whole lot more reasonable and
> thread safe to allow something like:
>
>      X : Gen_Acc_Type := new (pool-to-use) T'(initial_value);
>
> That is, allow the specification of the pool as part
> of the syntax for an allocator.

My "language purity" alarm went off with this one. Something about the
solution seems wrong; not sure I can explain it. It seems unsafe, in that
there would be no way to tell what pool something was allocated from, and
thus deallocation would be extremely dangerous.

I'm definitely dubious that the problem is the inability to specify a
storage pool on an allocator. That seems more like a possible solution to a
problem, rather than the problem itself. I'm sure there are problems in this
area, but I guess I'd like to lay out the problems before solving them (if
for no other reason than to ensure that we've really solved them all).

One technical question on this proposal: is this intended to work with
anonymous access types? That would seem to make allocators of them useful,
but also would introduce a number of interesting new issues (such as, can an
allocator with a specified pool be a coextension? If so, you can have
coextensions with parts in different pools, and if not, then it becomes even
less obvious if a particular object is a coextension).

> ... I suppose there ought
>to be a corresponding variant of Unchecked_Deallocation
>that takes a storage pool as an explicit parameter...
>
>This is probably less critical, since it is
>more common for heap objects to keep track of what
>pool they come from. ...

I've never done that, and this is the first time I can recall anyone saying
that they did that. So I'm dubious about the "more common" part. I've
considered having Janus/Ada do that internally for mutable types (it would
solve a number of obscure bugs and leaks), but that wouldn't help the user
any.

>...
>Furthermore, one of the main
>reasons to take over allocation is so you can avoid
>ever calling unchecked deallocation.

Humm, now that seems to be a problem, since it is impractical to do that in
Ada as it stands. (I've given up trying.) Either you end up with a huge
number of type conversions to some global general access type or you end up
with a global pool that isn't able to do any early deallocations. I've
generally used pools to catch double deallocations and the like, and/or to
make allocations more efficient by optimizing a particular size into an
array (other sizes being passed on to the main pool).

My first thought was that anonymous access types are the right solution for
this (as they are the sort of short-lived type needed) but that doesn't
actually work, because you'd still need all of the type conversions. (Which,
BTW, is one of the reasons I hate them so; they don't even solve the problem
that they were advertised to solve [eliminating excess type conversions].
And you can't specify a pool. So you can't actually use them in practice (or
they don't add anything), so why have them?)

So I'm not sure what the right answer is.

> And if you are going to be using unchecked-deallocation
> with your own storage pools, if you only associate
> a given pool with pool-specific access types, it gives you an
> extra level of comfort that the pointer passed
> to a pool's deallocate routine was produced by
> a call on the allocate routine for that pool.

It's a lot more than comfort -- in an absence of Unchecked_Conversions into
a pool-specific type, it is a guarantee. (And there shouldn't be any such
conversions; we'll raise Constraint_Error on the use of any such value if
the type uses the default pool - so it won't work anyway. I'd ban such U_Cs
outright if the language allowed me to do that.)

For the main pool, we can check for deallocations from the stack and raise
Program_Error if they're attempted; that provides (almost) the same level of
safety for a general access type using the default pool. But once you mix
user-defined pools into the type, you're playing Russian roulette if you are
using Unchecked_Deallocation.

Anyway, I don't like the idea, but perhaps there is no better solution to
the problem(s).

****************************************************************

From: Robert A. Duff
Sent: Wednesday, March 12, 2008  2:37 PM

> Could you care to elaboration on "they're not"? I've always viewed 
> pools as a more powerful version of collections. (Indeed, collections 
> are implemented as a special predefined kind of storage pool in 
> Janus/Ada.) If I'm wrong in some fundamental way, this would be as
> good of a time as any to find out.

As I recall, in Ada 83, the "collection" of an access type just means
the set of all objects allocated by allocators returning that type.  (I'm
ignoring derived access types.)  "Collection" is a fairly high-level
concept.  If we had kept the term in Ada 95, I suppose we would use it
to talk about finalization.

"Storage pool", OTOH, is a low-level memory-management concept.  Many
unrelated access types can share the same storage pool; hence it's not
the same thing as "collection".  If we take Tucker's suggestion, then
the reverse is also true -- many storage pools per collection.  (In
fact, through torturous programming, one can already achieve that
in Ada 95, as Tucker hinted.)

I suspect you're using "collection" to refer to 'Storage_Size clauses.
As such, I don't think you're doing anything wrong (except perhaps
misusing the term ;-) ).

...all of which reminds me: if we take Tucker's suggestion, it's worth
thinking about finalization, and perhaps awaiting task termination.
One useful thing to do with pools is to avoid Unchecked_Dealloc on
individual objects.  The pool collects objects together into "memory
pages" or whatever, and the entire pool can be wiped out with one call
(either explicit, or triggered by finalization of the pool).  Makes
deallocation very efficient, and very simple (if all objects in the
pool have roughly the same lifetime).  But this doesn't quite work
if the objects have finalization.  (And of course formally, ALL
objects have finalization according to the RM -- one must hope
that the implementation doesn't actually do anything if the
finalization is a no-op.)

Not a big deal, but worth thinking about.

****************************************************************

From: Stephen W Baird
Sent: Wednesday, March 12, 2008  2:08 PM

> I also know that the Rational compiler was able to use smaller 
> integers to represent pool-specific access values if they had an 
> associated Storage_Size.

You're going way, way back with this one.
Back in the days when Rational's Ada development environment was
hosted on proprietary hardware, the default representation for a
value of an access type was as a (bit) offset relative to the
beginning of a chunk of memory associated with the access type.
A storage_size representation clause could restrict the range of
these offsets and would therefore allow a smaller representation
for access values. This, coupled with the R1000's aggressive bit
packing policies, meant that a storage_size specification for an
access type with no other accompanying black magic could result
in an array of access values having a component size of, say, 23 bits.

Getting back to the original question about specifying a pool for
an allocator, it might be worth noting that Rational felt the need
to invent an Ada83 solution to this problem. Like many pragma-based
solutions to problems which really require language support, it was
somewhat ugly. The solution included a pragma Heap, which occurred
immediately after a statement or declaration which included an
allocator, and which specified the "storage pool" (although that
was not the terminology that was used at the time) for the
allocator(s). There is no need to point out the various problems
with this approach.

****************************************************************

From: Tucker Taft
Sent: Wednesday, March 12, 2008  3:41 PM

> My "language purity" alarm went off with this one. Something about the 
> solution seems wrong; not sure I can explain it. It seems unsafe, in 
> that there would be no way to tell what pool something was allocated 
> from, and thus deallocation would be extremely dangerous.

This is already an issue in Ada 95, because any general access value
could have originated from any allocator associated with any access type
that has a compatible designated type, or for that matter from 'Access
of an appropriate level aliased object.  Presuming we restrict this to
(named!) general access types, it is not introducing any more danger
than already exists.  Any work-around for this is going to be just as
dangerous, presuming you use some kind of level of indirection in your
storage pool, where you change the "real" storage pool being used during
execution.

> I'm definitely dubious that the problem is the inability to specify a 
> storage pool on an allocator.

I'm not. ;-)  I am a big believer in using something approximating a
mark/release approach to allocation, or equivalently, a "region-based"
approach. That is, you allocate objects from a region or heap one at
a time, and then you throw them all away in "one swell foop."  This
approach may not be too useful for some kinds of applications, but
applications that are vaguely like a compiler (which most of the
ones I develop are), generally build up a tree one node at a time,
and then at some point have no need for the entire tree.  Walking the
tree just to free it one node at a time is a pain, and probably means
that you need to have decent a defragmenting storage management system,
etc.

When it comes to freeing the tree, I'm not freeing
*all* trees, just the entire tree associated, say, with a particular
compilation unit.  There is no natural way in Ada to associate that one
particular tree with a storage pool, unless you play thread-unsafe games
with levels of indirection, etc.

If this capability existed, then I could have a factory routine which
took the pool as a parameter, or perhaps retrieved it from the "header"
of the overall tree, or wherever.  It would then use that pool to do the
allocation.

> ...
> That seems more like a possible solution to a
> problem, rather than the problem itself. I'm sure there are problems 
> in this area, but I guess I'd like to lay out the problems before 
> solving them (if for no other reason than to ensure that we've really solved them all).

The problem is that you want the pool to be associated with a set of objects,
not with a set of access types.

Another place this comes up is with containers.  I really want to pass in the
container to use when I create a hash table.  Different hash tables originating
from the same generic instantiation might very well want to use different
storage pools.  Essentially I want the storage pool to be identified via
an access discriminant of the container.  When the hash table goes away,
I can easily reclaim all the storage used to represent it.

> One technical question on this proposal: is this intended to work with 
> anonymous access types?

It depends on the kind of anonymous access types.  A coextension is allocated
from the same storage pool as that used for the enclosing object, so if that
object were allocated using this new capability, then all the coextensions
would be allocated from the pool specified in the allocator.  I suppose
specifying a storage pool an allocator used to initialize an access
discriminant would be a way of effectively breaking the coextension
relationship, just as though you had wrapped the allocator in a qualified
expression for some named access type.

If we are talking about an anonymous allocator passed to an access parameter,
then that uses a local storage pool that goes away when the subprogram
returns.  I can't see much value in overriding that, though I suppose there
might be reasons to do so.

If we are talking about the new kinds of anonymous access types introduced
in Ada 2005 for components, then I suppose it could be useful.

If we are talking about access return types, then the current AI proposes
that a return of an anonymous allocator uses the storage pool determined
by the context of use.  I suppose overriding the pool in that case again
would be roughly analogous to putting a qualified expression around the
allocator for some particular named access type, again breaking any
coextension relationship that might have been produced in the absence
of the explicit pool specification.

> ... That would seem to make allocators of them useful,
> but also would introduce a number of interesting new issues (such as, 
> can an allocator with a specified pool be a coextension? If so, you 
> can have coextensions with parts in different pools, and if not, then 
> it becomes even less obvious if a particular object is a coextension).

Part and parcel of the definition of a coextension is that it has the
same lifetime of the object that designates it.
Hence, as suggested above, if you specify an explicit pool you would
be preventing a coextension relationship, just as if you had wrapped
the allocator in a qualified expression for a named access type.
 
>> ... I suppose there ought
>> to be a corresponding variant of Unchecked_Deallocation that takes a 
>> storage pool as an explicit parameter...
>>
>> This is probably less critical, since it is more common for heap 
>> objects to keep track of what pool they come from. ...
> 
> I've never done that, and this is the first time I can recall anyone 
> saying that they did that.

Well it is quite common in the C++ world that "delete" is designed to
work no matter what "arena" the object is allocated from.  In fact,
there isn't much choice, since "delete" can't be passed any extra
parameters in C++.  Somewhat analogously, "free" in C doesn't take a
length.  The length is squirreled away in front of the object usually.

> ...
> So I'm dubious about the "more common" part. I've
> considered having Janus/Ada do that internally for mutable types (it 
> would solve a number of obscure bugs and leaks), but that wouldn't 
> help the user any.

I'm not proposing to *require* it, but it is always an option for
a user-defined storage pool.  In fact, the whole point of proposing the
additional version of unchecked deallocation is to put the burden
on the user rather than the implementor of the storage pool when that
is appropriate.  In Ada 95, the user has to decide what type to use
for Unchecked_Deallocation, even if there is a lot of
access-type-to-access-type conversion being performed in the
application.  I know our implementation of finalization of
heap-allocated objects as part of Unchecked_Deallocation goes out
of its way to survive in the presence of access-type-to-access-type
conversion.
 
>> ...
>> Furthermore, one of the main
>> reasons to take over allocation is so you can avoid ever calling 
>> unchecked deallocation.
> 
> Humm, now that seems to be a problem, since it is impractical to do 
> that in Ada as it stands. (I've given up trying.)

Maybe you are misinterpeting what I am saying.
We pretty much never call Unchecked_Deallocation in our static analysis
tool.  We rely on en-masse deallocation of an entire storage pool to
reclaim storage before the access type goes out of scope.

Of course this doesn't work if the objects need any sort of
finalization, so we have to be careful about that.  Sometimes we even
go to the trouble of calling Unchecked_Deallocation on each object
that might need finalization, even though we know the underlying pool
actually does nothing when you call "Deallocate."

> ... 
> Anyway, I don't like the idea, but perhaps there is no better solution 
> to the problem(s).

It may be you have never encountered the issue.

But in my experience, eliminating overhead in storage management is
absolutely critical to the performance of our kind of tool.  Garbage
collection is much too slow (even the high end, real-time, incremental,
blah, blah, blah versions), and even malloc/free are significantly
slower, due to the fragmentation problems, first-fit vs. best-fit
list walking, etc.

Only something approximating mark/release have produced the kind of
performance we require.

And then we end up wasting some of that on silly levels of indirection
through virtual storage pools.

****************************************************************

From: Robert A. Duff
Sent: Wednesday, March 12, 2008  6:48 PM

> That is, you allocate objects from a region or heap one at a time, and 
> then you throw them all away in "one swell foop."  This approach may 
> not be too useful for some kinds of applications, but applications 
> that are vaguely like a compiler (which most of the ones I develop 
> are), generally build up a tree one node at a time, and then at some 
> point have no need for the entire tree.

Right.  I'd like to point out that this "tree" (or whatever data structure
it is) consists of many different record types, linked together via many
different access types.  And the same access types are used in
different trees -- that's why per-type pools don't work in this case.

****************************************************************

From: Randy Brukardt
Sent: Wednesday, March 12, 2008  7:55 PM

> This is already an issue in Ada 95, because any general access value 
> could have originated from any allocator associated with any access 
> type that has a compatible designated type, or for that matter from 
> 'Access of an appropriate level aliased object.  Presuming we restrict 
> this to (named!) general access types, it is not introducing any more
> danger than already exists. Any work-around for this is going to be just
> as dangerous, presuming you use some kind of level of indirection in your 
> storage pool, where you change the "real" storage pool being used 
> during execution.

I agree that a workaround is just as dangerous, that's why I want to look
for a better solution that is not dangerous.

And while I agree that Ada 95 has this problem, it is one that is very
bounded to named general access types with a user-defined storage pool
(or with type conversions from a pool-specific access type that has such
a pool whose memory is allocated in the main heap) *and* that has an
instance of Unchecked_Deallocation.

As I discussed previously, in Janus/Ada, using Unchecked_Deallocation on
the default pool raises Program_Error if given an access value outside of
the main heap. Thus, using Unchecked_Deallocation on a general access type
(that uses the default pool) is almost as safe as using it in a
pool-specific type. (The only new issue is that a access-to-an-aliased-component
of a heap object would sneak through.)

[The reason for this check is to avoid heap corruption long enough to provide
an error message; otherwise, we've seen programs where the exception handling
mechanism fell over because of the corruption. Besides, we're already
tracking the heap bounds for the dereference check for pool-specific types
(an Ada 83 feature that we've preserved because we didn't want moving to
Ada 95 to lose functionality) and thus the check is essentially free.]

It would make perfect sense to warn on the declaration of such an instance
of Unchecked_Deallocation for a general access type that has a
user-defined pool. Indeed, I'm not sure why we don't have such a warning:
"Use of this deallocator is potentially unsafe as deallocation of objects
that were not originally allocated cannot be detected." Probably just
never thought about this in detail.

(Indeed, it would be possible to make many user-defined storage pools safe
in this way as well, but it would not be clear whether that sort of thing
is best handled by the compiler or by the user-defined pool.)

However, with your proposal, this danger now spreads to all (named) general
access types, because it would take full program analysis to prove or
disprove whether there is a danger. That means that this is no longer
something that the compiler can help with (and giving the above warning
on all general access types is not helpful).

...
> > One technical question on this proposal: is this intended to work 
> > with anonymous access types?
>
> It depends on the kind of anonymous access types. ...
[Lots more text].

You could have just said is "Yes". ;-) I've pretty much convinced myself
that if it is a good idea at all, it ought to be allowed on anonymous
access types - it would make them a lot more flexible.

> > ... That would seem to make allocators of them useful, but also 
> > would introduce a number of interesting new issues (such as, can an 
> > allocator with a specified pool be a coextension? If so, you can 
> > have coextensions with parts in different pools, and if not, then it 
> > becomes even
> > less obvious if a particular object is a coextension).
>
> Part and parcel of the definition of a coextension is that it has the 
> same lifetime of the object that designates it.
> Hence, as suggested above, if you specify an explicit pool you would 
> be preventing a coextension relationship, just as if you had wrapped 
> the allocator in a qualified expression for a named access type.

This is what I hate about coextensions. You mentioned at the last meeting
that the way Janus/Ada allocates objects means that all objects with
dynamic-sized array components act like coextensions. And that is true,
except for one major difference: *every* such object acts the same way.
There is no wishy-washy "maybe it is and maybe it isn't" and changing
a couple of characters here or there changes the status.

With the semantics you are suggesting, you couldn't even tell
syntactically whether it is or is not a coextension (it would depend on
what pool is given in the allocator). Yuck. (Yuck is not strong enough
of a word for this, but I don't want to spend 10 minutes finding a
thesaurus in order to find a better word...)

> > ... So I'm dubious about the "more common" part. I've considered 
> > having Janus/Ada do that internally for mutable types (it
> > would solve a number of obscure bugs and leaks), but that wouldn't
> > help the user any.
>
> I'm not proposing to *require* it, but it is always an option for a 
> user-defined storage pool.

I certainly would consider doing it as part of the implementation;
I can't stomach the lack of safety and at least that would reduce it.
(It's annoying that we'd have to check and raise an exception rather
than doing the right thing, but I suppose that would be for the best -
it would be more portable.)

> In fact,
> the whole point of proposing the additional version of unchecked 
> deallocation is to put the burden on the user rather than the 
> implementor of the storage pool when that is appropriate.

I think it makes just as much sense to put the burden on the compiler.
Why are we messing with all of this horrible accessibility stuff if
then we're going to immediately turn around and say "it's all erroneous
anyway!" If Ada is the language for safe programming, we need to act
that way.

It would make sense to allow the user to suppress such checks (as
with all language-defined checks), but to have the default to be
unsafe seems bad.

> In Ada 95, the user has
> to decide what type to use for Unchecked_Deallocation, even if there 
> is a lot of access-type-to-access-type conversion being performed in 
> the application.  I know our implementation of finalization of 
> heap-allocated objects as part of Unchecked_Deallocation goes out of 
> its way to survive in the presence of access-type-to-access-type 
> conversion.

You have to do that! Not sure how that "is going out its way", unless
you want to appeal to erroneousness and allow unrelated stuff to fail.
(You'd be right vis-a-vis the letter of the standard, but users wouldn't
be too happy.)

> >> ...
> >> Furthermore, one of the main
> >> reasons to take over allocation is so you can avoid ever calling 
> >> unchecked deallocation.
> >
> > Humm, now that seems to be a problem, since it is impractical to do 
> > that in
> > Ada as it stands. (I've given up trying.)
>
> Maybe you are misinterpeting what I am saying.

No, you stopped reading before I explained what I meant. I think we
actually agree. I said that it takes heroic efforts to do that (and I'm not
willing to go to those efforts), and you said (elsewhere) that it takes you
heroic efforts to do so (and that's why you're proposing this language
change). The only difference is that you are willing to go to those efforts
and I'm not.

...
> > Anyway, I don't like the idea, but perhaps there is no better 
> > solution to the problem(s).
>
> It may be you have never encountered the issue.

Say what? What does that have to do with not liking the idea? Just
because there is a problem that is hard to solve in Ada doesn't mean
that we should throw out all of the safety built-into the language and
particular implementations of the language in order to solve the problem.

I'm not against solving the problem. I just don't like this particular
form of the proposed solution to the problem. Sheesh.

> But in my experience, eliminating overhead in storage management is 
> absolutely critical to the performance of our kind of tool.  Garbage 
> collection is much too slow (even the high end, real-time, 
> incremental, blah, blah, blah versions), and even malloc/free are 
> significantly slower, due to the fragmentation problems, first-fit vs. 
> best-fit list walking, etc.
> Only something approximating mark/release have produced the kind of 
> performance we require.
> And then we end up wasting some of that on silly levels of indirection 
> through virtual storage pools.

OK. But claiming that "you've never encountered the problem" has nothing to
do with this.

Besides, I *have* encountered the problem, way back in the early days of Ada
83 and Janus/Ada. I see that I solved it (in our particular case) on
October 18, 1985. The solution is the obvious one, in fact: if
Unchecked_Deallocation is too slow, then don't call it!

We put the nodes (more than 5 different kinds) on a bounded free chain
when we're done with them. (It's bounded so that an unusual program doesn't
soak up all of the memory in one kind of node, but that rarely happens.)
It only takes 5 machine instructions to do so, so it can hardly be a
noticeable overhead until there are billions of nodes. Moreover, allocation
off of the free list is actually faster than normal allocation, so there
is a win on that end as well. The bounds were set by experimentation
with lots of test programs, so that exceeding the bounds is rare.

When the compiler pass terminates, the entire program exits (each pass
is a separate program), so there is no need to every actually release
the nodes on the free chain. We do have an "emergency dump-all" routine that
dumps all of the free chains if we run out of memory allocating nodes
(that occurred frequently back on the MS-DOS host with 640K of RAM, but I
doubt it has ever happened on any of the larger hosts).

Indeed, we tried a mark-release scheme early on (it was built-into the
Pascal compiler we started with). It was just a huge tangle of bugs, we
were forever walking into stuff that had been deallocated. We eventually
gave it up as not worth the hassle.

Obviously, this scheme wouldn't work as well if we needed to do major mode
shifts during the compiler passes, but even then it would probably work
pretty well (it cut the number of deallocations in our compiler by a
factor of 100). And the reuse of the nodes reduces the number of crashes
due to dangling pointers, but probably increases the number of latent
bugs caused by using a reused node instead.

Still, obviously one size does not fit all when it comes to memory
allocation. I'm just not interested in sacrificing some safety to solve
this (or any other, for that matter) problem.

****************************************************************

From: Randy Brukardt
Sent: Wednesday, March 12, 2008  8:06 PM

> Right.  I'd like to point out that this "tree" (or whatever data 
> structure it is) consists of many different record types, linked
> together via many different
> access types.  And the same access types are used in different trees 
> -- that's why per-type pools don't work in this case.

Same here. I detailed how I solved that (in 1985!) for Janus/Ada.
Perhaps if I would have had storage pools back then, I'd have
tried a different solution. But in all honesty, the simpler solution
is good enough, and I've reused it in a number of other contexts
since with success.

That is to say, I'm not sure that "storage pools" is the right
abstraction to solve this problem. I realize that once you have
them, the temptation is great to map everything into them. Anyway,
some food for thought.

****************************************************************

From: Randy Brukardt
Sent: Wednesday, March 12, 2008  8:40 PM

...
> As I recall, in Ada 83, the "collection" of an access type just means 
> the set > of all objects allocated by allocators returning that type.
> (I'm ignoring derived access types.)  "Collection" is a fairly high-level 
> concept.

Humm, I didn't recall that. (The term is not in the Ada 83 index, which
doesn't help any.) The Ada 83 Rationale uses the term the way you describe,
but I can't find anything in the Ada 83 standard that does. (Indeed, the
only uses of the term I can find are in 4.8(9) and 13.2(8), which is
defining 'Storage_Size.)

> If we had kept the term in Ada 95, I suppose we would use it to talk 
> about finalization.

We put it back (in a sense), to talk about finalization. See 7.6.1(11/2).

> "Storage pool", OTOH, is a low-level memory-management concept. Many
> unrelated access types can share the same storage pool; hence it's not
> the same thing as "collection".  If we take Tucker's suggestion, then
> the reverse is also true -- many storage pools per collection.
>  (In fact, through torturous programming,
> one can already achieve that in Ada 95, as Tucker hinted.)

I guess you're right if you take a narrow view of the meaning of
"collection". That is, that it is the objects of only a single type.
Period. Surely that is true in Ada 83 (no overlapping collections is
mentioned in the Rationale). But I see this as simply something that
was expanded in Ada 95 (multiple types can share their collections -
surely this is true for derived access types). And that the term was
changed to be more accurate.

> I suspect you're using "collection" to refer to 'Storage_Size clauses.  
> As such, I don't think you're doing anything wrong (except perhaps 
> misusing the term ;-) ).

That was the only visible meaning of the term in Ada 83.

> ...all of which reminds me: if we take Tucker's suggestion, it's worth
> thinking about finalization, and perhaps awaiting task termination.
> One useful thing to do with pools is to avoid Unchecked_Dealloc on
> individual objects. The pool collects objects together into
> "memory pages" or whatever, and the entire pool
> can be wiped out with one call (either explicit, or triggered by
> finalization of the pool).  Makes deallocation very efficient, and
> very simple (if all objects in the pool have roughly the same
> lifetime).  But this doesn't quite work if the objects have
> finalization.  (And of course formally, ALL objects
> have finalization according to the RM -- one must hope that the
> implementation doesn't actually do anything if the finalization is a
> no-op.)
>
> Not a big deal, but worth thinking about.

Actually, it is a very big deal (assuming non-trivial finalization).
The current rules ensure that the pool always outlives the type, so
finalization is a non-problem. More generally, the current Ada rules
require that the *memory* always outlives the Ada objects, so there
never is a problem with dangling pointers during finalization.

This change completely destroys that property (at least it does in
the intended use that Tucker is suggesting). Now, obviously, we can
simply declare that if the pool goes away early, the program is
erroneous. But that means that any attempt to solve Tucker's problem
is going to necessarily be erroneous. We could try to formally define
what "trivial finalization" is, and declare that if the pool goes
away early, the program is erroneous. But that's a maintenance
hazard and means that you can't use this mechanism generally (if you
are using types from some outside [independently developed] library,
you can't know or depend on the sort of finalization that it uses).

A better solution would be to define some mechanism to force early
finalization. (After all, I'm on record as saying that the vast
majority of new types should have non-trivial finalization.) But if
we have to do that, perhaps we could combine that with the
storage pool so that we don't have to lose the safety that we
currently have.

Not sure what the solution needs to be. But what I don't want to
see is lots of new ways to write erroneous programs. (Or old ways
that are ten times more likely to occur, either.)

One last point: we still need to fix Ada finalization to allow
garbage collection. Perhaps there is some way to do both (they
surely are closely related). At least if compilers were not
*required* to do something impossible (finalize an object after
the memory is gone), a "safe" compiler could take steps to
eliminate the problem. (Rather than crashing or raising an
exception that could not be avoided.)

****************************************************************

From: Tucker Taft
Sent: Wednesday, March 12, 2008  9:38 PM

> One last point: we still need to fix Ada finalization to allow garbage 
> collection. Perhaps there is some way to do both (they surely are 
> closely related). At least if compilers were not *required* to do 
> something impossible (finalize an object after the memory is gone), a 
> "safe" compiler could take steps to eliminate the problem. (Rather 
> than crashing or raising an exception that could not be avoided.)

Good point.  One approach to avoiding the finalization issue is to
require that storage pools used with this new syntax be required to
have some additional operations related to finalization.  In our
Ada 95 compiler, for each access type that might designate objects
requiring finalization, we maintain a (circularly-linked) list of
allocated objects that need finalization.  An allocator adds the
object to the list if it needs finalization, and
Unchecked_Deallocation removes the object from the list if
it needs finalization.   When the access type itself is finalized,
we walk the list finalizing all of the objects that remain on it.

If any early memory reclamation is done on pools, in Ada 95 or with
this new proposal, you have to hope you don't have objects that
need finalization, because that circularly-linked list is going to
have dangling references.  If instead we expected storage pools to
provide operations to add and remove objects from such a list, and
we require the storage pool object to never outlive the access types
whose objects it stores, so that when the storage pool itself is
finalized, it can free any remaining objects, then we can avoid
this dangling reference problem.  Furthermore, if the storage pool
doesn't want to handle objects needing finalization, it can simply
raise an exception if the "add-to-finalization-list" operation is
called.  Mark/release can then be handled safely with respect to
finalization, by simply doing a "mark" on this finalization list,
and at release finalizing all of the objects added to the list
since the mark.

****************************************************************

From: Tucker Taft
Sent: Wednesday, March 12, 2008  9:13 PM

> I agree that a workaround is just as dangerous, that's why I want to 
> look for a better solution that is not dangerous.
> 
> And while I agree that Ada 95 has this problem, it is one that is very 
> bounded to named general access types with a user-defined storage pool 
> (or with type conversions from a pool-specific access type that has 
> such a pool whose memory is allocated in the main heap) *and* that has 
> an instance of Unchecked_Deallocation.

I'm not sure I follow.  In Ada 95 I can convert to any general access type
with the same designated subtype and compatible accessibility level,
whether or not it has a user-defined storage pool.  I guess I don't
see the added danger with this proposal.

Are you associating the added danger with the allocator or with
Unchecked_Deallocation_From_Pool?
Both of these would be explicit in the source code.
There is no chance a user could use them by mistake.
And it seems that both of these new capabilities are equivalent
(ignoring thread-safety) to something that could be accomplished in Ada 95
using a level of indirection in a storage pool, plus a couple of access
type conversions.

> ...
> It would make perfect sense to warn on the declaration of such an 
> instance of Unchecked_Deallocation for a general access type that has 
> a user-defined pool. Indeed, I'm not sure why we don't have such a 
> warning: "Use of this deallocator is potentially unsafe as 
> deallocation of objects that were not originally allocated cannot be 
> detected." Probably just never thought about this in detail.
> ...
> 
> However, with your proposal, this danger now spreads to all (named) 
> general access types, because it would take full program analysis to 
> prove or disprove whether there is a danger.

I'm not sure I see why you need more sophisticated analysis for this
proposal.  The type conversions can be buried anywhere in Ada 95.

> ...
> With the semantics you are suggesting, you couldn't even tell 
> syntactically whether it is or is not a coextension (it would depend 
> on what pool is given in the allocator). Yuck.

Sorry, that wasn't what I was trying to say.  If you specify a storage
pool explicitly for the allocator, it would *not* be a coextension, even
if it happened to specify the same pool.

> ...
> I'm not against solving the problem. I just don't like this particular 
> form of the proposed solution to the problem....

For me the "problem" is that I can't associate a pool with a particular
allocator, without using a thread-unsafe approach.

How would you express the "problem" that you perceive in this area?

>> But in my experience, eliminating overhead in storage management is 
>> absolutely critical to the performance of our kind of tool. ...
> ...
> 
> Still, obviously one size does not fit all when it comes to memory 
> allocation. I'm just not interested in sacrificing some safety to 
> solve this (or any other, for that matter) problem.

I guess I don't see this as particularly affecting pointer safety
either way, since we have already said that Unchecked_Deallocation
is the users' responsibility to get right, both in terms of dangling
references and in terms of pools.

I am trying to solve a problem with thread-safety, without
significantly changing anything else.

If you could explain the problem as you perceive it, and at least
the general characteristics of a solution you would prefer, that
would help me better understand your view.

****************************************************************

From: Randy Brukardt
Sent: Wednesday, March 12, 2008  10:51 PM

> Good point.  One approach to avoiding the finalization issue is to 
> require that storage pools used with this new syntax be required to 
> have some additional operations related to finalization.  In our Ada 
> 95 compiler, for each access type that might designate objects 
> requiring finalization, we maintain a (circularly-
> linked) list of allocated objects that need finalization.  An 
> allocator adds the object to the list if it needs finalization, and 
> Unchecked_Deallocation removes the object from the list if
> it needs finalization.   When the access type itself is finalized,
> we walk the list finalizing all of the objects that remain on it.

Yes, our compiler works similarly.

> If any early memory reclamation is done on pools, in Ada 95 or with 
> this new proposal, you have to hope you don't have objects that need 
> finalization, because that circularly-linked list is going to have 
> dangling references.

"Hope" doesn't cut it for language design! :-)

> If instead we expected storage pools
> to provide operations to add and remove objects from such a list, and 
> we require the storage pool object to never outlive the access types 
> whose objects it stores, so that when the storage pool itself is 
> finalized, it can free any remaining objects, then we can avoid this 
> dangling reference problem.

Don't you mean the inverse? Currently, a pool has to outlive the access
type. Surely we can't invert that or we'd have nasty compatibility problems.

I'd presume that these "enhanced" pools would derived from
Root_Storage_Pool (as this is a nice abstract limited controlled type),
and it would be nasty to have different lifetime requirements for
Root_Storage_Pool and Enhanced_Storage_Pool. Especially consider the
problems making a legality check for Root_Storage_Pool'Class!

    package System.Enhanced_Storage_Pools is
         type Enhanced_Storage_Pool is new System.Storage_Pools.Root_Storage_Pool with private;

         procedure Save_Object_for_Finalization (Pool : in out Enhanced_Storage_Pool;
                  Storage_Address : in Address);

         procedure Finalize_Saved_Object (Pool : in out Enhanced_Storage_Pool;
                  Storage_Address : in Address);
    end System.Enhanced_Storage_Pools;

I would not be happy if these were unrelated types, since this is exactly
the sort of place where dispatching and inheritance works to maximum
advantage. That is, there would be no problem using "enhanced" storage pools
in contexts that also expect regular storage pools - so they should be related
types.

Not sure what the best solution is. If the type retained its own list, then there
is no problem having it call the pool operations as needed. But that seems like
overkill (both the user-defined pool and the language-defined type would have
to keep lists of objects). If the type tries to use the pool's list, then
objects no longer go away when the type does: that seems like a significant
change in semantics.

Maybe the answer is that if the allocator uses a "normal" pool, it goes onto
the type's list, and if the allocator uses an "enhanced" pool, it goes on the
pool's list. In the latter case, there is no defined relationship between
the access type and the object lifetime: that's the pools job.

Humm, this would seem to fix the garbage collection problem, too. Just specify
an "enhanced" pool for your access type (or explicitly in the allocators),
and you can finalize when you want (but no later than when the pool is
finalized). That couldn't be incompatible, as there are no "enhanced" pools
now.

> Furthermore, if the storage pool doesn't want to handle objects 
> needing finalization, it can simply raise an exception if the 
> "add-to-finalization-list" operation is called.

Yup. Probably would come in handy for some algorithms.

> Mark/release can then be handled safely with respect to finalization, 
> by simply doing a "mark" on this finalization list, and at release 
> finalizing all of the objects added to the list since the mark.

Yup. The only remaining issue is how to tie this to processes defined by
the standard that work like finalization (i.e. task waiting and protected
objects). Specifically, the pool has an address of an object to finalize:
how does it in fact finalize that object?

Implementation-wise, it seems easy. (For Janus/Ada, at least.) Just pass
it to the low-level library routine that finalizes an object. (That grabs
the tag out of location 0 [there must be one if it is a finalizable
object], and either dispatches appropriately, or if the tag is a
"magic marker" tag, do the operation required by that marker - such as
waiting for tasks or freeing a memory block.)

But how best to represent this operation so that the user code can call it?
Surely they must do so in order to finalize any remaining objects. All
I can think of is a magic routine in System.Enhanced_Storage_Pools:

      procedure Finalize_Object_at_Address (Storage_Address : in Address);
          -- Finalizes the object at Storage_Address, calling Finalize if
          -- needed, waits for tasks, does protected object finalization, etc.

This is not very satisfying.

P.S. This is more fun than fixing bugs in the Standard (like trying to
figure out how accessibility should work)! :-)

****************************************************************

From: Randy Brukardt
Sent: Wednesday, March 12, 2008  11:53 PM

> I'm not sure I follow.  In Ada 95 I can convert to any general access 
> type with the same designated subtype and compatible accessibility 
> level, whether or not it has a user-defined storage pool.  I guess I 
> don't see the added danger with this proposal.

If I was worried about such conversions, I could trivially add a
warning to them as well. But they're not often a problem; many
user-defined pools in fact allocate memory from the stack and thus
the check would catch them, too. Only user-defined pools that use
memory allocated from the main heap can sneak through.

> Are you associating the added danger with the allocator or with 
> Unchecked_Deallocation_From_Pool?
> Both of these would be explicit in the source code.
> There is no chance a user could use them by mistake.
> And it seems that both of these new capabilities are equivalent 
> (ignoring thread-safety) to something that could be accomplished in 
> Ada 95 using a level of indirection in a storage pool, plus a couple 
> of access type conversions.

The added danger is that an Unchecked_Deallocation (any kind) could
never know what pool the object might have come from.

Currently, the object (in the absence of type conversions) can *only*
have come from the defined pool or from some other place (usually the
stack). Thus, a simple check on the deallocation that the address belongs
to the defined pool will detect 98% of the misuses.

Note that I'm am not saying that our current implementation is 100% safe,
because it's not. (Among other things, it will miss pointers created into
the middle of heap objects.) But it doesn't need to be in order to add
substantial value (this is the same as the dangling pointer checks that
we do on containers: they're 98% effective, but that in practice catches
virtually all real problems eventually, and they're a lot cheaper than
checks that are 100% effective).

Obviously, you still can write code that is unsafe (we can't detect
double deallocations, for instance), but by catching a significant
percentage of problems, the buggy code gets detected earlier in testing
-- usually before any significant damage occurs.

...
> > However, with your proposal, this danger now spreads to all (named)
> > general access types, because it would take full program analysis to
> > prove or disprove whether there is a danger.
>
> I'm not sure I see why you need more sophisticated analysis for this 
> proposal.  The type conversions can be buried anywhere in Ada 95.

Type conversions don't cause a real problem, because hardly anyone
would go to the lengths that you did to use storage pools. If I have
to use a lot of type conversions, I generally give up and make
everything the same type.

In any case, you could warn on any type conversion that could cause a
problem (I think they are rare enough that wouldn't usually be a major
annoyance). It's only a problem for a conversion into a general access
type that has the default pool (and thus would not warn on the
instantiation of Unchecked_Deallocation). Other conversions would have
no reason to be warned. Warning on every use of an allocator that has
a storage pool would rightly be considered noise by a user (most would
say, of course I used the feature, if you can't suggest how to avoid the
warning, don't bother giving one!).

Of course, if there was some way to figure out that a particular
user-defined pool was "safe", the warning could be avoided, but I can't
imagine a good way to do that.

...
> > ...
> > I'm not against solving the problem. I just don't like this 
> > particular form
> > of the proposed solution to the problem....
>
> For me the "problem" is that I can't associate a pool with a 
> particular allocator, without using a thread-unsafe approach.

That's not a problem, that's a solution to a problem.

> How would you express the "problem" that you perceive in this area?

The user need that you described: the ability to conveniently manage
the memory lifetime of a data structure, possibly made up of multiple
types and multiple access types, separately from the lifetime of those
types.

I can grant that the solution might involve storage pools, but they
can only be part of a solution, never part of a problem.

...
> > Still, obviously one size does not fit all when it comes to memory 
> > allocation. I'm just not interested in sacrificing some safety to 
> > solve this (or any other, for that matter) problem.
>
> I guess I don't see this as particularly affecting pointer safety 
> either way, since we have already said that Unchecked_Deallocation is 
> the users' responsibility to get right, both in terms of dangling 
> references and in terms of pools.

That's of course a cop out. This user rarely gets these "user's
responsibility" things exactly right, and this implementer tries to
help out this user find out where he screwed up. And neither of me
much like losing a tool in this particular battle.

To take a current example: my spam filter crashes with a bad memory
reference roughly once every 30 days. Because one of the versions
of the crash is the dereference of a null pointer in code that can't
have a null pointer, I suspect a memory management problem. (Whether
it is with Windows, with my code, or with the Janus/Ada runtime code,
I don't know). I'd like to find the problem, but how can you do that
with something that happens once every 30 days (which is typically
longer than the time before Windows needs to reboot!). The only
thing I can do is make it less likely that I made a mistake, thus
extra checking on Unchecked_Deallocation at all levels.

> I am trying to solve a problem with thread-safety, without 
> significantly changing anything else.

Given the finalization problems that we've noted elsewhere, I don't
think that is possible. And if a solution is not at least as safe (and
preferably, much more safe) than existing code, than I'm not very
interested.

> If you could explain the problem as you perceive it, and at least the 
> general characteristics of a solution you would prefer, that would 
> help me better understand your view.

Well, the problem you described is being able to get a better handle
on memory management of large abstractions. It's not really about
objects per-se, but rather a collection of objects that make up
an instance of a single abstraction.

The specific deficiency of Ada in this area is that the only facilities
provided require tying of pools to the lifetime of access types. And
those are impossible to control with the needed granularity.

As far as solutions goes, I really don't know. (This isn't my problem,
after all.) I guess I would like to see the ability to tie pools to
an entire abstraction. One of the problems with your solution is that
you have to repeat it on every node of your abstraction -- passing
the pools everywhere, and putting them on every allocator. Forget one,
(or heaven forbid, use the wrong one), and the whole thing collapses
into a leaky mess (or worse).

One thought would be to have a way to specify a pool for the entire
abstraction. That would mean that that pool would be the default for
any allocations made as part of the abstraction. But I admit I don't
see how that would work in user-defined code.

---

On a detail level, I think you may have already addressed my concerns
when you suggested a special kind of storage pool to handle finalization
properly. It would be easy enough to add some implementation-defined
stuff into those fancier pools in order to preserve most of the
current level of safety. (In particular a check that you don't
deallocate an address that you didn't allocate.) Moreover, there
would be even less reason to interconvert general access types, so a
warning on conversions would be more effective.

And there would be less need to use Unchecked_Deallocation, so less
chance for problems. Still, there would be some loss of safety because
a user-defined pool allocating out of the general heap (and deallocating
an object of it by mistake) is a lot more likely than trying to
deallocate an object that is part of a larger heap object.

Which is all to say that I don't know. I still find that going from
98% accuracy to (say) 70% isn't a good thing. (If I could find a
cheap way without a memory footprint to detect dangling pointers, I'd
implement it immediately for similar reasons.)

****************************************************************

From: Robert A. Duff
Sent: Thursday, March 13, 2008  8:02 AM

> ...

(Somewhere you mentioned lifetimes, but I can't find the quote to reply
to.) Anyway, in Ada 95/2005, the pool object has a longer lifetime
than the access type(s).  But that does NOT prevent dangling pointers,
because you can have a pool operation that frees the memory prematurely.
For example, Tucker likes mark/release pools (type MR_Pool, which I
pronounce "Mister Pool" ;-)) -- Release frees memory.

> Type conversions don't cause a real problem, because hardly anyone 
> would go to the lengths that you did to use storage pools. If I have 
> to use a lot of type conversions, I generally give up and make
> everything the same type.

You can't make them the same type if they have different designated types.
That's a common case when using tagged/class-wides.

> To take a current example: my spam filter crashes with a bad memory 
> reference roughly once every 30 days. Because one of the versions of 
> the crash is the dereference of a null pointer in code that can't have 
> a null pointer, I suspect a memory management problem. (Whether it is 
> with Windows, with my code, or with the Janus/Ada runtime code, I 
> don't know). I'd like to find the problem, but how can you do that 
> with something that happens once every 30 days (which is typically 
> longer than the time before Windows needs to reboot!). The only thing 
> I can do is make it less likely that I made a mistake, thus extra
> checking on Unchecked_Deallocation at all levels.

You could try a language (or implementation?) with garbage collection.  ;-)

> Given the finalization problems that we've noted elsewhere, I don't 
> think that is possible. And if a solution is not at least as safe (and 
> preferably, much more safe) than existing code, than I'm not very interested.

I was surprised, at first, to find that dangling pointers and leaks were
LESS of a problem using the storage pools Tucker is talking about.  I
think this is because you don't have to worry about freeing individual
objects (tracking object "ownership" or whatever on a per-object basis).
You just decide which pool to put it in, which is a fairly coarse-grained
decision (all the objects that come from input file X belong in the
pool for X).  And then they get freed wholesale at some point (e.g. when
we notice that the user has deleted file X).

Those (Ada 95) pools require a global variable ("the current pool"),
which is bad for the usual reasons -- passing the pool to "new" would
solve those problems.

> As far as solutions goes, I really don't know. (This isn't my problem, 
> after all.) I guess I would like to see the ability to tie pools to an 
> entire abstraction. One of the problems with your solution is that you 
> have to repeat it on every node of your abstraction -- passing the 
> pools everywhere, and putting them on every allocator. Forget one, (or 
> heaven forbid, use the wrong one), and the whole thing collapses into a
> leaky mess (or worse).

I briefly mentioned a solution to this earlier.  I want to say "for all
access types declared in this package, and all its children, the default
pool is X". I would make X be a pool that always raises an exception on
Allocate. (Added bonus -- make sure the compiler _knows_ that, so it can
warn. It's basically the same as the "for T'Storage_Size use 0" pool.)
Then every "new" had better have a specified pool.

> One thought would be to have a way to specify a pool for the entire 
> abstraction.

Not sure what you mean by "abstraction", here.  Whatever it is, it needs
to be dynamic.

> On a detail level, I think you may have already addressed my concerns 
> when you suggested a special kind of storage pool to handle 
> finalization properly.

That's one approach.  Another acceptable approach would be to say that
allocating any object with nontrivial finalization (or with tasks) in
these prematurely-finalized pools is erroneous.  I don't think it's
hard to define "nontrivial finalization", but I think it needs to be
determined at run time for class-wide objects to work properly.

> Which is all to say that I don't know. I still find that going from 
> 98% accuracy to (say) 70% isn't a good thing. (If I could find a cheap 
> way without a memory footprint to detect dangling pointers, I'd 
> implement it immediately for similar reasons.)

It's not hard to make a pool type such that you can find out which pool
a given address points into.  Use BiBOP ("Big Bag Of Pages").  Memory
footprint is one word per memory "page" (where "page" = 1K or something).
I.e. approximately zero.

****************************************************************

From: Randy Brukardt
Sent: Thursday, March 13, 2008  2:19 PM

> (Somewhere you mentioned lifetimes, but I can't find the quote to 
> reply to.)
> Anyway, in Ada 95/2005, the pool object has a longer lifetime than the 
> access type(s).  But that does NOT prevent dangling pointers, because 
> you can have a pool operation that frees the memory prematurely.
> For example, Tucker likes mark/release pools (type MR_Pool, which I 
> pronounce "Mister Pool" ;-)) -- Release frees memory.

Correct, but such use of a storage pool is explicitly erroneous by
13.11(21). As an implementer, I'd be justified in deleting Boston if
Tucker does that. (This, I presume, is a reason that Tucker likes to
use his own compiler. ;-)

> > Type conversions don't cause a real problem, because hardly anyone 
> > would go to the lengths that you did to use storage pools. If I have
> > to use a lot of type conversions, I generally give up and make
> > everything the same type.
>
> You can't make them the same type if they have different designated types.
> That's a common case when using tagged/class-wides.

If they have different designated types, they're not convertable. As
far as tagged types go, I use a global "access to root class" for
everything, and explicitly convert into a particular specific/classwide
type if I need to access components or operations that are only available
on a derived type. That doesn't happen very often if your design is good
(dispatching does most of those conversions for you).

The net effect is, I rarely use access conversions; I only use view
conversions on the dereferenced access type. And I generally use only a
single global access type for the entire hierarchy. Other approaches seem
to cause a lot more explicit conversions.

> > To take a current example: my spam filter crashes with a bad memory 
> > reference roughly once every 30 days. Because one of the versions of 
> > the crash is the dereference of a null pointer in code that can't 
> > have a null pointer, I suspect a memory management problem. (Whether
> > it is with Windows, with my code, or with the Janus/Ada runtime code,
> > I don't know). I'd like to find the problem, but how can you do that
> > with something that happens once
> > every 30 days (which is typically longer than the time before 
> > Windows needs to reboot!). The only thing I can do is make it less
> > likely that I made a
> > mistake, thus extra checking on Unchecked_Deallocation at all levels.
>
> You could try a language (or implementation?) with garbage collection.  
> ;-)

Rewriting 2 Megabytes of working Ada into another language is not practical
(or smart!). And Ada does not allow garbage collection on types with
non-trivial finalization (you'd have to finalize too early to collect
the garbage - that's a problem that ought to be fixed), so that's out, too.

...
> > As far as solutions goes, I really don't know. (This isn't my 
> > problem, after
> > all.) I guess I would like to see the ability to tie pools to an 
> > entire abstraction. One of the problems with your solution is that 
> > you have to repeat it on every node of your abstraction -- passing 
> > the pools everywhere,
> > and putting them on every allocator. Forget one, (or heaven forbid, 
> > use the
> > wrong one), and the whole thing collapses into a leaky mess (or worse).
>
> I briefly mentioned a solution to this earlier.  I want to say "for all access
> types declared in this package, and all its children, the default pool is X".
> I would make X be a pool that always raises an exception on Allocate.
> (Added bonus -- make sure the compiler _knows_ that, so it can warn.
> It's basically the same as the "for T'Storage_Size use 0" pool.) Then 
> every "new" had better have a specified pool.

That's the sort of idea I had in mind. Thanks for mentioning it again.

> > One thought would be to have a way to specify a pool for the entire 
> > abstraction.
>
> Not sure what you mean by "abstraction", here.  Whatever it is, it 
> needs to be dynamic.

I'm not sure, either! That's why I was so vague. I don't see an obvious
way to map "abstraction" into language-defined terms like object and type.

> > On a detail level, I think you may have already addressed my 
> > concerns when
> > you suggested a special kind of storage pool to handle finalization 
> > properly.
>
> That's one approach.  Another acceptable approach would be to say that 
> allocating any object with nontrivial finalization (or with tasks) in 
> these prematurely-finalized pools is erroneous.  I don't think it's 
> hard to define "nontrivial finalization", but I think it needs to be 
> determined at run time for class-wide objects to work properly.

No, that is NOT an acceptable approach. For two reasons:

(1) I don't think a significant segment of the Ada community would stand
for a feature adding more erroneousness to Ada, especially in a common case.
(We went through this with the containers, after all, which is why we added
various checks to them over Matt's objections.) Tucker's argument that
the basic feature doesn't add any more erroneousness for
Unchecked_Deallocation to the language is probably sound (I don't like it
because it adds more erroneousness to our implementation, but that's
a different issue). But making finalization erroneous where it previously
was safe seems like an extension of erroneousness, and that is bad.

(2) Many common types have non-trivial finalization. I like to say that
all new types should be controlled, which while going a bit too far, is
still true for many projects. Even if you remove memory management from
the equation (and I'm not convinced that is a good idea), there still
are many other reasons to do things when objects go away. One example:
Claw puts active top-level windows on a chain. When a top-level window
is destroyed, it better be removed from that chain!

Probably more importantly to you, the language-defined containers are
controlled types. A solution that doesn't let you use the containers seems
very bad.

> > Which is all to say that I don't know. I still find that going from 
> > 98% accuracy to (say) 70% isn't a good thing. (If I could find a 
> > cheap way without a memory footprint to detect dangling pointers, 
> > I'd implement it immediately for similar reasons.)
>
> It's not hard to make a pool type such that you can find out which 
> pool a given
> address points into.  Use BiBOP ("Big Bag Of Pages").  Memory 
> footprint is one
> word per memory "page" (where "page" = 1K or something).  I.e.
> approximately zero.

I don't see how that would detect many dangling pointers (most of them
point into the main pool, after all, and almost certainly into pages
that are valid [since freeing pages is rare; we only do it for huge
single allocations that are deallocated]). Maybe I misunderstand the
algorithm.

It would help a bit on the deallocation from the wrong pool problem,
but what we do for the main heap right now would work nearly as well
(keeping track of the largest and smallest addresses previously
allocated). And that doesn't take much space, either. But since
dereferences don't know what pool they came from, you can't use that
on them (other than on pool-specific types).

Indeed, detecting dangling pointers in general access types is
very expensive, because you can only do it if you know what pool
the pointer supposedly was allocated from (user-defined pools would
need a checking routine as part of the pool, and aliased objects
couldn't be checked at all). That means carrying an indication
of the pool along with the access value.

****************************************************************

From: Tucker Taft
Sent: Thursday, March 13, 2008  3:28 PM

One thing that might alleviate some of the concerns you exoressed
would be to require some kind of rep clause on an access type to
allow it to be used with allocator-specific pools.  For example,
you might be required to specify the type of the pool objects being
passed, e.g.:

    for Acc_Type'Allocator_Pool_Type use My_Storage_Pool_Type;

Then you could only pass pools of type "My_Storage_Pool_Type"
to an allocator of type Acc_Type.  Of course you could say
"Root_Storage_Pool'Class" and you would have full flexibility (and
no checking).

If we want anonymous access types to fit into this scheme, we might
want to turn it around somewhat for that and specify whether a
particular storage pool type could be used with an anonymous allocator,
e.g.:

    pragma Allow_As_Anonymous_Allocator_Pool(My_Storage_Pool_Type);

or some such thing.

****************************************************************

From: Robert A. Duff
Sent: Thursday, March 13, 2008  5:08 PM

> I don't see how that would detect many dangling pointers...

Sorry, I wasn't clear.  The BiBOP technique can allow you to tell
(from the address) which pool it points into.  That may or may not
be part of dangling-pointer detection, or schemes for doing various
other things.

> Indeed, detecting dangling pointers in general access types is very 
> expensive, because you can only do it if you know what pool the 
> pointer supposedly was allocated from (user-defined pools would need a 
> checking routine as part of the pool, and aliased objects couldn't be 
> checked at all). That means carrying an indication of the pool along 
> with the access value.

The BiBOP technique means you don't have to carry the pool with
each access value, in order to tell which pool it came from.  Of
course, it only works for suitably-programmed pools -- whether
user-defined or part of the implementation.

****************************************************************

From: Tucker Taft
Sent: Thursday, March 13, 2008  9:30 PM

>> (Somewhere you mentioned lifetimes, but I can't find the quote to 
>> reply to.)
>> Anyway, in Ada 95/2005, the pool object has a longer lifetime than 
>> the access type(s).  But that does NOT prevent dangling pointers, 
>> because you can have a pool operation that frees the memory prematurely.
>> For example, Tucker likes mark/release pools (type MR_Pool, which I 
>> pronounce "Mister Pool" ;-)) -- Release frees memory.
> 
> Correct, but such use of a storage pool is explicitly erroneous by 
> 13.11(21). As an implementer, I'd be justified in deleting Boston if 
> Tucker does that. (This, I presume, is a reason that Tucker like to 
> use his own compiler. ;-)

I don't quite follow this (and please don't delete Boston on my behalf ;-).
13.11(21) says you can't "... use the storage for any other purpose
while the pool element remains in existence."  Clearly after a "release"
on a pool, the pool elements whose storage is freed no longer exist,
just like with Unchecked_Deallocation.  So what makes it erroneous?
Are you saying it is erroneous for an object that "needs finalization"
to go out of existence without having finalization performed? That I
would agree with you, though I'm not sure I know what paragraph of
the manual would back us up.  I guess 7.6.1(10) implies that the
only way for an object to cease to exist early is via Unchecked_Deallocation.
That is actually in redundancy brackets, but it seems like a bit of a
stretch, given the admitted possibility of Mark/Release storage pools
as exemplified in 13.11.

This whole area seems a bit confused.

In any case, the manual clearly implies that Mark/Release pools are
not erroneous, since they are used in the Example in 13.11.

****************************************************************

From: Randy Brukardt
Sent: Friday, March 14, 2008  12:13 AM

> > Correct, but such use of a storage pool is explicitly erroneous by 
> > 13.11(21). As an implementer, I'd be justified in deleting Boston if
Tucker
> > does that. (This, I presume, is a reason that Tucker like to use his 
> > own compiler. ;-)
>
> I don't quite follow this (and please don't delete Boston on my behalf 
> ;-).  13.11(21) says you can't "... use the storage for any other 
> purpose while the pool element remains in existence."  Clearly after a 
> "release" on a pool, the pool elements whose storage is freed no 
> longer exist, just like with Unchecked_Deallocation.

Not according to Ada. You and I argued about the meaning of "exists"
in Ada some years ago, and you convinced me that I was wrong and that it
is in fact a technical term. You can't flip-flop now... (well, it is an
election year, so I suppose you can. ;-)

> So what makes it erroneous?
> Are you saying it is erroneous for an object that "needs finalization" 
> to go out of existence without having finalization performed? That I 
> would agree with you, though I'm not sure I know what paragraph of the 
> manual would back us up.  I guess 7.6.1(10) implies that the only way 
> for an object to cease to exist early is via Unchecked_Deallocation.
> That is actually in redundancy brackets, but it seems like a bit of a 
> stretch, given the admitted possibility of Mark/Release storage pools 
> as exemplified in 13.11.

7.6.1(11/2) says that objects only cease to exist *after* they are
finalized. No redundancy brackets there. Since all objects are (logically)
finalized, and the only permission to clobber an object early is via
Unchecked_Deallocation, clearly the objects continue exist until the
master goes away. Thus the storage pool violates the third sentence
of 13.11(21), and execution is erroneous.

Another way to say this is that the actions of a random user-defined
storage pool (which could be anything) should not change the underlying
semantics of the language. Surely it should not be changing the lifetime
of objects. Otherwise, you'll have a nightmare for program analysis and
understanding. (See P.S.)

> This whole area seems a bit confused.

There is nothing confused here, the semantics are clear and straightforward.
They may not be what you think they should be, but that's different.

> In any case, the manual clearly implies that Mark/Release pools are 
> not erroneous, since they are used in the Example in 13.11.

I always thought that example meant that Release was called after the
type Acc went away. Anything else would clearly be erroneous. I see by
looking at it carefully that this is not clear. But in any case, this is
not normative.

This is another symptom of the "no garbage collection" problem. We have
no permission for early finalization of *any* object in Ada as it stands
 - anything that violates that is technically erroneous (even if not
erroneous in practice).

I've on a number of occasions suggested adding Implementation Permission
to allow early finalization in limited cases at limited points.
("Limited cases" means no early task waiting, and only for objects
that are otherwise inaccessible; "limited points" means at any point
where finalization would normally occur. Those limits should make it
possible to support garbage collection, mark/release pools, and anything
else in between without making program analysis intractable.) But there
never seems to be a lot of interest in tackling this problem (one of
the worst in Ada today, I think). [Keep in mind that recovering memory
from a failed allocator for an object with a controlled part is currently
wrong for this reason; I've written an ACATS-style test to test that --
I could issue it...]


P.S. Allowing existence to be controlled by storage pools would pretty much
mean that 7.6.1 can be thrown away and compilers can do anything they want.
Moreover, it would prevent a lot of static analysis, because it would be
hard to rule out a change in existence caused by a pool.

Consider the following:

    type Acc is access ...
    for Acc'Storage_Pool use MR_Pool;

    O : Acc;

    A := O.all; -- (1)
    null; -- (2)
    P (O.all); -- (3)


Can a tool or compiler assume that O.all is the same between uses (1)
and (3)? The answer is (amazingly) no. Another task could cause O.all to go
away at (2), and the program becomes erroneous.

Now, a compiler can use that erroneousness to allow it to make the assumption
anyway, because the "anything goes" rule of erroneous execution surely
includes assuming (3) is the same as (1). But I don't see that the same
applies to a static analysis tool. If P's parameter is in out, making
this assumption could overlook a major program disaster (overwriting
random memory). That would greatly damage the purpose of static analysis
in the first place.

Note that (2), it is perfectly OK for another task to access MR_Pool
and call Release. This task is not using it in any way at this point. Only
13.11(21) makes it erroneous to use the object later. (13.11.2(16) does
not apply - the object still exists.)

Currently, a static analysis tool could prove that there is no problem
by ensuring that no other task ever calls Unchecked_Deallocation on
a type that uses MR_Pool. While that wouldn't necessarily be easy,
it seems tractable. (Much of the time there would be no instance of UD
accessible in other tasks, which would prove the premise.)

However, allowing any random operation on a pool to change the existence
of objects means that a tool would need a much wider net: essentially
any possible touching of a pool (even a call to Allocate) could cause
some object elsewhere in the program to cease to exist. That would
seem to be very hard for a static analysis tool to disprove: it would
have to figure out the intent of any calls into the pool.

Consider:

    A := O.all; -- (1)
    B := Allocated_Size(MR_Pool); -- (2)
    P (O.all); -- (3)

A tool would have to assume that O.all could be killed by the user-defined
pool operation Allocated_Size. That's even though it is fairly obvious to
a human reader that the intent is just to get information from the pool.
The only way to avoid such an assumption would be analyze how
Allocated_Size worked, and since there are as many pool designs as
programmers, I'm not sure that would be very practical. (After all,
Release(MR_Pool) probably does nothing except set the value of a
ariable of type Address - it wouldn't do any memory operations itself.
How to tell that from a benign statistics gathering routine is not
obvious to me.)

Besides, such a rule is not addressing the real problem, which is that
early finalization is not allowed in Ada. Since all objects are
(logically) finalized, and surely the object has to exist in order that
it can be finalized, any attempt to release the memory early is going
to have be erroneous. I do agree that this (in the case of trivial
finalization) is an example of what Robert called "things that will
work fine in practice, but are still technically erroneous and cause
language lawyers to go tsk-tsk". The trouble is that trying to
separate trivial finalization from other finalization just makes
the problem more annoying: now you can do it legally with carefully
crafted types, but if maintenance adds a controlled part later on
(say by adding a container), the system deletes Boston (because it
now falls into Robert's second category "things that are genuinely
dangerous"). I think that's worse than the original situation (where
it never is legitimate).

****************************************************************

From: Tucker Taft
Sent: Friday, March 14, 2008  1:14 AM

>> So what makes it erroneous?
>>...
> 
> 7.6.1(11/2) says that objects only cease to exist *after* they are 
> finalized. No redundancy brackets there.

Well if we are getting technical, what 7.6.1(11/2) says is that it
finalizes objects *that still exist*.  If the object ceases to exist
earlier, then it is not finalized at the end of the scope of the access
type.

Now I admit that 7.6.1(10) implies (in redundancy brackets) that
Unchecked_Deallocation is the only way for an object to cease to exist
"early," but I have never taken that at face value, since the whole
point of a Mark/Release heap is to release storage *before* the whole
access type goes away, in the same way that Unchecked_Deallocation does.
Think of Release as a "grouped" Unchecked_Deallocation of a bunch of
objects.  In fact, it could be implemented that way, by keeping a list
of objects as a side-effect of Allocate, and then iterating over them
at the time of Release. In fact, if the objects need finalization,
then that is the only safe thing to do.  The actual "Deallocate"
routine could be a no-op, if the presumption is that the storage is
actually going to be reclaimed "en masse" by the Release operation.
Of course we would like a better way to accomplish this, so that
we don't have to have heap objects needing finalization on two lists,
one maintained by the compiler and one maintained by the pool.

> ... Since all objects are (logically)
> finalized, and the only permission to clobber an object early is via 
> Unchecked_Deallocation, clearly the objects continue exist until the 
> master goes away. Thus the storage pool violates the third sentence of 
> 13.11(21), and execution is erroneous.

I think the statement in 7.6.1(10) will want to be generalized to allow
for Mark/Release pools, if we can figure out a safe way for them to get
any needed finalizations performed on the objects whose storage is being
reclaimed upon Release.
 
> Another way to say this is that the actions of a random user-defined 
> storage pool (which could be anything) should not change the 
> underlying semantics of the language. Surely it should not be changing
> the lifetime of objects. Otherwise, you'll have a nightmare for program
> analysis and understanding. (See P.S.)

I think you would agree that a call on Release could be implemented via
a bunch of calls on Unchecked_Deallocation.  Of course, part of the design
problem we are working on is how to make such pools be able to accomplish
the equivalent of Unchecked_Deallocation on a whole group of objects so
that needed finalizations are performed, but do it more efficiently.
 
>> This whole area seems a bit confused.
> 
> There is nothing confused here, the semantics are clear and straightforward.
> They may not be what you think they should be, but that's different.

I think it is confused to say that Unchecked_Deallocation is the only way
for an object to cease to exist "early", given user-defined storage pools
and operations like "Release." But I agree that if we are going to be
storing objects needing finalization in user-defined mark/release storage
pools, we need to add more mechanism to enable them to do the needed
finalizations.

>> In any case, the manual clearly implies that Mark/Release pools are 
>> not erroneous, since they are used in the Example in 13.11.
> 
> I always thought that example meant that Release was called after the 
> type Acc went away. Anything else would clearly be erroneous. I see by 
> looking at it carefully that this is not clear. But in any case, this
> is not normative.
> 
> This is another symptom of the "no garbage collection" problem. We 
> have no permission for early finalization of *any* object in Ada as it 
> stands - anything that violates that is technically erroneous (even if 
> not erroneous in practice).

I think we are in violent agreement here, though I am more interested
in mark/release pools than in garbage collection. But it is basically
the same problem.  I suspect you might get more interest in the problem
if it was recognized that the current rules imply that mark/release pools
are erroneous. They are certainly used!

****************************************************************

From: Randy Brukardt
Sent: Friday, March 14, 2008  1:18 PM

> > Another way to say this is that the actions of a random user-defined storage
> > pool (which could be anything) should not change the underlying semantics of
> > the language. Surely it should not be changing the lifetime of objects.
> > Otherwise, you'll have a nightmare for program analysis and understanding.
> > (See P.S.)
>
> I think you would agree that a call on Release could be implemented 
> via a bunch of calls on Unchecked_Deallocation.  Of course, part of 
> the design problem we are working on is how to make such pools be able 
> to accomplish the equivalent of Unchecked_Deallocation on a whole 
> group of objects so that needed finalizations are performed, but do it 
> more efficiently.

While I agree with your principle, I don't think that is actually
possible in practice. You could implement it as calls to Deallocate,
but it doesn't change the existence of objects. That is the magic part
of Unchecked_Deallocation.

All I'm saying is that random user code shouldn't be allowed to change
the lifetime of objects other than through strictly defined ways. Otherwise,
it would be near impossible to analyze anything. How could a tool tell
between Release (which supposely changes the lifetime of things) and
Is_Valid (which is just a query)?

Currently, the only defined way to change the lifetime of an object is
Unchecked_Deallocation (and that does *not* mean the implementation
of UD via a pool, but rather the actual call to a UD instance). It's
obvious that we need more than that, but whatever we have ought to be
very well defined so it is possible to write tools and tests that depend
on the lifetime of objects.

> >> This whole area seems a bit confused.
> >
> > There is nothing confused here, the semantics are clear and straightforward.
> > They may not be what you think they should be, but that's different.
>
> I think it is confused to say that Unchecked_Deallocation is the only 
> way for an object to cease to exist "early", given user-defined 
> storage pools and operations like "Release."
> But I agree that if we are going to be storing objects needing 
> finalization in user-defined mark/release storage pools, we need to 
> add more mechanism to enable them to do the needed finalizations.

I've always thought that a Mark/Release pool is erroneous unless the type
has gone away. If the designated type has trivial finalization, it is
what Robert called "language-lawyer erroneousness", in that it would work
on virtually all implementations, but it still is erroneous. (We don't
have a concept of "trivial finalization" in Ada, after all.)

...
> I think we are in violent agreement here, though I am more interested 
> in mark/release pools than in garbage collection.
> But it is basically the same problem.  I suspect you might get more 
> interest in the problem if it was recognized that the current rules 
> imply that mark/release pools are erroneous.
> They are certainly used!

Some would say that Mark/Release is just a poor man's garbage collection.
(And they've never worked for me, although with multiple pools in Ada,
they might in fact work better.)

In any case, the issue is recognized now: at least by anyone who's read
this e-mail thread!

****************************************************************

From: Robert A. Duff
Sent: Friday, March 14, 2008  1:26 PM

Tuck, Randy is right -- the storage pool stuff in SofCheck Inspector is
utterly erroneous, from a formal point of view.  It relies on
implementation behavior that is not guaranteed by the RM.  Nonetheless,
it works just fine!

Anyway, I think we need a specific detailed proposal at this point.
It should address Randy's concerns, which I think are valid, but not
show-stopping.

I'm willing to volunteer to write it up, but I don't want to step on
Tucker's toes if he's planning to.

So far, we have Tucker, Robert, and me in favor of doing something in this
area. And some fairly negative comments from Randy.  I don't want to waste
my time, so I'd like to hear some positive comments from others, first.

So, who wants me to write up a proposal on the general topic of "more
flexible memory management"?

P.S. Re: Randy's issues with garbage collection.  I put some stuff about
it in the RM, but Tucker wanted it moved to the AARM.  It's in 13.11.3.
It's pretty stupid stuff -- I've learned a lot about GC and memory
management in general since those days.

****************************************************************

From: Tucker Taft
Sent: Friday, March 14, 2008  1:55 PM

> ...(We don't have a
> concept of "trivial finalization" in Ada, after all.

I thought that any object that doesn't
"need finalization" per RM 7.6(9.1) is presumed to have "trivial"
finalization in the sense we care about.

Class-wide objects are a bit
of a special case, since you could decide whether they
"need finalization" based on treating them as an object of a
class-wide type (and then the answer is "yes") or as an
object of the specific type determined by the tag (and then the
answer depends on the properties of the specific type).

It pretty much depends on whether you
are interested in a "static" (compile
or link-time) answer ("yes") or a run-time answer (it depends
on the tag).

****************************************************************

From: Tucker Taft
Sent: Friday, March 14, 2008  2:03 PM

> Anyway, I think we need a specific detailed proposal at this point.
> It should address Randy's concerns, which I think are valid, but not 
> show-stopping.
> 
> I'm willing to volunteer to write it up, but I don't want to step on 
> Tucker's toes if he's planning to.
> 
> So far, we have Tucker, Robert, and me in favor of doing something in this area.
> And some fairly negative comments from Randy.  

Actually I think Randy and I are converging, albeit a bit slowly... ;-)
I think Randy is afraid we are creating a bunch of new holes while trying to
reduce the use of Unchecked_Deallocation.  My ultimate goal is to make this
all automatic, so you have both efficient and safe, region/pool-based storage
allocation, where you can eliminate the possibility of dangling references
statically.  But first I was looking for a primitive that would allow me
to make progress toward that longer-term goal without all kinds of nasty
thread-unsafe kludgery using indirect storage pools.

But I think Randy doesn't want to add a
new primitive that makes things less safe, even if there is some grand
and glorious vision for eventually making things safer.

I accept Randy's concerns, and am beginning to see a path to achieving my goals
while not setting off his erroneousness alarm.

I guess I am still interested in writing something up myself, but I would be
happy to collaborate with you on it, as it seems to be an area we both
care about a lot.  If you want to take the first shot, great.  I'm a
little overwhelmed with other things right now, though some of them involve
storage pool hackery, so that is why I decided to float this syntactic
trial balloon.

****************************************************************

From: Randy Brukardt
Sent: Friday, March 14, 2008  2:15 PM

...
> Actually I think Randy and I are converging, albeit a bit slowly... 
> ;-)  I think Randy is afraid we are creating a bunch of new holes 
> while trying to reduce the use of Unchecked_Deallocation.  My ultimate 
> goal is to make this all automatic, so you have both efficient and 
> safe, region/pool-based storage allocation, where you can eliminate 
> the possibility of dangling references statically.  But first I was 
> looking for a primitive that would allow me to make progress toward 
> that longer-term goal without all kinds of nasty thread-unsafe 
> kludgery using indirect storage pools.

I'd probably have been less negative if you were not stopping halfway
on this journey. If the path to Nirvana leads through an aligator-infested
swamp, I'm less likely to take it. ;-)

> But I think Randy doesn't want to add a new primitive that makes 
> things less safe, even if there is some grand and glorious vision for 
> eventually making things safer.

Especially since no one has laid out the "grand and glorious" vision.

> I accept Randy's concerns, and am beginning to see a path to achieving 
> my goals while not setting off his erroneousness alarm.

I think I do too. It'll be interesting to see if they're the same path...

****************************************************************

From: Randy Brukardt
Sent: Friday, March 14, 2008  2:26 PM

> Tuck, Randy is right -- the storage pool stuff in SofCheck Inspector 
> is utterly erroneous, from a formal point of view.  It relies on
> implementation behavior that is not guaranteed by the RM.  Nonetheless,
> it works just fine!

Glad to see I'm not crazy. (I worry sometimes.)

...
> So, who wants me to write up a proposal on the general topic of "more 
> flexible memory management"?

I think you should go for it. I'm always more negative about pie-in-the-sky
ideas rather than a formal proposal.

I'm going to write up a proposed Implementation Permission to allow
garbage collection and Mark/Release pools for Ada. I think this is a
somewhat separate issue, in that it is an existing bug rather than
an extension of some sort. (I'd like to see it fixed now, rather than in
201x.)

I'd like to see a proposal with all of the elements we've previously discussed
(it's easier to delete them than add them later):

* A way to set the default pool for all types in a package;
* A way to set the class of pool types allowed for an access type (and that
  then allows specification of specific pools on allocators); [this also allows
  the allocators to depend on implementation-defined "enhanced pool" types,
  such as one that has a dereference check routine (which would allow pools
  that positively identify dangling references upon use)]
* A solution to the finalization problem when using specific pools in allocators;
* Probably a matching Unchecked_Deallocation for specific pools in allocators.

I hope I didn't miss any.

> P.S. Re: Randy's issues with garbage collection.  I put some stuff 
> about it in the RM, but Tucker wanted it moved to the AARM.  It's in
> 13.11.3. It's pretty stupid stuff -- I've learned a lot about GC
> and memory management in general since those days.

It surely is, especially as it never really addresses the problem. Indeed,
it actually says that the implementation needs to defer garbage collection
of objects that need finalization -- which seems exactly wrong.

I suppose we ought to update those notes along with a new
Implementation Permission, but I don't have much interest in that. Maybe
I can talk you into that (I suspect you know a lot more about GC than I do).

****************************************************************

From: Robert A. Duff
Sent: Friday, March 14, 2008  4:22 PM

> I guess I am still interested in writing something up myself, but I 
> would be happy to collaborate with you on it, as it seems to be an 
> area we both care about a lot.

I'm happy to let you go first.

I just think the whole discussion would be more productive if we had
something specific to argue about.  Randy has spent a lot of energy
pointing out what the RM _currently_ says, but obviously you want to
_change_ what it says.

P.S. I find memory management (GC and otherwise) to be fascinating,
for some strange reason!  ;-)

****************************************************************

From: Randy Brukardt
Sent: Friday, March 14, 2008  4:25 PM

> * A way to set the class of pool types allowed for an access type (and
that
> then allows specification of specific pools on allocators); [this also 
> allows the allocators to depend on implementation-defined "enhanced pool"
> types, such as one that has a dereference check routine (which would 
> allow pools that positively identify dangling references upon use)]

Really, the dereference check probably ought to be language defined. The
idea is to add:

    procedure Check_Dereference (Pool : in out Some_Storage_Pool;
                                 Addr : in Address) is null;

to a derived storage pool, and then define that a dereference of a
pool-specific access value (can't use this on general access types,
unfortunately, because you don't know the pool) calls this routine
(if it exists) before actually doing the dereference.

The routine then can check if the address is one that was actually allocated
by the pool, and raise an exception (preferably with a message) if it
was not. This could be used to detect dangling pointers as well as
bad pointers that were streamed in or Unchecked_Converted, and probably
to do other things as well if you want to assume the the type of object
that is designated.

It would allow the development of portable debugging pools that would
really be useful in that they could detect virtually all dynamic memory
management errors. (And combined with the ability to change the default
pool suggested above, could be used to debug an entire system rather
simply).

Perhaps the idea could even be extended enough to allow portable
user-defined garbage collection. (But that's not as obvious.)

****************************************************************

From: Randy Brukardt
Sent: Friday, March 14, 2008  4:46 PM

> I'm happy to let you go first.

As editor of the ARG, I think that is a bad idea. Tucker has several
critical projects for the ARG: he's the critical path for the entire
ASIS Standard at this point. I can't recommend that he do anything
that would take away from those focuses.

You, on the other hand seem to have both the necessary interest and
only a small amount of ARG homework (and not critical to the schedule
of a standard that has a hard deadline coming up).

Moreover, Tucker said he'd be happy to collaborate. What I don't want
to see is Tucker working on these "fun" projects and not getting his
critical ARG work done (because I know he has a limited amount of spare
time).

So I think very strongly that you should do it, and preferably right
away while you remember the gist of this e-mail discussion. Especially
because...

> P.S. I find memory management (GC and otherwise) to be fascinating, 
> for some strange reason!  ;-)

...which makes you the ideal person to work on this.

****************************************************************

From: Stephen W Baird
Sent: Friday, March 14, 2008  7:28 PM

> So, who wants me to write up a proposal on the general topic of "more 
> flexible memory management"?

I do.

****************************************************************

From: Tucker Taft
Sent: Friday, March 14, 2008  8:57 PM

By all means, after you... ;-)

Randy is right, I am too busy to be getting all fired up about this issue.
But I can't resist tossing out a couple of more thoughts:

I think Randy is right, we need to capture the "special sauce" that
Unchecked_Deallocation now has and figure out a way to "sprinkle" it
(carefully) on user-defined code.

I had suggested giving the management of a list of to-be-finalized
objects over to the user-defined code, but that seems like too much
work, and too "dangerous" work, for user-defined code.  So instead
I would suggest that a user-defined pool that wants to "destroy" objects
 early should be an extension of a "Root_Extended_Storage_Pool"
(Root ESP?) type (or some such name) that the implementation must
provide. This Root ESP type would have the necessary
mechanism, presumably provided via private operations that aren't
overridable, to maintain a list of to-be- finalized objects
associated with the pool.  So an allocator that uses such a pool
(either because the access type is associated with it, or the
individual allocator specifies it) would link the object, presuming
it needs finalization, into the pool-resident list.  Finalizing such
a pool object would automatically walk this pool-resident list.
In addition there would be an explicit operation provided by this
Root ESP type to finalize the whole list, thereby destroying all of
the objects on the list, and allowing the pool to reclaim all of their
storage.

I suppose there is no particular reason we couldn't just specify that
an explicit call on the Finalize operation of the Root ESP would have
this effect, but it might be better to give the operation some other
name, such as "Reset_Pool" (or even, Release_Pool ;-).

An unchecked_deallocation would still be permitted on a pool type
derived from this Root ESP, and it would remove the object from this
pool-resident list before finalizing it and calling the Deallocate
operation.

For example:

     type Root_Extended_Storage_Pool is abstract
       new Root_Storage_Pool with private;
         -- This Root ESP type has a (private) component that
         -- the RTS uses to maintain a list of heap
         -- objects allocated from the pool that need
         -- finalization.  The finalization of this component
         -- automatically finalizes all of the objects on
         -- the list and resets the list to empty.
         -- An explicit operation is provided to do
         -- the same thing:

     procedure Reset(Pool : in out Root_Extended_Storage_Pool);
         -- This resets the finalization list to empty after
         -- finalizing all of the objects on the list.
         -- All of the objects that have been allocated
         -- from the pool cease to exist after this call.
         -- It is presumed that extenders of the Root ESP
         -- type will override Reset to first call this
         -- original Reset routine to get any necessary
         -- finalizations performed, and then reclaim
         -- the actual storage associated with the pool.
         -- This operation is implicitly called as part
         -- of finalizing any type derived from Root ESP.

     -- Note that Allocate and Deallocate are inherited from
     -- Root_Storage_Pool, and are called as part of allocators
     -- and Unchecked_Deallocation[_From_Pool].  An allocator
     -- for an object that needs finalization will implicitly call
     -- a private operation of Root ESP to add the object to
     -- the pool's finalization list, and a call on an instance
     -- of Unchecked_Deallocation that applies to such a pool will
     -- implicitly call a private operation of Root ESP to remove
     -- the object from the list and finalize it, before calling
     -- the Deallocate routine.

   Reclaiming storage from a pool that is in use by an object
   would be erroneous unless either Unchecked_Deallocation is
   performed on the object, or Reset is performed on the pool
   as a whole.  As usual, referencing an object that no longer
   exists would be erroneous.

   It is important that a Root ESP pool object not outlive the
   type of the objects that are stored in it if they need
   finalization, since the finalization of a Root ESP object
   implicitly calls the finalize operation of these objects.
   That is why I said that the accessibility level of such
   a pool object must be at least deep as that of the types
   stored in it.  I realize that is somewhat the opposite of
   pools that are not managed in a mark/release manner.

So that's one approach to solve the finalization problem...

****************************************************************

[Editor's note: There's quite a bit more mail and proposals on
this topic. Check back after the next update to see everything.]

****************************************************************


Questions? Ask the ACAA Technical Agent