!standard 4.8(2) 08-08-13 AI05-0111-1/02 !standard 4.8(3/2) !standard 4.8(5.3/2) !standard 4.8(7/2) !standard 13.11(15) !class Amendment 08-08-08 !status work item 08-08-08 !status received 08-03-12 !priority Medium !difficulty Hard !subject Specifying a pool on an allocator !summary (See proposal.) !problem One often wants to manage dynamically allocated objects in multiple heaps with different lifetimes. Ada provides this automatically if things can be arranged so that all of the objects created with a given access type have the same lifetime. The finalization of the storage pool associated with the access type provides for the reclamation of the objects allocated using these access types. However, it is common for the access types to be declared at the library level, while there need to be multiple heaps that are reclaimed at a more nested level. One possible way to support multiple heaps is to allow the storage pool for a dynamically allocated object to be specified explicitly at the point of the allocator. However, it is desirable if there are protections against dangling references, and allowing storage pools of short lifetimes to be specified in an allocator for a long-lived access type clearly does not provide such protection. !proposal Add new syntax for specifying the storage pool at the point of an allocator, with a syntax like: X := new (Local_Storage_Pool) T'(ABC); To protect against a local storage pool being reclaimed while there are still references designating objects allocated from it, accessibility checks would be added on assignment for an access type whose storage pool is broken up into multiple storage "subpools." These accessibility checks will make sure that the object containing the left-hand side of the assignment will not outlive the object designated by the right-hand-side of the assignment (or more precisely, that the master of the LHS will not outlive the master of the RHS). Given an access value of an access type that allows subpools, it will be possible to determine at run time the accessibility level, or more precisely, the master, associated with the subpool from which the object it designates was allocated. If an object allocated from a subpool contains a task, the master of the task is determined by the master of the subpool. For subpools that are themselves dynamically allocated, they will become "dynamic" masters for all objects allocated from them, and reference counts or an equivalent mechanism will be used to determine whether they can be safely finalized (as part of a "checked" deallocation operation, called "Release"). These reference counts (or equivalent) will track the number of other unfinalized masters that might have objects with references designating objects of the dynamic master, so that finalization of a Release'd object actually occurs when the external reference count drops to zero. Cyclic references will be properly handled, so that Release will not cause finalization immediately if there are references, unless they are merely cyclic references from other subpools that have been Release'd, in which case all will be finalized. If an access type is to support subpools, then its 'Storage_Pool must denote an object of a type descended from the type System.Storage_Pools.Subpools.Root_Subpool. Root_Subpool is itself descended from Root_Storage_Pool, but with some extra operations, including one for determining, given a non-null access value of the type, the subpool from which the object it designates is allocated. From the subpool, the implementation can presumably determine the associated (possibly dynamic) master so it can perform accessibility checks. Another operation on a subpool is one used to specify that it may henceforth designate objects in another particular subpool. This operation might be named "May_Reference." May_Reference(A, B) would mean that subpool A may contains objects that designate objects allocated from subpool B. May_Reference is only relevant for subpools that are themselves dynamically allocated and hence have dynamic masters. The relative lifetime of a dynamic master would be determined by calls on this operation, and in turn would be checked on access value assignment. May_Reference establishes a partial order, and is presumed transitive. Each subpool "knows" the ultimate ancestor storage pool with which it is associated. When a subpool is specified in an allocator, a check is made that its ancestor storage pool is the one associated with the access type of the allocator. Replace 4.8(2) with: allocator ::= NEW [subpool_specifier] subtype_indication | NEW [subpool_specifier] qualified_expression subpool_specifier ::= ( *subpool_*name ) Add after 4.8(3/2): The expected type for the *subpool_*name of the subpool_specifier, if any, is System.Storage_Pools.Subpools.Root_Subpool'Class (see 13.11). Add after 4.8(5.3/2): If a subpool_specifier is present, the *subpool_*name of the subpool_specifier shall denote a variable. Add before 4.8(7/2): If a subpool_specifier is present, a check is made that the storage pool associated with the type of the allocator is an ancestor pool of the object denoted by the *subpool_*name (see 13.11). If this check fails, Program_Error is raised. Add after 13.11(15): The following language-defined library package exists: package System.Storage_Pools.Subpools is type Root_Subpool is abstract new Root_Storage_Pool with private; pragma Preelaborable_Initialization(Root_Subpool); -- Allocate, Deallocate, and Storage_Size are inherited -- Additional dispatching operations on subpools function Create_Subpool(Pool : access Root_Subpool) return Root_Subpool'Class is abstract; function Is_Ancestor_Pool (Ancestor_Pool : Root_Subpool; Subpool : Root_Subpool'Class) return Boolean is abstract; function Originating_Subpool (Pool : Root_Subpool; Storage_Address : System.Address) return access Root_Subpool'Class is abstract; -- Classwide operations on subpools procedure Release_Subpool(Pool : access Root_Subpool'Class); procedure May_Reference(Referencing_Pool : access Root_Subpool'Class; Referenced_Pool : access Root_Subpool'Class); private ... -- not specified by the language end System.Storage_Pools.Subpools; A *subpool type* is a descendant of Root_Subpool. A *subpool* is an object whose type is a subpool type. A subpool P is an *ancestor pool* of a second subpool S if P and S denote the same subpool, if S was the result of calling Create_Subpool(P), if Is_Ancestor_Pool(P, S) returns True, or if P is an ancestor pool of a third pool R which is in turn an ancestor pool of S. An element of a pool P *references* a pool R if a part of the element is of an access-to-object type, and it designates an element of pool R. A pool P *may reference* a pool R if May_Reference(P, R) has been invoked. !discussion It is valuable to be able to specify the storage pool (aka "region") at the point of allocation, rather than at the point of access type declaration. However, an access type that designates objects from pools with lifetimes shorter than that of the access type is clearly error prone. How do we deal with this issue? - At the point of declaration, we need to know that the access type is such an access type with per-allocator storage pools. We might specify a 'Storage_Pool_Type rather than an explicit 'Storage_Pool for such an access type. - Objects designated by such an access type need to keep track of what pool/region they come from so unchecked deallocation can operate properly, and to prevent a pointer to such an object being stored in a place that will outlive the object. A 'Storage_Pool attribute of access *values* might be a way to accomplish this. This implies that storage-pool types that can be used for access types with per-allocator storage pools must provide an operation that can determine the storage pool from an address. This needs to work even for addresses allocated from storage pools of a type descended from the given type, if we permit the specification of a class-wide type. Perhaps we shouldn't allow that specifically because of this problem. - Access parameters would need a 'Storage_Pool attribute as well. This could be used instead of the accessibility level attribute, presuming that storage pools have some kind of accessibility level residing in them. - An operation to compare lifetimes of storage pools should be available, so we can verify at run-time that the object where a pointer is being stored lives no longer than the designated object. This is a check performed by the compiler on storage pools (and perhaps any two controlled objects?). - We need an "open" operation or equivalent on a storage pool to make sure it is safe to allocate objects in the storage pool and hold the access values in local variables. This needs to be an operation that is somehow "scoped" in that we don't want the pool to be "closed" in the middle of the scope of local variables. One way to accomplish this is to make the "open" operation be something that is accomplished as a side-efffect of declaring a controlled object, similar to the "within" or "pin" operations we use now with subpools in the Inspector. Does this deserve new syntax? I wouldn't want to introduce it just for this purpose. In the long run you could imagine that a procedure might have a matching "finalize" operation that is called automatically on scope exit. But the explicit object approach does it awfully cleanly, since it captures the parameters (which must all be by-copy since they are discriminants) and can record the old state, etc., if desired. But it feels odd to require this notion of "open" but not provide any simple way to do it. Perhaps if we presume that for most pools, they are open throughout their lifetime, and the only time anything is reclaimed is when everything is reclaimed, when the pool is finalized. Mark/release is best done by creating effectively a new pool that just happens to share part of a page with an enclosing pool, which is frozen during that period. Freeze and unfreeze are safer than open and close, perhaps. And freeze simply means no allocations. Deallocations would only be via finalization. The "Finalize" routine for the Root type allowed for storage pools supporting per-allocator pools would finalize all of the objects in the storage pool. It need not be called explicitly, since it will be performed implicitly because it will use a controlled component. Hmmm, that doesn't really work, since we want finalization to happen *before* the pages of the pool are freed. Hence, I think we need an explicit operation that finalizes the objects, and which will produce an error if it is called implicitly. Which implies that - comparing pool lifetimes should be based on some kind of partial ordering established by an explicit directed graph with reference counts. Edges can be set up when the pool is created, such as the mark/release (sub)pool imagined above, or a subpool (subregion) that keeps a list of its components to be freed when the subpool goes away. (There are many ways a subpool could know how to release its own objects but no others on its finalization.) - between invoking an allocation and "safely" storing the pointer into some other heap object, the compiler could provide a reference count, which would prevent premature deallocation. It would only need to keep track of local variables and temps, and only one per master, since anything in the heap must have a storage-pool to storage-pool link to ensure lifetime. Conceivably this could be a list of access values if we wanted to prevent storage leaks. Each access value corresponds to the value of an allocator that has *not* yet been stored into another heap object. But to just prevent dangling references, all we need is that when doing an allocator, we add a cleanup object to the stack which contains the address of the pool to "pin"/"freeze." On the other hand, the pool must have been passed into the routine to do the allocator, so it would be essentially a multi-tasking no-no to delete it while it was being used in that way. - The implementation can take care of creating a list of items needing finalization using the "root" part of the Storage_Pool. It can also provide an operation for checking the respective lifetimes. In general types needing finalization will need extra overhead in heaps, which should be factored into the Max_Size_In_Storage_Elements. - If a particular container type often has its own storage pool for its elements, then it should identify it via an access discriminant, so that the pool gets finalized automatically if it is newly allocated as part of creating the container, but doesn't if it is a preexisting pool. DYNAMIC MASTER MODEL It seems helpful to think of a dynamic master as somewhat like a task. Releasing the associated subpool is equivalent to the task completing. It then proceeds to wait for its dependents to terminate or become terminable. A dependent subpool's dynamic master is "terminable" if it has been released and all of its dependents are similarly terminable. Note that we can have cyclic dependencies between dynamic masters. But that still works presuming they have all been released and all of the dependent tasks are terminable. We don't really want to introduce another task, both for efficiency and because we don't want any finalization to happen completely asynchronously. Essentially when the last subtask, if any, becomes terminable, or the last Release is performed, that causes the finalization to occur. This does mean that the finalization can be performed on the thread of a subtask. Is that acceptable? Is there any alternative? If there are subtasks, you have the possibility of finalizations occuring inside the subtasks, so you have multi-threaded finalization already. Presumably after a Release, no further allocations may be performed on the subpool. MAY REFERENCE RULES Calls on May_Reference(A, B) effectively create a dependence of A on B. A need not be a subpool -- it can be a "regular" pool. B must be a subpool. May_Reference will fail if the accessibility level of B's ultimate ancestor pool is deeper than A's ultimate ancestor pool. In other words, May_Reference will raise Program_Error if an attempt to convert from B's associated access type to A's would fail the "normal" Ada 95 accessibility check. We make this restriction because May_Reference(A, B) is designed only to extend the lifetime of B, not shorten the life of A, and should never cause dangling references if A lives out its "full" life. Note that if A is a "normal" pool, this effectively means that B will live at least until the (non-dynamic) master associated with A completes. This is also a way to allow regular local variables to refer to B. By allowing A to be a normal pool, we require that all pools can be on the dependent list of a subpool. Only subpools need to maintain dependent lists/counts. Perhaps we need a unary operation such as Locals_May_Reference(B) or equivalent, which will make sure B doesn't disappear before the current stack frame. Of course if we pass a reference to B as a parameter, we want to be sure B doesn't disappear. How does that work? It would seem that any frame in which an allocator occurs should be dependent on B. Also any frame that gets back a return result that was allocated inside the called function should be dependent on B. Is it legal to create a local subpool and then return an allocated object? Hopefully not! Hence, there needs to be a check on return of a value of an access type with subpools, since it is effectively being stored in the caller's frame. Similarly, storing into a heap object or a global requires a check. COUNTED REFERENCES *NOT* FROM OTHER SUBPOOLS How do we deal with simple stack-based references? What is the equivalent of May_Reference for stack-based references? Probably a local "Reference" object whose finalization releases the dependence. This is similar to the notion of "opening" or "pinning" the subpool. But how do between-subpool references work? In particular, is there a way to have a reference that can be deleted and the corresponding dependence goes away? This sounds like a counted reference which is not from one subpool to the next, but is rather a single object that combines the subpool dependence and the subpool pointer, so it can be stored in a hash table, say. So we really have two kinds of things, one are counted references from an object to a subpool, and counted references from a subpool to a subpool The latter only go away when the subpool goes away, since their are an unknown number of "subordinate" uncounted references relying on it. On the other hand, counted references from an object to a subpool can be deleted, or changed to point to some other subpool. Note that these can't be cyclic, since the origin of the counted reference is not a subpool. However, I suppose if it is inside a subpool, and the subpool is finalized, the counted reference will go away automatically. But we probably don't need to recognize such cyclic dependences, because they can be explicitly broken when desired. Subpool-to-subpool dependences cannot be explicitly broken, because of the subordinate uncounted references. Counted references could be either "weak" or "strong." A "strong" counted reference keeps the designated subpool from going away until the strong reference is destroyed. A "weak" counted reference does not impose any such requirement. Instead, a weak reference can generate a strong reference upon request, presuming the designated subpool is still in existence. If not, the weak reference returns "null," indicating the subpool is gone. Hence, a weak reference could be stored in a cache, and if the designated subpool is chosen for destruction because it hasn't been recently used, then the weak reference would reflect that. Strong references need only be counted. Weak references probably need to be on a list so they can be nulled out. Alternatively, weak references could refer to objects that are only reused as weak-ref targets, in the same way we handle tasks in the AdaMagic RTS, using a generation number technique to establish whether the target still exists. COMPOSITE OBJECTS WITH ACCESS VALUE SUBCOMPONENTS Things get more challenging when we start talking about assigning composite objects that have one or more subcomponents that are access values of subpool'ed access types. Basically we need to be sure that the LHS of the assignment does not outlive any of the subpools designated by the RHS subcomponents. We want to avoid any operation that needs to iterate through the subcomponents. In general, we associate a master with the entire object. This master is guaranteed to not outlive any of the subpools designated by its subcomponents. This works so long as the master is associated with a heap object. However, if the master is simply a local stack frame, then you couldn't assign into a formal parameter or return it from a function. We need to have a longer-lived master assocatiated with a stack object. We could associate a subpool with a local object. That is sort of what we do in an extended return statement. If we are doing an aggregate assignment into an OUT parameter, the aggregate could be treated as though it were part of the subpool holding the OUT parameter. We may need a more general way to associate a local variable with its ultimate target. Perhaps we can specify the pool as an attribute, or with a rep clause, on the local variable. Perhaps a better approach is as follows: If a composite type is non-limited, then the accessibility level of the declaration of (the ultimate ancestor of) the composite type will determine the presumed lifetime of any object of the type. This allows objects of the composite type to be freely interassigned without having to recheck accessibility. Any accessibility checks are performed when assigning to an access value component of such a composite object, presuming the object will ultimately be assigned to an object that lives at the same level as the composite type. On the other hand, if a composite type is limited, then no composite assignments are possible, and each object of the type is created in its final resting place, so the accessibility level is known, and can be checked when an assignment is performed to an access value component. !example !ACATS test !appendix From: Tucker Taft Sent: Wednesday, March 12, 2008 8:03 AM One thing I keep running into when trying to do more interesting things with storage pools is the inability to specify the storage pool at the point of the allocator. Obviously this only makes sense for general access types, but given that they can point to objects in different pools, why not be able to specify the pool on allocation? To work around this problem, we currently use some seriously groddy, thread-unsafe mechanisms that essentially specify a "virtual" storage pool on the access type, and then store into that storage pool a pointer to the "real" storage pool to use the next time an allocator is evaluated. Wouldn't it be a whole lot more reasonable and thread safe to allow something like: X : Gen_Acc_Type := new (pool-to-use) T'(initial_value); That is, allow the specification of the pool as part of the syntax for an allocator. I'll create an AI on this if there is any sympathy for the approach. **************************************************************** From: Robert Dewar Sent: Wednesday, March 12, 2008 8:07 AM I think this would definitely be useful, almost more useful than specifying on types! In practice all access types are general (I regard pool specific allocators as a bizarre corner feature of the language that is almosst never used :-) **************************************************************** From: Tucker Taft Sent: Wednesday, March 12, 2008 9:12 AM Glad to hear I am not the only one who might find this useful. I suppose there ought to be a corresponding variant of Unchecked_Deallocation that takes a storage pool as an explicit parameter, e.g.: generic type Object(<>) is limited private; type Name is access all Object; procedure Ada.Unchecked_Deallocation_From_Pool (X : in out Name; Pool : in out System.Storage_Pools.Root_Storage_Pool'Class); This is probably less critical, since it is more common for heap objects to keep track of what pool they come from. Furthermore, one of the main reasons to take over allocation is so you can avoid ever calling unchecked deallocation. But it still seems like something like the above should be available if one can specify an allocator-specific pool. (Note that I have limited the actual access type to be a general access type with the use of "all".) **************************************************************** From: Robert A. Duff Sent: Wednesday, March 12, 2008 11:41 AM > I think this would definitely be useful,... I agree with Tucker and Robert. I also agree with Tucker's other note about Unchecked_Deallocation. Maybe I'll be accused of making "best" the enemy of "good enough", ;-) but along these lines, I'd also like a way to define a default pool for all access types in a scope, and/or a way to force all access types in a scope to have a Storage_Pool clause. >...almost > more useful than specifying on types! In practice all access types >are general (I regard pool specific allocators as a bizarre corner >feature of the language that is almosst never used :-) Historical note: Tucker wanted to make all access types "general", so there would be no need for the extra "all" syntax. I convinced him to keep the Ada-83 kind. I've since changed my mind: pool-specific access types are useful, but not useful enough to warrant the added language complexity of having two kinds of access types. Tucker was right all along. I am also responsible for the bogus term "pool-specific". The issue is only vaguely related to pools. But I confused myself into thinking that pools are more-or-less the same thing as what Ada 83 calls a "collection". They're not. My fault. **************************************************************** From: Tucker Taft Sent: Wednesday, March 12, 2008 12:08 PM Your memory of history is better than mine. One thing I do remember, though, was that from an optimizer point of view, pool-specific access types are preferable, since they don't need to worry about updates to (non-heap) aliased objects or updates through unrelated access types. And if you are going to be using unchecked-deallocation with your own storage pools, if you only associate a given pool with pool-specific access types, it gives you an extra level of comfort that the pointer passed to a pool's deallocate routine was produced by a call on the allocate routine for that pool. I also know that the Rational compiler was able to use smaller integers to represent pool-specific access values if they had an associated Storage_Size. So I suppose I have changed *my* mind too, that pool-specific access types aren't all bad... ;-) But the bigger point in my mind is that associating a pool with an access type isn't what I usually want to do. I want to associate a pool with some set of objects that all have the same lifetime (or more precisely, the same moment of "death"). And that rarely matches the lifetime of the access types involved. **************************************************************** From: Randy Brukardt Sent: Wednesday, March 12, 2008 1:47 PM ... > I am also responsible for the bogus term "pool-specific". > The issue is only vaguely related to pools. But I confused myself > into thinking that pools are more-or-less the same thing as what Ada > 83 calls a "collection". They're not. Would you care to elaboration on "they're not"? I've always viewed pools as a more powerful version of collections. (Indeed, collections are implemented as a special predefined kind of storage pool in Janus/Ada.) If I'm wrong in some fundamental way, this would be as good of a time as any to find out. **************************************************************** From: Randy Brukardt Sent: Wednesday, March 12, 2008 2:36 PM > One thing I keep running into when trying to do more interesting > things with storage pools is the inability to specify the storage pool > at the point of the allocator. Obviously this only makes sense for > general access types, but given that they can point to objects in > different pools, why not be able to specify the pool on allocation? ... > > Wouldn't it be a whole lot more reasonable and > thread safe to allow something like: > > X : Gen_Acc_Type := new (pool-to-use) T'(initial_value); > > That is, allow the specification of the pool as part > of the syntax for an allocator. My "language purity" alarm went off with this one. Something about the solution seems wrong; not sure I can explain it. It seems unsafe, in that there would be no way to tell what pool something was allocated from, and thus deallocation would be extremely dangerous. I'm definitely dubious that the problem is the inability to specify a storage pool on an allocator. That seems more like a possible solution to a problem, rather than the problem itself. I'm sure there are problems in this area, but I guess I'd like to lay out the problems before solving them (if for no other reason than to ensure that we've really solved them all). One technical question on this proposal: is this intended to work with anonymous access types? That would seem to make allocators of them useful, but also would introduce a number of interesting new issues (such as, can an allocator with a specified pool be a coextension? If so, you can have coextensions with parts in different pools, and if not, then it becomes even less obvious if a particular object is a coextension). > ... I suppose there ought >to be a corresponding variant of Unchecked_Deallocation >that takes a storage pool as an explicit parameter... > >This is probably less critical, since it is >more common for heap objects to keep track of what >pool they come from. ... I've never done that, and this is the first time I can recall anyone saying that they did that. So I'm dubious about the "more common" part. I've considered having Janus/Ada do that internally for mutable types (it would solve a number of obscure bugs and leaks), but that wouldn't help the user any. >... >Furthermore, one of the main >reasons to take over allocation is so you can avoid >ever calling unchecked deallocation. Humm, now that seems to be a problem, since it is impractical to do that in Ada as it stands. (I've given up trying.) Either you end up with a huge number of type conversions to some global general access type or you end up with a global pool that isn't able to do any early deallocations. I've generally used pools to catch double deallocations and the like, and/or to make allocations more efficient by optimizing a particular size into an array (other sizes being passed on to the main pool). My first thought was that anonymous access types are the right solution for this (as they are the sort of short-lived type needed) but that doesn't actually work, because you'd still need all of the type conversions. (Which, BTW, is one of the reasons I hate them so; they don't even solve the problem that they were advertised to solve [eliminating excess type conversions]. And you can't specify a pool. So you can't actually use them in practice (or they don't add anything), so why have them?) So I'm not sure what the right answer is. > And if you are going to be using unchecked-deallocation > with your own storage pools, if you only associate > a given pool with pool-specific access types, it gives you an > extra level of comfort that the pointer passed > to a pool's deallocate routine was produced by > a call on the allocate routine for that pool. It's a lot more than comfort -- in an absence of Unchecked_Conversions into a pool-specific type, it is a guarantee. (And there shouldn't be any such conversions; we'll raise Constraint_Error on the use of any such value if the type uses the default pool - so it won't work anyway. I'd ban such U_Cs outright if the language allowed me to do that.) For the main pool, we can check for deallocations from the stack and raise Program_Error if they're attempted; that provides (almost) the same level of safety for a general access type using the default pool. But once you mix user-defined pools into the type, you're playing Russian roulette if you are using Unchecked_Deallocation. Anyway, I don't like the idea, but perhaps there is no better solution to the problem(s). **************************************************************** From: Robert A. Duff Sent: Wednesday, March 12, 2008 2:37 PM > Could you care to elaboration on "they're not"? I've always viewed > pools as a more powerful version of collections. (Indeed, collections > are implemented as a special predefined kind of storage pool in > Janus/Ada.) If I'm wrong in some fundamental way, this would be as > good of a time as any to find out. As I recall, in Ada 83, the "collection" of an access type just means the set of all objects allocated by allocators returning that type. (I'm ignoring derived access types.) "Collection" is a fairly high-level concept. If we had kept the term in Ada 95, I suppose we would use it to talk about finalization. "Storage pool", OTOH, is a low-level memory-management concept. Many unrelated access types can share the same storage pool; hence it's not the same thing as "collection". If we take Tucker's suggestion, then the reverse is also true -- many storage pools per collection. (In fact, through torturous programming, one can already achieve that in Ada 95, as Tucker hinted.) I suspect you're using "collection" to refer to 'Storage_Size clauses. As such, I don't think you're doing anything wrong (except perhaps misusing the term ;-) ). ...all of which reminds me: if we take Tucker's suggestion, it's worth thinking about finalization, and perhaps awaiting task termination. One useful thing to do with pools is to avoid Unchecked_Dealloc on individual objects. The pool collects objects together into "memory pages" or whatever, and the entire pool can be wiped out with one call (either explicit, or triggered by finalization of the pool). Makes deallocation very efficient, and very simple (if all objects in the pool have roughly the same lifetime). But this doesn't quite work if the objects have finalization. (And of course formally, ALL objects have finalization according to the RM -- one must hope that the implementation doesn't actually do anything if the finalization is a no-op.) Not a big deal, but worth thinking about. **************************************************************** From: Stephen W Baird Sent: Wednesday, March 12, 2008 2:08 PM > I also know that the Rational compiler was able to use smaller > integers to represent pool-specific access values if they had an > associated Storage_Size. You're going way, way back with this one. Back in the days when Rational's Ada development environment was hosted on proprietary hardware, the default representation for a value of an access type was as a (bit) offset relative to the beginning of a chunk of memory associated with the access type. A storage_size representation clause could restrict the range of these offsets and would therefore allow a smaller representation for access values. This, coupled with the R1000's aggressive bit packing policies, meant that a storage_size specification for an access type with no other accompanying black magic could result in an array of access values having a component size of, say, 23 bits. Getting back to the original question about specifying a pool for an allocator, it might be worth noting that Rational felt the need to invent an Ada83 solution to this problem. Like many pragma-based solutions to problems which really require language support, it was somewhat ugly. The solution included a pragma Heap, which occurred immediately after a statement or declaration which included an allocator, and which specified the "storage pool" (although that was not the terminology that was used at the time) for the allocator(s). There is no need to point out the various problems with this approach. **************************************************************** From: Tucker Taft Sent: Wednesday, March 12, 2008 3:41 PM > My "language purity" alarm went off with this one. Something about the > solution seems wrong; not sure I can explain it. It seems unsafe, in > that there would be no way to tell what pool something was allocated > from, and thus deallocation would be extremely dangerous. This is already an issue in Ada 95, because any general access value could have originated from any allocator associated with any access type that has a compatible designated type, or for that matter from 'Access of an appropriate level aliased object. Presuming we restrict this to (named!) general access types, it is not introducing any more danger than already exists. Any work-around for this is going to be just as dangerous, presuming you use some kind of level of indirection in your storage pool, where you change the "real" storage pool being used during execution. > I'm definitely dubious that the problem is the inability to specify a > storage pool on an allocator. I'm not. ;-) I am a big believer in using something approximating a mark/release approach to allocation, or equivalently, a "region-based" approach. That is, you allocate objects from a region or heap one at a time, and then you throw them all away in "one swell foop." This approach may not be too useful for some kinds of applications, but applications that are vaguely like a compiler (which most of the ones I develop are), generally build up a tree one node at a time, and then at some point have no need for the entire tree. Walking the tree just to free it one node at a time is a pain, and probably means that you need to have decent a defragmenting storage management system, etc. When it comes to freeing the tree, I'm not freeing *all* trees, just the entire tree associated, say, with a particular compilation unit. There is no natural way in Ada to associate that one particular tree with a storage pool, unless you play thread-unsafe games with levels of indirection, etc. If this capability existed, then I could have a factory routine which took the pool as a parameter, or perhaps retrieved it from the "header" of the overall tree, or wherever. It would then use that pool to do the allocation. > ... > That seems more like a possible solution to a > problem, rather than the problem itself. I'm sure there are problems > in this area, but I guess I'd like to lay out the problems before > solving them (if for no other reason than to ensure that we've really solved them all). The problem is that you want the pool to be associated with a set of objects, not with a set of access types. Another place this comes up is with containers. I really want to pass in the container to use when I create a hash table. Different hash tables originating from the same generic instantiation might very well want to use different storage pools. Essentially I want the storage pool to be identified via an access discriminant of the container. When the hash table goes away, I can easily reclaim all the storage used to represent it. > One technical question on this proposal: is this intended to work with > anonymous access types? It depends on the kind of anonymous access types. A coextension is allocated from the same storage pool as that used for the enclosing object, so if that object were allocated using this new capability, then all the coextensions would be allocated from the pool specified in the allocator. I suppose specifying a storage pool an allocator used to initialize an access discriminant would be a way of effectively breaking the coextension relationship, just as though you had wrapped the allocator in a qualified expression for some named access type. If we are talking about an anonymous allocator passed to an access parameter, then that uses a local storage pool that goes away when the subprogram returns. I can't see much value in overriding that, though I suppose there might be reasons to do so. If we are talking about the new kinds of anonymous access types introduced in Ada 2005 for components, then I suppose it could be useful. If we are talking about access return types, then the current AI proposes that a return of an anonymous allocator uses the storage pool determined by the context of use. I suppose overriding the pool in that case again would be roughly analogous to putting a qualified expression around the allocator for some particular named access type, again breaking any coextension relationship that might have been produced in the absence of the explicit pool specification. > ... That would seem to make allocators of them useful, > but also would introduce a number of interesting new issues (such as, > can an allocator with a specified pool be a coextension? If so, you > can have coextensions with parts in different pools, and if not, then > it becomes even less obvious if a particular object is a coextension). Part and parcel of the definition of a coextension is that it has the same lifetime of the object that designates it. Hence, as suggested above, if you specify an explicit pool you would be preventing a coextension relationship, just as if you had wrapped the allocator in a qualified expression for a named access type. >> ... I suppose there ought >> to be a corresponding variant of Unchecked_Deallocation that takes a >> storage pool as an explicit parameter... >> >> This is probably less critical, since it is more common for heap >> objects to keep track of what pool they come from. ... > > I've never done that, and this is the first time I can recall anyone > saying that they did that. Well it is quite common in the C++ world that "delete" is designed to work no matter what "arena" the object is allocated from. In fact, there isn't much choice, since "delete" can't be passed any extra parameters in C++. Somewhat analogously, "free" in C doesn't take a length. The length is squirreled away in front of the object usually. > ... > So I'm dubious about the "more common" part. I've > considered having Janus/Ada do that internally for mutable types (it > would solve a number of obscure bugs and leaks), but that wouldn't > help the user any. I'm not proposing to *require* it, but it is always an option for a user-defined storage pool. In fact, the whole point of proposing the additional version of unchecked deallocation is to put the burden on the user rather than the implementor of the storage pool when that is appropriate. In Ada 95, the user has to decide what type to use for Unchecked_Deallocation, even if there is a lot of access-type-to-access-type conversion being performed in the application. I know our implementation of finalization of heap-allocated objects as part of Unchecked_Deallocation goes out of its way to survive in the presence of access-type-to-access-type conversion. >> ... >> Furthermore, one of the main >> reasons to take over allocation is so you can avoid ever calling >> unchecked deallocation. > > Humm, now that seems to be a problem, since it is impractical to do > that in Ada as it stands. (I've given up trying.) Maybe you are misinterpeting what I am saying. We pretty much never call Unchecked_Deallocation in our static analysis tool. We rely on en-masse deallocation of an entire storage pool to reclaim storage before the access type goes out of scope. Of course this doesn't work if the objects need any sort of finalization, so we have to be careful about that. Sometimes we even go to the trouble of calling Unchecked_Deallocation on each object that might need finalization, even though we know the underlying pool actually does nothing when you call "Deallocate." > ... > Anyway, I don't like the idea, but perhaps there is no better solution > to the problem(s). It may be you have never encountered the issue. But in my experience, eliminating overhead in storage management is absolutely critical to the performance of our kind of tool. Garbage collection is much too slow (even the high end, real-time, incremental, blah, blah, blah versions), and even malloc/free are significantly slower, due to the fragmentation problems, first-fit vs. best-fit list walking, etc. Only something approximating mark/release have produced the kind of performance we require. And then we end up wasting some of that on silly levels of indirection through virtual storage pools. **************************************************************** From: Robert A. Duff Sent: Wednesday, March 12, 2008 6:48 PM > That is, you allocate objects from a region or heap one at a time, and > then you throw them all away in "one swell foop." This approach may > not be too useful for some kinds of applications, but applications > that are vaguely like a compiler (which most of the ones I develop > are), generally build up a tree one node at a time, and then at some > point have no need for the entire tree. Right. I'd like to point out that this "tree" (or whatever data structure it is) consists of many different record types, linked together via many different access types. And the same access types are used in different trees -- that's why per-type pools don't work in this case. **************************************************************** From: Randy Brukardt Sent: Wednesday, March 12, 2008 7:55 PM > This is already an issue in Ada 95, because any general access value > could have originated from any allocator associated with any access > type that has a compatible designated type, or for that matter from > 'Access of an appropriate level aliased object. Presuming we restrict > this to (named!) general access types, it is not introducing any more > danger than already exists. Any work-around for this is going to be just > as dangerous, presuming you use some kind of level of indirection in your > storage pool, where you change the "real" storage pool being used > during execution. I agree that a workaround is just as dangerous, that's why I want to look for a better solution that is not dangerous. And while I agree that Ada 95 has this problem, it is one that is very bounded to named general access types with a user-defined storage pool (or with type conversions from a pool-specific access type that has such a pool whose memory is allocated in the main heap) *and* that has an instance of Unchecked_Deallocation. As I discussed previously, in Janus/Ada, using Unchecked_Deallocation on the default pool raises Program_Error if given an access value outside of the main heap. Thus, using Unchecked_Deallocation on a general access type (that uses the default pool) is almost as safe as using it in a pool-specific type. (The only new issue is that a access-to-an-aliased-component of a heap object would sneak through.) [The reason for this check is to avoid heap corruption long enough to provide an error message; otherwise, we've seen programs where the exception handling mechanism fell over because of the corruption. Besides, we're already tracking the heap bounds for the dereference check for pool-specific types (an Ada 83 feature that we've preserved because we didn't want moving to Ada 95 to lose functionality) and thus the check is essentially free.] It would make perfect sense to warn on the declaration of such an instance of Unchecked_Deallocation for a general access type that has a user-defined pool. Indeed, I'm not sure why we don't have such a warning: "Use of this deallocator is potentially unsafe as deallocation of objects that were not originally allocated cannot be detected." Probably just never thought about this in detail. (Indeed, it would be possible to make many user-defined storage pools safe in this way as well, but it would not be clear whether that sort of thing is best handled by the compiler or by the user-defined pool.) However, with your proposal, this danger now spreads to all (named) general access types, because it would take full program analysis to prove or disprove whether there is a danger. That means that this is no longer something that the compiler can help with (and giving the above warning on all general access types is not helpful). ... > > One technical question on this proposal: is this intended to work > > with anonymous access types? > > It depends on the kind of anonymous access types. ... [Lots more text]. You could have just said is "Yes". ;-) I've pretty much convinced myself that if it is a good idea at all, it ought to be allowed on anonymous access types - it would make them a lot more flexible. > > ... That would seem to make allocators of them useful, but also > > would introduce a number of interesting new issues (such as, can an > > allocator with a specified pool be a coextension? If so, you can > > have coextensions with parts in different pools, and if not, then it > > becomes even > > less obvious if a particular object is a coextension). > > Part and parcel of the definition of a coextension is that it has the > same lifetime of the object that designates it. > Hence, as suggested above, if you specify an explicit pool you would > be preventing a coextension relationship, just as if you had wrapped > the allocator in a qualified expression for a named access type. This is what I hate about coextensions. You mentioned at the last meeting that the way Janus/Ada allocates objects means that all objects with dynamic-sized array components act like coextensions. And that is true, except for one major difference: *every* such object acts the same way. There is no wishy-washy "maybe it is and maybe it isn't" and changing a couple of characters here or there changes the status. With the semantics you are suggesting, you couldn't even tell syntactically whether it is or is not a coextension (it would depend on what pool is given in the allocator). Yuck. (Yuck is not strong enough of a word for this, but I don't want to spend 10 minutes finding a thesaurus in order to find a better word...) > > ... So I'm dubious about the "more common" part. I've considered > > having Janus/Ada do that internally for mutable types (it > > would solve a number of obscure bugs and leaks), but that wouldn't > > help the user any. > > I'm not proposing to *require* it, but it is always an option for a > user-defined storage pool. I certainly would consider doing it as part of the implementation; I can't stomach the lack of safety and at least that would reduce it. (It's annoying that we'd have to check and raise an exception rather than doing the right thing, but I suppose that would be for the best - it would be more portable.) > In fact, > the whole point of proposing the additional version of unchecked > deallocation is to put the burden on the user rather than the > implementor of the storage pool when that is appropriate. I think it makes just as much sense to put the burden on the compiler. Why are we messing with all of this horrible accessibility stuff if then we're going to immediately turn around and say "it's all erroneous anyway!" If Ada is the language for safe programming, we need to act that way. It would make sense to allow the user to suppress such checks (as with all language-defined checks), but to have the default to be unsafe seems bad. > In Ada 95, the user has > to decide what type to use for Unchecked_Deallocation, even if there > is a lot of access-type-to-access-type conversion being performed in > the application. I know our implementation of finalization of > heap-allocated objects as part of Unchecked_Deallocation goes out of > its way to survive in the presence of access-type-to-access-type > conversion. You have to do that! Not sure how that "is going out its way", unless you want to appeal to erroneousness and allow unrelated stuff to fail. (You'd be right vis-a-vis the letter of the standard, but users wouldn't be too happy.) > >> ... > >> Furthermore, one of the main > >> reasons to take over allocation is so you can avoid ever calling > >> unchecked deallocation. > > > > Humm, now that seems to be a problem, since it is impractical to do > > that in > > Ada as it stands. (I've given up trying.) > > Maybe you are misinterpeting what I am saying. No, you stopped reading before I explained what I meant. I think we actually agree. I said that it takes heroic efforts to do that (and I'm not willing to go to those efforts), and you said (elsewhere) that it takes you heroic efforts to do so (and that's why you're proposing this language change). The only difference is that you are willing to go to those efforts and I'm not. ... > > Anyway, I don't like the idea, but perhaps there is no better > > solution to the problem(s). > > It may be you have never encountered the issue. Say what? What does that have to do with not liking the idea? Just because there is a problem that is hard to solve in Ada doesn't mean that we should throw out all of the safety built-into the language and particular implementations of the language in order to solve the problem. I'm not against solving the problem. I just don't like this particular form of the proposed solution to the problem. Sheesh. > But in my experience, eliminating overhead in storage management is > absolutely critical to the performance of our kind of tool. Garbage > collection is much too slow (even the high end, real-time, > incremental, blah, blah, blah versions), and even malloc/free are > significantly slower, due to the fragmentation problems, first-fit vs. > best-fit list walking, etc. > Only something approximating mark/release have produced the kind of > performance we require. > And then we end up wasting some of that on silly levels of indirection > through virtual storage pools. OK. But claiming that "you've never encountered the problem" has nothing to do with this. Besides, I *have* encountered the problem, way back in the early days of Ada 83 and Janus/Ada. I see that I solved it (in our particular case) on October 18, 1985. The solution is the obvious one, in fact: if Unchecked_Deallocation is too slow, then don't call it! We put the nodes (more than 5 different kinds) on a bounded free chain when we're done with them. (It's bounded so that an unusual program doesn't soak up all of the memory in one kind of node, but that rarely happens.) It only takes 5 machine instructions to do so, so it can hardly be a noticeable overhead until there are billions of nodes. Moreover, allocation off of the free list is actually faster than normal allocation, so there is a win on that end as well. The bounds were set by experimentation with lots of test programs, so that exceeding the bounds is rare. When the compiler pass terminates, the entire program exits (each pass is a separate program), so there is no need to every actually release the nodes on the free chain. We do have an "emergency dump-all" routine that dumps all of the free chains if we run out of memory allocating nodes (that occurred frequently back on the MS-DOS host with 640K of RAM, but I doubt it has ever happened on any of the larger hosts). Indeed, we tried a mark-release scheme early on (it was built-into the Pascal compiler we started with). It was just a huge tangle of bugs, we were forever walking into stuff that had been deallocated. We eventually gave it up as not worth the hassle. Obviously, this scheme wouldn't work as well if we needed to do major mode shifts during the compiler passes, but even then it would probably work pretty well (it cut the number of deallocations in our compiler by a factor of 100). And the reuse of the nodes reduces the number of crashes due to dangling pointers, but probably increases the number of latent bugs caused by using a reused node instead. Still, obviously one size does not fit all when it comes to memory allocation. I'm just not interested in sacrificing some safety to solve this (or any other, for that matter) problem. **************************************************************** From: Randy Brukardt Sent: Wednesday, March 12, 2008 8:06 PM > Right. I'd like to point out that this "tree" (or whatever data > structure it is) consists of many different record types, linked > together via many different > access types. And the same access types are used in different trees > -- that's why per-type pools don't work in this case. Same here. I detailed how I solved that (in 1985!) for Janus/Ada. Perhaps if I would have had storage pools back then, I'd have tried a different solution. But in all honesty, the simpler solution is good enough, and I've reused it in a number of other contexts since with success. That is to say, I'm not sure that "storage pools" is the right abstraction to solve this problem. I realize that once you have them, the temptation is great to map everything into them. Anyway, some food for thought. **************************************************************** From: Randy Brukardt Sent: Wednesday, March 12, 2008 8:40 PM ... > As I recall, in Ada 83, the "collection" of an access type just means > the set > of all objects allocated by allocators returning that type. > (I'm ignoring derived access types.) "Collection" is a fairly high-level > concept. Humm, I didn't recall that. (The term is not in the Ada 83 index, which doesn't help any.) The Ada 83 Rationale uses the term the way you describe, but I can't find anything in the Ada 83 standard that does. (Indeed, the only uses of the term I can find are in 4.8(9) and 13.2(8), which is defining 'Storage_Size.) > If we had kept the term in Ada 95, I suppose we would use it to talk > about finalization. We put it back (in a sense), to talk about finalization. See 7.6.1(11/2). > "Storage pool", OTOH, is a low-level memory-management concept. Many > unrelated access types can share the same storage pool; hence it's not > the same thing as "collection". If we take Tucker's suggestion, then > the reverse is also true -- many storage pools per collection. > (In fact, through torturous programming, > one can already achieve that in Ada 95, as Tucker hinted.) I guess you're right if you take a narrow view of the meaning of "collection". That is, that it is the objects of only a single type. Period. Surely that is true in Ada 83 (no overlapping collections is mentioned in the Rationale). But I see this as simply something that was expanded in Ada 95 (multiple types can share their collections - surely this is true for derived access types). And that the term was changed to be more accurate. > I suspect you're using "collection" to refer to 'Storage_Size clauses. > As such, I don't think you're doing anything wrong (except perhaps > misusing the term ;-) ). That was the only visible meaning of the term in Ada 83. > ...all of which reminds me: if we take Tucker's suggestion, it's worth > thinking about finalization, and perhaps awaiting task termination. > One useful thing to do with pools is to avoid Unchecked_Dealloc on > individual objects. The pool collects objects together into > "memory pages" or whatever, and the entire pool > can be wiped out with one call (either explicit, or triggered by > finalization of the pool). Makes deallocation very efficient, and > very simple (if all objects in the pool have roughly the same > lifetime). But this doesn't quite work if the objects have > finalization. (And of course formally, ALL objects > have finalization according to the RM -- one must hope that the > implementation doesn't actually do anything if the finalization is a > no-op.) > > Not a big deal, but worth thinking about. Actually, it is a very big deal (assuming non-trivial finalization). The current rules ensure that the pool always outlives the type, so finalization is a non-problem. More generally, the current Ada rules require that the *memory* always outlives the Ada objects, so there never is a problem with dangling pointers during finalization. This change completely destroys that property (at least it does in the intended use that Tucker is suggesting). Now, obviously, we can simply declare that if the pool goes away early, the program is erroneous. But that means that any attempt to solve Tucker's problem is going to necessarily be erroneous. We could try to formally define what "trivial finalization" is, and declare that if the pool goes away early, the program is erroneous. But that's a maintenance hazard and means that you can't use this mechanism generally (if you are using types from some outside [independently developed] library, you can't know or depend on the sort of finalization that it uses). A better solution would be to define some mechanism to force early finalization. (After all, I'm on record as saying that the vast majority of new types should have non-trivial finalization.) But if we have to do that, perhaps we could combine that with the storage pool so that we don't have to lose the safety that we currently have. Not sure what the solution needs to be. But what I don't want to see is lots of new ways to write erroneous programs. (Or old ways that are ten times more likely to occur, either.) One last point: we still need to fix Ada finalization to allow garbage collection. Perhaps there is some way to do both (they surely are closely related). At least if compilers were not *required* to do something impossible (finalize an object after the memory is gone), a "safe" compiler could take steps to eliminate the problem. (Rather than crashing or raising an exception that could not be avoided.) **************************************************************** From: Tucker Taft Sent: Wednesday, March 12, 2008 9:38 PM > One last point: we still need to fix Ada finalization to allow garbage > collection. Perhaps there is some way to do both (they surely are > closely related). At least if compilers were not *required* to do > something impossible (finalize an object after the memory is gone), a > "safe" compiler could take steps to eliminate the problem. (Rather > than crashing or raising an exception that could not be avoided.) Good point. One approach to avoiding the finalization issue is to require that storage pools used with this new syntax be required to have some additional operations related to finalization. In our Ada 95 compiler, for each access type that might designate objects requiring finalization, we maintain a (circularly-linked) list of allocated objects that need finalization. An allocator adds the object to the list if it needs finalization, and Unchecked_Deallocation removes the object from the list if it needs finalization. When the access type itself is finalized, we walk the list finalizing all of the objects that remain on it. If any early memory reclamation is done on pools, in Ada 95 or with this new proposal, you have to hope you don't have objects that need finalization, because that circularly-linked list is going to have dangling references. If instead we expected storage pools to provide operations to add and remove objects from such a list, and we require the storage pool object to never outlive the access types whose objects it stores, so that when the storage pool itself is finalized, it can free any remaining objects, then we can avoid this dangling reference problem. Furthermore, if the storage pool doesn't want to handle objects needing finalization, it can simply raise an exception if the "add-to-finalization-list" operation is called. Mark/release can then be handled safely with respect to finalization, by simply doing a "mark" on this finalization list, and at release finalizing all of the objects added to the list since the mark. **************************************************************** From: Tucker Taft Sent: Wednesday, March 12, 2008 9:13 PM > I agree that a workaround is just as dangerous, that's why I want to > look for a better solution that is not dangerous. > > And while I agree that Ada 95 has this problem, it is one that is very > bounded to named general access types with a user-defined storage pool > (or with type conversions from a pool-specific access type that has > such a pool whose memory is allocated in the main heap) *and* that has > an instance of Unchecked_Deallocation. I'm not sure I follow. In Ada 95 I can convert to any general access type with the same designated subtype and compatible accessibility level, whether or not it has a user-defined storage pool. I guess I don't see the added danger with this proposal. Are you associating the added danger with the allocator or with Unchecked_Deallocation_From_Pool? Both of these would be explicit in the source code. There is no chance a user could use them by mistake. And it seems that both of these new capabilities are equivalent (ignoring thread-safety) to something that could be accomplished in Ada 95 using a level of indirection in a storage pool, plus a couple of access type conversions. > ... > It would make perfect sense to warn on the declaration of such an > instance of Unchecked_Deallocation for a general access type that has > a user-defined pool. Indeed, I'm not sure why we don't have such a > warning: "Use of this deallocator is potentially unsafe as > deallocation of objects that were not originally allocated cannot be > detected." Probably just never thought about this in detail. > ... > > However, with your proposal, this danger now spreads to all (named) > general access types, because it would take full program analysis to > prove or disprove whether there is a danger. I'm not sure I see why you need more sophisticated analysis for this proposal. The type conversions can be buried anywhere in Ada 95. > ... > With the semantics you are suggesting, you couldn't even tell > syntactically whether it is or is not a coextension (it would depend > on what pool is given in the allocator). Yuck. Sorry, that wasn't what I was trying to say. If you specify a storage pool explicitly for the allocator, it would *not* be a coextension, even if it happened to specify the same pool. > ... > I'm not against solving the problem. I just don't like this particular > form of the proposed solution to the problem.... For me the "problem" is that I can't associate a pool with a particular allocator, without using a thread-unsafe approach. How would you express the "problem" that you perceive in this area? >> But in my experience, eliminating overhead in storage management is >> absolutely critical to the performance of our kind of tool. ... > ... > > Still, obviously one size does not fit all when it comes to memory > allocation. I'm just not interested in sacrificing some safety to > solve this (or any other, for that matter) problem. I guess I don't see this as particularly affecting pointer safety either way, since we have already said that Unchecked_Deallocation is the users' responsibility to get right, both in terms of dangling references and in terms of pools. I am trying to solve a problem with thread-safety, without significantly changing anything else. If you could explain the problem as you perceive it, and at least the general characteristics of a solution you would prefer, that would help me better understand your view. **************************************************************** From: Randy Brukardt Sent: Wednesday, March 12, 2008 10:51 PM > Good point. One approach to avoiding the finalization issue is to > require that storage pools used with this new syntax be required to > have some additional operations related to finalization. In our Ada > 95 compiler, for each access type that might designate objects > requiring finalization, we maintain a (circularly- > linked) list of allocated objects that need finalization. An > allocator adds the object to the list if it needs finalization, and > Unchecked_Deallocation removes the object from the list if > it needs finalization. When the access type itself is finalized, > we walk the list finalizing all of the objects that remain on it. Yes, our compiler works similarly. > If any early memory reclamation is done on pools, in Ada 95 or with > this new proposal, you have to hope you don't have objects that need > finalization, because that circularly-linked list is going to have > dangling references. "Hope" doesn't cut it for language design! :-) > If instead we expected storage pools > to provide operations to add and remove objects from such a list, and > we require the storage pool object to never outlive the access types > whose objects it stores, so that when the storage pool itself is > finalized, it can free any remaining objects, then we can avoid this > dangling reference problem. Don't you mean the inverse? Currently, a pool has to outlive the access type. Surely we can't invert that or we'd have nasty compatibility problems. I'd presume that these "enhanced" pools would derived from Root_Storage_Pool (as this is a nice abstract limited controlled type), and it would be nasty to have different lifetime requirements for Root_Storage_Pool and Enhanced_Storage_Pool. Especially consider the problems making a legality check for Root_Storage_Pool'Class! package System.Enhanced_Storage_Pools is type Enhanced_Storage_Pool is new System.Storage_Pools.Root_Storage_Pool with private; procedure Save_Object_for_Finalization (Pool : in out Enhanced_Storage_Pool; Storage_Address : in Address); procedure Finalize_Saved_Object (Pool : in out Enhanced_Storage_Pool; Storage_Address : in Address); end System.Enhanced_Storage_Pools; I would not be happy if these were unrelated types, since this is exactly the sort of place where dispatching and inheritance works to maximum advantage. That is, there would be no problem using "enhanced" storage pools in contexts that also expect regular storage pools - so they should be related types. Not sure what the best solution is. If the type retained its own list, then there is no problem having it call the pool operations as needed. But that seems like overkill (both the user-defined pool and the language-defined type would have to keep lists of objects). If the type tries to use the pool's list, then objects no longer go away when the type does: that seems like a significant change in semantics. Maybe the answer is that if the allocator uses a "normal" pool, it goes onto the type's list, and if the allocator uses an "enhanced" pool, it goes on the pool's list. In the latter case, there is no defined relationship between the access type and the object lifetime: that's the pools job. Humm, this would seem to fix the garbage collection problem, too. Just specify an "enhanced" pool for your access type (or explicitly in the allocators), and you can finalize when you want (but no later than when the pool is finalized). That couldn't be incompatible, as there are no "enhanced" pools now. > Furthermore, if the storage pool doesn't want to handle objects > needing finalization, it can simply raise an exception if the > "add-to-finalization-list" operation is called. Yup. Probably would come in handy for some algorithms. > Mark/release can then be handled safely with respect to finalization, > by simply doing a "mark" on this finalization list, and at release > finalizing all of the objects added to the list since the mark. Yup. The only remaining issue is how to tie this to processes defined by the standard that work like finalization (i.e. task waiting and protected objects). Specifically, the pool has an address of an object to finalize: how does it in fact finalize that object? Implementation-wise, it seems easy. (For Janus/Ada, at least.) Just pass it to the low-level library routine that finalizes an object. (That grabs the tag out of location 0 [there must be one if it is a finalizable object], and either dispatches appropriately, or if the tag is a "magic marker" tag, do the operation required by that marker - such as waiting for tasks or freeing a memory block.) But how best to represent this operation so that the user code can call it? Surely they must do so in order to finalize any remaining objects. All I can think of is a magic routine in System.Enhanced_Storage_Pools: procedure Finalize_Object_at_Address (Storage_Address : in Address); -- Finalizes the object at Storage_Address, calling Finalize if -- needed, waits for tasks, does protected object finalization, etc. This is not very satisfying. P.S. This is more fun than fixing bugs in the Standard (like trying to figure out how accessibility should work)! :-) **************************************************************** From: Randy Brukardt Sent: Wednesday, March 12, 2008 11:53 PM > I'm not sure I follow. In Ada 95 I can convert to any general access > type with the same designated subtype and compatible accessibility > level, whether or not it has a user-defined storage pool. I guess I > don't see the added danger with this proposal. If I was worried about such conversions, I could trivially add a warning to them as well. But they're not often a problem; many user-defined pools in fact allocate memory from the stack and thus the check would catch them, too. Only user-defined pools that use memory allocated from the main heap can sneak through. > Are you associating the added danger with the allocator or with > Unchecked_Deallocation_From_Pool? > Both of these would be explicit in the source code. > There is no chance a user could use them by mistake. > And it seems that both of these new capabilities are equivalent > (ignoring thread-safety) to something that could be accomplished in > Ada 95 using a level of indirection in a storage pool, plus a couple > of access type conversions. The added danger is that an Unchecked_Deallocation (any kind) could never know what pool the object might have come from. Currently, the object (in the absence of type conversions) can *only* have come from the defined pool or from some other place (usually the stack). Thus, a simple check on the deallocation that the address belongs to the defined pool will detect 98% of the misuses. Note that I'm am not saying that our current implementation is 100% safe, because it's not. (Among other things, it will miss pointers created into the middle of heap objects.) But it doesn't need to be in order to add substantial value (this is the same as the dangling pointer checks that we do on containers: they're 98% effective, but that in practice catches virtually all real problems eventually, and they're a lot cheaper than checks that are 100% effective). Obviously, you still can write code that is unsafe (we can't detect double deallocations, for instance), but by catching a significant percentage of problems, the buggy code gets detected earlier in testing -- usually before any significant damage occurs. ... > > However, with your proposal, this danger now spreads to all (named) > > general access types, because it would take full program analysis to > > prove or disprove whether there is a danger. > > I'm not sure I see why you need more sophisticated analysis for this > proposal. The type conversions can be buried anywhere in Ada 95. Type conversions don't cause a real problem, because hardly anyone would go to the lengths that you did to use storage pools. If I have to use a lot of type conversions, I generally give up and make everything the same type. In any case, you could warn on any type conversion that could cause a problem (I think they are rare enough that wouldn't usually be a major annoyance). It's only a problem for a conversion into a general access type that has the default pool (and thus would not warn on the instantiation of Unchecked_Deallocation). Other conversions would have no reason to be warned. Warning on every use of an allocator that has a storage pool would rightly be considered noise by a user (most would say, of course I used the feature, if you can't suggest how to avoid the warning, don't bother giving one!). Of course, if there was some way to figure out that a particular user-defined pool was "safe", the warning could be avoided, but I can't imagine a good way to do that. ... > > ... > > I'm not against solving the problem. I just don't like this > > particular form > > of the proposed solution to the problem.... > > For me the "problem" is that I can't associate a pool with a > particular allocator, without using a thread-unsafe approach. That's not a problem, that's a solution to a problem. > How would you express the "problem" that you perceive in this area? The user need that you described: the ability to conveniently manage the memory lifetime of a data structure, possibly made up of multiple types and multiple access types, separately from the lifetime of those types. I can grant that the solution might involve storage pools, but they can only be part of a solution, never part of a problem. ... > > Still, obviously one size does not fit all when it comes to memory > > allocation. I'm just not interested in sacrificing some safety to > > solve this (or any other, for that matter) problem. > > I guess I don't see this as particularly affecting pointer safety > either way, since we have already said that Unchecked_Deallocation is > the users' responsibility to get right, both in terms of dangling > references and in terms of pools. That's of course a cop out. This user rarely gets these "user's responsibility" things exactly right, and this implementer tries to help out this user find out where he screwed up. And neither of me much like losing a tool in this particular battle. To take a current example: my spam filter crashes with a bad memory reference roughly once every 30 days. Because one of the versions of the crash is the dereference of a null pointer in code that can't have a null pointer, I suspect a memory management problem. (Whether it is with Windows, with my code, or with the Janus/Ada runtime code, I don't know). I'd like to find the problem, but how can you do that with something that happens once every 30 days (which is typically longer than the time before Windows needs to reboot!). The only thing I can do is make it less likely that I made a mistake, thus extra checking on Unchecked_Deallocation at all levels. > I am trying to solve a problem with thread-safety, without > significantly changing anything else. Given the finalization problems that we've noted elsewhere, I don't think that is possible. And if a solution is not at least as safe (and preferably, much more safe) than existing code, than I'm not very interested. > If you could explain the problem as you perceive it, and at least the > general characteristics of a solution you would prefer, that would > help me better understand your view. Well, the problem you described is being able to get a better handle on memory management of large abstractions. It's not really about objects per-se, but rather a collection of objects that make up an instance of a single abstraction. The specific deficiency of Ada in this area is that the only facilities provided require tying of pools to the lifetime of access types. And those are impossible to control with the needed granularity. As far as solutions goes, I really don't know. (This isn't my problem, after all.) I guess I would like to see the ability to tie pools to an entire abstraction. One of the problems with your solution is that you have to repeat it on every node of your abstraction -- passing the pools everywhere, and putting them on every allocator. Forget one, (or heaven forbid, use the wrong one), and the whole thing collapses into a leaky mess (or worse). One thought would be to have a way to specify a pool for the entire abstraction. That would mean that that pool would be the default for any allocations made as part of the abstraction. But I admit I don't see how that would work in user-defined code. --- On a detail level, I think you may have already addressed my concerns when you suggested a special kind of storage pool to handle finalization properly. It would be easy enough to add some implementation-defined stuff into those fancier pools in order to preserve most of the current level of safety. (In particular a check that you don't deallocate an address that you didn't allocate.) Moreover, there would be even less reason to interconvert general access types, so a warning on conversions would be more effective. And there would be less need to use Unchecked_Deallocation, so less chance for problems. Still, there would be some loss of safety because a user-defined pool allocating out of the general heap (and deallocating an object of it by mistake) is a lot more likely than trying to deallocate an object that is part of a larger heap object. Which is all to say that I don't know. I still find that going from 98% accuracy to (say) 70% isn't a good thing. (If I could find a cheap way without a memory footprint to detect dangling pointers, I'd implement it immediately for similar reasons.) **************************************************************** From: Robert A. Duff Sent: Thursday, March 13, 2008 8:02 AM > ... (Somewhere you mentioned lifetimes, but I can't find the quote to reply to.) Anyway, in Ada 95/2005, the pool object has a longer lifetime than the access type(s). But that does NOT prevent dangling pointers, because you can have a pool operation that frees the memory prematurely. For example, Tucker likes mark/release pools (type MR_Pool, which I pronounce "Mister Pool" ;-)) -- Release frees memory. > Type conversions don't cause a real problem, because hardly anyone > would go to the lengths that you did to use storage pools. If I have > to use a lot of type conversions, I generally give up and make > everything the same type. You can't make them the same type if they have different designated types. That's a common case when using tagged/class-wides. > To take a current example: my spam filter crashes with a bad memory > reference roughly once every 30 days. Because one of the versions of > the crash is the dereference of a null pointer in code that can't have > a null pointer, I suspect a memory management problem. (Whether it is > with Windows, with my code, or with the Janus/Ada runtime code, I > don't know). I'd like to find the problem, but how can you do that > with something that happens once every 30 days (which is typically > longer than the time before Windows needs to reboot!). The only thing > I can do is make it less likely that I made a mistake, thus extra > checking on Unchecked_Deallocation at all levels. You could try a language (or implementation?) with garbage collection. ;-) > Given the finalization problems that we've noted elsewhere, I don't > think that is possible. And if a solution is not at least as safe (and > preferably, much more safe) than existing code, than I'm not very interested. I was surprised, at first, to find that dangling pointers and leaks were LESS of a problem using the storage pools Tucker is talking about. I think this is because you don't have to worry about freeing individual objects (tracking object "ownership" or whatever on a per-object basis). You just decide which pool to put it in, which is a fairly coarse-grained decision (all the objects that come from input file X belong in the pool for X). And then they get freed wholesale at some point (e.g. when we notice that the user has deleted file X). Those (Ada 95) pools require a global variable ("the current pool"), which is bad for the usual reasons -- passing the pool to "new" would solve those problems. > As far as solutions goes, I really don't know. (This isn't my problem, > after all.) I guess I would like to see the ability to tie pools to an > entire abstraction. One of the problems with your solution is that you > have to repeat it on every node of your abstraction -- passing the > pools everywhere, and putting them on every allocator. Forget one, (or > heaven forbid, use the wrong one), and the whole thing collapses into a > leaky mess (or worse). I briefly mentioned a solution to this earlier. I want to say "for all access types declared in this package, and all its children, the default pool is X". I would make X be a pool that always raises an exception on Allocate. (Added bonus -- make sure the compiler _knows_ that, so it can warn. It's basically the same as the "for T'Storage_Size use 0" pool.) Then every "new" had better have a specified pool. > One thought would be to have a way to specify a pool for the entire > abstraction. Not sure what you mean by "abstraction", here. Whatever it is, it needs to be dynamic. > On a detail level, I think you may have already addressed my concerns > when you suggested a special kind of storage pool to handle > finalization properly. That's one approach. Another acceptable approach would be to say that allocating any object with nontrivial finalization (or with tasks) in these prematurely-finalized pools is erroneous. I don't think it's hard to define "nontrivial finalization", but I think it needs to be determined at run time for class-wide objects to work properly. > Which is all to say that I don't know. I still find that going from > 98% accuracy to (say) 70% isn't a good thing. (If I could find a cheap > way without a memory footprint to detect dangling pointers, I'd > implement it immediately for similar reasons.) It's not hard to make a pool type such that you can find out which pool a given address points into. Use BiBOP ("Big Bag Of Pages"). Memory footprint is one word per memory "page" (where "page" = 1K or something). I.e. approximately zero. **************************************************************** From: Randy Brukardt Sent: Thursday, March 13, 2008 2:19 PM > (Somewhere you mentioned lifetimes, but I can't find the quote to > reply to.) > Anyway, in Ada 95/2005, the pool object has a longer lifetime than the > access type(s). But that does NOT prevent dangling pointers, because > you can have a pool operation that frees the memory prematurely. > For example, Tucker likes mark/release pools (type MR_Pool, which I > pronounce "Mister Pool" ;-)) -- Release frees memory. Correct, but such use of a storage pool is explicitly erroneous by 13.11(21). As an implementer, I'd be justified in deleting Boston if Tucker does that. (This, I presume, is a reason that Tucker likes to use his own compiler. ;-) > > Type conversions don't cause a real problem, because hardly anyone > > would go to the lengths that you did to use storage pools. If I have > > to use a lot of type conversions, I generally give up and make > > everything the same type. > > You can't make them the same type if they have different designated types. > That's a common case when using tagged/class-wides. If they have different designated types, they're not convertable. As far as tagged types go, I use a global "access to root class" for everything, and explicitly convert into a particular specific/classwide type if I need to access components or operations that are only available on a derived type. That doesn't happen very often if your design is good (dispatching does most of those conversions for you). The net effect is, I rarely use access conversions; I only use view conversions on the dereferenced access type. And I generally use only a single global access type for the entire hierarchy. Other approaches seem to cause a lot more explicit conversions. > > To take a current example: my spam filter crashes with a bad memory > > reference roughly once every 30 days. Because one of the versions of > > the crash is the dereference of a null pointer in code that can't > > have a null pointer, I suspect a memory management problem. (Whether > > it is with Windows, with my code, or with the Janus/Ada runtime code, > > I don't know). I'd like to find the problem, but how can you do that > > with something that happens once > > every 30 days (which is typically longer than the time before > > Windows needs to reboot!). The only thing I can do is make it less > > likely that I made a > > mistake, thus extra checking on Unchecked_Deallocation at all levels. > > You could try a language (or implementation?) with garbage collection. > ;-) Rewriting 2 Megabytes of working Ada into another language is not practical (or smart!). And Ada does not allow garbage collection on types with non-trivial finalization (you'd have to finalize too early to collect the garbage - that's a problem that ought to be fixed), so that's out, too. ... > > As far as solutions goes, I really don't know. (This isn't my > > problem, after > > all.) I guess I would like to see the ability to tie pools to an > > entire abstraction. One of the problems with your solution is that > > you have to repeat it on every node of your abstraction -- passing > > the pools everywhere, > > and putting them on every allocator. Forget one, (or heaven forbid, > > use the > > wrong one), and the whole thing collapses into a leaky mess (or worse). > > I briefly mentioned a solution to this earlier. I want to say "for all access > types declared in this package, and all its children, the default pool is X". > I would make X be a pool that always raises an exception on Allocate. > (Added bonus -- make sure the compiler _knows_ that, so it can warn. > It's basically the same as the "for T'Storage_Size use 0" pool.) Then > every "new" had better have a specified pool. That's the sort of idea I had in mind. Thanks for mentioning it again. > > One thought would be to have a way to specify a pool for the entire > > abstraction. > > Not sure what you mean by "abstraction", here. Whatever it is, it > needs to be dynamic. I'm not sure, either! That's why I was so vague. I don't see an obvious way to map "abstraction" into language-defined terms like object and type. > > On a detail level, I think you may have already addressed my > > concerns when > > you suggested a special kind of storage pool to handle finalization > > properly. > > That's one approach. Another acceptable approach would be to say that > allocating any object with nontrivial finalization (or with tasks) in > these prematurely-finalized pools is erroneous. I don't think it's > hard to define "nontrivial finalization", but I think it needs to be > determined at run time for class-wide objects to work properly. No, that is NOT an acceptable approach. For two reasons: (1) I don't think a significant segment of the Ada community would stand for a feature adding more erroneousness to Ada, especially in a common case. (We went through this with the containers, after all, which is why we added various checks to them over Matt's objections.) Tucker's argument that the basic feature doesn't add any more erroneousness for Unchecked_Deallocation to the language is probably sound (I don't like it because it adds more erroneousness to our implementation, but that's a different issue). But making finalization erroneous where it previously was safe seems like an extension of erroneousness, and that is bad. (2) Many common types have non-trivial finalization. I like to say that all new types should be controlled, which while going a bit too far, is still true for many projects. Even if you remove memory management from the equation (and I'm not convinced that is a good idea), there still are many other reasons to do things when objects go away. One example: Claw puts active top-level windows on a chain. When a top-level window is destroyed, it better be removed from that chain! Probably more importantly to you, the language-defined containers are controlled types. A solution that doesn't let you use the containers seems very bad. > > Which is all to say that I don't know. I still find that going from > > 98% accuracy to (say) 70% isn't a good thing. (If I could find a > > cheap way without a memory footprint to detect dangling pointers, > > I'd implement it immediately for similar reasons.) > > It's not hard to make a pool type such that you can find out which > pool a given > address points into. Use BiBOP ("Big Bag Of Pages"). Memory > footprint is one > word per memory "page" (where "page" = 1K or something). I.e. > approximately zero. I don't see how that would detect many dangling pointers (most of them point into the main pool, after all, and almost certainly into pages that are valid [since freeing pages is rare; we only do it for huge single allocations that are deallocated]). Maybe I misunderstand the algorithm. It would help a bit on the deallocation from the wrong pool problem, but what we do for the main heap right now would work nearly as well (keeping track of the largest and smallest addresses previously allocated). And that doesn't take much space, either. But since dereferences don't know what pool they came from, you can't use that on them (other than on pool-specific types). Indeed, detecting dangling pointers in general access types is very expensive, because you can only do it if you know what pool the pointer supposedly was allocated from (user-defined pools would need a checking routine as part of the pool, and aliased objects couldn't be checked at all). That means carrying an indication of the pool along with the access value. **************************************************************** From: Tucker Taft Sent: Thursday, March 13, 2008 3:28 PM One thing that might alleviate some of the concerns you exoressed would be to require some kind of rep clause on an access type to allow it to be used with allocator-specific pools. For example, you might be required to specify the type of the pool objects being passed, e.g.: for Acc_Type'Allocator_Pool_Type use My_Storage_Pool_Type; Then you could only pass pools of type "My_Storage_Pool_Type" to an allocator of type Acc_Type. Of course you could say "Root_Storage_Pool'Class" and you would have full flexibility (and no checking). If we want anonymous access types to fit into this scheme, we might want to turn it around somewhat for that and specify whether a particular storage pool type could be used with an anonymous allocator, e.g.: pragma Allow_As_Anonymous_Allocator_Pool(My_Storage_Pool_Type); or some such thing. **************************************************************** From: Robert A. Duff Sent: Thursday, March 13, 2008 5:08 PM > I don't see how that would detect many dangling pointers... Sorry, I wasn't clear. The BiBOP technique can allow you to tell (from the address) which pool it points into. That may or may not be part of dangling-pointer detection, or schemes for doing various other things. > Indeed, detecting dangling pointers in general access types is very > expensive, because you can only do it if you know what pool the > pointer supposedly was allocated from (user-defined pools would need a > checking routine as part of the pool, and aliased objects couldn't be > checked at all). That means carrying an indication of the pool along > with the access value. The BiBOP technique means you don't have to carry the pool with each access value, in order to tell which pool it came from. Of course, it only works for suitably-programmed pools -- whether user-defined or part of the implementation. **************************************************************** From: Tucker Taft Sent: Thursday, March 13, 2008 9:30 PM >> (Somewhere you mentioned lifetimes, but I can't find the quote to >> reply to.) >> Anyway, in Ada 95/2005, the pool object has a longer lifetime than >> the access type(s). But that does NOT prevent dangling pointers, >> because you can have a pool operation that frees the memory prematurely. >> For example, Tucker likes mark/release pools (type MR_Pool, which I >> pronounce "Mister Pool" ;-)) -- Release frees memory. > > Correct, but such use of a storage pool is explicitly erroneous by > 13.11(21). As an implementer, I'd be justified in deleting Boston if > Tucker does that. (This, I presume, is a reason that Tucker like to > use his own compiler. ;-) I don't quite follow this (and please don't delete Boston on my behalf ;-). 13.11(21) says you can't "... use the storage for any other purpose while the pool element remains in existence." Clearly after a "release" on a pool, the pool elements whose storage is freed no longer exist, just like with Unchecked_Deallocation. So what makes it erroneous? Are you saying it is erroneous for an object that "needs finalization" to go out of existence without having finalization performed? That I would agree with you, though I'm not sure I know what paragraph of the manual would back us up. I guess 7.6.1(10) implies that the only way for an object to cease to exist early is via Unchecked_Deallocation. That is actually in redundancy brackets, but it seems like a bit of a stretch, given the admitted possibility of Mark/Release storage pools as exemplified in 13.11. This whole area seems a bit confused. In any case, the manual clearly implies that Mark/Release pools are not erroneous, since they are used in the Example in 13.11. **************************************************************** From: Randy Brukardt Sent: Friday, March 14, 2008 12:13 AM > > Correct, but such use of a storage pool is explicitly erroneous by > > 13.11(21). As an implementer, I'd be justified in deleting Boston if Tucker > > does that. (This, I presume, is a reason that Tucker like to use his > > own compiler. ;-) > > I don't quite follow this (and please don't delete Boston on my behalf > ;-). 13.11(21) says you can't "... use the storage for any other > purpose while the pool element remains in existence." Clearly after a > "release" on a pool, the pool elements whose storage is freed no > longer exist, just like with Unchecked_Deallocation. Not according to Ada. You and I argued about the meaning of "exists" in Ada some years ago, and you convinced me that I was wrong and that it is in fact a technical term. You can't flip-flop now... (well, it is an election year, so I suppose you can. ;-) > So what makes it erroneous? > Are you saying it is erroneous for an object that "needs finalization" > to go out of existence without having finalization performed? That I > would agree with you, though I'm not sure I know what paragraph of the > manual would back us up. I guess 7.6.1(10) implies that the only way > for an object to cease to exist early is via Unchecked_Deallocation. > That is actually in redundancy brackets, but it seems like a bit of a > stretch, given the admitted possibility of Mark/Release storage pools > as exemplified in 13.11. 7.6.1(11/2) says that objects only cease to exist *after* they are finalized. No redundancy brackets there. Since all objects are (logically) finalized, and the only permission to clobber an object early is via Unchecked_Deallocation, clearly the objects continue exist until the master goes away. Thus the storage pool violates the third sentence of 13.11(21), and execution is erroneous. Another way to say this is that the actions of a random user-defined storage pool (which could be anything) should not change the underlying semantics of the language. Surely it should not be changing the lifetime of objects. Otherwise, you'll have a nightmare for program analysis and understanding. (See P.S.) > This whole area seems a bit confused. There is nothing confused here, the semantics are clear and straightforward. They may not be what you think they should be, but that's different. > In any case, the manual clearly implies that Mark/Release pools are > not erroneous, since they are used in the Example in 13.11. I always thought that example meant that Release was called after the type Acc went away. Anything else would clearly be erroneous. I see by looking at it carefully that this is not clear. But in any case, this is not normative. This is another symptom of the "no garbage collection" problem. We have no permission for early finalization of *any* object in Ada as it stands - anything that violates that is technically erroneous (even if not erroneous in practice). I've on a number of occasions suggested adding Implementation Permission to allow early finalization in limited cases at limited points. ("Limited cases" means no early task waiting, and only for objects that are otherwise inaccessible; "limited points" means at any point where finalization would normally occur. Those limits should make it possible to support garbage collection, mark/release pools, and anything else in between without making program analysis intractable.) But there never seems to be a lot of interest in tackling this problem (one of the worst in Ada today, I think). [Keep in mind that recovering memory from a failed allocator for an object with a controlled part is currently wrong for this reason; I've written an ACATS-style test to test that -- I could issue it...] P.S. Allowing existence to be controlled by storage pools would pretty much mean that 7.6.1 can be thrown away and compilers can do anything they want. Moreover, it would prevent a lot of static analysis, because it would be hard to rule out a change in existence caused by a pool. Consider the following: type Acc is access ... for Acc'Storage_Pool use MR_Pool; O : Acc; A := O.all; -- (1) null; -- (2) P (O.all); -- (3) Can a tool or compiler assume that O.all is the same between uses (1) and (3)? The answer is (amazingly) no. Another task could cause O.all to go away at (2), and the program becomes erroneous. Now, a compiler can use that erroneousness to allow it to make the assumption anyway, because the "anything goes" rule of erroneous execution surely includes assuming (3) is the same as (1). But I don't see that the same applies to a static analysis tool. If P's parameter is in out, making this assumption could overlook a major program disaster (overwriting random memory). That would greatly damage the purpose of static analysis in the first place. Note that (2), it is perfectly OK for another task to access MR_Pool and call Release. This task is not using it in any way at this point. Only 13.11(21) makes it erroneous to use the object later. (13.11.2(16) does not apply - the object still exists.) Currently, a static analysis tool could prove that there is no problem by ensuring that no other task ever calls Unchecked_Deallocation on a type that uses MR_Pool. While that wouldn't necessarily be easy, it seems tractable. (Much of the time there would be no instance of UD accessible in other tasks, which would prove the premise.) However, allowing any random operation on a pool to change the existence of objects means that a tool would need a much wider net: essentially any possible touching of a pool (even a call to Allocate) could cause some object elsewhere in the program to cease to exist. That would seem to be very hard for a static analysis tool to disprove: it would have to figure out the intent of any calls into the pool. Consider: A := O.all; -- (1) B := Allocated_Size(MR_Pool); -- (2) P (O.all); -- (3) A tool would have to assume that O.all could be killed by the user-defined pool operation Allocated_Size. That's even though it is fairly obvious to a human reader that the intent is just to get information from the pool. The only way to avoid such an assumption would be analyze how Allocated_Size worked, and since there are as many pool designs as programmers, I'm not sure that would be very practical. (After all, Release(MR_Pool) probably does nothing except set the value of a ariable of type Address - it wouldn't do any memory operations itself. How to tell that from a benign statistics gathering routine is not obvious to me.) Besides, such a rule is not addressing the real problem, which is that early finalization is not allowed in Ada. Since all objects are (logically) finalized, and surely the object has to exist in order that it can be finalized, any attempt to release the memory early is going to have be erroneous. I do agree that this (in the case of trivial finalization) is an example of what Robert called "things that will work fine in practice, but are still technically erroneous and cause language lawyers to go tsk-tsk". The trouble is that trying to separate trivial finalization from other finalization just makes the problem more annoying: now you can do it legally with carefully crafted types, but if maintenance adds a controlled part later on (say by adding a container), the system deletes Boston (because it now falls into Robert's second category "things that are genuinely dangerous"). I think that's worse than the original situation (where it never is legitimate). **************************************************************** From: Tucker Taft Sent: Friday, March 14, 2008 1:14 AM >> So what makes it erroneous? >>... > > 7.6.1(11/2) says that objects only cease to exist *after* they are > finalized. No redundancy brackets there. Well if we are getting technical, what 7.6.1(11/2) says is that it finalizes objects *that still exist*. If the object ceases to exist earlier, then it is not finalized at the end of the scope of the access type. Now I admit that 7.6.1(10) implies (in redundancy brackets) that Unchecked_Deallocation is the only way for an object to cease to exist "early," but I have never taken that at face value, since the whole point of a Mark/Release heap is to release storage *before* the whole access type goes away, in the same way that Unchecked_Deallocation does. Think of Release as a "grouped" Unchecked_Deallocation of a bunch of objects. In fact, it could be implemented that way, by keeping a list of objects as a side-effect of Allocate, and then iterating over them at the time of Release. In fact, if the objects need finalization, then that is the only safe thing to do. The actual "Deallocate" routine could be a no-op, if the presumption is that the storage is actually going to be reclaimed "en masse" by the Release operation. Of course we would like a better way to accomplish this, so that we don't have to have heap objects needing finalization on two lists, one maintained by the compiler and one maintained by the pool. > ... Since all objects are (logically) > finalized, and the only permission to clobber an object early is via > Unchecked_Deallocation, clearly the objects continue exist until the > master goes away. Thus the storage pool violates the third sentence of > 13.11(21), and execution is erroneous. I think the statement in 7.6.1(10) will want to be generalized to allow for Mark/Release pools, if we can figure out a safe way for them to get any needed finalizations performed on the objects whose storage is being reclaimed upon Release. > Another way to say this is that the actions of a random user-defined > storage pool (which could be anything) should not change the > underlying semantics of the language. Surely it should not be changing > the lifetime of objects. Otherwise, you'll have a nightmare for program > analysis and understanding. (See P.S.) I think you would agree that a call on Release could be implemented via a bunch of calls on Unchecked_Deallocation. Of course, part of the design problem we are working on is how to make such pools be able to accomplish the equivalent of Unchecked_Deallocation on a whole group of objects so that needed finalizations are performed, but do it more efficiently. >> This whole area seems a bit confused. > > There is nothing confused here, the semantics are clear and straightforward. > They may not be what you think they should be, but that's different. I think it is confused to say that Unchecked_Deallocation is the only way for an object to cease to exist "early", given user-defined storage pools and operations like "Release." But I agree that if we are going to be storing objects needing finalization in user-defined mark/release storage pools, we need to add more mechanism to enable them to do the needed finalizations. >> In any case, the manual clearly implies that Mark/Release pools are >> not erroneous, since they are used in the Example in 13.11. > > I always thought that example meant that Release was called after the > type Acc went away. Anything else would clearly be erroneous. I see by > looking at it carefully that this is not clear. But in any case, this > is not normative. > > This is another symptom of the "no garbage collection" problem. We > have no permission for early finalization of *any* object in Ada as it > stands - anything that violates that is technically erroneous (even if > not erroneous in practice). I think we are in violent agreement here, though I am more interested in mark/release pools than in garbage collection. But it is basically the same problem. I suspect you might get more interest in the problem if it was recognized that the current rules imply that mark/release pools are erroneous. They are certainly used! **************************************************************** From: Randy Brukardt Sent: Friday, March 14, 2008 1:18 PM > > Another way to say this is that the actions of a random user-defined storage > > pool (which could be anything) should not change the underlying semantics of > > the language. Surely it should not be changing the lifetime of objects. > > Otherwise, you'll have a nightmare for program analysis and understanding. > > (See P.S.) > > I think you would agree that a call on Release could be implemented > via a bunch of calls on Unchecked_Deallocation. Of course, part of > the design problem we are working on is how to make such pools be able > to accomplish the equivalent of Unchecked_Deallocation on a whole > group of objects so that needed finalizations are performed, but do it > more efficiently. While I agree with your principle, I don't think that is actually possible in practice. You could implement it as calls to Deallocate, but it doesn't change the existence of objects. That is the magic part of Unchecked_Deallocation. All I'm saying is that random user code shouldn't be allowed to change the lifetime of objects other than through strictly defined ways. Otherwise, it would be near impossible to analyze anything. How could a tool tell between Release (which supposely changes the lifetime of things) and Is_Valid (which is just a query)? Currently, the only defined way to change the lifetime of an object is Unchecked_Deallocation (and that does *not* mean the implementation of UD via a pool, but rather the actual call to a UD instance). It's obvious that we need more than that, but whatever we have ought to be very well defined so it is possible to write tools and tests that depend on the lifetime of objects. > >> This whole area seems a bit confused. > > > > There is nothing confused here, the semantics are clear and straightforward. > > They may not be what you think they should be, but that's different. > > I think it is confused to say that Unchecked_Deallocation is the only > way for an object to cease to exist "early", given user-defined > storage pools and operations like "Release." > But I agree that if we are going to be storing objects needing > finalization in user-defined mark/release storage pools, we need to > add more mechanism to enable them to do the needed finalizations. I've always thought that a Mark/Release pool is erroneous unless the type has gone away. If the designated type has trivial finalization, it is what Robert called "language-lawyer erroneousness", in that it would work on virtually all implementations, but it still is erroneous. (We don't have a concept of "trivial finalization" in Ada, after all.) ... > I think we are in violent agreement here, though I am more interested > in mark/release pools than in garbage collection. > But it is basically the same problem. I suspect you might get more > interest in the problem if it was recognized that the current rules > imply that mark/release pools are erroneous. > They are certainly used! Some would say that Mark/Release is just a poor man's garbage collection. (And they've never worked for me, although with multiple pools in Ada, they might in fact work better.) In any case, the issue is recognized now: at least by anyone who's read this e-mail thread! **************************************************************** From: Robert A. Duff Sent: Friday, March 14, 2008 1:26 PM Tuck, Randy is right -- the storage pool stuff in SofCheck Inspector is utterly erroneous, from a formal point of view. It relies on implementation behavior that is not guaranteed by the RM. Nonetheless, it works just fine! Anyway, I think we need a specific detailed proposal at this point. It should address Randy's concerns, which I think are valid, but not show-stopping. I'm willing to volunteer to write it up, but I don't want to step on Tucker's toes if he's planning to. So far, we have Tucker, Robert, and me in favor of doing something in this area. And some fairly negative comments from Randy. I don't want to waste my time, so I'd like to hear some positive comments from others, first. So, who wants me to write up a proposal on the general topic of "more flexible memory management"? P.S. Re: Randy's issues with garbage collection. I put some stuff about it in the RM, but Tucker wanted it moved to the AARM. It's in 13.11.3. It's pretty stupid stuff -- I've learned a lot about GC and memory management in general since those days. **************************************************************** From: Tucker Taft Sent: Friday, March 14, 2008 1:55 PM > ...(We don't have a > concept of "trivial finalization" in Ada, after all. I thought that any object that doesn't "need finalization" per RM 7.6(9.1) is presumed to have "trivial" finalization in the sense we care about. Class-wide objects are a bit of a special case, since you could decide whether they "need finalization" based on treating them as an object of a class-wide type (and then the answer is "yes") or as an object of the specific type determined by the tag (and then the answer depends on the properties of the specific type). It pretty much depends on whether you are interested in a "static" (compile or link-time) answer ("yes") or a run-time answer (it depends on the tag). **************************************************************** From: Tucker Taft Sent: Friday, March 14, 2008 2:03 PM > Anyway, I think we need a specific detailed proposal at this point. > It should address Randy's concerns, which I think are valid, but not > show-stopping. > > I'm willing to volunteer to write it up, but I don't want to step on > Tucker's toes if he's planning to. > > So far, we have Tucker, Robert, and me in favor of doing something in this area. > And some fairly negative comments from Randy. Actually I think Randy and I are converging, albeit a bit slowly... ;-) I think Randy is afraid we are creating a bunch of new holes while trying to reduce the use of Unchecked_Deallocation. My ultimate goal is to make this all automatic, so you have both efficient and safe, region/pool-based storage allocation, where you can eliminate the possibility of dangling references statically. But first I was looking for a primitive that would allow me to make progress toward that longer-term goal without all kinds of nasty thread-unsafe kludgery using indirect storage pools. But I think Randy doesn't want to add a new primitive that makes things less safe, even if there is some grand and glorious vision for eventually making things safer. I accept Randy's concerns, and am beginning to see a path to achieving my goals while not setting off his erroneousness alarm. I guess I am still interested in writing something up myself, but I would be happy to collaborate with you on it, as it seems to be an area we both care about a lot. If you want to take the first shot, great. I'm a little overwhelmed with other things right now, though some of them involve storage pool hackery, so that is why I decided to float this syntactic trial balloon. **************************************************************** From: Randy Brukardt Sent: Friday, March 14, 2008 2:15 PM ... > Actually I think Randy and I are converging, albeit a bit slowly... > ;-) I think Randy is afraid we are creating a bunch of new holes > while trying to reduce the use of Unchecked_Deallocation. My ultimate > goal is to make this all automatic, so you have both efficient and > safe, region/pool-based storage allocation, where you can eliminate > the possibility of dangling references statically. But first I was > looking for a primitive that would allow me to make progress toward > that longer-term goal without all kinds of nasty thread-unsafe > kludgery using indirect storage pools. I'd probably have been less negative if you were not stopping halfway on this journey. If the path to Nirvana leads through an aligator-infested swamp, I'm less likely to take it. ;-) > But I think Randy doesn't want to add a new primitive that makes > things less safe, even if there is some grand and glorious vision for > eventually making things safer. Especially since no one has laid out the "grand and glorious" vision. > I accept Randy's concerns, and am beginning to see a path to achieving > my goals while not setting off his erroneousness alarm. I think I do too. It'll be interesting to see if they're the same path... **************************************************************** From: Randy Brukardt Sent: Friday, March 14, 2008 2:26 PM > Tuck, Randy is right -- the storage pool stuff in SofCheck Inspector > is utterly erroneous, from a formal point of view. It relies on > implementation behavior that is not guaranteed by the RM. Nonetheless, > it works just fine! Glad to see I'm not crazy. (I worry sometimes.) ... > So, who wants me to write up a proposal on the general topic of "more > flexible memory management"? I think you should go for it. I'm always more negative about pie-in-the-sky ideas rather than a formal proposal. I'm going to write up a proposed Implementation Permission to allow garbage collection and Mark/Release pools for Ada. I think this is a somewhat separate issue, in that it is an existing bug rather than an extension of some sort. (I'd like to see it fixed now, rather than in 201x.) I'd like to see a proposal with all of the elements we've previously discussed (it's easier to delete them than add them later): * A way to set the default pool for all types in a package; * A way to set the class of pool types allowed for an access type (and that then allows specification of specific pools on allocators); [this also allows the allocators to depend on implementation-defined "enhanced pool" types, such as one that has a dereference check routine (which would allow pools that positively identify dangling references upon use)] * A solution to the finalization problem when using specific pools in allocators; * Probably a matching Unchecked_Deallocation for specific pools in allocators. I hope I didn't miss any. > P.S. Re: Randy's issues with garbage collection. I put some stuff > about it in the RM, but Tucker wanted it moved to the AARM. It's in > 13.11.3. It's pretty stupid stuff -- I've learned a lot about GC > and memory management in general since those days. It surely is, especially as it never really addresses the problem. Indeed, it actually says that the implementation needs to defer garbage collection of objects that need finalization -- which seems exactly wrong. I suppose we ought to update those notes along with a new Implementation Permission, but I don't have much interest in that. Maybe I can talk you into that (I suspect you know a lot more about GC than I do). **************************************************************** From: Robert A. Duff Sent: Friday, March 14, 2008 4:22 PM > I guess I am still interested in writing something up myself, but I > would be happy to collaborate with you on it, as it seems to be an > area we both care about a lot. I'm happy to let you go first. I just think the whole discussion would be more productive if we had something specific to argue about. Randy has spent a lot of energy pointing out what the RM _currently_ says, but obviously you want to _change_ what it says. P.S. I find memory management (GC and otherwise) to be fascinating, for some strange reason! ;-) **************************************************************** From: Randy Brukardt Sent: Friday, March 14, 2008 4:25 PM > * A way to set the class of pool types allowed for an access type (and that > then allows specification of specific pools on allocators); [this also > allows the allocators to depend on implementation-defined "enhanced pool" > types, such as one that has a dereference check routine (which would > allow pools that positively identify dangling references upon use)] Really, the dereference check probably ought to be language defined. The idea is to add: procedure Check_Dereference (Pool : in out Some_Storage_Pool; Addr : in Address) is null; to a derived storage pool, and then define that a dereference of a pool-specific access value (can't use this on general access types, unfortunately, because you don't know the pool) calls this routine (if it exists) before actually doing the dereference. The routine then can check if the address is one that was actually allocated by the pool, and raise an exception (preferably with a message) if it was not. This could be used to detect dangling pointers as well as bad pointers that were streamed in or Unchecked_Converted, and probably to do other things as well if you want to assume the the type of object that is designated. It would allow the development of portable debugging pools that would really be useful in that they could detect virtually all dynamic memory management errors. (And combined with the ability to change the default pool suggested above, could be used to debug an entire system rather simply). Perhaps the idea could even be extended enough to allow portable user-defined garbage collection. (But that's not as obvious.) **************************************************************** From: Randy Brukardt Sent: Friday, March 14, 2008 4:46 PM > I'm happy to let you go first. As editor of the ARG, I think that is a bad idea. Tucker has several critical projects for the ARG: he's the critical path for the entire ASIS Standard at this point. I can't recommend that he do anything that would take away from those focuses. You, on the other hand seem to have both the necessary interest and only a small amount of ARG homework (and not critical to the schedule of a standard that has a hard deadline coming up). Moreover, Tucker said he'd be happy to collaborate. What I don't want to see is Tucker working on these "fun" projects and not getting his critical ARG work done (because I know he has a limited amount of spare time). So I think very strongly that you should do it, and preferably right away while you remember the gist of this e-mail discussion. Especially because... > P.S. I find memory management (GC and otherwise) to be fascinating, > for some strange reason! ;-) ...which makes you the ideal person to work on this. **************************************************************** From: Stephen W Baird Sent: Friday, March 14, 2008 7:28 PM > So, who wants me to write up a proposal on the general topic of "more > flexible memory management"? I do. **************************************************************** From: Tucker Taft Sent: Friday, March 14, 2008 8:57 PM By all means, after you... ;-) Randy is right, I am too busy to be getting all fired up about this issue. But I can't resist tossing out a couple of more thoughts: I think Randy is right, we need to capture the "special sauce" that Unchecked_Deallocation now has and figure out a way to "sprinkle" it (carefully) on user-defined code. I had suggested giving the management of a list of to-be-finalized objects over to the user-defined code, but that seems like too much work, and too "dangerous" work, for user-defined code. So instead I would suggest that a user-defined pool that wants to "destroy" objects early should be an extension of a "Root_Extended_Storage_Pool" (Root ESP?) type (or some such name) that the implementation must provide. This Root ESP type would have the necessary mechanism, presumably provided via private operations that aren't overridable, to maintain a list of to-be- finalized objects associated with the pool. So an allocator that uses such a pool (either because the access type is associated with it, or the individual allocator specifies it) would link the object, presuming it needs finalization, into the pool-resident list. Finalizing such a pool object would automatically walk this pool-resident list. In addition there would be an explicit operation provided by this Root ESP type to finalize the whole list, thereby destroying all of the objects on the list, and allowing the pool to reclaim all of their storage. I suppose there is no particular reason we couldn't just specify that an explicit call on the Finalize operation of the Root ESP would have this effect, but it might be better to give the operation some other name, such as "Reset_Pool" (or even, Release_Pool ;-). An unchecked_deallocation would still be permitted on a pool type derived from this Root ESP, and it would remove the object from this pool-resident list before finalizing it and calling the Deallocate operation. For example: type Root_Extended_Storage_Pool is abstract new Root_Storage_Pool with private; -- This Root ESP type has a (private) component that -- the RTS uses to maintain a list of heap -- objects allocated from the pool that need -- finalization. The finalization of this component -- automatically finalizes all of the objects on -- the list and resets the list to empty. -- An explicit operation is provided to do -- the same thing: procedure Reset(Pool : in out Root_Extended_Storage_Pool); -- This resets the finalization list to empty after -- finalizing all of the objects on the list. -- All of the objects that have been allocated -- from the pool cease to exist after this call. -- It is presumed that extenders of the Root ESP -- type will override Reset to first call this -- original Reset routine to get any necessary -- finalizations performed, and then reclaim -- the actual storage associated with the pool. -- This operation is implicitly called as part -- of finalizing any type derived from Root ESP. -- Note that Allocate and Deallocate are inherited from -- Root_Storage_Pool, and are called as part of allocators -- and Unchecked_Deallocation[_From_Pool]. An allocator -- for an object that needs finalization will implicitly call -- a private operation of Root ESP to add the object to -- the pool's finalization list, and a call on an instance -- of Unchecked_Deallocation that applies to such a pool will -- implicitly call a private operation of Root ESP to remove -- the object from the list and finalize it, before calling -- the Deallocate routine. Reclaiming storage from a pool that is in use by an object would be erroneous unless either Unchecked_Deallocation is performed on the object, or Reset is performed on the pool as a whole. As usual, referencing an object that no longer exists would be erroneous. It is important that a Root ESP pool object not outlive the type of the objects that are stored in it if they need finalization, since the finalization of a Root ESP object implicitly calls the finalize operation of these objects. That is why I said that the accessibility level of such a pool object must be at least deep as that of the types stored in it. I realize that is somewhat the opposite of pools that are not managed in a mark/release manner. So that's one approach to solve the finalization problem... **************************************************************** From: Tucker Taft Sent: Saturday, March 15, 2008 8:42 AM Well of course once I got started on this it will be hard to stop thinking about it. A couple of additional thoughts came to me relating to it. First, suppose we want to allow objects with task parts to be allocated from such a pool. We might then want to reasonably be able to wait for all such tasks to complete before performing a Reset. Similarly, we might just want to be able to check whether all such tasks are terminated (e.g. an All_Task_Parts_Are_Terminated function defined on Root_ESP), and we might want to be able to abort them all en masse and then wait for them to terminate. A second set of thoughts related to being able to check the accessibility level of an ESP object. The intent is that you can't allocate an object from an ESP if the type of the object doesn't outlive the ESP object. But we don't really have a way of doing that, since objects don't generally have any kind of indicator of their own accessibility. Even access parameters can lie, since they might be the result of a 'Access of something whose accessibility level isn't precisely known (such as a tagged in-out parameter). Even a value of a named access type can lie, since it might be the result of a conversion from a longer-lived access type. So, it would seem to do this check precisely, the implementation must in an ESP object a component that faithfully records the accessibility level of the object. This would be similar to what a "tag" must record so we can correctly check the accessibility of a tagged type on function return and on allocators. The above two thoughts led me to a third thought, namely that an ESP object is very close to what we have called a "master." If we allow task parts, then it could act as the master for those tasks. Similarly, we presume a master has a well known accessibility level, since we need to compare against it, particularly in the check on function return for functions nested in accept statements as specified by AI05-24: A check is made that the master of the type identified by the tag of the result includes the elaboration of the master that elaborated the function body. So if we begin to think of ESP objects as a kind of "dynamic" master, that the user can declare and apply operations to, such as wait for all tasks of the dynamic master, finalize all objects of the dynamic master, abort all tasks of the master, then it begins to make a bit more sense in terms of how it relates to the language as a whole. Similarly, when this dynamic master is finalized, so are all of the objects associated with the master. It would also make sense to wait for all the tasks of the dynamic master prior to finalizing it. To limit the places where one would have to worry about dynamic masters, we might require some specification on an access type to indicate that some of its values might be pointing to objects managed by a dynamic master. I had suggested a rep-clause for something like that, to indicate the *type* of the ESP object (aka dynamic master) that an access type might specify in one of its allocators. That might carry over further to limiting the conversion from one access type to another, namely requiring that when converting acc type A to acc type B, if acc type A allows ESP pools of type E in its allocators, then acc type B must specify a type that covers E for its allocators (with Root_ESP'Class covering them all). (This begins to sound more like an "operational item" than a "rep-clause.") I guess that's enough craziness for one message... **************************************************************** From: Robert A. Duff Sent: Saturday, March 15, 2008 9:12 AM > The above two thoughts led me to a third thought, namely that an ESP > object is very close to what we have called a "master." So we should call it Master_Pool instead of Mr_Pool. **************************************************************** From: Randy Brukardt Sent: Friday, March 14, 2008 4:07 PM As previously discussed, Ada does not currently allow garbage collection and mark/release pools of objects, because Ada does not allow early finalization. Thus, technically any such uses are erroneous. Moreover, if the designated type in question has non-trivial finalization, any such use is really likely to be erroneous, when the finalization hits non-existent memory. These advanced memory management techniques should surely be explicitly allowed in Ada. So, I'm proposing an Implementation Permission to allow them. I'm providing it in the form of half of an AI. !wording Add after 7.6.1(20): Implementation Permissions An *inaccessible object* is one that is never, other than for finalization, accessed again by the partition. An object declared in a passive partition (see E.1) is never inaccessible. AARM Discussion: This definition seems to require predicting the future, which is usually a mistake. But that is intended in this case, because it is attempting to provide a definition of objects that can be garbage collected (or released from a Mark/Release pool). Typically, an implementation will apply this definition to local objects which have no remaining references in the code, and to objects accessed via access values in those objects. AARM Reason: The second sentence is just to exclude passive partitions (which can be accessed by many active partitions) from this definition. Otherwise, the objects of a passive partition would always be inaccessible (as it has no execution thread), and that's clearly not what we want. If a directly allocated object Q of a type that is not immutably limited is inaccessible, then if the object Q would normally be finalized by task T by the finalization of a master M, then during the finalization of any master of task T that occurs prior to the finalization of the master M the implementation may finalize the object Q. After such a finalization, the object no longer exists. [Editor's note: "directly allocated" means one that comes from "new", not a subcomponent of such an object. Better wording is requested.] AARM Ramification: After the object ceases to exist, the implementation may recover its storage. Or the user may recover its storage via an operation of its storage pool (as in a Mark/Release pool). Note that such storage reclaiming by a storage pool is erroneous until the object ceases to exist; this permission allows that to happen sooner than in the canonical semantics. If the finalization of the object has no effect, this early finalization does not even require any code. [Editor's note: An additional, separate proposal is expected to provide a mechanism to ensure that finalization occurs early for a Mark/Release pool, so that the use of such pools can be made portable.] AARM Discussion: We only allow early finalizations to occur at points where finalizations are expected to occur anyway, and only by the same task that would normally finalize the object. Truly asynchronous finalizations would cause many new issues for programmers and implementers; imagine a finalization being invoked during the assignment of a global composite variable that is also used by the finalization routine. We don't programmers to have to take extra precautions to protect such objects in single tasking programs. There are a lot of AARM notes on Garbage Collection in 13.11.3 which should be updated to reflect what the language actually allows and does not allow. I'll try to get Bob to do that. !discussion We exclude "immutably limited" types from this permission for two reasons: (1) In AI95-00147, we decided that we would not allow optimization of Finalization for limited controlled types, as these are often used for locks and the like. It would seem weird to water down that decision and allow that Finalization to be moved to some other point in the program. (2) Finalization of objects containing task parts includes waiting for those tasks to terminate. This would mean that any finalization point could potentially be a task waiting point and essentially a blocking point. Moreover, it could introduce deadlock where none was previously present if the object containing the task is finalized before some resource in the parent task that it depends on is available. This seems like a likely scenario, as the whole idea of child tasks is that they run independently. The wording as written doesn't take into account that active nature of tasks. There doesn't seem to be any technical problem preventing extending this permission to limited controlled types and to protected objects if we wish to do that. There would need to be some wording preventing problematic task scenarios that might cause deadlocks. There wouldn't be any implementation problem for Janus/Ada on these, either, as tasks waiting and protected objects are handled much like controlled objects. But that may not be true in other implementations, particularly for the task waiting. --- Note that this permission does not directly allow finalizing of objects during an allocator. This means that invoking a garbage collector as part of an allocator is not directly supported by the language. However, it appears to me that this would be allowed in virtually all real cases, as it is highly likely that a master finalization would directly precede the allocator. In that case, an "as-if" optimization would allow doing the garbage collection as part of the allocator. I've been unable to think of any non-pathological cases where this would not be allowed (and it would matter), but I have not tried very hard to do so. On the other hand, an asynchronous garbage collector (that is, one that runs on a separate thread) would take special care, even with this permission. It could find garbage and collect garbage with trivial finalization, but it would have to synchronize with the task that owns the object in order to finalize it. While this is unpleasant, the alternative would be to require most existing Ada code that uses finalization to be rewritten (to add additional locking). That seems like too much of a burden. --- I considered allowing this for stand-alone objects as well. That makes sense as an implementation could implement local composite objects as if they are all allocated from the heap (similar to Java), and the implementation could also garbage collect them. However, I eventually gave up on the idea as it is possible for the finalization of an object to depend on the finalization of some other object in the same scope (there are examples of this in AI95-00280). It doesn't make sense to revisit that can of worms. The definition of "inaccessible" is designed to cover all objects, in case we want in the future to add additional permissions. --- In case that there was a problem with defining "inaccessible object", I checked for all uses of the term "inaccessible" in the AARM: AARM 3.10.2(22.t) says that the term "inaccessible" is sometimes used in relationship to accessibility. I can find no such uses in the AARM; I think this text should be deleted. 13.11(25.2/2) uses "inaccessible object". Previously there was no definition of that; now there is. I don't think that there is a problem with this, but if someone finds one, we could reword that paragraph. (Or use a different term.) AARM 3.9.1(4.a) uses "inaccessible subprogram" informally. No issue. E.1(7) defines "inaccessible partition", which is then used in a number of other places in E.1 and E.3. That's not a problem. The automatically generated annex M uses "inaccessible" based on some of the previously discussed rules. And that's it. ----------------------- That seems like enough for now. Send in the brickbats!! **************************************************************** From: Robert Dewar Sent: Friday, March 14, 2008 4:37 PM > These advanced memory management techniques should surely be > explicitly allowed in Ada. So, I'm proposing an Implementation > Permission to allow them. I'm providing it in the form of half of an AI. I would say this is premature unless someone is actually interested in trying to implement a garbage collector. Wait till that point, any earlier work on the issue is wasted effort in my opinion. **************************************************************** From: Randy Brukardt Sent: Friday, March 14, 2008 4:52 PM As noted in the previous thread, this also affects Tucker's Mark/Release storage pools (which are erroneous as the standard is written). It also affects the behavior of allocators that fail to initialize (they can't be finalized until the type goes away, which means that the memory has to stay around a long time). That also has come up previously. The point is that this is a general problem that comes up in Ada if you want to do any sort of memory management at all. The form of that memory management does not have to specifically be garbage collection. And surely there are people that do want to do memory management! **************************************************************** From: Robert A. Duff Sent: Friday, March 14, 2008 7:32 PM Randy, I didn't read the whole thing carefully yet. I will, but I wonder if you could reformat it with short lines? Somebody's mailer (possibly mine) wrapped them in an ugly fashion. > As previously discussed, Ada does not currently allow garbage > collection and mark/release pools of objects, because Ada does not > allow early finalization. Thus, technically any such uses are > erroneous. Moreover, if the designated type in question has > non-trivial finalization, any such use is really likely to be > erroneous, when the finalization hits non-existent memory. > > These advanced memory management techniques should surely be > explicitly allowed in Ada. So, I'm proposing an Implementation > Permission to allow them. I'm providing it in the form of half of an AI. Not sure if the above is intended to become the !summary or !question or whatever of an AI, but it seems way too negative. Normally, I wouldn't care about that, but garbage collection is a contentious issue, and an easy anti-Ada point. The truth is, nobody ever intended to outlaw GC in Ada! Nor Mister Pools. How about something like this: The design of Ada has always been intended to allow implementations that support garbage collection (GC). However, certain paragraphs of the RM seem to accidentally imply that GC is impossible. Is GC allowed in Ada? (Yes.) Similarly, these paragraphs seem to accidentally render certain kinds of user-defined storage pools, such as mark/release, technically erroneous. Do mark/release-like pools work in Ada? (Yes.) More later... **************************************************************** From: Randy Brukardt Sent: Friday, March 14, 2008 7:57 PM > Randy, I didn't read the whole thing carefully yet. I will, but I > wonder if > you could reformat it with short lines? Somebody's mailer (possibly > mine) wrapped them in an ugly fashion. I just looked, and it's a mess in mine, too. I think I forgot to not manually wrap the lines like I have to in an AI. I'll try to repost it. ... > Not sure if the above is intended to become the !summary or !question > or whatever of an AI, but it seems way too negative. No, it is just an introduction for the ARG members who haven't been following the other thread. It was supposed to be one sentence, but as you probably know, I tend to write two paragraphs when a sentence will do. :-) > Normally, I wouldn't care about that, but garbage collection is a > contentious issue, and an easy anti-Ada point. The truth is, nobody > ever intended to outlaw GC in Ada! Nor Mister Pools. True enough. > How about something like this: > > The design of Ada has always been intended to allow implementations that > support garbage collection (GC). However, certain paragraphs of the RM > seem to accidentally imply that GC is impossible. Is GC allowed in Ada? > (Yes.) > > Similarly, these paragraphs seem to accidentally render certain kinds of > user-defined storage pools, such as mark/release, technically erroneous. > Do mark/release-like pools work in Ada? (Yes.) Sounds good. Are you writing this AI, too? :-) I'm sure I can find something else to work on. :-) **************************************************************** From: Robert Dewar Sent: Friday, March 14, 2008 8:12 PM > I would say this is premature unless someone is actually interested in > trying to implement a garbage collector. Wait till that point, any > earlier work on the issue is wasted effort in my opinion. Thanks Randy for clarifying that the intent goes beyond just garbage collection. Of course if you implement your own storage pool, it is allowed to do arbitrary magic, since there is no requirement that it be programmed in Ada, so I am not sure that any IA is necessary, but it is perhaps useful as a guideline of how much magic is considered OK, though ultimately such good-taste guidelines are of limited use (sort of like trying to legislate what are OK pragmas). **************************************************************** From: Robert Dewar Sent: Friday, March 14, 2008 8:16 PM > As noted in the previous thread, this also affects Tucker's > Mark/Release storage pools (which are erroneous as the standard is > written). Not really. They may look erroneous, but if you just declare them to be written in Ada++ whose semantics are not well documented, they are fine, since the standard has nothing to say about Ada++. The status of such a package is that "it works using compiler bla" and perhaps "it does not work using compiler bla". The proposed implementation advice/permission won't affect that status either way. I don't see that the ARG mulling over some implementation advice or permission has any practical or useful effect on this pool. > The point is that this is a general problem that comes up in Ada if > you want to do any sort of memory management at all. The form of that > memory management does not have to specifically be garbage collection. > And surely there are people that do want to do memory management! Rignt, but if all they have to depend on is implementation dependent behavior, then the standard is somewhat irrelevant, what is important is what the implementation does, and that's unlikely to be affected in practice by what the standard says in this area. **************************************************************** From: Randy Brukardt Sent: Friday, March 14, 2008 8:13 PM Here's the repost Bob wanted; I put his introduction at the beginning rather than mine. (I did drop the word "accidentally" from it; no reason to talk about intent directly ("seem to imply" says it more subtly anyway.) I added a few sentences here and there to (hopefully) clarify things a bit; it's impossible to re-read something and not see ways to improve it. - RLB -------------- !question The design of Ada has always been intended to allow implementations that support garbage collection (GC). However, certain paragraphs of the RM seem to imply that GC is impossible. Is GC allowed in Ada? (Yes.) Similarly, these paragraphs seem to render certain kinds of user-defined storage pools, such as mark/release, technically erroneous. Do mark/release-like pools work in Ada? (Yes.) !wording Add after 7.6.1(20): Implementation Permissions An *inaccessible object* is one that is never, other than for finalization, accessed again by the partition. An object declared in a passive partition (see E.1) is never inaccessible. AARM Discussion: This definition seems to require predicting the future, which is usually a mistake. But that is intended in this case, because it is attempting to provide a definition of objects that can be garbage collected (or released from a Mark/Release pool). Typically, an implementation will apply this definition to local objects which have no remaining references in the code, and to objects accessed via access values in those objects. AARM Reason: The second sentence is just to exclude passive partitions (which can be accessed by many active partitions) from this definition. Otherwise, the objects of a passive partition would always be inaccessible (as it has no execution thread), and that's clearly not what we want. If a directly allocated object Q of a type that is not immutably limited is inaccessible, then if the object Q would normally be finalized by task T by the finalization of a master M, then during the finalization of any master of task T that occurs prior to the finalization of the master M the implementation may finalize the object Q. After such a finalization, the object no longer exists. [Editor's note: "directly allocated" means one that comes from "new", not a subcomponent of such an object. Better wording is requested.] AARM Ramification: After the object ceases to exist, the implementation may recover its storage. Or the user may recover its storage via an operation of its storage pool (as in a Mark/Release pool). Note that such storage reclaiming by a storage pool is erroneous until the object ceases to exist; this permission allows that to happen sooner than in the canonical semantics. If the finalization of the object has no effect, this early finalization does not even require any code. [Editor's note: An additional, separate proposal is expected to provide a mechanism to ensure that finalization occurs early for a Mark/Release pool, so that the use of such pools can be made portable.] AARM Discussion: We only allow early finalizations to occur at points where finalizations are expected to occur anyway, and only by the same task that would normally finalize the object. Truly asynchronous finalizations would cause many new issues for programmers and implementers; imagine a finalization being invoked during the assignment of a global composite variable that is also used by the finalization routine. We don't programmers to have to take extra precautions to protect such objects in single tasking programs. There are a lot of AARM notes on Garbage Collection in 13.11.3 which should be updated to reflect what the language actually allows and does not allow. I'll try to get Bob to do that. !discussion We exclude "immutably limited" types from this permission for three reasons: (1) In AI95-00147, we decided that we would not allow optimization of Finalization for limited controlled types, as these are often used for locks and the like. It would seem weird to water down that decision and allow that Finalization to be moved to some other point in the program. (2) Finalization of objects containing task parts includes waiting for those tasks to terminate. This would mean that any finalization point could potentially be a task waiting point and essentially a blocking point. Moreover, it could introduce deadlock where none was previously present if the object containing the task is finalized before some resource in the parent task that it depends on is available. This seems like a likely scenario, as the whole idea of child tasks is that they run independently. The wording as written doesn't take into account that active nature of tasks. (3) The implementation of task waiting is often different than that for finalization. An implementation burden may result from broadening the permission. There doesn't seem to be any technical problem preventing extending this permission to limited controlled types and to protected objects if we wish to do that. There would need to be some wording preventing problematic task scenarios that might cause deadlocks should we want to go all the way. There wouldn't be any implementation problem for Janus/Ada on these, either, as tasks waiting and protected objects are handled much like controlled objects. But that may not be true in other implementations, particularly for the task waiting. --- Note that this permission does not directly allow finalizing of objects during an allocator. This means that invoking a garbage collector as part of an allocator is not directly supported by the language. However, it appears to me that this would be allowed in virtually all real cases, as it is highly likely that a master finalization would directly precede the allocator. In that case, an "as-if" optimization would allow doing the garbage collection as part of the allocator. I've been unable to think of any non-pathological cases where this would not be allowed (and it could be detected), but I have not tried very hard to do so. On the other hand, an asynchronous garbage collector (that is, one that runs on a separate thread) would need special care, even with this permission. It could find garbage and collect garbage with trivial finalization at any point, but it would have to synchronize with the task that owns the object in order to finalize it. While this is unpleasant, the alternative would be to require most existing Ada code that uses finalization to be rewritten (to add additional locking). That seems like too much of a burden. --- I considered allowing this for stand-alone objects as well. That makes sense as an implementation could implement local composite objects as if they are all allocated from the heap (similar to Java), and the implementation could also garbage collect them. Also, it is uncomfortable that in: A : Some_Type; B : access Some_Type := new Some_Type; the finalization of B can be moved and A's cannot. However, I eventually gave up on the idea as it is possible for the finalization of an object to depend on the finalization of some other object in the same scope (there are examples of this in AI95-00280). It doesn't make sense to revisit that can of worms. Moreover, memory management is needed for object allocated from storage pools; it is not needed for stand-alone objects. At least at this point, the only problems that have been identified have to do with memory management. The definition of "inaccessible" is designed to cover all objects, in case we want in the future to add additional permissions. --- In case that there was a problem with defining "inaccessible object", I checked for all uses of the term "inaccessible" in the AARM: AARM 3.10.2(22.t) says that the term "inaccessible" is sometimes used in relationship to accessibility. I can find no such uses in the AARM (nor can I recall anyone ever having done that); I think this text should be deleted from the AARM. 13.11(25.2/2) uses "inaccessible object". Previously there was no definition of that; now there is. I don't think that there is a problem with this, but if someone finds one, we could reword that paragraph. (Or use a different term.) AARM 3.9.1(4.a) uses "inaccessible subprogram" informally. No issue. E.1(7) defines "inaccessible partition", which is then used in a number of other places in E.1 and E.3. That's not a problem. The automatically generated annex M uses "inaccessible" based on some of the previously discussed rules. And that's it. **************************************************************** From: Robert Dewar Sent: Saturday, March 15, 2008 8:18 AM Unless I misunderstand, this has the general form of implementation permission allowing a compiler to do certain things for an erroneous program. But any compiler is always allowed to do anything it likes for an erroneous program, including documenting that it does a certain thing and that you can rely on it. So I am not sure what the point is in practice. **************************************************************** From: Randy Brukardt Sent: Sunday, March 16, 2008 9:23 PM Yes, of course, but that is not exactly the case here. The permission allows a compiler to do something early for a program that *might* later become erroneous. That's not quite the same thing, because the early finalization could occur even if the program does not in fact ever execute anything that would trigger erroneous execution. (Remember that the executions are erroneous, not programs per se.) The fact that the finalization was done before the program became erroneous could be detected, and would be wrong (period). There is also value to having the permission so that people that are writing portable Ada code know what they have to support. In particular, if you are writing a library like Claw that is supposed to work on a variety of Ada systems (potentially including some not yet constructed), it's useful to know the bounds of "optimizations" that compilers are allowed to do on finalization. Of course, there may be a better solution out there; we've got to start somewhere and get the discussion rolling. **************************************************************** From: Tucker Taft Sent: Saturday, March 15, 2008 12:35 PM It was actually persistence that reignited my interest in storage pools recently. About the closest Ada 2005 has to persistence is shared passive library units. Shared passive units don't allow tasks or entry-ful protected types, and generally are used to hold objects or heaps that may outlast the execution of a single active partition. Similarly, shared passive units don't allow controlled types, since Ada.Finalization is neither Pure nor Shared_Passive. If one were to try to use the kinds of "allocator-specific" storage pools we have been talking about for persistence, one might like to somehow incorporate the rules for shared-passive library units. One annoying restriction for shared-passive units is that they don't permit the declaration of an access-to-class-wide type (E.2.1(7/1)), which is now *more* restrictive than what pure packages permit. This rule should probably be changed to be a restriction on allocators, namely that the type determined for the object created by an allocator for an access type declared in a shared-passive package must have preelaborable_initialization, and further, should itself be declared in a shared-passive or pure package. There is already a restriction on conversions implicit in E.2.1(8) that would prevent a conversion from an access type declared outside a shared-passive package being converted to one declared inside a shared-passive package. Another bug in E.2.1(7/1) is that it says the access type can't designate a task type or a protected type with entry_declarations, but there is nothing prohibiting designating a type that has a task component. This should probably piggy-back on the definition of preelaborable_initialization, and disallow an access type that designates a type that does not have preelaborable_initialization. Actually, if the class-wide type restriction of E.2.1(7/1) is shifted to being enforced at the point of an allocator, the task/protected-type restriction could as well, but perhaps doing it at both points is friendliest to the programmer. No point in declaring a type that no one can use. Now getting back to the topic of allocator-specific storage pools and persistence... Currently you can't specify a storage pool for an access type declared in a shared-passive package, since System.Storage_Pools is neither pure nor shared-passive. We should consider declaring it shared-passive. We might then have a separate sub-hierarchy of Root_Storage_Pool, say Root_Passive_Storage_Pool (or Root_Persistent_Storage_Pool, but in either case, Root PSP), that is specifically designed for use with access types and allocators associated with shared-passive packages. Unlike ESPs, PSPs would *not* need to worry about tasks and finalization, but they *would* need to worry about being referenced from multiple active partitions, and persisting past the end of their executions. I suppose rather than making System.Storage_Pools shared-passive, we could define a Root_Passive_Storage_Pool which is shared-passive, and then define Root_Storage_Pool to be a private extension of that. Presumably only descendants of the Root PSP that are themselves declared in a shared-passive package would be considered "passive" storage pools. In any case, an allocator for a shared-passive access type would only be allowed to specify a passive storage pool as the allocator-specific storage pool. So the overall rule would be that an allocator creating an object that needs finalization must specify a pool of a type that is derived from Root_Extended_Storage_Pool, and an allocator for a shared-passive access type must *not* be creating an object that needs finalization, and may only specify a pool that is a passive storage pool. Or something like that... **************************************************************** From: Tucker Taft Sent: Saturday, March 15, 2008 1:06 PM By the way, we might consider changing the definition of "needs finalization" to be consistent with the definition for "preelaborable_initialization" by exempting entry-less protected types. Since entry-less protected objects can be placed in shared-passive packages, and since they are presumed to require no run-time elaboration, it seems that they should not be considered to "need" finalization. **************************************************************** From: Randy Brukardt Sent: Sunday, March 16, 2008 9:29 PM > It was actually persistence that reignited my interest in storage > pools recently. About the closest Ada 2005 has to persistence is > shared passive library units. Shared passive units don't allow tasks > or entry-ful protected types, and generally are used to hold objects > or heaps that may outlast the execution of a single active partition. > Similarly, shared passive units don't allow controlled types, since > Ada.Finalization is neither Pure nor Shared_Passive. I'm not convinced that shared passive units are the best model to use for persistence. For one thing, omitting (non-limited) controlled types seems like a non-starter to me: it would not allow any containers, for instance. I've always thought that the place to start was with streaming and availability, because the existence of available attributes mean that it is possible to "flatten" the object to some sort of backing device (and also to allow appropriate restoring). Non-limited controlled objects do have available stream attributes. (And limited types with available attributes also could be made persistent - in that case, the user is deciding how to handle the messy cases, which seems appropriate.) Anyway, more to think about. (This does seem to be getting more complex as we go...) **************************************************************** From: Randy Brukardt Sent: Sunday, March 16, 2008 9:43 PM > By the way, we might consider changing the definition of "needs > finalization" to be consistent with the definition for > "preelaborable_initialization" by exempting entry-less protected > types. Since entry-less protected objects can be placed in > shared-passive packages, and since they are presumed to require no > run-time elaboration, it seems that they should not be considered to > "need" finalization. I've never understood how that was supposed to work. Our PT's surely need runtime elaboration: they're all controlled objects (even if the finalization doesn't do anything explicit, it still has to clean up the locks). So I'd be against spreading a mistake further (the current rules don't actually *require* no run-time actions, only "advise" it). **************************************************************** From: Tucker Taft Sent: Sunday, March 16, 2008 10:57 PM There are really at least two different approaches to persistence, one which is based on a representation that can be paged in and out of memory, and has few if any differences between the in-memory and the on-disk representation. At a minimum, the in-memory and the on-disk representations take up the same amount of space, so you can do whatever "fixups" are needed on read/write of a page "in place." As an example of such a persistence approach, take a look at the "Thor" system described by Barbara Liskov et al. in "Distributed Object Management in Thor": http://citeseer.ist.psu.edu/liskov93distributed.html An alternative approach is the one you suggest, which is more based on streaming/marshalling/ serializing, where the in-memory and on-disk representations have little or nothing to do with one another. These generally involve swapping objects rather than, say, pages. I am currently more interested in the former, which maps quite nicely to the distribution passive partition model, where something very close to the in-memory representation persists beyond the execution of a single active partition. For this approach, finalization could be accommodated, but it would only be to support unchecked-deallocation, since it is not generally practical to have to invoke the finalization operations of persistent objects when the passive partition finally dies (if it ever does). The second model clearly relates closely to the streaming stuff, and I agree for that model, having controlled types works fine. I don't see much of a connection between this second model and storage pools (though I may be missing something obvious), whereas for the first model, persistent/passive storage pools make sense, and would almost be essential to make shared-passive packages useful (I suspect you find shared-passive packages useless, but I think at least for the purposes of supporting persistence, they could be quite useful). **************************************************************** From: Robert A. Duff Sent: Monday, March 17, 2008 8:18 AM > useless, but I think at least for the purposes of supporting > persistence, they could be quite useful). You might be interested in this: http://www.adacore.com/2007/12/03/ada-gem-20/ which talks about how GNAT shared passive packages support persistence. **************************************************************** From: Robert Dewar Sent: Monday, March 17, 2008 8:25 AM The interesting history here is that I needed to do something to support shared passive, and I hit on this idea of using persistence in this way, and then decided it was a nice useful feature :-) **************************************************************** From: Robert I. Eachus Sent: Monday, March 17, 2008 12:49 PM > This does seem to be getting more complex as we go... Yes, it does. And as such, I think that modifying all of Ada to allow for mark-release and garbage collected storage pools is taking way too large a bite. I'm going to make a strawman proposal then see if it can be fit back into the RM without major harm. I'm going to start with a (somewhat too) simple idea, then see what it takes to make it work. with Ada.Finalization; package Ada.Unchecked_Finalization is pragma Preelaborate(Unchecked_Finalization); type Controlled is new Ada.Finalization.Controlled; pragma Preelaborable_Initialization(Controlled); procedure Initialize (Object : in out Controlled) is null; procedure Adjust (Object : in out Controlled) is null; procedure Finalize (Object : in out Controlled) is null; procedure Unchecked_Finalize (Object : in out Controlled) is null; type Limited_Controlled is new Ada.Finalization.Limited_Controlled; pragma Preelaborable_Initialization(Limited_Controlled); procedure Initialize (Object : in out Limited_Controlled) is null; procedure Finalize (Object : in out Limited_Controlled) is null; procedure Unchecked_Finalize (Object : in out Limited_Controlled) is null; private ... -- not specified by the language end Ada.Unchecked_Finalization; What does this do for us? The first thing it does, is that the semantics for Unchecked_Finalization are that Unchecked_Finalization can be called while an object is accessible* Unchecked_Finalization does not call Finalization or Unchecked_Finalization on subcomponents, and in particular the compiler magic to finalize components before finalizing the object doesn't happen. Calling Unchecked_Finalization on an object means that 1) Any future references to the object are erroneous. 2) Implementations are expected not to call Finalize for objects that have been Unchecked_Finalized. Now you can have mark-release and garbage-collected storage pools of objects derived from Unchecked_Finalization.Controlled. Does this do everything desired? Not really. There are two issues. The first is that the user is responsible for finalization of components of types derived from Controlled or Limited_Controlled. Fortunately defining Unchecked_Finalize to make the necessary calls to Finalize, and in the right order is possible--and probably a good idea during debugging. But... The hope is that storage pools for Unchecked_Finalization types can wave a magic wand and do Unchecked_Finalization for all the objects that are about to be released. This is especially important for mark-release storage pools. Garbage collecting pools are probably going to do compaction. If so, you would want a different package which would only support non-limited types. In practice an implementation would probably provide a package with a different name indicating the intent. Let's go back to mark-release for a bit. If the pool is intended to use mark-release for a set of types all derived from the same root, I would probably for efficiency reasons specify a size for the type and the same size for all its children. This gives you a pretty easy implementation for the pool. (If for no other reason than to insure that the objects do not cross cache line boundaries.) You create an array of objects of that size, and mark and release just move an index. If Unchecked_Finalization is non-null for any child type, the compiler can give a warning at compile time, and set a flag that requires the release procedure to step through the array calling Unchecked_Finalization. Again, if a subcomponent requires finalization, it is the programmer's job to make that call from Unchecked_Finalization. But note that a type now has two finalization procedures, one that assumes that linked lists and so on do not need to be maintained, and regular finalization which does ll that work. Is this a complete solution to the issue? No. But the amount of compiler involvement is small enough that I can see making the changes and creating mark-release and garbage collected packages. Then we would have a real example to work with, and something for users that does not change erroneousness for any existing code. Yes, it would result in erroneousness if a programmer extended a type with unchecked finalization using a type that needs finalization, and didn't put a call to Finalize in the Unchecked_Finalization routine. But that is a very localized extension of erroneousness. * Big deal, you can do that with Finalize, and in some cases the implementation will even call Finalize several times on the same object. **************************************************************** From: Randy Brukardt Sent: Monday, March 17, 2008 2:07 PM ... > There are really at least two different approaches to persistence, one > which is based on a representation that can be paged in and out of > memory, and has few if any differences between the in-memory and the > on-disk representation. At a minimum, the in-memory and the on-disk > representations take up the same amount of space, so you can do > whatever "fixups" > are needed on read/write of a page "in place." > As an example of such a persistence approach, take a look at the > "Thor" system described by Barbara Liskov et al. in "Distributed > Object Management in Thor": > > http://citeseer.ist.psu.edu/liskov93distributed.html OK, but that also maps to a sort of streaming. In fact, Storage_IO provides that streaming. (Indeed, I think that the streaming should have been done that way in the first place.) You do need access values to be implemented as relative references in order for that to work practically (the alterntive of trying to record every access part in the partition quickly becomes a nightmare, plus rather expensive to do fixups). Which seems like a fairly big change. ... > I am currently more interested in the former, which maps quite nicely > to the distribution passive partition model, where something very > close to the in-memory representation persists beyond the execution of > a single active partition. > For this approach, finalization could be accommodated, but it would > only be to support unchecked-deallocation, since it is not generally > practical to have to invoke the finalization operations of persistent > objects when the passive partition finally dies (if it ever does). There is more to controlled types than just finalization, of course. Initialize and especially Adjust are just as important. And there is value to types that work everywhere (Finalize is called when the object ceases to exist, which may be never for a persistent object). There doesn't seem to be any need to call Finalize on a persistent partition, assuming that it is defined to live forever. (If an extra-implementation means, such as a file delete, is used to get rid of it, we've not going have any reason to worry about preserving the semantics.) We've talked in the past at trying to figure out the difference between Finalize routines that are purely memory management (and as such, don't really need to be called at the library level when the program is going away anyway) and those that do something critical (like release a resource). The latter kind shouldn't be used in a persistent anything (don't want to lock a resource forever), but the former is just fine. The real problem with this approach seems to be that you have to have types that are crafted to work for persistence (or are scalar). That limits the utility compared to the streaming approach where the only requirement is that streaming work (which is a good idea anyway). ... > I don't see much of a connection between this second model and storage > pools (though I may be missing something obvious), whereas for the > first model, persistent/passive storage pools make sense, and would > almost be essential to make shared-passive packages useful (I suspect > you find shared-passive packages useless, but I think at least for the > purposes of supporting persistence, they could be quite useful). They *are* useless, as far as a language-defined persistence mechanism, because they support neither controlled types nor OOP (no access-to-classwide). We have way too many either you get this or you get this but not both things in Ada as it is. Saying that you can have persistence OR OOP but not both is surely not going to help advance Ada. There would be value to a language-defined way to do persistence, but it would have to be fairly general (at least all non-limited types). And I definitely agree that storage pools have nothing much to do with persistence, because the only things you could put into a persistent storage pool are those that are self-flattened. That's way too limiting (or very burdensome on the implementation). If someone wants to create such pools for their own use, that's fine. But I don't see much value into twisting the language definition around solely to make it easier to create persistent pools that could never be part of the language definition as they are so limiting. **************************************************************** From: Tucker Taft Sent: Friday, June 13, 2008 2:48 PM Here is the start of an AI on per-allocator pools. Still pretty sketchy in parts. [This is version /01 of the AI - Editor.] **************************************************************** From: Robert A. Duff Sent: Friday, June 13, 2008 4:20 PM Interesting ideas. I see we've discussed similar stuff in these threads: Subject: [ARG] Specifying pool on allocator Subject: [ARG] Persistent storage pools; E.2.1(7/1) [Editor's note: These threads are filed in this AI.] **************************************************************** From: Tucker Taft Sent: Friday, June 13, 2008 2:48 PM I had some better ideas on how to deal with "subpool" access values that are components. See below. [This is version /02 of the AI - Editor.] ****************************************************************