Version 1.1 of acs/ac-00306.txt

Unformatted version of acs/ac-00306.txt version 1.1
Other versions for file acs/ac-00306.txt

!standard A.18.32(0)          18-03-28 AC95-00306/00
!class Amendment 18-03-28
!status received no action 18-03-28
!status received 18-03-09
!subject Defining when default initialization happens in a container
!summary
!appendix

From: Tucker Taft
Sent: Friday, March  9, 2018  8:16 AM

Here is a comment from Martin Dowie, the customer who was complaining about
"unnecessary initialization" of containers:

> Spotted the AI (http://www.ada-auth.org/cgi-bin/cvsweb.cgi/ai12s/ai12-0258-1.txt?rev=1.1) ...hope the example replacement implementation I sent made it more obvious that it wasn't the Controlled nature of the vector element per se that I had concerns abo
ut (although I think they have the potential to make things worse...I could always grab a scarce system resource via the Controlled type's Initialize subprogram)...it was the initialisation of objects that logically donít exist yet *at all*...even simple r
ecords with an Integer value with default initial value. See example with very large array of default initialized elements.

His point is that if you have a large bounded vector, there might be a large
start-up cost to initialize the entire vector's space for no good reason, if
ultimately only a small portion of the vector is used.  He seems less concerned
about controlling when finalization happens, as many real-time embedded systems
are never expected to terminate (or at least not gracefully ;-).

***************************************************************

From: Randy Brukardt
Sent: Friday, March  9, 2018  3:34 PM

Sure, but this is a problem for arrays in general (it has little to do with the
containers). I'd run into this occassionally in my programs, the only sane thing
to do is to remove the initialization for the component type so that readers and
the compiler both know that it isn't initialized.

You could suppress the initialization by using Import, but that essentially is
just lying about the initialization of the array. And if anyone uses the
uninitialized junk, your program is erroneous (because Import means that you
expect the foreign code to do the needed initialization, if it doesn't do that -
in this case because there is no foreign code - then the code is erroneous by
the blanket rule in Annex B).

If we want the containers to be implementable in Ada, then a bounded container
(which is built out of an array) is of course going to suffer from the same
problem -- and need the same solution (don't do that if it hurts).

What he really wants is a bounded indefinite container. (Note that the
Bounded_Indefinite_Holders by themselves don't help for this, because they'll
have internal initialization, causing the problem all over again. You need a
real Bounded_Indefinite_Hashed_Map or the like).

***************************************************************

From: Tucker Taft
Sent: Friday, March  9, 2018  3:58 PM

I already gave him that advice, but agreed to pass along his concern.  He claims
that the C++ bounded vectors don't have this problem, and so are many times
faster.  Sigh...

And yes, I think we should probably offer Bounded_Indefinite_Vectors, at least.

***************************************************************

From: Randy Brukardt
Sent: Friday, March  9, 2018  11:42 PM

I can write an AI to that effect (including the wording needed), but don't want
to bother if there is a lot of opposition.

As I previously noted, because of Janus/Ada code shared generics, a
Bounded_Indefinite_Vectors would probably be a lot more efficient than the other
forms (assuming that all of the forms are written in Ada).

***************************************************************

From: Tucker Taft
Sent: Saturday, March 10, 2018  9:02 AM

> I can write an AI to that effect (including the wording needed), but
> don't want to bother if there is a lot of opposition.

No opposition on my end, of course.

***************************************************************

From: Bob Duff
Sent: Saturday, March 10, 2018  9:41 AM

I would rather either do nothing, or fix the problem properly (as C++ does,
using "placement new"). And I don't have the energy to determine whether the
"proper" solution is feasible, so I reluctantly advocate doing nothing.  We have
more important things to do.

***************************************************************

From: Tucker Taft
Sent: Saturday, March 10, 2018  9:56 AM

I believe our customer implemented a version that does the "right" thing for
bounded containers, at least in his view.   I presume he used tricks with byte
arrays and address clauses and/or perhaps the storage-subpool feature. He
doesn't seem to object to the current implementation of unbounded containers, as
these generally start small and grow incrementally, so there is no massive
one-time default initialization expense.

I think bounded indefinite containers make perfect sense for other reasons, and
would be a welcome addition to the language without much expense from a
specification or implementation point of view, and as a side-effect they would
provide "precise" initialization/finalization semantics.

***************************************************************

From: Randy Brukardt
Sent: Saturday, March 10, 2018  7:52 PM

> I would rather either do nothing, or fix the problem properly (as C++
> does, using "placement new").
> And I don't have the energy to determine whether the "proper"
> solution is feasible, so I reluctantly advocate doing nothing.  We
> have more important things to do.

"Placement New" itself is no real problem -- the Janus/Ada implementation uses
something like it to initialize all objects. For record types, this is
materialized into a procedure "thunk" which takes the address of the memory to
initialize. (Allocation is done separately, since it might come from the heap, a
user-defined storage pool, or the stack.)

The real problem with it from an Ada perspective is "what happens to whatever
was there previously?". If we were to somehow declare that the memory is truly
uninitialized, then you have a new way to introduce erroneousness (I surely hope
that a discontiguous object whose other parts are missing is erroneous!!). And
you've abandoned the guarantee that every object of an initialized type is
initialized. That seems to be the wrong direction to move for a language
primarily intended for safety-critical work.

If the previous contents are initialized, then (A) we have to figure out what to
do with those contents (especially in cases like controlled and protected
objects), and (B) we haven't really fixed anything.

So the 30 second think on the topic suggests that "placement new" is the wrong
thing for Ada. (I suppose one could come to a different conclusion if one spent
more time thinking about the topic -- but, like Bob, I don't have the energy --
or budget -- for that.)

***************************************************************

From: Tucker Taft
Sent: Sunday, March 11, 2018  10:43 AM

My sense is that the storage subpool stuff (13.11.{4,5,6}) would be almost
exactly what you want for bounded indefinite containers.  You create a subpool
to represent the bounded container, and then use an allocator identifying that
subpool each time a new element is created in the container, and you do an
unchecked deallocation each time you delete an element, and you reclaim the
whole subpool when the container goes away.  Presuming this approach is viable,
this makes bounded indefinite containers pretty easy for an (Ada 2012)
implementation to support.

***************************************************************

From: Randy Brukardt
Sent: Monday, March 12, 2018  8:11 PM

Subpools might lead to fragmentation, since the Inserts for various containers
of the type could be interleaved.

I had an even easier idea, although it turns out to have a problem.
Essentially, each container includes a pool object (with trivial initialization
and finalization so it doesn't act controlled). That object is declared
something like (the types aren't right and none of the checking code - for size,
alignment, and deallocation - is shown):

   type Bounded_Pool (Size, Capacity : Natural) is record
      Data : Storage_Elements (0 .. Size);
      Is_Allocated : Bool_Array (1 .. Capacity) := (others => False);
   end record;

   procedure Allocate(
      Pool : in out Bounded_Pool;
      Storage_Address : out Address;
      Size_In_Storage_Elements : in Storage_Elements.Storage_Count;
      Alignment : in Storage_Elements.Storage_Count) is
      Slot : Natural := 0;
   begin
      -- Find an unallocated slot:
      for I in Is_Allocated'Range loop
         if not Is_Allocated(I) then
	      Slot := I;
            exit;
         end if;
      end loop;
      -- Aside: Tucker can probably turn the above into a reduction
	-- expression, even though it has nothing to do with reduction per-se. :-)
      if Slot = 0 then
         raise Storage_Error; -- Full.
      end if;
      Pool.Is_Allocated(Slot) := True;
      Storage_Address := Pool.Data((Slot-1) * Size)'Address;
   end Allocate;

   procedure Deallocate(
      Pool : in out Root_Storage_Pool;
      Storage_Address : in Address;
      Size_In_Storage_Elements : in Storage_Elements.Storage_Count;
      Alignment : in Storage_Elements.Storage_Count) is
      Offset : Address;
   begin
      Offset := Storage_Address - Pool.Data(0)'Address;
      Pool.Is_Allocated ((Offset / Size)+1) := False;
   end Deallocate;

The implementation for Janus/Ada or any implementation with discontiguous
objects would be more complicated, but the idea remains the same. (Since the use
of this pool is in lock-step with it's definition, one can use some extra
operations to delineate the boundaries of each object, in which case one just
needs a high-water mark in the current slot for allocation.)

The problem with this is that the Size discriminant is clearly a function of the
Capacity discriminant (probably actually a function in order to do rounding up
for alignment), and that is illegal by 3.8(12).

That is, the component needs to be something like:
     type Bounded_List (Capacity : Natural) is record
         My_Pool : Bounded_Pool (Capacity*Element_Size, Capacity);

which is not allowed by Ada. The only way that I know to get around that is
allocate this from some global storage pool, which is not a problem for
Janus/Ada (since everything pretty much uses the heap anyway, worrying about one
more use is silly) -- but it defeats the purpose in safety-critical systems that
don't allow using the heap.

I recall that Steve had some of his usual unlikely examples to demonstrate why
lifting 3.8(12) wasn't a good idea, even though this particular case wouldn't be
a problem. I think.

It seems nasty that the idea of Bounded_Indefinite containers would fall apart on such a minor implementation issue.

***************************************************************

From: Tucker Taft
Sent: Monday, March 12, 2018  8:42 PM

> Subpools might lead to fragmentation, since the Inserts for various
> containers of the type could be interleaved.

I don't follow.  I am presuming each bounded container object would get its own subpool.

...
>      -- Aside: Tucker can probably turn the above into a reduction
> 	-- expression, even though it has nothing to do with reduction per-se. :-)

I suppose the following would work (but be delightfully inefficient):

   (for I in reverse Is_Allocated'Range => (if not Is_Allocated(I) then I else <0>))

***************************************************************

From: Randy Brukardt
Sent: Monday, March 12, 2018  9:12 PM

> > Subpools might lead to fragmentation, since the Inserts for various
> > containers of the type could be interleaved.
>
> I don't follow.  I am presuming each bounded container object would
> get its own subpool.

Generally, subpools allocate out of a shared memory space. If, on the other
hand, each subpool has it's own memory space that's part of the container, then
that's not an issue. But then it seems that there isn't any use to the shared
pool object. And it seems that you'd have the same problem declaring the
subpool's data area (can't use a formula of a discriminant in a type
declaration) as I had with a separate pool in each container.

...

> >      -- Aside: Tucker can probably turn the above into a reduction
> > 	-- expression, even though it has nothing to do with reduction
> > per-se. :-)
>
> I suppose the following would work (but be delightfully inefficient):
>
>    (for I in reverse Is_Allocated'Range => (if not
> Is_Allocated(I) then I else <0>))

This is missing the selection part.

I was thinking something like:

    (for I in Pool.Is_Allocated'Range when Pool.Is_Allocated(I) => I)'Reduce(Natural'Min, Natural'Last);

Assuming you get 100% of your wants in this area. It would be inefficient if the
compiler didn't recognize this case and stop the iteration early. (Since I is
steadily increasing, and 'Min is just looking for the smallest value, once it
"hits" there's no use in continuing.) And, for me at least, it is rather tricky.

If you wanted to get trickier still and care even less about efficiency, you
could use an array aggregate:

    (declare
        Find_Slot : Natural_Array := (for I in Pool.Is_Allocated'Range when Pool.Is_Allocated(I) => I);
     begin
        (if Find_Slot'Length = 0 then raise Storage_Error else Find_Slot(Find_Slot'First)))

Here, the compiler would know the lifetime of Find_Slot is really short and
could try to optimize some or all of it away (only the first component is ever
referenced, after all). Probably would need a pretty fancy compiler, though.

Functional programming is way cool, huh??? :-)

Aside: I didn't think these aggregate iterators included "reverse" since the
order of evaluation is unspecified??

***************************************************************

From: Tucker Taft
Sent: Monday, March 12, 2018  9:27 PM

>>> Subpools might lead to fragmentation, since the Inserts for various
>>> containers of the type could be interleaved.
>>
>> I don't follow.  I am presuming each bounded container object would
>> get its own subpool.
>
> Generally, subpools allocate out of a shared memory space.

Not in my world!  The whole point of subpools in my world view is to support
rapid deallocation, meaning they aren't interspersed in some shared memory
space.

> If, on the other
> hand, each subpool has it's own memory space that's part of the
> container, then that's not an issue.

Yes, that was always my model of subpools.

> But then it seems that there isn't any use to the shared pool object.

It keeps track of which is the default subpool.  It could also keep track of
empty blocks that are released when a subpool is released, rather than returning
them back to the O/S.

> And it seems that you'd have the same problem declaring the subpool's
> data area (can't use a formula of a discriminant in a type
> declaration) as I had with a separate pool in each container.

I didn't understand the problem you were trying to solve.  I presumed the user
would specify the size of the bounded indefinite container in storage units, not
in number of elements, since the size of the elements would typically vary, and
what they care about is bounding the total space occupied by the container.
E.g.:

  type Vector (Space : Storage_Count) is tagged private;

> ...
>>>     -- Find an unallocated slot:
>>>     for I in Is_Allocated'Range loop
>>>        if not Is_Allocated(I) then
>>> 	      Slot := I;
>>>           exit;
>>>        end if;
>>>     end loop;
>>>     -- Aside: Tucker can probably turn the above into a reduction
>>> 	-- expression, even though it has nothing to do with reduction per-se. :-)
>>
>> I suppose the following would work (but be delightfully inefficient):
>>
>>   (for I in reverse Is_Allocated'Range => (if not
>> Is_Allocated(I) then I else <0>))
>
> This is missing the selection part.

I did it in reverse so the first element would be last ;-)  That is why I said
it was delightfully inefficient.  But you are right, a filter would make more
sense...
...

***************************************************************

From: Randy Brukardt
Sent: Tuesday, March 13, 2018  12:01 AM

> > But then it seems that there isn't any use to the shared pool object.
>
> It keeps track of which is the default subpool.  It could also keep
> track of empty blocks that are released when a subpool is released,
> rather than returning them back to the O/S.

I don't think you would want to allocate anything from a default subpool.
Remember one of the purposes of these guys is to be used in environments where
dynamic allocation isn't allowed. Ideally, these allocators would map directly
into an array of bytes.

> > And it seems that you'd have the same problem declaring the
> > subpool's data area (can't use a formula of a discriminant in a type
> > declaration) as I had with a separate pool in each container.
>
> I didn't understand the problem you were trying to solve.  I presumed
> the user would specify the size of the bounded indefinite container in
> storage units, not in number of elements, since the size of the
> elements would typically vary, and what they care about is bounding
> the total space occupied by the container.  E.g.:
>
>   type Vector (Space : Storage_Count) is tagged private;

I was following the model of the Bounded_Indefinite_Holder: there is a generic
parameter which specifies the maximum number of storage units for each element.
And to minimize the differences with the regular bounded containers, a Capacity
discriminant. The above would seem to maximize the differences without any added
benefit.

For each container, the model was that the underlying data was an array of fixed
size buckets, each the size specified by the generic parameter. This model
eliminates any possibility of fragmentation; the reason for using a pool at all
is to get the effect of placement new. Thus, the pool is just divvying up a
array. If you used "normal" allocation, then you have all of the possibilities
for fragmentation that scares these safety-critical folks.

So the allocation size is the max size of one element times the capacity
(adjusted as needed for alignment).

Just letting the user set the size this way doesn't fix anything, since one then
would have to calculate the capacity. (One needs a sane bound on the number of
elements in the pool.) And that requires an expression (again, taking into
account the alignment of the element type). Again, you could do that with a heap
allocation, but again, the safety critical people would not be happy.

***************************************************************

From: Tucker Taft
Sent: Tuesday, March 13, 2018  7:44 AM

> I don't think you would want to allocate anything from a default subpool.

In this case, I agree.  I was taking about my more general model of subpools,
where the pool object could be used to keep track of large blocks from which
subpool blocks are allocated, as well as a pointer to the current default
subpool.  In this case of bounded indefinite containers, I am not sure the
shared pool object would be used for much, but that seems fine.

> Remember one of the purposes of these guys is to be used in environments
> where dynamic allocation isn't allowed. Ideally, these allocators would map
> directly into an array of bytes.

In many of these environments, they allow dynamic allocation within an overall
bounded pool during library unit elaboration, and then things are locked down.
Or they allow allocation within a bounded pool, but no deallocation.

...
>>I didn't understand the problem you were trying to solve.  I
>>presumed the user would specify the size of the bounded
>>indefinite container in storage units, not in number of
>>elements, since the size of the elements would typically
>>vary, and what they care about is bounding the total space
>>occupied by the container.  E.g.:
>>
>>  type Vector (Space : Storage_Count) is tagged private;
>
>I was following the model of the Bounded_Indefinite_Holder: there is a
>generic parameter which specifies the maximum number of storage units for
>each element.

We had different ideas of how to generalize Bounded_Indefinite_Holder.  I saw
the original limit as a maximum for the holder as a whole, which had a single
element.  The generalization to multiple elements I imagined was to specify a
maximum for the container as a whole, which now might have a variable number of
elements of differing sizes.  That is, I was bounding the total, not the
individual elements.   If the user wanted to bound the individual elements, they
could use the existing bounded (definite) containers, and use  a variant record
with defaults, or wrap each element in a holder.

>And to minimize the differences with the regular bounded
>containers, a Capacity discriminant. The above would seem to maximize the
>differences without any added benefit.

I don't agree.  By bounding each individual element, you are introducing an
enormous potential for "internal" fragmentation, just to avoid external
fragmentation (where "internal" fragmentation means waste due to rounding-up the
size of individual elements, while external fragmentation means waste due to
deallocations that leave holes that can't be re-used -- see
https://en.wikipedia.org/wiki/Fragmentation_(computing)).

For example, if I created a "string pool" which stored strings in a
bounded-indefinite-vector pool, and then used the indices into the vector as a
short index for each string, I certainly wouldn't want to bound each individual
string.

>For each container, the model was that the underlying data was an array of
>fixed size buckets, each the size specified by the generic parameter. This
>model eliminates any possibility of fragmentation

(eliminates *external* fragmentation in favor of potentially huge *internal*
fragmentation -- see ref above);

>; the reason for using a
>pool at all is to get the effect of placement new. Thus, the pool is just
>divvying up a array. If you used "normal" allocation, then you have all of
>the possibilities for fragmentation that scares these safety-critical folks.

Fragmentation is never pleasant, but it is not clear that huge internal
fragmentation is better than a potential for external fragmentation, especially
in cases where deallocation is relatively rare compared to allocation.
Certainly the most important thing is to have an upper bound on the total space
devoted to a given data structure.  Given the ability to create a "definite"
type either by using defaults for the discriminants or wrapping each element in
a holder, I don't see the need to force internal fragmentation on the user of
bounded *indefinite* containers.

What I see the user wants is the equivalent of a storage-sized access
collection, but where you can have a separate "collection" for each container
object.  Ada's access-type model requires a separate access *type* for each such
collection, but with containers, you want a separate "collection" for each
container *object*.  And this is where subpools help, but not if you force the
individual elements to each occupy the same amount of space.

>So the allocation size is the max size of one element times the capacity
>(adjusted as needed for alignment).

As indicated above, we were working from different assumptions.  I suggest there
is no point in proceeding further until we agree on the fundamental model --  a
single bound for the total size vs. two bounds, one for the total size and one
for each element.  To me, *indefinite* containers implies that individual
elements will occupy different amounts of space.  The only "bound" for such a
container refers to the total amount of space the container occupies.

***************************************************************

From: Randy Brukardt
Sent: Tuesday, March 13, 2018  4:29 PM

...
>>Remember one of the purposes of these guys is to be used in
>>environments where dynamic allocation isn't allowed. Ideally, these
>>allocators would map
>>directly into an array of bytes.

>In many of these environments, they allow dynamic allocation within an overall
>bounded pool during library unit elaboration, and then things are locked down.
>Or they allow allocation within a bounded pool, but no deallocation.

Sure, but then one doesn't need a special container. If you have a way to verify
that no deletes are being used, any container will work for that. Even better
would be a container kind that takes a storage pool (then that storage pool just
doesn't implement any deallocation).

>> I was following the model of the Bounded_Indefinite_Holder: there is
>> a generic parameter which specifies the maximum number of storage
>> units for each element.

>We had different ideas of how to generalize Bounded_Indefinite_Holder.
>I saw the original limit as a maximum for the holder as a whole, which
>had a single element.  The generalization to multiple elements I
>imagined was to specify a maximum for the container as a whole, which
>now might have a variable number of elements of differing sizes.  That
>is, I was bounding the total, not the individual elements.

That requires a totally general storage pool (supporting arbitrary allocates and
deletes in any order). And then there is literally nothing bounded about the
container other than the use of this pool -- all of the nodes and other internal
data structures also have to be allocated from this object.

For users like the original proposer, all of this complication and uncertainty
would be unacceptable. They could simplify their applications by avoiding
deletes and replaces (but the original proposer wanted to specifically do such
things), or more likely they'd just ban the use of these containers.

>If the user wanted to bound the individual elements, they could use the
>existing bounded (definite) containers, and use  a variant record with
>defaults, or wrap each element in a holder.

All of those would force initialization of all of the elements when they are
created (since a holder object requires at a mimimum nulling its internal
pointer). Leaving the secondary problem unsolved.

>>And to minimize the differences with the regular bounded containers, a
>>Capacity discriminant. The above would seem to maximize the
>>differences without any added benefit.

>I don't agree.  By bounding each individual element, you are
>introducing an enormous potential for "internal" fragmentation, just to
>avoid external fragmentation (where "internal" fragmentation means
>waste due to rounding-up
>the size of individual elements, while external fragmentation means
>waste due to deallocations that leave holes that can't be re-used --
>see https://en.wikipedia.org/wiki/Fragmentation_(computing)).

That's a massive abuse of the term "fragmentation". Fragmentation ("external" in
this article "with multiple issues") is bad because it is unpredictable. Waste
is bad its own way, but one thing it is not is unpredictable. The only way to
get rid of fragmentation is to increase waste, so it is a trade-off. If one uses
the term fragmentation for both things, then it is impossible to get rid of
fragmentation. Obviously, who ever came up with this terminology never dealt
with safety-critical systems and their programmers!

Note that the references in this article are "the PC guide" and some Symantec
document -- those are hardly reliable references to a CS term -- I believe it
more if it was "Knuth" or some Stanford documents. I suspect hardly anyone has
heard of "internal fragmentation" unless they've read this article.

>For example, if I created a "string pool" which stored strings in a
>bounded-indefinite-vector pool, and then used the indices into the
>vector as a short index for each string, I certainly wouldn't want to
>bound each individual string.

You also wouldn't want to use this sort of container, because you'd end up
wasting substantial space storing bounds that are trivial to reconstruct. You
might remember that Matt had proposed special string-keyed maps for this very
reason, but those got dumped early on. (The only "container" that I use
regularly is a string-to-index map - there is no counterpart in Ada.Containers.)

>>For each container, the model was that the underlying data was an
>>array of fixed size buckets, each the size specified by the generic
>>parameter. This model eliminates any possibility of fragmentation

>(eliminates *external* fragmentation in favor of potentially huge *internal*
>fragmentation -- see ref above);

Eliminates the unpredictability of fragmentation by the predictability of memory
usage! Waste is the only way to do that.

>>; the reason for using a
>>pool at all is to get the effect of placement new. Thus, the pool is
>>just divvying up a array. If you used "normal" allocation, then you
>>have all of the possibilities for fragmentation that scares these
>>safety-critical folks.

>Fragmentation is never pleasant, but it is not clear that huge internal
>fragmentation

...please call this "waste". It has none of the characteristics of
fragmentation!

>...is better than a potential for external fragmentation, especially in cases
>where deallocation is relatively rare compared to allocation. Certainly the
>most important thing is to have an upper bound on the total space devoted to a
>given data structure.  Given the ability to create a "definite" type either by
>using defaults for the discriminants or wrapping each element in a holder, I
>don't see the need to force internal fragmentation on the user of bounded
>*indefinite* containers.
>
>What I see the user wants is the equivalent of a storage-sized access
>collection, but where you can have a separate "collection" for each container
> object.

That's definitely not what the original proposer was looking for; recall that he
wanted "bounded" class-wide components. Those necessarily have some waste
per-element.

>Ada's access-type model requires a separate access *type* for each such
>collection, but with containers, you want a separate "collection" for
>each container *object*.  And this is where subpools help, but not if
>you force the individual elements to each occupy the same amount of space.

Don't see the "but" here; the same is true regardless of what type of pool one
uses. (Wanting separate "collections" seems orthogonal to how allocations are
done in each collection.)

>>So the allocation size is the max size of one element times the
>>capacity (adjusted as needed for alignment).

>As indicated above, we were working from different assumptions.  I
>suggest there is no point in proceeding further until we agree on the
>fundamental model --  a single bound for the total size vs. two bounds,
>one for the total size and one for each element.  To me, *indefinite*
>containers implies that individual elements will occupy different amounts of
>space. >The only "bound" for such a container refers to the total amount of
>space the container occupies.

I was going to suggest that we instead focus on a bounded container (bounded in
number of elements) that passes in a storage pool for allocating to indefinite
components. Then the user can do whatever they like in that pool.

But that doesn't completely work, because the pool needs to be able to find out
the capacity of the actual container object. Probably would have to pass in the
pool *type*, as a type derived from some special abstract pool with appropriate
discriminants.

I can propose something on this line tomorrow if you want to see it fleshed out
a bit more. Otherwise, I think we have to drop this idea (I don't think we could
- or necessarily should try to - agree on how memory should be allocated).

And there is a much bigger problem. You finally got through my fat head of why
you have to use subpools here - the issue that you can only have one pool per
access type. For some reason, I was thinking of a pool object per container and
thus essentially a access type per container. But that doesn't seem to be a
possibility in Ada (or is there, see below).

But the use of subpools is problematical, as it prevents the
Bounded_Indefinite_xxx containers from being Pure. There has to be a global pool
object for the (global) access type to use, one that manages the subpools. That
violates the Pure restrictions. Worse, I don't see any way to implement that
object for an unbounded number of holders (or any other container) without
either using non-trivial finalization or dynamic allocation. Which both are
out-of-bounds for such containers.

So I don't think a subpool implementation would fly. It has to be one pool
object per container, no strings attached. That would require a headstanding
implementation using local access types (declared in Insert and Delete),
'Unchecked_Access (to strip the accessibility info), and type conversions to the
more global type. Yuck (but less yucky than a subpool implementation that
violates the basic requirements).

***************************************************************

From: Tucker Taft
Sent: Tuesday, March 13, 2018  5:02 PM

Not sure this discussion makes sense since we have different models in mind, but
I'll try to clarify my views one last time...

>> We had different ideas of how to generalize Bounded_Indefinite_Holder.  I
>> saw the original limit as a maximum for the holder as a whole, which had a
>> single element.  The generalization to multiple elements I imagined was to
>> specify a maximum for the container as a whole, which now might have a
>> variable number of elements of differing sizes.  That is, I was bounding
>> the total, not the individual elements.
>
> That requires a totally general storage pool (supporting arbitrary allocates
> and deletes in any order). And then there is literally nothing bounded about
> the container other than the use of this pool -- all of the nodes and other
> internal data structures also have to be allocated from this object.

I was always presuming the key thing to bound was the total amount of storage
used, and the implementation *I* envisaged for a storage subpool encapsulated
all of the storage needed to maintain free lists, etc.  If your storage subpool
doesn't work that way, then it wouldn't be the right thing to use for
implementing the bounded indefinite containers.  We don't need to argue about
how to implement storage subpools.  I was simply indicating that *if* you
implemented storage subpools the way *I* imagined, then bounded indefinite
containers could piggy back on that implementation.  If you implement them some
different way, then you can't.

>For users like the original proposer, all of this complication and
>uncertainty would be unacceptable. They could simplify their applications by
>>avoiding deletes and replaces (but the original proposer wanted to
>specifically do such things), or more likely they'd just ban the use of
>these containers.

We certainly don't need to agree whether storage subpools are relevant here.  We
do need to agree on what exactly do bounded indefinite containers bound.  My
proposal is that they bound the total amount of storage used, but don't force
each element to use the same amount of space.  Let's try to get some consensus
on that issue first, if we can.  If we can't, then I don't see any point in
further discussion.

>>If the user wanted to bound the individual elements, they could use the
>>existing bounded (definite) containers, and use  a variant record with
>>defaults, or wrap each element in a holder.
>
>All of those would force initialization of all of the elements when they are
>created (since a holder object requires at a mimimum nulling its internal
>pointer). Leaving the secondary problem unsolved.

The model I had in mind is still not clear, I guess.  I was presuming that you
just have "raw" space until an element is added to the container.  When you add
the element, you do the equivalent of a "new," and if you were to use my
envisaged model of storage subpools, then you could actually use the Ada 2012
subpool allocator.  If not, then you will need some other magic, which might
still be based on the subpool allocator feature (to get the finalization scope
right), but might be a very special implementation of subpools not useful for
anything else.  In any case, the key point is that you are converting "raw"
space into an object, and at that point the initialization would take place.
Similarly, when you remove an element from the container, you would use
something very close to Unchecked_Deallocation, to do the finalization.  And
when the container as a whole went away, you would do something analogous to
finalizing the whole storage subpool, which would finalize the eleme s that
still exist in the container.  Alternatively, you could iterate through the
container and individually finalize the remaining elements.


>>By bounding each individual element, you are introducing
>>an enormous potential for "internal" fragmentation, just to avoid external
>>fragmentation (where "internal" fragmentation means waste due to rounding-up
>>the size of individual elements, while external fragmentation means waste
>>due to deallocations that leave holes that can't be re-used -- see
>>https://en.wikipedia.org/wiki/Fragmentation_(computing)).
>
>That's a massive abuse of the term "fragmentation". ...

Well that matches the definition of fragmentation that at least I am familiar
with, and appears in many books and various online courses.  So I would suggest
you use the term "external fragmentation" if that is the only kind of
fragmentation you are interested in.

...
>Note that the references in this article are "the PC guide" and some
>Symantec document -- those are hardly reliable references to a CS term -- I
>believe it more if it was "Knuth" or some Stanford documents. I suspect
>hardly anyone has heard of "internal fragmentation" unless they've read this
>article.

For what it is worth, see also:

  http://www.cis.upenn.edu/~lee/03cse380/exam3-v8.pdf  (first question in the exam ;-)

  https://courses.cs.washington.edu/courses/cse451/00sp/misc/quiz2sol.txt (also first exam question)

  https://www.quora.com/What-are-the-differences-between-internal-fragmentation-and-external-fragmentation

  https://techdifferences.com/difference-between-internal-and-external-fragmentation.html

etc, etc.

A google search for "internal fragmentation vs. external fragmentation" will
provide you with approximately 131,000 results... ;-)

***************************************************************

From: Randy Brukardt
Sent: Tuesday, March 13, 2018  5:44 PM

>Not sure this discussion makes sense since we have different models in
>mind, but I'll try to clarify my views one last time...

I don't think it really matters. If we need to support your view and my view
(and I can see arguments for that), then we have to allow passing in a storage
pool so the user can deal with it. Otherwise, there is no point in continuing,
because what you are proposing wouldn't be useable by the "original proposer",
which makes the utility dubious to me. They are looking for absolute
predictability, and definitely don't want to give up Replace and Delete
operations to get it (reading between the lines).

>>>We had different ideas of how to generalize
>>>Bounded_Indefinite_Holder.  I saw the original limit as a maximum for
>>>the holder as a whole, which had a
>>>single element.  The generalization to multiple elements I imagined was to
>>>specify a maximum for the container as a whole, which now might have
>>>a variable number of elements of differing sizes.  That is, I was
>>>bounding the total, not the individual elements.

>>That requires a totally general storage pool (supporting arbitrary allocates
>>and deletes in any order). And then there is literally nothing bounded about
>>the container other than the use of this pool -- all of the nodes and other
>>internal data structures also have to be allocated from this object.

>I was always presuming the key thing to bound was the total amount of
storage used, ...

Not for the "original proposer". In his environment, he was using machine-code
level verification, so the complexity of the implementation also comes into
play. He was looking for arbitrary operations on class-wide types, thus the
holder definition that we came up with.

The regular bounded pools are designed to meet similar requirements (bounded
pools by their definition waste vast amounts of memory; you use them because you
value predictability). I've been assuming that those requirements carry over to
any other "bounded" form pools.

>and the implementation *I* envisaged for a storage subpool encapsulated
>all of the storage needed to maintain free lists, etc.  If your storage
>subpool doesn't work that way, then it wouldn't be the right thing to
>use for implementing the bounded indefinite containers.  We don't need
>to argue about
>how to implement storage subpools.  I was simply indicating that *if*
>you implemented storage subpools the way *I* imagined, then bounded
>indefinite containers could piggy back on that implementation.  If you
>implement them some different way, then you can't.

Umm, there is no standard definition of a subpool; the language doesn't define
one. *I* personally don't have any such implementations as I've never had a good
use for them. In any case, any subpool implementation used here requires a
global pool object -- which is incompatible with a Pure package. Ergo, you can't
use a subpool to implement these containers, even if is seems reasonable to do
so.

Anyway, this implementation would not meet any of the requirements of the other
kinds of "bounded" containers: it couldn't be Pure; it would need either
finalization or heap allocation to track the existing subpools; and it wouldn't
eliminate "external" fragmentation if Replace/Delete is used. Sounds like it
wouldn't meet the usual safety-critical requirements, and it would barely differ
from the usual unbounded containers.

>We certainly don't need to agree whether storage subpools are relevant here.
>We do need to agree on what exactly do bounded indefinite containers bound.
>My proposal is that they bound the total amount of storage used, but
>don't force each element to use the same amount of space.  Let's try to
>get some consensus on that issue first, if we can.  If we can't, then I
>don't see any point in further discussion.

I think that is quite the wrong question at this point. We should leave that
question up to the user, because I can see at least two use-cases that still
meeting the safety-critical restrictions:
  (1) An allocate-only pool used with container objects that doesn't
      Replace/Delete elements -- bounding just the memory is fine.
  (2) A "blocked" pool that supports full functionality -- again, no
      non-determinism so we're fine. -- One has to bound two of the total
      memory, the container capacity, and the maximum element size.
There's a third use case for the rest of us:
  (3) A full pool implementation, we just want deterministic memory usage and
     deterministic (and lazy) initialization.

That's three separate storage pool implementations, and it's pretty clear to me
that one size does not fit all here. We're not going to have three kinds of
containers in the Standard, so the only question is whether we can design one
that actually allows all three use cases by letting the user choose their
storage pool.

If not, we should drop the subject. Do you want me to try, or forget it??

>>>If the user wanted to bound the individual elements, they could use
>>>the existing bounded (definite) containers, and use  a variant record
>>>with defaults, or wrap each element in a holder.

>>All of those would force initialization of all of the elements when they are
>>created (since a holder object requires at a minimum nulling its
>>internal pointer). Leaving the secondary problem unsolved.

>The model I had in mind is still not clear, I guess.  I was presuming
>that you just have "raw" space until an element is added to the container. ...

Huh? You said: "the user would use the existing bounded (definite)
containers..." which have the initialization behavior I cited. I was responding
specifically to the suggestion that a user that wanted to bound individual
elements should just use holders or variants in an existing definite container.

That, after all, is indeed the current state of the Standard, and we would have
never started this discussion if that was an adequate solution! You essentially
are saying that you can have one of memory certainty or sane initialization, but
not both. Sounds weird to me.

>>That's a massive abuse of the term "fragmentation". ...

>Well that matches the definition of fragmentation that at least I am
>familiar with, and appears in many books and various online courses.
>So I would suggest you use the term "external fragmentation" if that is
>the only kind of fragmentation you are interested in.

The reason I hate this is that it implies that fragmentation is a problem that
cannot be solved -- since the only way to get rid of "external fragmentation" is
to increase waste (padding bits, fixed size allocates, etc.). But the issue of
concern is predictability, and these supposedly similar views have very
different characteristics for that issue.

In any case, you felt it necessary to explain "internal fragmentation" in your
original post, so you must not have intuitively thought that the term was very
obvious. QED. :-)

'nuff on that.

***************************************************************


Questions? Ask the ACAA Technical Agent