Version 1.1 of acs/ac-00259.txt

Unformatted version of acs/ac-00259.txt version 1.1
Other versions for file acs/ac-00259.txt

!standard A.4.4(1)          14-05-09 AC95-00259/00
!class Amendment 13-05-09
!status received no action 14-05-09
!status received 14-02-28
!subject bounded containers as bounded lists for low memory usage
!summary
!appendix

!topic        bounded containers as lists
!reference Ada 2012 RM A.4.4(402) implementation advice
!from        Guerra 14-02-28
!keywords  bounded containers lists
!discussion

Working with geometries, for an air traffic control application. These
geometries must be superimposed over terrain maps as collections of points with
two coordinates, latitude and longitude.

   These coordinates must be _bounded strings_, for the client chooses whether
seconds will be used or not and longitudes are one character longer than
latitudes. This is not the problem, as it will be seen. The problem being the
possible high difference in the number of points between geometries.

   Most geometries are simple enough and will do with a few points (usually
under 7 and more often than not, just 4) or data (like rectangles and circles);
but, sometimes, in order to recreate a border-line or coast-line, the number of
points can rocket to more than sixty; thus, the limit to the number of points
lies in the nineties. It might be that just a couple of these `intensive point'
geometries arise per load of information, but they have to be kept in mind.

   Typical figures lie around 100 geometries per load of information and
reserving, as arrays -like bounded strings are adviced to be implemented-, over
90 points for each geometry makes the number of arrays to rocket to over 9
thousand per load, with the expected sequel of Storage_Error.

   The real number of arrays needed being in the hundreds, more often than not,
below 1 thousand.

   The available solution, up to now, being to use _unbounded strings_ for the
coordinates to prevent memory wasting in the form of reserving arrays that will
never be used; thus, using unbounded lists of characters for coordinates, which
are naturally bounded to 8 characters (DDDMMSSH in the case of a longitude: DDD
degrees, MM minutes, SS seconds and H hemisphere).

   Thus, up to now and in situations like this one, not only the nicety of
working with arrays is lost, but also the possibility of bounding the number of
characters for the coordinates.

   Summarizing: bounded strings as bounded lists would allow loosing just one of
the niceties of bounded strings in situations like this: just the vector
implementation, but keeping the bounding.

Backward compatibility with legal code would promote something like
    MAX_COORD_LENGTH : constant := 8;
    package P_Coordinate is new
      Ada.Strings.Bounded.Generic_Bounded_Length
      (Bounding_Type => Slim,
                       Max => MAX_COORD_LENGTH);

with Bounding_Type defaulting to Fast after being defined, near Max_Length as
    type Bounding_Type is (Fast, Slim);  -- Fast -> bounded array
                                         -- Slim -> bounded list

Or, may be, with an aspect to prevent the ilegal default discriminant value (for
a private type) for backward compatibility with legal code and because, in fact,
this is a compiling directive:
    type Bounded_String is private
      with Bounding => Default_Bounding;  -- Default_Bounding being Fast

Thus, it would be possible to declare:
    package P_Coordinate is new
      Ada.Strings.Bounded.Generic_Bounded_Length(Max => MAX_COORD_LENGTH)
      with Bounding => Slim;

The vector implementation nicety would be lost (it is anyway with
Unbounded_String) but the boundary could be enforced by both: the compiler and
the run-time environment.

   Thanks a lot for your time and best regards, M. Guerra

***************************************************************

From: Randy Brukardt
Sent: Wednesday, March 5, 2014  1:39 PM

...
>    Summarizing: bounded strings as bounded lists would allow loosing
> just one of the niceties of bounded strings in situations like this:
> just the vector implementation, but keeping the bounding.

Could you explain why using the bounded list container (A.18.20) or the bounded
vector container (A.18.19) does not give you what you want? These are new to Ada
2012 and are intended to provide memory certainty for container operations.

If you don't need memory certainty (read as no heap operations), then using the
unbounded containers is best. Bounds are always unnatural (there always will be
a case where they're insufficient), so if you can tolerate heap allocation, then
not having bounds is better.

Of course, there always will be specialized needs that can't be handled with the
predefined containers. It doesn't make sense for the Standard to include
containers that only a handful of users will ever need. Applications on the edge
of performance (in speed or memory usage or any other quality) will be better
served with custom containers tailored for that purpose.

***************************************************************

From: Guerra Garcˇa, Mario Daniel
Sent: Thursday, March 6, 2014  4:30 AM

>Could you explain why using the bounded list container (A.18.20) or the
>bounded vector container (A.18.19) does not give you what you want?
>These are new to Ada 2012 and are intended to provide memory certainty
>for container operations.

I'm sure it can be done, same as it would be possible using Character arrays and
lists instead of having the predefined String, Bounded_String and
Unbounded_String types.

I just thought it would fit nicely in the family of String types and would
prevent the weird sensation, for the programmer, of getting Storage_Error when
using a Bounded_String type and fixing it by using Unbounded_String  !!
...

>Bounds are always unnatural (there always will be a case where they're
>insufficient), so if you can tolerate heap allocation, then not having bounds
>is better.

I don't agree: we use finite machines with a finite amount of memory and a
finite clock rate. These limits will always underlay whatever we do. There are
always limits to what can be done and limits inherent to our problem domain,
from which the whole theory of subtypes is derived.

>Of course, there always will be specialized needs that can't be handled
>with the predefined containers...

I think air traffic control, as any other critical environment, is a home domain
for Ada; nonetheless, in my experience, the `almost always' non-linearity is
common enough throughout different domains. Being, in this case, that most
geometries have just a few points, but a couple of them might have over 50.

Anyway, if it is not considered interesting enough having a slim implementation
for the Bounded_Srting type, just drop the idea; but I really think it's weird
and ugly getting Storage_Error with the Bounded type and having to fix it by
using the Unbounded type. Counterintuitive, at least.

   Thanks a lot anyway and best regards, M. Guerra

P.S. I'm not sure where to my reply should be addressed; I'm doing the same as
you: send it to ada-comment@ada-auth.org with a copy for you. I'm terribly sorry
if you get it twice.

***************************************************************

From: Randy Brukardt
Sent: Thursday, March 6, 2014  10:17 PM

Thanks for the quick reply.

> P.S. I'm not sure where to my reply should be addressed; I'm doing the
> same as you: send it to ada-comment@ada-auth.org with a copy for you.
> I'm terribly sorry if you get it twice.

I did get it twice, but I'm used to that. It's important that replies get
addressed to the list so that they get included in the record of discussion on a
topic. Obviously, that doesn't happen for a private mail. I cced you explicitly
as well as replying to the list in case you haven't joined the list (anyone can
join) so that you'd see it. Generally, it's preferable to just send to the list,
but not at the expense of preventing the poster from seeing a reply.

...
> I don't agree: we use finite machines with a finite amount of memory
> and a finite clock rate. These limits will always underlay whatever we
> do.
> There are always limits to what can be done and limits inherent to our
> problem domain, from which the whole theory of subtypes is derived.

Sure, but I don't understand why you then want to exceed the limits -- they're
hardly limits at all in that case (just like the speed "limit" on the highway
:-).

After all, typical implementations of unbounded string use the minimum memory
possible to store the contents. OTOH, a bounded string uses a fixed, maximum
amount of memory for each string. If you're concerned about using the least
memory, you're better off using unbounded strings because you don't have the
waste of the fixed allocation in that case.

OTOH, if you are writing an application where memory certainty is required (like
flight software for an airplane), where a memory allocation failure could be
fatal (literally!), then a bounded string brings that sort of certainty.

It sounds like you're looking for the worst of both worlds (using more memory
than required, but without the guarantee that an allocation won't fail). That
has to be a rather specialized need.

> >Of course, there always will be specialized needs that can't be
> >handled with the predefined containers...
>
> I think air traffic control, as any other critical environment, is a
> home domain for Ada; nonetheless, in my experience, the `almost
> always' non-linearity is common enough throughout different domains.
> Being, in this case, that most geometries have just a few points, but
> a couple of them might have over 50.

Right, but that seems like its well suited to the typical unbounded string
implementations. I don't understand what you expect to gain from using a hybrid
approach. Note that the implementation of such a hybrid will have all of the
overhead of an unbounded string (because it *might* need to clean up some
allocated memory when the objects are destroyed) AND the space usage of a
bounded string. It might make sense to have an unsafe package with this behavior
(as it wouldn't need the clean-up overhead), but it's unlikely that we'd want
such a package in the Ada Standard. In any case, it's easy enough to write such
a package (since there is nothing special about bounded strings, it's normal Ada
specification). Perhaps it would help your proposal if you showed a possible
implementation for what you are proposing and especially why it would be better
(for some definition of better) than a normal unbounded string.

> Anyway, if it is not considered interesting enough having a slim
> implementation for the Bounded_Srting type, just drop the idea; but I
> really think it's weird and ugly getting Storage_Error with the
> Bounded type and having to fix it by using the Unbounded type.
> Counterintuitive, at least.

I'm not sure what you find weird about getting an exception when you exceed a
limit that your program asked for. If you don't want a limit, don't ask for one!

Anyway, pretty much all Amendment proposals that come here will be considered by
the ARG; I need to understand what is being asked for well-enough that I can
write a simple problem statement for that consideration. It's not just up to me
(we do do some triage of questions that have obvious answers or make no sense,
but we try to use a light hand on that). Your proposal will be considered
(unless you withdraw it) and I have no idea what will happen then (perhaps it
makes more sense to others than it does to me).

***************************************************************

From: Guerra Garcˇa, Mario Daniel
Sent: Friday, March 7, 2014  4:31 AM

>Sure, but I don't understand why you then want to exceed the limits --
>...they're hardly limits at all in that case (just like the speed
>"limit" on

I actually don't want to: it just happens when reserving big vectors for all
geometries, when most will use just a few positions in those vectors.

>After all, typical implementations of unbounded string use the minimum
>memory possible to store the contents. OTOH, a bounded string uses a
>fixed, maximum amount of memory for each string. If you're concerned
>about using the least memory, you're better off using unbounded strings
>because you don't have the waste of the fixed allocation in that case.

I'm really concerned with not getting Storage_Error; but in the part of the real
world I'm modelling, coordinates are bounded strings and I'd like to faithfully
keep that bounding in my software model of the real world.

>OTOH, if you are writing an application where memory certainty is
>required (like flight software for an airplane), where a memory
>allocation failure could be fatal (literally!), then a bounded string
>brings that sort of certainty.

A bounded string as a bounded list will also prevent going beyond the boundary,
but without the waste of, by default, allocating the maximum possible memory;
(1) even if it's never going to be used in most cases, as the vector
implementation does. (1) Is this true? This could be my mistake, as pointed out
below: if fixing the boundary gets the compromise by the compiler that it can
always be reached and memory is allocated to permit so whether vector or list,
then you can skip the discussion and I withdraw the attempted Ada issue.

>It sounds like you're looking for the worst of both worlds (using more
>memory than required, but without the guarantee that an allocation
>won't fail). That has to be a rather specialized need.

I really didn't make myself clear. I'm seeking the best of both worlds: having
the bounding enforced, but without memory waste.

>Right, but that seems like its well suited to the typical unbounded
>string implementations. I don't understand what you expect to gain from
>using a hybrid approach. Note that the implementation of such a hybrid
>will have all of the overhead of an unbounded string (because it
>*might* need to clean up some allocated memory when the objects are
>destroyed) AND the space usage of a bounded string...

(1) Why the space usage of a bounded string? Maybe this is my mistake:
I'm assuming that bounded lists don't reserve memory for all possible nodes at
the _new_ operation, but add them when needed as far as the number of nodes
stays within the boundary, raising Capacity_Error when trying to add an extra
node beyond that boundary; or Storage_Error, if appropriate.

If the assumption in (1) is correct, then it would be possible to have two or
three geometries defined by, say, 50 points while most of them (in over 100) are
defined by 5 points without wasting the allocation for the unneeded 45 points in
all those other cases; if it's _not correct_, then you can skip the rest of the
discussion.

>Perhaps it would help your proposal if you showed a possible implementation...

I'm not sure I can improve the implementation of a bounded list in Ada 2012: I
presume it is a record with a positive constant, the boundary; a natural
counter, the actual length; and a list of nodes with elements of the Item_Type
whose length is forced to stay within the boundary.

An addition or load of characters is possible as long as the number of
characters to be added/loaded <= Boundary

>I'm not sure what you find weird about getting an exception when you
>exceed a limit that your program asked for. If you don't want a limit,
>don't ask for one!

I didn't explain myself clearly. This is not the case: all my strings are within
boundary. I don't get Length_Error or Constraint_Error, but Storage_Error.
Geometries promote this Storage_Error when implemented like vectors of vectors
(Bounded_Strings), but not when implemented as vectors of lists
(Unbounded_Strings), at the cost of losing the boundary.

I got rid of the Storage_Error problem relinquishing the boundary and using
Unbounded_String, losing the real world bounding of the strings that represent
coordinates in my software model.

Maybe a diagram will help express my idea: (the X stands for memory position
reserved for Item_Type -coordinates of points for the particular geometry)
 - Bounded_String as vector
  (boundary enforced, all positions reserved even if not used -> Storage_Error)
  (vector of vectors)
G1: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
G2: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
G3: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
G4: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
G5: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX and so on

- Bounded_String as list
  (boundary enforced, only positions used allocated -> no Storage_Error)
  (vector of lists)
G1: XXXX
G2: XXXX
G2: XXXX
G4: XXXXXXXXXXXXXXXXXXXXX
G5: XXXX and so on

The difference with the Unbounded_String being that, in a Bounded_String as
list, the boundary is still enforced. This is what I seek: boundary enforcement
without memory waste; _if_, and this is a big if, the assumption made in (1) is
correct.

Thanks a lot and best regards, M. Guerra

P.S. I'm sending this just to the list and I've joined it, so you don't have to
cc me.

***************************************************************                                                 - Counter. This Counter being updated when proper adding/removal -that is, within boundaries- is performed.

From: Randy Brukardt
Sent: Friday, March 7, 2014  4:43 PM

...
> >After all, typical implementations of unbounded string use the
> >minimum memory possible to store the contents. OTOH, a bounded string
> >uses a fixed, maximum amount of memory for each string. If you're
> >concerned about using the least memory, you're better off using
> >unbounded strings because you don't have the waste of the fixed
> >allocation in
> that case.
>
> I'm really concerned with not getting Storage_Error; but in the part
> of the real world I'm modelling, coordinates are bounded strings and
> I'd like to faithfully keep that bounding in my software model of the
> real world.

I think I finally see what you're asking about. You want a bounded container (or
string) that's implemented like an unbounded container. I suppose that makes
sense, but it's not what the existing bounded containers are about in Ada.

> >OTOH, if you are writing an application where memory certainty is
> >required (like flight software for an airplane), where a memory
> >allocation failure could be fatal (literally!), then a bounded string
> >brings that sort of certainty.
>
> A bounded string as a bounded list will also prevent going beyond the
> boundary, but without the waste of, by default, allocating the maximum
> possible memory; (1) even if it's never going to be used in most
> cases, as the vector implementation does.
> (1) Is this true? This could be my mistake, as pointed out
> below: if fixing the boundary gets the compromise by the compiler that
> it can always be reached and memory is allocated to permit so whether
> vector or list, then you can skip the discussion and I withdraw the
> attempted Ada issue.

All of the bounded containers and bounded strings in Ada are supposed to be
implemented with fixed allocations. You can see that in the Implementation
Advice for each bounded container:

A.18.19(16/3): "Bounded vector objects should be implemented without implicit
pointers or dynamic allocation."

The demand for these forms of containers came from users that can't allow
dynamic allocation in their applications (typically because of reliability
requirements). Thus the implementations are not allowed to allocate memory
dynamically.

Of course, that's what you want -- you just want the bound to reflect some
real-life constraints of your problem but you don't want to pay a penalty in
space for this bound. Right?

...
> Maybe a diagram will help express my idea: (the X stands for memory
> position reserved for Item_Type -coordinates of points for the
> particular geometry)
>  - Bounded_String as vector
>   (boundary enforced, all positions reserved even if not used
> -> Storage_Error)
>   (vector of vectors)
> G1: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
> G2: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
> G3: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
> G4: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
> G5: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX and so on
>
> - Bounded_String as list
>   (boundary enforced, only positions used allocated -> no
> Storage_Error)
>   (vector of lists)
> G1: XXXX
> G2: XXXX
> G2: XXXX
> G4: XXXXXXXXXXXXXXXXXXXXX
> G5: XXXX and so on

Yup, my description above is right.

Given the primary requirement for the bounded forms was no dynamic allocation,
it should be clear that we can't really support your requirement in the standard
bounded forms. (Even if we had an alternative implementation, bounded container
users would be at risk of accidentially [either by their error or an
implementation error] getting the dynamic allocation that they cannot allow.
That would prevent the containers from being used by their target audience,
which isn't going to fly.)

Obviously, it would be easy to add a bound by wrapping an unbounded container or
unbounded string - it would just be tedious as many operations would need
additional checks.

You can also instantiate the unbounded vector to give it a bound, as the index
subtype can be anything you want. So if I declare something like:

    subtype Data is String(1..4); -- A single datum.
    subtype Index is Natural range 1 .. 50;

    package Data_Vector is new Ada.Containers.Vectors (Index, Data);

It is not possible to put more than 50 Data items into a Data_Vector.Vector
object, because that is the maximum number supported by the index subtype. An
attempt to insert a 51st item will raise Constraint_Error. (This is required by
A.18.2(151/3)).

You can even use Ada.Containers.Vectors like a bounded string by instantiating
it with an element type of Character.

The only problem with this is that the implementation of Ada.Containers.Vectors
is probably tailored for larger items than a single character or even a short
fragment as Data in the example above. So it's not clear that the overhead and
pre-allocation would actually make this implementation more space-efficient than
the Bounded_String implementation. (Of course, you could discuss that with your
compiler vendor; the implementation of Ada.Containers is up to them, not up to
the Standard.)

Hope this gives you some new ideas to investigate.

P.S. I copied you on this message because your list join got stuck in the spam
filter and probably hasn't been processed yet. You can reply only to the list. -
RLB

***************************************************************

From: Tucker Taft
Sent: Friday, March 7, 2014  6:42 PM

Conceivably you could use a subtype predicate applied to an unbounded string to
establish an upper bound on the length, while still using the minimum amount of
space for any given string.

E.g.:

    use Ada.Strings.Unbounded;
    subtype Somewhat_Bounded_String is Unbounded_String
      with Dynamic_Predicate =>
        Length (Somewhat_Bounded_String) <= Max_Len;

***************************************************************

From: Jeff Cousins
Sent: Monday, March 10, 2014  11:44 AM

A bit like the example in "Alice in Adaland" in the last Ada User Journal, of
using a string with a dynamic predicate as a cheaper alternative to a bounded
string..

***************************************************************

From: Guerra Garcˇa, Mario Daniel
Sent: Monday, March 10, 2014  4:15 AM

>Conceivably you could use a subtype predicate applied to an unbounded
>string ...
>E.g.:
>    use Ada.Strings.Unbounded;
>    subtype Somewhat_Bounded_String is Unbounded_String
>      with Dynamic_Predicate =>
>        Length (Somewhat_Bounded_String) <= Max_Len;

Really neat. Thanks a lot, Mario

***************************************************************

From: Guerra Garcˇa, Mario Daniel
Sent: Monday, March 10, 2014  4:04 AM

>All of the bounded containers and bounded strings in Ada are supposed
>to be implemented with fixed allocations.
>...
>A.18.19(16/3): "Bounded vector objects should be implemented without
>implicit pointers or dynamic allocation."

>The demand for these forms of containers came from users that can't
>allow dynamic allocation in their applications ...

That is the reason I was applying for an addition to, and not a substitution of
the implementation way.

>I think I finally see what you're asking about. You want a bounded
>container (or string) that's implemented like an unbounded container.
>...
>Of course, that's what you want --you just want the bound to reflect
>some real-life constraints of your problem but you don't want to pay a
>penalty in space for this bound. Right?

I couldn't have worded it any better: I was willing to pay for the boundary
space, but wanted to escape the memory reservation for data that was not going
to be there.

>Given the primary requirement for the bounded forms was no dynamic
>allocation, it should be clear that we can't really support your
>requirement in the standard bounded forms. (Even if we had an
>alternative implementation, bounded container users would be at risk of
>accidentially [either by their error or an implementation error]
>getting the dynamic allocation that they cannot allow. That would
>prevent the containers from being used by their target audience, which
>isn't going to fly.)

I see the point: humans have a special ability to push mistakes through any
warnings that might be in place.

I recognize that the compiler warning `Storage_Error might be raised at
run-time' could prove not to be enough to prevent trouble.

>Obviously, it would be easy to add a bound by wrapping an unbounded
>container or unbounded string -it would just be tedious as many
>operations would need additional checks.

This is why I thought having it neatly integrated with the String `family type'
would be far more convenient for most users: less coding = less opportunities to
make mistakes; I believe the compiler reflects the expertise of hundreds of
people built up along decades.

Thanks a lot for your ideas on the subject. Best regards, Mario

***************************************************************

From: Tucker Taft
Sent: Monday, March 10, 2014  6:06 PM

>> Conceivably you could use a subtype predicate applied to an unbounded
>> string ...
>> E.g.:
>>     use Ada.Strings.Unbounded;
>>     subtype Somewhat_Bounded_String is Unbounded_String
>>       with Dynamic_Predicate =>
>>         Length (Somewhat_Bounded_String) <= Max_Len;
>
> Really neat. Thanks a lot, Mario

I hope it turns out to be useful.

***************************************************************

From: Randy Brukardt
Sent: Friday, May 9, 2014  3:44 PM

In filing this, it seemed I didn't make one point strong enough, so I'm going to
do that now for the record:

...
> >Given the primary requirement for the bounded forms was no dynamic
> >allocation, it should be clear that we can't really support your
> >requirement in the standard bounded forms. (Even if we had an
> >alternative implementation, bounded container users would be at risk
> >of accidentially [either by their error or an implementation error]
> >getting the dynamic allocation that they cannot allow. That would
> >prevent the containers from being used by their target audience,
> >which isn't going to fly.)
>
> I see the point: humans have a special ability to push mistakes
> through any warnings that might be in place.
>
> I recognize that the compiler warning `Storage_Error might be raised
> at run-time' could prove not to be enough to prevent trouble.

It's more than just avoiding mistakes. These sorts of systems often are under
heavy audit requirements. They may simply not be allowed to use any allocators.
In such environments, there is usually an auditor, be it a program (perhaps just
a simple use of pragma Restrictions(No_Allocators) -- or maybe something fancier
like AdaControl) or a human. The auditor is not going to allow any allocators to
be in the source code of the program, simply because trying to prove that such
an allocator is never executed is harder and more likely to be wrong than just
not allowing it in the first place. (In your example, the auditor would have
prove that there are no mistakes in the implementation of the bounded container
that would accidentially invoke the "slim" allocator version. That's hard and
expensive; it's unlikely anyone would do that when just using something that
doesn't contain any allocators in the first place would suffice.)

For instance, pragma Restrictions(No_Allocators) applies *everywhere*, including
in the runtime and language-defined packages. In such an environment, your
proposed change would mean that bounded strings and containers could not be used
(as there would be an allocator for the optional "slim" implementation, even
though it wouldn't be executed). But as this is precisely the kind of
environment for which these containers are intended. We can't make them unusable
for their intended purpose, thus requiring the introduction of an allocator into
the implementation of a bounded string or container could not be tolerated.

***************************************************************

Questions? Ask the ACAA Technical Agent