Version 1.6 of ai12s/ai12-0254-1.txt

Unformatted version of ai12s/ai12-0254-1.txt version 1.6
Other versions for file ai12s/ai12-0254-1.txt

!standard A.18.32(0)          18-03-06 AI12-0254-1/03
!class Amendment 18-01-26
!status Amendment 1-2012 18-03-05
!status WG9 Approved 16-06-22
!status ARG Approved 7-1-4 18-03-05
!status work item 18-01-26
!status received 18-01-15
!priority Low
!difficulty Easy
!subject Bounded_Indefinite_Holders
!summary
The Bounded_Indefinite_Holder container is added to allow the use of class-wide objects in safety critical environments where dynamic allocation is not allowed.
!problem
In high-integrity, embedded applications, it is usually necessary to prove that the program will not run into stack overflow or heap exhaustion. While static-analysis tools exist for computing safe bounds on "primary" stack usage, that is not the case for heap usage nor for the usage of the "secondary" stack that some Ada compilers use, in place of heap, for local data of dynamic size.
On the other hand, it is desirable, and becoming accepted, to use tagged types and dynamic dispatching even in high-integrity applications. Static analysis tools for stack usage can usually deal with the dynamic dispatching (even if it introduces some overestimation), but a harder problem is that dynamic dispatching requires class-wide subtypes, which are currently considered indefinite subtypes by RM3.3(23/3) and for which the compiler cannot know the size, at compile time, from local analysis of the program. The amount of memory consumed by an object of a class-wide subtype is therefore defined at runtime, in practice by using heap or secondary stack (or some way of extending the local stack frame of the primary stack, equally difficult for static stack-usage analysis).
Moreover, the indefiniteness of class-wide types means that data structures (arrays, records, etc.) with components of class-wide type can be created only by using heap allocation and access-to-class-wide components. But in high-integrity applications, heap use is often completely forbidden.
The alternative, currently forced on programmers, is to replace the tagged type by a discriminated record type, with variants corresponding to sub-classes, and to implement the dispatch manually by case statements on the discriminant, which stands in for the tag. The record type is made definite by giving a default value to the discriminant. The practical drawbacks of this design are well known, and were one of the reasons for the introduction of tagged types and class-oriented programming.
A solution is needed that provides the ability to store class-wide objects without introducing dynamic memory allocation.
!proposal
(See Summary.)
!wording
Add subclause A.18.32:
The Generic Package Containers.Bounded_Indefinite_Holders
The language-defined generic package Containers.Bounded_Indefinite_Holders provides a private type Holder and a set of operations for that type. It provides the same operations as the package Containers.Indefinite_Holders (see A.18.18), with the difference that the maximum storage is bounded.
Static Semantics
The declaration of the generic library package Containers.Bounded_Indefinite_Holders has the same contents and semantics as Containers.Indefinite_Holders except:
* The following is added to the context clause:
with System.Storage_Elements; use System.Storage_Elements;
* An additional generic parameter follows Element_Type:
Max_Element_Size_in_Storage_Elements : Storage_Count;
AARM Reason: This value is in Storage_Elements so that it can be used as a discriminant on a storage pool object in the implementation of the object; Ada doesn't allow discriminant dependent components to use formulas.
This is a generic parameter as it is a property of the Element_Type; the largest possible object of Element_Type is unlikely to be different for different containers, so making it a discriminant (as Capacity is) provides no useful capability. End AARM Reason.
* Add to the precondition of To_Holder and Replace_Element:
and then (if New_Item'Size >
Max_Element_Size_in_Storage_Elements * System.Storage_Unit
then raise Program_Error)
AARM Reason: This ensures that an object that won't fit is not inserted into the container.
Bounded (Run-time) Errors
It is a bounded error to assign from a bounded holder object while tampering with elements of that object is prohibited. Either Program_Error is raised by the assignment, execution proceeds with the target object prohibiting tampering with elements, or execution proceeds normally.
Implementation Requirements
For each instance of Containers.Indefinite_Holders and each instance of Containers.Bounded_Indefinite_Holders, if the two instances meet the following conditions, then the output generated by the Holder'Output or Holder'Write subprograms of either instance shall be readable by the Holder'Input or Holder'Read of the other instance, respectively:
* the Element_Type parameters of the two instances are statically matching subtypes of the same type; and
* the output generated by Element_Type'Output or Element_Type'Write is readable by Element_Type'Input or Element_Type'Read, respectively (where Element_Type denotes the type of the two actual Element_Type parameters).
Implementation Advice
Bounded holder objects should be implemented without dynamic allocation and any finalization should be trivial unless Element_Type needs finalization.
AARM To Be Honest: Implementation of this container in Ada will probably require the use of a special storage pool. When we say "without dynamic allocation", we mean that this pool does not use heap memory and has a trivial finalization routine (that is, procedures Adjust and Finalize are null procedures). All storage pools are controlled, so we can't reasonably say that a bounded holder will not need finalization.
AARM Implementation Note: If the implementation supports discontiguous objects that require multiple calls to Allocate in a storage pool, the storage pool will need to support such allocations. The storage pool implementation can assume that all Allocate calls occur together, and similarly for Deallocate calls, thus simplifying the pool implementation so that allocation only occurs at a high-water mark location.
!discussion
This container can be used to solve the Problem by putting the class-wide objects into holders. Assuming that they fit, the holder can be used like any other fixed-size object.
Thoughts:
[1] The Bounded_Indefinite_Holder could be implemented in Ada using a storage pool that has a discriminant of the size of the data area required. Then, allocation/deallocation of a single object would be supported. If the all data types are allocated contiguously (common in compilers targeting applications that cannot allow heap use), this code would be very simple (simply directly returning an address from allocate and setting a Boolean to indicate that allocation has happened) and could easily be analyzed if needed.
Such a storage pool would require finalization, but the actual finalize routine could be a null procedure. An Ada compiler targeting these sorts of applications could (and should) treat an object requiring trivial finalization as not controlled, and thus machine code analysis would not see any finalization overhead.
That is, it's not finalization that such applications should ban, but rather non-trivial finalization.
Also note that containers do not need to be written in portable Ada per-se, so it certainly would be possible to use some implementation-defined mechanism to avoid the need for a storage pool.
[2] One could support other kinds of Bounded_Indefinite_xxx containers the same way. The storage pool would have the size of the data capacity times the element capacity. The data area would be treated as an array of data capacity- sized elements. There also would need to be an array of element capacity booleans to flag whether each item is allocated. Allocation then would be simply finding an unallocated element and then returning the address of the start of that element. (This is essentially the same technique as used in the regular bounded containers, sans the storage pool). No fragmentation could occur and the memory use is completely predictable.
We've decided not to do this at this time, although such containers would help give predictable finalization in environments where memory usage has to be bounded.
[3] Many users of Bounded_Indefinite_Holders would not want any runtime overhead to check that an object will fit into the container. A special language-defined mechanism is not needed to ensure this, because:
(1) Most objects put into a container will have a known specific type. That means that they also have a known, compile-time size. Therefore, the precondition check includes all compile-time known values, and thus should be evaluated at compile-time. No sane compiler will generate code for such a check unless it fails, and if it fails, a warning most likely would be given.
(2) In this environment, class-wide objects would be stored in Bounded_Indefinite_xxx containers (to give predictable memory usage). Such objects have a known upper bound on their size. A compiler could use that upper bound to evaluate the precondition. In typical cases (where all of the containers with the same element type have the same bound for maximum data size), this would result in the preconditions being known to pass (and thus not generating any code). Such an optimization would probably take some extra implementation work (it wouldn't fall out naturally like (1)), but it could be implemented for a compiler targeting this sort of application.
(3) This issue also could be handled with a technique recommended by AdaCore for such systems: use a tool like SPARK to prove that the checks cannot fail, then suppress them. Precondition checks on language-defined container units can be suppressed (using a suppress mechanism defined in AI12-0112-1), and thus not appear in the generated code. If the correctness has been verified some other way, this will not cause any safety issues.
!corrigendum A.18.32(0)
Insert new clause:
The language-defined generic package Containers.Bounded_Indefinite_Holders provides a private type Holder and a set of operations for that type. It provides the same operations as the package Containers.Indefinite_Holders (see A.18.18), with the difference that the maximum storage is bounded.
Static Semantics
The declaration of the generic library package Containers.Bounded_Indefinite_Holders has the same contents and semantics as Containers.Indefinite_Holders except:
with System.Storage_Elements; use System.Storage_Elements;
Max_Element_Size_in_Storage_Elements : Storage_Count;
and then (if New_Item'Size > Max_Element_Size_in_Storage_Elements * System.Storage_Unit then raise Program_Error)
Bounded (Run-Time) Errors
It is a bounded error to assign from a bounded holder object while tampering with elements of that object is prohibited. Either Program_Error is raised by the assignment, execution proceeds with the target object prohibiting tampering with elements, or execution proceeds normally.
Implementation Requirements
For each instance of Containers.Indefinite_Holders and each instance of Containers.Bounded_Indefinite_Holders, if the two instances meet the following conditions, then the output generated by the Holder'Output or Holder'Write subprograms of either instance shall be readable by the Holder'Input or Holder'Read of the other instance, respectively:
Implementation Advice
Bounded holder objects should be implemented without dynamic allocation and any finalization should be trivial unless Element_Type needs finalization.
!ASIS
No changes needed for package Bounded_Indefinite_Holders.
!ACATS test
ACATS B- and C-Tests are needed to check that the new capabilities are supported.
!appendix

!topic Definite class-wide types with Maximum_Size
!reference Ada 2012 RM3.3(23/3)
!from Niklas Holsti 18-01-15
!keywords class definite indefinite Maximum_Size
!discussion

This suggested Ada extension aims to make it possible to use class-wide
types and dispatching in applications that must have statically
predictable memory consumption. This suggestion arises from a discussion
on comp.lang.ada, starting 2017-06-23, within a thread with the Subject
"Ada Annoyances".

In high-integrity, embedded applications, it is usually necessary to
prove that the program will not run into stack overflow or heap
exhaustion. While static-analysis tools exist for computing safe bounds
on "primary" stack usage, that is not the case for heap usage nor for
the usage of the "secondary" stack that some Ada compilers use, in place
of heap, for local data of dynamic size.

On the other hand, it is desirable, and becoming accepted, to use tagged
types and dynamic dispatching even in high-integrity applications.
Static analysis tools for stack usage can usually deal with the dynamic
dispatching (even if it introduces some over-estimation), but a harder
problem is that dynamic dispatching requires class-wide subtypes, which
are currently considered indefinite subtypes by RM3.3(23/3) and for
which the compiler cannot know the size, at compile time, from local
analysis of the program. The amount of memory consumed by an object of a
class-wide subtype is therefore defined at run-time, in practice by
using heap or secondary stack (or some way of extending the local stack
frame of the primary stack, equally difficult for static stack-usage
analysis).

Moreover, the indefiniteness of class-wide types means that data
structures (array, records, etc.) with components of class-wide type can
be created only by using heap allocation and access-to-class-wide
components. But in high-integrity applications, heap use is often
completely forbidden.

The alternative, currently forced on programmers, is to replace the
tagged type by a discriminated record type, with variants corresponding
to sub-classes, and to implement the dispatch manually by case
statements on the discriminant, which stands in for the tag. The record
type is made definite by giving a default value to the discriminant. The
practical drawbacks of this design are well known, and were one of the
reasons for the introduction of tagged types and class-oriented programming.

As an example from real life, the company for which I mostly work (Space
Systems Finland Ltd) has a major SW component, used on several ESA
satellites over some 15 years, that implements telecommand and telemetry
packet handling by means of just such a discriminated record type
representing a "packet handle". The variants correspond to the different
memory areas that can contain such packets, with different allocation
and management mechanisms. Different applications (satellites) have
different collections of such areas, which means that the record type
and the case statements must be manually modified for each application.
Using a tagged type would eliminate most of that manual work.

The suggested Ada extension, below, proposes a form of class-wide types
that is modelled on the behaviour of a discriminated but still definite
record type.

Syntactically, the proposed extension takes the form of an aspect that
can be specified for a tagged type and that defines the maximum size, in
bits, of any object in the class-wide type rooted at this tagged type.
If we call this aspect Maximum_Size (or perhaps Maximum_Size'Class), the
programmer could use it like this:

    type Root is tagged record ... end record
    with Maximum_Size => 128;

    type Child is new Root with record ... end record;
    -- The compiler checks that Child'Size is at most 128 bits, and
    -- rejects the program otherwise.

It would now be legal to create statically-sized data structures using
Root'Class, without dynamic memory allocation; the compiler would
allocate 128 bits for each value of type Root'Class:

    type Object_List is array (List_Index) of Root'Class;

    type Object_Pair is record
       A, B : Root'Class;
    end record;

and so on.

Properties of such definite class-wide types would include:

- Each object of this type has a default initial tag value, taken from
   the root of the class (which must therefore be non-abstract to
   permit default initialisation of these objects).

- An object of this type can be assigned a new value, with a new
   tag from the same class. (In the comp.lang.ada discussion, Randy
   Brukardt called this point "way too drastic a change".)

Other comments from the comp.lang.ada discussion include:

- From Robert A. Duff: "Something like that was considered and
   rejected for Ada 9X. Part of the problem is that it seems so low
   level to be talking about sizes.  It's not even portable."

   My reply is that this feature would be used only or mainly in
   embedded programs, which usually already contain unportable,
   low-level stuff: task stack sizes, record representation clauses, etc.

- From Robert A. Duff: "It would be better to have the compiler
   compute the maximum size needed.  That would require the compiler
   to do a fairly global analysis, which is something Ada compilers are
   not set up to do."

   My reply is that I agree that, ideally, the compiler should compute
   the maximum size, but that a a friendly compiler would (at least
   optionally) tell me that I am wasting space (when Maximum_Size is
   needlessly large), or that the given Maximum_Size is too small, and
   what would be the minimum valid value. By setting the Maximum_Size
   deliberately too small or too large, a single compilation of the
   program would produce a set of error or information messages from
   which a suitable Maximum_Size could be picked.

   Similarly, an ideal compiler would compute the stack-size needed for
   each task (and some compilers do that). In practice, however, the
   programmer guesses a stack size, and uses either static analysis, or
   high-water-mark measurements, to determine the real stack usage.

****************************************************************

From: Tucker Taft
Sent: Monday, January 15, 2018  10:03 PM

This seems worth considering.  As far as portability, we already allow users to
specify a Storage_Size for a task, so this seems no worse than that.  I would
probably recommend allow specification on a class-wide type of the
"Max_Size_In_Storage_Elements" rather than "Max_Size" so that the size is given
in Storage_Elements rather than bits, and to make the connection to other
Storage_Size attributes.

Whether we want a user-specified Max_Size_In_Storage_Elements for a class-wide
type to imply that the type is definite, or rather allow the "Holder" container
(A.18.18) to query this attribute is another matter.  I think if the Holder
container could be made efficient on class-wide types, you could use them in
real-time and/or embedded systems.  Perhaps a "Bounded_Holder" container would
be the right answer, or simply specify that the Indefinite_Holder take advantage
of a Max_Size_In_Storage_Elements if it is sufficiently small (perhaps an
additional parameter to Indefinite_Holder could specify that it avoid using the
heap by using this attribute).

****************************************************************

From: Niklas Holsti
Sent: Wednesday, January 17, 2018  4:47 PM

> This seems worth considering.

Good!

> As far as portability, we already
> allow users to specify a Storage_Size for a task, so this seems no
> worse than that.  I would probably recommend allow specification on a
> class-wide type of the "Max_Size_In_Storage_Elements" rather than
> "Max_Size" so that the size is given in Storage_Elements rather than
> bits, and to make the connection to other Storage_Size attributes.

I thought in terms of bits, because the 'Size of types is in bits, and the names
of some predefined types (Interfaces.Unsigned_32, etc) embed the size in bits.
But Storage_Elements are ok too.

> Whether we want a user-specified Max_Size_In_Storage_Elements for a
> class-wide type to imply that the type is definite, or rather allow
> the "Holder" container (A.18.18) to query this attribute is another
> matter.  I think if the Holder container could be made efficient on
> class-wide types, you could use them in real-time and/or embedded
> systems.  Perhaps a "Bounded_Holder" container would be the right
> answer, or simply specify that the Indefinite_Holder take advantage of
> a Max_Size_In_Storage_Elements if it is sufficiently small (perhaps an
> additional parameter to Indefinite_Holder could specify that it avoid
> using the heap by using this attribute).

Such a Bounded_Holder generic would be difficult to implement in pure Ada, would
it not? The only approach I have imagined for a pure-Ada implementation of
Bounded_Holder would make each such object contain its own storage pool, able to
hold a single object of the "held" type, up to the Max_Size_In_Storage_Elements.
That would make Bounded_Holder require finalization, which I would like to
avoid, unless the "held" type requires it.

I would not like to have a fixed or implementation-defined limit on the
"sufficiently small" value of Max_Size_In_Storage_Elements; whatever limit value
is chosen, I'm sure that it would turn out to be just a little too small for my
next application :-)

While I'm not totally opposed to the Bounded_Holder idea, it seems to me that
most of the definition of Indefinite_Holder exists to support an implementation
that uses heap allocation (and naturally so). If the Bounded_Holder were defined
in a similar way, it would introduce complications, such as controlled types,
and the possibility of the Holder being empty.

****************************************************************

From: Randy Brukardt
Sent: Friday, January 19, 2018  1:50 AM

...
> Such a Bounded_Holder generic would be difficult to implement in pure
> Ada, would it not?

That's a nice-to-have, but it's definitely not a requirement for the Ada
Language-Defined packages. There are plenty of such packages
(Ada.Unchecked_Conversion, Ada.Unchecked_Deallocation, etc.) that can't be
implemented in Ada in any circumstance, and most of the rest tend to be
implemented in Ada + implementation-defined stuff.

> ...  The only approach I have imagined for a pure-Ada implementation
> of Bounded_Holder would make each such object contain its own storage
> pool, able to hold a single object of the "held" type, up to the
> Max_Size_In_Storage_Elements. That would make Bounded_Holder require
> finalization, which I would like to avoid, unless the "held" type
> requires it.

Trying to avoid finalization is like cutting off your leg and complaining that
you can't run. The finalization in this case would be trivial (probably an empty
Finalize routine), so it shouldn't matter at all (and possibly some compilers
could eliminate it altogether). Moreover, finalization does not cause any
analysis issues in a static allocation environment -- it just means that some
routines run automatically at scope exit and during assignment. Any
semi-competent tool should be able to handle that.

I understand that there are good reasons to avoid heap allocation, but allowing
dispatching and not allowing finalization seems like the priorities have gotten
mixed. :-)

> I would not like to have a fixed or implementation-defined limit on
> the "sufficiently small" value of Max_Size_In_Storage_Elements;
> whatever limit value is chosen, I'm sure that it would turn out to be
> just a little too small for my next application :-)

I agree with this, but I must have missed something because I didn't see any
such proposal (and it doesn't make much sense). The Bounded_Container would have
to somehow get the Max_Size_In_Storage_Elements. I'd probably suggest that it be
specified as a parameter to the generic, so that such a container could be used
for any indefinite type (assuming the Max is large enough).

> While I'm not totally opposed to the Bounded_Holder idea, it seems to
> me that most of the definition of Indefinite_Holder exists to support
> an implementation that uses heap allocation (and naturally so). If the
> Bounded_Holder were defined in a similar way, it would introduce
> complications, such as controlled types, and the possibility of the
> Holder being empty.

That's true of ALL of the containers. Vectors is designed to be unbounded and
use heap allocation. Yet we have Bounded_xxx versions that are supposed to avoid
dynamic allocation. Bounded_Vectors do no heap allocation and use an explicitly
specified capacity. I don't see any particular issue with having
Bounded_Indefinite_xxx versions that do the same thing for indefinite types. The
question is whether the implementation could be made "clean" enough to keep the
safety critical people happy.

I much prefer keeping the complications (if there are any) to the containers,
since implementing mutable record types is one of the hardest parts of Ada to
get right (especially because of the possibility of discriminant-dependent
controlled subcomponents). Allowing essentially the same thing for class-wide
types would be a whole 'nother level of implementation pain. Moreover, there are
a lot of bizarre language rules to avoid problems with mutable record types; the
tagged type rules were designed knowing that those cases couldn't happen because
the tag of an object cannot be changed. Allowing that would require some
additional restrictions on access types which would not be compatible (or safe).
[Otherwise, an access type to T'Class could end up pointing to an object that
isn't in T'Class.]

****************************************************************

From: Tucker Taft
Sent: Friday, January 19, 2018  8:01 AM

I agree with Randy's responses, in general.  Adding a new kind of container with
some implementation advice and/or requirements is hugely less disruptive than
adding a new language feature.  Furthermore, you really have no guarantee that
the new language feature wouldn't end up using heap allocation "underneath," and
so we would end up having new restrictions and/or implementation requirements to
prevent that.

Note that we might also address some of the same needs by having a "safe"
pointer capability (AI-0240), that manages storage automatically, and could be
coupled with other things such as storage-sized access collections  to achieve
the effectively "static" allocation you would like.

****************************************************************

From: Niklas Holsti
Sent: Sunday, January 21, 2018  9:48 AM

> ...
>> ...  The only approach I have imagined for a pure-Ada implementation
>> of Bounded_Holder would make each such object contain its own storage
>> pool, able to hold a single object of the "held" type, up to the
>> Max_Size_In_Storage_Elements. That would make Bounded_Holder require
>> finalization, which I would like to avoid, unless the "held" type
>> requires it.
>
> Trying to avoid finalization is like cutting off your leg and
> complaining that you can't run. The finalization in this case would be
> trivial (probably an empty Finalize routine), so it shouldn't matter
> at all (and possibly some compilers could eliminate it altogether).
> Moreover, finalization does not cause any analysis issues in a static
> allocation environment -- it just means that some routines run
> automatically at scope exit and during assignment. Any semi-competent tool
> should be able to handle that.

I'm afraid that is not true for static analysis of resource usage, such as stack
usage, because such analysis generally has to start from the machine code form
of the program, not the source code form.

Some implementations of finalization could be quite difficult to analyse from
the machine code. For example, assume that the objects to be finalized are put
in a linked list, and the finalization routine traverses this list and invokes
the dispatching Finalize for each list element. In the machine code, the list
traversal involves (1) a loop that is not terminated by a counter, but by
reaching a null link; (2) reading dynamically defined "next" pointers from list
cells; (3) at each list cell, calling a routine that is dynamically defined
(dispatching) by the contents (tag) of the cell. Each of these dynamic steps is
hard for a static analysis that aims to find an upper bound on the stage usage
of the Finalize routines that may be called.

(I know that GNAT now implements finalization differently, and not with lists as
described above, but I have not yet studied the present method, so I don't know
if it leads to simpler machine code.)

Explicit dispatching calls also lead to machine-level calls with dynamically
defined target addresses, but here the analyst can aid and guide the analysis
tool as to which subprograms may be called by a particular dispatching call.
That is much harder for implicit dispatching calls, such as finalization.

>> I would not like to have a fixed or implementation-defined limit on
>> the "sufficiently small" value of Max_Size_In_Storage_Elements;
>> whatever limit value is chosen, I'm sure that it would turn out to be
>> just a little too small for my next application :-)
>
> I agree with this, but I must have missed something because I didn't
> see any such proposal (and it doesn't make much sense).

Tuck said, in part, "... or simply specify that the Indefinite_Holder take
advantage of a Max_Size_In_Storage_Elements if it is sufficiently small ...".

> The Bounded_Container would
> have to somehow get the Max_Size_In_Storage_Elements. I'd probably
> suggest that it be specified as a parameter to the generic, so that
> such a container could be used for any indefinite type (assuming the Max is
> large enough).

That sounds like a workable solution. But it still seems necessary to specify
the Max_Size_In_Storage_Elements aspect for the class-wide root type, so that
the compiler can check that the max size given in the instantiation of
Bounded_Container is large enough for all members of the class.

> I much prefer keeping the complications (if there are any) to the
> containers, since implementing mutable record types is one of the
> hardest parts of Ada to get right (especially because of the
> possibility of discriminant-dependent controlled subcomponents).

Understood.

> Allowing essentially the same thing for class-wide types would be a
> whole 'nother level of implementation pain.

I don't really see why. All class-wide types are already defined as needing
finalization which, as I understand it, is because any class-wide variable may
or may not contain controlled subcomponents, depending on the actual tag. It
seems to me that there must already be mechanisms for managing the possible
(tag-dependent) existence of controlled subcomponents.

But I now realize that when writing this suggestion, I did not remember this
issue (needing finalization) of class-wide types (perhaps because I have seldom
used controlled types). Therefore, even if the suggestion leads to some way to
bound the size of T'Class, the analysis could still be confused by the
finalization, as I explained above. To solve this, I want to extend the
suggestion to include a Boolean aspect No_Controlled_Subcomponents which can be
specified for an array type, a record type, a tagged type, or a class-wide type,
and which would mean what it says, that this type cannot and will not have any
controlled subcomponents. A class-wide type with this aspect would not need
finalization, unless the root tagged type is itself controlled.

> Moreover, there are a lot of bizarre language rules to avoid problems
> with mutable record types; the tagged type rules were designed knowing
> that those cases couldn't happen because the tag of an object cannot
> be changed. Allowing that would require some additional restrictions
> on access types which would not be compatible (or safe).
> [Otherwise, an access type to T'Class could end up pointing to an
> object that isn't in T'Class.]

The rule that comes to mind, for solving this problem, is to prohibit the
creation of access types to sub-classes of any T'Class that has the
Max_Size_In_Storage_Units aspect. Access-to-constant types could still be
allowed. This does not seem incompatible with Ada 2012.

****************************************************************

From: Niklas Holsti
Sent: Sunday, January 21, 2018  10:52 AM

>> Moreover, finalization does not
>> cause any analysis issues in a static allocation environment -- it
>> just means that some routines run automatically at scope exit and
>> during assignment. Any semi-competent tool should be able to handle that.
>
> I'm afraid that is not true for static analysis of resource usage,
> such as stack usage, because such analysis generally has to start from
> the machine code form of the program, not the source code form.

    [snip description of finalization analysis problems]

> Explicit dispatching calls also lead to machine-level calls with
> dynamically defined target addresses, but here the analyst can aid and
> guide the analysis tool as to which subprograms may be called by a
> particular dispatching call. That is much harder for implicit
> dispatching calls, such as finalization.

As an aside, and perhaps partly arguing against myself, it could be said that
when a compiler is intended for high-integrity systems, it is the compiler's
responsibility to support the analysis of the code the compiler generates.

For example, in the case of finalization and other dispatching calls, the
compiler might be expected to emit debugging information or other forms of
information giving the machine-code locations of the dispatching calls and
naming the type-classes involved. If the compiler also identifies the primitive
operations of each tagged type, a tool that analyses the compiled code can
compute an upper bound on the set of operations that can be called from any
dispatching call, and proceed to analyse the machine code of those operations.

Some C++ compilers for embedded systems emit such information (for example, IAR
Systems compilers), but unfortunately, AFAIK, the common executable and
debugging data-format standards (ELF, Dwarf) do not specify how to represent
information of this kind. The compilers from IAR Systems use the IAR proprietary
format called UBROF.

Perhaps some requirements or suggestions on such compiler output could be added
to appendix H, section H.3.1 (pragma Reviewable). I could write a specific
proposal for such text, if the ARG might consider it. Paragraph 18.1 in that
section already requires "control- and data-flow information", but that is quite
vague.

This suggestion is a general one, to support analysis or review of
high-integrity applications that use dispatching. It is not limited to the
suggestion for definite class-wide types.

****************************************************************

From: Randy Brukardt
Sent: Monday, January 22, 2018  3:16 PM

> I'm afraid that is not true for static analysis of resource usage,
> such as stack usage, because such analysis generally has to start from
> the machine code form of the program, not the source code form.
>
> Some implementations of finalization could be quite difficult to
> analyse from the machine code. For example, assume that the objects to
> be finalized are put in a linked list, and the finalization routine
> traverses this list and invokes the dispatching Finalize for each list
> element. In the machine code, the list traversal involves (1) a loop
> that is not terminated by a counter, but by reaching a null link; (2)
> reading dynamically defined "next" pointers from list cells;
> (3) at each list cell, calling a routine that is dynamically defined
> (dispatching) by the contents (tag) of the cell. Each of these dynamic
> steps is hard for a static analysis that aims to find an upper bound
> on the stage usage of the Finalize routines that may be called.

Understood. But that assumes that the finalization is non-trivial, and that a
linked list implementation is used. If it is trivial, it is possible that the
compiler generate no finalization at all for an object with a null finalization
routine (and I would hope that is the case for any compiler trying to support
the kind of analysis you mention here). As such, the object would never be put
on the list in the first place. Your "bounded indefinite" storage pool would
likely have a null procedure finalization, as no special clean-up is needed for
a statically allocated chunk of memory.

> (I know that GNAT now implements finalization differently, and not
> with lists as described above, but I have not yet studied the present
> method, so I don't know if it leads to simpler machine code.)

My understanding is that GNAT generates a direct (not dispatching) call to the
Finalize routine for stand-alone objects and subcomponents thereof. That
shouldn't be a problem for analysis, even of the machine code. You might get a
list for allocated objects, but you are already trying to avoid those.

> Explicit dispatching calls also lead to machine-level calls with
> dynamically defined target addresses, but here the analyst can aid and
> guide the analysis tool as to which subprograms may be called by a
> particular dispatching call.
> That is much harder for implicit dispatching calls, such as
> finalization.

Not sure why; the type is known exactly in most cases (including the one in
question). And GNAT probably isn't generating a dispatching call anyway (and
probably isn't generating any call at all in this particular case).

...
> > The Bounded_Container would
> > have to somehow get the Max_Size_In_Storage_Elements. I'd probably
> > suggest that it be specified as a parameter to the generic, so that
> > such a container could be used for any indefinite type (assuming the
> > Max is large enough).
>
> That sounds like a workable solution. But it still seems necessary to
> specify the Max_Size_In_Storage_Elements aspect for the class-wide
> root type, so that the compiler can check that the max size given in
> the instantiation of Bounded_Container is large enough for all members
> of the class.

I would have expected it to be enough for the container to check for overflow.
Any full-program analysis tool can easily prove that it can't fail.

> > Allowing essentially the same thing for class-wide types would be a
> > whole 'nother level of implementation pain.
>
> I don't really see why. All class-wide types are already defined as
> needing finalization which, as I understand it, is because any
> class-wide variable may or may not contain controlled subcomponents,
> depending on the actual tag. It seems to me that there must already be
> mechanisms for managing the possible (tag-dependent) existence of
> controlled subcomponents.

In the normal Ada case (where the tag/bounds/discriminants can't change after
allocation), one handles all of the finalization and storage allocation at the
declaration of the object. For an implementation like Janus/Ada, that means
putting controlled components on the current finalization chain as well as
allocating memory from the current frame's storage pool. None of that can change
later, so it's relatively easy. Janus/Ada treats each controlled part
individually, so only directly controlled components need anything done. (We did
that because the alternative of dealing with the entire composite object would
mean having to deal with controlled -- thus tagged -- array types, which
appeared to lead to madness [mine].)

When you have a mutable record, however, one has to be able to add/remove
controlled components.

With the Janus/Ada implementation, that has to be to a remote location in the
finalization chain. Deleting is easy (the chain is doubly linked for that
reason), but adding is difficult because it has to be done at a particular place
on the chain. This never actually worked in Janus/Ada until last spring when I
finally gave up on any efficient solution and placed a local finalization chain
into each such object. (And that compiler has only been used by a handful of
people, so there's no certainty that it is actually fixed.) (There also was a
storage leak problem that has been fixed similarly by including an access to the
storage pool to use into the object.)

If one allowed changing the tag of class-wide objects, you would have the same
issues with them, and a similar, heavy-weight solution would have to be applied
to them.

> But I now realize that when writing this suggestion, I did not
> remember this issue (needing finalization) of class-wide types
> (perhaps because I have seldom used controlled types).
> Therefore, even if the suggestion leads to some way to bound the size
> of T'Class, the analysis could still be confused by the finalization,
> as I explained above. To solve this, I want to extend the suggestion
> to include a Boolean aspect No_Controlled_Subcomponents which can be
> specified for an array type, a record type, a tagged type, or a
> class-wide type, and which would mean what it says, that this type
> cannot and will not have any controlled subcomponents. A class-wide
> type with this aspect would not need finalization, unless the root
> tagged type is itself controlled.

Seems reasonable.

> > Moreover, there are a lot of bizarre language rules to avoid
> > problems with mutable record types; the tagged type rules were
> > designed knowing that those cases couldn't happen because the tag of
> > an object cannot be changed. Allowing that would require some
> > additional restrictions on access types which would not be compatible (or
> > safe). [Otherwise, an access type to T'Class could end up pointing to an
> > object that isn't in T'Class.]
>
> The rule that comes to mind, for solving this problem, is to prohibit
> the creation of access types to sub-classes of any T'Class that has
> the Max_Size_In_Storage_Units aspect.
> Access-to-constant types could still be allowed. This does not seem
> incompatible with Ada 2012.

That would work for the specific problem; not sure if there are any others (that
would require some analysis of potential problems).

It would seem to be a pretty draconian restriction, though, as anything other
than access T'Class would have to be banned. That would seem to conflict with
anonymous access dispatching, for instance.

****************************************************************

From: Niklas Holsti
Sent: Tuesday, January 23, 2018  4:33 AM

> I agree with Randy's responses, in general.  Adding a new kind of
> container with some implementation advice and/or requirements is
> hugely less disruptive than adding a new language feature.

That route is ok for me, unless the container introduces some additional
analysis problems, such as finalization (as was discussed in other messages).

> Furthermore, you really have no guarantee that the new language
> feature wouldn't end up using heap allocation "underneath," and so we
> would end up having new restrictions and/or implementation
> requirements to prevent that.

The restriction No_Implicit_Heap_Allocations should work, at least to warn the
programmer if the compiler needs heap to implement the definite class-wide
types. Some compilers could still use a secondary stack, of course.

But this seems to be an issue of implementation quality, or implementation
orientation, and does not apply only to the propsed definite class-wide types,
but to almost anything (say, local array variables of dynamic length, which some
compilers might allocate on the heap, or on a secondary stack).

> Note that we might also address some of the same needs by having a
> "safe" pointer capability (AI-0240), that manages storage
> automatically, and could be coupled with other things such as
> storage-sized access collections  to achieve the effectively "static"
> allocation you would like.

If "safe" pointers still use allocations ("new"), they would not solve the
problem, because a static analysis could find it quite difficult to bound the
number of allocations executed at run-time, and to match up the allocations with
the (implicit) deallocations, which it would have to do in order to discover the
maximum number of allocated objects that could exist, at any time, during
execution.

There are analysis tools that attempt to find bounds on heap usage, but they
require much deeper and more complex analysis (for example, "shape analysis" for
linked data structures) than is needed for stack-usage analysis.

****************************************************************

From: Niklas Holsti
Sent: Tuesday, January 23, 2018  8:25 AM

> ...
>> Some implementations of finalization could be quite difficult to
>> analyse from the machine code.
   ...
> Understood. But that assumes that the finalization is non-trivial, and
> that a linked list implementation is used. If it is trivial, it is
> possible that the compiler generate no finalization at all for an
> object with a null finalization routine (and I would hope that is the
> case for any compiler trying to support the kind of analysis you mention here).

That would solve the finalization-analysis problem in this case, yes.

> My understanding is that GNAT generates a direct (not dispatching)
> call to the Finalize routine for stand-alone objects and subcomponents
> thereof.

Ok, that should not be a problem then, if the call to Finalize the
Bounded_Container is direct.

>> Explicit dispatching calls also lead to machine-level calls with
>> dynamically defined target addresses, but here the analyst can aid
>> and guide the analysis tool as to which subprograms may be called by
>> a particular dispatching call.
>> That is much harder for implicit dispatching calls, such as
>> finalization.
>
> Not sure why; the type is known exactly in most cases (including the
> one in question).

The class-wide type is exactly known to the compiler, but not to the
machine-code analyzer.

For explicit dispatching calls, the guidance from the analyst, to which I
referred above, usually takes the form of textual "annotations", written by the
analyst for use by the analysis tool, and saying that the dynamic call at a
particular source code location (given by file name and line number, or similar)
may actually call (dispatch to) a given set of subprograms specifically named in
the annotation. The analyst identifies these subprograms from the source code,
perhaps aided by a source-level analysis that identifies the primitive
subprograms for each type included in the class. Something like

    dynamic call at "huhu.adb:626" calls "Foo" or "Foo2" or "Foo3";

For implicit dispatching calls, there is no similar source-code reference
location. The analysis tool can tell the analyst that there is a dispatching
call at machine code address A, but the analyst will find it difficult to map
this call to the source code, and thus to understand what the call is doing, and
how to guide the analysis of this call.

Machine-code analyzers usually strive to be source-language-agnostic and
compiler-agnostic, so as not to restrict their user base by conditions on the
user's programming tools. So one tries to separate the machine-code analysis
from any source-code analysis that might support the machine-code one. And there
is always the problem of mapping information from the source-code analysis to
the machine level, such as identifying which implicit finalization accounts for
the dispatching call at address A.

>>> The Bounded_Container would
>>> have to somehow get the Max_Size_In_Storage_Elements. I'd probably
>>> suggest that it be specified as a parameter to the generic, so that
>>> such a container could be used for any indefinite type (assuming the
>>> Max is large enough).
>>
>> That sounds like a workable solution. But it still seems necessary to
>> specify the Max_Size_In_Storage_Elements aspect for the class-wide
>> root type, so that the compiler can check that the max size given in
>> the instantiation of Bounded_Container is large enough for all
>> members of the class.
>
> I would have expected it to be enough for the container to check for
> overflow. Any full-program analysis tool can easily prove that it
> can't fail.

The other fellow's job always seems easier than one's own :-)

In this case, the machine-code analyzer would have to perform this proof at
every point in the code where an object is stored into some instantiation of a
Bounded_Container. To do so, the machine-code analyzer would have to be able to
statically compare the actual Max_Size value of the present instance of
Bounded_Container, to the (actual or maximum) size of the object that is being
stored in the Bounded_Container. In the machine-code, both values may be the
result of dynamic computations. It may be possible; I find it hard to say, as it
will depend on the precise form of the code generated for instantiations and for
storing an object in the container.

(Aside from that, I would much prefer to have a possible overflow detected at
compile time, at the instantiation.)

>>> Allowing essentially the same thing for class-wide types would be a
>>> whole 'nother level of implementation pain.

    [snip description of Janus/Ada methods; thanks, it was illuminating]

> If one allowed changing the tag of class-wide objects, you would have
> the same issues with them, and a similar, heavy-weight solution would
> have to be applied to them.
>
>> But I now realize that when writing this suggestion, I did not
>> remember this issue (needing finalization) of class-wide types
>> (perhaps because I have seldom used controlled types).
>> Therefore, even if the suggestion leads to some way to bound the size
>> of T'Class, the analysis could still be confused by the finalization,
>> as I explained above. To solve this, I want to extend the suggestion
>> to include a Boolean aspect No_Controlled_Subcomponents which can be
>> specified for an array type, a record type, a tagged type, or a
>> class-wide type, and which would mean what it says, that this type
>> cannot and will not have any controlled subcomponents. A class-wide
>> type with this aspect would not need finalization, unless the root
>> tagged type is itself controlled.
>
> Seems reasonable.

If it would simplify implementation, one could allow compilers to require (or
assume) that No_Controlled_Subcomponents is specified together with
Max_Size_In_Storage_Units.

****************************************************************

From: Randy Brukardt
Sent: Wednesday, January 24, 2018  3:25 PM

...
> >> Explicit dispatching calls also lead to machine-level calls with
> >> dynamically defined target addresses, but here the analyst can aid
> >> and guide the analysis tool as to which subprograms may be called
> >> by a particular dispatching call.
> >> That is much harder for implicit dispatching calls, such as
> >> finalization.
> >
> > Not sure why; the type is known exactly in most cases
> (including the
> > one in question).
>
> The class-wide type is exactly known to the compiler, but not to the
> machine-code analyzer.

I understand that, but it isn't relevant when the specific type is known. (Which
is what I meant by "type is known exactly", I could have been clearer.) As you
know, Ada dispatching uses static binding for dispatching calls where the
controlling type is statically determined. A statically bound dispatching call
doesn't dispatch at all, it just directly calls the appropriate routine.

So, for machine-code analysis, this just looks like a normal direct call.
I'd be amazed if the analyzer cares about how that call got determined.

Obviously, a compiler could revert to dynamic dispatching code even in that
case, but one would hope that if the compiler vendor is trying to support
machine code analysis, they would fix the code generation to avoid that. (Yes,
Janus/Ada would use a dynamic call in such a case, and that call is hidden in
the runtime, but anyone trying to machine-code analysis on Janus/Ada output is
nuts -- it wasn't designed for that at all.)

...
> >>> The Bounded_Container would
> >>> have to somehow get the Max_Size_In_Storage_Elements. I'd probably
> >>> suggest that it be specified as a parameter to the generic, so
> >>> that such a container could be used for any indefinite type
> >>> (assuming the Max is large enough).
> >>
> >> That sounds like a workable solution. But it still seems necessary
> >> to specify the Max_Size_In_Storage_Elements aspect for the
> >> class-wide root type, so that the compiler can check that the max
> >> size given in the instantiation of Bounded_Container is large
> >> enough for all members of the class.
> >
> > I would have expected it to be enough for the container to check for
> > overflow. Any full-program analysis tool can easily prove that it
> > can't fail.
>
> The other fellow's job always seems easier than one's own :-)
>
> In this case, the machine-code analyzer would have to perform this
> proof at every point in the code where an object is stored into some
> instantiation of a Bounded_Container. To do so, the machine-code
> analyzer would have to be able to statically compare the actual
> Max_Size value of the present instance of Bounded_Container, to the
> (actual or
> maximum) size of the object that is being stored in the
> Bounded_Container. In the machine-code, both values may be the result
> of dynamic computations. It may be possible; I find it hard to say, as
> it will depend on the precise form of the code generated for
> instantiations and for storing an object in the container.

I wasn't expecting the machine-code analyzer to ever see this check. I was
expecting an Ada-level tool to make the proof and the check to then be
suppressed. (The only legitimate reason for Ada check suppression is to reduce
the cost of machine-code analysis).

Additionally, this check (like most new checks in Ada 2020) would be written as
a precondition, so it is likely the compiler could eliminate it in most cases.
Again, it wouldn't survive for the machine code analyzer to see it.

> (Aside from that, I would much prefer to have a possible overflow
> detected at compile time, at the instantiation.)

If an object did cause an overflow, you'd most likely get a warning about a
precondition that always will fail. If one uses "warnings as errors" mode, you
get the effect without any extra language machinery.

Specifically:
    procedure Replace_Element (Container : in out Holder;
                               New_Item  : in     Element_Type)
        with Pre => New_Item'Size <= Max_Size;

If New_Item is an object of a specific type (the usual case), the compiler would
be able to evaluate the precondition at compile-time (since New_Item'Size is a
compile-time constant). It would be a pretty lousy compiler that actually
generated a check in that case. :-)

You'd probably have to suppress the check if the object is class-wide, but that
would generally be safe if it had previously been stored in a Holder. (And
Ada-level analysis tools should be able to prove that regardless of the source
of the object.) There will shortly be a proposal for suppressing precondition
checks for language-defined units as part of AI12-0112-1. (I have to find time
to write it, rather than commenting on everyone else's proposals. :-)

****************************************************************

From: Niklas Holsti
Sent: Friday, January 26, 2018  2:55 PM

> ...
>>>> That is much harder for implicit dispatching calls, such as
>>>> finalization.
>>>
>>> Not sure why; the type is known exactly in most cases
>> (including the one in question).
     ...
> So, for machine-code analysis, this just looks like a normal direct call.

Indeed, such calls are not a problem, not (usually) even for implicit calls.

>>> [Niklas Holsti writes:]
>>>> But it still seems necessary to
>>>> specify the Max_Size_In_Storage_Elements aspect for the class-wide
>>>> root type, so that the compiler can check that the max size given
>>>> in the instantiation of Bounded_Container is large enough for all
>>>> members of the class.
>>>
>>> I would have expected it to be enough for the container to check for
>>> overflow.

   [snip]

> If an object did cause an overflow, you'd most likely get a warning
> about a precondition that always will fail. If one uses "warnings as
> errors" mode, you get the effect without any extra language machinery.
>
> Specifically:
>     procedure Replace_Element (Container : in out Holder;
>                                New_Item  : in     Element_Type)
>         with Pre => New_Item'Size <= Max_Size;
>
> If New_Item is an object of a specific type (the usual case), the
> compiler would be able to evaluate the precondition at compile-time
> (since New_Item'Size is a compile-time constant).

I get it now. I wanted to have the compiler check Max_Size when a descendant
type is derived, but I agree with you that as an object of every such descendant
type will be entered into an instance of the Bounded_Holder object, somewhere in
the program, and most probably with its specific type, the Max_Size can be a
(generic) property of Bounded_Holder, and checked against the size of the
specific descendant type in a precondition of Replace_Element. Nice.

So then I would be content with a Bounded_Holder like that (with some strong
advice that its memory usage should be a static function of Max_Size), plus the
new aspect No_Controlled_Subcomponents, to eliminate dispatching finalization of
the Element_Type upon replacement.

Thanks for the discussion of this suggestion.

To Randy: should I write new ada-comment messages to suggest Bounded_Holder and
No_Controlled_Subcomponents explicitly, or is this discussion enough for you to
consider them suggested?

****************************************************************

From: Randy Brukardt
Sent: Friday, January 26, 2018  6:19 PM

...
> To Randy: should I write new ada-comment messages to suggest
> Bounded_Holder and No_Controlled_Subcomponents explicitly, or is this
> discussion enough for you to consider them suggested?

No need. The ARG mainly wants problem descriptions to which we can figure out
the answers. And you've provided an understandable problem description, to which
we'll work out the least disruptive and most functional solution.

In any case, the deadlines (both published and functional) have passed, so new
proposals probably will get much lower priority than the existing ones. (The
*real* deadline was the deadline for submissions for Monday's ARG meeting, which
was 6 hours ago.)

I was just writing up the AI for this thread, it will be AI12-0254-1 (should be
posted sometime today, Wisconsin time -- thus the next 6 hours).

****************************************************************

From: Niklas Holsti
Sent: Sunday, January 28, 2018  5:50 AM [privately]

Thanks for creating this AI from our ada-comment discussions.

I have a very minor comment on the formulation of the precondition regarding the
size of the elements. The AI gives this example:

procedure Replace_Element (Container : in out Holder;
                            New_Item  : in     Element_Type)
     with Pre => New_Item'Size/System.Storage_Unit <= Max_Size_in_Storage_Elements;

Is it certain that New_Item'Size is always a multiple of System.Storage_Unit at
this point?

If not, it seems to me that the precondition should be written as

    New_Item'Size <= System.Storage_Unit * Max_Size_in_Storage_Elements;

I also noticed that the ""!ASIS" section of the AI ends with an interrupted
conditional sentence, perhaps from an editing glitch.

****************************************************************

From: Randy Brukardt
Sent: Sunday, January 28, 2018  9:28 PM [privately]

It's highly unlikely that the Size would not be an even number of units, but 
since inverting the condition doesn't change much (other than introducing an
overflow possibility), it probably is better written as you have it.

Can I put your message into the public record? (That is, filed in the AI?)

****************************************************************

From: Niklas Holsti
Sent: Monday, January 29, 2018 12:46 AM [privately]

> It's highly unlikely that the Size would not be an even number of 
> units, but since inverting the condition doesn't change much (other 
> than introducing an overflow possibility), it probably is better written as
> you have it.

Good, thanks.

> Can I put your message into the public record? (That is, filed in the 
> AI?)

Yes, of course. Apologies if this direct e-mail caused extra editing trouble
for you.

****************************************************************

From: Randy Brukardt
Sent: Thursday, February 1, 2018  12:21 AM

During the discussion of AI12-0254-1, Jean-Pierre complained that "Bounded"
has a different meaning for a Bounded_Indefinite_Container. The minutes say:

Jean-Pierre notes that "Bounded" is somewhat a misnomer. For most Bounded
containers the bound is on the number of elements; for the
Bounded_Indefinite_Container, the bound is on the size of an individual
element.

However, the introduction of the Bounded_Vector container says:

The language-defined generic package Containers.Bounded_Vectors provides a
private type Vector and a set of operations. It provides the same operations
as the package Containers.Vectors (see A.18.2), with the difference that the
maximum storage is bounded.

Ergo, "bounded" in the Standard sense means that the storage is bounded, and
clearly it means the same thing for the Bounded_Indefinite_Container. The
mechanism to accomplish that bounding is different, but the goal is the same.
(And if there was a Bounded_Indefinite_Vector, it would use both techniques
for bounding. Aside: A Bounded_Indefinite_Vector probably would be the fastest
implementation of a vector for Janus/Ada's code shared generics, presuming
"canonical" implementations of all of the containers -- it would slightly beat
Indefinite_Vector by having faster allocations than the regular heap -- and it
would clobber the other implementations, which have to use implicit heap
allocation for each component. End aside.)

****************************************************************


Questions? Ask the ACAA Technical Agent