!standard A.18.32(0) 18-03-06 AI12-0254-1/03 !class Amendment 18-01-26 !status Amendment 1-2012 18-03-05 !status WG9 Approved 16-06-22 !status ARG Approved 7-1-4 18-03-05 !status work item 18-01-26 !status received 18-01-15 !priority Low !difficulty Easy !subject Bounded_Indefinite_Holders !summary The Bounded_Indefinite_Holder container is added to allow the use of class-wide objects in safety critical environments where dynamic allocation is not allowed. !problem In high-integrity, embedded applications, it is usually necessary to prove that the program will not run into stack overflow or heap exhaustion. While static-analysis tools exist for computing safe bounds on "primary" stack usage, that is not the case for heap usage nor for the usage of the "secondary" stack that some Ada compilers use, in place of heap, for local data of dynamic size. On the other hand, it is desirable, and becoming accepted, to use tagged types and dynamic dispatching even in high-integrity applications. Static analysis tools for stack usage can usually deal with the dynamic dispatching (even if it introduces some overestimation), but a harder problem is that dynamic dispatching requires class-wide subtypes, which are currently considered indefinite subtypes by RM3.3(23/3) and for which the compiler cannot know the size, at compile time, from local analysis of the program. The amount of memory consumed by an object of a class-wide subtype is therefore defined at runtime, in practice by using heap or secondary stack (or some way of extending the local stack frame of the primary stack, equally difficult for static stack-usage analysis). Moreover, the indefiniteness of class-wide types means that data structures (arrays, records, etc.) with components of class-wide type can be created only by using heap allocation and access-to-class-wide components. But in high-integrity applications, heap use is often completely forbidden. The alternative, currently forced on programmers, is to replace the tagged type by a discriminated record type, with variants corresponding to sub-classes, and to implement the dispatch manually by case statements on the discriminant, which stands in for the tag. The record type is made definite by giving a default value to the discriminant. The practical drawbacks of this design are well known, and were one of the reasons for the introduction of tagged types and class-oriented programming. A solution is needed that provides the ability to store class-wide objects without introducing dynamic memory allocation. !proposal (See Summary.) !wording Add subclause A.18.32: The Generic Package Containers.Bounded_Indefinite_Holders The language-defined generic package Containers.Bounded_Indefinite_Holders provides a private type Holder and a set of operations for that type. It provides the same operations as the package Containers.Indefinite_Holders (see A.18.18), with the difference that the maximum storage is bounded. Static Semantics The declaration of the generic library package Containers.Bounded_Indefinite_Holders has the same contents and semantics as Containers.Indefinite_Holders except: * The following is added to the context clause: with System.Storage_Elements; use System.Storage_Elements; * An additional generic parameter follows Element_Type: Max_Element_Size_in_Storage_Elements : Storage_Count; AARM Reason: This value is in Storage_Elements so that it can be used as a discriminant on a storage pool object in the implementation of the object; Ada doesn't allow discriminant dependent components to use formulas. This is a generic parameter as it is a property of the Element_Type; the largest possible object of Element_Type is unlikely to be different for different containers, so making it a discriminant (as Capacity is) provides no useful capability. End AARM Reason. * Add to the precondition of To_Holder and Replace_Element: and then (if New_Item'Size > Max_Element_Size_in_Storage_Elements * System.Storage_Unit then raise Program_Error) AARM Reason: This ensures that an object that won't fit is not inserted into the container. Bounded (Run-time) Errors It is a bounded error to assign from a bounded holder object while tampering with elements of that object is prohibited. Either Program_Error is raised by the assignment, execution proceeds with the target object prohibiting tampering with elements, or execution proceeds normally. Implementation Requirements For each instance of Containers.Indefinite_Holders and each instance of Containers.Bounded_Indefinite_Holders, if the two instances meet the following conditions, then the output generated by the Holder'Output or Holder'Write subprograms of either instance shall be readable by the Holder'Input or Holder'Read of the other instance, respectively: * the Element_Type parameters of the two instances are statically matching subtypes of the same type; and * the output generated by Element_Type'Output or Element_Type'Write is readable by Element_Type'Input or Element_Type'Read, respectively (where Element_Type denotes the type of the two actual Element_Type parameters). Implementation Advice Bounded holder objects should be implemented without dynamic allocation and any finalization should be trivial unless Element_Type needs finalization. AARM To Be Honest: Implementation of this container in Ada will probably require the use of a special storage pool. When we say "without dynamic allocation", we mean that this pool does not use heap memory and has a trivial finalization routine (that is, procedures Adjust and Finalize are null procedures). All storage pools are controlled, so we can't reasonably say that a bounded holder will not need finalization. AARM Implementation Note: If the implementation supports discontiguous objects that require multiple calls to Allocate in a storage pool, the storage pool will need to support such allocations. The storage pool implementation can assume that all Allocate calls occur together, and similarly for Deallocate calls, thus simplifying the pool implementation so that allocation only occurs at a high-water mark location. !discussion This container can be used to solve the Problem by putting the class-wide objects into holders. Assuming that they fit, the holder can be used like any other fixed-size object. Thoughts: [1] The Bounded_Indefinite_Holder could be implemented in Ada using a storage pool that has a discriminant of the size of the data area required. Then, allocation/deallocation of a single object would be supported. If the all data types are allocated contiguously (common in compilers targeting applications that cannot allow heap use), this code would be very simple (simply directly returning an address from allocate and setting a Boolean to indicate that allocation has happened) and could easily be analyzed if needed. Such a storage pool would require finalization, but the actual finalize routine could be a null procedure. An Ada compiler targeting these sorts of applications could (and should) treat an object requiring trivial finalization as not controlled, and thus machine code analysis would not see any finalization overhead. That is, it's not finalization that such applications should ban, but rather non-trivial finalization. Also note that containers do not need to be written in portable Ada per-se, so it certainly would be possible to use some implementation-defined mechanism to avoid the need for a storage pool. [2] One could support other kinds of Bounded_Indefinite_xxx containers the same way. The storage pool would have the size of the data capacity times the element capacity. The data area would be treated as an array of data capacity- sized elements. There also would need to be an array of element capacity booleans to flag whether each item is allocated. Allocation then would be simply finding an unallocated element and then returning the address of the start of that element. (This is essentially the same technique as used in the regular bounded containers, sans the storage pool). No fragmentation could occur and the memory use is completely predictable. We've decided not to do this at this time, although such containers would help give predictable finalization in environments where memory usage has to be bounded. [3] Many users of Bounded_Indefinite_Holders would not want any runtime overhead to check that an object will fit into the container. A special language-defined mechanism is not needed to ensure this, because: (1) Most objects put into a container will have a known specific type. That means that they also have a known, compile-time size. Therefore, the precondition check includes all compile-time known values, and thus should be evaluated at compile-time. No sane compiler will generate code for such a check unless it fails, and if it fails, a warning most likely would be given. (2) In this environment, class-wide objects would be stored in Bounded_Indefinite_xxx containers (to give predictable memory usage). Such objects have a known upper bound on their size. A compiler could use that upper bound to evaluate the precondition. In typical cases (where all of the containers with the same element type have the same bound for maximum data size), this would result in the preconditions being known to pass (and thus not generating any code). Such an optimization would probably take some extra implementation work (it wouldn't fall out naturally like (1)), but it could be implemented for a compiler targeting this sort of application. (3) This issue also could be handled with a technique recommended by AdaCore for such systems: use a tool like SPARK to prove that the checks cannot fail, then suppress them. Precondition checks on language-defined container units can be suppressed (using a suppress mechanism defined in AI12-0112-1), and thus not appear in the generated code. If the correctness has been verified some other way, this will not cause any safety issues. !corrigendum A.18.32(0) @dinsc The language-defined generic package Containers.Bounded_Indefinite_Holders provides a private type Holder and a set of operations for that type. It provides the same operations as the package Containers.Indefinite_Holders (see A.18.18), with the difference that the maximum storage is bounded. @s8<@i> The declaration of the generic library package Containers.Bounded_Indefinite_Holders has the same contents and semantics as Containers.Indefinite_Holders except: @xbullet @xcode< @b System.Storage_Elements; @b System.Storage_Elements;> @xbullet @xcode< Max_Element_Size_in_Storage_Elements : Storage_Count;> @xbullet @xcode< @b (@b New_Item'Size @> Max_Element_Size_in_Storage_Elements * System.Storage_Unit @b Program_Error)> @s8<@i> It is a bounded error to assign from a bounded holder object while tampering with elements of that object is prohibited. Either Program_Error is raised by the assignment, execution proceeds with the target object prohibiting tampering with elements, or execution proceeds normally. @s8<@i> For each instance of Containers.Indefinite_Holders and each instance of Containers.Bounded_Indefinite_Holders, if the two instances meet the following conditions, then the output generated by the Holder'Output or Holder'Write subprograms of either instance shall be readable by the Holder'Input or Holder'Read of the other instance, respectively: @xbullet @xbullet @s8<@i> Bounded holder objects should be implemented without dynamic allocation and any finalization should be trivial unless Element_Type needs finalization. !ASIS No changes needed for package Bounded_Indefinite_Holders. !ACATS test ACATS B- and C-Tests are needed to check that the new capabilities are supported. !appendix !topic Definite class-wide types with Maximum_Size !reference Ada 2012 RM3.3(23/3) !from Niklas Holsti 18-01-15 !keywords class definite indefinite Maximum_Size !discussion This suggested Ada extension aims to make it possible to use class-wide types and dispatching in applications that must have statically predictable memory consumption. This suggestion arises from a discussion on comp.lang.ada, starting 2017-06-23, within a thread with the Subject "Ada Annoyances". In high-integrity, embedded applications, it is usually necessary to prove that the program will not run into stack overflow or heap exhaustion. While static-analysis tools exist for computing safe bounds on "primary" stack usage, that is not the case for heap usage nor for the usage of the "secondary" stack that some Ada compilers use, in place of heap, for local data of dynamic size. On the other hand, it is desirable, and becoming accepted, to use tagged types and dynamic dispatching even in high-integrity applications. Static analysis tools for stack usage can usually deal with the dynamic dispatching (even if it introduces some over-estimation), but a harder problem is that dynamic dispatching requires class-wide subtypes, which are currently considered indefinite subtypes by RM3.3(23/3) and for which the compiler cannot know the size, at compile time, from local analysis of the program. The amount of memory consumed by an object of a class-wide subtype is therefore defined at run-time, in practice by using heap or secondary stack (or some way of extending the local stack frame of the primary stack, equally difficult for static stack-usage analysis). Moreover, the indefiniteness of class-wide types means that data structures (array, records, etc.) with components of class-wide type can be created only by using heap allocation and access-to-class-wide components. But in high-integrity applications, heap use is often completely forbidden. The alternative, currently forced on programmers, is to replace the tagged type by a discriminated record type, with variants corresponding to sub-classes, and to implement the dispatch manually by case statements on the discriminant, which stands in for the tag. The record type is made definite by giving a default value to the discriminant. The practical drawbacks of this design are well known, and were one of the reasons for the introduction of tagged types and class-oriented programming. As an example from real life, the company for which I mostly work (Space Systems Finland Ltd) has a major SW component, used on several ESA satellites over some 15 years, that implements telecommand and telemetry packet handling by means of just such a discriminated record type representing a "packet handle". The variants correspond to the different memory areas that can contain such packets, with different allocation and management mechanisms. Different applications (satellites) have different collections of such areas, which means that the record type and the case statements must be manually modified for each application. Using a tagged type would eliminate most of that manual work. The suggested Ada extension, below, proposes a form of class-wide types that is modelled on the behaviour of a discriminated but still definite record type. Syntactically, the proposed extension takes the form of an aspect that can be specified for a tagged type and that defines the maximum size, in bits, of any object in the class-wide type rooted at this tagged type. If we call this aspect Maximum_Size (or perhaps Maximum_Size'Class), the programmer could use it like this: type Root is tagged record ... end record with Maximum_Size => 128; type Child is new Root with record ... end record; -- The compiler checks that Child'Size is at most 128 bits, and -- rejects the program otherwise. It would now be legal to create statically-sized data structures using Root'Class, without dynamic memory allocation; the compiler would allocate 128 bits for each value of type Root'Class: type Object_List is array (List_Index) of Root'Class; type Object_Pair is record A, B : Root'Class; end record; and so on. Properties of such definite class-wide types would include: - Each object of this type has a default initial tag value, taken from the root of the class (which must therefore be non-abstract to permit default initialisation of these objects). - An object of this type can be assigned a new value, with a new tag from the same class. (In the comp.lang.ada discussion, Randy Brukardt called this point "way too drastic a change".) Other comments from the comp.lang.ada discussion include: - From Robert A. Duff: "Something like that was considered and rejected for Ada 9X. Part of the problem is that it seems so low level to be talking about sizes. It's not even portable." My reply is that this feature would be used only or mainly in embedded programs, which usually already contain unportable, low-level stuff: task stack sizes, record representation clauses, etc. - From Robert A. Duff: "It would be better to have the compiler compute the maximum size needed. That would require the compiler to do a fairly global analysis, which is something Ada compilers are not set up to do." My reply is that I agree that, ideally, the compiler should compute the maximum size, but that a a friendly compiler would (at least optionally) tell me that I am wasting space (when Maximum_Size is needlessly large), or that the given Maximum_Size is too small, and what would be the minimum valid value. By setting the Maximum_Size deliberately too small or too large, a single compilation of the program would produce a set of error or information messages from which a suitable Maximum_Size could be picked. Similarly, an ideal compiler would compute the stack-size needed for each task (and some compilers do that). In practice, however, the programmer guesses a stack size, and uses either static analysis, or high-water-mark measurements, to determine the real stack usage. **************************************************************** From: Tucker Taft Sent: Monday, January 15, 2018 10:03 PM This seems worth considering. As far as portability, we already allow users to specify a Storage_Size for a task, so this seems no worse than that. I would probably recommend allow specification on a class-wide type of the "Max_Size_In_Storage_Elements" rather than "Max_Size" so that the size is given in Storage_Elements rather than bits, and to make the connection to other Storage_Size attributes. Whether we want a user-specified Max_Size_In_Storage_Elements for a class-wide type to imply that the type is definite, or rather allow the "Holder" container (A.18.18) to query this attribute is another matter. I think if the Holder container could be made efficient on class-wide types, you could use them in real-time and/or embedded systems. Perhaps a "Bounded_Holder" container would be the right answer, or simply specify that the Indefinite_Holder take advantage of a Max_Size_In_Storage_Elements if it is sufficiently small (perhaps an additional parameter to Indefinite_Holder could specify that it avoid using the heap by using this attribute). **************************************************************** From: Niklas Holsti Sent: Wednesday, January 17, 2018 4:47 PM > This seems worth considering. Good! > As far as portability, we already > allow users to specify a Storage_Size for a task, so this seems no > worse than that. I would probably recommend allow specification on a > class-wide type of the "Max_Size_In_Storage_Elements" rather than > "Max_Size" so that the size is given in Storage_Elements rather than > bits, and to make the connection to other Storage_Size attributes. I thought in terms of bits, because the 'Size of types is in bits, and the names of some predefined types (Interfaces.Unsigned_32, etc) embed the size in bits. But Storage_Elements are ok too. > Whether we want a user-specified Max_Size_In_Storage_Elements for a > class-wide type to imply that the type is definite, or rather allow > the "Holder" container (A.18.18) to query this attribute is another > matter. I think if the Holder container could be made efficient on > class-wide types, you could use them in real-time and/or embedded > systems. Perhaps a "Bounded_Holder" container would be the right > answer, or simply specify that the Indefinite_Holder take advantage of > a Max_Size_In_Storage_Elements if it is sufficiently small (perhaps an > additional parameter to Indefinite_Holder could specify that it avoid > using the heap by using this attribute). Such a Bounded_Holder generic would be difficult to implement in pure Ada, would it not? The only approach I have imagined for a pure-Ada implementation of Bounded_Holder would make each such object contain its own storage pool, able to hold a single object of the "held" type, up to the Max_Size_In_Storage_Elements. That would make Bounded_Holder require finalization, which I would like to avoid, unless the "held" type requires it. I would not like to have a fixed or implementation-defined limit on the "sufficiently small" value of Max_Size_In_Storage_Elements; whatever limit value is chosen, I'm sure that it would turn out to be just a little too small for my next application :-) While I'm not totally opposed to the Bounded_Holder idea, it seems to me that most of the definition of Indefinite_Holder exists to support an implementation that uses heap allocation (and naturally so). If the Bounded_Holder were defined in a similar way, it would introduce complications, such as controlled types, and the possibility of the Holder being empty. **************************************************************** From: Randy Brukardt Sent: Friday, January 19, 2018 1:50 AM ... > Such a Bounded_Holder generic would be difficult to implement in pure > Ada, would it not? That's a nice-to-have, but it's definitely not a requirement for the Ada Language-Defined packages. There are plenty of such packages (Ada.Unchecked_Conversion, Ada.Unchecked_Deallocation, etc.) that can't be implemented in Ada in any circumstance, and most of the rest tend to be implemented in Ada + implementation-defined stuff. > ... The only approach I have imagined for a pure-Ada implementation > of Bounded_Holder would make each such object contain its own storage > pool, able to hold a single object of the "held" type, up to the > Max_Size_In_Storage_Elements. That would make Bounded_Holder require > finalization, which I would like to avoid, unless the "held" type > requires it. Trying to avoid finalization is like cutting off your leg and complaining that you can't run. The finalization in this case would be trivial (probably an empty Finalize routine), so it shouldn't matter at all (and possibly some compilers could eliminate it altogether). Moreover, finalization does not cause any analysis issues in a static allocation environment -- it just means that some routines run automatically at scope exit and during assignment. Any semi-competent tool should be able to handle that. I understand that there are good reasons to avoid heap allocation, but allowing dispatching and not allowing finalization seems like the priorities have gotten mixed. :-) > I would not like to have a fixed or implementation-defined limit on > the "sufficiently small" value of Max_Size_In_Storage_Elements; > whatever limit value is chosen, I'm sure that it would turn out to be > just a little too small for my next application :-) I agree with this, but I must have missed something because I didn't see any such proposal (and it doesn't make much sense). The Bounded_Container would have to somehow get the Max_Size_In_Storage_Elements. I'd probably suggest that it be specified as a parameter to the generic, so that such a container could be used for any indefinite type (assuming the Max is large enough). > While I'm not totally opposed to the Bounded_Holder idea, it seems to > me that most of the definition of Indefinite_Holder exists to support > an implementation that uses heap allocation (and naturally so). If the > Bounded_Holder were defined in a similar way, it would introduce > complications, such as controlled types, and the possibility of the > Holder being empty. That's true of ALL of the containers. Vectors is designed to be unbounded and use heap allocation. Yet we have Bounded_xxx versions that are supposed to avoid dynamic allocation. Bounded_Vectors do no heap allocation and use an explicitly specified capacity. I don't see any particular issue with having Bounded_Indefinite_xxx versions that do the same thing for indefinite types. The question is whether the implementation could be made "clean" enough to keep the safety critical people happy. I much prefer keeping the complications (if there are any) to the containers, since implementing mutable record types is one of the hardest parts of Ada to get right (especially because of the possibility of discriminant-dependent controlled subcomponents). Allowing essentially the same thing for class-wide types would be a whole 'nother level of implementation pain. Moreover, there are a lot of bizarre language rules to avoid problems with mutable record types; the tagged type rules were designed knowing that those cases couldn't happen because the tag of an object cannot be changed. Allowing that would require some additional restrictions on access types which would not be compatible (or safe). [Otherwise, an access type to T'Class could end up pointing to an object that isn't in T'Class.] **************************************************************** From: Tucker Taft Sent: Friday, January 19, 2018 8:01 AM I agree with Randy's responses, in general. Adding a new kind of container with some implementation advice and/or requirements is hugely less disruptive than adding a new language feature. Furthermore, you really have no guarantee that the new language feature wouldn't end up using heap allocation "underneath," and so we would end up having new restrictions and/or implementation requirements to prevent that. Note that we might also address some of the same needs by having a "safe" pointer capability (AI-0240), that manages storage automatically, and could be coupled with other things such as storage-sized access collections to achieve the effectively "static" allocation you would like. **************************************************************** From: Niklas Holsti Sent: Sunday, January 21, 2018 9:48 AM > ... >> ... The only approach I have imagined for a pure-Ada implementation >> of Bounded_Holder would make each such object contain its own storage >> pool, able to hold a single object of the "held" type, up to the >> Max_Size_In_Storage_Elements. That would make Bounded_Holder require >> finalization, which I would like to avoid, unless the "held" type >> requires it. > > Trying to avoid finalization is like cutting off your leg and > complaining that you can't run. The finalization in this case would be > trivial (probably an empty Finalize routine), so it shouldn't matter > at all (and possibly some compilers could eliminate it altogether). > Moreover, finalization does not cause any analysis issues in a static > allocation environment -- it just means that some routines run > automatically at scope exit and during assignment. Any semi-competent tool > should be able to handle that. I'm afraid that is not true for static analysis of resource usage, such as stack usage, because such analysis generally has to start from the machine code form of the program, not the source code form. Some implementations of finalization could be quite difficult to analyse from the machine code. For example, assume that the objects to be finalized are put in a linked list, and the finalization routine traverses this list and invokes the dispatching Finalize for each list element. In the machine code, the list traversal involves (1) a loop that is not terminated by a counter, but by reaching a null link; (2) reading dynamically defined "next" pointers from list cells; (3) at each list cell, calling a routine that is dynamically defined (dispatching) by the contents (tag) of the cell. Each of these dynamic steps is hard for a static analysis that aims to find an upper bound on the stage usage of the Finalize routines that may be called. (I know that GNAT now implements finalization differently, and not with lists as described above, but I have not yet studied the present method, so I don't know if it leads to simpler machine code.) Explicit dispatching calls also lead to machine-level calls with dynamically defined target addresses, but here the analyst can aid and guide the analysis tool as to which subprograms may be called by a particular dispatching call. That is much harder for implicit dispatching calls, such as finalization. >> I would not like to have a fixed or implementation-defined limit on >> the "sufficiently small" value of Max_Size_In_Storage_Elements; >> whatever limit value is chosen, I'm sure that it would turn out to be >> just a little too small for my next application :-) > > I agree with this, but I must have missed something because I didn't > see any such proposal (and it doesn't make much sense). Tuck said, in part, "... or simply specify that the Indefinite_Holder take advantage of a Max_Size_In_Storage_Elements if it is sufficiently small ...". > The Bounded_Container would > have to somehow get the Max_Size_In_Storage_Elements. I'd probably > suggest that it be specified as a parameter to the generic, so that > such a container could be used for any indefinite type (assuming the Max is > large enough). That sounds like a workable solution. But it still seems necessary to specify the Max_Size_In_Storage_Elements aspect for the class-wide root type, so that the compiler can check that the max size given in the instantiation of Bounded_Container is large enough for all members of the class. > I much prefer keeping the complications (if there are any) to the > containers, since implementing mutable record types is one of the > hardest parts of Ada to get right (especially because of the > possibility of discriminant-dependent controlled subcomponents). Understood. > Allowing essentially the same thing for class-wide types would be a > whole 'nother level of implementation pain. I don't really see why. All class-wide types are already defined as needing finalization which, as I understand it, is because any class-wide variable may or may not contain controlled subcomponents, depending on the actual tag. It seems to me that there must already be mechanisms for managing the possible (tag-dependent) existence of controlled subcomponents. But I now realize that when writing this suggestion, I did not remember this issue (needing finalization) of class-wide types (perhaps because I have seldom used controlled types). Therefore, even if the suggestion leads to some way to bound the size of T'Class, the analysis could still be confused by the finalization, as I explained above. To solve this, I want to extend the suggestion to include a Boolean aspect No_Controlled_Subcomponents which can be specified for an array type, a record type, a tagged type, or a class-wide type, and which would mean what it says, that this type cannot and will not have any controlled subcomponents. A class-wide type with this aspect would not need finalization, unless the root tagged type is itself controlled. > Moreover, there are a lot of bizarre language rules to avoid problems > with mutable record types; the tagged type rules were designed knowing > that those cases couldn't happen because the tag of an object cannot > be changed. Allowing that would require some additional restrictions > on access types which would not be compatible (or safe). > [Otherwise, an access type to T'Class could end up pointing to an > object that isn't in T'Class.] The rule that comes to mind, for solving this problem, is to prohibit the creation of access types to sub-classes of any T'Class that has the Max_Size_In_Storage_Units aspect. Access-to-constant types could still be allowed. This does not seem incompatible with Ada 2012. **************************************************************** From: Niklas Holsti Sent: Sunday, January 21, 2018 10:52 AM >> Moreover, finalization does not >> cause any analysis issues in a static allocation environment -- it >> just means that some routines run automatically at scope exit and >> during assignment. Any semi-competent tool should be able to handle that. > > I'm afraid that is not true for static analysis of resource usage, > such as stack usage, because such analysis generally has to start from > the machine code form of the program, not the source code form. [snip description of finalization analysis problems] > Explicit dispatching calls also lead to machine-level calls with > dynamically defined target addresses, but here the analyst can aid and > guide the analysis tool as to which subprograms may be called by a > particular dispatching call. That is much harder for implicit > dispatching calls, such as finalization. As an aside, and perhaps partly arguing against myself, it could be said that when a compiler is intended for high-integrity systems, it is the compiler's responsibility to support the analysis of the code the compiler generates. For example, in the case of finalization and other dispatching calls, the compiler might be expected to emit debugging information or other forms of information giving the machine-code locations of the dispatching calls and naming the type-classes involved. If the compiler also identifies the primitive operations of each tagged type, a tool that analyses the compiled code can compute an upper bound on the set of operations that can be called from any dispatching call, and proceed to analyse the machine code of those operations. Some C++ compilers for embedded systems emit such information (for example, IAR Systems compilers), but unfortunately, AFAIK, the common executable and debugging data-format standards (ELF, Dwarf) do not specify how to represent information of this kind. The compilers from IAR Systems use the IAR proprietary format called UBROF. Perhaps some requirements or suggestions on such compiler output could be added to appendix H, section H.3.1 (pragma Reviewable). I could write a specific proposal for such text, if the ARG might consider it. Paragraph 18.1 in that section already requires "control- and data-flow information", but that is quite vague. This suggestion is a general one, to support analysis or review of high-integrity applications that use dispatching. It is not limited to the suggestion for definite class-wide types. **************************************************************** From: Randy Brukardt Sent: Monday, January 22, 2018 3:16 PM > I'm afraid that is not true for static analysis of resource usage, > such as stack usage, because such analysis generally has to start from > the machine code form of the program, not the source code form. > > Some implementations of finalization could be quite difficult to > analyse from the machine code. For example, assume that the objects to > be finalized are put in a linked list, and the finalization routine > traverses this list and invokes the dispatching Finalize for each list > element. In the machine code, the list traversal involves (1) a loop > that is not terminated by a counter, but by reaching a null link; (2) > reading dynamically defined "next" pointers from list cells; > (3) at each list cell, calling a routine that is dynamically defined > (dispatching) by the contents (tag) of the cell. Each of these dynamic > steps is hard for a static analysis that aims to find an upper bound > on the stage usage of the Finalize routines that may be called. Understood. But that assumes that the finalization is non-trivial, and that a linked list implementation is used. If it is trivial, it is possible that the compiler generate no finalization at all for an object with a null finalization routine (and I would hope that is the case for any compiler trying to support the kind of analysis you mention here). As such, the object would never be put on the list in the first place. Your "bounded indefinite" storage pool would likely have a null procedure finalization, as no special clean-up is needed for a statically allocated chunk of memory. > (I know that GNAT now implements finalization differently, and not > with lists as described above, but I have not yet studied the present > method, so I don't know if it leads to simpler machine code.) My understanding is that GNAT generates a direct (not dispatching) call to the Finalize routine for stand-alone objects and subcomponents thereof. That shouldn't be a problem for analysis, even of the machine code. You might get a list for allocated objects, but you are already trying to avoid those. > Explicit dispatching calls also lead to machine-level calls with > dynamically defined target addresses, but here the analyst can aid and > guide the analysis tool as to which subprograms may be called by a > particular dispatching call. > That is much harder for implicit dispatching calls, such as > finalization. Not sure why; the type is known exactly in most cases (including the one in question). And GNAT probably isn't generating a dispatching call anyway (and probably isn't generating any call at all in this particular case). ... > > The Bounded_Container would > > have to somehow get the Max_Size_In_Storage_Elements. I'd probably > > suggest that it be specified as a parameter to the generic, so that > > such a container could be used for any indefinite type (assuming the > > Max is large enough). > > That sounds like a workable solution. But it still seems necessary to > specify the Max_Size_In_Storage_Elements aspect for the class-wide > root type, so that the compiler can check that the max size given in > the instantiation of Bounded_Container is large enough for all members > of the class. I would have expected it to be enough for the container to check for overflow. Any full-program analysis tool can easily prove that it can't fail. > > Allowing essentially the same thing for class-wide types would be a > > whole 'nother level of implementation pain. > > I don't really see why. All class-wide types are already defined as > needing finalization which, as I understand it, is because any > class-wide variable may or may not contain controlled subcomponents, > depending on the actual tag. It seems to me that there must already be > mechanisms for managing the possible (tag-dependent) existence of > controlled subcomponents. In the normal Ada case (where the tag/bounds/discriminants can't change after allocation), one handles all of the finalization and storage allocation at the declaration of the object. For an implementation like Janus/Ada, that means putting controlled components on the current finalization chain as well as allocating memory from the current frame's storage pool. None of that can change later, so it's relatively easy. Janus/Ada treats each controlled part individually, so only directly controlled components need anything done. (We did that because the alternative of dealing with the entire composite object would mean having to deal with controlled -- thus tagged -- array types, which appeared to lead to madness [mine].) When you have a mutable record, however, one has to be able to add/remove controlled components. With the Janus/Ada implementation, that has to be to a remote location in the finalization chain. Deleting is easy (the chain is doubly linked for that reason), but adding is difficult because it has to be done at a particular place on the chain. This never actually worked in Janus/Ada until last spring when I finally gave up on any efficient solution and placed a local finalization chain into each such object. (And that compiler has only been used by a handful of people, so there's no certainty that it is actually fixed.) (There also was a storage leak problem that has been fixed similarly by including an access to the storage pool to use into the object.) If one allowed changing the tag of class-wide objects, you would have the same issues with them, and a similar, heavy-weight solution would have to be applied to them. > But I now realize that when writing this suggestion, I did not > remember this issue (needing finalization) of class-wide types > (perhaps because I have seldom used controlled types). > Therefore, even if the suggestion leads to some way to bound the size > of T'Class, the analysis could still be confused by the finalization, > as I explained above. To solve this, I want to extend the suggestion > to include a Boolean aspect No_Controlled_Subcomponents which can be > specified for an array type, a record type, a tagged type, or a > class-wide type, and which would mean what it says, that this type > cannot and will not have any controlled subcomponents. A class-wide > type with this aspect would not need finalization, unless the root > tagged type is itself controlled. Seems reasonable. > > Moreover, there are a lot of bizarre language rules to avoid > > problems with mutable record types; the tagged type rules were > > designed knowing that those cases couldn't happen because the tag of > > an object cannot be changed. Allowing that would require some > > additional restrictions on access types which would not be compatible (or > > safe). [Otherwise, an access type to T'Class could end up pointing to an > > object that isn't in T'Class.] > > The rule that comes to mind, for solving this problem, is to prohibit > the creation of access types to sub-classes of any T'Class that has > the Max_Size_In_Storage_Units aspect. > Access-to-constant types could still be allowed. This does not seem > incompatible with Ada 2012. That would work for the specific problem; not sure if there are any others (that would require some analysis of potential problems). It would seem to be a pretty draconian restriction, though, as anything other than access T'Class would have to be banned. That would seem to conflict with anonymous access dispatching, for instance. **************************************************************** From: Niklas Holsti Sent: Tuesday, January 23, 2018 4:33 AM > I agree with Randy's responses, in general. Adding a new kind of > container with some implementation advice and/or requirements is > hugely less disruptive than adding a new language feature. That route is ok for me, unless the container introduces some additional analysis problems, such as finalization (as was discussed in other messages). > Furthermore, you really have no guarantee that the new language > feature wouldn't end up using heap allocation "underneath," and so we > would end up having new restrictions and/or implementation > requirements to prevent that. The restriction No_Implicit_Heap_Allocations should work, at least to warn the programmer if the compiler needs heap to implement the definite class-wide types. Some compilers could still use a secondary stack, of course. But this seems to be an issue of implementation quality, or implementation orientation, and does not apply only to the propsed definite class-wide types, but to almost anything (say, local array variables of dynamic length, which some compilers might allocate on the heap, or on a secondary stack). > Note that we might also address some of the same needs by having a > "safe" pointer capability (AI-0240), that manages storage > automatically, and could be coupled with other things such as > storage-sized access collections to achieve the effectively "static" > allocation you would like. If "safe" pointers still use allocations ("new"), they would not solve the problem, because a static analysis could find it quite difficult to bound the number of allocations executed at run-time, and to match up the allocations with the (implicit) deallocations, which it would have to do in order to discover the maximum number of allocated objects that could exist, at any time, during execution. There are analysis tools that attempt to find bounds on heap usage, but they require much deeper and more complex analysis (for example, "shape analysis" for linked data structures) than is needed for stack-usage analysis. **************************************************************** From: Niklas Holsti Sent: Tuesday, January 23, 2018 8:25 AM > ... >> Some implementations of finalization could be quite difficult to >> analyse from the machine code. ... > Understood. But that assumes that the finalization is non-trivial, and > that a linked list implementation is used. If it is trivial, it is > possible that the compiler generate no finalization at all for an > object with a null finalization routine (and I would hope that is the > case for any compiler trying to support the kind of analysis you mention here). That would solve the finalization-analysis problem in this case, yes. > My understanding is that GNAT generates a direct (not dispatching) > call to the Finalize routine for stand-alone objects and subcomponents > thereof. Ok, that should not be a problem then, if the call to Finalize the Bounded_Container is direct. >> Explicit dispatching calls also lead to machine-level calls with >> dynamically defined target addresses, but here the analyst can aid >> and guide the analysis tool as to which subprograms may be called by >> a particular dispatching call. >> That is much harder for implicit dispatching calls, such as >> finalization. > > Not sure why; the type is known exactly in most cases (including the > one in question). The class-wide type is exactly known to the compiler, but not to the machine-code analyzer. For explicit dispatching calls, the guidance from the analyst, to which I referred above, usually takes the form of textual "annotations", written by the analyst for use by the analysis tool, and saying that the dynamic call at a particular source code location (given by file name and line number, or similar) may actually call (dispatch to) a given set of subprograms specifically named in the annotation. The analyst identifies these subprograms from the source code, perhaps aided by a source-level analysis that identifies the primitive subprograms for each type included in the class. Something like dynamic call at "huhu.adb:626" calls "Foo" or "Foo2" or "Foo3"; For implicit dispatching calls, there is no similar source-code reference location. The analysis tool can tell the analyst that there is a dispatching call at machine code address A, but the analyst will find it difficult to map this call to the source code, and thus to understand what the call is doing, and how to guide the analysis of this call. Machine-code analyzers usually strive to be source-language-agnostic and compiler-agnostic, so as not to restrict their user base by conditions on the user's programming tools. So one tries to separate the machine-code analysis from any source-code analysis that might support the machine-code one. And there is always the problem of mapping information from the source-code analysis to the machine level, such as identifying which implicit finalization accounts for the dispatching call at address A. >>> The Bounded_Container would >>> have to somehow get the Max_Size_In_Storage_Elements. I'd probably >>> suggest that it be specified as a parameter to the generic, so that >>> such a container could be used for any indefinite type (assuming the >>> Max is large enough). >> >> That sounds like a workable solution. But it still seems necessary to >> specify the Max_Size_In_Storage_Elements aspect for the class-wide >> root type, so that the compiler can check that the max size given in >> the instantiation of Bounded_Container is large enough for all >> members of the class. > > I would have expected it to be enough for the container to check for > overflow. Any full-program analysis tool can easily prove that it > can't fail. The other fellow's job always seems easier than one's own :-) In this case, the machine-code analyzer would have to perform this proof at every point in the code where an object is stored into some instantiation of a Bounded_Container. To do so, the machine-code analyzer would have to be able to statically compare the actual Max_Size value of the present instance of Bounded_Container, to the (actual or maximum) size of the object that is being stored in the Bounded_Container. In the machine-code, both values may be the result of dynamic computations. It may be possible; I find it hard to say, as it will depend on the precise form of the code generated for instantiations and for storing an object in the container. (Aside from that, I would much prefer to have a possible overflow detected at compile time, at the instantiation.) >>> Allowing essentially the same thing for class-wide types would be a >>> whole 'nother level of implementation pain. [snip description of Janus/Ada methods; thanks, it was illuminating] > If one allowed changing the tag of class-wide objects, you would have > the same issues with them, and a similar, heavy-weight solution would > have to be applied to them. > >> But I now realize that when writing this suggestion, I did not >> remember this issue (needing finalization) of class-wide types >> (perhaps because I have seldom used controlled types). >> Therefore, even if the suggestion leads to some way to bound the size >> of T'Class, the analysis could still be confused by the finalization, >> as I explained above. To solve this, I want to extend the suggestion >> to include a Boolean aspect No_Controlled_Subcomponents which can be >> specified for an array type, a record type, a tagged type, or a >> class-wide type, and which would mean what it says, that this type >> cannot and will not have any controlled subcomponents. A class-wide >> type with this aspect would not need finalization, unless the root >> tagged type is itself controlled. > > Seems reasonable. If it would simplify implementation, one could allow compilers to require (or assume) that No_Controlled_Subcomponents is specified together with Max_Size_In_Storage_Units. **************************************************************** From: Randy Brukardt Sent: Wednesday, January 24, 2018 3:25 PM ... > >> Explicit dispatching calls also lead to machine-level calls with > >> dynamically defined target addresses, but here the analyst can aid > >> and guide the analysis tool as to which subprograms may be called > >> by a particular dispatching call. > >> That is much harder for implicit dispatching calls, such as > >> finalization. > > > > Not sure why; the type is known exactly in most cases > (including the > > one in question). > > The class-wide type is exactly known to the compiler, but not to the > machine-code analyzer. I understand that, but it isn't relevant when the specific type is known. (Which is what I meant by "type is known exactly", I could have been clearer.) As you know, Ada dispatching uses static binding for dispatching calls where the controlling type is statically determined. A statically bound dispatching call doesn't dispatch at all, it just directly calls the appropriate routine. So, for machine-code analysis, this just looks like a normal direct call. I'd be amazed if the analyzer cares about how that call got determined. Obviously, a compiler could revert to dynamic dispatching code even in that case, but one would hope that if the compiler vendor is trying to support machine code analysis, they would fix the code generation to avoid that. (Yes, Janus/Ada would use a dynamic call in such a case, and that call is hidden in the runtime, but anyone trying to machine-code analysis on Janus/Ada output is nuts -- it wasn't designed for that at all.) ... > >>> The Bounded_Container would > >>> have to somehow get the Max_Size_In_Storage_Elements. I'd probably > >>> suggest that it be specified as a parameter to the generic, so > >>> that such a container could be used for any indefinite type > >>> (assuming the Max is large enough). > >> > >> That sounds like a workable solution. But it still seems necessary > >> to specify the Max_Size_In_Storage_Elements aspect for the > >> class-wide root type, so that the compiler can check that the max > >> size given in the instantiation of Bounded_Container is large > >> enough for all members of the class. > > > > I would have expected it to be enough for the container to check for > > overflow. Any full-program analysis tool can easily prove that it > > can't fail. > > The other fellow's job always seems easier than one's own :-) > > In this case, the machine-code analyzer would have to perform this > proof at every point in the code where an object is stored into some > instantiation of a Bounded_Container. To do so, the machine-code > analyzer would have to be able to statically compare the actual > Max_Size value of the present instance of Bounded_Container, to the > (actual or > maximum) size of the object that is being stored in the > Bounded_Container. In the machine-code, both values may be the result > of dynamic computations. It may be possible; I find it hard to say, as > it will depend on the precise form of the code generated for > instantiations and for storing an object in the container. I wasn't expecting the machine-code analyzer to ever see this check. I was expecting an Ada-level tool to make the proof and the check to then be suppressed. (The only legitimate reason for Ada check suppression is to reduce the cost of machine-code analysis). Additionally, this check (like most new checks in Ada 2020) would be written as a precondition, so it is likely the compiler could eliminate it in most cases. Again, it wouldn't survive for the machine code analyzer to see it. > (Aside from that, I would much prefer to have a possible overflow > detected at compile time, at the instantiation.) If an object did cause an overflow, you'd most likely get a warning about a precondition that always will fail. If one uses "warnings as errors" mode, you get the effect without any extra language machinery. Specifically: procedure Replace_Element (Container : in out Holder; New_Item : in Element_Type) with Pre => New_Item'Size <= Max_Size; If New_Item is an object of a specific type (the usual case), the compiler would be able to evaluate the precondition at compile-time (since New_Item'Size is a compile-time constant). It would be a pretty lousy compiler that actually generated a check in that case. :-) You'd probably have to suppress the check if the object is class-wide, but that would generally be safe if it had previously been stored in a Holder. (And Ada-level analysis tools should be able to prove that regardless of the source of the object.) There will shortly be a proposal for suppressing precondition checks for language-defined units as part of AI12-0112-1. (I have to find time to write it, rather than commenting on everyone else's proposals. :-) **************************************************************** From: Niklas Holsti Sent: Friday, January 26, 2018 2:55 PM > ... >>>> That is much harder for implicit dispatching calls, such as >>>> finalization. >>> >>> Not sure why; the type is known exactly in most cases >> (including the one in question). ... > So, for machine-code analysis, this just looks like a normal direct call. Indeed, such calls are not a problem, not (usually) even for implicit calls. >>> [Niklas Holsti writes:] >>>> But it still seems necessary to >>>> specify the Max_Size_In_Storage_Elements aspect for the class-wide >>>> root type, so that the compiler can check that the max size given >>>> in the instantiation of Bounded_Container is large enough for all >>>> members of the class. >>> >>> I would have expected it to be enough for the container to check for >>> overflow. [snip] > If an object did cause an overflow, you'd most likely get a warning > about a precondition that always will fail. If one uses "warnings as > errors" mode, you get the effect without any extra language machinery. > > Specifically: > procedure Replace_Element (Container : in out Holder; > New_Item : in Element_Type) > with Pre => New_Item'Size <= Max_Size; > > If New_Item is an object of a specific type (the usual case), the > compiler would be able to evaluate the precondition at compile-time > (since New_Item'Size is a compile-time constant). I get it now. I wanted to have the compiler check Max_Size when a descendant type is derived, but I agree with you that as an object of every such descendant type will be entered into an instance of the Bounded_Holder object, somewhere in the program, and most probably with its specific type, the Max_Size can be a (generic) property of Bounded_Holder, and checked against the size of the specific descendant type in a precondition of Replace_Element. Nice. So then I would be content with a Bounded_Holder like that (with some strong advice that its memory usage should be a static function of Max_Size), plus the new aspect No_Controlled_Subcomponents, to eliminate dispatching finalization of the Element_Type upon replacement. Thanks for the discussion of this suggestion. To Randy: should I write new ada-comment messages to suggest Bounded_Holder and No_Controlled_Subcomponents explicitly, or is this discussion enough for you to consider them suggested? **************************************************************** From: Randy Brukardt Sent: Friday, January 26, 2018 6:19 PM ... > To Randy: should I write new ada-comment messages to suggest > Bounded_Holder and No_Controlled_Subcomponents explicitly, or is this > discussion enough for you to consider them suggested? No need. The ARG mainly wants problem descriptions to which we can figure out the answers. And you've provided an understandable problem description, to which we'll work out the least disruptive and most functional solution. In any case, the deadlines (both published and functional) have passed, so new proposals probably will get much lower priority than the existing ones. (The *real* deadline was the deadline for submissions for Monday's ARG meeting, which was 6 hours ago.) I was just writing up the AI for this thread, it will be AI12-0254-1 (should be posted sometime today, Wisconsin time -- thus the next 6 hours). **************************************************************** From: Niklas Holsti Sent: Sunday, January 28, 2018 5:50 AM [privately] Thanks for creating this AI from our ada-comment discussions. I have a very minor comment on the formulation of the precondition regarding the size of the elements. The AI gives this example: procedure Replace_Element (Container : in out Holder; New_Item : in Element_Type) with Pre => New_Item'Size/System.Storage_Unit <= Max_Size_in_Storage_Elements; Is it certain that New_Item'Size is always a multiple of System.Storage_Unit at this point? If not, it seems to me that the precondition should be written as New_Item'Size <= System.Storage_Unit * Max_Size_in_Storage_Elements; I also noticed that the ""!ASIS" section of the AI ends with an interrupted conditional sentence, perhaps from an editing glitch. **************************************************************** From: Randy Brukardt Sent: Sunday, January 28, 2018 9:28 PM [privately] It's highly unlikely that the Size would not be an even number of units, but since inverting the condition doesn't change much (other than introducing an overflow possibility), it probably is better written as you have it. Can I put your message into the public record? (That is, filed in the AI?) **************************************************************** From: Niklas Holsti Sent: Monday, January 29, 2018 12:46 AM [privately] > It's highly unlikely that the Size would not be an even number of > units, but since inverting the condition doesn't change much (other > than introducing an overflow possibility), it probably is better written as > you have it. Good, thanks. > Can I put your message into the public record? (That is, filed in the > AI?) Yes, of course. Apologies if this direct e-mail caused extra editing trouble for you. **************************************************************** From: Randy Brukardt Sent: Thursday, February 1, 2018 12:21 AM During the discussion of AI12-0254-1, Jean-Pierre complained that "Bounded" has a different meaning for a Bounded_Indefinite_Container. The minutes say: Jean-Pierre notes that "Bounded" is somewhat a misnomer. For most Bounded containers the bound is on the number of elements; for the Bounded_Indefinite_Container, the bound is on the size of an individual element. However, the introduction of the Bounded_Vector container says: The language-defined generic package Containers.Bounded_Vectors provides a private type Vector and a set of operations. It provides the same operations as the package Containers.Vectors (see A.18.2), with the difference that the maximum storage is bounded. Ergo, "bounded" in the Standard sense means that the storage is bounded, and clearly it means the same thing for the Bounded_Indefinite_Container. The mechanism to accomplish that bounding is different, but the goal is the same. (And if there was a Bounded_Indefinite_Vector, it would use both techniques for bounding. Aside: A Bounded_Indefinite_Vector probably would be the fastest implementation of a vector for Janus/Ada's code shared generics, presuming "canonical" implementations of all of the containers -- it would slightly beat Indefinite_Vector by having faster allocations than the regular heap -- and it would clobber the other implementations, which have to use implicit heap allocation for each component. End aside.) ****************************************************************