!standard A.18.32(0) 18-01-26 AI12-0254-1/01 !class Amendment 18-01-26 !status work item 18-01-26 !status received 18-01-15 !priority Very_Low !difficulty Easy !subject Bounded_Indefinite_Holders !summary Add Bounded_Indefinite_Holder containers to allow the use of class-wide objects in safety critical environments where dynamic allocation and finalization are not allowed. !problem In high-integrity, embedded applications, it is usually necessary to prove that the program will not run into stack overflow or heap exhaustion. While static-analysis tools exist for computing safe bounds on "primary" stack usage, that is not the case for heap usage nor for the usage of the "secondary" stack that some Ada compilers use, in place of heap, for local data of dynamic size. On the other hand, it is desirable, and becoming accepted, to use tagged types and dynamic dispatching even in high-integrity applications. Static analysis tools for stack usage can usually deal with the dynamic dispatching (even if it introduces some over-estimation), but a harder problem is that dynamic dispatching requires class-wide subtypes, which are currently considered indefinite subtypes by RM3.3(23/3) and for which the compiler cannot know the size, at compile time, from local analysis of the program. The amount of memory consumed by an object of a class-wide subtype is therefore defined at run-time, in practice by using heap or secondary stack (or some way of extending the local stack frame of the primary stack, equally difficult for static stack-usage analysis). Moreover, the indefiniteness of class-wide types means that data structures (array, records, etc.) with components of class-wide type can be created only by using heap allocation and access-to-class-wide components. But in high-integrity applications, heap use is often completely forbidden. The alternative, currently forced on programmers, is to replace the tagged type by a discriminated record type, with variants corresponding to sub-classes, and to implement the dispatch manually by case statements on the discriminant, which stands in for the tag. The record type is made definite by giving a default value to the discriminant. The practical drawbacks of this design are well known, and were one of the reasons for the introduction of tagged types and class-oriented programming. A solution is needed that provides the ability to store class-wide objects without introducing dynamic memory allocation. !proposal Introduce a package Bounded_Indefinite_Holders. This would have a similar specification to that of Indefinite_Holders, except: * the addition of a generic formal parameter Max_Size_in_Storage_Elements [note: this could be a discriminant instead, but it seems that this value would usually be global to the entire program for a particular element type so one would want it at the highest level possible. note2: This value has to be in storage units as Ada does not allow formulas on discriminants in discriminant-dependent components, and this value may be used in a discriminant of a storage pool to allocate an array of storage elements to store the object.] * the addition of preconditions to all of the ways to insert objects: procedure Replace_Element (Container : in out Holder; New_Item : in Element_Type) with Pre => New_Item'Size/System.Storage_Unit <= Max_Size_in_Storage_Elements; * the addition of Implementation Advice that all memory allocation be static and any finalization needed be trivial. [note: We want to allow a storage pool (which needs finalization) to be used to implement this container, but we want any such finalization to be trivial so the compiler can implement it as if the type is not controlled.] * the addition of a permission for the implementation to raise Program_Error if objects of the type are not stored contiguously. (Such objects probably can only be allocated using a storage pool that supports arbitrary sized allocations, which would defeat the purpose of this container for analyzability.) It would be implementation-defined which objects are stored this way (so that in theory it would have to be documented). This package can be used to store class-wide objects so long as the object is not larger than Max_Size_in_Storage_Elements (and it is contiguous). --- Options: (1) We could define Bounded_Indefinite_xxx versions of all of the containers in a similar manner. This could allow objects of pretty much any contiguously allocated type to be stored in a container. (A possible implementation is found in the discussion section.) (2) A No_Controlled_Subcomponents aspect that ould be applied to any type. The type and any extensions of it is illegal if any component is controlled. [What about inside of a generic body? This property is not always known at compile-time - RLB] The purpose is to allow the compiler to not generate finalization overhead for class-wide objects of a tagged type with this aspect. (Otherwise, some future added object could have such an extension, so class-wide objects always have to be assumed to need finalization.) (3) A Maximum_Size aspect could be defined to apply to any tagged type. Such a type would not allow any extensions whose size would exceed the Maximum_Size. This would allow checking all instances of Bounded_Indefinite_Holders at compile time. (See the discussion for more on this idea.) !wording ** TBD. !discussion Thoughts: [1] The Bounded_Indefinite_Holder could be implemented in Ada using a storage pool that has a discriminant of the size of the data area required. Then, allocation/decallocation of a single object would be supported. This code would be very simple (simply directly returning an address from allocate and setting a Boolean to indicate that allocation has happened) and could easily be analyzed if needed. Such a storage pool would require finalization, but the actual finalize routine could be a null procedure. An Ada compiler targeting these sorts of applications could (and should) treat an object requiring trivial finalization as not controlled, and thus machine code analysis would not see any finalization overhead. That is, it's not finalization that such applications should ban, but rather non-trivial finalization. Also note that containers do not need to be written in portable Ada per-se, so it certainly would be possible to use some implementation-defined mechanism to avoid the need for a storage pool. [2] One could support other kinds of Bounded_Indefinite_xxx containers the same way. The storage pool would have the size of the data capacity times the element capacity. The data area would be treated as an array of data capacity- sized elements. There also would need to be an array of element capacity booleans to flag whether each item is allocated. Allocation then would be simply finding an unallocated element and then returning the address of the start of that element. (This is essentially the same technique as used in the regular bounded containers, sans the storage pool). No fragmentation could occur and the memory use is completely predictable. [3] Having a Maximum_Size aspect would allow checking the instances for problems at compile-time (although that would require some sort of different size check that the one proposed above). However, in most circumstances, that isn't really necessary. (1) Most objects put into a container will have a known specific type. That means that they also have a known, compile-time size. Therefore, the precondition check includes all compile-time known values, and thus should be evaluated at compile-time. No sane compiler will generate code for such a check unless it fails, and if it fails, a warning most likely would be given. (2) In this environment, class-wide objects would be stored in Bounded_Indefinite_xxx containers (to give predictable memory usage). Such objects have a known upper bound on their size. A compiler could use that upper bound to evaluate the precondition. In typical cases (where all of the containers with the same element type have the same bound for maximum data size), this would result in the preconditions being known to pass (and thus not generating any code). Such an optimization would probably take some extra implementation work (it wouldn't fall out naturally like (1)), but it could be implemented for a compiler targetting this sort of application. (3) This issue also could be handled with a technique recommended by AdaCore for such systems: use a tool like SPARK to prove that the checks cannot fail, then suppress them. Precondition checks on language-defined units can be suppressed (using a suppress mechanism defined in AI12-0112-1), and thus not appear in the generated code. If the correctness has been verified some other way, this will not cause any safety issues. !ASIS No changes needed for package Bounded_Indefinite_Holders. If any of the optional new aspects !ACATS test ACATS B- and C-Tests are needed to check that the new capabilities are supported. !appendix !topic Definite class-wide types with Maximum_Size !reference Ada 2012 RM3.3(23/3) !from Niklas Holsti 18-01-15 !keywords class definite indefinite Maximum_Size !discussion This suggested Ada extension aims to make it possible to use class-wide types and dispatching in applications that must have statically predictable memory consumption. This suggestion arises from a discussion on comp.lang.ada, starting 2017-06-23, within a thread with the Subject "Ada Annoyances". In high-integrity, embedded applications, it is usually necessary to prove that the program will not run into stack overflow or heap exhaustion. While static-analysis tools exist for computing safe bounds on "primary" stack usage, that is not the case for heap usage nor for the usage of the "secondary" stack that some Ada compilers use, in place of heap, for local data of dynamic size. On the other hand, it is desirable, and becoming accepted, to use tagged types and dynamic dispatching even in high-integrity applications. Static analysis tools for stack usage can usually deal with the dynamic dispatching (even if it introduces some over-estimation), but a harder problem is that dynamic dispatching requires class-wide subtypes, which are currently considered indefinite subtypes by RM3.3(23/3) and for which the compiler cannot know the size, at compile time, from local analysis of the program. The amount of memory consumed by an object of a class-wide subtype is therefore defined at run-time, in practice by using heap or secondary stack (or some way of extending the local stack frame of the primary stack, equally difficult for static stack-usage analysis). Moreover, the indefiniteness of class-wide types means that data structures (array, records, etc.) with components of class-wide type can be created only by using heap allocation and access-to-class-wide components. But in high-integrity applications, heap use is often completely forbidden. The alternative, currently forced on programmers, is to replace the tagged type by a discriminated record type, with variants corresponding to sub-classes, and to implement the dispatch manually by case statements on the discriminant, which stands in for the tag. The record type is made definite by giving a default value to the discriminant. The practical drawbacks of this design are well known, and were one of the reasons for the introduction of tagged types and class-oriented programming. As an example from real life, the company for which I mostly work (Space Systems Finland Ltd) has a major SW component, used on several ESA satellites over some 15 years, that implements telecommand and telemetry packet handling by means of just such a discriminated record type representing a "packet handle". The variants correspond to the different memory areas that can contain such packets, with different allocation and management mechanisms. Different applications (satellites) have different collections of such areas, which means that the record type and the case statements must be manually modified for each application. Using a tagged type would eliminate most of that manual work. The suggested Ada extension, below, proposes a form of class-wide types that is modelled on the behaviour of a discriminated but still definite record type. Syntactically, the proposed extension takes the form of an aspect that can be specified for a tagged type and that defines the maximum size, in bits, of any object in the class-wide type rooted at this tagged type. If we call this aspect Maximum_Size (or perhaps Maximum_Size'Class), the programmer could use it like this: type Root is tagged record ... end record with Maximum_Size => 128; type Child is new Root with record ... end record; -- The compiler checks that Child'Size is at most 128 bits, and -- rejects the program otherwise. It would now be legal to create statically-sized data structures using Root'Class, without dynamic memory allocation; the compiler would allocate 128 bits for each value of type Root'Class: type Object_List is array (List_Index) of Root'Class; type Object_Pair is record A, B : Root'Class; end record; and so on. Properties of such definite class-wide types would include: - Each object of this type has a default initial tag value, taken from the root of the class (which must therefore be non-abstract to permit default initialisation of these objects). - An object of this type can be assigned a new value, with a new tag from the same class. (In the comp.lang.ada discussion, Randy Brukardt called this point "way too drastic a change".) Other comments from the comp.lang.ada discussion include: - From Robert A. Duff: "Something like that was considered and rejected for Ada 9X. Part of the problem is that it seems so low level to be talking about sizes. It's not even portable." My reply is that this feature would be used only or mainly in embedded programs, which usually already contain unportable, low-level stuff: task stack sizes, record representation clauses, etc. - From Robert A. Duff: "It would be better to have the compiler compute the maximum size needed. That would require the compiler to do a fairly global analysis, which is something Ada compilers are not set up to do." My reply is that I agree that, ideally, the compiler should compute the maximum size, but that a a friendly compiler would (at least optionally) tell me that I am wasting space (when Maximum_Size is needlessly large), or that the given Maximum_Size is too small, and what would be the minimum valid value. By setting the Maximum_Size deliberately too small or too large, a single compilation of the program would produce a set of error or information messages from which a suitable Maximum_Size could be picked. Similarly, an ideal compiler would compute the stack-size needed for each task (and some compilers do that). In practice, however, the programmer guesses a stack size, and uses either static analysis, or high-water-mark measurements, to determine the real stack usage. **************************************************************** From: Tucker Taft Sent: Monday, January 15, 2018 10:03 PM This seems worth considering. As far as portability, we already allow users to specify a Storage_Size for a task, so this seems no worse than that. I would probably recommend allow specification on a class-wide type of the "Max_Size_In_Storage_Elements" rather than "Max_Size" so that the size is given in Storage_Elements rather than bits, and to make the connection to other Storage_Size attributes. Whether we want a user-specified Max_Size_In_Storage_Elements for a class-wide type to imply that the type is definite, or rather allow the "Holder" container (A.18.18) to query this attribute is another matter. I think if the Holder container could be made efficient on class-wide types, you could use them in real-time and/or embedded systems. Perhaps a "Bounded_Holder" container would be the right answer, or simply specify that the Indefinite_Holder take advantage of a Max_Size_In_Storage_Elements if it is sufficiently small (perhaps an additional parameter to Indefinite_Holder could specify that it avoid using the heap by using this attribute). **************************************************************** From: Niklas Holsti Sent: Wednesday, January 17, 2018 4:47 PM > This seems worth considering. Good! > As far as portability, we already > allow users to specify a Storage_Size for a task, so this seems no > worse than that. I would probably recommend allow specification on a > class-wide type of the "Max_Size_In_Storage_Elements" rather than > "Max_Size" so that the size is given in Storage_Elements rather than > bits, and to make the connection to other Storage_Size attributes. I thought in terms of bits, because the 'Size of types is in bits, and the names of some predefined types (Interfaces.Unsigned_32, etc) embed the size in bits. But Storage_Elements are ok too. > Whether we want a user-specified Max_Size_In_Storage_Elements for a > class-wide type to imply that the type is definite, or rather allow > the "Holder" container (A.18.18) to query this attribute is another > matter. I think if the Holder container could be made efficient on > class-wide types, you could use them in real-time and/or embedded > systems. Perhaps a "Bounded_Holder" container would be the right > answer, or simply specify that the Indefinite_Holder take advantage of > a Max_Size_In_Storage_Elements if it is sufficiently small (perhaps an > additional parameter to Indefinite_Holder could specify that it avoid > using the heap by using this attribute). Such a Bounded_Holder generic would be difficult to implement in pure Ada, would it not? The only approach I have imagined for a pure-Ada implementation of Bounded_Holder would make each such object contain its own storage pool, able to hold a single object of the "held" type, up to the Max_Size_In_Storage_Elements. That would make Bounded_Holder require finalization, which I would like to avoid, unless the "held" type requires it. I would not like to have a fixed or implementation-defined limit on the "sufficiently small" value of Max_Size_In_Storage_Elements; whatever limit value is chosen, I'm sure that it would turn out to be just a little too small for my next application :-) While I'm not totally opposed to the Bounded_Holder idea, it seems to me that most of the definition of Indefinite_Holder exists to support an implementation that uses heap allocation (and naturally so). If the Bounded_Holder were defined in a similar way, it would introduce complications, such as controlled types, and the possibility of the Holder being empty. **************************************************************** From: Randy Brukardt Sent: Friday, January 19, 2018 1:50 AM ... > Such a Bounded_Holder generic would be difficult to implement in pure > Ada, would it not? That's a nice-to-have, but it's definitely not a requirement for the Ada Language-Defined packages. There are plenty of such packages (Ada.Unchecked_Conversion, Ada.Unchecked_Deallocation, etc.) that can't be implemented in Ada in any circumstance, and most of the rest tend to be implemented in Ada + implementation-defined stuff. > ... The only approach I have imagined for a pure-Ada implementation > of Bounded_Holder would make each such object contain its own storage > pool, able to hold a single object of the "held" type, up to the > Max_Size_In_Storage_Elements. That would make Bounded_Holder require > finalization, which I would like to avoid, unless the "held" type > requires it. Trying to avoid finalization is like cutting off your leg and complaining that you can't run. The finalization in this case would be trivial (probably an empty Finalize routine), so it shouldn't matter at all (and possibly some compilers could eliminate it altogether). Moreover, finalization does not cause any analysis issues in a static allocation environment -- it just means that some routines run automatically at scope exit and during assignment. Any semi-competent tool should be able to handle that. I understand that there are good reasons to avoid heap allocation, but allowing dispatching and not allowing finalization seems like the priorities have gotten mixed. :-) > I would not like to have a fixed or implementation-defined limit on > the "sufficiently small" value of Max_Size_In_Storage_Elements; > whatever limit value is chosen, I'm sure that it would turn out to be > just a little too small for my next application :-) I agree with this, but I must have missed something because I didn't see any such proposal (and it doesn't make much sense). The Bounded_Container would have to somehow get the Max_Size_In_Storage_Elements. I'd probably suggest that it be specified as a parameter to the generic, so that such a container could be used for any indefinite type (assuming the Max is large enough). > While I'm not totally opposed to the Bounded_Holder idea, it seems to > me that most of the definition of Indefinite_Holder exists to support > an implementation that uses heap allocation (and naturally so). If the > Bounded_Holder were defined in a similar way, it would introduce > complications, such as controlled types, and the possibility of the > Holder being empty. That's true of ALL of the containers. Vectors is designed to be unbounded and use heap allocation. Yet we have Bounded_xxx versions that are supposed to avoid dynamic allocation. Bounded_Vectors do no heap allocation and use an explicitly specified capacity. I don't see any particular issue with having Bounded_Indefinite_xxx versions that do the same thing for indefinite types. The question is whether the implementation could be made "clean" enough to keep the safety critical people happy. I much prefer keeping the complications (if there are any) to the containers, since implementing mutable record types is one of the hardest parts of Ada to get right (especially because of the possibility of discriminant-dependent controlled subcomponents). Allowing essentially the same thing for class-wide types would be a whole 'nother level of implementation pain. Moreover, there are a lot of bizarre language rules to avoid problems with mutable record types; the tagged type rules were designed knowing that those cases couldn't happen because the tag of an object cannot be changed. Allowing that would require some additional restrictions on access types which would not be compatible (or safe). [Otherwise, an access type to T'Class could end up pointing to an object that isn't in T'Class.] **************************************************************** From: Tucker Taft Sent: Friday, January 19, 2018 8:01 AM I agree with Randy's responses, in general. Adding a new kind of container with some implementation advice and/or requirements is hugely less disruptive than adding a new language feature. Furthermore, you really have no guarantee that the new language feature wouldn't end up using heap allocation "underneath," and so we would end up having new restrictions and/or implementation requirements to prevent that. Note that we might also address some of the same needs by having a "safe" pointer capability (AI-0240), that manages storage automatically, and could be coupled with other things such as storage-sized access collections to achieve the effectively "static" allocation you would like. **************************************************************** From: Niklas Holsti Sent: Sunday, January 21, 2018 9:48 AM > ... >> ... The only approach I have imagined for a pure-Ada implementation >> of Bounded_Holder would make each such object contain its own storage >> pool, able to hold a single object of the "held" type, up to the >> Max_Size_In_Storage_Elements. That would make Bounded_Holder require >> finalization, which I would like to avoid, unless the "held" type >> requires it. > > Trying to avoid finalization is like cutting off your leg and > complaining that you can't run. The finalization in this case would be > trivial (probably an empty Finalize routine), so it shouldn't matter > at all (and possibly some compilers could eliminate it altogether). > Moreover, finalization does not cause any analysis issues in a static > allocation environment -- it just means that some routines run > automatically at scope exit and during assignment. Any semi-competent tool > should be able to handle that. I'm afraid that is not true for static analysis of resource usage, such as stack usage, because such analysis generally has to start from the machine code form of the program, not the source code form. Some implementations of finalization could be quite difficult to analyse from the machine code. For example, assume that the objects to be finalized are put in a linked list, and the finalization routine traverses this list and invokes the dispatching Finalize for each list element. In the machine code, the list traversal involves (1) a loop that is not terminated by a counter, but by reaching a null link; (2) reading dynamically defined "next" pointers from list cells; (3) at each list cell, calling a routine that is dynamically defined (dispatching) by the contents (tag) of the cell. Each of these dynamic steps is hard for a static analysis that aims to find an upper bound on the stage usage of the Finalize routines that may be called. (I know that GNAT now implements finalization differently, and not with lists as described above, but I have not yet studied the present method, so I don't know if it leads to simpler machine code.) Explicit dispatching calls also lead to machine-level calls with dynamically defined target addresses, but here the analyst can aid and guide the analysis tool as to which subprograms may be called by a particular dispatching call. That is much harder for implicit dispatching calls, such as finalization. >> I would not like to have a fixed or implementation-defined limit on >> the "sufficiently small" value of Max_Size_In_Storage_Elements; >> whatever limit value is chosen, I'm sure that it would turn out to be >> just a little too small for my next application :-) > > I agree with this, but I must have missed something because I didn't > see any such proposal (and it doesn't make much sense). Tuck said, in part, "... or simply specify that the Indefinite_Holder take advantage of a Max_Size_In_Storage_Elements if it is sufficiently small ...". > The Bounded_Container would > have to somehow get the Max_Size_In_Storage_Elements. I'd probably > suggest that it be specified as a parameter to the generic, so that > such a container could be used for any indefinite type (assuming the Max is > large enough). That sounds like a workable solution. But it still seems necessary to specify the Max_Size_In_Storage_Elements aspect for the class-wide root type, so that the compiler can check that the max size given in the instantiation of Bounded_Container is large enough for all members of the class. > I much prefer keeping the complications (if there are any) to the > containers, since implementing mutable record types is one of the > hardest parts of Ada to get right (especially because of the > possibility of discriminant-dependent controlled subcomponents). Understood. > Allowing essentially the same thing for class-wide types would be a > whole 'nother level of implementation pain. I don't really see why. All class-wide types are already defined as needing finalization which, as I understand it, is because any class-wide variable may or may not contain controlled subcomponents, depending on the actual tag. It seems to me that there must already be mechanisms for managing the possible (tag-dependent) existence of controlled subcomponents. But I now realize that when writing this suggestion, I did not remember this issue (needing finalization) of class-wide types (perhaps because I have seldom used controlled types). Therefore, even if the suggestion leads to some way to bound the size of T'Class, the analysis could still be confused by the finalization, as I explained above. To solve this, I want to extend the suggestion to include a Boolean aspect No_Controlled_Subcomponents which can be specified for an array type, a record type, a tagged type, or a class-wide type, and which would mean what it says, that this type cannot and will not have any controlled subcomponents. A class-wide type with this aspect would not need finalization, unless the root tagged type is itself controlled. > Moreover, there are a lot of bizarre language rules to avoid problems > with mutable record types; the tagged type rules were designed knowing > that those cases couldn't happen because the tag of an object cannot > be changed. Allowing that would require some additional restrictions > on access types which would not be compatible (or safe). > [Otherwise, an access type to T'Class could end up pointing to an > object that isn't in T'Class.] The rule that comes to mind, for solving this problem, is to prohibit the creation of access types to sub-classes of any T'Class that has the Max_Size_In_Storage_Units aspect. Access-to-constant types could still be allowed. This does not seem incompatible with Ada 2012. **************************************************************** From: Niklas Holsti Sent: Sunday, January 21, 2018 10:52 AM >> Moreover, finalization does not >> cause any analysis issues in a static allocation environment -- it >> just means that some routines run automatically at scope exit and >> during assignment. Any semi-competent tool should be able to handle that. > > I'm afraid that is not true for static analysis of resource usage, > such as stack usage, because such analysis generally has to start from > the machine code form of the program, not the source code form. [snip description of finalization analysis problems] > Explicit dispatching calls also lead to machine-level calls with > dynamically defined target addresses, but here the analyst can aid and > guide the analysis tool as to which subprograms may be called by a > particular dispatching call. That is much harder for implicit > dispatching calls, such as finalization. As an aside, and perhaps partly arguing against myself, it could be said that when a compiler is intended for high-integrity systems, it is the compiler's responsibility to support the analysis of the code the compiler generates. For example, in the case of finalization and other dispatching calls, the compiler might be expected to emit debugging information or other forms of information giving the machine-code locations of the dispatching calls and naming the type-classes involved. If the compiler also identifies the primitive operations of each tagged type, a tool that analyses the compiled code can compute an upper bound on the set of operations that can be called from any dispatching call, and proceed to analyse the machine code of those operations. Some C++ compilers for embedded systems emit such information (for example, IAR Systems compilers), but unfortunately, AFAIK, the common executable and debugging data-format standards (ELF, Dwarf) do not specify how to represent information of this kind. The compilers from IAR Systems use the IAR proprietary format called UBROF. Perhaps some requirements or suggestions on such compiler output could be added to appendix H, section H.3.1 (pragma Reviewable). I could write a specific proposal for such text, if the ARG might consider it. Paragraph 18.1 in that section already requires "control- and data-flow information", but that is quite vague. This suggestion is a general one, to support analysis or review of high-integrity applications that use dispatching. It is not limited to the suggestion for definite class-wide types. **************************************************************** From: Niklas Holsti Sent: Monday, January 22, 2018 3:16 PM > I'm afraid that is not true for static analysis of resource usage, > such as stack usage, because such analysis generally has to start from > the machine code form of the program, not the source code form. > > Some implementations of finalization could be quite difficult to > analyse from the machine code. For example, assume that the objects to > be finalized are put in a linked list, and the finalization routine > traverses this list and invokes the dispatching Finalize for each list > element. In the machine code, the list traversal involves (1) a loop > that is not terminated by a counter, but by reaching a null link; (2) > reading dynamically defined "next" pointers from list cells; > (3) at each list cell, calling a routine that is dynamically defined > (dispatching) by the contents (tag) of the cell. Each of these dynamic > steps is hard for a static analysis that aims to find an upper bound > on the stage usage of the Finalize routines that may be called. Understood. But that assumes that the finalization is non-trivial, and that a linked list implementation is used. If it is trivial, it is possible that the compiler generate no finalization at all for an object with a null finalization routine (and I would hope that is the case for any compiler trying to support the kind of analysis you mention here). As such, the object would never be put on the list in the first place. Your "bounded indefinite" storage pool would likely have a null procedure finalization, as no special clean-up is needed for a statically allocated chunk of memory. > (I know that GNAT now implements finalization differently, and not > with lists as described above, but I have not yet studied the present > method, so I don't know if it leads to simpler machine code.) My understanding is that GNAT generates a direct (not dispatching) call to the Finalize routine for stand-alone objects and subcomponents thereof. That shouldn't be a problem for analysis, even of the machine code. You might get a list for allocated objects, but you are already trying to avoid those. > Explicit dispatching calls also lead to machine-level calls with > dynamically defined target addresses, but here the analyst can aid and > guide the analysis tool as to which subprograms may be called by a > particular dispatching call. > That is much harder for implicit dispatching calls, such as > finalization. Not sure why; the type is known exactly in most cases (including the one in question). And GNAT probably isn't generating a dispatching call anyway (and probably isn't generating any call at all in this particular case). ... > > The Bounded_Container would > > have to somehow get the Max_Size_In_Storage_Elements. I'd probably > > suggest that it be specified as a parameter to the generic, so that > > such a container could be used for any indefinite type (assuming the > > Max is large enough). > > That sounds like a workable solution. But it still seems necessary to > specify the Max_Size_In_Storage_Elements aspect for the class-wide > root type, so that the compiler can check that the max size given in > the instantiation of Bounded_Container is large enough for all members > of the class. I would have expected it to be enough for the container to check for overflow. Any full-program analysis tool can easily prove that it can't fail. > > Allowing essentially the same thing for class-wide types would be a > > whole 'nother level of implementation pain. > > I don't really see why. All class-wide types are already defined as > needing finalization which, as I understand it, is because any > class-wide variable may or may not contain controlled subcomponents, > depending on the actual tag. It seems to me that there must already be > mechanisms for managing the possible (tag-dependent) existence of > controlled subcomponents. In the normal Ada case (where the tag/bounds/discriminants can't change after allocation), one handles all of the finalization and storage allocation at the declaration of the object. For an implementation like Janus/Ada, that means putting controlled components on the current finalization chain as well as allocating memory from the current frame's storage pool. None of that can change later, so it's relatively easy. Janus/Ada treats each controlled part individually, so only directly controlled components need anything done. (We did that because the alternative of dealing with the entire composite object would mean having to deal with controlled -- thus tagged -- array types, which appeared to lead to madness [mine].) When you have a mutable record, however, one has to be able to add/remove controlled components. With the Janus/Ada implementation, that has to be to a remote location in the finalization chain. Deleting is easy (the chain is doubly linked for that reason), but adding is difficult because it has to be done at a particular place on the chain. This never actually worked in Janus/Ada until last spring when I finally gave up on any efficient solution and placed a local finalization chain into each such object. (And that compiler has only been used by a handful of people, so there's no certainty that it is actually fixed.) (There also was a storage leak problem that has been fixed similarly by including an access to the storage pool to use into the object.) If one allowed changing the tag of class-wide objects, you would have the same issues with them, and a similar, heavy-weight solution would have to be applied to them. > But I now realize that when writing this suggestion, I did not > remember this issue (needing finalization) of class-wide types > (perhaps because I have seldom used controlled types). > Therefore, even if the suggestion leads to some way to bound the size > of T'Class, the analysis could still be confused by the finalization, > as I explained above. To solve this, I want to extend the suggestion > to include a Boolean aspect No_Controlled_Subcomponents which can be > specified for an array type, a record type, a tagged type, or a > class-wide type, and which would mean what it says, that this type > cannot and will not have any controlled subcomponents. A class-wide > type with this aspect would not need finalization, unless the root > tagged type is itself controlled. Seems reasonable. > > Moreover, there are a lot of bizarre language rules to avoid > > problems with mutable record types; the tagged type rules were > > designed knowing that those cases couldn't happen because the tag of > > an object cannot be changed. Allowing that would require some > > additional restrictions on access types which would not be compatible (or > > safe). [Otherwise, an access type to T'Class could end up pointing to an > > object that isn't in T'Class.] > > The rule that comes to mind, for solving this problem, is to prohibit > the creation of access types to sub-classes of any T'Class that has > the Max_Size_In_Storage_Units aspect. > Access-to-constant types could still be allowed. This does not seem > incompatible with Ada 2012. That would work for the specific problem; not sure if there are any others (that would require some analysis of potential problems). It would seem to be a pretty draconian restriction, though, as anything other than access T'Class would have to be banned. That would seem to conflict with anonymous access dispatching, for instance. **************************************************************** From: Niklas Holsti Sent: Tuesday, January 23, 2018 4:33 AM > I agree with Randy's responses, in general. Adding a new kind of > container with some implementation advice and/or requirements is > hugely less disruptive than adding a new language feature. That route is ok for me, unless the container introduces some additional analysis problems, such as finalization (as was discussed in other messages). > Furthermore, you really have no guarantee that the new language > feature wouldn't end up using heap allocation "underneath," and so we > would end up having new restrictions and/or implementation > requirements to prevent that. The restriction No_Implicit_Heap_Allocations should work, at least to warn the programmer if the compiler needs heap to implement the definite class-wide types. Some compilers could still use a secondary stack, of course. But this seems to be an issue of implementation quality, or implementation orientation, and does not apply only to the propsed definite class-wide types, but to almost anything (say, local array variables of dynamic length, which some compilers might allocate on the heap, or on a secondary stack). > Note that we might also address some of the same needs by having a > "safe" pointer capability (AI-0240), that manages storage > automatically, and could be coupled with other things such as > storage-sized access collections to achieve the effectively "static" > allocation you would like. If "safe" pointers still use allocations ("new"), they would not solve the problem, because a static analysis could find it quite difficult to bound the number of allocations executed at run-time, and to match up the allocations with the (implicit) deallocations, which it would have to do in order to discover the maximum number of allocated objects that could exist, at any time, during execution. There are analysis tools that attempt to find bounds on heap usage, but they require much deeper and more complex analysis (for example, "shape analysis" for linked data structures) than is needed for stack-usage analysis. **************************************************************** From: Niklas Holsti Sent: Tuesday, January 23, 2018 8:25 AM > ... >> Some implementations of finalization could be quite difficult to >> analyse from the machine code. ... > Understood. But that assumes that the finalization is non-trivial, and > that a linked list implementation is used. If it is trivial, it is > possible that the compiler generate no finalization at all for an > object with a null finalization routine (and I would hope that is the > case for any compiler trying to support the kind of analysis you mention here). That would solve the finalization-analysis problem in this case, yes. > My understanding is that GNAT generates a direct (not dispatching) > call to the Finalize routine for stand-alone objects and subcomponents > thereof. Ok, that should not be a problem then, if the call to Finalize the Bounded_Container is direct. >> Explicit dispatching calls also lead to machine-level calls with >> dynamically defined target addresses, but here the analyst can aid >> and guide the analysis tool as to which subprograms may be called by >> a particular dispatching call. >> That is much harder for implicit dispatching calls, such as >> finalization. > > Not sure why; the type is known exactly in most cases (including the > one in question). The class-wide type is exactly known to the compiler, but not to the machine-code analyzer. For explicit dispatching calls, the guidance from the analyst, to which I referred above, usually takes the form of textual "annotations", written by the analyst for use by the analysis tool, and saying that the dynamic call at a particular source code location (given by file name and line number, or similar) may actually call (dispatch to) a given set of subprograms specifically named in the annotation. The analyst identifies these subprograms from the source code, perhaps aided by a source-level analysis that identifies the primitive subprograms for each type included in the class. Something like dynamic call at "huhu.adb:626" calls "Foo" or "Foo2" or "Foo3"; For implicit dispatching calls, there is no similar source-code reference location. The analysis tool can tell the analyst that there is a dispatching call at machine code address A, but the analyst will find it difficult to map this call to the source code, and thus to understand what the call is doing, and how to guide the analysis of this call. Machine-code analyzers usually strive to be source-language-agnostic and compiler-agnostic, so as not to restrict their user base by conditions on the user's programming tools. So one tries to separate the machine-code analysis from any source-code analysis that might support the machine-code one. And there is always the problem of mapping information from the source-code analysis to the machine level, such as identifying which implicit finalization accounts for the dispatching call at address A. >>> The Bounded_Container would >>> have to somehow get the Max_Size_In_Storage_Elements. I'd probably >>> suggest that it be specified as a parameter to the generic, so that >>> such a container could be used for any indefinite type (assuming the >>> Max is large enough). >> >> That sounds like a workable solution. But it still seems necessary to >> specify the Max_Size_In_Storage_Elements aspect for the class-wide >> root type, so that the compiler can check that the max size given in >> the instantiation of Bounded_Container is large enough for all >> members of the class. > > I would have expected it to be enough for the container to check for > overflow. Any full-program analysis tool can easily prove that it > can't fail. The other fellow's job always seems easier than one's own :-) In this case, the machine-code analyzer would have to perform this proof at every point in the code where an object is stored into some instantiation of a Bounded_Container. To do so, the machine-code analyzer would have to be able to statically compare the actual Max_Size value of the present instance of Bounded_Container, to the (actual or maximum) size of the object that is being stored in the Bounded_Container. In the machine-code, both values may be the result of dynamic computations. It may be possible; I find it hard to say, as it will depend on the precise form of the code generated for instantiations and for storing an object in the container. (Aside from that, I would much prefer to have a possible overflow detected at compile time, at the instantiation.) >>> Allowing essentially the same thing for class-wide types would be a >>> whole 'nother level of implementation pain. [snip description of Janus/Ada methods; thanks, it was illuminating] > If one allowed changing the tag of class-wide objects, you would have > the same issues with them, and a similar, heavy-weight solution would > have to be applied to them. > >> But I now realize that when writing this suggestion, I did not >> remember this issue (needing finalization) of class-wide types >> (perhaps because I have seldom used controlled types). >> Therefore, even if the suggestion leads to some way to bound the size >> of T'Class, the analysis could still be confused by the finalization, >> as I explained above. To solve this, I want to extend the suggestion >> to include a Boolean aspect No_Controlled_Subcomponents which can be >> specified for an array type, a record type, a tagged type, or a >> class-wide type, and which would mean what it says, that this type >> cannot and will not have any controlled subcomponents. A class-wide >> type with this aspect would not need finalization, unless the root >> tagged type is itself controlled. > > Seems reasonable. If it would simplify implementation, one could allow compilers to require (or assume) that No_Controlled_Subcomponents is specified together with Max_Size_In_Storage_Units. **************************************************************** From: Randy Brukardt Sent: Wednesday, January 24, 2018 3:25 PM ... > >> Explicit dispatching calls also lead to machine-level calls with > >> dynamically defined target addresses, but here the analyst can aid > >> and guide the analysis tool as to which subprograms may be called > >> by a particular dispatching call. > >> That is much harder for implicit dispatching calls, such as > >> finalization. > > > > Not sure why; the type is known exactly in most cases > (including the > > one in question). > > The class-wide type is exactly known to the compiler, but not to the > machine-code analyzer. I understand that, but it isn't relevant when the specific type is known. (Which is what I meant by "type is known exactly", I could have been clearer.) As you know, Ada dispatching uses static binding for dispatching calls where the controlling type is statically determined. A statically bound dispatching call doesn't dispatch at all, it just directly calls the appropriate routine. So, for machine-code analysis, this just looks like a normal direct call. I'd be amazed if the analyzer cares about how that call got determined. Obviously, a compiler could revert to dynamic dispatching code even in that case, but one would hope that if the compiler vendor is trying to support machine code analysis, they would fix the code generation to avoid that. (Yes, Janus/Ada would use a dynamic call in such a case, and that call is hidden in the runtime, but anyone trying to machine-code analysis on Janus/Ada output is nuts -- it wasn't designed for that at all.) ... > >>> The Bounded_Container would > >>> have to somehow get the Max_Size_In_Storage_Elements. I'd probably > >>> suggest that it be specified as a parameter to the generic, so > >>> that such a container could be used for any indefinite type > >>> (assuming the Max is large enough). > >> > >> That sounds like a workable solution. But it still seems necessary > >> to specify the Max_Size_In_Storage_Elements aspect for the > >> class-wide root type, so that the compiler can check that the max > >> size given in the instantiation of Bounded_Container is large > >> enough for all members of the class. > > > > I would have expected it to be enough for the container to check for > > overflow. Any full-program analysis tool can easily prove that it > > can't fail. > > The other fellow's job always seems easier than one's own :-) > > In this case, the machine-code analyzer would have to perform this > proof at every point in the code where an object is stored into some > instantiation of a Bounded_Container. To do so, the machine-code > analyzer would have to be able to statically compare the actual > Max_Size value of the present instance of Bounded_Container, to the > (actual or > maximum) size of the object that is being stored in the > Bounded_Container. In the machine-code, both values may be the result > of dynamic computations. It may be possible; I find it hard to say, as > it will depend on the precise form of the code generated for > instantiations and for storing an object in the container. I wasn't expecting the machine-code analyzer to ever see this check. I was expecting an Ada-level tool to make the proof and the check to then be suppressed. (The only legitimate reason for Ada check suppression is to reduce the cost of machine-code analysis). Additionally, this check (like most new checks in Ada 2020) would be written as a precondition, so it is likely the compiler could eliminate it in most cases. Again, it wouldn't survive for the machine code analyzer to see it. > (Aside from that, I would much prefer to have a possible overflow > detected at compile time, at the instantiation.) If an object did cause an overflow, you'd most likely get a warning about a precondition that always will fail. If one uses "warnings as errors" mode, you get the effect without any extra language machinery. Specifically: procedure Replace_Element (Container : in out Holder; New_Item : in Element_Type) with Pre => New_Item'Size <= Max_Size; If New_Item is an object of a specific type (the usual case), the compiler would be able to evaluate the precondition at compile-time (since New_Item'Size is a compile-time constant). It would be a pretty lousy compiler that actually generated a check in that case. :-) You'd probably have to suppress the check if the object is class-wide, but that would generally be safe if it had previously been stored in a Holder. (And Ada-level analysis tools should be able to prove that regardless of the source of the object.) There will shortly be a proposal for suppressing precondition checks for language-defined units as part of AI12-0112-1. (I have to find time to write it, rather than commenting on everyone else's proposals. :-) **************************************************************** From: Niklas Holsti Sent: Friday, January 26, 2018 2:55 PM > ... >>>> That is much harder for implicit dispatching calls, such as >>>> finalization. >>> >>> Not sure why; the type is known exactly in most cases >> (including the one in question). ... > So, for machine-code analysis, this just looks like a normal direct call. Indeed, such calls are not a problem, not (usually) even for implicit calls. >>> [Niklas Holsti writes:] >>>> But it still seems necessary to >>>> specify the Max_Size_In_Storage_Elements aspect for the class-wide >>>> root type, so that the compiler can check that the max size given >>>> in the instantiation of Bounded_Container is large enough for all >>>> members of the class. >>> >>> I would have expected it to be enough for the container to check for >>> overflow. [snip] > If an object did cause an overflow, you'd most likely get a warning > about a precondition that always will fail. If one uses "warnings as > errors" mode, you get the effect without any extra language machinery. > > Specifically: > procedure Replace_Element (Container : in out Holder; > New_Item : in Element_Type) > with Pre => New_Item'Size <= Max_Size; > > If New_Item is an object of a specific type (the usual case), the > compiler would be able to evaluate the precondition at compile-time > (since New_Item'Size is a compile-time constant). I get it now. I wanted to have the compiler check Max_Size when a descendant type is derived, but I agree with you that as an object of every such descendant type will be entered into an instance of the Bounded_Holder object, somewhere in the program, and most probably with its specific type, the Max_Size can be a (generic) property of Bounded_Holder, and checked against the size of the specific descendant type in a precondition of Replace_Element. Nice. So then I would be content with a Bounded_Holder like that (with some strong advice that its memory usage should be a static function of Max_Size), plus the new aspect No_Controlled_Subcomponents, to eliminate dispatching finalization of the Element_Type upon replacement. Thanks for the discussion of this suggestion. To Randy: should I write new ada-comment messages to suggest Bounded_Holder and No_Controlled_Subcomponents explicitly, or is this discussion enough for you to consider them suggested? **************************************************************** From: Randy Brukardt Sent: Friday, January 26, 2018 6:19 PM ... > To Randy: should I write new ada-comment messages to suggest > Bounded_Holder and No_Controlled_Subcomponents explicitly, or is this > discussion enough for you to consider them suggested? No need. The ARG mainly wants problem descriptions to which we can figure out the answers. And you've provided an understandable problem description, to which we'll work out the least disruptive and most functional solution. In any case, the deadlines (both published and functional) have passed, so new proposals probably will get much lower priority than the existing ones. (The *real* deadline was the deadline for submissions for Monday's ARG meeting, which was 6 hours ago.) I was just writing up the AI for this thread, it will be AI12-0254-1 (should be posted sometime today, Wisconsin time -- thus the next 6 hours). ****************************************************************