!standard 3.8(12/3) 15-10-05 AC95-00273/00 !class confirmation 15-10-05 !status received no action 15-10-05 !status received 15-08-31 !subject Discriminants would be more useful if they could be used in static expressions !summary !appendix !topic Discriminants would be more useful if they could be used in static expressions !reference Ada 2012 RM3.8(12/3) !from Hadrien Grasland 15-08-30 !keywords discriminant static !discussion In section 3.8, paragraph 12/3, the Ada 2012 RM states "If the discriminant is used to define the constraint of a component, the bounds of an entry family, or the constraint of the parent subtype in a derived_type_definition, then its name shall appear alone as a direct_name (not as part of a larger expression or expanded name)." Unfortunately, when building nontrivial discriminated types, this rule turns out to be unpleasantly limitating. For example, it makes it impossible to parametrize an array using its initial bound and length, as in the following (illegal) code : type Tridiagonal_System(First_Index : Index_Type; Length : Count_Type) is record Lower_Diagonal : F_Containers.Vector (First_Index + 1 .. First_Index + Length - 1); Diagonal : F_Containers.Vector (First_Index .. First_Index + Length - 1); Upper_Diagonal : F_Containers.Vector (First_Index .. First_Index + Length - 2); end record; Of course, I can emulate this functionality using more complex langage constructs, such as a triplet of heap-allocated arrays that are all simultaneously allocated by a constructor function. But one thing I greatly enjoy about Ada, with respect to other languages like C++, is the way its glorious type system dramatically reduces the need for such boilerplate code (and unnecessary heap allocations). So this feels like a step in the wrong direction. Consequently, unless it would really be an immense amount of trouble for implementations, I really wish it were possible to use type discriminants in static expressions. *************************************************************** From: Randy Brukardt Sent: Tuesday, September 1, 2015 6:19 PM > In section 3.8, paragraph 12/3, the Ada 2012 RM states "If the > discriminant is used to define the constraint of a component, the > bounds of an entry family, or the constraint of the parent subtype in > a derived_type_definition, then its name shall appear alone as a > direct_name (not as part of a larger expression or expanded name)." Terminology point: nothing in the above wording nor in the rest of your message has anything at all to do with "static expressions" as defined in 4.9. I can't imagine any scenario where the discriminant of a current instance would ever be considered static, as the associated discriminant constraint is unknown there, and surely can be non-static: Obj : Rec_D (Some_Function); ... > Of course, I can emulate this functionality using more complex langage > constructs, such as a triplet of heap-allocated arrays that are all > simultaneously allocated by a constructor function. But one thing I > greatly enjoy about Ada, with respect to other languages like C++, is > the way its glorious type system dramatically reduces the need for > such boilerplate code (and unnecessary heap allocations). > So this feels like a step in the wrong direction. > > Consequently, unless it would really be an immense amount of trouble > for implementations, I really wish it were possible to use type > discriminants in static expressions. As I noted, there is nothing static about these expressions, and there never could be. By mentioning "static expressions" here, you're confusing causal readers as to what you actually are asking about. If you're not sure about the proper terminology, it's best to stay vague (in this case, just talk about "expressions") because that will not confuse the issue. It's a minor point, on to your actual point. This issue has been well-known for decades, but the problem is that this looks natural but it has the potential to cause extreme pain for implementations and correspondingly for some kinds of users (those that cannot have any sort of heap operations in their code). It's clear that some sort of restrictions have to be placed on the expression. Coincidentially, I recently received some private mail from Steve Baird on this very topic (some form of this will be in a future draft of AI12-0075-1, which proposes expansion of what is considered a static function, so I think it is OK to use it here): >We don't want to mess with the 3.8(12/3) rule. > >It might seem tempting to allow (in a component constraint) the case of >a call to a static function such that all parameters (including, of >course, defaulted parameters) are either static expressions or names >denoting discriminants. > > One property we need for discriminant-dependent component constraint > expressions is reproducibility - given the same discriminant values we > must always get the same result. This relaxation of the rules would > meet that requirement, but that's not enough. > > We also need, given a set of discriminant subtypes, to be able to > easily calculate the minimum and maximum values that an expression > might yield (e.g., we need to know how much storage is needed when we > declare an unconstrained object of a type which has an array component > with a discriminant-dependent index constraint). > > That's where this proposal would lead to implementation problems, as > an arbitrary static function would not necessarily be monotonic. I'm no fan of the max-size implementation approach, and are particularly appalled that we allow that to be the only approach used by an implementation. (We don't allow only table-based case statements, for instance, any arbitrary range has to work in a case statement. Why do we treat discriminated records differently??) Nevertheless, that train left the station decades ago, and it wouldn't be sensible to change the rules now, given the massive pain that would be caused. On top of which, some users cannot have any implicit non-stack based storage management (as is required when avoiding a max-size implementation), so an additional aspect would be required to force a max-size implementation when required. That's clearly very complex. (It's also my preference for the language, but I expect that I would be a group of one supporting that position, that's certainly been the way it was in the past.) We once considered (I forget when) allowing a very limited expansion to the 3.8(12/3) rule. Essentially, only D + C and D - C (C being a static expression) were suggested. It didn't seem that those would be sufficiently useful to support the expansion of the Standard wording and implementation complexity. It's probably possible to go slightly beyond that, but even multiply and divide can cause problems (especially when the values can be negative, which they can be in general). Ergo, I wouldn't expect a change in this area, as "thar be dragons", at least for compilers implemented as most are (to my knowledge, all but Janus/Ada). *************************************************************** From: Hadrian Grasland Sent: Wednesday, September 2, 2015 1:59 AM > Terminology point: nothing in the above wording nor in the rest of > your message has anything at all to do with "static expressions" as > defined in 4.9. I can't imagine any scenario where the discriminant of > a current instance would ever be considered static, as the associated > discriminant constraint is unknown there, and surely can be non-static: > > Obj : Rec_D (Some_Function); Indeed, sorry about that. I thought that static expressions would be enough and easier on implementations (if the size of objects is known at compile-time, it is easier to allocate them), but you are right that this would be *way* too limitating, and actually runs counter to the point of discriminated types in most cases. ... >> Consequently, unless it would really be an immense amount of trouble >> for implementations, I really wish it were possible to use type >> discriminants in static expressions. > As I noted, there is nothing static about these expressions, and there > never could be. > > By mentioning "static expressions" here, you're confusing causal > readers as to what you actually are asking about. If you're not sure > about the proper terminology, it's best to stay vague (in this case, > just talk about > "expressions") because that will not confuse the issue. I agree, see above. > It's a minor point, on to your actual point. > > This issue has been well-known for decades, but the problem is that > this looks natural but it has the potential to cause extreme pain for > implementations and correspondingly for some kinds of users (those > that cannot have any sort of heap operations in their code). It's > clear that some sort of restrictions have to be placed on the expression. I am somewhat surprised about the necessity of heap allocation that you mention. I do not see what would prevent a stack-based Ada implementation from implementing the above record as something that is allocated on a stack, maybe along the lines of... ... | Record discriminants | First array offset | First array length | First array data | Second array offset | | ... After all, by the time control reaches the declaration of a constrained object of a discriminated record type, all the information that is required for allocating the discriminated record on the stack should be available. It is only a matter of computing the various array offsets and length at runtime. Of course, as you say and as the email you quoted below elaborates, there must be some kind of rule enforcing the reproducibility of array offset and length computations. For example, it should probably be a expression without any side effect. As for unconstrained objects of a discriminated type, an implementation could probably use the same sort of techniques as for functions returning unconstrained objects, as we discussed recently on comp.lang.ada. That is, either use a secondary stack or raw heap allocation, whichever works best for the application domain, followed by storing a pointer to the allocated object on the primary stack. > Coincidentially, I recently received some private mail from Steve > Baird on this very topic (some form of this will be in a future draft > of AI12-0075-1, which proposes expansion of what is considered a > static function, so I think it is OK to use it here): > >> We don't want to mess with the 3.8(12/3) rule. >> >> It might seem tempting to allow (in a component constraint) the case >> of a call to a static function such that all parameters (including, >> of course, defaulted parameters) are either static expressions or >> names denoting discriminants. >> >> One property we need for discriminant-dependent component constraint >> expressions is reproducibility - given the same discriminant values >> we must always get the same result. This relaxation of the rules >> would meet that requirement, but that's not enough. >> >> We also need, given a set of discriminant subtypes, to be able to >> easily calculate the minimum and maximum values that an expression >> might yield (e.g., we need to know how much storage is needed when we >> declare an unconstrained object of a type which has an array >> component with a discriminant-dependent index constraint). >> >> That's where this proposal would lead to implementation problems, as >> an arbitrary static function would not necessarily be monotonic. Please excuse my lack of implementation experience, but why is it needed to know that minimum and maximum of an expression in this situation ? I do not understand either this comment, nor the ones you provide below about max-size implementations. > I'm no fan of the max-size implementation approach, and are > particularly appalled that we allow that to be the only approach used > by an implementation. (We don't allow only table-based case > statements, for instance, any arbitrary range has to work in a case > statement. Why do we treat discriminated records differently??) > Nevertheless, that train left the station decades ago, and it wouldn't > be sensible to change the rules now, given the massive pain that would > be caused. On top of which, some users cannot have any implicit > non-stack based storage management (as is required when avoiding a > max-size implementation), so an additional aspect would be required to > force a max-size implementation when required. That's clearly very > complex. (It's also my preference for the language, but I expect that > I would be a group of one supporting that position, that's certainly > been the way it was in the past.) > > We once considered (I forget when) allowing a very limited expansion > to the > 3.8(12/3) rule. Essentially, only D + C and D - C (C being a static > expression) were suggested. It didn't seem that those would be > sufficiently useful to support the expansion of the Standard wording > and implementation complexity. It's probably possible to go slightly > beyond that, but even multiply and divide can cause problems > (especially when the values can be negative, which they can be in general). > > Ergo, I wouldn't expect a change in this area, as "thar be dragons", > at least for compilers implemented as most are (to my knowledge, all > but Janus/Ada). In any case, thanks for your thoughtful reply :) It seems that whether I am meanglessly trying to revive an old fire for nothing or not, I am going to learn quite a lot of interesting things by raising this issue again. By the way, if you can remember identifiers or keywords for older Ada Issues that I can read in order to get more familiar with this matter, I would be happy to peruse them. *************************************************************** From: Randy Brukardt Sent: Wednesday, September 2, 2015 3:23 PM ... > > This issue has been well-known for decades, but the problem is that > > this looks natural but it has the potential to cause extreme pain > > for implementations and correspondingly for some kinds of users > > (those that cannot have any sort of heap operations in their code). > > It's clear that some sort of restrictions have to be placed on > the expression. > > I am somewhat surprised about the necessity of heap allocation that > you mention. I do not see what would prevent a stack-based Ada > implementation from implementing the above record as something that is > allocated on a stack, maybe along the lines of... > > ... | Record discriminants | First array offset | First array length | > First array data | Second array offset | components> | ... Nothing does. But that would be a rarely-occurring special case for a compiler, something most compiler-writers are going to avoid. (Especially as we have to support whole record assignment from/to unconstrained objects of the same type. It's a pain if there are multiple representations.) Unconstrained objects of discriminated types are much more likely. Moreover, the ARG tends to worry about the worst case, because that's where the problems lie. We don't worry much about the best case, 'cause implementers will make the best performing compiler that they can -- they rarely need our help in doing that. Anyway, unconstrained objects cause problems in two cases: (1) If the object is constrained by its initial value, the discriminant value my be unknown until runtime. Obj : Disc := Some_Function; (2) If the type is mutable ("mutable" is the term that the RM doesn't define but the AARM does to describe a type with defaulted discriminants), the discriminants can be changed by a later assignment. type Rec (Len : Natural := 0) is record Val : String (1 .. Len); end record; Obj : Rec; -- This could be a component, too. Obj := (Len => 2, Val => "RR"); ... Obj := (Len => 3, Val => "ZZZ"); The commonly used solution to this last problem is the "max size" implementation; the object will be large enough to hold a value for any possible value of the discriminant. (As such, this declaration won't compile or will raise Storage_Error on most compilers.) There is no way that we would have different rules for the use of discriminants for mutable and non-mutable types; that would be amazingly confusing to programmers and would also pose a maintenance hazard (it would be much harder to switch between a non-mutable and a mutable type). > After all, by the time control reaches the declaration of a > constrained object of a discriminated record type, all the > information that is required for allocating the discriminated > record on the stack should be available. It is only a matter > of computing the various array offsets and length at runtime. > > Of course, as you say and as the email you quoted below > elaborates, there must be some kind of rule enforcing the > reproducibility of array offset and length computations. For > example, it should probably be a expression without any side effect. Right, but there is no way in current Ada to know that. Only predefined operators are safe that way, which is rather limiting. > As for unconstrained objects of a discriminated type, an > implementation could probably use the same sort of techniques > as for functions returning unconstrained objects, as we > discussed recently on comp.lang.ada. That is, either use a > secondary stack or raw heap allocation, whichever works best > for the application domain, followed by storing a pointer to > the allocated object on the primary stack. No, that does not work for a mutable type. That's because it can change shape (grow and use more memory is the problem case, of course) in a later assignment, and that assignment might be inside of some other scope than the one that created the object. The memory allocation here is not a stack. One can just drop the memory on the floor, but for some reason users don't like that. ;-) Or one can have some sort of memory allocator associated with the scope where the object is declared. That isn't cheap (although it does have the effect of providing a correct implementation of dynamic accessibility checking for almost free - so it's possible that you're paying the cost anyway). Moreover, for users that cannot have any sort of heap allocation, this technique isn't going to work. They'd have to avoid the use of discriminant-dependent components, which is way too fierce of a restriction. > > Coincidentially, I recently received some private mail from Steve > > Baird on this very topic (some form of this will be in a future draft > > of AI12-0075-1, which proposes expansion of what is considered a > > static function, so I think it is OK to use it here): > > > >> We don't want to mess with the 3.8(12/3) rule. > >> > >> It might seem tempting to allow (in a component constraint) the case > >> of a call to a static function such that all parameters (including, > >> of course, defaulted parameters) are either static expressions or > >> names denoting discriminants. > >> > >> One property we need for discriminant-dependent component constraint > >> expressions is reproducibility - given the same discriminant values > >> we must always get the same result. This relaxation of the rules > >> would meet that requirement, but that's not enough. > >> > >> We also need, given a set of discriminant subtypes, to be able to > >> easily calculate the minimum and maximum values that an expression > >> might yield (e.g., we need to know how much storage is needed when we > >> declare an unconstrained object of a type which has an array > >> component with a discriminant-dependent index constraint). > >> > >> That's where this proposal would lead to implementation problems, as > >> an arbitrary static function would not necessarily be monotonic. > > Please excuse my lack of implementation experience, but why > is it needed to know that minimum and maximum of an > expression in this situation ? I do not understand either > this comment, nor the ones you provide below about max-size > implementations. I think I explained it above. For an unconstrained mutable discriminated record type, most implementations use a max-size approach to allocating memory. The alternative of having a dedicated storage pool available to be used if a later assignment grows the object is considered too complex. For a max-size approach to work, one has to be able to determine the max size. But that's not easy if arbitrary static functions are allowed. Consider: subtype S_A is Integer range -20..20; subtype S_B is Integer range -30..10; type A_Rec (A : S_A := 0; B : S_B := 0) is record Val : String (1 .. (if A*B > 0 then A*B else 0)); end record; What is the maximum size of Val? It's 60 characters, but to figure that out, one has to multiply all 4 combinations of end-points. The "obvious" technique of using the upper bounds gets the wrong answer of 20. (Here, the conditional expression just serves to keep the value in range for String; I could have used some other array type without the conditional expression and without changing the issue.) It gets even worse for arbitrary conditional expressions as that potentially makes discontinuities in the result values; you'd have to try every possible value of the discriminants (which can be a whole lot if there are multiple discriminants). It's not impossible, but complicated and slow. (For instance, consider: type B_Rec (C : S_A := 0) is record Val : String (1 .. (if C = 0 then 50 else C+20)); end record; The largest size does not come from either endpoint.) ... > In any case, thanks for your thoughtful reply :) It seems > that whether I am meanglessly trying to revive an old fire > for nothing or not, I am going to learn quite a lot of > interesting things by raising this issue again. > > By the way, if you can remember identifiers or keywords for > older Ada Issues that I can read in order to get more > familiar with this matter, I would be happy to peruse them. I'd look for the paragraph number in the AIs and minutes, that probably won't find everything but at least it would provide a start. *************************************************************** From: Tucker Taft Sent: Wednesday, September 2, 2015 3:43 PM > ... For a max-size approach to work, one has to be able to determine > the max size. But that's not easy if arbitrary static functions are allowed. > Consider: > > subtype S_A is Integer range -20..20; > subtype S_B is Integer range -30..10; > > type A_Rec (A : S_A := 0; B : S_B := 0) is record > Val : String (1 .. (if A*B > 0 then A*B else 0)); > end record; > > What is the maximum size of Val? It's 60 characters, but to figure > that out, one has to multiply all 4 combinations of end-points. The "obvious" > technique of using the upper bounds gets the wrong answer of 20. > (Here, the conditional expression just serves to keep the value in > range for String; I could have used some other array type without the > conditional expression and without changing the issue.) ... To be fair, we could define rules similar to those for "static" predicates (RM2012 3.2.4(15/3-22/3), which have similar concerns. We could call it "discriminant-static" in analogy with "predicate-static," with the key requirement being the ability to determine statically the possible range of values of the expression knowing the range of values of each discriminant, without any sort of exhaustive search. *************************************************************** From: Randy Brukardt Sent: Wednesday, September 2, 2015 4:00 PM Surely. But one wonders if that would lead to a result that is still pretty limiting to users. Pretty much the only operators that work for such a scenario are "+" and "-" (I showed why "*" is a problem, and "/" is obviously worse); no conditionals, no static functions, unclear whether multiple discriminants could be allowed. Anyway, to be discussed if we want to get serious about this. *************************************************************** From: Tucker Taft Sent: Wednesday, September 2, 2015 10:07 PM I don't see "*" as a serious problem. It is true you have to split into positive and negative subranges, and compute their resulting ranges separately. But that is much better than an "exhaustive search." For what it is worth, that is exactly what the CodePeer static analyzer does, as it is performing this sort of "value-set arithmetic" everywhere, on almost any operator. *************************************************************** From: Bob Duff Sent: Wednesday, September 2, 2015 4:48 PM > Terminology point: nothing in the above wording nor in the rest of > your message has anything at all to do with "static expressions" as > defined in 4.9. I think Hadrien Grasland was saying that it should have the form of a static expression, assuming you plugged in a static value for the discriminant. Or something like that. This concept is similar to "predicate-static". > I'm no fan of the max-size implementation approach, and are > particularly appalled that we allow that to be the only approach used > by an implementation. (We don't allow only table-based case > statements, for instance, any arbitrary range has to work in a case > statement. Why do we treat discriminated records differently??) Probably because language designers had the notion that heap allocation should never be implicit. Sparse case statements don't require implicit heap allocation. Never mind that if you don't want implicit heap allocation, a better solution is "pragma Restrictions(No_Implicit_Heap);". And never mind that in Ada 83, tasks require heap allocation, and in later language versions most implementations use heap allocation. And never mind that unbounded strings and Vectors and Holders and whatnot all do heap allocation, arguably "implicit". > [quoting Steve:] > > We also need, given a set of discriminant subtypes, to be able to > > easily calculate the minimum and maximum values that an expression > > might yield (e.g., we need to know how much storage is needed when > > we declare an unconstrained object of a type which has an array > > component with a discriminant-dependent index constraint). I'm not sure why we "need" that. Maybe we'd "want" that, but: - Why "minimum"? - This is only needed/wanted for defaulted discrims. We already have different rules (e.g. tagged types forbid defaulted discrims). - A conservative maximum can always be calculated -- e.g 2**56-1 bytes on x86-64 (which will always raise S_E on alloc-the-max implementations). > We once considered (I forget when) allowing a very limited expansion > to the > 3.8(12/3) rule. Essentially, only D + C and D - C (C being a static > expression) were suggested. For what it's worth, those are the cases I've run into in practise. In particular "Some_Array(0..Length_Discrim-1);". But it's a rather kludgy restriction, reminiscent of the restrictions on array indexing expressions in early FORTRAN. *************************************************************** From: Steve Baird Sent: Wednesday, September 2, 2015 6:08 PM > - Why "minimum"? For a discriminant-dependent low bound in an index constraint. > This is only needed/wanted for defaulted discrims. We already > have different rules (e.g. tagged types forbid defaulted > discrims). If we change the rules to allow subtype Index is Integer range 0 .. Some_Dynamic_Expression; type T1 (D1 : Index) is record T1_Stuff : String (1 .. Some_Complicated_But_Static_Function (D1)); -- legal because D1 has no default value end record; then what is the component size of the following array type? type T2 (D2 : Index := 0) is record T2_Stuff: T1 (D1 => D2); end record; type T2_Vector is array (Character) of T2; I'm taking it as a given that we want to preserve "allocate max size" as a viable implementation option for mutable records. Some set of rules could be concocted to allow the cases we know how to implement while disallowing things like the above example; I'm just saying that the rule to capture this distinction isn't completely obvious. A similar example that would also need to be dealt with is type T3 (D3 : Index := 0) is new T1 (D1 => D3); , although new discrims for a derived untagged type is more of a corner case. *************************************************************** From: Hadrien Grasland Sent: Thursday, September 3, 2015 2:31 PM > [...] > > Anyway, unconstrained objects cause problems in two cases: > (1) If the object is constrained by its initial value, the > discriminant value my be unknown until runtime. > Obj : Disc := Some_Function; > > (2) If the type is mutable ("mutable" is the term that the RM doesn't > define but the AARM does to describe a type with defaulted > discriminants), the discriminants can be changed by a later assignment. > > type Rec (Len : Natural := 0) is record > Val : String (1 .. Len); > end record; > > Obj : Rec; -- This could be a component, too. > > Obj := (Len => 2, Val => "RR"); > ... > Obj := (Len => 3, Val => "ZZZ"); > > The commonly used solution to this last problem is the "max size" > implementation; the object will be large enough to hold a value for > any possible value of the discriminant. (As such, this declaration > won't compile or will raise Storage_Error on most compilers.) > > There is no way that we would have different rules for the use of > discriminants for mutable and non-mutable types; that would be > amazingly confusing to programmers and would also pose a maintenance > hazard (it would be much harder to switch between a non-mutable and a mutable type). I would argue that of those, only (2) is really serious, since (1) can almost always (except where you highlight below) be solved using a run-time allocation mechanism based either on heap allocation or a sufficiently large program stack, whichever is available. The only situation where this would fail would be an architecture where processes may have large static memory requirements, but may not ask for much more at runtime (think non-growing 4 KB stacks :-( ) Indeed though, the mutable type problem is pretty serious. I didn't know about that one, and indeed I agree with you that silent heap allocation with reference-counted pointers would probably be a better fit in this case, if the target platform could provide that. Though this is probably never going to fly for most implementations, because real-time programming and heap allocation don't mix well. > [...] > >> As for unconstrained objects of a discriminated type, an >> implementation could probably use the same sort of techniques as for >> functions returning unconstrained objects, as we discussed recently >> on comp.lang.ada. That is, either use a secondary stack or raw heap >> allocation, whichever works best for the application domain, followed >> by storing a pointer to the allocated object on the primary stack. > No, that does not work for a mutable type. That's because it can > change shape (grow and use more memory is the problem case, of course) > in a later assignment, and that assignment might be inside of some > other scope than the one that created the object. The memory > allocation here is not a stack. One can just drop the memory on the > floor, but for some reason users don't like that. ;-) > > Or one can have some sort of memory allocator associated with the > scope where the object is declared. That isn't cheap (although it does > have the effect of providing a correct implementation of dynamic > accessibility checking for almost free - so it's possible that you're > paying the cost anyway). > > Moreover, for users that cannot have any sort of heap allocation, this > technique isn't going to work. They'd have to avoid the use of > discriminant-dependent components, which is way too fierce of a restriction. So if I get your point, the main problem with allowing broader use of discriminants is that it could break max-size-based implementation techniques, which themselves are required to implement mutable types without heap allocation. But as you have also shown above, these techniques are already trivially broken by using a "wrong" discriminant type to define array bounds, and consequently can only be applied to a limited subset of the mutable types allowed by the Ada standard anyway. So what would change in the end ? Couldn't an implementation that doesn't support heap allocation just equate an array index of nontrivial value to the mythical universal_integer, throw a warning about objects of an unbounded mutable type causing problems, and then die at runtime the way GNAT did when I exposed it to your code snippet above ? > [...] > > [...] For an unconstrained mutable discriminated record type, most > implementations use a max-size approach to allocating memory. The > alternative of having a dedicated storage pool available to be used if > a later assignment grows the object is considered too complex. > > For a max-size approach to work, one has to be able to determine the > max size. But that's not easy if arbitrary static functions are allowed. > Consider: > > subtype S_A is Integer range -20..20; > subtype S_B is Integer range -30..10; > > type A_Rec (A : S_A := 0; B : S_B := 0) is record > Val : String (1 .. (if A*B > 0 then A*B else 0)); > end record; > > What is the maximum size of Val? It's 60 characters, but to figure > that out, one has to multiply all 4 combinations of end-points. The "obvious" > technique of using the upper bounds gets the wrong answer of 20. > (Here, the conditional expression just serves to keep the value in > range for String; I could have used some other array type without the > conditional expression and without changing the issue.) > > It gets even worse for arbitrary conditional expressions as that > potentially makes discontinuities in the result values; you'd have to > try every possible value of the discriminants (which can be a whole > lot if there are multiple discriminants). It's not impossible, but complicated and slow. > > (For instance, consider: > > type B_Rec (C : S_A := 0) is record > Val : String (1 .. (if C = 0 then 50 else C+20)); > end record; > > The largest size does not come from either endpoint.) I guess a max-size implementation could do it like this then : * Try to put a reasonable upper bound on mutable object size using trivial heuristics (e.g. for raw discriminant usage "+" or "-" something known statically, where the computation is easy). * In case of failure, assume the size of the object can be anything from the conceptual universal_integer'First to universal_integer'Last (and beyond !), and throw a compiler warning about mutable object usage possibly causing errors at runtime, the way GNAT does. * In the special case where the array bound values are provided by an expression function, use the type of the result to put less pessimistic bounds on the mutable object's size. Possibly throw a warning anyway if the range remains gigantic. > [...] >> By the way, if you can remember identifiers or keywords for older Ada >> Issues that I can read in order to get more familiar with this >> matter, I would be happy to peruse them. > I'd look for the paragraph number in the AIs and minutes, that > probably won't find everything but at least it would provide a start. Unfortunately, trying it with "3.8(12", which is the least specific I can do before being swamped with irrelevant junk, didn't lead much results. Mostly AIs about the RM sometimes violating this rule, typo corrections, and things like that. Well, guess this will have to do. *************************************************************** From: Hadrien Grasland Sent: Thursday, September 3, 2015 2:47 PM > [...] > >> Terminology point: nothing in the above wording nor in the rest of >> your message has anything at all to do with "static expressions" as >> defined in 4.9. > I think Hadrien Grasland was saying that it should have the form of a > static expression, assuming you plugged in a static value for the > discriminant. Or something like that. This concept is similar to > "predicate-static". Perhaps that's what I had in mind initially (I don't remember), but Randy is right that what I wrote in the end was terribly wrong ;-) >> I'm no fan of the max-size implementation approach, and are >> particularly appalled that we allow that to be the only approach used >> by an implementation. (We don't allow only table-based case >> statements, for instance, any arbitrary range has to work in a case >> statement. Why do we treat discriminated records differently??) > Probably because language designers had the notion that heap > allocation should never be implicit. Sparse case statements don't > require implicit heap allocation. > > Never mind that if you don't want implicit heap allocation, a better > solution is "pragma Restrictions(No_Implicit_Heap);". > > And never mind that in Ada 83, tasks require heap allocation, and in > later language versions most implementations use heap allocation. > > And never mind that unbounded strings and Vectors and Holders and > whatnot all do heap allocation, arguably "implicit". I agree, this is all very hypocritical. I'm not a huge fan of unnecessary heap allocation, but in some scenarios, using the heap simply cannot be avoided (or an implementation simply doesn't want to do without, because it's much more difficult), and it would be WAY better to have some kind of restrictions available to allow a programmer to prevent it from happening. >> [quoting Steve:] >>> We also need, given a set of discriminant subtypes, to be able to >>> easily calculate the minimum and maximum values that an expression >>> might yield (e.g., we need to know how much storage is needed when >>> we declare an unconstrained object of a type which has an array >>> component with a discriminant-dependent index constraint). > I'm not sure why we "need" that. Maybe we'd "want" that, but: > > - Why "minimum"? > > - This is only needed/wanted for defaulted discrims. We already > have different rules (e.g. tagged types forbid defaulted > discrims). > > - A conservative maximum can always be calculated -- e.g 2**56-1 > bytes on x86-64 (which will always raise S_E on alloc-the-max > implementations). > >> We once considered (I forget when) allowing a very limited expansion >> to the >> 3.8(12/3) rule. Essentially, only D + C and D - C (C being a static >> expression) were suggested. > For what it's worth, those are the cases I've run into in practise. > In particular "Some_Array(0..Length_Discrim-1);". > But it's a rather kludgy restriction, reminiscent of the restrictions > on array indexing expressions in early FORTRAN. Yes. A better solution would seem to be the one that you proposed above, and that I also echoed in my reply to Randy's first mail : let the implementation compute max-size correctly when it is able to (e.g. discriminant "+" or "-" static data), and when it is not able to, assume the worst, throw a warning, and crash at run-time. *************************************************************** From: Randy Brukardt Sent: Thursday, September 3, 2015 3:11 PM > So if I get your point, the main problem with allowing broader use of > discriminants is that it could break max-size-based implementation > techniques, which themselves are required to implement mutable types > without heap allocation. > > But as you have also shown above, these techniques are already > trivially broken by using a "wrong" discriminant type to define array > bounds, and consequently can only be applied to a limited subset of > the mutable types allowed by the Ada standard anyway. So what would > change in the end ? > > Couldn't an implementation that doesn't support heap allocation just > equate an array index of nontrivial value to the mythical > universal_integer, throw a warning about objects of an unbounded > mutable type causing problems, and then die at runtime the way GNAT > did when I exposed it to your code snippet above ? The problem with this is that almost all compilers solely support a max-size implementation. (Janus/Ada is the only one that I know of that does not do that.) What would be point be of providing an expansive new feature that actually works in extremely limited circumstances? That would be lying to users. Moreover, it would make discriminated types even less portable than they currently are (with many reasonable definitions getting rejected at runtime). Now, I personally think that implementations should never have been allowed to use solely a max-size implementation. If someone used restriction No_Implicit_Heap or used a similar type-related aspect (which should be defined), then the type should be rejected (which is far more useful than getting Storage_Error at some later point). Those restrictions specifically invoke implementation-dependent behavior, so that seems fine. But otherwise failing to implement a reasonable type declaration simply should not be an option. But I've been a lone voice in the wilderness on this one, and I don't expect that to change at this late date. ... > I guess a max-size implementation could do it like this then : > > * Try to put a reasonable upper bound on mutable object size using > trivial heuristics (e.g. for raw discriminant usage "+" > or "-" something known statically, where the computation is easy). > * In case of failure, assume the size of the object can be anything > from the conceptual universal_integer'First to universal_integer'Last > (and beyond !), and throw a compiler warning about mutable object > usage possibly causing errors at runtime, the way GNAT does. > * In the special case where the array bound values are provided by an > expression function, use the type of the result to put less > pessimistic bounds on the mutable object's size. Possibly throw a > warning anyway if the range remains gigantic. I don't believe that an implementation should have the option to mishandle the implementation of reasonable types. That's a serious portability problem, as every implementation will do that differently. > > [...] > >> By the way, if you can remember identifiers or keywords for older > >> Ada Issues that I can read in order to get more familiar with this > >> matter, I would be happy to peruse them. > > I'd look for the paragraph number in the AIs and minutes, that > > probably won't find everything but at least it would provide a start. > > Unfortunately, trying it with "3.8(12", which is the least specific I > can do before being swamped with irrelevant junk, didn't lead much > results. Mostly AIs about the RM sometimes violating this rule, typo > corrections, and things like that. > > Well, guess this will have to do. I suspect that this thread (including Steve and Bob's messages) covered most of the problems introduced by an expansion of the rules. So you probably wouldn't find much different elsewhere. *************************************************************** From: Bob Duff Sent: Thursday, September 3, 2015 3:33 PM > > - Why "minimum"? > > For a discriminant-dependent low bound in an index constraint. Ah, yes, of course. I think I have used that feature approximately zero times in my life (not counting compiler-test programs). > > This is only needed/wanted for defaulted discrims. We already > > have different rules (e.g. tagged types forbid defaulted > > discrims). > > > If we change the rules to allow > > subtype Index is Integer range 0 .. Some_Dynamic_Expression; > > type T1 (D1 : Index) is > record > T1_Stuff : String > (1 .. Some_Complicated_But_Static_Function (D1)); > -- legal because D1 has no default value > end record; > > then what is the component size of the following array type? > > type T2 (D2 : Index := 0) is > record > T2_Stuff: T1 (D1 => D2); > end record; > > type T2_Vector is array (Character) of T2; I guess part of your point is that if we allow the above, we might as well allow a default on T1 as well. And I guess the answer to the "component size" question is approximately Integer'Last*Storage_Unit bits, likely leading to Storage_Error. > I'm taking it as a given that we want to preserve "allocate max size" > as a viable implementation option for mutable records. > > Some set of rules could be concocted to allow the cases we know how to > implement while disallowing things like the above example; I'm just > saying that the rule to capture this distinction isn't completely > obvious. Agreed, but the language doesn't necessarily need to make things illegal -- we can count on GNAT to raise S_E and warn about it. (By the way, I did some experiments, and discovered that the GNAT warnings are conservative -- it warns about S_E in some cases where it ends up allocating 100 or so bytes. Probably because the warning comes from the front end, and the record layout is done by the back end.) *************************************************************** From: Bob Duff Sent: Thursday, September 3, 2015 3:44 PM > The problem with this is that almost all compilers solely support a > max-size implementation. (Janus/Ada is the only one that I know of > that does not do > that.) I think the Rational compiler also uses heap allocation. And the old Alsys compiler also, except it was buggy (a renaming could become a dangling pointer). > But I've been a lone voice in the wilderness on this one, and I don't > expect that to change at this late date. Not quite "lone"; I think Steve agrees with you. And I'm sort of ambivalent. *************************************************************** From: Hadrien Grasland Sent: Thursday, September 3, 2015 4:30 PM >> Couldn't an implementation that doesn't support heap allocation just >> equate an array index of nontrivial value to the mythical >> universal_integer, throw a warning about objects of an unbounded >> mutable type causing problems, and then die at runtime the way GNAT >> did when I exposed it to your code snippet above ? > The problem with this is that almost all compilers solely support a > max-size implementation. (Janus/Ada is the only one that I know of > that does not do > that.) What would be point be of providing an expansive new feature > that actually works in extremely limited circumstances? That would be > lying to users. Moreover, it would make discriminated types even less > portable than they currently are (with many reasonable definitions > getting rejected at runtime). My point was that for mutable discriminated types, the harm is already done. You have shown yourself that something as trivial as an array whose bounds are parametrized by a Natural will violently implode at runtime under most implementations using the max-size technique, yet work just fine on a couple of others. As of today, mutable discriminated types are unportable because most implementations do not do a very good job at implementing them. My understanding is that the only way to fix this (long-term, Ada 202x-ish) would be to... 1/ Somehow amend the Ada standard so that it rejects pure max-size implementations that cannot use the heap when it is the only sensible option 2/ Add a restriction so that users who want no heap allocation can still get that behavior, and make the use of mutable types illegal when that restriction is in effect It actually seems that we agree on this one, when you write... > Now, I personally think that implementations should never have been > allowed to use solely a max-size implementation. If someone used > restriction No_Implicit_Heap or used a similar type-related aspect > (which should be defined), then the type should be rejected (which is > far more useful than getting Storage_Error at some later point). Those > restrictions specifically invoke implementation-dependent behavior, so > that seems fine. But otherwise failing to implement a reasonable type > declaration simply should not be an option. > > But I've been a lone voice in the wilderness on this one, and I don't > expect that to change at this late date. ...so I guess I can join you in the wilderness if you like ;-) > I don't believe that an implementation should have the option to > mishandle the implementation of reasonable types. That's a serious > portability problem, as every implementation will do that differently. But again, isn't the harm already done on this one ? And, more interestingly, should it be considered a separate defect of the Ada standard that so many implementations, which are considered valid, can implement discriminated types in a fashion that makes some of them, even very simple ones, largely unportable ? To say it otherwise, is the standard or the implementation at fault in this case ? *************************************************************** From: Randy Brukardt Sent: Thursday, September 3, 2015 4:55 PM ... > And, more interestingly, should it be considered a separate defect of > the Ada standard that so many implementations, which are considered > valid, can implement discriminated types in a fashion that makes some > of them, even very simple ones, largely unportable ? To say it > otherwise, is the standard or the implementation at fault in this case > ? It's not a defect of the Standard, since capacity limits are left to the implementers. An implementation that only allowed one component in each record would meet the requirements of the Standard, but obviously wouldn't be very useful. The Standard can't say anything about "goodness" of an implementation, since trying to specify limits is a never-ending game of whack-a-mole (and it would be problematical in an existing Standard as we wouldn't want to force existing implementations to change most things). As Steve Baird points out in a private mail, the real fault here is that the ACATS permits this. My recollection is that that was an explicit decision - tests were removed from an early version of the ACVC (the predecessor of the ACATS) that used Integer discriminants in this way. (But I could be misremembering, that was a very long time ago -- mid-1980s.) While the ACATS cannot test "goodness" of an implementation either, it does effectively set a minimum standard, as it is a pain in the neck to dispute tests and use modified grading to avoid implementation limits. An implementation that couldn't handle multiple record components couldn't compile large parts of the ACATS, so it is necessary to make the implementation limits allow compilation of the ACATS. That at least provides a minimum level of "goodness". The problem here is that trying to introduce such a test would lead to a large fight. On top of which, I think that the aspect Contiguous_Representation (or something like that) would need to be introduced so that users could force max-size behavior if they need it on a per-type basis (and that would be good for portability in any case). [Restriction No_Implicit_Heap_Allocation would do the trick for program-wide requirements.] So I wouldn't try to use the ACATS to introduce such a requirement without at least some ARG support. *************************************************************** From: Hadrien Grasland Sent: Monday, September 7, 2015 4:56 AM So, if I try to summarize, we seem to be discussing two separate, but related issues here. 1/Currently, the applicability of discriminants is limited in unnatural ways, due to the restrictions brought forth by paragraph 3.8(12/3) of the RM. The rules do not even allow the specification of an array by its index offset and size, for example, which seems unnecessarily draconian. Thus, it would be very nice if these restrictions could be lifted to allow for more general expressions based on discriminants in record types. Two problems prevent this from happening : - Not every expression is suitable for use inside of a discriminated record definition. We would like to have an expression with no side effect, to ensure reproducibility of computation results. The global annotations being proposed as part of AI12-0079-1 could probably help with this when it comes to function calls. - Complex expressions, especially those including calls to non-intrinsic functions, make it hard to impossible for implementations to deduce reasonable bounds on the size of discriminated types. Such bounds are, in turn, necessary to the implementation of mutable discriminated types in heap-free environments. A way to address both of these problems is to restrict the kind of expressions allowed inside of discriminated types to certain frequently needed scenarii, such as the application of integer offsets ("+" and "-") to integral discriminants. That works, provided that the "+" and "-" operators are implemented without side effects (...we have to think about the worst case, right ?). But as Bob Duff points out, it is kludgy, so a more general solution would be preferable if possible. 2/In general, current heap-free implementations of mutable discriminated types are kludgy. It is possible to break them with trivial code examples, such as the one given by Randy Brukardt of a string type with Natural bounds. Arguably, a better option would be to make the functionality depend on the availability of heap allocation for such types, and introduce a restriction No_Implicit_Heap (or similar), along with possibly type-specific aspects, to prevent the use of heap-allocated types. But that would be, in itself, opening a new can of worms. Would you agree with this summary, or am I missing something important there ? *************************************************************** From: Tucker Taft Sent: Monday, September 7, 2015 8:51 AM > A way to address both of these problems is to restrict the kind of > expressions allowed inside of discriminated types to certain > frequently needed scenarii, such as the application of integer offsets > ("+" and "-") to integral discriminants. That works, provided that the > "+" and "-" operators are implemented without side effects (...we have > to think about the worst case, right ?). But as Bob Duff points out, it is kludgy, so a more general solution would be preferable if possible. Perhaps, but I believe this is the way to go, if we go in this direction at all. But note that we do not need to restrict it this much. We could include "*", "/", "**" and perhaps a few others, so long as the range of possible values can be determined statically without an exhaustive search through all possible values of the discriminant. > 2/In general, current heap-free implementations of mutable discriminated types are kludgy. > It is possible to break them with trivial code examples, such as the > one given by Randy Brukardt of a string type with Natural bounds. > Arguably, a better option would be to make the functionality depend on > the availability of heap allocation for such types, and introduce a > restriction No_Implicit_Heap (or similar), along with possibly > type-specific aspects, to prevent the use of heap-allocated types. But that would be, in itself, opening a new can of worms. This is asking for trouble, in my view. I don't think we should ever require the use of a heap for mutable discriminated types. *************************************************************** From: Bob Duff Sent: Monday, September 7, 2015 8:54 AM > Would you agree with this summary, or am I missing something important > there ? Good summary. I am inclined to oppose any changes in this area. It's not sufficiently broken, and I'm sure there are other more important changes to make. I could be convinced to allow the "Discrim+1" case and similar. But I am strongly opposed to any change that would require heap allocation. Too much implementation difficulty (for some implementations) for too little gain. If you want growable things, use Containers.Vectors or similar. By the way, the restriction No_Implicit_Heap_Allocations already exists. I don't know what "implicit" means; the RM doesn't define it. *************************************************************** From: Jean-Pierre Rosen Sent: Monday, September 7, 2015 9:37 AM It doesn't define "heap" either... *************************************************************** From: Bob Duff Sent: Monday, September 7, 2015 12:15 PM True. But I know what "heap allocation" means; at least I think I do. I'm not so sure about what makes such an allocation "implicit". *************************************************************** From: Simon Wright Sent: Monday, September 7, 2015 12:38 PM "not explicit", I suppose; i.e. without the use of the keyword "new". This usage of "implicit" isn't given by dictionary.com, for example, but (as "unstated") does appear in the thesaurus. *************************************************************** From: Bob Duff Sent: Monday, September 7, 2015 2:32 PM For sure, "new" is an explicit heap allocation. But what about "Append" (in Ada.Containers.Vectors)? Everybody knows it does heap allocation. And I can point to the line of code with "Append", and say, "See that? That right there is doing a heap allocation." Sounds kind of explicit. But there's no "new" in the code (no fair peeking at the body of Append; it might not even be written in Ada). *************************************************************** From: Simon Wright Sent: Tuesday, September 8, 2015 4:14 PM Doesn't sound explicit to me. It seems that GNAT implements the restriction by forbidding "new" while compiling the RTS. So if I were to copy Ada.Containers.Vectors outside the RTS (i.e. outside the Ada hierarchy) I could use it in a restricted environment without problems. The Ravenscar Guide (from which the "implicit" wording derives) doesn't seem to consider Ada.Containers as part of "the implementation". *************************************************************** From: Hadrien Grasland Sent: Wednesday, September 9, 2015 1:28 PM >> Would you agree with this summary, or am I missing something >> important there ? > Good summary. > > I am inclined to oppose any changes in this area. It's not > sufficiently broken, and I'm sure there are other more important changes to make. > > I could be convinced to allow the "Discrim+1" case and similar. > But I am strongly opposed to any change that would require heap > allocation. Too much implementation difficulty (for some > implementations) for too little gain. If you want growable things, > use Containers.Vectors or similar. > > By the way, the restriction No_Implicit_Heap_Allocations already exists. > I don't know what "implicit" means; the RM doesn't define it. In the end, I guess I will have to agree with you and Tucker Taft. It sounds like the only changes that can be added without either... 1/An unstated requirement for heap allocation 2/More annoying distinctions between constained and mutable discriminated types would feel like too little, too late, in comparison to other pending issues. And lack of forced heap allocation is also something I like about Ada, considering I plan to use this language in my next attempt at OSdeving. Thanks for this very thoughtful discussion anyway, everyone ! ***************************************************************