!standard 7.5(8.1/2) 07-10-24 AC95-00149/01 !class confirmation 07-10-24 !status received no action 07-10-24 !status received 07-09-26 !subject Out of range array aggregates and build-in-place !summary !appendix !topic Out-of-range array aggregates and build-in-place !reference 7.5(8.1) !from Adam Beneschan 07-09-26 !discussion In the following program: package Pak1 is Count : Integer := 0; type Rec is record N : Integer; end record; type Rec_Arr is array (natural range <>) of Rec; function Func return Rec; end Pak1; package body Pak1 is function Func return Rec is begin Count := Count + 1; return (N => Count); end Func; end Pak1; with Pak1; with Text_IO; procedure Test570 is begin begin declare A : Pak1.Rec_Arr (1 .. 20) := (1 .. 1000 => Pak1.Func); begin null; end; exception when Constraint_Error => null; end; Text_IO.Put_Line (Integer'Image (Pak1.Count)); end Test570; The semantics of the declaration of A is that an anonymous array object with bounds 1..1000 is created, Func is called 1000 times to set up the array, and then the code attempts to convert this object to the subtype Rec_Arr(1..20) and then gets a constraint error. The program will output 1000. Now suppose we make Rec a limited type, so that no anonymous object can be created. What value of Count is output? Is an implementation that outputs either 0 or 20 (or maybe 21) for Count a legal implementation? If the initialization expression had been a call to a function that returns an unconstrained array, and the function attempted to return an array with bounds 1..1000, the Implementation Permission in 6.5(24) would allow the implementation to raise Constraint_Error at the point of the call as soon as it was determined that the constraint would be violated. But I don't see any such permission for initialization expressions that are aggregates. Intuitively, I would think that for this aggregate: A : Pak1.Rec_Arr (1 .. 20) := (1 .. 20 => Pak1.Func); the implementation would call Pak1.Func effectively using A(I) as an "out" parameter for each I in 1..20; obviously this won't work when the aggregate is too big, if Func is expected to be called 1000 times. **************************************************************** From: Randy Brukardt Sent: Wednesday, September 26, 2007 5:56 PM .... > The semantics of the declaration of A is that an anonymous array > object with bounds 1..1000 is created, Func is called 1000 times to > set up the array, and then the code attempts to convert this object to > the subtype Rec_Arr(1..20) and then gets a constraint error. The > program will output 1000. > > Now suppose we make Rec a limited type, so that no anonymous object > can be created. What value of Count is output? Is an implementation > that outputs either 0 or 20 (or maybe 21) for Count a legal > implementation? I think so, and I think that is true for the non-limited case as well. That's because of the optimization permissions of 11.6(6). The Constraint_Error here is surely a language-defined check, and the permission allows the expression to be moved forward (or backwards) so long as the same handler will be used. (Which is surely the case for a single statement or declaration.) In particular, the "external interactions" don't need to match the canonical semantics, meaning that you can't reason at all about the number of calls to the function. (OTOH, if no exception is raised, the number of calls is well-defined.) We need the special permission for functions because the exception would then be raised within the function (and that is different exception handler). In hindsight, I wonder if we really did need that permission at all. Well, let's not answer questions not asked... :-) Of course, this all depends on my understanding of 11.6, and we all know that *no one* understands 11.6 :-), so I might be totally off base... **************************************************************** From: Randy Brukardt Sent: Wednesday, September 26, 2007 8:27 PM Stephen Baird asserts privately: > Build-in-place should be implementable without relying on 11.6. That doesn't make much sense. 11.6 is a bunch of exceptions to the canonical semantics; build-in-place is a set of exceptions to the canonical semantics. Why is one set of exceptions better than any other? What you need to prevent trouble with Adam's example is a permission to raise the exception early. (That's what the permission for functions is about.) The language already has it. Having two permissions with the same effect is bad language design! To start saying that all language permissions are not created equal is pretty silly. You could argue that 11.6 itself is bad language design, but I would counter that build-in-place itself is also bad language design. Both warp the canonical semantics for a particular end. It's not worth putting lipstick on a pig. The reason I was in favor of including the permission for functions is to make it crystal clear that raising the exception is allowed, given that it is logically in a different scope (possibly separately compiled) than the original exception (it's only inside of the function that you can tell that there is a problem). That's not necessary in the aggregate case, as everything is in the same place. I realize that someone has suggested that build-in-place be given first class language status. I don't believe that is possible without a massive rewrite of the standard, because you are going to be continually running into examples like Adam's. That's why it wasn't done in the first place. I think the time spent on such a rewrite is better spent on other Ada topics, rather than making a bad situation worse. **************************************************************** From: Tucker Taft Sent: Wednesday, September 26, 2007 8:58 PM I agree with Randy. Our compiler generates "raise Constraint_Error;" for this, followed by some (dead) code that would do bad things if it were executed. In general in our compiler, if there is an applicable index constraint (what we more generally call a "target subtype") for an array aggregate, even if it doesn't have an "others" association, we check the actual bounds against the applicable index constraint before we start evaluating component expressions. It might be nice if that were explicitly permitted by RM 4.3.*, but as Randy points out, 11.6 allows this kind of reordering generally. **************************************************************** From: Pascal Leroy Sent: Thursday, September 27, 2007 4:42 AM > Stephen Baird asserts privately: > > > Build-in-place should be implementable without relying on 11.6. I am with Steve here. It should be possible for a compiler writer to skip the sections titled Implementation Permissions and still know how to implement the language correctly. > That doesn't make much sense. 11.6 is a bunch of exceptions > to the canonical semantics; build-in-place is a set of > exceptions to the canonical semantics. Why is one set of > exceptions better than any other? Oh boy, that's confused! In case you didn't notice, 11.6 is a bunch of *Implementation Permissions*. 7.5(8.1/2) is an *Implementation Requirement*. Surely I would assume that a requirement has precedence over a permission. In other words, a requirement is absolutely not an exception to the canonical semantics: if you don't comply with it you don't have a correct compiler. > To start saying that all language permissions are not created > equal is pretty silly. This statement is true, but then saying that a requirement is on the same footing as a permission is also quite silly. > The reason I was in favor of including the permission for > functions is to make it crystal clear that raising the > exception is allowed, given that it is logically in a > different scope (possibly separately compiled) than the > original exception (it's only inside of the function that you > can tell that there is a problem). That's not necessary in > the aggregate case, as everything is in the same place. I think that Adam's example shows that, when it comes to build-in-place, whenever we have a rule regarding functions we should have a similar rule regarding aggregates. So there should be somewhere (probably in 4.3) an equivalent of 6.5(24/2). > I realize that someone has suggested that build-in-place be > given first class language status. Yes, and that someone believes that this is the only sane way to deal with the stream of oddities and pathologies that Adam and Steve have been producing over the last few months. > I don't believe that is > possible without a massive rewrite of the standard FUD, FUD, FUD! > That's why it wasn't done in the first place. That's not my recollection. At least that's not how I perceived it at the time. My view that we had to go with an Implementation Requirement because it was not really testable and had no effect on the high-level semantics of the language. Both assertions turned out to be untrue: witness the fact that someone wrote ACATS tests to check that you actually build in place. **************************************************************** From: Edmond Schonberg Sent: Thursday, September 27, 2007 7:56 AM > Oh boy, that's confused! In case you didn't notice, 11.6 is a bunch of > *Implementation Permissions*. 7.5(8.1/2) is an *Implementation > Requirement*. Surely I would assume that a requirement has precedence > over a permission. In other words, a requirement is absolutely not an > exception to the canonical semantics: if you don't comply with it you > don't have a correct compiler. > Didn't Randy mention 11.6 just to answer Adam's question about the indeterminacy of the output? The program may print 0, 20, or 21, depending on when the constraint violation is detected and how the code for the aggregate is generated. If the compiler recognizes statically that the bounds of the aggregate violate the constraint, it is certainly legal to replace the whole aggregate with a raise, in which case the program outputs zero` (that's what happens with GNAT, and apparently with Tuck's compiler as well). If the aggregate is expanded into a loop then 20 and 21 are possible outputs. As long as this expansion does not generate a temporary this obeys 7.5 (8.1/2), so where is the problem? **************************************************************** From: Tucker Taft Sent: Thursday, September 27, 2007 8:09 AM I sympathize with Steve and Pascal, but I agree with Randy that 11.6 takes precedence over *all* rules in the language. They are written explicitly in a way that says *the rest of the standard* defines the "canonical semantics" and then 11.6 is applied over and above those canonical semantics. I don't buy the argument that implementation permissions are somehow "lower" in the pecking order than implementation requirements. I don't think you can create a pecking order among paragraphs based on the kind of paragraph. All normative paragraphs are of equal weight in general. It is only the words *within* the paragraph that can establish a "pecking order" and words like "notwithstanding what it says elsewhere" or the kind of explicit discussion present in 11.6 are examples of that. However, I agree with Pascal that we ought to at least *try* to define build-in-place formally. However, for this particular check, I think the issue existed before Ada 2005, because of the frequency with which you have a "target subtype" for an aggregate (or more specifically an "applicable index constraint" for an array aggregate), and that it should be legal to perform the constraint checks on the bounds before evaluating the individual component expressions. Other parts of the manual go out of their way to permit "any order" when there are multiple constraint checks and/or multiple expression evaluations. It seems we ought to do something similar here. And having a symmetry between function calls and aggregates makes sense. **************************************************************** From: Robert A. Duff Sent: Thursday, September 27, 2007 8:10 AM > > Stephen Baird asserts privately: > > > > > Build-in-place should be implementable without relying on 11.6. > > I am with Steve here. I share that sentiment. But in practical terms, what matters is what the compiler is allowed/required to do. IMHO, the natural implementation would be to check the bounds before evaluating the components of the aggregate, in which case Count will be 0 when done. So we should allow that (whether limited or nonlimited). I suppose it's in the "spirit of Ada" to allow pretty much anything else, too -- that is, any number of components between 0 and 1000 could be evaluated. **************************************************************** From: Robert A. Duff Sent: Thursday, September 27, 2007 9:12 AM >... If the compiler recognizes statically that the bounds of the > aggregate violate the constraint, it is certainly legal to replace the whole > aggregate with a raise, in which case the program outputs zero` (that's what > happens with GNAT, and apparently with Tuck's compiler as well). Right. But the compiler doesn't need to recognize anything statically in order to get zero. I just checked, and GNAT gets zero even if the bounds of the left-hand side and the bounds of the aggregate are dynamic, presumably because it checks the bounds first (which I think is the natural implementation). My modified version is shown below, and it prints this: Constraint_Error raised. 0 ----- package Pak1 is pragma Elaborate_Body; Count : Integer := 0; type Rec is limited record N : Integer; end record; type Rec_Arr is array (natural range <>) of Rec; function Func return Rec; Ident_Int_20 : Integer; Ident_Int_1000 : Integer; end Pak1; package body Pak1 is function Func return Rec is begin Count := Count + 1; return (N => Count); end Func; begin Ident_Int_20 := 20; Ident_Int_1000 := 1000; end Pak1; with Pak1; with Text_IO; procedure Test570 is begin begin declare A : Pak1.Rec_Arr (1 .. Pak1.Ident_Int_20) := (1 .. Pak1.Ident_Int_1000 => Pak1.Func); begin null; end; exception when Constraint_Error => Text_IO.Put_Line ("Constraint_Error raised."); end; Text_IO.Put_Line (Integer'Image (Pak1.Count)); end Test570; **************************************************************** From: Robert A. Duff Sent: Thursday, September 27, 2007 9:16 AM > However, I agree with Pascal that we ought to > at least *try* to define build-in-place formally. I think something like this might be a good approach: the return object is created, and at a certain point it "becomes" the target object. Atomically with respect to abort, of course. ;-) Furthermore, if the target object has constrained nominal subtype, the C_E can happen "early". And as Pascal pointed out, aggregates and functions are similar, and deserve similar wording. **************************************************************** From: Adam Beneschan Sent: Thursday, September 27, 2007 10:24 AM > Didn't Randy mention 11.6 just to answer Adam's question about the > indeterminacy of the output? The program may print 0, 20, or 21, > depending on when the constraint violation is detected and how the > code for the aggregate is generated. If the compiler recognizes > statically that the bounds of the aggregate violate the constraint, > it is certainly legal to replace the whole aggregate with a raise, in > which case the program outputs zero` (that's what happens with GNAT, > and apparently with Tuck's compiler as well). If the aggregate is > expanded into a loop then 20 and 21 are possible outputs. As long as > this expansion does not generate a temporary this obeys 7.5 (8.1/2), > so where is the problem? I framed my question in terms of the indeterminacy of the value of Count, but (as I think some suspected) the value of Count is only part of the issue; it's more of a concrete symptom of the underlying issue, which is: in this construct, where Rec_Arr is an array of a limited type: A : Pak1.Rec_Arr (1 .. 20) := (1 .. 1000 => Pak1.Func); and the RM requires that the aggregate be "built in place", just what the heck does that mean? I'm siding with those who think that relying on 11.6 isn't a good solution, because (1) 11.6(6) is an implementation permission, not a requirement, so there needs to be a definite answer to the question "What are the semantics of this declaration for a compiler that does not invoke the permission of 11.6(6)", even if every existing compiler would actually take advantage of that permission---and I don't see a good answer to that question; and (2) 11.6 is entitled "Exceptions and Optimization", and to me the term "optimization" is about improving performance---not about turning incorrect programs into correct ones, which it would be if we said that a compiler would not be correct unless it took advantage of 11.6(6) in this declaration. In a previous thread, when I asked a different question about the effect of this build-in-place requirement, Tucker mentioned that there was already some perception in the ARG that it's necessary to be more explicit about what "build-in-place" means; this seems to me to be another example demonstrating why it's necessary. I also think that "build-in-place" should be given first-class language status, and I've done a little bit of thinking about how the RM could be reworded for this (mainly because of a nagging feeling that maybe I should be doing some of the work instead of just harping on the problems :)), but I haven't gotten too far yet... **************************************************************** From: Edmond Schonberg Sent: Thursday, September 27, 2007 10:51 AM > I framed my question in terms of the indeterminacy of the value of > Count, but (as I think some suspected) the value of Count is only part > of the issue; it's more of a concrete symptom of the underlying issue, > which is: in this construct, where Rec_Arr is an array of a limited > type: > > A : Pak1.Rec_Arr (1 .. 20) := (1 .. 1000 => Pak1.Func); > > and the RM requires that the aggregate be "built in place", just what > the heck does that mean? In your original question you said: Intuitively, I would think that for this aggregate: A : Pak1.Rec_Arr (1 .. 20) := (1 .. 20 => Pak1.Func); the implementation would call Pak1.Func effectively using A(I) as an "out" parameter for each I in 1..20; obviously this won't work when the aggregate is too big, if Func is expected to be called 1000 times. I think we all share this intuition. If the aggregate is (1..x => Pac1.Func) and x is non-static, then either we check that x = 20 before the loop (and raise an exception before any call to Func), or else the call to A(I) needs to do a check on the value of I at each step, and we call Func 20 times before raising the exception. 7.5 (8 1/2) just says that these individual assignment at NOT to an anonymous temporary, so they must be "in place", i.e. in object A. This does not rely on 11.6 to define the semantics (apart from the indeterminacy of Count). What am I missing? **************************************************************** From: Jean-Pierre Rosen Sent: Thursday, September 27, 2007 9:38 AM > I suppose it's in the "spirit of Ada" to allow pretty much anything else, too > -- that is, any number of components between 0 and 1000 could be evaluated. > Yes, and talking about user's benefit (as opposed to compiler writer ease), I can't imagine that a non-insane program would benefit from guaranteed portability (i.e. same number of evaluations whatever the compiler) in this case... **************************************************************** From: Pascal Leroy Sent: Thursday, September 27, 2007 11:51 AM > I think we all share this intuition... What am I missing? I believe that you are missing the fact that this intuition is not supported by any normative wording in the RM. In the non-limited case, you *have*to* call the function 1000 times to build the aggregate and then you do the check. In the limited case, there is no wording that tells you that don't really have to call the function 1000 times. In other words, we require build-in-place, but what if there is place to build (part of) the object? **************************************************************** From: Randy Brukardt Sent: Thursday, September 27, 2007 12:59 PM ... > > That doesn't make much sense. 11.6 is a bunch of exceptions > > to the canonical semantics; build-in-place is a set of > > exceptions to the canonical semantics. Why is one set of > > exceptions better than any other? > > Oh boy, that's confused! In case you didn't notice, 11.6 is a bunch of > *Implementation Permissions*. 7.5(8.1/2) is an *Implementation > Requirement*. Surely I would assume that a requirement has precedence > over a permission. In other words, a requirement is absolutely not an > exception to the canonical semantics: if you don't comply with it you > don't have a correct compiler. It is a requirement to implement a permission; it does not change the canonical semantics at all. Both 11.6 and build-in-place are mods to the canonical semantics, and as such have a lesser status - but beyond that, it doesn't make sense to give "goodness" rankings to RM rules. ... > > I realize that someone has suggested that build-in-place be > > given first class language status. > > Yes, and that someone believes that this is the only sane way to deal with > the stream of oddities and pathologies that Adam and Steve have been > producing over the last few months. > > > I don't believe that is > > possible without a massive rewrite of the standard > > FUD, FUD, FUD! Not FUD, fact. The underlying problem here is that build-in-place is not assignment, and describing it as such is going to be forever error prone. Assignment and initialization are two different things. But to change that properly means rewriting every paragraph where "assignment" is used to describe initialization. Moreover, the majority of these anomalies can be resolved simply by looking at the canonical semantics (that everything is copied); they don't require additional wording to resolve. But every last one will require wording if the canonical semantics is changed to include build-in-place, and I've be very surprised if we've uncovered them all already (my guess would be that we've only seen a few of them - call that FUD if you like, but it is based on years of experience handling these sorts of issues). A simple half-baked change here surely won't help any; we'd still have a never-ending stream of issues and AIs to fix them. If we really are going to try to fix this this way, we have to commit to a full-fledged study of the effects of the change on the entire RM. Moreover, only a few ARG members are really capable of handling a change of this sort, and I don't think any of them really can spend the time now (you are moving and changing jobs; I'm supposed to be building an ACATS; Tucker has major ASIS projects; there are a few others that are capable of writing such a massive language change, but none of them have done that sort of project in years). And in any case I'm going to be stuck with the job of rewriting all of the AARM paragraphs. Yikes! ... > > That's why it wasn't done in the first place. > > That's not my recollection. At least that's not how I perceived it at the > time. My view that we had to go with an Implementation Requirement > because it was not really testable and had no effect on the high-level > semantics of the language. Both assertions turned out to be untrue: > witness the fact that someone wrote ACATS tests to check that you actually > build in place. It originally was a Corrigendum change, and it was done in the easiest way possible. But the only alternative is a massive change: at least 15 paragraphs would need to be rewritten in AARM 7.6 alone. We're again in Corrigendum mode, and I find it hard to justify such a change. Perhaps we're painted into a corner here vis-a-vis the standard, but in any case, this is not important - I find it hard to believe that any implementer will not understand what is expected; and no one cares whether an exception is raised a bit early. There surely aren't going to be any ACATS tests on such minutia. The canonical semantics is clear, but 11.6 makes that untestable in any case. So this is a low-priority issue at best. **************************************************************** From: Randy Brukardt Sent: Thursday, September 27, 2007 1:07 PM > > I think we all share this intuition... What am I missing? > > I believe that you are missing the fact that this intuition is not supported > by any normative wording in the RM. In the non-limited case, you *have*to* > call the function 1000 times to build the aggregate and then you do the > check. In the limited case, there is no wording that tells you that don't > really have to call the function 1000 times. In other words, we require > build-in-place, but what if there is place to build (part of) the object? No, you can't have it both ways. Either you only consider the canonical semantics (which does not include build-in-place), or you consider the entire semantics (which includes 11.6). There is never any point at which either set of semantics is inconsistent. Besides, the entire question has nothing to do with a build-in-place requirement. The question here is whether you ever bave to build an aggregate that isn't going to fit, and the answer is no, because of 11.6. We don't have a specific permission to reorder here, because there is a blanket one. Why do we need to complicate the standard with extra permissions? ****************************************************************