CVS difference for ai05s/ai05-0144-2.txt

Differences between 1.3 and version 1.4
Log of other versions for file ai05s/ai05-0144-2.txt

--- ai05s/ai05-0144-2.txt	2009/11/04 06:26:38	1.3
+++ ai05s/ai05-0144-2.txt	2010/02/25 01:22:32	1.4
@@ -1,4 +1,4 @@
-!standard 6.02 (11)                                 09-10-30  AI05-0144-2/02
+!standard 6.02 (11)                                 10-02-24  AI05-0144-2/03
 !class Amendment 09-06-07
 !status work item 09-06-07
 !status received 09-06-07
@@ -89,18 +89,17 @@
 on the full definition of partial views.
 
 If a call C has two or more parameters of mode in out or out that
-are of a type that is not known to be passed by reference, then
-the call is legal only if:
+are of an elementary type, then the call is legal only if:
 
  *  For each name N of an that is passed as a parameter of
     mode in out or out to the call C, there is no other name among the
-    other parameters of mode in out or out to C that is known to refer to
+    other parameters of mode in out or out to C that is known to denote the
     same object.
 
-[Editor's note: see the discussion item about compatibility. Also note
-that I changed "denote the same object" to "refer to the same object",
-because this now includes composite types and thus we need the more
-complex matching included in "refer".]
+AARM To Be Honest: This means *visibly* an elementary type; it does not include
+partial views of elementary types (partial views are always composite). That's
+necessary to avoid having Legality Rules depend on the contents of the private
+part.
 
 If a construct C has two or more direct constituents that are names or
 expressions whose evaluation may occur in an arbitrary order, at least
@@ -397,20 +396,14 @@
 on code like this is just a ticking time bomb. So this check will mostly detect
 bugs.
 
-[Editor's note: The expansion of this rule to everything that is not required to
-be passed by-reference will also expand the incompatibility to some cases where
-there is no actual problem - such as large untagged record types, which probably
-are passed by reference by all compilers but are not required to be passed that
-way. Thus the rule we have adopted seems to violate the principle of not rejecting
-safe things. Admittedly, Erhard does not share my feeling that by-reference
-parameters is safe (even though the language semantics is well-defined and all such
-uses are portable). I fear that Erhard's insistence on expanding this applicability
-will eventually cause the entire rule to be dropped -- which would be a massive
-pity.]
+This rule only applies parameters that are certain to be passed by copy.
+Parameters that are passed by reference have a defined semantics and are not
+(necessarily) a problem. Since we intend to only reject known-to-be dubious
+constructs, including types that *might* be passed by copy would be going too far.
 
 ---
 
-The decision to exclude anonymous access parameters from this cheecking means that
+The decision to exclude anonymous access parameters from this checking means that
 most of the initial examples in fact are still legal (even if insidious). For instance,
 the Ada 95 example from above:
     if F4(A1) = F3(F1(A1)) then
@@ -421,15 +414,16 @@
 dangerous). The main argument used is that functions with access parameters are
 common (as that was the workaround to not having "in out" parameters).
 
+This would also violate our known-to-be dubious rule. Such access parameters are
+problems only if they are dereferenced, and we cannot know whether the body of
+the subprogram actually does that. Without such a dereference, there is no problem.
+(Of course, it is most likely that they will be dereferenced.)
+
 It is annoying that the existence of that workaround is being used to make it harder
 to convert to "in out" parameters in those functions (as the result might be
 illegal while the original code was not -- even though both are equally dubious),
 but that cannot be helped.
 
-[Editor's note: I'm still dubious about this decision, especially as we seem
-willing to take the incompatibility for the multiple parameter case.]
-
-
 !example
 
 (See discussion.)
@@ -502,7 +496,7 @@
 > !wording
 ...
 
->     AARM Discussion: This is determined statically. If the name 
+>     AARM Discussion: This is determined statically. If the name
 > contains
                        ^^^^
 >     some dynamic portion other than a dereference, indexed_component, or
@@ -516,14 +510,14 @@
 keep it here, and spell it out: "Whether or not names or prefixes are known
 to denote the same object is determined statically. ..."
 
-> Two names N1 and N2 are *known to refer to the same object* if N1 and 
-> N2 are known to denote the same object, or if N1 is known to denote a 
+> Two names N1 and N2 are *known to refer to the same object* if N1 and
+> N2 are known to denote the same object, or if N1 is known to denote a
 > subcomponent of the object denoted by N2, or vice-versa.
 ...
 
-> If a construct C has two or more direct constituents that are names or 
-> expressions whose evaluation may occur in an arbitrary order, at least 
-> one of which contains a function call with an in out, out, or 
+> If a construct C has two or more direct constituents that are names or
+> expressions whose evaluation may occur in an arbitrary order, at least
+> one of which contains a function call with an in out, out, or
 > access-to-variable parameter, then the construct is legal only if:
 
 "access-to-variable parameter" seems confusing; I think you mean it to include
@@ -535,33 +529,33 @@
 >     itself), there is no other name anywhere within a direct constituent
 >     of the construct C other than the one containing C2, that is known
 >     to refer to the same object; and
->     
->  *  For each name N'Access or N'Unchecked_Access that is passed as an 
+>
+>  *  For each name N'Access or N'Unchecked_Access that is passed as an
 >     access-to-variable parameter to some inner function call C2 (not
 
 >     including the construct C itself), there is no other name anywhere
 >     within a direct constitutent of the construct C other than the one
 >     containing C2, that is known to refer to the same object as N.
->     
+>
 > For the purposes of checking this rule on an array aggreagate, an
                                                      aggregate
 
 
-> expression associated with a discrete_choice_list that has two or more 
-> discrete choices, or that has a nonstatic range, is considered as two 
-> or more separate occurrences of the expression.  Similarly for a 
-> record aggregate, the expression of a record_component_association is 
+> expression associated with a discrete_choice_list that has two or more
+> discrete choices, or that has a nonstatic range, is considered as two
+> or more separate occurrences of the expression.  Similarly for a
+> record aggregate, the expression of a record_component_association is
 > considered to occur once for each associated component.
-> 
+>
 > AARM Reason: This prevents obvious cases of dependence on the order of
                ^^^^
 Another dangling "this".
 
-> evaluation of names or expressions. Such dependence is usually a bug, 
-> and in any case, is not portable to another implementation (or even 
+> evaluation of names or expressions. Such dependence is usually a bug,
+> and in any case, is not portable to another implementation (or even
 > another optimization setting).
-> 
-> The third bullet does not check for uses of the prefix, since the 
+>
+> The third bullet does not check for uses of the prefix, since the
 > access type
       ^^^^^^^^^^^^
 
@@ -570,11 +564,11 @@
 [Editor's note: The one Tucker deleted. ;-) He apparently didn't update
 these notes.]
 
-> and the designated object are not the same, and "known to denote the 
+> and the designated object are not the same, and "known to denote the
 > same prefix" does not include dereferences anyway.
 ...
 
-> The usual concern about the use of *in out* parameters in functions 
+> The usual concern about the use of *in out* parameters in functions
 > begins something
 > like:
 
@@ -584,53 +578,53 @@
 Which is OK with me.
 
 > Imagine writing an expression like:
-> 
+>
 >     if F3(O1) = F3(F2(O1)) then
-> 
-> This expression has an evaluation order dependency: if the expression 
-> is evaluated left-to-right, the result is True (both values have (C => 
-> 1) and O1.C is set to 2 afterwards), and if the expression is 
-> evaluated right-to-left, the result is False (the right operand is 
+>
+> This expression has an evaluation order dependency: if the expression
+> is evaluated left-to-right, the result is True (both values have (C =>
+> 1) and O1.C is set to 2 afterwards), and if the expression is
+> evaluated right-to-left, the result is False (the right operand is
 > still (C => 1), but now the left operand is (C => 2), and O1.C is still 2 afterwards).
-> 
+>
 > This is usually used as a reason to not allow *in out* parameters on functions.
 
 "to not allow" --> "not to allow" or "to disallow"
 
 > If you have to use access parameters, then the expression is:
-> 
+>
 >     if F3(O1) = F3(F1(O1'access)) then
-> 
-> and the use of 'access and aliased on the declaration of O1 should 
+>
+> and the use of 'access and aliased on the declaration of O1 should
 > provide a red flag about the possible order dependence.
-> 
-> However, this red flag only occurs some of the time. First of all, 
-> access objects are implicitly converted to anonymous access types, so 
+>
+> However, this red flag only occurs some of the time. First of all,
+> access objects are implicitly converted to anonymous access types, so
 > no red flag is raised when using them:
-> 
+>
 >     if F3(A1.all) = F3(F1(A1)) then
-> 
-> Perhaps the .all on the left-hand argument could be considered a red 
+>
+> Perhaps the .all on the left-hand argument could be considered a red
 > flag. But of course that doesn't apply if that function also takes an access parameter:
-> 
+>
 >     if F4(A1) = F3(F1(A1)) then
-> 
+>
 > We have the same order dependency, but there is no sign of a red flag here.
-> 
+>
 > This is all Ada 95 code, ...
 
 No, it's not -- I see calls to functions with 'in out' params above.
 
 [I'd don't see any such calls other than in a single 'straw man' case.]
 
->...but Ada 2005 makes this situation worse by adding prefix  views and 
+>...but Ada 2005 makes this situation worse by adding prefix  views and
 >implicit 'access. If A_Rec is tagged, we can write:
 ...
 > We can also get an order dependence from a single call (even in Ada 83):
-> 
+>
 >     P2 (O1, O1);
-> 
-> but only if there are multiple parameters that can modify an object, 
+>
+> but only if there are multiple parameters that can modify an object,
 > and
                                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
@@ -640,8 +634,8 @@
 as needed, apparently by some magical force. :-) Writing two extra sentences
 to be pedantic about topics such as this don't help understanding.]
 
-> interestingly, only if the parameters are passed by-copy. (If A_Rec is 
-> tagged, for example, the value of O1.C will be increased by 3, no 
+> interestingly, only if the parameters are passed by-copy. (If A_Rec is
+> tagged, for example, the value of O1.C will be increased by 3, no
 > matter what order the parameters are evaluated in.)
 ...
 
@@ -649,99 +643,99 @@
 
 It is very useful to have this "Survey" attached to this AI, for posterity!
 
-> It fairly obvious that that order dependencies are a problem in Ada, 
+> It fairly obvious that that order dependencies are a problem in Ada,
 > and
 
 "that that" --> "that"
 
-> have been getting worse with each version of the language (even 
-> without *in out* parameters in functions). Moreover, it has been used 
-> as the primary reason for leaving something natural and useful (*in 
+> have been getting worse with each version of the language (even
+> without *in out* parameters in functions). Moreover, it has been used
+> as the primary reason for leaving something natural and useful (*in
 > out* parameters for functions) out of the language.
 ...
 
-> One obvious solution would be to define the order of evaluation, 
-> eliminating the problem at the source. Java, for instance, requires 
-> left-to-right evaluation. However, that would encourage tricky code 
+> One obvious solution would be to define the order of evaluation,
+> eliminating the problem at the source. Java, for instance, requires
+> left-to-right evaluation. However, that would encourage tricky code
 > like the various examples by making it portable.
 
 I doubt if I'll convince anyone, but I think that's a bogus argument.
 People do, in fact, depend on eval order all the time, either by accident, or because they don't know the language rules.  And there's nothing any language definition can say to stop it.  A language definition can say, "It is considered bad style to depen
d on evaluation order (of...).
 Don't do that."  That would have just as much effect as leaving the order undefined -- i.e. it puts people on notice, but doesn't entirely stop the problem.
 
->... Moreover, Ada compilers have been using this  flexibility for 
->decades; trying to remove it from compilers (particularly  from 
->optimizers) could be very difficult. Note that this is the only 
+>... Moreover, Ada compilers have been using this  flexibility for
+>decades; trying to remove it from compilers (particularly  from
+>optimizers) could be very difficult. Note that this is the only
 >solution  that actually could eliminate dependencies on side-effects inside of functions.
-> But defining the order of evaluation was considered for both Ada 83 
+> But defining the order of evaluation was considered for both Ada 83
 >and Ada 95  and was deemed not worth it -- it's hard to see what has changed.
-> 
-> Another option would be to increase the visibility of parameters with 
-> side-effects. This sounds somewhat appealing (after all, it seems to 
-> be the basis on which access parameters are deemed OK and *in out* 
+>
+> Another option would be to increase the visibility of parameters with
+> side-effects. This sounds somewhat appealing (after all, it seems to
+> be the basis on which access parameters are deemed OK and *in out*
 > parameters are not). One possibility would be to add ordering symbols to named notation:
-> Param <- <expr> for an *in* parameter (this includes access); Param -> 
+> Param <- <expr> for an *in* parameter (this includes access); Param ->
 > <expr> for an *out* parameter; and Param <-> <expr> for an *in out* parameter.
 
 Ada 80 (or thereabouts) used the symbols :=, =:, and :=: for this.
 I suggest you use this notation (before shooting it down below).
 And eliminate the "That's especially bad..." sentence below.
 
-> However, for compatibility, old code that don't use the symbols would 
-> have to be allowed. That's especially bad because the symbol for 
-> ordinary named parameters (=>) looks like the symbol for an *out* 
-> parameter; while it usually will be an *in* parameter. Moreover, this 
-> solution does nothing for positional parameters in calls nor for the 
-> prefixes of prefix notation. And it is misleading for access 
+> However, for compatibility, old code that don't use the symbols would
+> have to be allowed. That's especially bad because the symbol for
+> ordinary named parameters (=>) looks like the symbol for an *out*
+> parameter; while it usually will be an *in* parameter. Moreover, this
+> solution does nothing for positional parameters in calls nor for the
+> prefixes of prefix notation. And it is misleading for access
 > parameters, whose mode is officially "in", but still might cause side-effects.
-> 
-> One could argue that positional parameters are already unsafe and 
-> requiring named notation to be safe is not much of an imposition. But 
-> the prefix and access issues are not so easily explained away. 
-> Additionally, putting the mode into calls in some way makes 
-> maintenance of programs harder: changing the mode of a call is going 
+>
+> One could argue that positional parameters are already unsafe and
+> requiring named notation to be safe is not much of an imposition. But
+> the prefix and access issues are not so easily explained away.
+> Additionally, putting the mode into calls in some way makes
+> maintenance of programs harder: changing the mode of a call is going
 > to make many calls illegal, while today most calls will remain legal (all will if the mode is changed from "in out" to "in").
 
 I think that last part is bogus -- it's like saying the full coverage rules for aggregates make maintenance harder.
 
-> Another syntax suggestion that was made recently was to (optionally) 
+> Another syntax suggestion that was made recently was to (optionally)
 > include the parameter mode as part of the call. That would look something like:
-> 
+>
 >     if F3(in O1) = F3(in F2(in out O1)) then
 
 Shirley, you'd leave out the 'in's!
 
-> This could be applied to positional calls as well, but still provides 
+> This could be applied to positional calls as well, but still provides
 > no help for prefix calls nor for access parameters.
-> 
-> One could imagine requiring the syntax for calls of functions with *in 
-> out* parameters and making it optional elsewhere. That might placate 
+>
+> One could imagine requiring the syntax for calls of functions with *in
+> out* parameters and making it optional elsewhere. That might placate
 > *in out* parameter opponents, but otherwise doesn't seem to do much for the language.
 
 If we were to do any of the above "marking [in]out params" syntax, we should also define
 a Restriction that forces it on all calls (not 'in' params, of course).
 Worth mentioning, even though we're not going this route.
 
-> Finally, we come to some sort of legality rules and/or runtime checks 
-> for preventing such order dependencies. It is important to note that 
-> making such rules too simple (and strong) only would mean that 
-> temporaries have to be introduced in some expressions. That would be 
+> Finally, we come to some sort of legality rules and/or runtime checks
+> for preventing such order dependencies. It is important to note that
+> making such rules too simple (and strong) only would mean that
+> temporaries have to be introduced in some expressions. That would be
 > an annoyance, but surely not as bad as the current ticking time-bomb.
-> 
-> The easiest option is to blame all of the problems on functions with 
+>
+> The easiest option is to blame all of the problems on functions with
 > *in out* parameters and make them stand alone. The rule would be something like:
-> 
+>
 >    A call of a function with an *in out* parameter must be the only call in
 >    an expression.
-> 
+>
 > That would mean that
-> 
+>
 >      if F2(O1) then
-> 
+>
 > would be legal (assuming F2 returned type Boolean), but
-> 
+>
 >      if F2(O1) = (C => 1) then
-> 
+>
 > would not be. Obviously, this is too strict.
 
 I agree it's too strict, but I don't think it's "obviously" too strict.
@@ -753,9 +747,9 @@
 
 >...Amazingly, it also not strict
 > enough:
-> 
+>
 >      Some_Array(F3(O1)) := F2(O1);
-> 
+>
 > would be allowed. (An assignment statement is *not* an expression!)
 
 Well, then obviously the wording of the rule should not be "expression".
@@ -773,8 +767,8 @@
 ...
 
 > So we could modify the definition of "potentially the same object" to:
-> 
-> Two objects are considered to be "potentially the same object" one has 
+>
+> Two objects are considered to be "potentially the same object" one has
 > the type
                                                                 if
                                                                 **
@@ -782,17 +776,17 @@
 > of a part of the other, or one has a part whose type is a partial view, unless:
 >    * one object is part of a stand-alone object, and the other object is part
 ...
-> Such rules would only reject calls where it is clear that parts of the 
-> same object are involved. That eliminates the complications of private 
+> Such rules would only reject calls where it is clear that parts of the
+> same object are involved. That eliminates the complications of private
 > types (we won't look in them), arrays (we won't try to determine if they are the same), and so on.
-         
+
 "arrays" --> "array components", I think you mean.
 
 > These are the rules proposed in the !wording section above.
 ...
 
-> None of the proposed rules do anything about side-effects totally 
-> inside of functions. One way to deal with that would be to require an 
+> None of the proposed rules do anything about side-effects totally
+> inside of functions. One way to deal with that would be to require an
 > expression to contain only a single function call unless all of the functions are *strict*.
 
 I don't think "strict" is the right term, here.  In computer science, "strict"
@@ -806,24 +800,24 @@
 matter, because we're not doing any of this.]
 
 >... A strict
-> function exposes all of its side effects in its specification, meaning 
->it does  not read or write global variables, call non-strict functions, 
->write hidden parts  of parameters, etc. Most of the language-defined 
+> function exposes all of its side effects in its specification, meaning
+>it does  not read or write global variables, call non-strict functions,
+>write hidden parts  of parameters, etc. Most of the language-defined
 >functions are strict (that would  have to be declared somehow).
-> 
-> Strict functions have several other nice properties: they don't need 
-> elaboration checking (freezing is sufficient to prevent 
-> access-before-elaboration problems); they can be used in symbolic 
+>
+> Strict functions have several other nice properties: they don't need
+> elaboration checking (freezing is sufficient to prevent
+> access-before-elaboration problems); they can be used in symbolic
 > reductions; and they can use the optimizations allowed for functions in pure packages.
-> 
-> However, such a requirement would be quite incompatible. Moreover, 
+>
+> However, such a requirement would be quite incompatible. Moreover,
 > strict functions would be rather limiting by themselves.
-> 
-> An alternative that has been suggested is to allow Global_In and 
-> Global_Out annotations on subprograms, which would declare the global 
-> side-effects of a subprogram. Such annotations could not be a lie 
-> (they'd have to be checked in some way), and thus would fill the role 
-> of strict functions more flexibly. But it would still be too 
+>
+> An alternative that has been suggested is to allow Global_In and
+> Global_Out annotations on subprograms, which would declare the global
+> side-effects of a subprogram. Such annotations could not be a lie
+> (they'd have to be checked in some way), and thus would fill the role
+> of strict functions more flexibly. But it would still be too
 > incompatible to ban dangerous side-effects in functions (although separate tools
 > or non-Ada operating modes could make such checks).
 
@@ -843,5 +837,87 @@
 
 We didn't really review the discussion or examples, but it sounds like they will need
 some work.
+
+****************************************************************
+
+From: Edmond Schonberg
+Sent: Tuesday, February 23, 2010  8:33 PM
+
+I can report on an unfinished experiment concerning the detection of dangerous
+order dependences.
+
+a) I implemented the check on multiple in-out parameters in a procedure, where
+the actuals of an elementary type overlap.  In the  15,000 tests in our test
+suite I found two occurrences of  P (X, X) or near equivalent.  One of them
+(in Matt Heaney's code!) appears harmless. The other one is in a program full of
+other errors, so unimportant. So application of this rule should not break
+anything.
+
+b)  Using the rules proposed in AI-0144-1 I checked for overlap
+between actuals in inner calls, when the formal is an access type
+(didn't implement in-out parameters for functions, there would be no code to
+test :-)).  Here there is a problem, I can't even compile the GNAT library. For
+example, the following innocent code (from
+ada.containers.red_black_trees.generic_operations)  violates the rule:
+
+        if Parent (X) = Left (Parent (Parent (X))) then
+
+Here X is of an access type. Of course Parent is a pure function, but the
+compiler doesn't know that at this point, and it looks like an order dependence.
+So getting rid of the rule about access types (as done in 144-2) seems
+indispensable.
+
+To be continued.
+
+****************************************************************
+
+From: Robert Dewar
+Sent: Wednesday, February 24, 2010  4:58 AM
+
+...
+> a) I implemented the check on multiple in-out parameters in a
+> procedure, where the actuals of an elementary type overlap.  In the
+> 15,000 tests in our test suite I found two occurrences of  P (X, X)
+> or near equivalent.  One of them (in Matt Heaney's code!) appears
+> harmless. The other one is in a program full of other errors, so
+> unimportant. So application of this rule should not break anything.
+
+I guess I can tolerate this rule, of course Ed's experiment also shows that it
+is almost certainly useless, so just another case of forcing compiler writers to
+waste time on nonsense.
+
+> b)  Using the rules proposed in AI-0144-1 I checked for overlap
+> between actuals in inner calls, when the formal is an access type
+> (didn't implement in-out parameters for functions, there would be no
+> code to test :-)).  Here there is a problem, I can't even compile the
+> GNAT library. For example, the following innocent code (from
+> ada.containers.red_black_trees.generic_operations)  violates the rule:
+>
+>         if Parent (X) = Left (Parent (Parent (X))) then
+>
+> Here X is of an access type. Of course Parent is a pure function, but
+> the compiler doesn't know that at this point, and it looks like an
+> order dependence.  So getting rid of the rule about access types (as
+> done in 144-2) seems indispensable.
+
+I strongly dislike all this order dependence stuff, but for sure flagging the
+above line would be nonsense.
+
+****************************************************************
+
+From: Randy Brukardt
+Sent: Wednesday, February 24, 2010  7:21 PM
+
+For the record, we've already abandoned this particular rule for access types.
+And this example makes it clear that's a good thing.
+
+The currently proposed rule (b) only applies to in out and out parameters of
+functions, so it can't possibly have a compatibility problem. (And Ed's
+experiment shows that rule (a) isn't significantly incompatible.) So I think
+we're good to go on this.
+
+As previously noted, the primary purpose of these rules is to cut the Gordian
+knot of opposition to allowing "in out" parameters on functions. I'll take some
+work for that benefit (that we should have had in Ada 95, IMHO).
 
 ****************************************************************

Questions? Ask the ACAA Technical Agent