!standard 6.04 (09) 09-04-30 AI05-0144-1/03 !class Amendment 09-02-15 !status work item 09-02-15 !status received 09-02-15 !priority High !difficulty Hard !subject Detecting dangerous order dependencies !summary Define rules to reject expressions with obvious cases of order-dependence. !problem Ada does not define the order of evaluation of parameters in calls. This opens up the possibity that function calls with side-effects can be non-portable. The problem gets worse with "in out" and access parameters, as proper evaluation may depend on a particular order of evaluation. Arguably, Ada has selected the worst possible solution to evaluation order dependencies: it allows such dependencies (by not specifying an order of evaluation), does not detect them in any way, and then says that if you depend on one (even if by accident), your code will fail at some point in the future when your compiler changes. Something should be done about this. !proposal (See wording.) !wording [I don't know where best to put this, perhaps in 6.4? - RLB] A name N1 is *known to denote the same object* as another name N2 if: * N1 statically denotes a part of a stand-alone object or parameter, and N2 statically denotes the same part of the same stand-alone object or parameter; or [We're assuming that the first bullet covers selected_components, as those are always known at compile-time - ED] * N1 is a dereference P1.all, N2 is a dereference P2.all, and the prefix P1 is known to denote the same object as the prefix P2; or * N1 is an indexed_component P1(I1,...), N2 is an indexed_component P2(I2,...), the prefix P1 is known to denote the same object as the prefix P2, and for each index of the indexed_component, I1 and I2 are static expressions with the same value, or I1 and I2 are names that are known to denote the same object; or * N1 is a slice P1(R1), N2 is a slice P2(R2), the prefix P1 is known to denote the same object as the prefix P2, and the subtypes denoted by the ranges R1 and R2 statically match. AARM Discussion: This is determined statically. If the name contains some dynamic portion other than a dereference, indexed_component, or slice, it is not "known to denote the same object". [We could also use the same rules for indexes for the bounds of slices that have explicit bounds, although it doesn't seem very likely to occur and the wording is messy.] A name N1 is *known to denote a prefix of the same object* as another name N2 if N2 is known to denote the same object as a subcomponent of the object denoted by N1. AARM Reason: This ensures that names Prefix.Comp and Prefix are known to denote the same object for the purposes of the rules below. This intentionally does not include dereferences; we only want to worry about accesses to the same object, and a dereference changes the object in question. (There is nothing shared between an access value and the object it designates.) A call C is legal only if: * For each name N that is passed to some inner call (not including the call C itself) as the actual parameter to a formal in out or out parameter, there is no other name anywhere in the expressions of the actual parameters of the call other than the one containing N that is known to denote the same object or is known to denote a prefix of the same object; and * For each name N that is passed to some inner call (not including the call C itself) as the actual parameter of a formal parameter of an access type, there is no other name anywhere in the expressions of the actual parameters of the call other than the one containing N that is known to denote the same object as N.all; and * for each name N that is passed as the actual parameter to a formal in out or out parameter that is of an elementary type, there is no other name in the actual parameters corresponding to formal in out or out parameters of the call other than the one containing N that is known to denote the same object or is known to denote a prefix of the same object. For the purpose of checking this rule, an assignment_statement is considered a call with two parameters, the parameter corresponding to the source expression having mode in and the parameter corresponding to the target name having mode out. AARM Reason: This prevents obvious cases of dependence on the order of evaluation of parameters in expressions. Such dependence is usually a bug, and in any case, is not portable to another implementation (or even another optimization setting). The second bullet does not check for uses of the prefix, since the access type and the designated object are not the same, and "known to denote the same prefix" does not include dereferences anyway. Note that these rules as a group make a symmetrical set of rules, in that either name can designate an object that is the prefix of the other. If the name N is a prefix of some other name in the call, these rules will trigger because that prefix would necessarily be known to designate the same object. (Nothing in these rules require the full other name to match; any part can match.) OTOH, we need explicit wording if some prefix of N matches some other name in the call. These rules do not require checks for most in out parameters in the top-level call C, as the rules about evaluation of calls prevent problems. Similarly, we do not need checks for short circuit operations. The rules about arbitrary order (see 1.1.4) allow evaluating parameters and writing parameters back in an arbitrary order, but not interleaving of evaluating parameters of one call with writing parameters back from another - that would not correspond to any allowed sequential order. End AARM Reason. AARM Ramification: Note that first two bullets cannot fail for a procedure or entry call alone; there must be at least one function with an access, in out, or out parameter called as part of a parameter expression of the call in order for it to fail. !discussion In order to discuss this topic, we need to look at some examples. type A_Rec is {tagged} record C : Natural := 0; end record; -- Note: We're going to talk about this record both as if it is -- tagged, and then as if it is not tagged and passed by copy. -- Thus the {tagged} here. procedure P1 (Input : in A_Rec; Output : in out A_Rec) is begin Output.C := Output.C + Input.C; end P1; procedure P2 (Bump1, Bump2 : in out A_Rec) is begin Bump1.C := Bump1.C + 1; Bump2.C := Bump2.C + 2; end P2; function F1 (Bumpee : access A_Rec) return A_Rec is begin Bumpee.C := Bumpee.C + 1; return (C => Bumpee.C - 1); end F1; function F2 (Bumpee : in out A_Rec) return A_Rec is begin Bumpee.C := Bumpee.C + 1; return (C => Bumpee.C - 1); end F2; function F3 (Obj : in A_Rec) return Natural is begin return Obj.C; end F3; function F4 (Obj : access A_Rec) return Natural is begin return Obj.C; end F4; function F5 (Bump1, Bump2 : access A_Rec) return A_Rec is begin Bump1.C := Bump1.C + 1; Bump2.C := Bump2.C + 2; return (C => Bump1.C - 1); end F5; function F6 (Bump1, Bump2 : in out A_Rec) return A_Rec is begin Bump1.C := Bump1.C + 1; Bump2.C := Bump2.C + 2; return (C => Bump1.C - 1); end F6; function F7 (Bumpee : in out A_Rec; Bumper : in A_Rec) return A_Rec is begin Bumpee.C := Bumpee.C + Bumper.C; return (C => Bumpee.C); end F6; O1 : {aliased} A_Rec := (C => 1); -- "aliased" only if needed by -- the particular example below. O2, O3 : A_Rec := (C => 1); type Acc_A_Rec is access all A_Rec; A1 : Acc_A_Rec := O1'access; The usual concern about the use of *in out* parameters in functions begins something like: Imagine writing an expression like: if F3(O1) = F3(F2(O1)) then This expression has an evaluation order dependency: if the expression is evaluated left-to-right, the result is True (both values have (C => 1) and O1.C is set to 2 afterwards), and if the expression is evaluated right-to-left, the result is False (the right operand is still (C => 1), but now the left operand is (C => 2), and O1.C is still 2 afterwards). This is usually used as a reason to disallow *in out* parameters on functions. If you have to use access parameters, then the expression is: if F3(O1) = F3(F1(O1'access)) then and the use of 'access and aliased on the declaration of O1 should provide a red flag about the possible order dependence. However, this red flag only occurs some of the time. First of all, access objects are implicitly converted to anonymous access types, so no red flag is raised when using them: if F3(A1.all) = F3(F1(A1)) then Perhaps the .all on the left-hand argument could be considered a red flag. But of course that doesn't apply if that function also takes an access parameter: if F4(A1) = F3(F1(A1)) then We have the same order dependency, but there is no sign of a red flag here. This is all Ada 95 code, but Ada 2005 makes this situation worse by adding prefix views and implicit 'access. If A_Rec is tagged, we can write: if O1.F3 = O1.F1.F3 then And since tagged parameters are implicitly aliased, if O1 is a tagged parameter, there isn't the slightest sign of a red flag here. This shows that we are already in a very deep pit. One can argue whether moving the rest of the way to the bottom is that significant. [Thanks to Pascal Leroy for showing just how bad the current situation is.] We can show similar problems with procedure calls. Consider: P1 (Input => F2(O1), Output => O1); If O1 is tagged (and thus passed by reference), this will set O1.C to 3 in either order of evaluation. (In both cases, Input will have (C => 1) and output will have (C => 2)). But if O1 is not tagged (and thus passed by copy), this will set O1.C to 3 if evaluated left-to-right [F2 sets O1.C to 2, then passes (C => 1) to Input; Output is passed (C => 2); and then that summed to (C => 3)] and O1.C will be 2 if evaluated right-to-left [Output is passed (C => 1); F2 sets O1.C to 2, then passes (C => 1) to Input; and then that is summed to (C => 2)]. We can write similar code in Ada 95: P1 (Input => F1(A1), Output => A1.all); getting the same order dependence. We can also get an order dependence from a single call (even in Ada 83): P2 (O1, O1); but only if there are multiple parameters that can modify an object, and interestingly, only if the parameters are passed by-copy. (If A_Rec is tagged, for example, the value of O1.C will be increased by 3, no matter what order the parameters are evaluated in.) That means that the Ada 95 call: P1 (Input => F5 (A1, A1), Output => O1); cannot have an order dependence, but *in out* parameters: P1 (Input => F6 (O1, O1), Output => O2); could, but only if A_Rec is passed by copy. Note that a single call with only one modifiable parameter cannot have an order dependence: P1 (Input => O1, Output => O1); will always end up with the same result, not matter what order the parameters are evaluated in. (That result could depend on the parameter passing mode, but that is controllable by using parameters of types that are by-copy or by-reference and in any case is not the problem we are discussing here). The parameters will be evaluated before the call; for by-copy they'll both have the value (C => 1) and the result value of (C => 2) will be written after the call. No part of the expression will use the modified value after it is modified, so there cannot be a dependence. Similarly, P1 (Input => F7 (O1, O1), Output => O2); does not have an order dependence, as again, no part of the expression could depend on the modified value of O1. Questions to answer: Should the rules (whatever they are) apply to all calls or just functions with *in out* parameters? The latter position clearly is completely compatible, but it obviously leaves many cases of order dependence undetected. Another issue is whether the rules should apply to all writable parameters (that is all *in out* and *out* parameters, and all access-to-varaible parameters), or just to writable by-copy parameters (which have to be *in out* or *out*). Clearly, problems can happen in any of these cases. But Ada has lived with the by-reference cases for decades, and adding *in out* parameters to functions doesn't change anything here (for a by-reference parameter, an *in out* parameter is equivalent to an access parameter of the same type). As noted in the discussion above, the real problems occur when new objects are created by intermediate expressions. "By-copy" in this context still means any parameter that *might* be passed by-copy: any type that is not known to be a by-reference type (clearly including untagged private types). It also has to include return objects of all types. Note that these two issues are not completely independent of each other: limiting checks to by-copy parameters also limits the added incompatibility, (and the likelyhood that the rules are really preventing errors). Survey of solutions It is fairly obvious that order dependencies are a problem in Ada, and have been getting worse with each version of the language (even without *in out* parameters in functions). Moreover, it has been used as the primary reason for leaving something natural and useful (*in out* parameters for functions) out of the language. We could do nothing, saying that since we're already nearly at the bottom of the pit, moving down two inches to the bottom will not have any practical effect. It surely would be easier than trying to cling to the side of the pit! But perhaps we can do better. It should be noted immediately that (almost) anything we could do could not detect order dependencies that are caused by side-effects *inside* of functions. Rules enforced on a call (or syntax, or whatever) can only prevent problems caused by side-effects visible to the call. We could limit *in out* parameters for functions to by-reference types (effectively to only tagged types). That clearly will not introduce any *new* problems, as shown by the examples that start out this section. But argubly it would make them more likely, and in any event, it would leave all of the existing nasty cases left unchecked. One obvious solution would be to define the order of evaluation, eliminating the problem at the source. Java, for instance, requires left-to-right evaluation. However, that would encourage tricky code like the various examples by making it portable. Moreover, Ada compilers have been using this flexibility for decades; trying to remove it from compilers (particularly from optimizers) could be very difficult. Note that this is the only solution that actually could eliminate dependencies on side-effects inside of functions. But defining the order of evaluation was considered for both Ada 83 and Ada 95 and was deemed not worth it -- it's hard to see what has changed. Another option would be to increase the visibility of parameters with side-effects. This sounds somewhat appealing (after all, it seems to be the basis on which access parameters are deemed OK and *in out* parameters are not). One possibility would be to add ordering symbols to named notation: Param <- for an *in* parameter (this includes access); Param -> for an *out* parameter; and Param <-> for an *in out* parameter. [It has been pointed out that a very early version of Ada used ":=", "=:", and ":=:" for this purpose. The author find those too weird to contemplate, and in any case doesn't change the argument any.] However, for compatibility, old code that don't use the symbols would have to be allowed. That's especially bad because the symbol for ordinary named parameters (=>) looks like the symbol for an *out* parameter; while it usually will be an *in* parameter. Moreover, this solution does nothing for positional parameters in calls nor for the prefixes of prefix notation. And it is misleading for access parameters, whose mode is officially "in", but still might cause side-effects. One could argue that positional parameters are already unsafe and requiring named notation to be safe is not much of an imposition. But the prefix and access issues are not so easily explained away. Additionally, putting the mode into calls in some way makes maintenance of programs harder: changing the mode of a call is going to make many calls illegal, while today most calls will remain legal (all will if the mode is changed from "in out" to "in"). Another syntax suggestion that was made recently was to (optionally) include the parameter mode as part of the call. That would look something like: if F3(in O1) = F3(in F2(in out O1)) then This could be applied to positional calls as well, but still provides no help for prefix calls nor for access parameters. One could imagine requiring the syntax for calls of functions with *in out* parameters and making it optional elsewhere. That might placate *in out* parameter opponents, but otherwise doesn't seem to do much for the language. Defining a Restriction here might help with new programs by enforcing style (although Restrictions really are terrible for enforcing style alone, since they also apply to the runtime which probably was built with a completely different style guide), but even that would not do much to help existing code. Finally, we come to some sort of legality rules and/or runtime checks for preventing such order dependencies. It is important to note that making such rules too simple (and strong) only would mean that temporaries have to be introduced in some expressions. That would be an annoyance, but surely not as bad as the current ticking time-bomb. The easiest option is to blame all of the problems on functions with *in out* parameters and make them stand alone. The rule would be something like: A call of a function with an *in out* parameter must be the only call in an expression. That would mean that if F2(O1) then would be legal (assuming F2 returned type Boolean), but if F2(O1) = (C => 1) then would not be. Obviously, this is too strict. Amazingly, it is also not strict enough: Some_Array(F3(O1)) := F2(O1); would be allowed. (An assignment statement is *not* an expression!) So, correcting some of the deficiencies: A call of a function with an *in out* parameter must be the only call in an expression or statement; If a call of a function with an *in out* parameter is the source expression of an assignment_statement, the target variable_name shall not include a call, an indexed component, a slice, or a dereference. This would allow assigning to temporaries and record components (which can't be computed) and not much else. Of course, this is completely compatible with Ada 95 and later; but it doesn't do anything to detect the existing cases of problems. Maybe that's not important. In order to have an order dependence, there has to be two (or more) uses of an single object within an expression (or statement). But of course that object could be aliased via parameter passing or access values, a part of a larger object, or computed (as in an array component). In addition, one of the uses of the object has to be modified (that is, it is passed as an *in out* or *out* parameter, or it is passed as the designated object of an access type parameter). Finally, one of the following has to be true: * in the smallest call (subexpression) that contains both uses of the object (that is the call where the order dependence can occur), the use that modifies the object cannot be directly a parameter of that call; or * both uses are parameters to the same call; both uses could modify the object, and neither parameter is required to be passed by reference. The first bullet is shown by a procedure call like (we're using the declarations at the head of this section again): P1 (Input => (C => F3(O1)), Output => O1); Since the use that modifies O1 is in the top-level procedure call, it won't be modified until that call is underway. The other parameter(s) will have been evaluated by then. The second bullet provides the only exception to the first bullet; it covers cases with two parameters, both of which can modify the object, and at least one of which could be passed by copy. Turning this into rules is not hard, except for defining a "a single object". The easiest way to do that is to simply use type information: Two objects are considered to be "potentially the same object" if one has the type of a part of the other. [Remember that 'part' includes the type itself. The only parts considered for this rule are those that are visible at the point of the call, as is typical for legality rules.] A call is legal only if: For each name N that is passed to some inner call (not including the call itself) as the actual parameter to a formal in out or out parameter, or is passed as the designated object of a formal parameter of an access type, there is no other name in the expressions of the actual parameters of the call other than the one containing N that denotes potentially the same object; and for each name N that is passed as the actual parameter to a formal in out or out parameter that is not of a by-reference type, there is no other name in the actual parameters corresponding to formal in out or out parameters of the call other than the one containing N that denotes potentially the same object. For the purpose of checking this rule, an assignment_statement is considered a call with two parameters, the source parameter having mode in. This is clearly too strict, as calls with obviously different objects would be illegal. For instance, P1 (Input => F2(O2), Output => O1); fails this proposed rule. That would annoy programmers intensely, as it is crystal-clear that there is no conflict in this call. Moreover, this could interfere with using temporaries to break up expressions that otherwise would violate the rules. So this rule needs improvement. There is a also a problem with private types. If a private type has an aliased component, it is possible for an access type (presumably returned from some operation on the private type) to designate that component. But that fact that the private object had such a part with the appropriate part would not be known at the point of the call. That could happen, for instance, in a container accessor. A1 := Accessor(A_Rec_Container); PX (A_Rec_Container, F2 (A1.all)); A1.all designates part of A_Rec_Container, and (if we are trying to catch all such cases), the call to PX should be illegal. So we could modify the definition of "potentially the same object" to: Two objects are considered to be "potentially the same object" if one has the type of a part of the other, or one has a part whose type is a partial view, unless: * one object is part of a stand-alone object, and the other object is part of a different stand-alone object; [Since there are no dereferences here (they're never stand-alone objects), we don't have to worry about private types because the problem cases all involve dereferences designating aliased components. And different objects are otherwise disjoint.] * one object is part of a stand-alone object SO with no parts that are aliased or have types that are partial view, and the other object is not part of SO; [Here we can see the the first object has no aliased parts or private types that could hide aliased parts; in that case, we only care that the second object is not part of the same object.] * one object is a parameter of an elementary type, and the other object is not that parameter or rename thereof; [We don't say "by-copy type" here, as that ignores privacy, which would be wrong for legality rules. A by-copy parameter can't be aliased and cannot represent any other object, so they never are the same as some other object.] [In the following cases, we can ignore private types; a type can never be directly a subcomponent of itself, and O2 cannot be a private type or we wouldn't be able to see the subcomponent. We also don't have to worry about different stand-alone objects, as t] * the type of object O1 is that of a (visible) subcomponent of object O2, object O1 is a non-aliased part of a stand-alone object, and object O2 is not part of the same stand-alone object; [We can only get in trouble here if O2 is part of the same stand-alone object. O2 can be in a storage pool, or any other stand-alone object, as we don't have to worry about aliasing.] * the type of object O1 is that of a (visible) subcomponent of object O2, object O2 is a aliased part of a stand-alone object, and object O1 is a part of a dereference of a pool-specific access type; [Here we have to worry about accesses to the subcomponent of O2; we don't want O1 to be a dereference that represents that subcomponent. Different stand-alone objects are covered in the first bullet.] * the type of object O1 is that of a (visible) subcomponent of object O2, object O2 is a non-aliased stand-alone object, and object O1 is not part of the same stand-alone object; [Note that we don't talk about object O2 as being a part. That gets messy when private types are taken into account, because we can't necessarily see if the objects have an aliased components of the right type. (And even when we can, the wording is messy.) Otherwise, this is similar to the second bullet. * the type of object O1 is that of a (visible) subcomponent of object O2, object O2 is a aliased part of a stand-alone object, and object O1 is a part of a different stand-alone object or a part of a dereference of a pool-specific access type; [Here we have to worry about accesses to the subcomponent of O2; we don't want O1 to be a dereference that represents that subcomponent.] Essentially the idea here is that if the two objects are different stand-alone objects, we don't care about their types; if one of the objects is a stand-alone object and nothing is hidden or aliased, then the other object can be anything other than the same object. If we can see the relationship between the objects, we can then allow more (such as pool-specific dereferences, which can't designate stand-alone objects). Private types are considered to conflict with all other types. This gets rid of all of the critical problems; private types can't hide aliasing, and temporaries will always work to break up expressions. However, these rules still have some annoying effects: * Parameters are not stand-alone objects, so parameters are treated as if they could be the same always treated as if they could be the same, if there is any possibility. We do except by-copy parameters, as they can't be aliased and they are effectively new objects. * Changing a type from a visible type to a private type potentially could make some calls illegal. The above rules assume the worst about private types, while as long as the type is visible, the compiler can actually verify that there is no problem. That seems inherent in this model. * There is no attempt to differentiate individual components that have the same type. This can appear for records: P (Complex.Real, F(Complex.Img)); would appear to conflict and thus be illegal. But this is much more likely to cause problems for arrays: P (Arr(1), F(Arr(2))); The reader can easily tell that these array component s aren't ever going to be the same, but the compiler can't with the given rules. Of course, if 1 and 2 were replaced by functions, then these should be treated as overlapping. The component cases could be handled with more complex rules which could determine if the components are different parts of the same object. In particular, array components with static index expressions can be allowed. However, the increasing complexity of these rules has led the author to quit at this point. Moreover, they still seem to be too incompatible; this is most acute when looking at a routine like Swap_Integers (with two in out parameters of type Integer): Swap_Integers (Arr(I), Arr(J)); This would be illegal by any conceivable set of "complete" rules, since we cannot know the values of I and J, and the parameters are not by-reference. As such, this call should not be allowed, but that is a likely incompatibility. We could break the "completeness" by only checking function calls, but that doesn't make much sense. There is little semantic difference between functions and procedures, and programmers switch between them all the time. Preventing a problem for function calls but not preventing the same problem for procedures would be bizarre. Finally, A Proposed Solution A better solution is be to create rules that only try to detect the "low-hanging fruit" - that is cases where there clearly is a problem. This would be preferable anyway; it would reduce the frustration caused by the rules (compared to rules like the accessibility rules, for which the checks generally have no effect except to get in the way of what you need to do -- to the point that Ada provides an attribute to ignore them!). It would still be possible to cause problems, but at least obvious problems would be prevented, and most illegal calls would have obvious problems. Such rules would only reject calls where it is clear that parts of the same object are involved. That eliminates the complications of private types (we won't look in them), array components (we won't try to determine if they are the same), and so on. These are the rules proposed in the !wording section above. The proposed wording detects arrays indexed by the same value or object, dereferences of the same access value, as well as uses of the same object. It does not try to detect other cases of problems. For instance, Swap_Integer (Arr(I), Arr(I)); is illegal (and is almost certainly a bug -- one I've written periodically!), while Swap_Integer (Arr(I), Arr(J)); is legal. The proposed wording only applies to calls. It specifically does not apply to short circuit operations, as the order of evaluation of those operations is language-defined. Thus it is not possible to cause an order dependence across the parts of a short-circuit form. For instance: if F2(O1) = F6(O2, O3) and then F3(O1) = F3(O3) then does not have an order dependence; the calls to F2 and F6 have to be evaluated before the calls to F3. Thus the result will be False (O1.C = 2 /= O3.C = 3). Since "and then" is not a call, the proposed rules do not apply to it, and the calls to "=" making up the sub-expressions are checked separately. Whereas: if F2(O1) = F6(O2, O3) and F3(O1) = F3(O3) then does have an order dependence, and the result of the expression could be either True or False. Since the evaluation of "and" is a call, the proposed rules apply to this expression and thus it is illegal. The proposed rules depend on the fact that while Ada allows parameters to be evaluated in an arbitrary order, it does not allow interleaving of part of the evaluations of those parameters. (See 1.1.4(18), 1.1.4(18.d) says that it is intended to allow programmers to depend on some side-effects.) Thus, each call evaluates its parameters and checks their subtypes (in some arbitrary order), executes the call, and does any copying back of parameters (in some arbitrary order) without part of any other call being evaluated in the middle. Of course, an implementation can reorder operations in any way it likes so long as the result one of those allowed by evaluation rules above. But a compiler cannot start evaluating other parameters before writing back the results of a call if the other parameters could depend on those results. Additional alternatives One idea that was considered but discarded was to use run-time checks to deal with the complex cases, such as when private types are (usually) hiding the fact that checks are not needed, or to verify that array indexes are different when that is required. The problem is that adds a lot of complexity, and while many of the checks could be eliminated, some probably would remain, adding overhead without much value. (After all, the program probably would work -- although not portably -- without the checks.) None of the proposed rules do anything about side-effects totally inside of functions. One way to deal with that would be to require an expression to contain only a single function call unless all of the functions are *strict*. A strict function exposes all of its side effects in its specification, meaning it does not read or write global variables, call non-strict functions, write hidden parts of parameters, etc. Most of the language-defined functions are strict (that would have to be declared somehow). Strict functions have several other nice properties: they don't need elaboration checking (freezing is sufficient to prevent access-before-elaboration problems); they can be used in symbolic reductions; and they can use the optimizations allowed for functions in pure packages. However, such a requirement would be quite incompatible. Moreover, strict functions would be rather limiting by themselves. An alternative that has been suggested is to allow Global_In and Global_Out annotations on subprograms, which would declare the global side-effects of a subprogram. Such annotations could not be a lie (they'd have to be checked in some way), and thus would fill the role of strict functions more flexibly. But it would still be too incompatible to ban dangerous side-effects in functions (although separate tools or non-Ada operating modes could make such checks). !example (See discussion.) !ACATS test !appendix !topic Allow parameter modes for actual parameters !reference RM 6.4 !from Adam Beneschan 09-03-05 !discussion (Based on a comment in comp.lang.ada by Yannick Duchêne.) It would be useful for readers to be able to see, at the point of a call, which parameters the call outputs or modifies, and which ones are just passed as inputs to it. I think this is ESPECIALLY the case for function calls; now that we are considering [IN] OUT parameters for functions, Ada programmers who are used to function calls having IN parameters, and who see a function call in the code, may not realize that the function has the side effect of modifying one of the parameters. This is the sort of thing that is very easy to miss, especially since function calls can be "buried" inside larger expressions. Personally, I have this problem when reading code in Pascal. The proposal is to alter the definition of parameter_association in 6.4(5): parameter_association => [mode] [formal_parameter_selector_name =>] explicit_actual_parameter with the legality rule (in 6.4.1) that if a "mode" is present, it must match the mode of the formal parameter (note that the mode of a parameter defined by an access_definition is IN, by 6.1(18)). We can argue about whether the [mode] would look better to the left or right of the formal_parameter_selector_name =>, if both are present; I think this is just an issue of taste. Also, I think it's arguable that this mode should be *required* on actual parameters of function calls for formal parameters of mode OUT or IN OUT, assuming we allow such parameters. As I mentioned above, the fact that a function call is going to modify one of its parameters is the sort of thing that could be easily missed, and that's a possible argument for making the mode a requirement. It's a bit unfortunate that this proposal wouldn't apply to the prefix of a subprogram call given in Object.Operation notation, but I can't think of a syntax that wouldn't look hokey: Page_Number := Get_Page_Number_From_User; in out Book_Object.Go_To_Page (Page_Number); ---??? Yuk. On the other hand, when Object.Operation notation is used it's probably more obvious from the operation name what's being done with or to the object, so perhaps an explicit mode isn't quite as useful. **************************************************************** From: Randy Brukardt Sent: Thursday, March 5, 2009 1:38 PM > It would be useful for readers to be able to see, at the > point of a call, which parameters the call outputs or > modifies, and which ones are just passed as inputs to it. This topic is covered in the portion of AI05-0144-1 that I've already written. The net of the discussion there is that it isn't worth doing in Ada as it stands for a number of reasons. One of the most important is the lack of viable syntax -- the syntax of Ada as it stands is exactly wrong for this. And all of the proposals only help named notation, but the problem is even more severe for positional notation. > I think this is ESPECIALLY the case for function calls; now > that we are considering [IN] OUT parameters for functions, > Ada programmers who are used to function calls having IN > parameters, and who see a function call in the code, may not > realize that the function has the side effect of modifying > one of the parameters. It shouldn't be relevant, because the intent is that any call where it could matter would be illegal. (Unlike side effects in functions that aren't visible in the contract, we can viably check for side effect conflicts that *are* visible in the contract.) At least that's the theory. In any case, parameters to functions often have side-effects, you just can't see them. (Think the language-defined random number generator.) And of course anonymous access parameters can be modified (and they don't have to be obvious in a call). A programmer who doesn't at least consider the possibility of parameters changing is already sunk. ... > The proposal is to alter the definition of parameter_association in > 6.4(5): > > parameter_association => > [mode] [formal_parameter_selector_name =>] > explicit_actual_parameter > > with the legality rule (in 6.4.1) that if a "mode" is > present, it must match the mode of the formal parameter (note > that the mode of a parameter defined by an access_definition > is IN, by 6.1(18)). I didn't consider this particular syntax idea (I concentrated on more subtle ways, such as the direction of the name arrow), but I don't think it works very well with typical parameter names. Consider Insert in the predefined containers: Insert (in out Container => My_Vector, in Before => 10, in New_Item => Some_Value); "in Before"? Gag! Parameters named "On" are common in some packages. "in On"??? Yikes! I had suggested in the AI: Insert (Container <-> My_Vector, Before <- 10, New_Item <- Some_Value); It also should be noted that either of these schemes would totally hide anonymous access parameters (as their mode is technically "in"), which are at least as dangerous as "in out" ones. In any case, I'm going to file this on the existing AI and I'm sure the ARG will discuss it. **************************************************************** From: Adam Beneschan Sent: Thursday, March 5, 2009 1:59 PM > It shouldn't be relevant, because the intent is that any call where it > could matter would be illegal. (Unlike side effects in functions that > aren't visible in the contract, we can viably check for side effect > conflicts that > *are* visible in the contract.) At least that's the theory. > > In any case, parameters to functions often have side-effects, you just > can't see them. (Think the language-defined random number generator.) > And of course anonymous access parameters can be modified (and they > don't have to be obvious in a call). My thinking was that most of the time if you're passing a variable as a function access parameter, you'd have to pass it as Variable'Access, which would at least serve as a clue. > I didn't consider this particular syntax idea (I concentrated on more > subtle ways, such as the direction of the name arrow), but I don't > think it works very well with typical parameter names. Consider Insert > in the predefined > containers: > > Insert (in out Container => My_Vector, > in Before => 10, > in New_Item => Some_Value); > > "in Before"? Gag! Parameters named "On" are common in some packages. > "in On"??? Yikes! Yeah, it would work better if the source is displayed with some sort of editor that boldfaces reserved words. Anyway, if I did decide to use the modes on all actual parameters, I'd probably write the above as Insert (in out Container => My_Vector, in Before => 10, in New_Item => Some_Value); which probably has a lower gag factor. > I had suggested in the AI: > > Insert (Container <-> My_Vector, > Before <- 10, > New_Item <- Some_Value); I don't think this idea will work because it will invalidate any code that looks something like if Error>0.5 or else Error<-0.5 then ... OK, it would force programmers who have code like this to insert some spaces and make it more readable, which I guess is good, but I don't think this incompatibility would make them happy. **************************************************************** From: Bob Duff Sent: Thursday, March 5, 2009 1:30 PM > It would be useful for readers to be able to see, at the point of a > call, which parameters the call outputs or modifies, and which ones > are just passed as inputs to it. This feature existed in an early version of Ada -- circa 1979 or 1980? The syntax was: Mumble (In_Param := 123, Out_Parm =: Blah, In_Out_Param :=: Thing); if I remember correctly. It only worked for named notation, though. > I think this is ESPECIALLY the case for function calls; now that we > are considering [IN] OUT parameters for functions, Ada programmers who > are used to function calls having IN parameters, and who see a > function call in the code, may not realize that the function has the > side effect of modifying one of the parameters. This is the sort of > thing that is very easy to miss, especially since function calls can > be "buried" inside larger expressions. Personally, I have this > problem when reading code in Pascal. > > The proposal is to alter the definition of parameter_association in > 6.4(5): > > parameter_association => > [mode] [formal_parameter_selector_name =>] explicit_actual_parameter > > with the legality rule (in 6.4.1) that if a "mode" is present, it must > match the mode of the formal parameter (note that the mode of a > parameter defined by an access_definition is IN, by 6.1(18)). > > We can argue about whether the [mode] would look better to the left or > right of the formal_parameter_selector_name =>, if both are present; I > think this is just an issue of taste. > > Also, I think it's arguable that this mode should be *required* on > actual parameters of function calls for formal parameters of mode OUT > or IN OUT, assuming we allow such parameters. That's an interesting idea. It might even make the idea of '[in] out' parameters on functions more palatable to some folks. I'd also like a configuration pragma or compiler switch that would cause a warning or error if it's missing on '[in] out' parameters of non-functions. **************************************************************** From: Randy Brukardt Sent: Thursday, March 5, 2009 2:20 PM > My thinking was that most of the time if you're passing a variable as > a function access parameter, you'd have to pass it as Variable'Access, > which would at least serve as a clue. That thinking probably is wrong (any time you pass an access value it is wrong, it is wrong in prefix notation, and so on.) Read the AI, it covers that in detail. ... > > I had suggested in the AI: > > > > Insert (Container <-> My_Vector, > > Before <- 10, > > New_Item <- Some_Value); > > I don't think this idea will work because it will invalidate any code > that looks something like > > if Error>0.5 or else Error<-0.5 then ... It surely wouldn't invalidate *this* code, because there is no call that can have named notation. But I do see your point. I originally tried with "<=>" and "<=" but the problem there (besides the conflict with "<=") is that the current arrow "=>" ought to represent "out". That doesn't work well. Anyway, if people want to pursue this, they're welcome to come up with ideas for syntax. This is an idea where the syntax will make it or break it, so all ideas should be considered. **************************************************************** From: Jeffrey R. Carter Sent: Thursday, March 5, 2009 2:13 PM > I don't think this idea will work because it will invalidate any code > that looks something like > > if Error>0.5 or else Error<-0.5 then ... Why? That's not a parameter association, so it shouldn't be subject to the parameter-association rules. However, such an idea would only work for named parameter association; repeating the mode would work for positional association as well. So if we're going to have such a thing, I'd vote for repeating the mode. **************************************************************** From: Randy Brukardt Sent: Thursday, March 5, 2009 2:38 PM ... > However, such an idea would only work for named parameter association; > repeating the mode would work for positional association as well. So > if we're going to have such a thing, I'd vote for repeating the mode. Repeating the mode in positional associations would be incredibly confusing, since "in" is allowed in expressions. Proc (in X, in T); -- A two parameter call. Proc (in X in T); -- A one parameter call. You'd have to pay amazing attention to commas when reading. (The idea would be a complete non-starter if there was a case where both of these would be legal, but I don't think it could exist.) We'd also have to build an entire new parser for our compiler (there is no chance that syntax like this could be made to go through our current parser generator). **************************************************************** From: Bob Duff Sent: Thursday, March 5, 2009 2:34 PM > I didn't consider this particular syntax idea (I concentrated on more > subtle ways, such as the direction of the name arrow), but I don't > think it works very well with typical parameter names. Consider Insert > in the predefined > containers: > > Insert (in out Container => My_Vector, > in Before => 10, > in New_Item => Some_Value); > > "in Before"? Gag! Parameters named "On" are common in some packages. > "in On"??? Yikes! Well, I presumed "in" was optional. And I would certainly never use it. So: Insert (in out Container => My_Vector, Before => 10, New_Item => Some_Value); or: Insert (in out My_Vector, 10, Some_Value); which aren't so horrible. **************************************************************** From: Bob Duff Sent: Thursday, March 5, 2009 2:41 PM > > I don't think this idea will work because it will invalidate any > > code that looks something like > > > > if Error>0.5 or else Error<-0.5 then ... > > Why? That's not a parameter association, so it shouldn't be subject to > the parameter-association rules. Because if <- is a lexical element, it has to be a lexical element in all contexts. We don't have feedback from the parser into the lexer in Ada, and we don't want to change that fact. > However, such an idea would only work for named parameter association; > repeating the mode would work for positional association as well. So > if we're going to have such a thing, I'd vote for repeating the mode. Right, whatever the syntax, if it doesn't support positional notation, then it's pointless. **************************************************************** From: Adam Beneschan Sent: Thursday, March 5, 2009 2:59 PM > Repeating the mode in positional associations would be incredibly > confusing, since "in" is allowed in expressions. > > Proc (in X, in T); -- A two parameter call. > Proc (in X in T); -- A one parameter call. FYI, I completely forgot about membership tests when I wrote my earlier e-mail. Also, I was definitely thinking more about OUT and IN OUT parameters---repeating the mode for those is useful to alert the reader to a side-effect, but less useful for IN parameters. **************************************************************** From: Adam Beneschan Sent: Thursday, March 5, 2009 3:00 PM > > I don't think this idea will work because it will invalidate any > > code that looks something like > > > > if Error>0.5 or else Error<-0.5 then ... > > Why? That's not a parameter association, so it shouldn't be subject to > the parameter-association rules. Ummm, you want to try writing a lexical analyzer that interprets <- as a single token inside a subprogram call but two tokens elsewhere? And gets this case right? Arr : array (Boolean) of Integer; ... N := Arr (Error<-0.5); Sorry, I think that's asking too much. I don't think there are any possible ambiguities with any of the current compound delimiters defined in 2.2---there is no syntax in which any of them could be interpreted as two single deilmiters---but adding this syntax would add one. I think the only way this could reasonably work is if a language rule were added to 2.2 saying that <- is interpreted as a compound delimiter whenever those two characters appear together; otherwise it would be just too hard to get right, in my opinion. **************************************************************** From: Jeffrey R. Carter Sent: Thursday, March 5, 2009 4:02 PM > Ummm, you want to try writing a lexical analyzer that interprets <- as > a single token inside a subprogram call but two tokens elsewhere? And > gets this case right? No. But I wouldn't let it stop me from designing such a language if I didn't have to write the compiler :) As I've said, I don't really like the proposal, so I was merely curious. There are lots of places where the way things are interpreted depends on context. But the "in X in T" makes the repeated-mode proposal seem to have problems, too. **************************************************************** From: Micronian Sent: Thursday, March 5, 2009 7:03 PM Would anyone object to the idea of using brackets? I don't recall it used any where in the Ada language. It's not the prettiest syntax, but it looks clear and it is easy to separate from the actual parameters. Proc ([in] X, [in] T); -- A two parameter call. Proc ([in] X in T); -- A one parameter call. Insert ([in out] Container => My_Vector, [in] Before => 10, [in] New_Item => Some_Value); **************************************************************** From: Adam Beneschan Sent: Thursday, March 5, 2009 7:31 PM The same idea actually occurred to me. Back when Ada was first designed, there was some criterion that prevented certain characters from being used (except in comments and string literals), and square brackets were in that list. However, I suspect that all the old keypunch machines that didn't have those characters, that the original designers were worried about, have long ago been turned into scrap metal. Personally, I'm not opposed to starting to use the forbidden characters (as long as the resulting code doesn't start looking like C programs); it would seem odd that we can now use letters from every alphabet in the world in our identifiers including Greek and Tamil and ancient Irish alphabets that haven't been used in perhaps a thousand years, but can't use square brackets or curly braces in the syntax. But for some reason, this doesn't seem like the right place to introduce the concept. **************************************************************** From: Micronian Sent: Thursday, March 5, 2009 7:47 PM Well, seeing as how Ada now has the Pi Unicode character, I don't see why brackets should not be allowed. **************************************************************** From: Niklas Holsti Sent: Friday, March 5, 2009 1:58 AM > Would anyone object to the idea of using brackets? Yes. While I think the optional indication of actual parameter mode is a useful addition to Ada, I dislike the bracket proposal. > It's not the prettiest syntax, but it looks clear and it is easy to > separate from the actual parameters. > > Proc ([in] X, [in] T); -- A two parameter call. > Proc ([in] X in T); -- A one parameter call. That looks cluttered to me, and not uniform with the absence of brackets in the subprogram declarations (OK, optional brackets could be allowed in declarations, too, but are still ugly). To me the problem of mistakenly mixing up Proc (in X, in T) with Proc (in X in T) is not severe enough to merit the ugliness of brackets. After all, we already have the similar problem of confusing Proc (X, -3) and Proc (X -3), where both forms can be legal, even. **************************************************************** From: Dmitry A. Kazakov Sent: Friday, March 6, 2009 2:45 AM > It would be useful for readers to be able to see, at the point of a > call, which parameters the call outputs or modifies, and which ones > are just passed as inputs to it. Why would it be useful for readers? The mode is statically checked, there is nothing the reader should worry about. > I think this is ESPECIALLY the case for function calls; now that we > are considering [IN] OUT parameters for functions, Ada programmers who > are used to function calls having IN parameters, and who see a > function call in the code, may not realize that the function has the > side effect of modifying one of the parameters. This is the sort of > thing that is very easy to miss, especially since function calls can > be "buried" inside larger expressions. I am afraid this is a totally wrong idea. If something is needed to do here then it is to make things statically checkable. For example, functions with side effects should not be allowed as arguments except than in unary operations. Presence / absence of side-effects could be explicitly stated in the function declaration. Order of actual parameter evaluation could be explicitly stated for a subprogram, which in turn would allow calls to functions with side effects as arguments, etc. **************************************************************** From: Randy Brukardt Sent: Friday, March 6, 2009 2:03 PM For what it's worth, that's the direction that I am pursuing for future Ada versions. While some sort of mode specification would have been nice, it's 30 years too late to add it (to be useful, it has to be required). And, as noted, Ada originally had such a specification and it was dropped for some reason before Ada 80 came out, which suggests that the idea was rejected way back then. Not sure what has changed about calls that should cause a reconsideration. **************************************************************** From: Micronian Sent: Friday, March 6, 2009 3:06 AM Yes, in most cases using Proc(in X, out Y) works. I mainly provided the bracket idea to guarentee there is no confusion with something like Proc(in X in Y) Personally, I still prefer _not_ having brackets. For the above case, I probably would make it a habit to put an expression like "X in Y" in another variable (e.g. Is_Descendent := X in Y) or if anything Proc( in (X in Y) ) In the end, it would be nice to have the ability to specify param modes because I often find myself looking through someone elses code and need to look back at the spec file to have a better understanding of how data is being manipulated. But, I understand there are far more significant/complicated issues that need to be addressed first. **************************************************************** From: Niklas Holsti Sent: Friday, March 6, 2009 2:44 PM > Proc( in (X in Y) ) I, too, would do that, unless the call is written with one parameter per line, which is my usual style when there are more than one or two parameters. > In the end, it would be nice to have the ability to specify param > modes because I often find myself looking through someone elses code > and need to look back at the spec file to have a better understanding > of how data is being manipulated. Exactly. It is a help to readers, and so in line with the Ada philosophy. Still, it could help also when writing code; I have made a couple of errors where I misremembered the inputs and outputs for some call, and those errors usually were not caught by exceptions at run-time. > But, I understand there are far more > significant/complicated issues that need to be addressed first. Of course. But this is a simple extension that is easy to talk about and to like or dislike. **************************************************************** From: Adam Beneschan Sent: Friday, March 6, 2009 3:01 PM > Why would it be useful for readers? The mode is statically checked, > there is nothing the reader should worry about. We must be talking about two totally different things, since I have no idea what your last sentence has to do with anything. My point, and I think Yannick's, was just to add something to help make a program more self-documenting. If I see something like R := ; -- (A) if Some_Function (Some_Expression, R) > 0 then ... V := R; ... it might not occur to me that V is going to be assigned some value other than the value R got assigned in (A), because I'm used to thinking of functions as taking just inputs and producing a value, and I might not realize that calling Some_Function may change R. This would make things clearer that R may modified: R := ; -- (A) if Some_Function (Some_Expression, IN OUT R) > 0 then ... V := R; ... just an aid to help someone reading the code realize what's going on with R. There are other ways to accomplish this: sometimes selecting an appropriate parameter name might be enough, and if all else fails you can add a comment. But this is something that I figured might help and couldn't hurt because it would be easy to implement. **************************************************************** From: Niklas Holsti Sent: Friday, March 6, 2009 3:08 PM > ... While some sort of mode specification would have been nice, it's > 30 years too late to add it (to be useful, it has to be required). I do not agree that it would have to be required to be useful, and thus I don't think it is too late to add it. I think the use or non-use of actual-parameter modes would be a question of programming style, similar to the use or non-use of the predefined numeric types, or the repetition of subprogram names after the "end". The usage of actual-parameter modes could be controlled by coding rules, perhaps enforced by compiler switches (like GNAT formatting rules) or other source-code analysis tools. > And, as noted, Ada originally had such a specification and it was > dropped for some reason before Ada 80 came out, which suggests that > the idea was rejected way back then. Not sure what has changed about > calls that should cause a reconsideration. While I have much respect for the originators of Ada, I don't think their decisions should be taken as dogma. I agree with Micronian that this is not an very important issue, but I am in favour of it anyway. As I said in another message, it could have caught a couple of errors I have made. I think actual-parameter modes would be especially valuable during software maintenance: if the modes of the formal parameters are changed, this feature could flag all calls that conflict with the new modes. Existing rules might flag only some of the conflicting calls. For example, if the mode is changed from "in" to "in out" the existing rules flag only calls where the actual parameter is a constant. The existing rules do not flag calls where the actual parameter is a variable, since both "in" and "in out" are then legal, but the caller may not expect the call to change this variable and may therefore malfunction at some later point in its execution. **************************************************************** From: Randy Brukardt Sent: Friday, March 6, 2009 5:53 PM > I do not agree that it would have to be required to be useful, and > thus I don't think it is too late to add it. I probably shouldn't have used the word "useful". I had presumed that this particular syntax would not be implementable using our LALR(1) parser generator. And I was thinking that if it was *required* there would not be a problem, but as optional syntax it can't be implemented. This turns out to be wrong on two counts: Making the syntax required wouldn't help. The syntax for calls is shared with type conversions and array indexing, so there isn't any circumstance where adding something to positional calls doesn't have to be optional (effectively). And more importantly, I actually tried this on our grammar generator and it seems to be happy. (Enforcing the legality rules would be a massive pain, requiring massive changes to the structure of calls, but that's still very different than what I was envisioning.) I really do not believe this result, but given that I can't find an obvious bug I have to assume that it is correct. I still don't see any reason to support such syntax on positional calls. They're mostly used by the lazy (whose would never spend the extra typing on the modes anyway) and those who want to use the symmetry between array indexing and a function call (and you couldn't use the modes then). Otherwise, you'd use named notation. Is it really such a hardship to use that if you want to know the mode?? ... > I think actual-parameter modes would be especially valuable during > software maintenance: if the modes of the formal parameters are > changed, this feature could flag all calls that conflict with the new > modes. Existing rules might flag only some of the conflicting calls. > For example, if the mode is changed from "in" to "in out" > the existing rules flag only calls where the actual parameter is a > constant. The existing rules do not flag calls where the actual > parameter is a variable, since both "in" and "in out" > are then legal, but the caller may not expect the call to change this > variable and may therefore malfunction at some later point in its > execution. One could make this argument in reverse as well: changing from "in out" to "in" would require changing a lot of calls without any corresponding benefit. Indeed, we've changed modes of calls in Claw, knowing that such changes are compatible 98% of the time (constant Window objects are rare!). Such changes also have been made in the Ada standard libraries (usually to fix definitional bugs). If this feature existed and was used, such changes would become totally incompatible and probably would not be able to be made. The point here is that adding these modes would impede maintenance as much as they would help it. That would be OK if they were helping to detect a lot of bugs, but that's not likely to be the case. --- One further point: concentrating on the parameter modes is the wrong thing. If you are concerned about side-effects on a parameter because of a call, you need to worry about: [1] Parameters of mode "out" and "in out"; [2] Anonymous access parameters (to variables); [3] In Parameters of named access to variable types; [4] In parameters of composite types that have a component of an access-to-object type (including the infamous Rosen trick). And that's not counting modifications by other access paths (Claw does this a lot). So knowing the mode is meaningless unless you also know the type (for scalar types, you do only need to worry about [1]). But for that you still have to look at the spec (or the declaration of the actual). I suppose the next proposal will be to repeat the formal type in the call as well... --- It is interesting that I was neutral on this idea (other than the lack of good syntax) before this discussion. But now I see why the "founding father(s)" left it out, and I believe that I am now strongly opposed to it. So you guys have done a lot of good (just not the good you had in mind)! **************************************************************** From: Niklas Holsti Sent: Saturday, March 7, 2009 1:40 AM > I probably shouldn't have used the word "useful". I had presumed that > this particular syntax would not be implementable using our > LALR(1) parser generator. [...] I actually tried this on our grammar > generator and it seems to be happy. (Enforcing the legality rules > would be a massive pain, requiring massive changes to the structure of > calls, [...] I understand that the compiler-implementation cost of proposed extensions to Ada is an important factor in the decision to accept or reject the extension. As I have little knowledge of Ada compilers I have nothing to contribute on this. > I still don't see any reason to support such syntax on positional > calls. They're mostly used by the lazy ... I agree with this deprecation of positional calls in general, and would not much mind if the proposed actual-parameter mode indications were allowed only for named-association calls. >> I think actual-parameter modes would be especially valuable during >> software maintenance: if the modes of the formal parameters are >> changed, this feature could flag all calls that conflict with the new >> modes. Existing rules might flag only some of the conflicting calls. >> For example, if the mode is changed from "in" to "in out" the >> existing rules flag only calls where the actual parameter is a >> constant. The existing rules do not flag calls where the actual >> parameter is a variable, since both "in" and "in out" are then legal, >> but the caller may not expect the call to change this variable and >> may therefore malfunction at some later point in its execution. > > > One could make this argument in reverse as well: changing from "in > out" to "in" would require changing a lot of calls without any > corresponding benefit. Why would there be no benefit? If a formal parameter mode is changed from "in out" to "in", surely it is necessary to review all existing calls: the caller expects the actual parameter to change, and it no longer changes, so later uses of the (unchanged) actual parameter in the caller may no longer work. Of course there are other ways to find all calls of the changed subprogram (simple search, or a cross-reference listing). Actual-parameter modes are not crucial but may be helpful. I agree that there are cases where a mode-change clearly has no impact on calls, and the actual-mode indications would cause extra maintenance work. For example, if the parameter type is changed from having value semantics to having reference semantics, an "in out" parameter mode usually changes to "in" without any change in the meaning of calls. > One further point: concentrating on the parameter modes is the wrong > thing. If you are concerned about side-effects on a parameter because > of a call, you need to worry about.... [ list of side-effect channels ] Of course I agree that there are many ways in which a call may have side effects and that parameter modes are only the surface level. In critical software, however, the more complex side-effect mechanisms tend to be frowned on or entirely forbidden. I wonder what the SPARK people think of this proposal; it seems related to the information-flow analysis in SPARK. > It is interesting that I was neutral on this idea (other than the lack > of good syntax) before this discussion. But now I see why the > "founding father(s)" left it out, and I believe that I am now strongly > opposed to it. So you guys have done a lot of good (just not the good > you had in mind)! OK. I can't say that you have convinced me, but I'm not rabidly in favour of the proposal either, so I'm content to let it rest. **************************************************************** From: Dmitry A. Kazakov Sent: Saturday, March 7, 2009 2:50 AM > it might not occur to me that V is going to be assigned some value > other than the value R got assigned in (A), because I'm used to > thinking of functions as taking just inputs and producing a value, and > I might not realize that calling Some_Function may change R. This > would make things clearer that R may modified: > > R := ; -- (A) > > if Some_Function (Some_Expression, IN OUT R) > 0 then > ... > V := R; > ... I don't see where this pattern could be useful. To me it is rather a poor design and IN OUT serves as a comment to explain, excuse it. IMO in out arguments of functions are really required in cases like I/O, where the state of an argument is implicitly changed. For example: S : Stream; begin Connect (S); if Read (S) = "hello" then ... > just an aid to help someone reading the code realize what's going on > with R. I.e. it is a comment. > There are other ways to accomplish this: sometimes selecting an > appropriate parameter name might be enough, and if all else fails you > can add a comment. But this is something that I figured might help > and couldn't hurt because it would be easy to implement. As I said, I see no function behind this feature. What does it *add* to the program rather than mere noise/comments? 1. It does not check anything that is not already checked. 2. It imposes a distributed maintenance overhead when some modes are changed to compatible ones. 3. It is semantically suspicious because the mode is a part of an anonymous subtype. In fact "in T", "out T", "in out T" are constrained subtypes of T, with some operations possibly disallowed. If mode is to be specified on the caller's side then, logically, the whole subtype should be. Plus, if you add in, out, in out modes you should also add other "modes" (constraints) like 'Class, discriminants, bounds: procedure Foo (X : T'Class); Foo (X'Class => Object) type T (I : Integer) is ...; procedure Foo (X : T); Foo (X (24) => Object) Put_Line (Item (1..20) => Object) In its core it is an idea to tell which kind of stuff you, the caller, expect from / allow to the callee to do with the argument. The type of the argument is a description of what is allowed. Now why the idea is bad is because the function you call is too an operation of the argument's type. Thus it is just a tautology in a strongly typed language. I hope Ada is still one. When it is intended as something more that a tautology,then it would become an extremely dangerous thing known as "cast." **************************************************************** From: Georg Bauhaus Sent: Monday, March 9, 2009 5:34 AM > This would make things clearer that R may modified: > > R := ; -- (A) > > if Some_Function (Some_Expression, IN OUT R) > 0 then > ... > V := R; > ... Isn't there just one case warranting the reader's attention? Namely when a parameter may be modified. So make this the only case, and write if Some_Function (Some_Expression, ! R) > 0 then or if ! R.Some_Function(Some_Expression) then (Not sure whether |R would have to be permitted, too, then.) ****************************************************************