Version 1.5 of ai05s/ai05-0144-1.txt
!standard 6.04 (09) 09-04-30 AI05-0144-1/03
!class Amendment 09-02-15
!status work item 09-02-15
!status received 09-02-15
!priority High
!difficulty Hard
!subject Detecting dangerous order dependencies
!summary
Define rules to reject expressions with obvious cases of order-dependence.
!problem
Ada does not define the order of evaluation of parameters in calls. This opens up
the possibity that function calls with side-effects can be non-portable. The
problem gets worse with "in out" and access parameters, as proper evaluation
may depend on a particular order of evaluation.
Arguably, Ada has selected the worst possible solution to evaluation order
dependencies: it allows such dependencies (by not specifying an order of evaluation),
does not detect them in any way, and then says that if you depend on one (even if
by accident), your code will fail at some point in the future when your compiler
changes.
Something should be done about this.
!proposal
(See wording.)
!wording
[I don't know where best to put this, perhaps in 6.4? - RLB]
A name N1 is known to denote the same object as another name N2 if:
* N1 statically denotes a part of a stand-alone object or parameter, and
N2 statically denotes the same part of the same stand-alone object or
parameter; or
[We're assuming that the first bullet covers selected_components, as
those are always known at compile-time - ED]
* N1 is a dereference P1.all, N2 is a dereference P2.all, and
the prefix P1 is known to denote the same object as the prefix P2; or
* N1 is an indexed_component P1(I1,...), N2 is an indexed_component
P2(I2,...), the prefix P1 is known to denote the same object as the
prefix P2, and for each index of the indexed_component, I1 and I2 are
static expressions with the same value, or I1 and I2 are names that
are known to denote the same object; or
* N1 is a slice P1(R1), N2 is a slice P2(R2), the prefix P1 is known to
denote the same object as the prefix P2, and the subtypes denoted by the
ranges R1 and R2 statically match.
AARM Discussion: This is determined statically. If the name contains some dynamic
portion other than a dereference, indexed_component, or slice, it is not "known
to denote the same object". [We could also use the same rules for indexes for
the bounds of slices that have explicit bounds, although it doesn't seem very
likely to occur and the wording is messy.]
A name N1 is known to denote a prefix of the same object as another name N2 if
N2 is known to denote the same object as a subcomponent of the object denoted by N1.
AARM Reason: This ensures that names Prefix.Comp and Prefix are known to
denote the same object for the purposes of the rules below. This intentionally does
not include dereferences; we only want to worry about accesses to the same object,
and a dereference changes the object in question. (There is nothing shared between
an access value and the object it designates.)
A call C is legal only if:
* For each name N that is passed to some inner call (not including the call C
itself) as the actual parameter to a formal in out or out parameter, there
is no other name anywhere in the expressions of the actual parameters of the call
other than the one containing N that is known to denote the same object or is
known to denote a prefix of the same object; and
* For each name N that is passed to some inner call (not including the call C
itself) as the actual parameter of a formal parameter of an access type, there
is no other name anywhere in the expressions of the actual parameters of the call
other than the one containing N that is known to denote the same object as N.all;
and
* for each name N that is passed as the actual parameter to a formal in out or out
parameter that is of an elementary type, there is no other name in the actual
parameters corresponding to formal in out or out parameters of the call
other than the one containing N that is known to denote the same object or is
known to denote a prefix of the same object.
For the purpose of checking this rule, an assignment_statement is considered a call
with two parameters, the parameter corresponding to the source expression having mode
in and the parameter corresponding to the target name having mode out.
AARM Reason: This prevents obvious cases of dependence on the order of
evaluation of parameters in expressions. Such dependence is usually a bug, and
in any case, is not portable to another implementation (or even another
optimization setting).
The second bullet does not check for uses of the prefix, since the access type
and the designated object are not the same, and "known to denote the same
prefix" does not include dereferences anyway.
Note that these rules as a group make a symmetrical set of rules, in that either
name can designate an object that is the prefix of the other. If the name N is
a prefix of some other name in the call, these rules will trigger because that
prefix would necessarily be known to designate the same object. (Nothing in these
rules require the full other name to match; any part can match.) OTOH, we need
explicit wording if some prefix of N matches some other name in the call.
These rules do not require checks for most in out parameters in the top-level
call C, as the rules about evaluation of calls prevent problems. Similarly,
we do not need checks for short circuit operations. The rules about arbitrary
order (see 1.1.4) allow evaluating parameters and writing parameters back in
an arbitrary order, but not interleaving of evaluating parameters of one call
with writing parameters back from another - that would not correspond to any
allowed sequential order.
End AARM Reason.
AARM Ramification: Note that first two bullets cannot fail for a procedure or entry
call alone; there must be at least one function with an access, in out, or out
parameter called as part of a parameter expression of the call in order for it
to fail.
!discussion
In order to discuss this topic, we need to look at some examples.
type A_Rec is {tagged} record
C : Natural := 0;
end record;
--
--
--
procedure P1 (Input : in A_Rec; Output : in out A_Rec) is
begin
Output.C := Output.C + Input.C;
end P1;
procedure P2 (Bump1, Bump2 : in out A_Rec) is
begin
Bump1.C := Bump1.C + 1;
Bump2.C := Bump2.C + 2;
end P2;
function F1 (Bumpee : access A_Rec) return A_Rec is
begin
Bumpee.C := Bumpee.C + 1;
return (C => Bumpee.C - 1);
end F1;
function F2 (Bumpee : in out A_Rec) return A_Rec is
begin
Bumpee.C := Bumpee.C + 1;
return (C => Bumpee.C - 1);
end F2;
function F3 (Obj : in A_Rec) return Natural is
begin
return Obj.C;
end F3;
function F4 (Obj : access A_Rec) return Natural is
begin
return Obj.C;
end F4;
function F5 (Bump1, Bump2 : access A_Rec) return A_Rec is
begin
Bump1.C := Bump1.C + 1;
Bump2.C := Bump2.C + 2;
return (C => Bump1.C - 1);
end F5;
function F6 (Bump1, Bump2 : in out A_Rec) return A_Rec is
begin
Bump1.C := Bump1.C + 1;
Bump2.C := Bump2.C + 2;
return (C => Bump1.C - 1);
end F6;
function F7 (Bumpee : in out A_Rec; Bumper : in A_Rec) return A_Rec is
begin
Bumpee.C := Bumpee.C + Bumper.C;
return (C => Bumpee.C);
end F6;
O1 : {aliased} A_Rec := (C => 1); --
--
O2, O3 : A_Rec := (C => 1);
type Acc_A_Rec is access all A_Rec;
A1 : Acc_A_Rec := O1'access;
The usual concern about the use of in out parameters in functions begins something
like:
Imagine writing an expression like:
if F3(O1) = F3(F2(O1)) then
This expression has an evaluation order dependency: if the expression is evaluated
left-to-right, the result is True (both values have (C => 1) and O1.C is set to 2
afterwards), and if the expression is evaluated right-to-left, the result is False
(the right operand is still (C => 1), but now the left operand is (C => 2), and O1.C
is still 2 afterwards).
This is usually used as a reason to not allow in out parameters on functions.
If you have to use access parameters, then the expression is:
if F3(O1) = F3(F1(O1'access)) then
and the use of 'access and aliased on the declaration of O1 should provide a red flag
about the possible order dependence.
However, this red flag only occurs some of the time. First of all, access objects
are implicitly converted to anonymous access types, so no red flag is raised when
using them:
if F3(A1.all) = F3(F1(A1)) then
Perhaps the .all on the left-hand argument could be considered a red flag. But of
course that doesn't apply if that function also takes an access parameter:
if F4(A1) = F3(F1(A1)) then
We have the same order dependency, but there is no sign of a red flag here.
This is all Ada 95 code, but Ada 2005 makes this situation worse by adding prefix
views and implicit 'access. If A_Rec is tagged, we can write:
if O1.F3 = O1.F1.F3 then
And since tagged parameters are implicitly aliased, if O1 is a tagged parameter,
there isn't the slightest sign of a red flag here.
This shows that we are already in a very deep pit. One can argue whether moving
the rest of the way to the bottom is that significant. [Thanks to Pascal Leroy for
showing just how bad the current situation is.]
We can show similar problems with procedure calls. Consider:
P1 (Input => F2(O1), Output => O1);
If O1 is tagged (and thus passed by reference), this will set O1.C to 3 in
either order of evaluation. (In both cases, Input will have (C => 1) and
output will have (C => 2)).
But if O1 is not tagged (and thus passed by copy), this will set O1.C to 3
if evaluated left-to-right [F2 sets O1.C to 2, then passes (C => 1) to Input;
Output is passed (C => 2); and then that summed to (C => 3)] and O1.C will
be 2 if evaluated right-to-left [Output is passed (C => 1); F2 sets O1.C to 2,
then passes (C => 1) to Input; and then that is summed to (C => 2)].
We can write similar code in Ada 95:
P1 (Input => F1(A1), Output => A1.all);
getting the same order dependence.
We can also get an order dependence from a single call (even in Ada 83):
P2 (O1, O1);
but only if there are multiple parameters that can modify an object, and
interestingly, only if the parameters are passed by-copy. (If A_Rec is
tagged, for example, the value of O1.C will be increased by 3, no matter
what order the parameters are evaluated in.)
That means that the Ada 95 call:
P1 (Input => F5 (A1, A1), Output => O1);
cannot have an order dependence, but in out parameters:
P1 (Input => F6 (O1, O1), Output => O2);
could, but only if A_Rec is passed by copy.
Note that a single call with only one modifiable parameter cannot have an
order dependence:
P1 (Input => O1, Output => O1);
will always end up with the same result, not matter what order the parameters
are evaluated in. (That result could depend on the parameter passing mode, but
that is controllable by using parameters of types that are by-copy or by-reference
and in any case is not the problem we are discussing here). The parameters will
be evaluated before the call; for by-copy they'll both have the value (C => 1)
and the result value of (C => 2) will be written after the call. No part of the
expression will use the modified value after it is modified, so there cannot be
a dependence.
Similarly,
P1 (Input => F7 (O1, O1), Output => O2);
does not have an order dependence, as again, no part of the expression could
depend on the modified value of O1.
Questions to answer:
Should the rules (whatever they are) apply to all calls or just functions with
in out parameters? The latter position clearly is completely compatible,
but it obviously leaves many cases of order dependence undetected.
Another issue is whether the rules should apply to all writable parameters
(that is all in out and out parameters, and all access-to-varaible
parameters), or just to writable by-copy parameters (which have to be in out
or out). Clearly, problems can happen in any of these cases. But Ada has
lived with the by-reference cases for decades, and adding in out parameters
to functions doesn't change anything here (for a by-reference parameter,
an in out parameter is equivalent to an access parameter of the same type).
As noted in the discussion above, the real problems occur when new objects
are created by intermediate expressions. "By-copy" in this context still means
any parameter that might be passed by-copy: any type that is not known to be
a by-reference type (clearly including untagged private types). It also has
to include return objects of all types.
Note that these two issues are not completely independent of each other:
limiting checks to by-copy parameters also limits the added incompatibility,
(and the likelyhood that the rules are really preventing errors).
Survey of solutions
It fairly obvious that that order dependencies are a problem in Ada, and
have been getting worse with each version of the language (even without
in out parameters in functions). Moreover, it has been used as the
primary reason for leaving something natural and useful (in out parameters
for functions) out of the language.
We could do nothing, saying that since we're already nearly at the bottom
of the pit, moving down two inches to the bottom will not have any practical
effect. It surely would be easier than trying to cling to the side of the
pit! But perhaps we can do better.
It should be noted immediately that (almost) anything we could do could not
detect order dependencies that are caused by side-effects inside of
functions. Rules enforced on a call (or syntax, or whatever) can only
prevent problems caused by side-effects visible to the call.
We could limit in out parameters for functions to by-reference types
(effectively to only tagged types). That clearly will not introduce any new
problems, as shown by the examples that start out this section. But argubly
it would make them more likely, and in any event, it would leave all of the
existing nasty cases left unchecked.
One obvious solution would be to define the order of evaluation, eliminating
the problem at the source. Java, for instance, requires left-to-right
evaluation. However, that would encourage tricky code like the various
examples by making it portable. Moreover, Ada compilers have been using this
flexibility for decades; trying to remove it from compilers (particularly
from optimizers) could be very difficult. Note that this is the only solution
that actually could eliminate dependencies on side-effects inside of functions.
But defining the order of evaluation was considered for both Ada 83 and Ada 95
and was deemed not worth it -- it's hard to see what has changed.
Another option would be to increase the visibility of parameters with
side-effects. This sounds somewhat appealing (after all, it seems to be
the basis on which access parameters are deemed OK and in out parameters
are not). One possibility would be to add ordering symbols to named notation:
Param <- <expr> for an in parameter (this includes access); Param -> <expr>
for an out parameter; and Param <-> <expr> for an in out parameter.
However, for compatibility, old code that don't use the symbols would
have to be allowed. That's especially bad because the symbol for ordinary
named parameters (=>) looks like the symbol for an out parameter; while it
usually will be an in parameter. Moreover, this solution does nothing for
positional parameters in calls nor for the prefixes of prefix notation. And
it is misleading for access parameters, whose mode is officially "in",
but still might cause side-effects.
One could argue that positional parameters are already unsafe and requiring
named notation to be safe is not much of an imposition. But the prefix and
access issues are not so easily explained away. Additionally, putting the
mode into calls in some way makes maintenance of programs harder: changing
the mode of a call is going to make many calls illegal, while today most calls
will remain legal (all will if the mode is changed from "in out" to "in").
Another syntax suggestion that was made recently was to (optionally) include
the parameter mode as part of the call. That would look something like:
if F3(in O1) = F3(in F2(in out O1)) then
This could be applied to positional calls as well, but still provides no
help for prefix calls nor for access parameters.
One could imagine requiring the syntax for calls of functions with in out
parameters and making it optional elsewhere. That might placate in out
parameter opponents, but otherwise doesn't seem to do much for the language.
Finally, we come to some sort of legality rules and/or runtime checks for
preventing such order dependencies. It is important to note that making such
rules too simple (and strong) only would mean that temporaries have to be
introduced in some expressions. That would be an annoyance, but surely not
as bad as the current ticking time-bomb.
The easiest option is to blame all of the problems on functions with in out
parameters and make them stand alone. The rule would be something like:
A call of a function with an in out parameter must be the only call in
an expression.
That would mean that
if F2(O1) then
would be legal (assuming F2 returned type Boolean), but
if F2(O1) = (C => 1) then
would not be. Obviously, this is too strict. Amazingly, it also not strict
enough:
Some_Array(F3(O1)) := F2(O1);
would be allowed. (An assignment statement is not an expression!)
A call of a function with an in out parameter must be the only call in
an expression or statement;
If a call of a function with an in out parameter is the source expression
of an assignment_statement, the target variable_name shall not include a
call, an indexed component, a slice, or a dereference.
This would allow assigning to temporaries and record components (which can't be
computed) and not much else.
Of course, this is completely compatible with Ada 95 and later; but it doesn't
do anything to detect the existing cases of problems. Maybe that's not important.
In order to have an order dependence, there has to be two (or more) uses of an single
object within an expression (or statement). But of course that object could be aliased
via parameter passing or access values, a part of a larger object, or computed (as in
an array component). In addition, one of the uses of the object has to be modified
(that is, it is passed as an in out or out parameter, or it is passed as the
designated object of an access type parameter). Finally, one of the following has
to be true:
* in the smallest call (subexpression) that contains both uses of the object (that
is the call where the order dependence can occur), the use that modifies the object
cannot be directly a parameter of that call; or
* both uses are parameters to the same call; both uses could modify the object,
and neither parameter is required to be passed by reference.
The first bullet is shown by a procedure call like (we're using the declarations at
the head of this section again):
P1 (Input => (C => F3(O1)), Output => O1);
Since the use that modifies O1 is in the top-level procedure call, it won't be
modified until that call is underway. The other parameter(s) will have been evaluated
by then.
The second bullet provides the only exception to the first bullet; it covers cases
with two parameters, both of which can modify the object, and at least one of which
could be passed by copy.
Turning this into rules is not hard, except for defining a "a single object".
The easiest way to do that is to simply use type information:
Two objects are considered to be "potentially the same object" if one has the type
of a part of the other. [Remember that 'part' includes the type itself. The only
parts considered for this rule are those that are visible at the point of the call,
as is typical for legality rules.]
A call is legal only if:
For each name N that is passed to some inner call (not including the call itself)
as the actual parameter to a formal in out or out parameter, or is passed as
the designated object of a formal parameter of an access type, there is no
other name in the expressions of the actual parameters of the call other than
the one containing N that denotes potentially the same object; and
for each name N that is passed as the actual parameter to a formal in out or out
parameter that is not of a by-reference type, there is no other name in the
actual parameters corresponding to formal in out or out parameters of the call
other than the one containing N that denotes potentially the same object.
For the purpose of checking this rule, an assignment_statement is considered a call
with two parameters, the source parameter having mode in.
This is clearly too strict, as calls with obviously different objects would be illegal.
For instance,
P1 (Input => F2(O2), Output => O1);
fails this proposed rule. That would annoy programmers intensely, as it is crystal-clear
that there is no conflict in this call. Moreover, this could interfere with using
temporaries to break up expressions that otherwise would violate the rules. So this
rule needs improvement.
There is a also a problem with private types. If a private type has an aliased
component, it is possible for an access type (presumably returned from some operation
on the private type) to designate that component. But that fact that the private
object had such a part with the appropriate part would not be known at the point of
the call. That could happen, for instance, in a container accessor.
A1 := Accessor(A_Rec_Container);
PX (A_Rec_Container, F2 (A1.all));
A1.all designates part of A_Rec_Container, and (if we are trying to catch all such cases),
the call to PX should be illegal.
So we could modify the definition of "potentially the same object" to:
Two objects are considered to be "potentially the same object" one has the type
of a part of the other, or one has a part whose type is a partial view, unless:
* one object is part of a stand-alone object, and the other object is part
of a different stand-alone object;
[Since there are no dereferences here (they're never stand-alone objects),
we don't have to worry about private types because the problem cases all
involve dereferences designating aliased components. And different
objects are otherwise disjoint.]
* one object is part of a stand-alone object SO with no parts that are aliased or
have types that are partial view, and the other object is not part of SO;
[Here we can see the the first object has no aliased parts or private types
that could hide aliased parts; in that case, we only care that the second
object is not part of the same object.]
* one object is a parameter of an elementary type, and the other object
is not that parameter or rename thereof;
[We don't say "by-copy type" here, as that ignores privacy, which would be
wrong for legality rules. A by-copy parameter can't be aliased and
cannot represent any other object, so they never are the same as some
other object.]
[In the following cases, we can ignore private types; a type can never
be directly a subcomponent of itself, and O2 cannot be a private type
or we wouldn't be able to see the subcomponent. We also don't have
to worry about different stand-alone objects, as t]
* the type of object O1 is that of a (visible) subcomponent of object O2,
object O1 is a non-aliased part of a stand-alone object,
and object O2 is not part of the same stand-alone object;
[We can only get in trouble here if O2 is part of the same stand-alone
object. O2 can be in a storage pool, or any other stand-alone object, as
we don't have to worry about aliasing.]
* the type of object O1 is that of a (visible) subcomponent of object O2,
object O2 is a aliased part of a stand-alone object,
and object O1 is a part of a dereference of a pool-specific access type;
[Here we have to worry about accesses to the subcomponent of O2; we don't
want O1 to be a dereference that represents that subcomponent. Different
stand-alone objects are covered in the first bullet.]
* the type of object O1 is that of a (visible) subcomponent of object O2,
object O2 is a non-aliased stand-alone object, and object O1 is not part
of the same stand-alone object;
[Note that we don't talk about object O2 as being a part. That gets messy
when private types are taken into account, because we can't necessarily
see if the objects have an aliased components of the right type. (And even
when we can, the wording is messy.) Otherwise, this is similar to the
second bullet.
* the type of object O1 is that of a (visible) subcomponent of object O2,
object O2 is a aliased part of a stand-alone object,
and object O1 is a part of a different stand-alone object or a part of a
dereference of a pool-specific access type;
[Here we have to worry about accesses to the subcomponent of O2; we don't
want O1 to be a dereference that represents that subcomponent.]
Essentially the idea here is that if the two objects are different stand-alone
objects, we don't care about their types; if one of the objects is a stand-alone
object and nothing is hidden or aliased, then the other object can be anything
other than the same object.
If we can see the relationship between the objects, we can then allow more
(such as pool-specific dereferences, which can't designate stand-alone objects).
Private types are considered to conflict with all other types.
This gets rid of all of the critical problems; private types can't hide aliasing,
and temporaries will always work to break up expressions.
However, these rules still have some annoying effects:
* Parameters are not stand-alone objects, so parameters are treated as if they could
be the same always treated as if they could be the same, if there is any
possibility. We do except by-copy parameters, as they can't be aliased and they
are effectively new objects.
* Changing a type from a visible type to a private type potentially could make
some calls illegal. The above rules assume the worst about private types, while
as long as the type is visible, the compiler can actually verify that there is
no problem. That seems inherent in this model.
* There is no attempt to differentiate individual components that have the same
type. This can appear for records:
P (Complex.Real, F(Complex.Img));
would appear to conflict and thus be illegal.
But this is much more likely to cause problems for arrays:
P (Arr(1), F(Arr(2)));
The reader can easily tell that these array component s aren't ever going to be the
same, but the compiler can't with the given rules. Of course, if 1 and 2 were
replaced by functions, then these should be treated as overlapping.
The component cases could be handled with more complex rules which could determine
if the components are different parts of the same object. In particular, array
components with static index expressions can be allowed.
However, the increasing complexity of these rules has led the author to quit at this
point. Moreover, they still seem to be too incompatible; this is most acute when
looking at a routine like Swap_Integers (with two in out parameters of type Integer):
Swap_Integers (Arr(I), Arr(J));
This would be illegal by any conceivable set of "complete" rules, since we cannot know
the values of I and J, and the parameters are not by-reference. As such, this call
should not be allowed, but that is a likely incompatibility.
We could break the "completeness" by only checking function calls, but that doesn't
make much sense. There is little semantic difference between functions and procedures,
and programmers switch between them all the time. Preventing a problem for function
calls but not preventing the same problem for procedures would be bizarre.
Finally, A Proposed Solution
A better solution is be to create rules that only try to detect the "low-hanging
fruit" - that is cases where there clearly is a problem. This would be preferable
anyway; it would reduce the frustration caused by the rules (compared to rules like
the accessibility rules, for which the checks generally have no effect except to
get in the way of what you need to do -- to the point that Ada provides an attribute
to ignore them!). It would still be possible to cause problems, but at least
obvious problems would be prevented, and most illegal calls would have obvious
problems.
Such rules would only reject calls where it is clear that parts of the same object
are involved. That eliminates the complications of private types (we won't look in
them), arrays (we won't try to determine if they are the same), and so on.
These are the rules proposed in the !wording section above.
The proposed wording detects arrays indexed by the same value or object,
dereferences of the same access value, as well as uses of the same object. It
does not try to detect other cases of problems.
For instance,
Swap_Integer (Arr(I), Arr(I));
is illegal (and is almost certainly a bug -- one I've written periodically!), while
Swap_Integer (Arr(I), Arr(J));
is legal.
The proposed wording only applies to calls. It specifically does not apply to
short circuit operations, as the order of evaluation of those operations is
language-defined. Thus it is not possible to cause an order dependence
across the parts of a short-circuit form. For instance:
if F2(O1) = F6(O2, O3) and then F3(O1) = F3(O3) then
does not have an order dependence; the calls to F2 and F6 have to be evaluated
before the calls to F3. Thus the result will be False (O1.C = 2 /= O3.C = 3).
Since "and then" is not a call, the proposed rules do not apply to it, and the
calls to "=" making up the sub-expressions are checked separately. Whereas:
if F2(O1) = F6(O2, O3) and F3(O1) = F3(O3) then
does have an order dependence, and the result of the expression could be either
True or False. Since the evaluation of "and" is a call, the proposed rules
apply to this expression and thus it is illegal.
The proposed rules depend on the fact that while Ada allows parameters to be
evaluated in an arbitrary order, it does not allow interleaving of part of the
evaluations of those parameters. (See 1.1.4(18), 1.1.4(18.d) says that it is
intended to allow programmers to depend on some side-effects.) Thus, each call
evaluates its parameters and checks their subtypes (in some arbitrary order),
executes the call, and does any copying back of parameters (in some arbitrary
order) without part of any other call being evaluated in the middle.
Of course, an implementation can reorder operations in any way it likes so
long as the result one of those allowed by evaluation rules above. But a
compiler cannot start evaluating other parameters before writing back the
results of a call if the other parameters could depend on those results.
Additional alternatives
One idea that was considered but discarded was to use run-time checks to deal with
the complex cases, such as when private types are (usually) hiding the fact that
checks are not needed, or to verify that array indexes are different when that
is required.
The problem is that adds a lot of complexity, and while many of the checks could
be eliminated, some probably would remain, adding overhead without much value.
(After all, the program probably would work -- although not portably -- without
the checks.)
None of the proposed rules do anything about side-effects totally inside of
functions. One way to deal with that would be to require an expression to contain
only a single function call unless all of the functions are strict. A strict
function exposes all of its side effects in its specification, meaning it does
not read or write global variables, call non-strict functions, write hidden parts
of parameters, etc. Most of the language-defined functions are strict (that would
have to be declared somehow).
Strict functions have several other nice properties: they don't need elaboration
checking (freezing is sufficient to prevent access-before-elaboration problems);
they can be used in symbolic reductions; and they can use the optimizations allowed
for functions in pure packages.
However, such a requirement would be quite incompatible. Moreover, strict functions
would be rather limiting by themselves.
An alternative that has been suggested is to allow Global_In and Global_Out
annotations on subprograms, which would declare the global side-effects of a
subprogram. Such annotations could not be a lie (they'd have to be checked in some
way), and thus would fill the role of strict functions more flexibly. But it would
still be too incompatible to ban dangerous side-effects in functions (although
separate tools or non-Ada operating modes could make such checks).
!example
(See discussion.)
!ACATS test
!appendix
!topic Allow parameter modes for actual parameters
!reference RM 6.4
!from Adam Beneschan 09-03-05
!discussion
(Based on a comment in comp.lang.ada by Yannick Duchêne.)
It would be useful for readers to be able to see, at the point of a call, which
parameters the call outputs or modifies, and which ones are just passed as
inputs to it.
I think this is ESPECIALLY the case for function calls; now that we are
considering [IN] OUT parameters for functions, Ada programmers who are used to
function calls having IN parameters, and who see a function call in the code,
may not realize that the function has the side effect of modifying one of the
parameters. This is the sort of thing that is very easy to miss, especially
since function calls can be "buried" inside larger expressions. Personally, I
have this problem when reading code in Pascal.
The proposal is to alter the definition of parameter_association in 6.4(5):
parameter_association =>
[mode] [formal_parameter_selector_name =>] explicit_actual_parameter
with the legality rule (in 6.4.1) that if a "mode" is present, it must match the
mode of the formal parameter (note that the mode of a parameter defined by an
access_definition is IN, by 6.1(18)).
We can argue about whether the [mode] would look better to the left or right of
the formal_parameter_selector_name =>, if both are present; I think this is just
an issue of taste.
Also, I think it's arguable that this mode should be *required* on actual
parameters of function calls for formal parameters of mode OUT or IN OUT,
assuming we allow such parameters. As I mentioned above, the fact that a
function call is going to modify one of its parameters is the sort of thing that
could be easily missed, and that's a possible argument for making the mode a
requirement.
It's a bit unfortunate that this proposal wouldn't apply to the prefix of a
subprogram call given in Object.Operation notation, but I can't think of a
syntax that wouldn't look hokey:
Page_Number := Get_Page_Number_From_User;
in out Book_Object.Go_To_Page (Page_Number); ---???
Yuk. On the other hand, when Object.Operation notation is used it's probably
more obvious from the operation name what's being done with or to the object, so
perhaps an explicit mode isn't quite as useful.
****************************************************************
From: Randy Brukardt
Sent: Thursday, March 5, 2009 1:38 PM
> It would be useful for readers to be able to see, at the
> point of a call, which parameters the call outputs or
> modifies, and which ones are just passed as inputs to it.
This topic is covered in the portion of AI05-0144-1 that I've already
written. The net of the discussion there is that it isn't worth doing in Ada
as it stands for a number of reasons. One of the most important is the lack
of viable syntax -- the syntax of Ada as it stands is exactly wrong for
this. And all of the proposals only help named notation, but the problem is
even more severe for positional notation.
> I think this is ESPECIALLY the case for function calls; now
> that we are considering [IN] OUT parameters for functions,
> Ada programmers who are used to function calls having IN
> parameters, and who see a function call in the code, may not
> realize that the function has the side effect of modifying
> one of the parameters.
It shouldn't be relevant, because the intent is that any call where it could
matter would be illegal. (Unlike side effects in functions that aren't
visible in the contract, we can viably check for side effect conflicts that
*are* visible in the contract.) At least that's the theory.
In any case, parameters to functions often have side-effects, you just can't
see them. (Think the language-defined random number generator.) And of
course anonymous access parameters can be modified (and they don't have to
be obvious in a call). A programmer who doesn't at least consider the
possibility of parameters changing is already sunk.
...
> The proposal is to alter the definition of parameter_association in
> 6.4(5):
>
> parameter_association =>
> [mode] [formal_parameter_selector_name =>]
> explicit_actual_parameter
>
> with the legality rule (in 6.4.1) that if a "mode" is
> present, it must match the mode of the formal parameter (note
> that the mode of a parameter defined by an access_definition
> is IN, by 6.1(18)).
I didn't consider this particular syntax idea (I concentrated on more subtle
ways, such as the direction of the name arrow), but I don't think it works
very well with typical parameter names. Consider Insert in the predefined
containers:
Insert (in out Container => My_Vector,
in Before => 10,
in New_Item => Some_Value);
"in Before"? Gag! Parameters named "On" are common in some packages. "in
On"??? Yikes!
I had suggested in the AI:
Insert (Container <-> My_Vector,
Before <- 10,
New_Item <- Some_Value);
It also should be noted that either of these schemes would totally hide
anonymous access parameters (as their mode is technically "in"), which are
at least as dangerous as "in out" ones.
In any case, I'm going to file this on the existing AI and I'm sure the ARG
will discuss it.
****************************************************************
From: Adam Beneschan
Sent: Thursday, March 5, 2009 1:59 PM
> It shouldn't be relevant, because the intent is that any call where it
> could matter would be illegal. (Unlike side effects in functions that
> aren't visible in the contract, we can viably check for side effect
> conflicts that
> *are* visible in the contract.) At least that's the theory.
>
> In any case, parameters to functions often have side-effects, you just
> can't see them. (Think the language-defined random number generator.)
> And of course anonymous access parameters can be modified (and they
> don't have to be obvious in a call).
My thinking was that most of the time if you're passing a variable as a function
access parameter, you'd have to pass it as Variable'Access, which would at least
serve as a clue.
> I didn't consider this particular syntax idea (I concentrated on more
> subtle ways, such as the direction of the name arrow), but I don't
> think it works very well with typical parameter names. Consider Insert
> in the predefined
> containers:
>
> Insert (in out Container => My_Vector,
> in Before => 10,
> in New_Item => Some_Value);
>
> "in Before"? Gag! Parameters named "On" are common in some packages.
> "in On"??? Yikes!
Yeah, it would work better if the source is displayed with some sort of editor
that boldfaces reserved words. Anyway, if I did decide to use the modes on all
actual parameters, I'd probably write the above as
Insert (in out Container => My_Vector,
in Before => 10,
in New_Item => Some_Value);
which probably has a lower gag factor.
> I had suggested in the AI:
>
> Insert (Container <-> My_Vector,
> Before <- 10,
> New_Item <- Some_Value);
I don't think this idea will work because it will invalidate any code that looks
something like
if Error>0.5 or else Error<-0.5 then ...
OK, it would force programmers who have code like this to insert some spaces and
make it more readable, which I guess is good, but I don't think this
incompatibility would make them happy.
****************************************************************
From: Bob Duff
Sent: Thursday, March 5, 2009 1:30 PM
> It would be useful for readers to be able to see, at the point of a
> call, which parameters the call outputs or modifies, and which ones
> are just passed as inputs to it.
This feature existed in an early version of Ada -- circa 1979 or 1980?
The syntax was:
Mumble (In_Param := 123, Out_Parm =: Blah, In_Out_Param :=: Thing);
if I remember correctly. It only worked for named notation, though.
> I think this is ESPECIALLY the case for function calls; now that we
> are considering [IN] OUT parameters for functions, Ada programmers who
> are used to function calls having IN parameters, and who see a
> function call in the code, may not realize that the function has the
> side effect of modifying one of the parameters. This is the sort of
> thing that is very easy to miss, especially since function calls can
> be "buried" inside larger expressions. Personally, I have this
> problem when reading code in Pascal.
>
> The proposal is to alter the definition of parameter_association in
> 6.4(5):
>
> parameter_association =>
> [mode] [formal_parameter_selector_name =>] explicit_actual_parameter
>
> with the legality rule (in 6.4.1) that if a "mode" is present, it must
> match the mode of the formal parameter (note that the mode of a
> parameter defined by an access_definition is IN, by 6.1(18)).
>
> We can argue about whether the [mode] would look better to the left or
> right of the formal_parameter_selector_name =>, if both are present; I
> think this is just an issue of taste.
>
> Also, I think it's arguable that this mode should be *required* on
> actual parameters of function calls for formal parameters of mode OUT
> or IN OUT, assuming we allow such parameters.
That's an interesting idea. It might even make the idea of '[in] out'
parameters on functions more palatable to some folks.
I'd also like a configuration pragma or compiler switch that would cause a
warning or error if it's missing on '[in] out' parameters of non-functions.
****************************************************************
From: Randy Brukardt
Sent: Thursday, March 5, 2009 2:20 PM
> My thinking was that most of the time if you're passing a variable as
> a function access parameter, you'd have to pass it as Variable'Access,
> which would at least serve as a clue.
That thinking probably is wrong (any time you pass an access value it is wrong,
it is wrong in prefix notation, and so on.) Read the AI, it covers that in
detail.
...
> > I had suggested in the AI:
> >
> > Insert (Container <-> My_Vector,
> > Before <- 10,
> > New_Item <- Some_Value);
>
> I don't think this idea will work because it will invalidate any code
> that looks something like
>
> if Error>0.5 or else Error<-0.5 then ...
It surely wouldn't invalidate *this* code, because there is no call that can
have named notation.
But I do see your point. I originally tried with "<=>" and "<=" but the problem
there (besides the conflict with "<=") is that the current arrow "=>" ought to
represent "out". That doesn't work well.
Anyway, if people want to pursue this, they're welcome to come up with ideas for
syntax. This is an idea where the syntax will make it or break it, so all ideas
should be considered.
****************************************************************
From: Jeffrey R. Carter
Sent: Thursday, March 5, 2009 2:13 PM
> I don't think this idea will work because it will invalidate any code
> that looks something like
>
> if Error>0.5 or else Error<-0.5 then ...
Why? That's not a parameter association, so it shouldn't be subject to the
parameter-association rules.
However, such an idea would only work for named parameter association; repeating
the mode would work for positional association as well. So if we're going to
have such a thing, I'd vote for repeating the mode.
****************************************************************
From: Randy Brukardt
Sent: Thursday, March 5, 2009 2:38 PM
...
> However, such an idea would only work for named parameter association;
> repeating the mode would work for positional association as well. So
> if we're going to have such a thing, I'd vote for repeating the mode.
Repeating the mode in positional associations would be incredibly confusing,
since "in" is allowed in expressions.
Proc (in X, in T); -- A two parameter call.
Proc (in X in T); -- A one parameter call.
You'd have to pay amazing attention to commas when reading. (The idea would be a
complete non-starter if there was a case where both of these would be legal, but
I don't think it could exist.)
We'd also have to build an entire new parser for our compiler (there is no
chance that syntax like this could be made to go through our current parser
generator).
****************************************************************
From: Bob Duff
Sent: Thursday, March 5, 2009 2:34 PM
> I didn't consider this particular syntax idea (I concentrated on more
> subtle ways, such as the direction of the name arrow), but I don't
> think it works very well with typical parameter names. Consider Insert
> in the predefined
> containers:
>
> Insert (in out Container => My_Vector,
> in Before => 10,
> in New_Item => Some_Value);
>
> "in Before"? Gag! Parameters named "On" are common in some packages.
> "in On"??? Yikes!
Well, I presumed "in" was optional. And I would certainly never use it.
So:
Insert (in out Container => My_Vector,
Before => 10,
New_Item => Some_Value);
or:
Insert (in out My_Vector, 10, Some_Value);
which aren't so horrible.
****************************************************************
From: Bob Duff
Sent: Thursday, March 5, 2009 2:41 PM
> > I don't think this idea will work because it will invalidate any
> > code that looks something like
> >
> > if Error>0.5 or else Error<-0.5 then ...
>
> Why? That's not a parameter association, so it shouldn't be subject to
> the parameter-association rules.
Because if <- is a lexical element, it has to be a lexical element in all
contexts. We don't have feedback from the parser into the lexer in Ada, and we
don't want to change that fact.
> However, such an idea would only work for named parameter association;
> repeating the mode would work for positional association as well. So
> if we're going to have such a thing, I'd vote for repeating the mode.
Right, whatever the syntax, if it doesn't support positional notation, then it's
pointless.
****************************************************************
From: Adam Beneschan
Sent: Thursday, March 5, 2009 2:59 PM
> Repeating the mode in positional associations would be incredibly
> confusing, since "in" is allowed in expressions.
>
> Proc (in X, in T); -- A two parameter call.
> Proc (in X in T); -- A one parameter call.
FYI, I completely forgot about membership tests when I wrote my earlier e-mail.
Also, I was definitely thinking more about OUT and IN OUT parameters---repeating
the mode for those is useful to alert the reader to a side-effect, but less
useful for IN parameters.
****************************************************************
From: Adam Beneschan
Sent: Thursday, March 5, 2009 3:00 PM
> > I don't think this idea will work because it will invalidate any
> > code that looks something like
> >
> > if Error>0.5 or else Error<-0.5 then ...
>
> Why? That's not a parameter association, so it shouldn't be subject to
> the parameter-association rules.
Ummm, you want to try writing a lexical analyzer that interprets <- as a single
token inside a subprogram call but two tokens elsewhere? And gets this case
right?
Arr : array (Boolean) of Integer;
...
N := Arr (Error<-0.5);
Sorry, I think that's asking too much. I don't think there are any possible
ambiguities with any of the current compound delimiters defined in 2.2---there
is no syntax in which any of them could be interpreted as two single
deilmiters---but adding this syntax would add one.
I think the only way this could reasonably work is if a language rule were added
to 2.2 saying that <- is interpreted as a compound delimiter whenever those two
characters appear together; otherwise it would be just too hard to get right, in
my opinion.
****************************************************************
From: Jeffrey R. Carter
Sent: Thursday, March 5, 2009 4:02 PM
> Ummm, you want to try writing a lexical analyzer that interprets <- as
> a single token inside a subprogram call but two tokens elsewhere? And
> gets this case right?
No. But I wouldn't let it stop me from designing such a language if I didn't
have to write the compiler :)
As I've said, I don't really like the proposal, so I was merely curious. There
are lots of places where the way things are interpreted depends on context.
But the "in X in T" makes the repeated-mode proposal seem to have problems, too.
****************************************************************
From: Micronian
Sent: Thursday, March 5, 2009 7:03 PM
Would anyone object to the idea of using brackets? I don't recall it used any
where in the Ada language. It's not the prettiest syntax, but it looks clear and
it is easy to separate from the actual parameters.
Proc ([in] X, [in] T); -- A two parameter call.
Proc ([in] X in T); -- A one parameter call.
Insert ([in out] Container => My_Vector,
[in] Before => 10,
[in] New_Item => Some_Value);
****************************************************************
From: Adam Beneschan
Sent: Thursday, March 5, 2009 7:31 PM
The same idea actually occurred to me. Back when Ada was first designed, there
was some criterion that prevented certain characters from being used (except in
comments and string literals), and square brackets were in that list. However,
I suspect that all the old keypunch machines that didn't have those characters,
that the original designers were worried about, have long ago been turned into
scrap metal.
Personally, I'm not opposed to starting to use the forbidden characters (as long
as the resulting code doesn't start looking like C programs); it would seem odd
that we can now use letters from every alphabet in the world in our identifiers
including Greek and Tamil and ancient Irish alphabets that haven't been used in
perhaps a thousand years, but can't use square brackets or curly braces in the
syntax. But for some reason, this doesn't seem like the right place to introduce
the concept.
****************************************************************
From: Micronian
Sent: Thursday, March 5, 2009 7:47 PM
Well, seeing as how Ada now has the Pi Unicode character, I don't see why
brackets should not be allowed.
****************************************************************
From: Niklas Holsti
Sent: Friday, March 5, 2009 1:58 AM
> Would anyone object to the idea of using brackets?
Yes. While I think the optional indication of actual parameter mode is a useful
addition to Ada, I dislike the bracket proposal.
> It's not the prettiest syntax, but it looks clear and it is easy to
> separate from the actual parameters.
>
> Proc ([in] X, [in] T); -- A two parameter call.
> Proc ([in] X in T); -- A one parameter call.
That looks cluttered to me, and not uniform with the absence of brackets in the
subprogram declarations (OK, optional brackets could be allowed in declarations,
too, but are still ugly).
To me the problem of mistakenly mixing up Proc (in X, in T) with Proc (in X in
T) is not severe enough to merit the ugliness of brackets. After all, we already
have the similar problem of confusing Proc (X, -3) and Proc (X -3), where both
forms can be legal, even.
****************************************************************
From: Dmitry A. Kazakov
Sent: Friday, March 6, 2009 2:45 AM
> It would be useful for readers to be able to see, at the point of a
> call, which parameters the call outputs or modifies, and which ones
> are just passed as inputs to it.
Why would it be useful for readers? The mode is statically checked, there is
nothing the reader should worry about.
> I think this is ESPECIALLY the case for function calls; now that we
> are considering [IN] OUT parameters for functions, Ada programmers who
> are used to function calls having IN parameters, and who see a
> function call in the code, may not realize that the function has the
> side effect of modifying one of the parameters. This is the sort of
> thing that is very easy to miss, especially since function calls can
> be "buried" inside larger expressions.
I am afraid this is a totally wrong idea. If something is needed to do here then
it is to make things statically checkable. For example, functions with side
effects should not be allowed as arguments except than in unary operations.
Presence / absence of side-effects could be explicitly stated in the function
declaration. Order of actual parameter evaluation could be explicitly stated for
a subprogram, which in turn would allow calls to functions with side effects as
arguments, etc.
****************************************************************
From: Randy Brukardt
Sent: Friday, March 6, 2009 2:03 PM
For what it's worth, that's the direction that I am pursuing for future Ada
versions. While some sort of mode specification would have been nice, it's 30
years too late to add it (to be useful, it has to be required). And, as noted,
Ada originally had such a specification and it was dropped for some reason
before Ada 80 came out, which suggests that the idea was rejected way back then.
Not sure what has changed about calls that should cause a reconsideration.
****************************************************************
From: Micronian
Sent: Friday, March 6, 2009 3:06 AM
Yes, in most cases using
Proc(in X, out Y)
works. I mainly provided the bracket idea to guarentee there is no confusion
with something like
Proc(in X in Y)
Personally, I still prefer _not_ having brackets. For the above case, I probably
would make it a habit to put an expression like "X in Y" in another variable
(e.g. Is_Descendent := X in Y) or if anything
Proc( in (X in Y) )
In the end, it would be nice to have the ability to specify param modes because
I often find myself looking through someone elses code and need to look back at
the spec file to have a better understanding of how data is being manipulated.
But, I understand there are far more significant/complicated issues that need to
be addressed first.
****************************************************************
From: Niklas Holsti
Sent: Friday, March 6, 2009 2:44 PM
> Proc( in (X in Y) )
I, too, would do that, unless the call is written with one parameter per line,
which is my usual style when there are more than one or two parameters.
> In the end, it would be nice to have the ability to specify param
> modes because I often find myself looking through someone elses code
> and need to look back at the spec file to have a better understanding
> of how data is being manipulated.
Exactly. It is a help to readers, and so in line with the Ada philosophy. Still,
it could help also when writing code; I have made a couple of errors where I
misremembered the inputs and outputs for some call, and those errors usually
were not caught by exceptions at run-time.
> But, I understand there are far more
> significant/complicated issues that need to be addressed first.
Of course. But this is a simple extension that is easy to talk about and to like
or dislike.
****************************************************************
From: Adam Beneschan
Sent: Friday, March 6, 2009 3:01 PM
> Why would it be useful for readers? The mode is statically checked,
> there is nothing the reader should worry about.
We must be talking about two totally different things, since I have no idea what
your last sentence has to do with anything. My point, and I think Yannick's,
was just to add something to help make a program more self-documenting. If I
see something like
R := <something>; -- (A)
<some other code>
if Some_Function (Some_Expression, R) > 0 then
...
V := R;
...
it might not occur to me that V is going to be assigned some value other than
the value R got assigned in (A), because I'm used to thinking of functions as
taking just inputs and producing a value, and I might not realize that calling
Some_Function may change R. This would make things clearer that R may modified:
R := <something>; -- (A)
<some other code>
if Some_Function (Some_Expression, IN OUT R) > 0 then
...
V := R;
...
just an aid to help someone reading the code realize what's going on with R.
There are other ways to accomplish this: sometimes selecting an appropriate
parameter name might be enough, and if all else fails you can add a comment.
But this is something that I figured might help and couldn't hurt because it
would be easy to implement.
****************************************************************
From: Niklas Holsti
Sent: Friday, March 6, 2009 3:08 PM
> ... While some sort of mode specification would have been nice, it's
> 30 years too late to add it (to be useful, it has to be required).
I do not agree that it would have to be required to be useful, and thus I don't
think it is too late to add it.
I think the use or non-use of actual-parameter modes would be a question of
programming style, similar to the use or non-use of the predefined numeric
types, or the repetition of subprogram names after the "end". The usage of
actual-parameter modes could be controlled by coding rules, perhaps enforced by
compiler switches (like GNAT formatting rules) or other source-code analysis
tools.
> And, as noted, Ada originally had such a specification and it was
> dropped for some reason before Ada 80 came out, which suggests that
> the idea was rejected way back then. Not sure what has changed about
> calls that should cause a reconsideration.
While I have much respect for the originators of Ada, I don't think their
decisions should be taken as dogma. I agree with Micronian that this is not an
very important issue, but I am in favour of it anyway. As I said in another
message, it could have caught a couple of errors I have made.
I think actual-parameter modes would be especially valuable during software
maintenance: if the modes of the formal parameters are changed, this feature
could flag all calls that conflict with the new modes. Existing rules might flag
only some of the conflicting calls. For example, if the mode is changed from
"in" to "in out" the existing rules flag only calls where the actual parameter
is a constant. The existing rules do not flag calls where the actual parameter
is a variable, since both "in" and "in out" are then legal, but the caller may
not expect the call to change this variable and may therefore malfunction at
some later point in its execution.
****************************************************************
From: Randy Brukardt
Sent: Friday, March 6, 2009 5:53 PM
> I do not agree that it would have to be required to be useful, and
> thus I don't think it is too late to add it.
I probably shouldn't have used the word "useful". I had presumed that this
particular syntax would not be implementable using our LALR(1) parser generator.
And I was thinking that if it was *required* there would not be a problem, but
as optional syntax it can't be implemented.
This turns out to be wrong on two counts: Making the syntax required wouldn't
help. The syntax for calls is shared with type conversions and array indexing,
so there isn't any circumstance where adding something to positional calls
doesn't have to be optional (effectively).
And more importantly, I actually tried this on our grammar generator and it
seems to be happy. (Enforcing the legality rules would be a massive pain,
requiring massive changes to the structure of calls, but that's still very
different than what I was envisioning.) I really do not believe this result, but
given that I can't find an obvious bug I have to assume that it is correct.
I still don't see any reason to support such syntax on positional calls. They're
mostly used by the lazy (whose would never spend the extra typing on the modes
anyway) and those who want to use the symmetry between array indexing and a
function call (and you couldn't use the modes then). Otherwise, you'd use named
notation. Is it really such a hardship to use that if you want to know the
mode??
...
> I think actual-parameter modes would be especially valuable during
> software maintenance: if the modes of the formal parameters are
> changed, this feature could flag all calls that conflict with the new
> modes. Existing rules might flag only some of the conflicting calls.
> For example, if the mode is changed from "in" to "in out"
> the existing rules flag only calls where the actual parameter is a
> constant. The existing rules do not flag calls where the actual
> parameter is a variable, since both "in" and "in out"
> are then legal, but the caller may not expect the call to change this
> variable and may therefore malfunction at some later point in its
> execution.
One could make this argument in reverse as well: changing from "in out" to "in"
would require changing a lot of calls without any corresponding benefit.
Indeed, we've changed modes of calls in Claw, knowing that such changes are
compatible 98% of the time (constant Window objects are rare!). Such changes
also have been made in the Ada standard libraries (usually to fix definitional
bugs). If this feature existed and was used, such changes would become totally
incompatible and probably would not be able to be made.
The point here is that adding these modes would impede maintenance as much as
they would help it. That would be OK if they were helping to detect a lot of
bugs, but that's not likely to be the case.
---
One further point: concentrating on the parameter modes is the wrong thing.
If you are concerned about side-effects on a parameter because of a call, you
need to worry about:
[1] Parameters of mode "out" and "in out";
[2] Anonymous access parameters (to variables);
[3] In Parameters of named access to variable types;
[4] In parameters of composite types that have a component of an
access-to-object type (including the infamous Rosen trick).
And that's not counting modifications by other access paths (Claw does this a
lot). So knowing the mode is meaningless unless you also know the type (for
scalar types, you do only need to worry about [1]). But for that you still have
to look at the spec (or the declaration of the actual). I suppose the next
proposal will be to repeat the formal type in the call as well...
---
It is interesting that I was neutral on this idea (other than the lack of good
syntax) before this discussion. But now I see why the "founding father(s)" left
it out, and I believe that I am now strongly opposed to it. So you guys have
done a lot of good (just not the good you had in mind)!
****************************************************************
From: Niklas Holsti
Sent: Saturday, March 7, 2009 1:40 AM
> I probably shouldn't have used the word "useful". I had presumed that
> this particular syntax would not be implementable using our
> LALR(1) parser generator. [...] I actually tried this on our grammar
> generator and it seems to be happy. (Enforcing the legality rules
> would be a massive pain, requiring massive changes to the structure of
> calls, [...]
I understand that the compiler-implementation cost of proposed extensions to Ada
is an important factor in the decision to accept or reject the extension. As I
have little knowledge of Ada compilers I have nothing to contribute on this.
> I still don't see any reason to support such syntax on positional
> calls. They're mostly used by the lazy ...
I agree with this deprecation of positional calls in general, and would not much
mind if the proposed actual-parameter mode indications were allowed only for
named-association calls.
>> I think actual-parameter modes would be especially valuable during
>> software maintenance: if the modes of the formal parameters are
>> changed, this feature could flag all calls that conflict with the new
>> modes. Existing rules might flag only some of the conflicting calls.
>> For example, if the mode is changed from "in" to "in out" the
>> existing rules flag only calls where the actual parameter is a
>> constant. The existing rules do not flag calls where the actual
>> parameter is a variable, since both "in" and "in out" are then legal,
>> but the caller may not expect the call to change this variable and
>> may therefore malfunction at some later point in its execution.
>
>
> One could make this argument in reverse as well: changing from "in
> out" to "in" would require changing a lot of calls without any
> corresponding benefit.
Why would there be no benefit? If a formal parameter mode is changed from "in
out" to "in", surely it is necessary to review all existing calls: the caller
expects the actual parameter to change, and it no longer changes, so later uses
of the (unchanged) actual parameter in the caller may no longer work.
Of course there are other ways to find all calls of the changed subprogram
(simple search, or a cross-reference listing). Actual-parameter modes are not
crucial but may be helpful.
I agree that there are cases where a mode-change clearly has no impact on calls,
and the actual-mode indications would cause extra maintenance work. For example,
if the parameter type is changed from having value semantics to having reference
semantics, an "in out" parameter mode usually changes to "in" without any change
in the meaning of calls.
> One further point: concentrating on the parameter modes is the wrong
> thing. If you are concerned about side-effects on a parameter because
> of a call, you need to worry about....
[ list of side-effect channels ]
Of course I agree that there are many ways in which a call may have side effects
and that parameter modes are only the surface level. In critical software,
however, the more complex side-effect mechanisms tend to be frowned on or
entirely forbidden.
I wonder what the SPARK people think of this proposal; it seems related to the
information-flow analysis in SPARK.
> It is interesting that I was neutral on this idea (other than the lack
> of good syntax) before this discussion. But now I see why the
> "founding father(s)" left it out, and I believe that I am now strongly
> opposed to it. So you guys have done a lot of good (just not the good
> you had in mind)!
OK. I can't say that you have convinced me, but I'm not rabidly in favour of the
proposal either, so I'm content to let it rest.
****************************************************************
From: Dmitry A. Kazakov
Sent: Saturday, March 7, 2009 2:50 AM
> it might not occur to me that V is going to be assigned some value
> other than the value R got assigned in (A), because I'm used to
> thinking of functions as taking just inputs and producing a value, and
> I might not realize that calling Some_Function may change R. This
> would make things clearer that R may modified:
>
> R := <something>; -- (A)
> <some other code>
> if Some_Function (Some_Expression, IN OUT R) > 0 then
> ...
> V := R;
> ...
I don't see where this pattern could be useful. To me it is rather a poor design
and IN OUT serves as a comment to explain, excuse it.
IMO in out arguments of functions are really required in cases like I/O, where
the state of an argument is implicitly changed. For example:
S : Stream;
begin
Connect (S);
if Read (S) = "hello" then
...
> just an aid to help someone reading the code realize what's going on
> with R.
I.e. it is a comment.
> There are other ways to accomplish this: sometimes selecting an
> appropriate parameter name might be enough, and if all else fails you
> can add a comment. But this is something that I figured might help
> and couldn't hurt because it would be easy to implement.
As I said, I see no function behind this feature. What does it *add* to the
program rather than mere noise/comments?
1. It does not check anything that is not already checked.
2. It imposes a distributed maintenance overhead when some modes are changed to
compatible ones.
3. It is semantically suspicious because the mode is a part of an anonymous
subtype. In fact "in T", "out T", "in out T" are constrained subtypes of T,
with some operations possibly disallowed. If mode is to be specified on the
caller's side then, logically, the whole subtype should be.
Plus, if you add in, out, in out modes you should also add other "modes"
(constraints) like 'Class, discriminants, bounds:
procedure Foo (X : T'Class);
Foo (X'Class => Object)
type T (I : Integer) is ...;
procedure Foo (X : T);
Foo (X (24) => Object)
Put_Line (Item (1..20) => Object)
In its core it is an idea to tell which kind of stuff you, the caller, expect
from / allow to the callee to do with the argument. The type of the argument is
a description of what is allowed. Now why the idea is bad is because the
function you call is too an operation of the argument's type. Thus it is just a
tautology in a strongly typed language. I hope Ada is still one.
When it is intended as something more that a tautology,then it would become an
extremely dangerous thing known as "cast."
****************************************************************
From: Georg Bauhaus
Sent: Monday, March 9, 2009 5:34 AM
> This would make things clearer that R may modified:
>
> R := <something>; -- (A)
> <some other code>
> if Some_Function (Some_Expression, IN OUT R) > 0 then
> ...
> V := R;
> ...
Isn't there just one case warranting the reader's attention? Namely when a
parameter may be modified. So make this the only case, and write
if Some_Function (Some_Expression, ! R) > 0 then
or
if ! R.Some_Function(Some_Expression) then
(Not sure whether |R would have to be permitted, too, then.)
****************************************************************
Questions? Ask the ACAA Technical Agent