Version 1.3 of ai05s/ai05-0144-2.txt
!standard 6.02 (11) 09-10-30 AI05-0144-2/02
!class Amendment 09-06-07
!status work item 09-06-07
!status received 09-06-07
!priority High
!difficulty Hard
!subject Detecting dangerous order dependencies
!summary
Define rules to reject expressions with obvious cases of order-dependence.
!problem
Ada does not define the order of evaluation of parameters in calls. This opens up
the possibity that function calls with side-effects can be non-portable. The
problem gets worse with "in out" and access parameters, as proper evaluation
may depend on a particular order of evaluation.
Arguably, Ada has selected the worst possible solution to evaluation order
dependencies: it allows such dependencies (by not specifying an order of evaluation),
does not detect them in any way, and then says that if you depend on one (even if
by accident), your code will fail at some point in the future when your compiler
changes.
Something should be done about this.
!proposal
The following rules eliminate the most obvious side effects that can cause evaluation
order problems. These rules are checked statically. Unlike most static rules, which
are conservative, these rules are liberal in that they do not attempt to prevent all
evaluation order problems, just ones that are certain to be problems.
!wording
[I don't know where best to put this, perhaps in 6.4? - RLB]
[We talk about access paths in 6.2, parameter modes, so that would seem to
be a good place to discuss this related issue. -- STT]
Add after 6.2(11):
Two names or prefixes, N1 and N2, are known to denote the same object if:
* N1 statically denotes a part of a stand-alone object or parameter, and
N2 statically denotes the same part of the same stand-alone object or
parameter; or
[We're assuming that this bullet covers selected_components, as
those are always known at compile-time - ED]
* N1 is a dereference (implicit or explicit) of P1, N2 is a dereference
(implicit or explicit) of P2, and prefixes P1 and P2 are known to denote
the same object; or
* N1 is an indexed_component P1(I1,...), N2 is an indexed_component
P2(I2,...), the prefix P1 is known to denote the same object as the
prefix P2, and for each index of the indexed_component, I1 and I2 are
static expressions with the same value, or I1 and I2 are names that
are known to denote the same object; or
* N1 is a slice P1(S1), N2 is a slice P2(S2), the prefixes P1 and P2 are
known to denote the same object, and the subtypes denoted by S1 and S2
statically match.
AARM Discussion: Whether or not names or prefixes are known to denote the
same object is determined statically. If the name contains some dynamic
portion other than a dereference, indexed_component, or
slice, it is not "known to denote the same object". [We could also
use the same rules for indexes for the bounds of slices that have
explicit bounds, although it doesn't seem very likely to occur and
the wording is messy.]
Two names N1 and N2 are known to refer to the same object if N1 and N2
are known to denote the same object, or if N1 is known to denote a
subcomponent of the object denoted by N2, or vice-versa.
AARM Reason: This ensures that names Prefix.Comp and Prefix are
known to refer to the same object for the purposes of the
rules below. This intentionally does not include dereferences; we
only want to worry about accesses to the same object, and a
dereference changes the object in question. (There is nothing shared
between an access value and the object it designates.)
A type is known to be passed by reference if it is tagged or
immutably limited (see 7.5).
AARM Reason: The by-reference property breaks privacy by requiring information
about the full definition of partial views; these properties do not depend
on the full definition of partial views.
If a call C has two or more parameters of mode in out or out that
are of a type that is not known to be passed by reference, then
the call is legal only if:
* For each name N of an that is passed as a parameter of
mode in out or out to the call C, there is no other name among the
other parameters of mode in out or out to C that is known to refer to
same object.
[Editor's note: see the discussion item about compatibility. Also note
that I changed "denote the same object" to "refer to the same object",
because this now includes composite types and thus we need the more
complex matching included in "refer".]
If a construct C has two or more direct constituents that are names or
expressions whose evaluation may occur in an arbitrary order, at least
one of which contains a function call with an in out or out parameter,
then the construct is legal only if:
* For each name N that is passed as a parameter of mode in out or out
to some inner function call C2 (not including the construct C
itself), there is no other name anywhere within a direct constituent
of the construct C other than the one containing C2, that is known
to refer to the same object.
For the purposes of checking this rule:
* For an array aggregate, an expression associated with a discrete_choice_list that
has two or more discrete choices, or that has a nonstatic range, is considered
as two or more separate occurrences of the expression;
* For a record aggregate:
- The expression of a record_component_association is considered to occur
once for each associated component; and
- The default_expression for each record_component_association with <> for which
the associated component has a default_expression is considered part of the
aggregate;
* For a call, any default_expression evaluated as part of the call is considered
part of the call.
AARM Ramification: We do not check expressions that are evaluated only because
of a component initialized by default in an aggregate (via <>).
[Editor's note: I'm a bit dubious about these default_expression rules. These
expressions are not visible to the programmer and may not actually modifiable
by them. OTOH, it isn't very likely that they would cause a problem. These
additional rules were suggested by the minutes of the Brest meeting, so
they were added here.]
AARM Reason: These rules prevent obvious cases of dependence on the order of
evaluation of names or expressions. Such dependence is usually a bug, and
in any case, is not portable to another implementation (or even another
optimization setting).
In the case that the top-level construct C is a call, these rules do not require
checks for most in out parameters, as the rules about evaluation of calls prevent
problems. Similarly, we do not need checks for short circuit operations or other
operations with a defined order of evaluation. The rules about arbitrary
order (see 1.1.4) allow evaluating parameters and writing parameters back in
an arbitrary order, but not interleaving of evaluating parameters of one call
with writing parameters back from another - that would not correspond to any
allowed sequential order.
End AARM Reason.
AARM Ramification: Note that the latter requirement cannot fail for a procedure or
entry call alone; there must be at least one function with an in out or out
parameter called as part of a parameter expression of the call in order for it
to fail.
!discussion
In order to discuss this topic, we need to look at some examples.
type A_Rec is {tagged} record
C : Natural := 0;
end record;
--
--
--
procedure P1 (Input : in A_Rec; Output : in out A_Rec) is
begin
Output.C := Output.C + Input.C;
end P1;
procedure P2 (Bump1, Bump2 : in out A_Rec) is
begin
Bump1.C := Bump1.C + 1;
Bump2.C := Bump2.C + 2;
end P2;
function F1 (Bumpee : access A_Rec) return A_Rec is
begin
Bumpee.C := Bumpee.C + 1;
return (C => Bumpee.C - 1);
end F1;
function F2 (Bumpee : in out A_Rec) return A_Rec is
begin
Bumpee.C := Bumpee.C + 1;
return (C => Bumpee.C - 1);
end F2;
function F3 (Obj : in A_Rec) return Natural is
begin
return Obj.C;
end F3;
function F4 (Obj : access A_Rec) return Natural is
begin
return Obj.C;
end F4;
function F5 (Bump1, Bump2 : access A_Rec) return A_Rec is
begin
Bump1.C := Bump1.C + 1;
Bump2.C := Bump2.C + 2;
return (C => Bump1.C - 1);
end F5;
function F6 (Bump1, Bump2 : in out A_Rec) return A_Rec is
begin
Bump1.C := Bump1.C + 1;
Bump2.C := Bump2.C + 2;
return (C => Bump1.C - 1);
end F6;
function F7 (Bumpee : in out A_Rec; Bumper : in A_Rec) return A_Rec is
begin
Bumpee.C := Bumpee.C + Bumper.C;
return (C => Bumpee.C);
end F6;
O1 : {aliased} A_Rec := (C => 1); --
--
O2, O3 : A_Rec := (C => 1);
type Acc_A_Rec is access all A_Rec;
A1 : Acc_A_Rec := O1'access;
The usual concern about the use of in out parameters in functions begins something
like:
Imagine writing (in a future version of Ada that allows in out parameters) an
expression like:
if F3(O1) = F3(F2(O1)) then
This expression has an evaluation order dependency: if the expression is evaluated
left-to-right, the result is True (both values have (C => 1) and O1.C is set to 2
afterwards), and if the expression is evaluated right-to-left, the result is False
(the right operand is still (C => 1), but now the left operand is (C => 2), and O1.C
is still 2 afterwards).
This is usually used as a reason to disallow in out parameters on functions.
If you have to use access parameters, then the expression is:
if F3(O1) = F3(F1(O1'access)) then
and the use of 'access and aliased on the declaration of O1 should provide a red flag
about the possible order dependence.
However, this red flag only occurs some of the time. First of all, access objects
are implicitly converted to anonymous access types, so no red flag is raised when
using them:
if F3(A1.all) = F3(F1(A1)) then
Perhaps the .all on the left-hand argument could be considered a red flag. But of
course that doesn't apply if that function also takes an access parameter:
if F4(A1) = F3(F1(A1)) then
We have the same order dependency, but there is no sign of a red flag here.
All of these calls can be written in Ada 95, but Ada 2005 makes this situation worse
by adding prefix views and implicit 'access. If A_Rec is tagged, we can write:
if O1.F3 = O1.F1.F3 then
And since tagged parameters are implicitly aliased, if O1 is a tagged parameter,
there isn't the slightest sign of a red flag here.
This shows that we are already in a very deep pit. One can argue whether moving
the rest of the way to the bottom is that significant. [Thanks to Pascal Leroy for
showing just how bad the current situation is.]
We can show similar problems with procedure calls. Consider:
P1 (Input => F2(O1), Output => O1);
If O1 is tagged (and thus passed by reference), this will set O1.C to 3 in
either order of evaluation. (In both cases, Input will have (C => 1) and
output will have (C => 2)).
But if O1 is not tagged (and thus passed by copy), this will set O1.C to 3
if evaluated left-to-right [F2 sets O1.C to 2, then passes (C => 1) to Input;
Output is passed (C => 2); and then that summed to (C => 3)] and O1.C will
be 2 if evaluated right-to-left [Output is passed (C => 1); F2 sets O1.C to 2,
then passes (C => 1) to Input; and then that is summed to (C => 2)].
We can write similar code in Ada 95:
P1 (Input => F1(A1), Output => A1.all);
getting the same order dependence.
We can also get an order dependence from a single call (even in Ada 83):
P2 (O1, O1);
but only if there are multiple parameters that can modify an object, and
interestingly, only if the parameters are passed by-copy. (If A_Rec is
tagged, for example, the value of O1.C will be increased by 3, no matter
what order the parameters are evaluated in.)
That means that the Ada 95 call:
P1 (Input => F5 (A1, A1), Output => O1);
cannot have an order dependence, but in out parameters:
P1 (Input => F6 (O1, O1), Output => O2);
could, but only if A_Rec is passed by copy.
Note that a single call with only one modifiable parameter cannot have an
order dependence:
P1 (Input => O1, Output => O1);
will always end up with the same result, not matter what order the parameters
are evaluated in. (That result could depend on the parameter passing mode, but
that is controllable by using parameters of types that are by-copy or by-reference
and in any case is not the problem we are discussing here). The parameters will
be evaluated before the call; for by-copy they'll both have the value (C => 1)
and the result value of (C => 2) will be written after the call. No part of the
expression will use the modified value after it is modified, so there cannot be
a dependence.
Similarly,
P1 (Input => F7 (O1, O1), Output => O2);
does not have an order dependence, as again, no part of the expression could
depend on the modified value of O1.
----
After much analysis (mostly outlined in AI05-0144-1), we settled on the following
principles:
(1) This problem is more insidious than "ordinary" side effects in functions. A
function a routine that uses a side-effect internally ought to have been
written so that the side-effect doesn't damage the correctness of the
function. (Otherwise, the function could only be called once, which would be
unusual.) On the other hand, a side effect occurring in a call is could not
be known to the author of the function and may not be known to the author of
the call either. This is code that is clearly non-portable, and is likely to
break on a different compiler (or different compiler version, or even
different optimization settings). That makes it more dangerous than the
usual cases we've been living with for years.
(2) We must avoid having these checks annoy programmers by rejecting perfectly
safe things. Therefore, we will generate errors only when there is a
certainty that the result depends on the order of evaluation (when it is
arbitrary) and/or the parameter passing mechanism selected (pass by
copy/pass by reference). In particular, that means we will not reject
programs just because array indexes might be calculated to be the same
value.
(3) The issue occurs anywhere the language defines an arbitrary order of
evaluation (which is most places). The problem could occur just as easily in
an aggregate or assignment statement as in a nest of calls.
(4) The language does make it very clear that all of the parameters are evaluated
(in an arbitrary order), the call is made, and then the parameters are
copied back (in an arbitrary order). Mixing parameter evaluations and
copies back is not allowed, and that reduces the scope of the problem
somewhat.
(5) It probably doesn't pay to try to check access values, as they are rarely
analyzable, they're effectively reference parameters (which usually are
well-defined) and users expect them to be aliased. In addition, checking
access parameters would be incompatible.
---
Note that there is no generic contract issues with these rules. This is a case
where more is allowed (strictly) in an instance when more information (such as
about the kinds of types) is available. In the generic, all of these cases would
be illegal for generic formal types; the only time things would be legal if if
the instance has the "right" actuals (but that's irrelevant since the generic is
already illegal).
---
The rule about multiple in out parameters in a single call is incompatible,
but virtually all programs that would be made illegal would be very dubious. For
instance:
procedure Do_It (Double, Triple : in out Natural) is
begin
Double := Double * 2;
Triple := Triple * 3;
end Do_It;
Var : Natural := 2;
Do_It (Var, Var); --
Since whether Var contains 4 or 6 after the call to Do_It depends on the compiler
version, optimization settings, and potentially the phase of the moon, depending
on code like this is just a ticking time bomb. So this check will mostly detect
bugs.
[Editor's note: The expansion of this rule to everything that is not required to
be passed by-reference will also expand the incompatibility to some cases where
there is no actual problem - such as large untagged record types, which probably
are passed by reference by all compilers but are not required to be passed that
way. Thus the rule we have adopted seems to violate the principle of not rejecting
safe things. Admittedly, Erhard does not share my feeling that by-reference
parameters is safe (even though the language semantics is well-defined and all such
uses are portable). I fear that Erhard's insistence on expanding this applicability
will eventually cause the entire rule to be dropped -- which would be a massive
pity.]
---
The decision to exclude anonymous access parameters from this cheecking means that
most of the initial examples in fact are still legal (even if insidious). For instance,
the Ada 95 example from above:
if F4(A1) = F3(F1(A1)) then
is not detected by the proposed rules.
This is mainly for compatibility reasons: since Ada 95 code could contain these
sorts of problems, we don't want to make a lot of it illegal (even if it is
dangerous). The main argument used is that functions with access parameters are
common (as that was the workaround to not having "in out" parameters).
It is annoying that the existence of that workaround is being used to make it harder
to convert to "in out" parameters in those functions (as the result might be
illegal while the original code was not -- even though both are equally dubious),
but that cannot be helped.
[Editor's note: I'm still dubious about this decision, especially as we seem
willing to take the incompatibility for the multiple parameter case.]
!example
(See discussion.)
!ACATS test
!appendix
From: Tucker Taft
Sent: Sunday, June 7, 2009 7:14 AM
I thought about this some more, and tried again.
You had generalized your checks to cover assignment statements, but there are a
lot of constructs that allow arbitrary order of evaluation (aggregates,
constraints, record type elaboration, etc.). I concluded we should focus on the
"arbitrary order of evaluation" rather than on calls.
I also separated out the handling of elementary in-out and out parameters, as
that seems like a pretty different problem (order of copy-back), and it doesn't
help to mix it in with the arbitrary order of evaluation stuff. I also limited
it to cases of two or more in-out or out parameters to the same call. I don't
think inner calls make any difference here. (Your original wording was
confusing to me, so you might have not meant to worry about inner calls either.)
If we have aliased parameters, then we only care about non-explicitly-aliased
elementary in-out/out parameters, but I left out that subtlety for now.
I didn't bother talking about the effective "out"
parameter mode of the left-hand side of an assignment, as that seems irrelevant
to the rules. The problems come with the arbitrary order of evaluation of the
LHS vs. the RHS, not with the actual assignment operation itself, which happens
after all of the LHS and RHS evaluation is done.
Finally, I didn't try to worry about general access-type parameters, but only
access parameters with 'Access or 'Unchecked_Access. In any case once you get
into access types, you run into all kinds of ways things can go wrong, so I
would rather not venture into that can of worms too far. I think we might want
to avoid any discussion of parameters of an access type, and just have the one
bullet which deals with evaluation order.
[This was version /01 of the AI. This was the version discussed at the
Brest ARG meeting. This was made as a separate alternative so the
original lengthy discussion could be preserved without rewriting it.]
****************************************************************
From: Bob Duff
Sent: Saturday, June 13, 2009 11:30 AM
> Here is the version I am proposing.
I agree with your proposal. Mostly editorial comments below (maybe that's premature?).
...
> (See wording.)
I think a summary of the proposal should go here. Such as:
The following rules eliminate the most obvious side effects that can cause
evaluation order problems. These rules are checked statically. Unlike most static
rules, which are conservative, these rules are liberal in that they do not attempt
to prevent all evaluation order problems.
Otherwise, one gets lost in the detailed definitions, wondering what the point is.
Or else put some similar text *before* the detailed wording, either as an AARM
annotation, or a [bracketed introductory paragraph].
> !wording
...
> AARM Discussion: This is determined statically. If the name
> contains
^^^^
> some dynamic portion other than a dereference, indexed_component, or
> slice, it is not "known to denote the same object". [We could also
> use the same rules for indexes for the bounds of slices that have
> explicit bounds, although it doesn't seem very likely to occur and
> the wording is messy.]
It's hard to know what "this" refers to above. I suggest moving the AARM
annotation above the bullets, and change "This" to "This property". Or else
keep it here, and spell it out: "Whether or not names or prefixes are known
to denote the same object is determined statically. ..."
> Two names N1 and N2 are *known to refer to the same object* if N1 and
> N2 are known to denote the same object, or if N1 is known to denote a
> subcomponent of the object denoted by N2, or vice-versa.
...
> If a construct C has two or more direct constituents that are names or
> expressions whose evaluation may occur in an arbitrary order, at least
> one of which contains a function call with an in out, out, or
> access-to-variable parameter, then the construct is legal only if:
"access-to-variable parameter" seems confusing; I think you mean it to include
named access types, as well as access parameters. How about "parameter of an
access-to-variable type"?
> * For each name N that is passed as a parameter of mode in out or out
> to some inner function call C2 (not including the construct C
> itself), there is no other name anywhere within a direct constituent
> of the construct C other than the one containing C2, that is known
> to refer to the same object; and
>
> * For each name N'Access or N'Unchecked_Access that is passed as an
> access-to-variable parameter to some inner function call C2 (not
> including the construct C itself), there is no other name anywhere
> within a direct constitutent of the construct C other than the one
> containing C2, that is known to refer to the same object as N.
>
> For the purposes of checking this rule on an array aggreagate, an
aggregate
> expression associated with a discrete_choice_list that has two or more
> discrete choices, or that has a nonstatic range, is considered as two
> or more separate occurrences of the expression. Similarly for a
> record aggregate, the expression of a record_component_association is
> considered to occur once for each associated component.
>
> AARM Reason: This prevents obvious cases of dependence on the order of
^^^^
Another dangling "this".
> evaluation of names or expressions. Such dependence is usually a bug,
> and in any case, is not portable to another implementation (or even
> another optimization setting).
>
> The third bullet does not check for uses of the prefix, since the
> access type
^^^^^^^^^^^^
Which third bullet?
[Editor's note: The one Tucker deleted. ;-) He apparently didn't update
these notes.]
> and the designated object are not the same, and "known to denote the
> same prefix" does not include dereferences anyway.
...
> The usual concern about the use of *in out* parameters in functions
> begins something
> like:
It would be useful to mark each of the following with a comment "-- OK" or
"-- ERROR:" showing whether the proposed rules outlaw it. The proposed rules
do not outlaw all the cases below, I think -- e.g. the cases that pass A1.
Which is OK with me.
> Imagine writing an expression like:
>
> if F3(O1) = F3(F2(O1)) then
>
> This expression has an evaluation order dependency: if the expression
> is evaluated left-to-right, the result is True (both values have (C =>
> 1) and O1.C is set to 2 afterwards), and if the expression is
> evaluated right-to-left, the result is False (the right operand is
> still (C => 1), but now the left operand is (C => 2), and O1.C is still 2 afterwards).
>
> This is usually used as a reason to not allow *in out* parameters on functions.
"to not allow" --> "not to allow" or "to disallow"
> If you have to use access parameters, then the expression is:
>
> if F3(O1) = F3(F1(O1'access)) then
>
> and the use of 'access and aliased on the declaration of O1 should
> provide a red flag about the possible order dependence.
>
> However, this red flag only occurs some of the time. First of all,
> access objects are implicitly converted to anonymous access types, so
> no red flag is raised when using them:
>
> if F3(A1.all) = F3(F1(A1)) then
>
> Perhaps the .all on the left-hand argument could be considered a red
> flag. But of course that doesn't apply if that function also takes an access parameter:
>
> if F4(A1) = F3(F1(A1)) then
>
> We have the same order dependency, but there is no sign of a red flag here.
>
> This is all Ada 95 code, ...
No, it's not -- I see calls to functions with 'in out' params above.
[I'd don't see any such calls other than in a single 'straw man' case.]
>...but Ada 2005 makes this situation worse by adding prefix views and
>implicit 'access. If A_Rec is tagged, we can write:
...
> We can also get an order dependence from a single call (even in Ada 83):
>
> P2 (O1, O1);
>
> but only if there are multiple parameters that can modify an object,
> and
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
I think I know what you mean, but it's confusing -- parameters don't modify things.
[Editor's note: neither do calls or anything else. They just assigned back
as needed, apparently by some magical force. :-) Writing two extra sentences
to be pedantic about topics such as this don't help understanding.]
> interestingly, only if the parameters are passed by-copy. (If A_Rec is
> tagged, for example, the value of O1.C will be increased by 3, no
> matter what order the parameters are evaluated in.)
...
> Survey of solutions
It is very useful to have this "Survey" attached to this AI, for posterity!
> It fairly obvious that that order dependencies are a problem in Ada,
> and
"that that" --> "that"
> have been getting worse with each version of the language (even
> without *in out* parameters in functions). Moreover, it has been used
> as the primary reason for leaving something natural and useful (*in
> out* parameters for functions) out of the language.
...
> One obvious solution would be to define the order of evaluation,
> eliminating the problem at the source. Java, for instance, requires
> left-to-right evaluation. However, that would encourage tricky code
> like the various examples by making it portable.
I doubt if I'll convince anyone, but I think that's a bogus argument.
People do, in fact, depend on eval order all the time, either by accident, or because they don't know the language rules. And there's nothing any language definition can say to stop it. A language definition can say, "It is considered bad style to depend
on evaluation order (of...).
Don't do that." That would have just as much effect as leaving the order undefined -- i.e. it puts people on notice, but doesn't entirely stop the problem.
>... Moreover, Ada compilers have been using this flexibility for
>decades; trying to remove it from compilers (particularly from
>optimizers) could be very difficult. Note that this is the only
>solution that actually could eliminate dependencies on side-effects inside of functions.
> But defining the order of evaluation was considered for both Ada 83
>and Ada 95 and was deemed not worth it -- it's hard to see what has changed.
>
> Another option would be to increase the visibility of parameters with
> side-effects. This sounds somewhat appealing (after all, it seems to
> be the basis on which access parameters are deemed OK and *in out*
> parameters are not). One possibility would be to add ordering symbols to named notation:
> Param <- <expr> for an *in* parameter (this includes access); Param ->
> <expr> for an *out* parameter; and Param <-> <expr> for an *in out* parameter.
Ada 80 (or thereabouts) used the symbols :=, =:, and :=: for this.
I suggest you use this notation (before shooting it down below).
And eliminate the "That's especially bad..." sentence below.
> However, for compatibility, old code that don't use the symbols would
> have to be allowed. That's especially bad because the symbol for
> ordinary named parameters (=>) looks like the symbol for an *out*
> parameter; while it usually will be an *in* parameter. Moreover, this
> solution does nothing for positional parameters in calls nor for the
> prefixes of prefix notation. And it is misleading for access
> parameters, whose mode is officially "in", but still might cause side-effects.
>
> One could argue that positional parameters are already unsafe and
> requiring named notation to be safe is not much of an imposition. But
> the prefix and access issues are not so easily explained away.
> Additionally, putting the mode into calls in some way makes
> maintenance of programs harder: changing the mode of a call is going
> to make many calls illegal, while today most calls will remain legal (all will if the mode is changed from "in out" to "in").
I think that last part is bogus -- it's like saying the full coverage rules for aggregates make maintenance harder.
> Another syntax suggestion that was made recently was to (optionally)
> include the parameter mode as part of the call. That would look something like:
>
> if F3(in O1) = F3(in F2(in out O1)) then
Shirley, you'd leave out the 'in's!
> This could be applied to positional calls as well, but still provides
> no help for prefix calls nor for access parameters.
>
> One could imagine requiring the syntax for calls of functions with *in
> out* parameters and making it optional elsewhere. That might placate
> *in out* parameter opponents, but otherwise doesn't seem to do much for the language.
If we were to do any of the above "marking [in]out params" syntax, we should also define
a Restriction that forces it on all calls (not 'in' params, of course).
Worth mentioning, even though we're not going this route.
> Finally, we come to some sort of legality rules and/or runtime checks
> for preventing such order dependencies. It is important to note that
> making such rules too simple (and strong) only would mean that
> temporaries have to be introduced in some expressions. That would be
> an annoyance, but surely not as bad as the current ticking time-bomb.
>
> The easiest option is to blame all of the problems on functions with
> *in out* parameters and make them stand alone. The rule would be something like:
>
> A call of a function with an *in out* parameter must be the only call in
> an expression.
>
> That would mean that
>
> if F2(O1) then
>
> would be legal (assuming F2 returned type Boolean), but
>
> if F2(O1) = (C => 1) then
>
> would not be. Obviously, this is too strict.
I agree it's too strict, but I don't think it's "obviously" too strict.
It allows:
Blah : T := Func(...);
which is one of the more useful cases. Especially when T is something like String.
>...Amazingly, it also not strict
> enough:
>
> Some_Array(F3(O1)) := F2(O1);
>
> would be allowed. (An assignment statement is *not* an expression!)
Well, then obviously the wording of the rule should not be "expression".
It should be something like (in this rejected alternative):
If a construct has two or more constituents whose evaluation may occur in an
arbitrary order, and contains a call to a function with an [in]out param,
then it shall contain no other calls.
> A call of a function with an *in out* parameter must be the only call in
> an expression or statement;
> If a call of a function with an *in out* parameter is the source expression
> of an assignment_statement, the target variable_name shall not include a
> call, an indexed component, a slice, or a dereference.
...
> So we could modify the definition of "potentially the same object" to:
>
> Two objects are considered to be "potentially the same object" one has
> the type
if
**
> of a part of the other, or one has a part whose type is a partial view, unless:
> * one object is part of a stand-alone object, and the other object is part
...
> Such rules would only reject calls where it is clear that parts of the
> same object are involved. That eliminates the complications of private
> types (we won't look in them), arrays (we won't try to determine if they are the same), and so on.
"arrays" --> "array components", I think you mean.
> These are the rules proposed in the !wording section above.
...
> None of the proposed rules do anything about side-effects totally
> inside of functions. One way to deal with that would be to require an
> expression to contain only a single function call unless all of the functions are *strict*.
I don't think "strict" is the right term, here. In computer science, "strict"
means "arguments are evaluated at the call", as opposed to "lazy" and "non-strict" (which
are almost, but not quite the same thing -- Haskell is non-strict), and "call by name".
I'd consider extending "pure" to this concept. Or else don't define a term, just talk about
"such functions" in this part.
[Editor's note: I tried that and people hated it. A new term seemed better. Doesn't
matter, because we're not doing any of this.]
>... A strict
> function exposes all of its side effects in its specification, meaning
>it does not read or write global variables, call non-strict functions,
>write hidden parts of parameters, etc. Most of the language-defined
>functions are strict (that would have to be declared somehow).
>
> Strict functions have several other nice properties: they don't need
> elaboration checking (freezing is sufficient to prevent
> access-before-elaboration problems); they can be used in symbolic
> reductions; and they can use the optimizations allowed for functions in pure packages.
>
> However, such a requirement would be quite incompatible. Moreover,
> strict functions would be rather limiting by themselves.
>
> An alternative that has been suggested is to allow Global_In and
> Global_Out annotations on subprograms, which would declare the global
> side-effects of a subprogram. Such annotations could not be a lie
> (they'd have to be checked in some way), and thus would fill the role
> of strict functions more flexibly. But it would still be too
> incompatible to ban dangerous side-effects in functions (although separate tools
> or non-Ada operating modes could make such checks).
Or pragma Restrictions. Same applies to the "strict" discussion above.
****************************************************************
From: Tucker Taft
Sent: Saturday, June 13, 2009 4:36 PM
Thanks for reviewing this. We approved
the intent, after deleting the bullet relating to access parameters, and generalizing
the copy-back rules to apply to not only elementary types, but in fact any type that
is not tagged or immutably limited. Basically, unless it is bloody obvious/guaranteed
the type will be passed by reference, then we disallow having two out parameters that denote
the same object in a single call.
We didn't really review the discussion or examples, but it sounds like they will need
some work.
****************************************************************
Questions? Ask the ACAA Technical Agent