Version 1.1 of acs/ac-00311.txt

Unformatted version of acs/ac-00311.txt version 1.1
Other versions for file acs/ac-00311.txt

!standard 4.5.9(0)          18-12-14 AC95-00311/00
!class Amendment 18-12-03
!status received no action 18-12-03
!status received 18-11-06
!subject allow exit in reduction expressions?
!summary
!appendix

From: Brad Moore
Sent: Tuesday, November 6, 2018  10:13 AM

I was looking for some examples of reductions in other languages, and found this
one written in Perl, with the following challenge.

   print 'The first ten primes are: ',
     show(take(10, filter { prime(shift) } integers)), "\n";

   Think as to how you would express the first ten prime numbers in a simple way
   in your favourite programming language?

I was trying to think how one might do this, as a one liner, using the proposed
new Ada features for Ada 2020.

I first was thinking of an array aggregate, but there doesn't seem to be an easy
way to limit the length of the result to 10, when you need a some unknown larger
number of iterations to find the result.

I then thought of using 'Reduce. This almost works. With a function expression,
and 'Image for all one could in theory write:

Put_Line
   ("The first 10 primes are: " &
         Number_Array'Image
           ([for I in 1 .. 1_000 => I]'Reduce
               (((function(L,R) return
                     (if L'Length = 10 or else not Is_Prime(R) then L else L & R),
                [])
           )
   );

But the problem here is that you end up iterating through all values from 1 to
1000, when you could have computed the result with much less iteration.

We have the new aspect Allows_Exit which allows one to write an exit statement
for a loop as a procedure body.

AI12-0189-1 Anonymous loop body procedures
AI12-0286-1 Allows_Exit aspect

It seems like it would be similarly useful to also write exit when iterating
where the loop body appears like a function expression.

Then one could write:

Put_Line
   ("The first 10 primes are: " &
         Number_Array'Image
           ([for I in 1 .. 1_000 => I]'Reduce
               (((function(L,R) return
                     (if L'Length = 10 then exit
                      else (if Is_Prime(R) then L & R else L)),
                [])
           )
   );

Where the iteration would terminate after 10 values are found.

***************************************************************

From: Tucker Taft
Sent: Tuesday, November 6, 2018  11:51 AM

Looks pretty grizzly to me, and it might be an implementation burden to set up
an exception-handling environment in the middle of an expression.  Seems like
the kind of thing for which it would be fine to have to go out of line, rather
than trying to push the whole thing into a single expression.  We are already
having push-back on expression complexity for cases where we have real user
demand -- this one might push us over the brink.

***************************************************************

From: Brad Moore
Sent: Tuesday, November 6, 2018  2:08 PM

It could be that this is the straw that broke the camels back, but if we were
going to consider this, I would suggest a couple tweaks/improvements to what I
wrote below.

Ideally, you'd want to have syntax such as "exit with value", that is, return a
value, and if the enclosing construct supports exit, then let the enclosing
construct do an exit.

This would mean that the exit could occur on the iteration that found the
answer, rather than having to do one extra iteration, and dealing with a
function call that doesn't return anything. i.e. With this improvement all calls
to the function return valid results.

This would allow one to use a filter in this example, whereas before you
wouldn't want to use a filter, because you'd end up having to find one more
prime number than you wanted before breaking out of the iteration.

I think this also would eliminate the need for exception handling in the
expression. An function that Allows_Exit could just set an implicit exit flag
before returning. If the calling construct supported exit, such as a Reduction
expression, then the implementation would just break out of the iteration.

So, to rewrite the example with these thoughts in mind, (exit with value + using
filter expressions), we could write:

Put_Line
 ("The first 10 primes are: " & Number_Array'Image (
      [for I in 1 .. 1_000 when Is_Prime (I) => I]'Reduce
          (((function(L,R) return (if L'Length = 9 then exit with L & R else L & R), [])));

Which I think also looks a lot less grisly.

Does that help?

***************************************************************

From: Jean-Pierre Rosen
Sent: Tuesday, November 6, 2018  2:22 PM

> So, to rewrite the example with these thoughts in mind, (exit with value +
> using filter expressions), we could write:
>
> Put_Line
>  ("The first 10 primes are: " & Number_Array'Image (
>       [for I in 1 .. 1_000 when Is_Prime (I) => I]'Reduce
>           (((function(L,R) return (if L'Length = 9 then exit with L & R else L & R), [])));
>
> Which I think also looks a lot less grisly.

Sorry, but I can't see the benfit of all that mouthfull when it is so simple to
write it procedurally, without constructing an intermediate array...

***************************************************************

From: Brad Moore
Sent: Tuesday, November 6, 2018  11:27 PM

It sounds simple, but when you go to write this, you find you need a declare
block, and a variable to keep track of how many values have been collected,
handling for line ending, interleaving IO, etc, it starts to look like a
mouthfull doing it the old way.

eg.

   declare
      Prime_Count : Natural := 0;
   begin
      Put ("The first 10 primes are:");

      Prime_Loop :
      for I in 1 .. 1000 loop
         if Is_Prime(I) then
            Put (Integer'Image(I));
            Prime_Count := Prime_Count + 1;

            exit Prime_Loop when Prime_Count = 10;
         end if;
      end loop Prime_Loop;

      New_Line;
   end;

But I think I do have a better and an adequate solution to the Perl challenge,
without adding more to existing proposals.

I think using the new array aggregate syntax of AI12-0212-1, with array slicing,
iterator filters, and 'Image for all, we can write:

  Put_Line ("The first 10 primes are: " & Num_Array'Image ([for I in 1 .. 1_000 when Is_Prime (I) => I](1 .. 10)));

This now seems like a mouthful that is quite a bit easier to swallow, and not as
likely to have a bitter after taste, or give you indigestion.

Yes, the compiler might execute all 1000 iterations, but a smarter compiler
might recognize that that the result is only returning the first 10 values from
the slice, and optimize out the unnecessary iterations. The Perl program
probably isn't any more efficient, as all most people would care about in this
example is that the answer is correct. I think the Ada solution is more readable
than the Perl one, from the perspective of being more intuitive to someone
seeing this syntax for the first time.

***************************************************************

From: Steve Baird
Sent: Tuesday, November 6, 2018  8:02 PM

> Seems like the kind of thing for which it would be fine to have to go out of
> line, rather than trying to push the whole thing into a single expression.

I'm not advocating this feature (I agree with what Tuck said), but there is an
argument for this sort of feature that I think we should at least look at.

One argument for "shorthand" expressions of various sorts (e.g., quantified
expressions, declare expressions) is that they make code easier to both read and
write.

Unlike quantified expressions, I don't see a justification for the proposed
feature using that metric.

But another point to consider when looking at any of these "shorthand"
expression proposals is whether they allow you to express something in a
contract that would otherwise be difficult to express at all.

Suppose you have something like

   package Pkg is
     function Find (C : Character; S : String) return Natural
       with Post => (if (for some K in S'Range => S (K) = C)
                     then S (Find'Result) = C
                     else Find'Result = 0);

      -- if C occurs in S, then return the index of an occurrence of
      -- C in S; otherwise return 0.
    end;

but someone tells you "The project coding standard doesn't allow quantified
expressions; you need to rewrite this using a loop statement".

So you say "Fine, I'll do it out of line" and you rewrite the spec as something
like

   package Pkg is
     function Occurs_In (C : Character; S : String) return Boolean;

     function Find (C : Character; S : String) return Natural
       with Post => (if Occurs_In (C, S) then
                     S (Find'Result) = C
                     else Find'Result = 0);

      -- if C occurs in S, then return the index of an occurrence of
      -- C in S; otherwise return 0.
    end;

But what should the postcondition of Occurs_In be? If the best answer is
"describe what it does with a comment" (or even "declare Occurs_In as a
recursive expression function"), then we have lost real expressive power, not
just convenience, by excluding quantified expressions. Modular static analysis
tools such as SPARK will not be able to do as good job with the second version
(because they won't look at the body of Pkg when analyzing a client of Pkg; they
also might have a hard time figuring out what is going on even if they did peek
into the body of Pkg).

Nobody is proposing changing any rules about quantified expressions.

My question is whether a similar argument applies in the case of the feature
Brad is proposing.
   - Does this feature provide expressive power that would otherwise
     be lacking?
   - If so, then does a need for this come up in practice often?
   - If so, then would allowing this feature allow tools such as
     SPARK to provide better analysis results? Would this construct
     convey information that the tools would be able to make use of?

In this particular case, the complexity-to-utility ratio still seems too high to
me. But I think the above questions ought to be asked whenever we are
considering a new construct that might occur in a contract expression.

***************************************************************

From: Brad Moore
Sent: Tuesday, November 6, 2018  11:42 PM

After giving more thought, and realizing the solution in the previous email, so
far, allows_exit in function_expressions feels like it serves as an
optimization, which probably doesn't help much in contracts and proofs, but I
will try to think of an example where such a feature would be more needed.

***************************************************************

From: Tucker Taft
Sent: Wednesday, November 7, 2018  8:00 AM

...
>Ideally, you'd want to have syntax such as "exit with value", that is,
>return a value, and if the enclosing construct supports exit, then let
>the enclosing construct do an exit.

I would certainly agree that if you start having an "exit" in an expression
involving a quantifier or other looping expression, you would almost certainly
want an "exit with value" rather than a simple exit.  But then you start to get
into the multi-level exit question, which would then imply having labeled loops,
and you ultimately get to the point that you are duplicating the entire
statement-level syntax within expressions.

Of course, there are languages where everything is an expression, and we are
admittedly beginning to head in that direction in Ada to serve the needs of
postconditions, so we may need to develop some criteria for saying "that is
going too far" for some proposals, and "that is easy to understand and makes
perfect sense" for others.

Pure control flow does not make sense in my view at the expression level.  Early
exit with a value, or early return, might make sense, but even at the statement
level, early exits and early returns can be a source of confusion, and are
disallowed or restricted in some coding conventions.  So if we need to get a
stake in the ground, I would say that this might be a good place to do it -- no
early exits or returns from expressions, unless they are implicit in the
semantics (such as the short-circuit semantics of a quantified expression).

A "lazy" evaluation semantics, which is what languages like Haskell adopt, might
be another way to support such things.  That is basically what Brad is
suggesting a compiler might do when it is clear that there is no reason to
generate more prime numbers than are preserved at the end.

For reductions, if there is the semantic equivalent of a "zero" for the
reduction operation (i.e. a value where X op Z = Z), then that would be a
natural value on which to stop the reduction.  This is analogous to stopping
evaluation of a universal quantified expression when you get a False (the "zero"
of the "and" operator), or stopping on a True in an existential quantified
expression (True being the "zero" of the "or" operator).   One could imagine
defining some notion like this (it relates to being "idempotent" as well, I
suppose), and then incorporate it into the semantics of reduction, without
having to otherwise extend the syntax, and without having to introduce
explicitly control-flow-like notions into expressions.

**************************************************************

From: Brad Moore
Sent: Thursday, November 8, 2018  9:41 AM

...
> Pure control flow does not make sense in my view at the expression
> level.  Early exit with a value, or early return, might make sense,
> but even at the statement level, early exits and early returns can be
> a source of confusion, and are disallowed or restricted in some coding
> conventions.  So if we need to get a stake in the ground, I would say
> that this might be a good place to do it -- no early exits or returns
> from expressions, unless they are implicit in the semantics (such as the short-circuit semantics of a quantified expression).

That sounds to me like a sensible place to put a stake in the ground. Sounds
good to me!

> A "lazy" evaluation semantics, which is what languages like Haskell
> adopt, might be another way to support such things.  That is basically
> what Brad is suggesting a compiler might do when it is clear that
> there is no reason to generate more prime numbers than are preserved at the
> end.
>
> For reductions, if there is the semantic equivalent of a "zero" for
> the reduction operation (i.e. a value where X op Z = Z), then that
> would be a natural value on which to stop the reduction.  This is
> analogous to stopping evaluation of a universal quantified expression
> when you get a False (the "zero" of the "and" operator), or stopping on a
> True in an existential quantified expression (True being the "zero" of
> the "or" operator). One could
> imagine defining some notion like this (it relates to being
> "idempotent" as well, I suppose), and then incorporate it into the
> semantics of reduction, without having to otherwise extend the syntax,
> and without having to introduce explicitly control-flow-like notions into
> expressions.

A couple more comments on the Perl challenge solution.

I suggested the following, using proposed array aggregate syntax, is the best
answer, which I still believe to be the case:

  Put_Line ("The first 10 primes are: " &
            Num_Array'Image ([for I in 1 .. 1_000 when Is_Prime (I) => I](1 .. 10)));

But it is interesting to note that this can be made into a reduction simply by
inserting "'Reduce("&",[])" in the middle, while still having the same potential
for optimization with lazy-evaluation, as in;

  Put_Line ("The first 10 primes are: " &
            Num_Array'Image ([for I in 1 .. 1_000 when Is_Prime (I) => I]'Reduce("&",[])(1 .. 10)));

Which suggests that such array aggregates can be viewed as being a shorthand for
reduction.

Also a couple of other points to add to the benefits of these functional
expressions over the imperative approach of writing a loop, over and above the
points already raised by Steve Baird and Tucker, in response to Jean-Pierre's
comment.

The loop version requires a global variable, which in itself is undesirable, but
because of that, and because of the interleaved I/O, the loop cannot be
parallelized, while the expression can. (And can be explicitly parallelized by
adding the parallel keyword before the "for", if the expression has 'Reduce)

Except note that currently the array aggregate proposal doesn't allow the
parallel keyword, while the reduction syntax proposal does. Given that I've just
shown that the array aggregate syntax can be viewed as a shorthand for a
reduction, it seems like we should be allowing the parallel keyword there.

Another point is that the loop exit logic is more complex in the imperative
version, and more likely to have a bug if a programmer is asked to write this
from scratch. (e.g. Uninitialized variable, or off by one error, etc) The array
slice of the functional approach seems less likely to have a logic error being
introduced.

Lastly, though we are creating a temporary array object in the expression, it
should also be noted that the loop version has multiple calls to do I/O, while
the expression has only one call, and I/O calls can be expensive. So in terms of
efficiency, there is a trade-off either way. But again, given the specific
problem that is being asked to be solved, it is unlikely that efficiency or lack
thereof would even be noticeable.

***************************************************************

Questions? Ask the ACAA Technical Agent