Version 1.9 of ai12s/ai12-0242-1.txt

Unformatted version of ai12s/ai12-0242-1.txt version 1.9
Other versions for file ai12s/ai12-0242-1.txt

!standard 4.5.9 (0)          18-03-01 AI12-0242-1/03
!class Amendment 14-06-20
!status work item 14-06-20
!status received 14-06-17
!priority Medium
!difficulty Hard
!subject Reduction Expressions
!summary
Two New Attributes to facilitate reduction via a Reduce Attribute for sequential reduction and Parallel_Reduce Attribute for parallel reduction.
!problem
A common problem in computing is the need to summarize a set of values.
Currently in Ada, one generally has to write a subprogram to do this, which involves writing a loop to iterate through the set of values, and a global accumulator variable or set of variables that get updates through each iteration of the loop.
One of the most important addition to Ada in Ada 2012 is contract specifications for subprograms. However to write pre and post conditions that attempt to summarize inputs or outputs, one has to write a subprogram to express the summary result, which can add clutter to an API for an abstraction, and also injects a level of indirection, where if a client wants to understand the requirements of the pre condition, or the effects of the post condition, one has to find and examine the body of the subprogram named in the contract specification.
It would be useful if such summarizations could be expressed in a simpler, more concise manner, such as an expression, without always having to write another subprogram. The proposed solution is to allow an Attribute to be applied to an array object to provide the summarization.
Ada 83 was already a pretty good language for "hybrid" functional/imperative programming, because you can have a sequence of declarations that progressively compute a desired result. Often it was only because you needed to enter a loop that you had to do any significant amount of computing after the "begin." In today's Ada, with conditional expressions, quantified expressions, the recently approved iterated component association, and our newly proposed container aggregates, the need for explicit ifs and loops after the "begin" is continuing to decrease. There appears to be a shift in the a balance between functional and imperative features, more towards the functional side. There is a resurgence of interest in functional programming these days, but going all the way to "pure functional" programming, with "monads" instead of assignment statements, and recursion only with no iteration, would be too going too far towards the functional side, but it is important to "complete" the functional primitives, if only to give the desired power to Ada's contract-based programming.
Some of the existing syntax in Ada already can provide forms of summarization. For example, a quantified expression can be viewed as being a special purpose Reduction Expression that applies an operation, the Predicate, to a set of values to reduce the predicate results to a single Boolean result. Similarly, an array aggregate in the form of an iterated_component_association can also be viewed as being a special purpose Reduction Expression that applies an operation, in place concatenation, to a set of values to reduce into an array result. A Reduction expression is a more general syntactic form where there is less constraint on the type of operation to be applied to summarize the set of values, and the operation can produce result values of types other than Boolean values, or array creation.
Such a reduction capability is the last remaining major component needed to complete the "iterated" operations. There are many applications of reduction capabilities, including both sequential and parallelism usage. It will often be the right way to define, in a postcondition, what is the final value of one output of the subprogram, even though the subprogram might have multiple outputs and might do many other things.
Such a capability can allow programmers to express algorithms more concisely, which tends to also make the algorithms easier to read, debug, understand, and maintain.
Another need is to be able to perform reductions in parallel. Modern computing platforms typically provide support for multicore computing, yet writing algorithms that take advantage of these multicore platforms is considerably difficult to write, very error prone, and difficult to debug. Many of the divide and conquer parallelism algorithms involve reduction, commonly known as MapReduce, in the parallelism world. The challenge with such reductions is that writing a subprogram with a global accumulator value normally would imply a data race if multiple threads of execution end up updating the global variable in parallel.
parallel reduction is in many ways a separate issue, when compared to sequential reduction, and where (and whether) it is best inserted depends heavily on the amount of computing done within a single iteration, the independence of those computations, and the number of iterations. The advantage of the notations like quantified expressions, container aggregates, and reduction expressions, is that they are self-contained, and can often be analyzed holistically by the compiler to determine whether and where to insert parallelism. It is important in the context of Ada to have sufficient annotations to allow the compiler to both check the safety of explicit parallelism, and also safely insert implicit parallelism. Hence the importance of the Global and Nonblocking annotations.
One of the strengths of Ada is in the area of safety critical and high integrity systems. Ada has a unique opportunity to extend its offering in these areas by providing parallelism capabilities that can be statically checked by the compiler to help the programmer eliminate data races.
!proposal
This proposal depends on the facilities for aspect Global (AI12-0079-1) and for aspect Nonblocking (AI12-0064-2). Those proposals allow the compiler to statically determine where parallelism may be introduced without introducing data races.
The goals of this proposal are to provide a mechanism for expressing a summarization of a set of values. Such a summarization is known as a reduction. It is desired that such a mechanism;
- is easy to understand - is easy to write/express - allows parallelization - provides safety by eliminating data races and logic errors. - fits in well with existing Ada syntax
Reduction Attributes ---------------------
This model proposes two new attributes, 'Reduce, and 'Parallel_Reduce which can be used to combine the values of the prefix which denotes an array object into a single result. This mechanism is called reduction. The 'Reduce attribute is intended to provide sequential reductions by default, unless the compiler can determine it is safe and beneficial to performance to implicitly introduce parallelism. The 'Parallel_Reduce attribute is intended to allow the programmer to explicitly request that a reduction be performed in parallel, which is useful because the compiler is not always able to make a correct assessment statically whether injecting parallelism with improve instead of detract from performance. When a compiler cannot make that determination, it is more apt to choose sequential processing because it is simpler, and more deterministic. By allowing the programmer to request parallelism explicitly, the compiler does not need to make the assessment, and the programmer can easily revert to sequential processing with by simply replacing the parallel version of the attribute with the sequential one.
The Parallel_Reduce attribute activates the necessary static checks by the compiler to ensure that the parallelism can safely be introduced, without necessarily requiring analysis to determine if there is a performance benefit to the parallelism. The two attributes are similarly named to make it easier for the programmer to switch between parallelism and sequential implementations, such that performance measurements and other similar techniques can be employed to help the programmer decide which attribute is the better choice for a particular problem.
The prefix of the 'Reduce and 'Parallel_Reduce attribute is either an array object or a container object. This includes objects created by array aggregates, or container aggregates.
The 'Reduce attribute and 'Parallel_Reduce attribute both accept two arguments. The first argument denotes a binary subprogram or operator. If the subprogram is a function, then the function is of the form;
function Combiner (L, R : S) return S;
The type of the parameters and result are of the same type as the component or element objects associated with the attribute prefix.
If the subprogram is a procedure, then the procedure is of the form;
procedure Combiner (L : in out S; R : S);
In either case, such a subprogram is called a combiner_subprogram. The combiner subprogram is called once for every value associated with the prefix of the attribute, where each value of the prefix is passed in as the second argument of the combiner subprogram, and results from previous calls are passed in as the first argument of the combiner subprogram.
For the very first call to the combiner subprogram, the second argument of the attribute is passed as the first argument of the combiner subprogram, since there are no previous results yet determined. This second argument of the attribute is called the initial value.
The initial value also serves to be the result of the attribute for the case when there are no iterations performed (E.g. For the case when the attribute prefix denotes a null array).
In order for a 'Parallel_Reduce attribute reference to be legal, the combiner subprogram needs to have the Associative aspect. The Associative aspect is a Boolean aspect indicating that the subprogram can be applied to combine a set of values and provide equivalent results regardless of the grouping of the values, given that the order of the values is preserved. For instance, integer addition is considered to be an associative operation, since (A + B) + C = A + (B + C). The Associative aspect is more of a documentation artefact, since a compiler is unlikely to be able to determine if the result of an operation is in fact associative. We can define the aspect on language defined operations and libraries which can go a long way to catch errors. For user defined subprograms, we would require the programmer to apply the Associative aspect, which makes the programmer give consideration as to whether the operation is in fact associative or not.
There is a thread of control associated with each value of the prefix. Each thread of control (i.e. tasklet, see AI12-0119-1), needs to accumulate its partial result into an implicit accumulator variable. This implicit variable is of the type of the combiner subprogram parameters, and is initialized to have the first value to be processed by the tasklet, unless the tasklet is assigned to process the first object or element of the prefix, in which case, the implicit variable is initialized to the value of the Initial_Value of the attribute reference.
Once each tasklet has generated its local partial result, it can combine its result with other tasklets using the combiner subprogram. The multiple tasklet results are combined two at a time.
For parallel execution, the 'Parallel_Reduce attribute is expected to generate an associative result based on the values of the prefix, but otherwise is not restricted. The combiner for 'Reduce does not need to have the Associative aspect. The combiner subprogram can be non-commutative (e.g. vector concatenation), as it is expected that the implementation will ensure that the tasklet results are combined while preserving the original ordering of the set of input values of the attribute prefix as would be consistent with sequential execution.
!wording
Add after 3.6.2(10)
For a subtype S of a component of an array type A or element of a container type C, and a corresponding object O of the array or container type, which is denoted by the prefix of a Reduce or Parallel_Reduce attribute reference:
kA Combiner_Subprogram is a subprogram that accepts exactly two parameters and can be iteratively called to summarize the prefix object to produce a single result.
A Combiner_subprobram either denotes a function with the following specification:
function Combiner(L, R : S) return S;
or denotes a procedure with the following specification:
procedure Combiner(L : in out S; R : S);
[RLB: Examples have shown that an asymetric combiner is very useful. The second operand should be allowed to be any type, and that matches the element type of the iterator of the prefix type; the other operand specifies the type of the result of the attribute.]
O'Reduce(Combiner, Initial_Value)
For a prefix denoting an object of an array or container, O'Reduce is an expression that summarises the components or elements of O by applying the named Combiner_Subprogram and initial value to all the values associated with the attribute prefix.
[RLB: A "container" is not a technical term, and even if it was, it's the wrong thing. What we need here is a full object iterator, essentially an "of" form. The resolution requirement should probably be an object of any composite type, and then have a Legality Rule of the form that the attribute is illegal if the type of the prefix does not have an "of" iterator (need technical terms, of course). The element type involved comes from that iterator, so this has an effect on the resolution of the combiner.]
O'Parallel_Reduce(Combiner, Initial_Value)
For a prefix denoting an object of an array or container, O'Reduce is an expression that summarises the components or elements of O by applying the named Combiner_Subprogram and initial value to all the values associated with the attribute prefix, using parallel execution. The Combiner_Subprogram shall have the Associative aspect with the value True. See (9.12). The Combiner_Subprogram shall have the Non_Blocking aspect with the value True. See (9.5). The Combiner_Subprogram shall have the Global aspect and the Global aspect shall not identify any global variables that are not synchronized.
Name Resolution Rules
The expected type of a Reduce or Parallel_Reduce attribute reference is any nonlimited type.
AARM Note The prefix can be a (qualified) array or container aggregate. The prefix can denote an object of a multi-dimensional array.
Dynamic Semantics
For a Reduce attribute reference, the combiner subprogram denoted by the attribute reference is first called passing the Initial_Value of the attribute reference as the first parameter, and the first value associated with the attribute prefix as the second parameter. The Combiner_Subprogram is then called repeatedly for each subsequent value associated with the attribute prefix where the result output by the previous call is passed as the first parameter to the Combiner_Subprogram, and the subsequent value of the prefix is passed as the second parameter. This continues until all values of the prefix have been processed. The result of the last call to the Combiner_Subprogram is the result of the expression. If the prefix of the attribute reference denotes a null range of values, then the Initial_Value is the result of the expression.
[RLB: "repeatedly" doesn't explain what really happens. This needs to be worded in terms of the "of" iterator for the type of the prefix (especially important for the container case).]
For a Parallel_Reduce attribute reference, each component or element of the prefix is assigned to an executor. (See 9.12). The executor associated with the first value of the prefix calls the Combiner_Subprogram using the Initial_Value of the attribute reference as the first parameter of the call, and using the first value as the second parameter. Any other consecutive values assigned to the same executor involve repeated calls to the Combiner_Subprogram, where the result output by the previous call is passed as the first parameter to the Combiner_Subprogram, and the subsequent value of the prefix is passed as the second parameter. For executors assigned to other components or elements of the prefix, the result for that executor is initially assigned the value from the lowest indexed value assigned to that executor. Any other consecutive values assigned to the same executor, involve repeated calls to the Combiner_Subprogram, where the result output by the previous call is passed as the first parameter to the Combiner_Subprogram, and the subsequent value of the prefix is passed as the second parameter. The results from the executors are combined similarly, by calling the Combiner_Subprogram to combine the results from two Executors whose results corresond to consecutive ranges of value of the attribute prefix. This process repeats until all Executors have combined all intermediate results into a final value. This result is the result of the expression. If the prefix of the attribute reference denotes a null range of values, then the Initial_Value is the result of the expression.
AARM Note:
It is important to specify which argument of the reducer function is used as for feeding back prior results, because certain reductions, such as vector concatentation, can be non-commutative operations. In order to return a deterministic result for parallel execution, it is necessary to specify which parameter is used for feedback of prior results. When combining results from two tasklets, the results associated with the array components with the lower index value are passed as argument L of the reducer function. This is also necessary for non-commutative reductions to work properly.
Examples
Calculate the sum of elements of an array of integers
A'Reduce("+",0) -- See 4.3.3(43)
Determine if all elements in a two dimensional array of booleans are set to true
Grid'Reduce("and", true); -- See 3.6(30)
Calculate the minimum value of an array of integers in parallel
A'Parallel_Reduce(Integer'Min, Integer'Last)
Determine how many integers in an array are prime numbers
Prime_Count : contant Natural := (for P of A when Is_Prime(P) => 1)'Reduce("+",0);
An expression function that returns its result as a Reduction Expression
function Factorial(N : Natural) return Natural is
((for J in 1..N => J)'Reduce("*", 1));
An expression function that computes the Sin of X using Taylor expansion
function Sin(X : Float; Num_Terms : Positive := 5) return Float is
((for I in 1..Num_Terms => (-1.0)**I * X**(2*I-1)/Float(Fact(2*I-1)))
'Reduce("+", 0.0));
A reduction expression that outputs the sum of squares
Put_Line ("Sum of Squares is" &
Integer'Image((for I in 1 .. 10 => I**2)'Reduce("+", 0));
An expression function to compute the value of Pi
-- See 3.5.7(21) function Pi (Number_Of_Steps : Natural := 10_000) return Real is
(1.0 / Number_Of_Steps * (for I in 1 .. Number_Of_Steps => (4.0 / (1.0 + ((Real (I) - 0.5) * (1.0 / Number_Of_Steps))**2))) 'Reduce("+", 0.0));
AARM Implementation Advice
For a prefix of a Reduce or Parallel_Reduce attribute reference that is an aggregate, the creation of a temporary object is not necessary if each value of the aggregate object that would have been created is iteratively applied to the Combiner_Suprogram of the attribute reference.
[RLB: We need more than this; we also need a permission to interleave the evaluation of the aggregate choices and the reduction. That has to be normative as it overrides the canonical semantics. That probably ought to be a "Notwithstanding" rule.
Wording Changes from Ada 2012 Reduce attribute is new. Parallel_Reduce attribute is new.
Modify 4.5.1(3) with
function "and"(Left, Right : T) return T {with Associative} function "or" (Left, Right : T) return T {with Associative} function "xor"(Left, Right : T) return T {with Associative}"
Modify 4.5.3(2)
"function "+"(Left, Right : T) return T{ with Associative}
function "-"(Left, Right : T) return T
"
Modify 4.5.3(4)
"function "&"(Left : T; Right : T) return T {with Associative} function "&"(Left : T; Right : C) return T {with Associative} function "&"(Left : C; Right : T) return T {with Associative} function "&"(Left : C; Right : C) return T {with Associative}"
Modify 4.5.5(2) function "*" (Left, Right : T) return T {with Associative} function "/" (Left, Right : T) return T function "mod"(Left, Right : T) return T function "rem"(Left, Right : T) return T
Modify 4.5.5(12)
function "*"(Left, Right : T) return T {with Associative} function "/"(Left, Right : T) return T
Modify 4.5.5(14)
function "*"(Left : T; Right : Integer) return T {with Associative} function "*"(Left : Integer; Right : T) return T {with Associative} function "/"(Left : T; Right : Integer) return T
Modify 4.5.5(16-17)
function "*"(Left, Right : root_real) return root_real {with Associative} function "/"(Left, Right : root_real) return root_real
function "*"(Left : root_real; Right : root_integer) return root_real {with Associative} function "*"(Left : root_integer; Right : root_real) return root_real {with Associative} function "/"(Left : root_real; Right : root_integer) return root_real
Modify 4.5.5(19) function "*"(Left, Right : universal_fixed) return universal_fixed {with Associative} function "/"(Left, Right : universal_fixed) return universal_fixed
Add AARM To Be Honest after 4.5.5(19)
Addition and Multiplication operators for discrete and floating point types are not truly associative, even though the differences may not be important for typical use. Discrete additions might overflow when different groupings are applied, and floating point rounding errors can produce different results when different groupings are applied. We say these are associative though, because for typical use, the non-associativity is subtle enough to not be a concern.
Add to new section 9.12 (See AI12-0119-1)
[Reader Note: I somewhat modelled the Associative aspect after the nonblocking
aspect. There may be content that is not needed, but I figure it makes sense to start with something that already has pretty good wording, rather than try to invent new wording]
[RLB: There's much more wording here than needed, since Associative doesn't have any Legality Rules on the contents of the subprogram (it's more of an assertion). As such, most the generic rules are unneeded.]
For a noninstance subprogram, predefined operator, generic subprogram, formal subprogram, formal object of a named or anonymous access-to-subprogram type (including a formal type) the following language-defined operational aspect is defined.
AARM Ramification: "Noninstance subprogram" excludes a subprogram that is an instance of a generic subprogram. In that case, the aspect should be specified on the generic subprogram. If the Associative aspect needs to be added to an instance of a generic subprogram, it can be accomplished by creating a separate subprogram specification and then completing that specification with a renames-as-body of the instance.
Associative The type of aspect Associative is Boolean. This aspect specifies the associative restriction for the entity;
Aspect Description for Associative: Specifies that a subprogram or operator produces an associative result, that is, the operation produces the same semantic results when applied to a sequence of values, regardless how the values in the sequence are grouped, so long as the order of the sequence is maintained.
Note: For example, addition is associative since (A + B) + C = A + (B + C)
The Associative aspect may be specified for all entities for which it is defined. In particular, Associative may be specified for generic formal parameters.
Ramification: The Associative aspect cannot be specified for enumeration literals but we don't need to mention that above. One would have to declare a subprogram in order specify the aspect in that case, but that defines a user-defined subprogram which is itself not a an enumeration literal.
The subprogram or operator associated with the aspect shall have two or more parameters if the Associative aspect is True.
For a dereference of an access-to-subprogram type, the Associative aspect of the designated subprogram is that of the access-to-subprogram type.
[For a full type declaration that has a partial view, the aspect is the same as that of the partial view.]
AARM Proof: Type aspects are never view-specific; they always have the same value for all views. This is formally stated in 13.1.
For any inherited subprogram, it is asociative if and only if the corresponding subprogram of the parent is associative.
Overridings of dispatching operations do not inherit this aspect.
Unless directly specified, for a formal type, or formal subprogram, the aspect is that of the actual type, or subprogram.
AAARM Reason: This means that Associative legality checking for the actual parameters of the instance is only necessary when the aspect is explicitly specified for the formal type.
Unless directly specified, for a derived type the Associative aspect is False.
AARM Discussion: The expressions that can be specified for a derived type are limited by a Legality Rule, see below.
Legality Rules
If aspect Associative is specified for an entity that is not a generic unit or declared inside of a generic unit, the aspect_definition shall be a static expression.
If the prefix of an Associative attribute_reference denotes a generic unit G, the reference shall occur within the declarative region of G.
AARM Reason: We want the value of Associative attributes to be static so long as they occur outside of any generic unit. The Associative aspect of a generic unit will often depend on the actual parameters of the unit, so it cannot be static (or have any well-defined value). We need this latter rule in case the attribute of a generic is used outside of the generic. Note that the previous rule makes it illegal to use such an attribute to specify aspect Associative outside of a generic, but we don't want to allow any other uses since it does not have a known value until instantiated.
AARM Ramification: This rule does not apply to instances of generic units and entities declared within them.
In a generic instantiation (after replacement in the associative expressions by values of the actuals as described previously):
the actual subprogram corresponding to an associative formal subprogram shall be associative [(an actual that is an entry is not permitted in this case)];
the actual type corresponding to an associative formal access-to-subprogram type shall be associative;
the actual object corresponding to a formal object of an associative access-to-subprogram type shall be of an associative access-to-subprogram type;
In addition to the places where Legality Rules normally apply (see 12.3), the above rules also apply in the private part of an instance of a generic unit.
AARM Ramification: For a generic formal parameter to be associative (thus, for these rules to apply), it has to explicitly specify aspect Associative to be True. In particular, these rules do not apply when it specifies aspect Associative to be an expression involving attribute Associative of a generic formal parameter. However, in such a case, these rules do apply in the instance of the specification of the generic unit (the normal re-checking is needed). For instance, the body of an expression function might make a prohibited call.
A program unit P declared inside of a generic unit but not in a generic body or that is a generic specification not declared in a generic unit is considered associative for the purposes of checking the restrictions on an associative only if the value of its Associative aspect is statically True. For the purposes of checks in P, a call to a subprogram is considered associative unless the value of its Associative aspect is statically False.
AARM Reason: This is a typical “assume-the-best” rule. We only make checks if we know the associative status of both P and the called subprogram. All other checks will be performed when the generic unit is instantiated. We used the awkward “inside of a generic unit but not in a generic body” so that a generic specification declared inside of a generic body uses the following “assume-the-worst” rule.
A program unit P declared inside of a generic body or that is a generic body is considered associative for the purposes of checking the restrictions on an associative unit unless the value of its Associative aspect is statically False. For the purposes of checks in P, a call to a subprogram is considered to be non-associative unless:
the value of its Associative aspect is statically True, or
its associative expression (that is, Associative aspect) conforms exactly to that of P, or conforms to some part of the associative expression of P that is combined with the remainder of the associative expression of P by one or more and or and then operations.
AARM Ramification: That is, if the aspect of the program unit is specified (directly or via inheritance) with any non-static Associative aspects, it is considered to be an associative program unit for the purposes of making checks. This is a typical “assume-the-worst” rule.
Extensions to Ada 2012
Aspect Associative is new; it allows compile-time checks to assist in determining if 'Reduce attribute evaluations can be performed with parallel execution.
Modify A.1 (9.1)
-- function "and" (Left, Right : Boolean'Base) return Boolean'Base {with Associative}; -- function "or" (Left, Right : Boolean'Base) return Boolean'Base {with Associative}; -- function "xor" (Left, Right : Boolean'Base) return Boolean'Base {with Associative};
Modify A.1 (17)
-- function "+" (Left, Right : Integer'Base) return Integer'Base {with Associative}; -- function "-" (Left, Right : Integer'Base) return Integer'Base; -- function "*" (Left, Right : Integer'Base) return Integer'Base {with Associtive}; -- function "/" (Left, Right : Integer'Base) return Integer'Base;
Modify A.1 (25)
-- function "+" (Left, Right : Float) return Float {with Associative}; -- function "-" (Left, Right : Float) return Float; -- function "*" (Left, Right : Float) return Float {with Associative}; -- function "/" (Left, Right : Float) return Float;
Modify A.1 (29-34)
function "*" (Left : root_integer; Right : root_real)
return root_real {with Associative};
function "*" (Left : root_real; Right : root_integer)
return root_real {with Associative};
function "/" (Left : root_real; Right : root_integer)
return root_real;
-- The type universal_fixed is predefined. -- The only multiplying operators defined between -- fixed point types are
function "*" (Left : universal_fixed; Right : universal_fixed)
return universal_fixed {with Associative};
function "/" (Left : universal_fixed; Right : universal_fixed)
return universal_fixed;
Modify A.4.4 (13-17)
function Append (Left, Right : in Bounded_String; Drop : in Truncation := Error) return Bounded_String {with Associative};
function Append (Left : in Bounded_String; Right : in String; Drop : in Truncation := Error) return Bounded_String {with Associative};
function Append (Left : in String; Right : in Bounded_String; Drop : in Truncation := Error) return Bounded_String {with Associative};
function Append (Left : in Bounded_String; Right : in Character; Drop : in Truncation := Error) return Bounded_String {with Associative};
function Append (Left : in Character; Right : in Bounded_String; Drop : in Truncation := Error) return Bounded_String {with Associative};
Modify A.4.4 (21-26)
function "&" (Left, Right : in Bounded_String)
return Bounded_String {with Associative};
function "&" (Left : in Bounded_String; Right : in String)
return Bounded_String {with Associative};
function "&" (Left : in String; Right : in Bounded_String)
return Bounded_String {with Associative};
function "&" (Left : in Bounded_String; Right : in Character)
return Bounded_String {with Associative};
function "&" (Left : in Character; Right : in Bounded_String)
return Bounded_String {with Associative};
Modify A.4.5 (15-19)
function "&" (Left, Right : in Unbounded_String)
return Unbounded_String {with Associative};
function "&" (Left : in Unbounded_String; Right : in String)
return Unbounded_String {with Associative};
function "&" (Left : in String; Right : in Unbounded_String)
return Unbounded_String {with Associative};
function "&" (Left : in Unbounded_String; Right : in Character)
return Unbounded_String {with Associative};
function "&" (Left : in Character; Right : in Unbounded_String)
return Unbounded_String {with Associative};
Modify A.18.2 (15/2-18/.2)
function "&" (Left, Right : Vector) return Vector {with Associative};
function "&" (Left : Vector;
Right : Element_Type) return Vector {with Associative};
function "&" (Left : Element_Type;
Right : Vector) return Vector {with Associative};
function "&" (Left, Right : Element_Type) return Vector {with Associative};
Add after A.18.3 (32/2)
function "&" (Left, Right : List) return List {with Associative};
function "&" (Left : List;
Right : Element_Type) return List {with Associative};
function "&" (Left : Element_Type;
Right : List) return List {with Associative};
function "&" (Left, Right : Element_Type) return List {with Associative};
Modify A.18.7(53/2)
procedure Union (Target : in out Set; Source : in Set) {with Associative};
Modify A.18.7(55/2)
function Union (Left, Right : Set) return Set {with Associative};
Modify A.18.7(57/2)
procedure Intersection (Target : in out Set; Source : in Set) {with Associative};
Modify A.18.7(59/2)
function Intersection (Left, Right : Set) return Set {with Associative};
Modify A.18.7(67/2)
function Symmetric_Difference (Left, Right : Set) return Set {with Associative};
Modify A.18.8(26/2)
procedure Union (Target : in out Set; Source : in Set) {with Associative};
Modify A.18.8(27/2)
function Union (Left, Right : Set) return Set { with Associative};
Modify A.18.8(29/2-31/2)
procedure Intersection (Target : in out Set; Source : in Set) {with Associative};
function Intersection (Left, Right : Set) return Set {with Associative};
Modify A.18.8(35/2)
function Symmetric_Difference (Left, Right : Set) return Set {with Associative};
Modify A.18.9(27/2)
procedure Union (Target : in out Set; Source : in Set) {with Associative};
function Union (Left, Right : Set) return Set { with Associative};
procedure Intersection (Target : in out Set; Source : in Set) {with Associative};
function Intersection (Left, Right : Set) return Set {with Associative};
Modify A.18.9(37/2)
function Symmetric_Difference (Left, Right : Set) return Set {with Associative};
!discussion
A 'Reduce or 'Parallel_Reduce attribute reference provides a concise way to use iteration to combine a set of values into a single result.
We already have some special purpose reduction expressions in Ada in the form of quantified expressions and array aggregates of the form using an iterated_component_association.
Consider:
All_Graduated : constant Boolean := (for all Student of Class => Passed_Exam (Student));
Someone_Graduated : constant Boolean := (for some Student of Class => Passed_Exam (Student));
Here we are in both cases effectively reducing a set of Boolean values into a single result using quantified expressions.
A similar effect can be written using a more generalized 'Reduce attribute expression, in combination with an array aggregate.
All_Graduated : constant Boolean := (for Student of Class => Passed_Exam (Student))'Reduce("And", True);
Someone_Graduated : constant Boolean := (for Student of Class => Passed_Exam (Student))'Reduce("Or", False);
Here we use "and" and "or" as the combiner_subprogram of the Reduction expression. The initial value plays an important role, since the Identity for "and" must be true, and false for "or".
We considered whether to allow reductions for combiner_functions where the two parameters are not of the same type. For example, one might want to have a combiner that appends characters to an array of characters. It was decided that this would be left out for now. We could decide to add support for it later, but doing so would add a fair bit more complexity. For instance, it would then be necessary to define an Identity_Value aspect, because the compiler would not know how to initialize local accumulators for different tasklets. We currently only allow combiners where both parameters are of the same type. This allows us to use the first value associated each tasklet as the initial value, which works because the result and inputs are of the same type. If the types were different, we'd also need to define a Reducer aspect that names another subprogram that can be used to combine results from different tasklets. Such a Reducer subprogram would necessarily have the same types for both input parameters. Since we are currently restricting combiner_functions to have that form, there is no need to define a Reducer aspect.
Such reducers tend to be concatentation reductions, which Ada already has concise syntax for, in the form of array aggregate notation, and possibly container aggregate as they are currently being proposed. There otherwise seems to be little need for such capability.
We also considered whether the initial_value was needed or not. We could alternatively overload the attribute names to provide versions that do not accept a second argument.
This however, means that there can be a question about what the expression should evaluate to, if the prefix of the attribute specifies a null range of values. Likely, an exception would need to be raised, but that can add complexity for uses such as contracts, since handling the exception case would need to be considered. We decided that for now at least, it is better to keep the interface simple. If an exception is needed for a null range, there are other ways to do that already in the language.
Another concern might be whether parallel loops can generally be written as expressions, and still have a similar performance benefit.
Consider the calculation of Pi:
function Pi (Number_Of_Steps : Natural := 100_000) return Long_Float is (1.0 / Number_Of_Steps * (parallel for I in 1 .. Number_Of_Steps => (4.0 / (1.0 + ((Long_Float (I) - 0.5) * 1.0 / Number_Of_Steps)**2))) 'Reduce("+", 0.0));
One might feel that the readability of this example might be improved if temporary variable declarations were to be added for intermediate results. For instanced, it might be nice to replace the sub-expression;
(1 + ((Long_Float (I) - 0.5) * Step)**2)
with
(1.0 + X**2)
and
1.0 / Number_Of_Steps
with
Step
One ordinarily cannot declare variables in an expression, but one can always write a function which can be called from an expression.
We might instead decide to write:
function Pi (Number_Of_Steps : Natural := 100_000) return Long_Float is Step : constant Long_Float := 1.0 / Number_Of_Steps;
function Pi_Slicer (Index : Natural) return Float is X : constant Long_Float := (Long_Float (Index) - 0.5) * Step; begin return 4.0 / (1.0 + X ** 2); end Pi_Slicer;
Temp : constant Long_Float := (for I in 1 .. Number_Of_Steps => Pi_Slicer (I))'Parallel_Reduce("+", 0.0); begin return Step * Temp; end Pi;
Which leaves a simpler looking Reduction Expression to do computation.
For performance testing, two versions of this code were created using the Paraffin libraries, and tested using GNAT on a quadcore Linux machine.
One version of the code has the parallelism code written inline as in the first version above, and the second calls a local subprogram idential to the Pi_Slicer routine above.
For 10_000_000 steps of iteration, the following results were collected with optimization turned off.
Sequential Pi Elapsed = 00:00:00.65 Pi = 3.141592653590E+00
Work Seeking Pi Elapsed = 00:00:00.16 Pi = 3.141592653590E+00
'Parallel_Reduce Pi Elapsed = 00:00:00.19 Pi = 3.141592653590E+00
Both versons of the code produced comparable results compared to the sequential version of the code.
After turning on full optimization, and rerunning the same code, the following results were collected.
Sequential Pi Elapsed = 00:00:00.57 Pi = 3.141592653590E+00
Work Seeking Pi Elapsed = 00:00:00.15 Pi = 3.141592653590E+00
'Parallel_Reduce Pi Elapsed = 00:00:00.14 Pi = 3.141592653590E+00
In this case, the results were even more comparable, and in fact the better time was found with the version of code that had the extra function call.
Another concern might be about the need to perform multiple reductions at the same time.
Consider:
Sum : Integer := A'Reduce("+", 0): Min : Integer := A'Reduce(Integer'Min, Integer'Last); Max : Integer := A'Reduce(Integer'Max, Integer'First);
Here we have three calculations that might ideally occur in parallel, but here would normally be expected to execute sequentially with respect to each other.
One might want to calculate these three results iterating only once through the array.
This can be accomplished by creating a composite result type, and writing a user defined reducer function.
type Summary is record Sum : Integer; Min : Integer; Max : Integer; end record;
-- Identity value for the Reduce function Identity : constant Summary := Summary'(Sum => 0, Min => Integer'Last, Max => Integer'First);
-- Reducer function for the Reduction expression function Reduce (L, R : Summary) return Summary with Associative is (Summary'(Sum => L + R, Min => Integer'Min (L, R), Max => Integer'Max (L, R)));
-- Reduction expression to compute all 3 results at once Result : constant Summary := A'Reduce(Reduce, Identity);
It was also noted here during performance testing that better times were collected by actually writting 3 separate reductions, however in other more complicated examples, it was found that writing a composite object as above provided better performance.
!ASIS
** TBD.
!ACATS test
ACATS B-Test and C-Test are needed for these features.
!appendix

From: Brad Moore
Sent: Thursday, December 28, 2017 3:05 PM

Here is another chunk of my homework.

This is the new AI on Reduction Expressions which was split off from
AI12-0119-1. [This is version /01 of the AI. For previous discussion, see the
!appendix of AI12-0119-1, and versions /01 through /04 of that AI. - ED.]

I don't yet have a number for this AI, I presume Randy will work his magic and
assign a number.

The main differences in this version are that the reduction parameter (the
expression enclosed by angle brackets) is now mandatory. The parallel keyword
has been removed, as it is assumed the implementation will make the decision
whether to execute in parallel or sequentially. The Identity and Reducer aspects
are not mandatory for a reduction expression. If they are missing, the
implementation will choose sequential execution.

****************************************************************

From: Randy Brukardt
Sent: Friday, December 29, 2017  1:07 AM

...
> The main differences in this version are that the reduction parameter
> (the expression enclosed by angle brackets) is now mandatory.

Haven't we yet come up with acceptable syntax for these things? Angle brackets
are not an acceptable syntax for a language like Ada. Even if they can be
unambiguously parsed (which I seriously doubt), they would be very confusing for
readers. (And especially for syntax error correctors.)

I'm interested in the proposed feature, but it has to have an acceptable syntax.
And to this point, it doesn't. If the syntax does not become acceptable soon,
then I will start having to oppose the entire feature (we're better off without
a feature than permanently damaging the Ada syntax).

****************************************************************

From: Brad Moore
Sent: Friday, December 29, 2017 12:17 PM

The last time we discussed this was in Vienna.

At that time, we had narrowed it down to the set of choices;

(1)  <<  >> <<>>
(2)  <  >   <>
(3)  [  ]   []
(4)  {  }   {}

The results of the straw poll were:

Straw poll: Love-Hate-Neutral on the options given above
(1) 7-2-2
(2) 6-2-3
(3) 0-8-3
(4) 4-6-1

The straw poll favored 2 slightly over 1. (6 votes vs 5) Options 3 and 4 were
ruled out.

Though the meeting minutes say a choice was not made, it seemed clear that there
was a lot of love for angle brackets, since the two choices that survived both
had them.

I left it as single angle brackets, as it had the slight edge over double angle
brackets. I presumed people would otherwise be happy enough with either of those
two choices.

Presumably you were one of the 2 haters.

One thing I did in this version of the AI, that I now realize I need to undo is
that I changed the reduction_parameter syntax from a simple_expression to an
expression, as I was worried that it would disallow boolean expressions. I see
now from the Vienna meeting minutes that it was a simple expression, because
then it would force boolean expressions to be parenthesized, which eliminates
some ambiguity.

   eg < a > b >

would need to be:

  < (a > b) >

Does that help? Maybe we should say that it is a primary instead of a simple
expression, then in most cases it would need to be parenthesized.

Otherwise, does anyone have any better suggestions?

Any favourites out of these?

1. Sum := (for I of A => <0> + I);
2. Sum := (for I of A => use(0) + I);
3. Sum := (for I of A => with(0) + I);
4. Sum := (for I of A => $0 + I);
5. Sum := (for I of A => %0 + I);
6. Sum := (for I of A => :0: + I);
7. Sum := (for I of A => 0? + I);

Of these, I think I like 4 (followed by 5) as good if not better than 1.
$name is commonly used in scripting languages to indicate a parameter
substitution, so it might strike familiarity with a lot of users.

I tried searching for $ in the RM, I didn't find anything. Maybe adding $ to the
RM will increase Ada's fortunes. ;-)

****************************************************************

From: Edmond Schonberg
Sent: Friday, December 29, 2017 12:35 PM

> Haven't we yet come up with acceptable syntax for these things? Angle
> brackets are not an acceptable syntax for a language like Ada. Even if
> they can be unambiguously parsed (which I seriously doubt), they would
> be very confusing for readers. (And especially for syntax error
> correctors.)
>
> I'm interested in the proposed feature, but it has to have an
> acceptable syntax. And to this point, it doesn't. If the syntax does
> not become acceptable soon, then I will start having to oppose the
> entire feature (we're better off without a feature than permanently
> damaging the Ada syntax).

This seems like an aesthetic argument: why is <init> unacceptable for Ada? There
is certainly no parsing problems with this syntax ( at least none in the
prototype developed in GNAT). The proposed syntax is compact and readable, so I
don’t understand your objection to the current proposal.

(on the other hand, its usage to build aggregates instead of
iterated_component_associations it harder to implement efficiently, and it seems
like a stretch to say that a Reduction produces a “bigger" value than its
operands).

Concerning the text of the AI, I think it would be simpler if the motivation for
it were not related to potential parallelism. A reduction operation is
interesting in its own right, and implementation in terms of (and interaction
with) tasklets should be presented elsewhere, in connection with other
parallelizable constructs.

****************************************************************

From: Jeff Cousins
Sent: Friday, December 29, 2017 12:38 PM

Tuck previously said that Ed had implemented a form of reduction expressions -
what syntax does that use?

****************************************************************

From: Brad Moore
Sent: Friday, December 29, 2017 1:38 PM

> Concerning the text of the AI, I think it would be simpler if the
> motivation for it were not related to potential parallelism. A
> reduction operation is interesting in its own right, and
> implementation in terms of (and interaction with)  tasklets should be
> presented elsewhere, in connection with other parallelizable
> constructs.

I agree that parallelism is only a subset of the potential usage. I had that in
mind when writing the motivation text, but could have gone further to
deemphasize the parallelism angle, even though parallel reduction was the
problem we were trying to solve when this solution presented itself. I think it
is worth at least noting though that the syntax can be parallelized, as that is
a common, if not one of the main problems that other parallization languages,
e.g. OpenMP and Cilk, deal with. And that is intended to be the Ada solution to
that problem, even if implementations choose to not support it.

But since the parallelism is implicit, and implementation advice, and that there
can be a lot of uses where parallelism isn't even needed, the motivation should
focus more on the non-parallelism benefits.

****************************************************************

From: Tucker Taft
Sent: Friday, December 29, 2017 6:04 PM

> Haven't we yet come up with acceptable syntax for these things? Angle
> brackets are not an acceptable syntax for a language like Ada. Even if
> they can be unambiguously parsed (which I seriously doubt), they would
> be very confusing for readers. (And especially for syntax error
> correctors.)
>
> I'm interested in the proposed feature, but it has to have an
> acceptable syntax. And to this point, it doesn't. If the syntax does
> not become acceptable soon, then I will start having to oppose the
> entire feature (we're better off without a feature than permanently
> damaging the Ada syntax).

So long as the angle brackets are never empty, they are distinct from any other
use of angle brackets, and since they are effectively in the "unary" context
rather than the "binary" context, they are easy to distinguish from
less-than/greater-than.

Ed Schonberg implemented these with no problem, and so we can review a bunch of
examples that work to see what we think.

****************************************************************

From: Randy Brukardt
Sent: Friday, December 29, 2017  6:36 PM

> So long as the angle brackets are never empty, they are distinct from
> any other use of angle brackets, and since they are effectively in the
> "unary" context rather than the "binary" context, they are easy to
> distinguish from less-than/greater-than.

Perhaps in your world of "easy", but I'm dubious that it is going to work well
in the context of a table-driven parser. And particularly in terms of error
correction: the handling of binary operators and of bracketing tokens
(particularly closing brackets) is wildly different, and the context of an
expression is the same for both.

> Ed Schonberg implemented these with no problem, and so we can review a
> bunch of examples that work to see what we think.

Having looked at Brad's suggestions, I am beginning to realize that my objection
is really to the actual concept and not so much toward are particular syntax.
Whatever syntax is adopted, it will only be usable in this one particular case
(initialization of reduction expressions), since it appears in the middle of an
unrestricted expression. Much like we couldn't reuse <> for assignment because
it was already used, the same thing will occur for whatever syntax we adopt
here. (While one probably won't see reduction expressions as the default
parameter of a subprogram or the like, the possibility will exist.) As such,
this elevates this syntax all out of proportion to its actual use (fairly rare),
as it affects every expression that could possibly be written.

Unless you think 20% of new Ada programs will be reduction expressions (I don't
:-), it would be better to find a syntax that doesn't require inserting some new
thingy into every kind of expression that could ever occur. And if there is no
sane way to avoid that, then we at least need to avoid using a syntax that looks
to have a general use for a very specialized one-time usage.

For instance, I'd rather use <> or @@ to mark the location, and put the actual
expression somewhere else. (But I'd rather avoid having to mark the location at
all.)

I don't want to see angle brackets in Ada at all, but if we were going to have
them, I'd rather see them used in a general feature (say, some sort of pattern
matching) than a one-time use in a rare kind of expression. Let's find some
alternative to messing around with expressions. Please.

****************************************************************

From: Brad Moore
Sent: Sunday, December 31, 2017 3:14 PM

> For instance, I'd rather use <> or @@ to mark the location, and put
> the actual expression somewhere else. (But I'd rather avoid having to
> mark the location at all.)

This is actually what I was originally thinking of a while back, where the <>
would always be used, and initialized by the Identity_Value aspect associated
with the reduction expression.

I never quite saw why we were so insistent on having the initial value inside
the angle brackets, because if one wants to use a value other than the
Identify_Value, that can easily enough be added to the expression outside the
reduction expression.

eg.

   Sum :=  5 + (for I in 1 .. 10 => <> + I);

Seems as easy to write as;

   Sum := (for I in 1 .. 10 => <5> + I);

And maybe more easier to understand for the reader as well.

****************************************************************

From: Bob Duff
Sent: Sunday, December 31, 2017 4:44 PM

> And maybe more easier to understand for the reader as well.

I tend to agree.

****************************************************************

From: Tucker Taft
Sent: Sunday, December 31, 2017 5:48 PM

> Tuck previously said that Ed had implemented a form of reduction expressions - what syntax does that use?

Single angle brackets.

****************************************************************

From: Brad Moore
Sent: Monday, January 1, 2018 12:10 PM

> Seems as easy to write as;
>
>   Sum := (for I in 1 .. 10 => <5> + I);
>
> And maybe more easier to understand for the reader as well.

I recall now the reason we went the way we did was because we were looking to
disambiguate the syntax more, as we were worried about confusion with the new
array aggregate syntax. (See 4.3.3 (5.1/5))

However, in my view, I see aggregate syntax as just another specialised form of
reduction expression, just as quantified expressions can be viewed as a
specialised form of reduction expression.

For array aggregate syntax, the combiner_function can be considered to be
concatenation.

for example the array aggregate;

A : constant Integer_Array := (for I in 1 .. 10 => I);

can be viewed as a shorthand for the reduction expression;

A : constant Integer_Array := (for I in 1 .. 10 => <> & I);

The concatenation can be optimized by the implementation to be an 'in-place'
form of concatenation, where each element as it is composed is stored in the
result, rather than building a temporary array and assigning that temporary to
the result.

The confusion might come when one tries to mix aggregates with box notation into
this, but I think aggregate syntax is the disambiguator. In the AI there is a
rule that says that only one reduction parameter (i.e. box) can appear in a
reduction expression, but we could clarify that a box that appears inside
something like a record_component_association for a record aggregate, is not a
reduction parameter.

eg.

B : constant Record_Array := (for I in 1 .. 10 => <> & My_Record'(X => I, Y => <>));

The first Box is a reduction parameter, the second one isn't, as it is a
record_component_association.

One would ordinarily write this using the Array aggregate shorthand anyways,

B : constant Record_Array := (for I in 1 .. 10 => My_Record'(X => I, Y => <>));


So I still think that using <> syntax for the reduction parameter would work.

****************************************************************

From: Edmond Schonberg
Sent: Tuesday, January 2, 2018  8:05 AM

...
> One would ordinarily write this using the Array aggregate shorthand
> anyways,
>
> B : constant Record_Array := (for I in 1 .. 10 => My_Record'(X => I, Y
> => <>));
>
>
> So I still think that using <> syntax for the reduction parameter would work.

I agree, this works, and reuses the meaning of the box in a non-confusing way. I
also agree that being able to specify a different default in the expression is
very little gain for new syntax.  Using the box does require a new aspect for
functions and operators however.

Tucker (private communication :-)) indicated that at the previous ARG meeting
the <init> form was also allowed to appear WITHIN any of the arguments of the
combiner function, which cannot be done with the box alone.  The rule you
propose would prevent the nested usage, which is fine by me!

****************************************************************

From: Jeff Cousins
Sent: Tuesday, January 2, 2018  10:26 AM

Just some typos (and a query):

!proposal 2nd para
; -> :
Easy -> easy

Add AARM note:
eg. -> e.g.

Static Semantics 2nd para
nees -> needs

Dynamic Semantics 2nd para
substitued -> substituted

Examples – what’s the name of this formula for Pi?

Add after A.18.3 (32/2)
As this is an add not a modify, {} not needed.

!discussion 12th (?) para
instanced -> instance
; -> :
1 -> 1.0

****************************************************************

From: Tucker Taft
Sent: Tuesday, January 2, 2018  12:27 PM

>> So I still think that using <> syntax for the reduction parameter would work.
>
> I agree, this works, and reuses the meaning of the box in a
> non-confusing way. I also agree that being able to specify a different
> default in the expression is very little gain for new syntax.  Using the box does require a new aspect for functions and operators however.

And this presumes that you always start with the same initial value.  For string
concatenation, for example, it is quite reasonable to have something other than
"" as the initial string.  I have also had examples for counting, max/min, etc.,
where the initial value varies according to context.  Finally you are separating
the specification of the initial value from the point of usage, which could make
it that much harder to understand the code.

> Tucker (private communication :-)) indicated that at the previous ARG
> meeting the <init> form was also allowed to appear WITHIN any of the
> arguments of the combiner function, which cannot be done with the box alone.  The rule you propose would prevent the nested usage, which is fine by me!

I don't think we should start by using a syntax that implies unnecessary
restrictions.  We could impose arbitrary restrictions that we might later relax,
but if the syntax makes that impossible, that is very unwise in my view.

****************************************************************

From: Brad Moore
Sent: Wednesday, January 3, 2018 12:02 AM

> And this presumes that you always start with the same initial value.
> For string concatenation, for example, it is quite reasonable to have
> something other than "" as the initial string.  I have also had
> examples for counting, max/min, etc., where the initial value varies
> according to context.  Finally you are separating the specification of
> the initial value from the point of usage, which could make it that much harder to understand the code.

I'm afraid I am not understanding this comment.

For string concatenation, why not use whichever initial string you want outside
of the expression?

eg.

   Message : constant String := "Alphabet:" & (for Letter in 'A' .. 'Z' => <> & Letter);

Which I think would be easier to understand than

   Message : constant String := (for Letter in 'A' .. 'Z' => <"Alphabet:"> & Letter);

for those not familiar with the syntax. Its clearer in the first case that
"Alphabet:" will be the first part of the result.

Similarly,

   Sum : constant Integer := 5 + (for I in 1 .. 10 => <> + A(I));

Seems easier to read and understand to me than;

   Sum : constant Integer := (for I in 1 .. 10 => <5> + A(I));


   Sales_Price : constant Integer := Integer'Max(100, (for Bid of Offers => Integer'Max(<>, Bid)));

   Sales_Price : constant Integer := (for Bid of Offers => Integer'Max(<100>, Bid));

  Here I think one could argue either way, but maybe the second is preferable
  since it is shorter.

Maybe we need to see more examples, but of the examples we have currently, most
(in fact I think all) dont even have an initial value, other than the
Identity_Value.

For something that appears as though it would be uncommonly used in a majority
of cases, I wonder if is it worth adding special syntax when existing familiar
syntax can be used to express the same thing?

>> Tucker (private communication :-)) indicated that at the previous ARG
>> meeting the <init> form was also allowed to appear WITHIN any of the
>> arguments of the combiner function, which cannot be done with the box
>> alone.  The rule you propose would prevent the nested usage, which is
>> fine by me!

I am not sure I am understanding this limitation. Could someone (Tucker, Ed)
show an example of what the previous syntax would allow, but the current one
wouldn't?

It would also help if this example were a practical one.

****************************************************************

From: Tucker Taft
Sent: Wednesday, January 3, 2018  9:17 AM

I really thinks it creates an unnecessary burden to require that every operation
you want to use as the "primary" operation of the reduction expression has to
have an appropriate aspect specified.  Suppose you find this nice package that
has the operation you want to use?  Do you have to create a private copy to
specify the aspect.  Furthermore, there are operations where there may not be
any obvious "identity," and so the reader will have to  guess what it might be.
Putting it explicitly in the reduction expression reduces the cognitive overhead
of using the feature.

...
>> And this presumes that you always start with the same initial value.
>> For string concatenation, for example, it is quite reasonable to have
>> something other than "" as the initial string.  I have also had
>> examples for counting, max/min, etc., where the initial value varies
>> according to context.  Finally you are separating the specification
>> of the initial value from the point of usage, which could make it that much harder to understand the code.
>
> I'm afraid I am not understanding this comment.
>
> For string concatenation, why not use whichever initial string you want
> outside of the expression?

Seems like an unnecessary extra concatenation.  Not such a big burden for string
concatenation, but could be expensive in other cases, such as, say, merging a
series of maps.

> eg.
>
>   Message : constant String := "Alphabet:" & (for Letter in 'A' .. 'Z'
> => <> & Letter);
>
> Which I think would be easier to understand than
>
>   Message : constant String := (for Letter in 'A' .. 'Z' =>
> <"Alphabet:"> & Letter);
>
> for those not familiar with the syntax ...

If someone is not familiar with the syntax, I think it would be at least as hard
to guess what "<> & blah" means.

To me, it is more reasonable to expect that the reader is familiar with the
general syntax than with all of the details of the particular operation being
used to do the reduction.  That is a reason why having to go to the declaration
of the operation to learn what is the initial value seems unwise.

****************************************************************

From: Randy Brukardt
Sent: Wednesday, January 3, 2018  7:43 PM

>  Putting it explicitly in the reduction expression reduces the
> cognitive overhead of using the feature.

It's precisely that cognitive overhead that worries me. This expression and its
special syntax can appear absolutely anywhere, associated with any kind of
expression. But it is likely to be very rare in Ada code -- and there is nothing
about the expression or its syntax that intuitively explains what it does. (I
still barely understand that.) Outside of parallel execution, I don't see much
to recommend anyone writing in this style, yet we're for some reason encouraging
everyone to do so even for things that can't reasonably be parallelized
(built-in-place function calls come to mind).

Perhaps I'm reaching the too-fossilized-to-be-open-to-new-things stage, but I
don't want to see these things in any code that I have to read or maintain.
Which makes me wonder why we want them in any Ada code (I doubt very much that
I'm alone in this feeling).

****************************************************************

From: Tucker Taft
Sent: Wednesday, January 3, 2018  8:48 PM

My experience in a language that supports this is that it is used quite
frequently.  Of course it will take some time to get used to any new notation.
The ultimate goal would be that once you are used to it, you should be able to
see what you need to see without having to jump back and forth from the code to
various aspect specifications.

Below are various examples from some "real" programs written in ParaSail (e.g. a
compiler to LLVM and a static analysis tool).  I prefaced each example with a
block of comments to try to explain what it is doing.

--------------

--  Count the number of formal-params that are output parameters
-- The "{...}" is a filter that controls whether any action is taken
-- for a particular item in the container
--  (equivalent to the "when X =>")
const Num_Outputs :=
                      (for each P of Formal_Params
                        {P.Is_Operation_Output} => <0> + 1)


--  Create a bracketed list of the elements of DF.
--  Note that ParaSail allows two iterators that iterate together,
--  stopping when either reaches its end.
--  So in the case, we are iterating over the elements of "DF" while
--  also doing a simple "X := Y then Z"
--  iterator which sets X to Y initially and then sets it to Z on each
--  subsequent iteration.
const DF_Image :=
  (for (F in DF; Sep := "" then ", ") => <"["> | Sep | F) | "]"


--  This produces a string representation of a sequence of ranges
--  separated by spaces
return (for R in Rngs forward =>
   <"Countable::["> | R.First | ".." | R.Last | " ") | "]"


--  This prints out a bracketed, comma-separated string representation
--  of the inputs
--  It again uses two iterators being iterated together
Println((for (each Input of Inputs forward;
                    Sep := "" then ", ") =>
                 <"["> | Sep | "VN" | Input) | "]")


--  This counts the number of parameters that are outputs or variables
--  with the filter in "{...}" again
(for each P of Params
 {P.Is_Operation_Output or else P.Is_Var} => <0> + 1)


--  This determine the minimum index of a set of parameters that pass
--  the filter (i.e. are either outputs or variables).
--  Rather than using the most positive and most negative values as the
--  "identity" for Min and Max, in Parasail, null is used, meaning both
--  Min(null, X) = X and Max(null, X) = X
const First_Updated :=
  (for each [I => P] of Params
    {P.Is_Operation_Output or else P.Is_Var} => Min(<null>, I))


-- Count the number of non-inputs
const Num_Inputs :=
  (for each P of Params {not P.Is_Operation_Output} => <0> + 1)


--  Print the list of VNs in VNs_To_Prop
Println(
  (for VN in VNs_To_Prop forward => <"Recursive_Top_Down"> | " VN" | VN))


--  Create a set of the parent value numbers (VNs) of the VNs in
--  Changed_Operands
const Parents : VN_Set :=
   (for Opnd in Changed_Operands =>
     <[]> | Op_Ctx.VN_To_Parent_Set_Map[Opnd])


--  Print a list of [VNx => VNy] pairs representing the live-out set
Println(
  (for each [Loc => VN] of Live_Out_Set =>
    <" Live_Out_Set ="> | " [VN" | Loc | "=>VN" | VN | "]"))


--  Count the number of nodes that are found by following the
--  Parent chain.
--  This uses the general "X := Y then Z while W" iterator which
--  initializes X to Y, and then sets it to Z on each subsequent
--  iteration, as long as "W" remains true.
return (for N := Node_Id then Node_Tree.Parent[N]
           while N in Node_Tree.Parent => <0> + 1)


--  "{...}" by itself is an assertion
--  This asserts that the sum of the sizes in Bit_Field_Sizes should be <= 63
{(for each Size of Bit_Field_Sizes => <0> + |Size|) <= 63}


--  Build up an array mapping from Bit-Field Key to the
--  sum of the sizes of all bit-fields preceding this bitfield
const Bit_Field_Offsets : Array<Univ_Integer, Indexed_By => Key_Type> :=
  [for each [Key => Size] of Bit_Field_Sizes, Key =>
     (for K in Key_Type::First() .. Key-1 => <0> + Bit_Field_Sizes[K])]


--  Dump the sizes of the bit fields
Println(
  (for (each [Key => Size] of Bit_Field_Sizes; Sep := "" then ", ")
    forward => <"Bit_Field_Sizes:["> | Sep | Key | " => " | Size) | "]")


--  Hash the characters of a string to produce an integer
func Hash_Vec(Vec : Vector<Univ_Character>) -> Univ_Integer is
 return (for I in 1 .. |Vec| reverse =>
   (<0> * 127 + (Vec[I] - Char_First)) mod Hash_Modulus)
end func Hash_Vec

****************************************************************

From: Randy Brukardt
Sent: Wednesday, January 3, 2018  10:54 PM

> Below are various examples from some "real" programs written in
> ParaSail (e.g. a compiler to LLVM and a static analysis tool).  I
> prefaced each example with a block of comments to try to explain what
> it is doing.

Having looked over the examples, I'm now even more certain that I don't want
this notation in the language. Ada is not a functional language, but this
feature can only be used functionally. Ada expressions are too large as it is,
and this feature seems mainly about writing large, clever expressions that you
need to analyze for hours to understand. Bah-humbug. (Staying in the season. ;-)

We've probably gone too far already in allowing stuff in expressions, and if we
really need to do this to support parallelism in Ada, then it's time to admit
that Ada is the wrong foundation and start over with a functional design.
Grafting on some functional features because some people find them clever is the
wrong direction.

Anyway, let's look at one of these examples in detail:

> --  Count the number of formal-params that are output parameters
> -- The "{...}" is a filter that controls whether any action is taken
> -- for a particular item in the container
> --  (equivalent to the "when X =>")
> const Num_Outputs :=
>                       (for each P of Formal_Params
>                         {P.Is_Operation_Output} => <0> + 1)

So, how would one do this in Ada 2012? Probably I would write (assuming that
Formal_Params has an appropriate container iterator):

    declare
       Num_Outputs : Natural := 0;
    begin
       for P of Formal_Params loop
           if P.Is_Operation_Output then
               Num_Outputs := Num_Outputs + 1;
           end if;
       end loop;
       ... -- Use Num_Outputs.
    end;

The only real loss here is the inability to use "constant". That's hardly worth
a new construct. I write this sort of code all of the time, and probably so does
every other Ada programmer. This takes more lines, but that's mainly because of
the conventional Ada formatting. One could get rid of two lines with a trick
(which I don't recommend, BTW):

      Num_Outputs := @ + Boolean'Pos (P.Is_Operation_Output);

or a junky conditional expression:

      Num_Outputs := @ + (if P.Is_Operation_Output then 1 else 0);

It strikes me that the right way to do this in Ada if one is insisting on an
expression format is to directly write it (using the proposed block expression
and an expression assignment):

     Num_Outputs : constant Natural :=
        (declare
            Accum : Natural := 0;
         begin
            (for P of Formal_Params => (if P.Is_Operation_Output then Accum := @ + 1; Accum else Accum)));

The main problem with this is that the quantified expression tends to muddle the
readability of the loop portion (also a problem with your proposal) -- as they
are virtually identical but have rather different semantics.

In any case, the more I see of this the less I like it.

****************************************************************

From: Brad Moore
Sent: Thursday, January 4, 2018  1:41 AM

I have an idea that gets rid of the <> entirely, and also gets rid of the need
to specify aspects. I think this also reads better and quite a bit more
understandable. Basically, the idea is to move both the combiner function and
the initial value out of the expression, so the expression just provides a list
of values.

e.g.
Instead of:

   Sum := (for I in 1 .. 10 => <5> + I);


You'd write:

   Sum := (5 + (for I in 1 .. 10 => I));


The expression function generates a list of values, and the combiner function
"+" applies all these values to the other argument which is the initial value.
You can basically read this as we are adding a bunch of values to 5.

You would need two sets of parens normally, one set around the for loop
generator, and another around the whole expression including the combiner and
initial value.

I have applied this syntax idea to all the example in the AI, plus most of the
examples that Tuck provided, after converting them to Ada from Parasail, and I
find every single one of them easier to read and understand using this syntax.

I feel the level of cognitive processing is significantly lower. When I compare
to back and forth between the two syntaxes for any of these examples, I feel the
"head explode" needle jump when I look at the ones with <> syntax. Not enough to
cause an explosion, thank goodness, but I do feel the needle jump, when I turn
the sensitivity knob up....

Here are the examples.

Sum : constant Integer := (for all I in 1 .. 10 => <0> + I);

Sum : constant Integer := (0 + (for I in 1 .. 10 => I));

Sales_Price : constant Integer := (for I in I .. 10 => Integer'Min(<5>, I)); Sales_Price : constant Integer := Integer'Min(5, (for I in I .. 10 => I));

Message : constant String :=  (for Letter in 'A' .. 'Z' => <"Alphabet: "> & Letter) Message : constant String := ("Alphabet: " & (for Letter in 'A' .. 'Z' => Letter));

Four_Or_More : constant Natural := (for P of A => <0> + (if P > 3 then 1 else 0)); Four_Or_More : constant Natural := (0 + (for P of A => (if P > 3 then 1 else 0)));

function Factorial(N : Natural) return Natural is (for J in 1..N => <1> * J); function Factorial(N : Natural) return Natural is (1 * (for J in 1..N => J));

function Sin(X : Float; Num_Terms : Positive := 5) return Float is
  (for I in 1..Num_Terms => <0.0> + (-1.0)**I * X**(2*I-1)/Float(Fact(2*I-1)));

function Sin(X : Float; Num_Terms : Positive := 5) return Float is
  (0.0 + (for I in 1..Num_Terms => (-1.0)**I * X**(2*I-1)/Float(Fact(2*I-1))));

Put_Line ("Sum of Squares is" & Integer'Image(for I in 1 .. 10 => <0> + I**2)); Put_Line ("Sum of Squares is" & Integer'Image(0 + (for I in 1 .. 10 => I**2)));

function Pi (Number_Of_Steps : Natural := 10_000) return Real is
   (1.0 / Number_Of_Steps *
     (for I in 1 .. Number_Of_Steps =>
        <0.0> + (4.0 / (1.0 + ((Real (I) - 0.5) * (1.0 / Number_Of_Steps))**2))));

function Pi (Number_Of_Steps : Natural := 10_000) return Real is
   (1.0 / Number_Of_Steps *
     (0.0 + (for I in 1 .. Number_Of_Steps =>
             (4.0 / (1.0 + ((Real (I) - 0.5) * (1.0 / Number_Of_Steps))**2)))));


--  Count the number of formal-params that are output parameters

Num_Outputs : constant Integer:=
    (for P of Formal_Params => <0> + (if P.Is_Operation_Output then 1 else 0))

Num_Outputs : constant Integer :=
    (0 + (for P of Formal_Params => (if P.Is_Operation_Output then 1 else 0)));

--  Create a bracketed list of the elements of DF.

X : constant DF_Image :=
  (for F in DF => <"["> & (if F = DF.First then "" else ", ") & F) & "]";

X : constant DF_Image :=
  ("[" & (for F in DF => (if F = DF.First then "" else ", ") & F)) & "]";

--  This produces a string representation of a sequence of ranges
--  separated by spaces

return (for R in Rngs => <"Countable::["> & R.First & ".." & R.Last & " ") & "]";

return ("Countable::[" & (for R in Rngs => R.First & ".." & R.Last & " ")) & "]";


--  This prints out a bracketed, comma-separated string representation of the inputs

Put_Line((for Inp of Inputs => <"["> & (if Inp = Inputs'First then "" else ", ") & "VN" & Inp) & "]");

Put_Line(("[" & (for Inp of Inputs => (if Inp = Inputs'First then "" else ", ") & "VN" & Inp)) & "]");


--  This counts the number of parameters that are outputs or variables

(for P of Params => <0> + (if P.Is_Operation_Output or else P.Is_Var then 1 else 0))

(0 + (for P of Params => (if P.Is_Operation_Output or else P.Is_Var then 1 else 0)))

--  This determine the minimum index of a set of parameters that pass
--  the filter (i.e. are either outputs or variables).

First_Updated : constant Integer :=
  (for P of Params =>
    Integer'Min(<Integer'Last>, (if P.Is_Operation_Output or else P.Is_Var then P.I else Integer'Last)));

First_Updated : constant Integer :=
   Integer'Min(Integer'Last,
               (for P of Params =>
                  (if P.Is_Operation_Output or else P.Is_Var then P.I else Integer'Last)));

-- Count the number of non-inputs

Num_Inputs : constant Integer :=
  (for P of Params => <0> + (if not P.Is_Operation_Output then 1 else 0));

Num_Inputs : constant Integer :=
  (0 + (for P of Params => (if not P.Is_Operation_Output then 1 else 0)));

--  Print the list of VNs in VNs_To_Prop

Put_Line((for VN in VNs_To_Prop => <"Recursive_Top_Down"> & " VN" & VN));

Put_Line("Recursive_Top_Down" & (for VN in VNs_To_Prop => " VN" & VN));


--  Create a set of the parent value numbers (VNs) of the VNs in
--  Changed_Operands

Parents : constant VN_Set :=
   (for Opnd in Changed_Operands => <Op_Ctx.Empty_Set> & Op_Ctx.VN_To_Parent_Set_Map(Opnd));

Parents : constant VN_Set :=
   (Op_Ctx.Empty_Set & (for Opnd in Changed_Operands => Op_Ctx.VN_To_Parent_Set_Map(Opnd)));

--  Print a list of [VNx => VNy] pairs representing the live-out set

Put_Line(
  (for Loc of Live_Out_Set =>
      <" Live_Out_Set ="> & " [VN" & Loc & "=>VN" & Loc.VN & "]"));

Put_Line(
  (" Live_Out_Set =" &
     (for Loc of Live_Out_Set => " [VN" & Loc & "=>VN" & Loc.VN & "]")));


--  Count the number of nodes that are found by following the
--  Parent chain.

return (for N in Node_Tree.Parent_Iterate(Node_Id) => <0> + 1)

return (0 + (for N in Node_Tree.Parent_Iterate(Node_Id) => 1));


--  This asserts that the sum of the sizes in Bit_Field_Sizes should be <= 63

Assert((for Size of Bit_Field_Sizes => <0> + Size) <= 63);

Assert((0 + (for Size of Bit_Field_Sizes => Size)) <= 63);


Note, I see now what Ed and Tuck meant by nesting the <> inside the expression.
Tuck's last example has this.

--  Hash the characters of a string to produce an integer func Hash_Vec(Vec : Vector<Univ_Character>) -> Univ_Integer is  return (for I in 1 .. |Vec| reverse =>
   (<0> * 127 + (Vec[I] - Char_First)) mod Hash_Modulus) end func Hash_Vec


I don't believe this can be expressed using the syntax I am proposing either.
On the other hand, this was quite difficult to understand, and I agree with Ed,
I'd be completely fine with not allowing such functions to be written with
expression functions. This causes the head explode needle to move dangerously
close to the big bang!

****************************************************************

From: Brad Moore
Sent: Thursday, January 4, 2018  1:48 AM

...
> We've probably gone too far already in allowing stuff in expressions,
> and if we really need to do this to support parallelism in Ada, then
> it's time to admit that Ada is the wrong foundation and start over
> with a functional design. Grafting on some functional features because
> some people find them clever is the wrong direction.

I'd be interested in hearing your take on the suggestion in my last email. After
eliminating the <> altogether, as well as the need for aspects, and pulling the
combiner and initial value out of the syntax, I think it helps quite a bit
towards looking more Ada like.

Quantified expressions and Array aggregates are pretty much reduction
expressions that we already have in the language, and quantified expressions are
particularly useful for contracts. I think reduction expressions will also be
useful in this area, as well as supporting parallelism.

****************************************************************

From: Tucker Taft
Sent: Thursday, January 4, 2018  7:53 AM

Your suggestion is interesting, but it seems pretty ambiguous right now.  What
distinguishes these nested iterators from an array (or container) aggregate with
an iterated component association?

Also, didn't we agree to add the "when" clause to all iterators at the last ARG
meeting?

See below for a couple of other comments.

...
>> We've probably gone too far already in allowing stuff in expressions,
>> and if we really need to do this to support parallelism in Ada, then
>> it's time to admit that Ada is the wrong foundation and start over
>> with a functional design. Grafting on some functional features
>> because some people find them clever is the wrong direction.

Randy, this is about expressivity, not cleverness.  Once you start doing
contract-based programming, you really want the ability to express important
properties in postconditions.  It seems you are reacting this way mostly because
it is unfamiliar.  Yes, Ada is evolving to have more functional constructs, such
as delta aggregates, array aggregates with iterated component associations, etc.
But that is what happens when you want to write things that specify what is the
net effect of an operation.  You may not be used to it, but that doesn't mean it
isn't useful and important.

> I'd be interested in hearing your take on the suggestion in my last email.
> After eliminating the <> altogether, as well as the need for aspects,
> and pulling the combiner and initial value out of the syntax, I think
> it helps quite a bit towards looking more Ada like.

As mentioned above, I think this is an interesting direction, but first, I would
prefer to see the use of the "when" clause for filtering rather than an "if"
expression, and the semantics seems quite ambiguous as written, given that the
syntax is not distinguishable from array aggregates with iterated component
associations.

> Quantified expressions and Array aggregates are pretty much reduction
> expressions that we already have in the language, and quantified
> expressions are particularly useful for contracts. I think reduction
> expressions will also be useful in this area, as well as supporting parallelism.

These are both very important points.  The reduction expression was designed to
be a generalization of quantified expressions (where the combining operation is
"and then" for "for all" and "or else" for "for some") and for array/container
aggregates with iterated component associations (where the combining operation
is "&" for an array aggregate, and something like "insert" for a container).
Secondly, these are the kinds of constructs you will want in postconditions to
describe what is being done by a function, and in expression functions when that
is *all* that is being done.  Admittedly, postcondition specifications can get
large, but that is the nature of contract-based programming.  One of the points
of reduction expressions is that it allows you to concisely specify what is
happening.  And if that is *all* that is happening, you can use the reduction
expression itself to perform the computation, rather than just specify the
contract for the computation.

****************************************************************

From: Tucker Taft
Sent: Thursday, January 4, 2018  8:17 AM

One possibility would be to use unique syntax, such as "<...>", around the
iterator to indicate that you are doing a reduction across the result of the
iterator.  E.g.:

    --  Find smallest square of any odd element of Con
    Min_Odd_Square : constant Integer :=
      Integer'Min(Integer'Last,
                         <for E of Con when E mod 2 = 1 => E**2>);

****************************************************************

From: Brad Moore
Sent: Thursday, January 4, 2018  10:35 AM

> Your suggestion is interesting, but it seems pretty ambiguous right
> now.  What distinguishes these nested iterators from an array (or
> container) aggregate with an iterated component association?
>
> Also, didn't we agree to add the "when" clause to all iterators at the
> last ARG meeting?

An array aggregate such as;

A : constant Array_Of_Integers := (for I in 1 .. 10 => I);

can be considered a shorthand for

A : constant Array_Of_Integers := (Empty_Array & (for I in 1 .. 10 => I))

Where Empty_Array is some expression for an array that has 0 elements.

The long form is disambiguated by the outer set of parens. There is only one
combiner function allowed in a reduction expression, in this case "&". The array
aggregate doesn't have that, nor does it have the enclosing set of parens.

It's subtle, but I think it works. It is also further disambiguated by context
where the expression must be an array since that is the target of the
expression. The only way to get an array of integers from a set of integer
values would be to concatenate them together. I think while this further
disambiguation helps clarify to the reader, it is not necessary for the compiler
to parse. It already has enough just with the set of parents. At least that is
my hope.


I don't recall discussing the "when" clause at the ARG meeting, and when I
search for "when" in Randy's minutes, I dont find anything. It sounds
interesting though, as it could help readability for cases where if expressions
would otherwise be used.

Is this something we would have discussed at a different ARG meeting? I searched
the Vienna minutes also, but didn't find anything.

****************************************************************

From: Randy Brukardt
Sent: Thursday, January 4, 2018  6:16 PM

> One possibility would be to use unique syntax, such as "<...>", around
> the iterator to indicate that you are doing a reduction across the
> result of the iterator.  E.g.:
>
>     --  Find smallest square of any odd element of Con
>     Min_Odd_Square : constant Integer :=
>       Integer'Min(Integer'Last,
>                          <for E of Con when E mod 2 = 1 => E**2>);

This thread started because I said that I didn't want to see angle brackets used
in Ada syntax. That didn't suddenly disappear because you put them somewhere
else... :-)

****************************************************************

From: Randy Brukardt
Sent: Thursday, January 4, 2018  6:31 PM

...
> I don't recall discussing the "when" clause at the ARG meeting, and
> when I search for "when" in Randy's minutes, I dont find anything. It
> sounds interesting though, as it could help readability for cases
> where if expressions would otherwise be used.

It was in Vienna, and it was buried in some other proposal. Tucker suggested
that it get put out on its own, where either it was added to all iterators or
none. The person it was assigned to has never pursued it, and I haven't bothered
to remind them of it because I think it is a horrible and completely unnecessary
idea. (One can always put a conditional expression/statement inside of any
iterator.) As such, I was rather hoping that the proposer never actually gets it
on the agenda in time (June 15th).

Perhaps it is time to break out the gee-gaws speech; I just posted a form of it
on Ada-Comment. We all know that there is a cost to every feature added to Ada,
and the last thing I want to see is a language overloaded by clever gee-gaws of
every stripe. Ada started out its life with a design where pretty much
everything had a name, everything is written out explicitly, and shorthands are
rare ("elsif" being pretty much it). Now people want a shorthand for every
situation, make everything anonymous, and hope that the many possibilities occur
often enough that readers don't have to go to a book every time to figure out
what is going on. That doesn't seem to be the right direction for Ada -- we need
to limit the nice to haves to a minimum.

I'm willing to keep talking about reduction expressions ONLY because of the
parallism opportunities and would hope that no one use them otherwise. I don't
have any interest in anything that is only valuable for sequential cases,
because that's solely about making code harder to understand (especially the <>
in the middle cases -- no thanks).

P.S. I can't imagine how a "when clause" would work with case and aggregate
completeness checks; it would seem necessary to limit it to the form of a static
predicate and treat it effectively like such a predicate -- but then it would
have been better to define  such a predicate and iterator over that rather than
defining a "when clause". Sounds like a major complication with no actual new
expressive power.

****************************************************************

From: Randy Brukardt
Sent: Thursday, January 4, 2018  7:04 PM

...
> >> We've probably gone too far already in allowing stuff in
> >> expressions, and if we really need to do this to support
> >> parallelism in Ada, then it's time to admit that Ada is the wrong
> >> foundation and start over with a functional design. Grafting on
> >> some functional features because some people find them clever is the wrong direction.
>
> Randy, this is about expressivity, not cleverness.  Once you start
> doing contract-based programming, you really want the ability to
> express important properties in postconditions.
> It seems you are reacting this way mostly because it is unfamiliar.
> Yes, Ada is evolving to have more functional constructs, such as delta
> aggregates, array aggregates with iterated component associations,
> etc.  But that is what happens when you want to write things that
> specify what is the net effect of an operation.  You may not be used
> to it, but that doesn't mean it isn't useful and important.

It's about cleverness, because there is ZERO problem writing a function to do
this. And it is precisely the sort of thing that is best put into a function
anyway, because it's complex and most likely occurs in multiple contracts. There
is far too little abstraction is many of the contracts I've seen, making them
ginormous.

(And I don't see delta aggregate or iterator aggregates as being functional at
all; they're just additional ways to specify an aggregate, ones that have been
missing forever.)

...
> > Quantified expressions and Array aggregates are pretty much reduction
> > expressions that we already have in the language, and quantified
> > expressions are particularly useful for contracts. I think reduction
> > expressions will also be useful in this area, as well as supporting
> > parallelism.
>
> These are both very important points.  The reduction expression was
> designed to be a generalization of quantified expressions (where the
> combining operation is "and then" for "for all" and "or else" for "for
> some") and for array/container aggregates with iterated component
> associations (where the combining operation is "&" for an array
> aggregate, and something like "insert" for a container).

AARRRGGGHHH!!!!

I was strongly opposed to quantified expressions in the beginning (I wanted if
expressions and nothing further), and I was eventually convinced that (A) some
others considered them important, and (B) I don't have to use them or worry
about reading them in normal code.

So now you want to turn essentially all Ada code into these nasty things.

AAARRRGGHHH!

Especially, trying to unify two completely unrelated things, a composite object
constructor (aggregates) and chains of expressions (values). Indeed, perhaps it
is my compiler-writer's viewpoint, but I can't even imagine treating these as
the same. An expression is (an elementary) value that lives in a machine
register while objects are memory that lives elsewhere. (Composite expressions
are pointers at objects.) A literal unification as described above would lead to
god-awful code (given that "&" can allocate as many as 4 temporary objects for
each one) -- the absoletely worst way to create an object (*any* object). No one
wants that.

This whole train is heading in the wrong direction, which is highly unfortunate.
It might make sense for some new language, but not for Ada.

****************************************************************

From: Tucker Taft
Sent: Thursday, January 4, 2018  7:47 PM

...
> This thread started because I said that I didn't want to see angle
> brackets used in Ada syntax. That didn't suddenly disappear because
> you put them somewhere else... :-)

I am still unclear why you are so strongly against angle brackets.  Parsing is
straightforward so long as the rightmost construct is a "simple_expression."
They clearly look different from existing bracketing constructs, but in this
case, the whole point is that you want them to look different.   We could start
using "[]" or "{}" if you think those would be somehow better.  But trying to
overload "(...)" even more at this stage seems unwise.

As far as the "when" clause, it is quite important for cases when you are
building up something with the iterator, as you don't want to create a component
at all if the when clause returns False.  Container aggregates seem to be one
important use, but they are also useful for parallel iterators, where you don't
want the overhead of creating a "tasklet" if you are not going to be doing
anything with the element.

****************************************************************

From: Edmond Schonberg
Sent: Thursday, January 4, 2018  7:48 PM

> Especially, trying to unify two completely unrelated things, a
> composite object constructor (aggregates) and chains of expressions
> (values). Indeed, perhaps it is my compiler-writer's viewpoint, but I
> can't even imagine treating these as the same. An expression is (an
> elementary) value that lives in a machine register while objects are memory that lives elsewhere.
> (Composite expressions are pointers at objects.) A literal unification
> as described above would lead to god-awful code (given that "&" can
> allocate as many as 4 temporary objects for each one) -- the
> absoletely worst way to create an object (*any* object). No one wants that.
>
> This whole train is heading in the wrong direction, which is highly
> unfortunate. It might make sense for some new language, but not for Ada.

At the risk of eliciting more gag reflexes, let me suggest another way of
looking at quantified expressions and reduction operations.  The original Ada
dealt mostly with scalars, arrays, and records, and had a very rich set of
constructs to build composite values, but few global operations on them.

The introduction of containers, with a large set of dedicated operations, made
these composite objects (almost) as easy to manipulate as scalars, and hid the
storage management of these complex objects from the programmer, with a great
increase in expressivity.

Quantified expressions are another construct intended to manipulate composite
objects, and they are widely used in logical formulae, and therefore in
predicates and invariants. Reduction operations are another useful way of
manipulating composite values, and are widely used in other languages. Now the
current syntax is still not ideal, and there is no way we will get the
compactness of the APL notation, where  +/V means the sum of the elements of V
:-)!  But the examples that Tuck proposed indicate that it is really a natural
notation for common computations. In fact, thinking about the APL notation, it
would be nice if the iterator was implicit and the argument was a container.
But even without that, there are good arguments to have reduction expressions in
the language, outside of possible parallelism.

****************************************************************

From: Tucker Taft
Sent: Thursday, January 4, 2018  8:13 PM

I think we are approaching the APL view, but using an iterator rather than
directly using the container, to avoid having to "re-ify" the set of items to be
reduced, and dealing with the fact that many reducer operations in Ada are *not*
operators, unlike in APL.  Building on Brad's suggestion, but using "{...}"
instead of "(...)" to reduce further overloading the meaning of "(...)":

    So what in APL would be "op / vector" becomes in Ada:

        op (initial, {iterator over vector})
                    --  if "op" is *not* an Ada operator

or

        initial op {iterator over vector}
                    --  if "op" *is* an Ada operator

I am using "{...}" rather than "<...>" merely to avoid further inflaming
things... ;-)

****************************************************************

From: Randy Brukardt
Sent: Thursday, January 4, 2018  8:14 PM

...
> > This thread started because I said that I didn't want to see angle
> > brackets used in Ada syntax. That didn't suddenly disappear because
> > you put them somewhere else... :-)
>
> I am still unclear why you are so strongly against angle brackets.
> Parsing is straightforward so long as the rightmost construct is a
> "simple_expression."  They clearly look different from existing
> bracketing constructs, but in this case, the whole point is that you
> want them to look
> different.   We could start using "[]" or "{}" if you think
> those would be somehow better.  But trying to overload "(...)" even
> more at this stage seems unwise.

My experience is that so-called angle brackets make code much harder to read.
The old Scribe syntax allows angle brackets as one of 5 sets of bracket
characters, and the angle bracket uses tend to be the hardest to read
(especially when other nearby text is using  < or > literally). There's also an
error correction concern (our table driven parser uses weights for each token to
determine error correction, but the weights for a bracketing token and for an
operator are nearly maximally different). I'd probably be less concerned if the
contents where limited to just an identifier or something else very simple, but
arbitrary expression text will lead to madness. (While a machine might be able
to parse it, a human would have trouble.)

I'd definitely be happier with {} or [], but I'd really rather not have to use
anything new at all. Maybe impossible, dunno.

> As far as the "when" clause, it is quite important for cases when you
> are building up something with the iterator, as you don't want to
> create a component at all if the when clause returns False.  Container
> aggregates seem to be one important use, but they are also useful for
> parallel iterators, where you don't want the overhead of creating a
> "tasklet" if you are not going to be doing anything with the element.

That seems abusive to me. Not creating components is precisely the problem case,
since other rules generally require completeness of components (certainly in the
array aggregate case, and this is a very valuable and important feature of Ada).
I'd rather that such containers are created using a loop and if structure, not
by a phony aggregate. Aggregates (and other expressions for that matter) are for
*simple* cases, not something that can (or should) handle *every* case.

As far as the parallel iterators go, that seems to be assuming that there is a
tasklet for every iteration, but that is unlikely. Much more likely there are
going to be chunks of iteration for each tasklet, and only the most simple
conditionals would allow dropping any tasklets (normally, you at least have to
evaluate the conditionals before knowing whether further processing is needed;
doing that outside of a tasklet is reducing parallelism for no reason).

As with every clever concept that has been proposed, I'm sure there are uses.
The problem is that we are starting to drown in clever concepts -- to the point
where I haven't even had the energy to come up with any of my own. ;-) We need
to draw the line lest we have a ton of gee-gaws and no one other than AdaCore
left in the Ada business.

****************************************************************

From: Brad Moore
Sent: Friday, January 5, 2018  1:58 AM

> As mentioned above, I think this is an interesting direction, but
> first, I would prefer to see the use of the "when" clause for filtering rather than an "if"
> expression, and the semantics seems quite ambiguous as written, given
> that the syntax is not distinguishable from array aggregates with
> iterated component associations.

For reference...

Here are the examples again, using only the currently discussed form, but using
{...} instead of (...) for the iterator part, and using "when" clauses where it
makes sense.

For the cases where "When" clauses are used, I also show the version that uses
if expressions for comparison.


Sum : constant Integer := (0 + {for I in 1 .. 10 => I});

Sales_Price : constant Integer := Integer'Max(100, {for I in I .. 10 => I});

-- Appending to string
Message : constant String := ("Alphabet: " & {for Letter in 'A' .. 'Z' => Letter});

-- Prepending to string (Note: We might want to allow the combiner function to appear
-- on either side of the iterator
Message : constant String := ({for Letter in 'A' .. 'Z' => Letter} & " <-Alphabet");

-- Calculate ln(1 + X)
function Mercator (X : Float; Num_Steps : Integer) return Float is
  (0.0 + {for I in 1 .. Num_Steps => X**I / Float(I) * (if I mod 2 = 0 then -1.0 else 1.0)});

-- Number of Adults (18 years of age or older)

Adults : constant Natural := (0 + {for Age of People => (if Age >= 18 then 1 else 0)}); Adults : constant Natural := (0 + {for Age of People when Age >= 18 => 1});

function Factorial(N : Natural) return Natural is (1 * {for J in 1..N => J});

function Sin(X : Float; Num_Terms : Positive := 5) return Float is
  (0.0 + {for I in 1 .. Num_Terms => (-1.0)**I * X**(2*I-1)/Float(Fact(2*I-1))});

Put_Line ("Sum of Squares is" & Integer'Image(0 + {for I in 1 .. 10 => I**2}));

function Pi (Number_Of_Steps : Natural := 10_000) return Real is
   (1.0 / Number_Of_Steps *
     (0.0 + {for I in 1 .. Number_Of_Steps =>
             (4.0 / (1.0 + ((Real (I) - 0.5) * (1.0 / Number_Of_Steps))**2))}));

--  Count the number of formal-params that are output parameters

Num_Outputs : constant Integer :=
    (0 + {for P of Formal_Params => (if P.Is_Operation_Output then 1 else 0)});

Num_Outputs : constant Integer :=
    (0 + {for P of Formal_Params when P.Is_Operation_Output => 1});

--  Create a bracketed list of the elements of DF.

X : constant DF_Image :=
  ("[" & {for F in DF => (if F = DF.First then "" else ", ") & F}) & "]";

--  This produces a string representation of a sequence of ranges
--  separated by spaces

return ("Countable::[" & {for R in Rngs => R.First & ".." & R.Last & " "}) & "]";

--  This prints out a bracketed, comma-separated string representation of the inputs

Put_Line(("[" & {for Inp of Inputs => (if Inp = Inputs'First then "" else ", ") & "VN" & Inp}) & "]");


--  This counts the number of parameters that are outputs or variables

(0 + {for P of Params => (if P.Is_Operation_Output or else P.Is_Var then 1 else 0)}) (0 + {for P of Params when P.Is_Operation_Output or else P.Is_Var => 1})

--  This determine the minimum index of a set of parameters that pass
--  the filter (i.e. are either outputs or variables).

First_Updated : constant Integer :=
   Integer'Min(Integer'Last,
               {for P of Params =>
                  (if P.Is_Operation_Output or else P.Is_Var then P.I else Integer'Last)});

First_Updated : constant Integer :=
   Integer'Min(Integer'Last,
               {for P of Params when P.Is_Operation_Output or else P.Is_Var => P.I});

-- Count the number of non-inputs

Num_Inputs : constant Integer :=
  (0 + {for P of Params => (if not P.Is_Operation_Output then 1 else 0)});

Num_Inputs : constant Integer :=
  (0 + {for P of Params when not P.Is_Operation_Output => 1});

--  Print the list of VNs in VNs_To_Prop

Put_Line("Recursive_Top_Down" & {for VN in VNs_To_Prop => " VN" & VN});


--  Create a set of the parent value numbers (VNs) of the VNs in
--  Changed_Operands

Parents : constant VN_Set :=
   (Op_Ctx.Empty_Set & {for Opnd in Changed_Operands => Op_Ctx.VN_To_Parent_Set_Map(Opnd)});

--  Print a list of [VNx => VNy] pairs representing the live-out set

Put_Line(" Live_Out_Set =" &
         {for Loc of Live_Out_Set => " [VN" & Loc & "=>VN" & Loc.VN & "]"});

--  Count the number of nodes that are found by following the
--  Parent chain.

return (0 + {for N in Node_Tree.Parent_Iterate(Node_Id) => 1});


--  This asserts that the sum of the sizes in Bit_Field_Sizes should be <= 63

Assert((0 + {for Size of Bit_Field_Sizes => Size}) <= 63);


****************************************************************

From: Tucker Taft
Sent: Friday, January 5, 2018  7:45 AM

...
> My experience is that so-called angle brackets make code much harder
> to read. The old Scribe syntax allows angle brackets as one of 5 sets
> of bracket characters, and the angle bracket uses tend to be the
> hardest to read (especially when other nearby text is using  < or >
> literally). There's also an error correction concern (our table driven
> parser uses weights for each token to determine error correction, but
> the weights for a bracketing token and for an operator are nearly
> maximally different). I'd probably be less concerned if the contents
> where limited to just an identifier or something else very simple, but
> arbitrary expression text will lead to madness. (While a machine might
> be able to parse it, a human would have
> trouble.)

This seems pretty subjective.  Scribe is notoriously dense with brackets, so I
can imagine making such distinctions is painful.  But many languages these days
use angle brackets as part of generic instantiation (e.g. Map<Key_Type,
Value_Type>) so it seems that much of the programming world has adopted angle
brackets as a useful alternative bracketing syntax.  But I could certainly live
with "{...}" instead.

> I'd definitely be happier with {} or [], but I'd really rather not
> have to use anything new at all. Maybe impossible, dunno.

I think having a distinct syntax for this is quite important.  Parentheses are
already everywhere in Ada, and trying to imbue levels of parentheses with
special semantics seems quite dangerous.

>> As far as the "when" clause, it is quite important for cases when you
>> are building up something with the iterator, as you don't want to
>> create a component at all if the when clause returns False.
>> Container aggregates seem to be one important use, but they are also
>> useful for parallel iterators, where you don't want the overhead of
>> creating a "tasklet" if you are not going to be doing anything with
>> the element.
>
> That seems abusive to me.

That's a new term for the ARG!

...
> Not creating components is precisely the problem case, since other
> rules generally require completeness of components (certainly in the
> array aggregate case, and this is a very valuable and important
> feature of Ada). I'd rather that such containers are created using a
> loop and if structure, not by a phony aggregate. Aggregates (and other
> expressions for that matter) are for *simple* cases, not something
> that can (or should) handle *every* case.

When it comes to creating a container aggregate for a map, there typically is no
notion of "completeness."  There might be some notion of no overlap, but
completeness doesn't really make sense for most maps.

> As far as the parallel iterators go, that seems to be assuming that
> there is a tasklet for every iteration, but that is unlikely. Much
> more likely there are going to be chunks of iteration for each
> tasklet, and only the most simple conditionals would allow dropping
> any tasklets (normally, you at least have to evaluate the conditionals
> before knowing whether further processing is needed; doing that
> outside of a tasklet is reducing parallelism for no reason).
>
> As with every clever concept that has been proposed, I'm sure there
> are uses. The problem is that we are starting to drown in clever
> concepts -- to the point where I haven't even had the energy to come up with any of my own.
> ;-) We need to draw the line lest we have a ton of gee-gaws and no one
> other than AdaCore left in the Ada business.

The notion of a "filter" is not just something "clever."  It is fundamental to
many notions of what an iterator does.  Having to use an explicit "if" statement
or "if" expression muddies the waters in my view, and forces an extra level of
nesting which makes it that much *harder* to read.  And neither an "if"
statement nor an "if" expression really works for something that is generating
values, which is what we are doing in the particular cases of a container
aggregate or a reduction expression.

Also, remember that we hope to use these notations as part of contracts.  It
doesn't work very well to bury the semantics of a contract inside a function,
because then you just have to write a contract for that function (unless that
function is an expression function -- which gets us back to using things like
reduction expressions).

****************************************************************

From: Jean-Pierre Rosen
Sent: Friday, January 5, 2018  8:30 AM

>> I'd definitely be happier with {} or [], but I'd really rather not
>> have to use anything new at all. Maybe impossible, dunno.

> I think having a distinct syntax for this is quite important.  Parentheses are
> already everywhere in Ada, and trying to imbue levels of parentheses with
> special semantics seems quite dangerous.

Note that there is a precedent using parenthesis: in the body of an entry
family, where the body is "repeated" by the "for"

****************************************************************

From: Brad Moore
Sent: Friday, January 5, 2018  11:07 AM

I'd like to try to simplify this further.
It feels like pulling the combiner_function and the initial value out of the
iteration part of the reduction expression was a step in the right direction.

It still is bothering me, that the construct feels complex enough that it seems
to need two types of bracketing. (parens and braces). It also bothers me that
the outer part of the construct involving the combiner looks too much like
existing syntax, but has quite different semantics of feeding multiple values
back into the expression.

It feels like we can decompose this construct further to make use of other syntax that either already exists, or we are considering adding.

Specifically, it feels like the set of values produced by the iteration part of
the reduction expression can be considered to be an array, and we already have
array aggregate syntax that can produce such arrays.

If we take the view that we can use array aggregate syntax to produce an array,
perhaps even an anonymous array, then that reduces the problem for reduction
expressions to just be;
   How to summarize an array.

We could use an attribute, say 'Reduce for that purpose, where the 'Reduce
attribute could enclose the combiner function and initial value.

I think we can then go back to using <> as the indicator where the feedback
occurs.

For example, to add all the elements of an array of integers (or a container of
integers);

Sum : constant Integer := A'Reduce(0 + <>);

Note here that unlike before, there is no need for Identity aspects or Reducer
aspects. Those are only needed if we want this to run in parallel. Here we are
just feeding every value of the array into the 'Reduce attribute to produce the
summary. Note that the initial value in this example is represented by 0.

If we can say that an array aggregate can produce an anonymous array, then we
can summarize a range of values inline.

Sum : constant Integer := (for I in 1 .. 10 => I)'Reduce(0 + <>);

Here we are just combining two separate syntax features together. (Array
aggregates + Reduction expression) Another nice feature of this is that you dont
need nested parens, which makes things more difficult to read. Nor do you need
to use different kinds of bracketing.

If the expression gets too complicated, you can then use declare expressions to
break it up for improved readability.

Sum : constant Integer := (Values : constant := (for I in 1 .. 10 => I)
                           begin Values'Reduce(0 + <>));


Note it could be optional (or mandatory) to explicitly mention the type of the
array in the Values declaration above. But we could have as above, where the
type of array is implicit or considered to be an anonymous array type.

With this in mind, here are all the current examples rewritten to use this
syntax idea.

-- Sum of an array of integers
A_Sum : constant Integer := (A'Reduce(<> + 0));

-- Sum of a range of values
Sum : constant Integer := (X : constant := (for I in 1 .. 10 => I)
                           begin X'Reduce(<> + 0));
-- or shorthand
Sum : constant Integer := ((for I in 1 .. 10 => I)'Reduce(<> + 0));

Sales_Price : constant Integer := Offers'Reduce(Integer'Max(100, <>));

-- Appending to string
Message : constant String := (for Letter in 'A' .. 'Z' => Letter)'Reduce("Alphabet: " & <>);

-- Prepending to string
Message : constant String := (for Letter in 'A' .. 'Z' => Letter)'Reduce(<> & " <- Alphabet");

-- Calculate ln(1 + X)
function Mercator (X : Float; Num_Steps : Integer) return Float is
  (Terms : constant := (for I in 1 .. Num_Steps =>
                          X**I / Float(I) * (if I mod 2 = 0 then -1.0 else 1.0))
   begin Terms'Reduce(0.0 + <>));

-- Number of Adults (18 years of age or older)

Adult_Count : constant Natural :=
  (Adults : constant := (for Age of People when Age >= 18 => 1)
   begin Adults'Reduce(0.0 + <>);

function Factorial(N : Natural) return Natural is ((for J in 1..N => J)'Reduce(1 * <>));

function Sin(X : Float; Num_Terms : Positive := 5) return Float is
  (Terms : constant := (for I in 1 .. Num_Terms =>
                          (-1.0)**I * X**(2*I-1)/Float(Fact(2*I-1)))
   begin Terms'Reduce(0.0 + <>));

Put_Line ("Sum of Squares is" & Integer'Image((for I in 1 .. 10 => I**2)'Reduce(0 + <>)));

function Pi (Number_Of_Steps : Natural := 10_000) return Real is
   (Terms : constant :=
      (for I in 1 .. Number_Of_Steps =>
         (4.0 / (1.0 + ((Real (I) - 0.5) * (1.0 / Number_Of_Steps))**2)))
    begin 1.0 / Number_Of_Steps * Terms'Reduce(0.0 + <>));

--  Count the number of formal-params that are output parameters

Num_Outputs : constant Integer :=
    ((for P of Formal_Params when P.Is_Operation_Output => 1)'Reduce(0 + <>));

--  Create a bracketed list of the elements of DF.

X : constant DF_Image :=
  DF'Reduce("[" & (if <> = DF.First_Element then "" else ", ") & <>) & "]";

--  This produces a string representation of a sequence of ranges
--  separated by spaces

return (for R in Rngs => R.First & ".." & R.Last & " ")'Reduce("Countable::[" & <>) & "]";

--  This prints out a bracketed, comma-separated string representation of the inputs

Put_Line(Terms : constant :=
           (for Inp of Inputs => (if Inp = Inputs'First then "" else ", ") & "VN" & Inp)
         begin Terms'Reduce("[" & <>) & "]");

--  This counts the number of parameters that are outputs or variables

(for P of Params when P.Is_Operation_Output or else P.Is_Var => 1)'Reduce(0 + <>)

--  This determines the minimum index of a set of parameters that pass
--  the filter (i.e. are either outputs or variables).

First_Updated : constant Integer :=
   (for P of Params when P.Is_Operation_Output or else P.Is_Var => P.I)'Reduce(Integer'Min(Integer'Last,<>));

-- Count the number of non-inputs

Num_Inputs : constant Integer :=
  (for P of Params when not P.Is_Operation_Output => 1)'Reduce(0 + <>);

--  Print the list of VNs in VNs_To_Prop

Put_Line("Recursive_Top_Down" & VNs_To_Prop'Reduce(" VN" & <>));


--  Create a set of the parent value numbers (VNs) of the VNs in
--  Changed_Operands

Parents : constant VN_Set :=
   Changed_Operands'Reduce(Op_Ctx.Empty_Set & <>.VN_To_Parent_Set_Map);


--  Print a list of [VNx => VNy] pairs representing the live-out set

Put_Line((for Loc of Live_Out_Set => " [VN" & Loc & "=>VN" & Loc.VN & "]")'Reduce(" Live_Out_Set =" & <>));

--  Count the number of nodes that are found by following the
--  Parent chain.

return (for N in Node_Tree.Parent_Iterate(Node_Id) => 1)'Reduce(0 + <>);


--  This asserts that the sum of the sizes in Bit_Field_Sizes should be <= 63

Assert(Bit_Field_Sizes'Reduce(0 + <>) <= 63);

****************************************************************

From: Erhard Ploedereder
Sent: Friday, January 5, 2018  6:12 PM

> But I could certainly live with "{...}" instead.

Of all the offered options, I would prefer this one, because it does have the
Kleene connotations of something iterative and no previous association with Ada
semantics.

Having said that, I find all the syntax examples so far rather unintuitive.
Foremost, when I see one of these {..}, am I necessarily looking at a reduction?
Or am I looking at a generator? Why this attempt to mix the two notions?

is "(3 + {for i in 1..10 ..})" intuitively a reduction with 3 as an initial
value? Not to me, someone had to decipher the riddle for me explicitly.

"Integer*Min(5, {for I in 1..10 => I})" I am not sure I even understood.
Is this a strange way of spelling "1"? I presume that the I in the iteration
range of the example is a typo, no? If not, what does it mean?

> Sales_Price : constant Integer := Integer'Min(5, (for I in I .. 10 =>
> I));

Is "(3, {for i in 1..10 ..})" something meaningful? I would read this as a
vector of 10 pairs. Or maybe it is a vector of 11 entries, starting with a 3?

Note the short Hamming distance for 2 very different things.

Indeed, an intuitive Ada equivalent for +/{..} is needed for reductions and
similarly one for generators, if you want to get the functional folks to love
you.

****************************************************************

From: Edmond Schonberg
Sent: Friday, January 5, 2018  7:12 PM

Maybe the introduction of an attribute ‘Reduce,  as suggested by Brad, is the
best way to make the construct immediately visible and intrusive. The prefix is
the iterator (really a generator) and the arguments are the operator or function
name and the initial value. Functional folks will love the presence of a real
functional!

****************************************************************

From: Randy Brukardt
Sent: Friday, January 5, 2018  7:24 PM

...
> Indeed, an intuitive Ada equivalent for +/{..} is needed for
> reductions and similarly one for generators, if you want to get the
> functional folks to love you.

I, at least, explicitly don't give a rats a** about the "functional folks".
I would hope this is what's best for Ada going forward, and I'd also hope this
isn't trying to chase other consistencies with some vague hope that Ada might
appeal to them. That *never* works, and just ends up bastardizing the language.

The Introduction to the Ada Standard states that "Ada was originally designed
with three overriding concerns: program reliability and maintenance, programming
as a human activity, and efficiency." Typical "functional folks" don't care
about any of these. Most functional programs are conglomerations of clever
tricks without concern for efficiency or much else. They'll write a one-liner to
accurately predict tommorrow's stock market price, but no one including the
author will be able to figure out how it works and it will take 100 years to
run.

We need to stick to these original goals for Ada, and that means avoiding
loading up Ada with the gee-gaw de jour. Shortly afterward in the Introduction,
it says "Above all, an attempt was made to keep to a relatively small number of
underlying concepts integrated in a consistent and systematic way..." We have to
be careful that we are not completely forgetting the human element here -- no
one is going to be able to remember bunches of rarely used syntaxes.

I came to Ada in the first place because it was the only language at the time
that really got syntax right, with the appropriate balance between explicitness
and expressivity. Most language of the time (and that has not changed much in
the interim) go way too far in the direction of clever shorthands (like "++")
that programmers have to remember and understand.

Up to this point, we've done a good job (with a couple of exceptions) in keeping
the original goals (and the reasons I use Ada in the first place) intact. But
we're now starting toward "design-by-committee" where everybody's favorite
gadget is included, regardless of need or expressivity.


If there really is no way to write safe parallel-executing programs in a
conventional style, then so be it. But let's not use that as an excuse to
destroy all of Ada's positive attributes; it means that Ada has reached the end
of its useful life and a new language (without the baggage) needs to take over.
(And since I probably won't use that language, I don't care what it looks like.)

****************************************************************

From: Randy Brukardt
Sent: Friday, January 5, 2018  7:50 PM

> ...
> > Not creating components is precisely the problem case, since other
> > rules generally require completeness of components
> (certainly in the
> > array aggregate case, and this is a very valuable and important
> > feature of Ada). I'd rather that such containers are
> created using a
> > loop and if structure, not by a phony aggregate. Aggregates
> (and other
> > expressions for that matter) are for *simple* cases, not something
> > that can (or should) handle *every* case.
>
> When it comes to creating a container aggregate for a map, there
> typically is no notion of "completeness."  There might be some notion
> of no overlap, but completeness doesn't really make sense for most
> maps.

The reason for having container aggregates at all (IMHO) is to allow the
array-like containers to be used even more like an array. One gets aggregates
for maps as a side-effect, but they don't fit into the model of aggregates at
all. (The whole reason for using aggregates is for the completeness checks,
otherwise a series of component assignments is easier and usually more
readable.)

One is not (I hope) building giant data structures with container aggregates;
those are almost always created by a program, and there's no reason for that
program to build a giant unoptimizable entity when a series of statements works
just as well (and can be optimized).

...
> > As with every clever concept that has been proposed, I'm sure there
> > are uses. The problem is that we are starting to drown in clever
> > concepts -- to the point where I haven't even had the energy to come
> > up with any of my own. ;-) We need to draw the line lest we have a
> > ton of gee-gaws and no one other than AdaCore left in the Ada business.
>
> The notion of a "filter" is not just something "clever."  It is
> fundamental to many notions of what an iterator does.

Not mine. :-)

> Having to use an explicit "if" statement or "if" expression muddies
> the waters in my view, and forces an extra level of nesting which
> makes it that much *harder* to read. And neither an "if" statement nor
> an "if" expression really works for something that is generating
> values, which is what we are doing in the particular cases of a
> container aggregate or a reduction expression.

That could be fixed, but it would require introducing another gee-gaw (we allow
omitting "else True", we could also allow omitting "else 0" and "else 0.0").

In any case, I jump back to the line in the RM Introduction I mentioned in the
reply to Erhard: "Above all, an attempt was made to keep to a relatively small
number of underlying concepts...". There are already at least two, and often
three, ways to write an iterator filter. Do we really need a fourth?

> Also, remember that we hope to use these notations as part of
> contracts.  It doesn't work very well to bury the semantics of a
> contract inside a function, because then you just have to write a
> contract for that function (unless that function is an expression
> function -- which gets us back to using things like reduction
> expressions).

Generally, these sorts of functions have to be extra-pure ("global in out null")
and they produce a simple answer ("Is_Open", "Mode", "Capacity", "Length"), so
there is no separate contract for them. And they should be able to reduced
algebraically at the call-site (assuming that appropriate stable properties have
been defined), meaning that looking in them for preconditions is unnecessary. So
we're mainly talking about being able to look in them for the purposes of
proving postconditions -- but that's almost never a problem since the function
and the postcondition are going to be in the same package (since they both need
to be primitives of the type). I suppose there might be some issues with
inherited functions and with postconditions for class-wide subprograms, but I
don't think this matters much: any static prover is going to have to look into
bodies -- at least for generic instantiations -- and if you have to do that in
some cases, why not all?

The more general answer is given in my reply to Erhard's message, but it
probably is even more critical. If we *really* need all of this stuff, then Ada
is done, because it will collapse of excessive weight. Ada does not need
everyone's pet feature (or pet five features ;-) and SOMEONE has to start being
the voice of reason on this topic.

****************************************************************

From: Randy Brukardt
Sent: Friday, January 5, 2018  9:21 PM

...
> > My experience is that so-called angle brackets make code much harder
> > to read. The old Scribe syntax allows angle brackets as one of 5
> > sets of bracket characters, and the angle bracket uses tend to be
> > the hardest to read (especially when other nearby text is using  <
> > or > literally). There's also an error correction concern (our table
> > driven parser uses weights for each token to determine error
> > correction, but the weights for a bracketing token and for an
> > operator are nearly maximally different). I'd probably be less
> > concerned if the contents where limited to just an identifier or
> > something else very simple, but arbitrary expression text will lead
> > to madness. (While a machine might be able to parse it, a human
> > would have
> > trouble.)
>
> This seems pretty subjective.

Syntax always is!

> ... Scribe is notoriously dense
> with brackets, so I can imagine making such distinctions is painful.
> But many languages these days use angle brackets as part of generic
> instantiation (e.g. Map<Key_Type,
> Value_Type>) so it seems that much of the programming world has
> adopted angle brackets as a useful alternative bracketing syntax.

In that case, it seems to be a list of identifiers, which is quite restricted
and something I'd probably have grudgedly agreed to. Allowing arbitrary
expressions inside however, brings up reading ambiguity even if the compiler can
figure it out:

    <(lots + identifiers * operators > more + identifiers and operators)>

Hard to tell if the > is closing or an operator, especially if the expression is
long enough that it has to be written on multiple lines. And compare that to:

    <(lots + identifiers * operators >) + more + identifiers and operators > 1

which is the same problem with the other result.

> But I could certainly live with "{...}" instead.

Me too, so let's stay with that for now.

****************************************************************

From: Brad Moore
Sent: Saturday, January 6, 2018  2:13 AM

> I find all the syntax examples so far rather unintuitive. Foremost,
> when I see one of these {..}, am I necessarily looking at a reduction?
> Or am I looking at a generator? Why this attempt to mix the two
> notions?

I strongly dislike introducing curly brace for this one purpose in Ada. It feels
too foreign to Ada, and too complex requiring two sets of braces of different
kinds in the same construct. That is what drove me to find a different solution
with the 'Reduce attribute.

> is "(3 + {for i in 1..10 ..})" intuitively a reduction with 3 as an
> initial value? Not to me, someone had to decipher the riddle for me
> explicitly.

How about?

(1..10)'Reduce(3 + <>);

To me, this appears more intuitive that we are reducing the range of values from
1 to 10. We have a descriptively named attribute that you could look up in the
RM, if you wonder how that attribute works, but the attribute syntax suggests to
me we are applying the reduction operation to the values on the left using the
formula on the right.

As Ed mentioned, the part on the left is the generator, but if you've already
got an array, there's no need to generate anything, and you can replace the left
side with the name of an array.

The idea of the 'Reduce attribute is that it is always just reducing an array of
values.

If you need to modify the values before you reduce them, then you could expand
the generator syntax to full array aggregate syntax to name the values and
modify them.

e.g.

-- Sum of squares
(for I in 1 .. 10 => I**2)'Reduce(0 + <>)

I'm not sure if we'd want to support the range shorthand, but it seems silly and
confusing to name loop variables if they are not used, which helps readability I
think when the shorthand can be used.

> "Integer*Min(5, {for I in 1..10 => I})" I am not sure I even understood.
> Is this a strange way of spelling "1"? I presume that the I in the
> iteration range of the example is a typo, no? If not, what does it
> mean?

The * between Integer and Min is a typo. It should be ' , since 'Min is the
attribute. The intent was to use array aggregate syntax to generate an array, as
you can currently do in Ada.

e.g.

-- Array Aggregate syntax example that is already in the RM
X : constant Integer_Array := (for I in 1 .. 10 => I);

I much prefer the following though than the curly brace syntax. See my previous
email for more examples and explanation.

(1 .. 10)'Reduce(Integer'Min(5, <>))

>> Sales_Price : constant Integer := Integer'Min(5, (for I in I .. 10 =>
>> I));
>
> Is "(3, {for i in 1..10 ..})" something meaningful? I would read this
> as a vector of 10 pairs. Or maybe it is a vector of 11 entries,
> starting with a 3?

The intent was that that wouldn't have been meaningful, as there is no combiner
function associated with the 3.

I see the confusion though, which hopefully is better remedied by the latest
syntax idea.

If you still find it confusing even with the 'Reduce attribute, then you at
least have the option of splitting this up to declare an array separately, and
then apply the reduction with just the 'Reduce attribute.

You don't have that option with the curly brace syntax.

Bids : constant Price_Array := (for C of Customers => C.Bid);

-- Get the best bid, but it has to be better than the minimum acceptable offer
Sales_Price : constant Integer :=
   Bids'Reduce(Integer'Max(<>,
                           Minimum_Acceptable_Offer));


Another advantage of the 'Reduce attribute syntax, is that it provides more
capability with simpler syntax.

With the previous curly brace syntax, to append the alphabet to a string you
could in theory write:

("Alphabet: " & {'A'..'Z'})

But if we want to instead prepend the alphabet to a string, we would need the
syntax to support having the combiner function and initial value on either side
of the generator, which is more complex syntax, and more complex to parse and
understand.

({'A'..'Z'} & "<- Alphabet")

With the Reduce attribute, you get the same capability, in a more uniform way.

('A'..'Z')'Reduce("Alphabet: " & <>)
vs.
('A'..'Z')'Reduce(<> & "Alphabet: ")

Another point to note is that though parens are not nested with 'Reduce
attribute syntax. The curly brace syntax you have braces nested inside of
parens. I was noticing that the nested parens can become difficult to read in
more complicated reduction expressions.

Lastly with the curly brace syntax, you always need to write a generator. With
the Reduce attribute, you dont need to write a generator if you already have an
array.

****************************************************************

From: Erhard Ploedereder
Sent: Saturday, January 6, 2018  8:44 AM

>> "Integer*Min(5, {for I in 1..10 => I})" I am not sure I even understood.
>> Is this a strange way of spelling "1"? I presume that the I in the
>> iteration range of the example is a typo, no? If not, what does it
>> mean?
>> Sales_Price : constant Integer := Integer'Min(5, (for I in I .. 10 =>
>> I));

> The * between Integer and Min is a typo. It should be ' , since 'Min is the
> attribute.

Sorry, my typo. I understood that part. But you did not answer my question(s).
What is the outcome? and was the I in the range intentional and I simply did not
understand the idea?

> The intent was to use array aggregate syntax to generate an array, as
> you can currently do in Ada.

Where does an array play into Integer'Min and if so, what role does the 5 play?
To me, Min is typically a binary function that chooses among two values. By your
analogy, 'Min would have a signature of (Integer, Array of Integer); that looks
very strange.

****************************************************************

From: Edmond Schonberg
Sent: Saturday, January 6, 2018  10:40 AM

I think that the introduction of an attribute 'Reduce is so far the clearest
choice: it makes the syntax clear without the need for a new keyword, and it
avoid the ambiguities /  obscuritiies that Randy has found in previous
proposals. Using an attribute also allows for a compact notation when applied to
a composite that supports iteration:   V'Reduce ("+", 0)  is clear and compact
(almost like +/V !).  But the prefix can be any iterated_component association :

   (for J in V'range => V (J) ** 2)'Reduce (F, V0)

       which can also be written as:

   (for E of V => E ** 2)'Reduce (F. 0).

The first argument of the attribute must be s combiner function or operator,
i.e. a binary operation with operands and result of the same type.  Being able
to have an expression on the prefix that yields an itterable construct is also a
natural generalization.  Randy, is this any more palatable?

****************************************************************

From: Brad Moore
Sent: Saturday, January 6, 2018  11:45 AM

>>> "Integer*Min(5, {for I in 1..10 => I})" I am not sure I even understood.
>>> Is this a strange way of spelling "1"? I presume that the I in the
>>> iteration range of the example is a typo, no? If not, what does it
>>> mean?
>>> Sales_Price : constant Integer := Integer'Min(5, (for I in I .. 10
>>> => I));
>
>> The * between Integer and Min is a typo. It should be ' , since 'Min
>> is the attribute.
>
> Sorry, my typo. I understood that part. But you did not answer my
> question(s). What is the outcome? and was the I in the range
> intentional and I simply did not understand the idea?

The idea was that Integer'Min was to be treated as the combiner function, and
the curly braces indicated the values to be fed into the combiner function as
the second parameter.

As a combiner function, the other parameter shows the initial value (5) to be
passed into the first parameter, but for subsequent iterations, the result of
the previous calls to Integer'Min are passed back into Integer'Min for the first
parameter instead. The more I look at that example, the more I see your
confusion, and the more I dislike that whole proposal. Its way too subtle and
non-intuitive.

Integer'Min(5, {for I in 1 .. 10 => I});

Compare to;

(for I in 1 .. 10 => I)'Reduce(Integer'Min, 5)

As per Ed's latest suggested tweaks, which to me make it even clearer and
simpler. I agree with Ed and I think the 'Reduce attribute idea is a clear
winner over anything we've discussed before.

I will take a stab at updating the AI to reflect this latest idea.

****************************************************************

From: Tucker Taft
Sent: Saturday, January 6, 2018  12:57 PM

I like Ed's refinement of Brad's 'Reduce proposal as well.

****************************************************************

From: Brad Moore
Sent: Saturday, January 6, 2018  2:37 PM

> The first argument of the attribute must be s combiner function or
> operator, i.e. a binary operation with operands and result of the same
> type.

Can we relax this restriction a bit and say that that its a binary operation,
and the type of the 2nd argument of the attribute must match at least one of the
arguments of the binary operation as well as the result type of the binary
operation. The type of the values being reduced must match the type of the other
parameter of the binary operation.

This would then allow reductions such as concatenation where the result type can
be an array type, but the elements being concatentated are the elements of the
array.

e.g.

(for Letter in 'A' .. 'Z' => Letter)'Reduce("&", "")

****************************************************************

From: Tucker Taft
Sent: Saturday, January 6, 2018  4:17 PM

Agreed, we should support the case where we have Vector op Component => Vector.

****************************************************************

From: Erhard Ploederder
Sent: Saturday, January 6, 2018  4:08 PM

>    (for J in V’range => V (J) ** 2)’Reduce (F, V0)
>
>        which can also be written as:
>
>    (for E of V => E ** 2)’Reduce (F. 0).
>
> The first argument of the attribute must be s combiner function or
> operator, i.e. a binary operation with operands and result
> of the same type.  Being able to have an expression on the prefix that
> yields an itterable construct is also
> a natural generalization.  Randy, is this any more palatable?


I find this model and syntax a lot better.

Semantic question: In the case of
 (for E of V => g(E))'Reduce(f, 0)
where g has a side effect, what will the semantics say about the guaranteed
sequence of g and f calls?

1) all g calls before all f calls ?  Then you lost all ability to stream the
   values to and through the Reduce.

2) g, f, g, f, ... ?

3) (almost) anything goes

?
------------------------

1) is the semantics that you undoubtedly would be guaranteed to get if the
   prefix is to be treated as a normal array aggregate. (=> need for a different
   syntax, if you do not want that - and I sure do not want this canonical
   semantics)

3) is necessary if parallelization of g calls is supposed to play here.

2) maybe as a selectable option? It clearly makes implemention easier, since no
   buffer for results of g calls is needed, as is in case 3) and it still avoids
   storing the intermediate array.

--------------------

****************************************************************

From: Edmond Schonberg
Sent: Saturday, January 6, 2018  4:40 PM

Given the interest in parallizability, I think the choice must be 3),  i.e.
unspecified order. If this is to be parallelizable, g better be side-effect free
anyway.

****************************************************************

From: Edmond Schonberg
Sent: Saturday, January 6, 2018  4:47 PM

> Agreed, we should support the case where we have Vector op Component =>
> Vector.

I still think that this is an abuse of notation. It is amusing that it may fall
out of the proposed semantics of reduction, but a construct that builds an array
from scalar components does not look like its reducing anything:-)! We have
plenty of constructs for aggregates already, hiding this one in this new
notation is redundant (not to mention the awkward implementation it requires).
So I’d rather retain the consistent type requirement. (not a deal breaker
obviously, but also simpler to describe).

****************************************************************

From: Brad Moore
Sent: Saturday, January 6, 2018  5:05 PM

A good point.

Another advantage of this restriction would be that we would not need the
'Reducer aspect for parallelism, since the combiner function would then always a
reducer function. (i.e. The function used to combine results from independent
tasklets).

We would still need the Identity aspect, but it would simplify the parallelism
part of this AI, and reduce the number of aspects needed to be placed on various
library calls in the RM.

****************************************************************

From: Brad Moore
Sent: Saturday, January 6, 2018  8:44 PM

I have a few questions.

1) I presume we want 'Reduce to work on multidimensional arrays?

   Example from RM:

  G : constant Matrix :=
    (for I in 1 .. 4 =>
       (for J in 1 .. 4 =>
          (if I=J then 1.0 else 0.0))); -- Identity matrix

  Sum : constant Float := G'Reduce("+", 0.0);

  Reason, seems like a useful thing to be able to do, and easy enough to understand and implement.


2) I presume we are probably *not* interested in defining a version of 'Reduce
   that operates on a single dimension of a multidimentional array?

  e.g.
  Sum_1 : constant Float := G'Reduce(1, "+", 0.0);

  Where 1 represents the first dimension, similar to 'Length(1)

  Reason, while this could be implemented probably about as easy at the other
  version of 'Reduce, I suspect there might not be enough need for this to
  justify. We could always add this later, if needed.


3) Its not clear to me whether we'd want to be able to apply 'Reduce to
   aggregate expressions. I don't see a reason not to, and I suspect it would be
   allowed based on Ed's outline of the rules.

That is, I would think one would be able to write something like?

  Sum : constant Float :=
     Matrix'(for I in 1 .. 4 =>
               (for J in 1 .. 4 =>
                  (if I=J then 1.0 else 0.0)))'Reduce("+", 0.0);

4) And if the answer to 3 is yes, then I presume it would also work for other
   forms of array aggregates, not just ones involving
   iterated_component_associations?

eg

Sum : constant Float := Matrix'(1 => (1.1, 1.2, 1.3), 2 => (2.1, 2.2, 2.3))'Reduce("+", 0.0);

****************************************************************

From: Jean-Pierre Rosen
Sent: Sunday, January 7, 2018  12:42 AM

I have been followin this thread with mixed feelings...

Although I see the benefit of the attribute, I don't like having:
(very_complicated_expression_spanning_several_lines)'Reduce()

I would suggest doing it the other way round:
For every function F with appropriate profile (TBD), there is a function
F'Reduce that operates on arrays of Elems, with the following semantics:
1) if the array is empty, Constraint_Error is raised
2) if the array has one element, the element is returned
3) otherwise, the result is the successive application of F to elements in an
   arbitrary order (needs better definition, it's just to give the idea).

This has the following benefits:
- Easy to read and understand
- No need for Identity
- For parallelism, easy to specify the reduction function as
  an aspect Reduction => F'Reduce

Some examples:

Old:
(for E of V => E ** 2)’Reduce (F. 0)

Proposed:
F'Reduce ((for E of V => E ** 2))

Old:
(for I in 1 .. 10 => I)'Reduce(Integer'Min, 5)

Proposed:
Integer'Min'Reduce ((5, for I in 1..10 => I))


Old:
(for Letter in 'A' .. 'Z' => Letter)'Reduce("&", "")

Proposed:
"&"'Reduce ((for Letter in 'A' .. 'Z' => Letter))

****************************************************************

From: Peter Chapin
Sent: Sunday, January 7, 2018  9:17 AM

Hello!

I'm just an observer here, for now, so I feel some reluctance to jump in with
too many comments. I feel like I'm still gathering background information. ??
That said, I did want to say a few things about the debate regarding reduction
expressions...

I have extensive background with the functional language Scala and I've worked
with OCaml and Haskell as well. I can appreciate that Ada is not, at its core, a
functional language and that making it into one is fraught with potential
difficulties. On the other hand, there are two forces at work that are pushing
Ada to grow more functional features: contracts and now parallelism. Contracts
benefit from powerful expressions, and parallelism benefits from pure
expressions, both things that functional languages are good at. So it makes
sense to me to borrow ideas from the functional world to support these things in
Ada. The trick is doing so in a reasonable (for Ada) way!

In Scala, at least, there is a distinction between "reduce" and "fold." Folds
take an initial value, can be applied to an empty sequence, and use a combiner
function that might have parameters of different types (for example when folding
a sequence of characters into a string). In contrast, reductions can't be
applied (without exception) to an empty sequence and use a combiner function
taking parameters of the same type as the sequence elements. The AI talks about
"reducer functions" and "reduction expressions" which I found confusing at first
since the "reduction expression" is really a fold in the functional sense. I'm
not sure if this matters to anything, but speaking as someone coming from the
functional community I had to read the AI twice before I de-confused myself on
this point.

Regarding syntax... I realize we aren't talking about adding general lambdas to
Ada, but does it make sense to choose a syntax that wouldn't prohibit such an
addition in the future?

I see a pretty similarity between:

(1 .. 10)'Reduce(3 + <>)

and the Scala equivalent:

(1 to 10).fold(3)(_ + _)

Scala allows methods with multiple parameter lists, and for obscure technical
reasons (not one of Scala's better features) related to type inference, the
initial value needs to be put into its own parameter list. So the syntax
logically is trying to be:

(1 to 10).fold(3, _ + _ )

where the weird looking "_ + _" is Scala's highly abbreviated way of writing a
lambda. The underscores are placeholders for the unnamed function arguments,
similar to the <> notation in the proposed 'Reduce syntax. Spelling out the
anonymous combiner function longhand gives:

(1 to 10).fold(3, (x: Int, y: Int) => x + y) )

I guess that's all I have to say for now. I don't have a specific suggestion or
feeling at the moment about what is best for Ada here. I'm still absorbing the
very interesting discussion. I did want to throw in my support for the idea of,
in some fashion, adding folds, er... reduction expressions, to Ada. Certainly
they are widely used in languages that currently have them and they would enrich
Ada's expression sublanguage considerably.

****************************************************************

From: Tucker Taft
Sent: Sunday, January 7, 2018  10:30 AM

...
> I would suggest doing it the other way round:
> For every function F with appropriate profile (TBD), there is a
> function F'Reduce that operates on arrays of Elems, with the following
> semantics:

I believe it is a serious mistake to have to create an array to do a reduction.
There are many cases when you want to do a reduction over a series of values
that are produced in various ways.  Limiting it to arrays means you end up
having to figure out how to create the array, and furthermore, you need to know
in advance how big an array to allocate, or undergo the dynamic allocation
implicit in a growable vector.

Using a generalized iterator (including a filter of some sort) is significantly
more flexible and potentially hugely more time and space efficient.

****************************************************************

From: Erhard Ploedereder
Sent: Sunday, January 7, 2018  12:02 PM

> I believe it is a serious mistake to have to create an array to do a
> reduction.

Indeed. Argument to the Reduction should be a Generator (which, as a very
special case, can be an array iterator).

****************************************************************

From: Brad Moore
Sent: Sunday, January 7, 2018  12:05 PM

> I have been followin this thread with mixed feelings...
>
> Although I see the benefit of the attribute, I don't like having:
> (very_complicated_expression_spanning_several_lines)'Reduce()
>
> I would suggest doing it the other way round:
> For every function F with appropriate profile (TBD), there is a
> function F'Reduce that operates on arrays of Elems, with the following semantics:
> 1) if the array is empty, Constraint_Error is raised

I don't think we want the user to have to worry about handling such constraint
errors. That's why we have the initial_value. You get that result if you are
trying to apply 'Reduce to a null array, or an empty
iterated_component_association. Having to deal with exceptions makes it more
difficult to use with contracts for example.

> 2) if the array has one element, the element is returned
> 3) otherwise, the result is the successive application of F to
> elements in an arbitrary order (needs better definition, it's just to give
> the idea).
>
> This has the following benefits:
> - Easy to read and understand
> - No need for Identity
> - For parallelism, easy to specify the reduction function as an
>   aspect Reduction => F'Reduce

I think we can (and probably should) get rid of the 'Identity aspect with either
proposal. I don't think it is needed. For tasklets where Initial_Value is not
being applied, the initial value can be the value of the first item. If there
are more than one item per tasklet, the Reducer function can be applied to
combine the first and subsequent results.

Another criticism of this idea is that it would significantly expand our concept
of what an attribute is.

With the previous 'Reduce, the arguments of the attribute translate to arguments
of a function call, which we have already with other attributes in the language.

With your suggestion, the contents of this attribute are an expression, which
needs to be parsed, which is something we don't have yet in the language.

> Some examples:
>
> Old:
> (for E of V => E ** 2)’Reduce (F. 0)
>
> Proposed:
> F'Reduce ((for E of V => E ** 2))
>
> Old:
> (for I in 1 .. 10 => I)'Reduce(Integer'Min, 5)
>
> Proposed:
> Integer'Min'Reduce ((5, for I in 1..10 => I))

I also find the expression (5, for I in 1..10 => I) strange. It doesn't seem to
mean anything by itself.

Only in the context of outside its enclosing parens as part of one particular
attribute.

With the 'old' syntax, there are no meaningless expressions.

Otherwise your suggestion appears to be mainly a cosmetic one, of just moving
things around. I can see your reasoning for doing so, but I don't think can be
justified given these criticisms.

****************************************************************

From: Jean-Pierre Rosen
Sent: Sunday, January 7, 2018  12:20 PM

> I believe it is a serious mistake to have to create an array to do a
> reduction.  There are many cases when you want to do a reduction over
> a series of values that are produced in various ways.  Limiting it to
> arrays means you end up having to figure out how to create the array,
> and furthermore, you need to know in advance how big an array to
> allocate, or undergo the dynamic allocation implicit in a growable
> vector.
>
> Using a generalized iterator (including a filter of some sort) is
> significantly more flexible and potentially hugely more time and space
> efficient.

Indeed. This proposal does not intend to really create an array. It's 100% "as
if".

I was not explicit about this, but since F'Reduce would be 100% in the
compiler's realm, it could be defined as having convention => intrinsic and
aspect Inline. Since it deals only with elements, it could be considered an
optimization that F is simply called in line on all the elements without needing
to construct an actual array.

OTOH, a compiler would be /allowed/ to construct an array if it is more
convenient or efficient in some cases.

I think it would be easier to understand and explain with this model.

****************************************************************

From: Tucker Taft
Sent: Sunday, January 7, 2018  1:00 PM

Your proposed syntax does not feel like an improvement from a readability point
of view, and seems to muddy the waters significantly semantically, in my view.

Personally, I prefer the order where the iterator comes first, so you know what
you are talking about, and then the reduction operation.  There are various
syntaxes that have been proposed that have that ordering.  I happen to still
prefer the one that ParaSail uses, but the one with:

   (iterator)'Reduce(func, params)

is second best, in my view.

I think we should not eliminate the 'Identity aspect, and in fact might want
additional aspects such as 'Commutative and 'Associative, but the basic
semantics of the reduction expression should not depend on having any of these
aspects specified.  Specifying them can provide more ways to optimize things,
but without them, I would argue the semantics should be very simple, and not
require the operation to be commutative, and perhaps not even associative
(though that is a bit more stringent).

The semantics of "(iterator)'Reduce(func, initial_val) would be simply:

    tmp := initial_val;
    while more in (iterator) loop
        tmp := func (tmp, <next-val-of-(iterator)>);
    end loop;
    return tmp;

Parallelization would be optional, and could use properties of "func" as defined
by various aspects (e.g. Identity, Reducer, Commutative, Associative, etc.), or
built-in knowledge in case of language-defined operators, to achieve maximum
speedup.

Overload resolution for "func" would expect a function with a profile of

    function func (Left : <initial_val'Type>; Right : <iterator-element'Type>) return <initial_val'Type>;

For the first version of this, we could require that <initial_val'Type> and
<iterator-element'Type> be the same, though that doesn't seem necessary, and
interferes with operations that are, for example, taking individual elements and
building up a set.  I agree it makes parallelization easier, if you presume func
has two operands of same type, and is associative, though having an aspect that
identifies a "Reducer" function that has those properties can provide the same
advantage.

You might also want to allow a second version of this, that could perhaps use
the identical syntax, and be chosen when the "Reduce" operation parameter is
actually a procedure rather than a function:

    (iterator)'Reduce(proc, initial_val)

would expand into

    tmp := initial_val
    while more in (iterator) loop
         proc(tmp, <next-val-of-(iterator)>);
    end loop;
    return tmp;

This could be significantly more efficient when the reduction is producing a
more complex data structure, such as a hashed set, or a large matrix.

****************************************************************

From: Brad Moore
Sent: Sunday, January 7, 2018  1:57 PM

> I think we should not eliminate the 'Identity aspect, and in fact
> might want additional aspects such as 'Commutative and 'Associative,
> but the basic semantics of the reduction expression should not depend
> on having any of these aspects specified.  Specifying them can provide
> more ways to optimize things, but without them, I would argue the
> semantics should be very simple, and not require the operation to be
> commutative, and perhaps not even associative (though that is a bit more
> stringent).

I was actually thinking of writing the AI to replace the Identity aspect with an
Associative Boolean aspect. That seems more useful, since parallelism is not
possible if a reducer function is not associative.

If the reducer function does not have the associative aspect, then'Reduce
attribute would normally apply sequential processing. On the other hand, if the
reduction function does have it, then parallel execution can potentially be
applied by the implementation.

A problem with the Identity aspect is that it is difficult to apply everywhere.

For example, the language defined bitwise "and" of;

type Fingers is mod 5;

is hard to define. You could say it is Fingers'Last, but then "and"ing that with
3, wouldn't work. You could say it is 7, but then the value is outside the range
of values for the type, and its use could trigger constraint errors.

Using the logic that Jean-Pierre proposed, you don't have to define this, and
the 'Reduce attribute works.

I think the Associative aspect is a better replacement for 'Identity, because it
can easily defined on all subprograms that need it. (And its simpler, being a
boolean aspect)

I think we should not include the Identity aspect in this proposal unless
someone can show a valid need for it. I would want to see some sort of
explanation on how it can improve parallelism or understanding.

As it stands, it just appears to add complexity to the model in an inconsistent
manner without any real benefit. We can always add it later, if we can find a
valid use for it.

The associative aspect already seems to have validity. It can be used to prevent
errors, and help in the decision of whether to use parallelism or not.

A communitative aspect might be useful for future consideration, but my
experience with Paraffin is that having the implementation assume all reductions
are non-commutative does not add noticeable overhead to parallelism. So for now,
I would recommend we leave the Commutative aspect out of the AI.

> Overload resolution for "func" would expect a function with a profile
> of
>
>    function func (Left : <initial_val'Type>; Right : <iterator-element'Type>)
>    return <initial_val'Type>;
>
> For the first version of this, we could require that
> <initial_val'Type> and <iterator-element'Type> be the same, though
> that doesn't seem necessary, and interferes with operations that are,
> for example, taking individual elements and building up a set.  I
> agree it makes parallelization easier, if you presume func has two
> operands of same type, and is associative, though having an aspect
> that identifies a "Reducer" function that has those properties can provide the same advantage.

By first version, I assume you mean first version of this AI/feature.

I agree we can always consider later whether to add a 'Reducer aspect and
support reducer functions with differing types of parameters, based on feedback
from users. I think for now assuming all parameters are of the same type
probably covers a majority of the uses.

> You might also want to allow a second version of this, that could
> perhaps use the identical syntax, and be chosen when the "Reduce"
> operation parameter is actually a procedure rather than a function:
>
>    (iterator)'Reduce(proc, initial_val)
>
> would expand into
>
>    tmp := initial_val
>    while more in (iterator) loop
>         proc(tmp, <next-val-of-(iterator)>);
>    end loop;
>    return tmp;

I agree allowing a procedure for a reducer is useful, and probably worth
considering for this version of the AI. We could say that it can be a procedure
that accepts two parameters of the same type, where the first parameter is an in
out parameter.

****************************************************************

From: Tucker Taft
Sent: Sunday, January 7, 2018  3:24 PM

...
> I was actually thinking of writing the AI to replace the Identity
> aspect with an Associative Boolean aspect. That seems more useful,
> since parallelism is not possible if a reducer function is not associative.

Both are relevant.  It is not an either/or.

> If the reducer function does not have the associative aspect,
> then'Reduce attribute would normally apply sequential processing. On
> the other hand, if the reduction function does have it, then parallel
> execution can potentially be applied by the implementation.
>
> A problem with the Identity aspect is that it is difficult to apply
> everywhere.

That's OK.  If there is no well-defined identity, then so be it.

> For example, the language defined bitwise "and" of;
>
> type Fingers is mod 5;
>
> is hard to define. You could say it is Fingers'Last, but then "and"ing
> that with 3, wouldn't work. You could say it is 7, but then the value
> is outside the range of values for the type, and its use could trigger
> constraint errors.

Right.  These non-power-of-2 modular types are weird in various ways, and
certainly don't have well-defined bit-wise boolean operators.  So these
operators wouldn't have a well-defined Identity aspect, which is fine, since
no-one in their right mind would use bit-wise boolean operators to reduce a
bunch of non-power-of-2 modular values.  But clearly addition and multiplication
have well-defined Identity aspects, even for such oddball modular types.  No
need to throw out the baby with the non-power-of-2 modular type bathwater!

> Using the logic that Jean-Pierre proposed, you don't have to define
> this, and the 'Reduce attribute works.
>
> I think the Associative aspect is a better replacement for 'Identity,
> because it can easily defined on all subprograms that need it. (And
> its simpler, being a boolean aspect)

There might still be reasons to define an Identity aspect, as it might be useful
for some purposes.  I don't see this as one "replaces" the other.  They should
all be optional, and only used for certain optimizations.

> I think we should not include the Identity aspect in this proposal
> unless someone can show a valid need for it. I would want to see some
> sort of explanation on how it can improve parallelism or understanding.

Well, "normal" reductions use identity quite a bit, but it seems fine to omit
it.  In general, I would try to separate this proposal into what is needed for
basic functionality, and what is useful for optimizations and parallelization.
The aspects should only be needed for optimizations, and never be required for
basic functionality.  And if we cannot agree on the aspects, or want to leave
them for a future AI, that seems fine.

>> ...
>> You might also want to allow a second version of this, that could
>> perhaps use the identical syntax, and be chosen when the "Reduce"
>> operation parameter is actually a procedure rather than a function:
>>
>>   (iterator)'Reduce(proc, initial_val)
>>
>> would expand into
>>
>>   tmp := initial_val
>>   while more in (iterator) loop
>>        proc(tmp, <next-val-of-(iterator)>);
>>   end loop;
>>   return tmp;
>
> I agree allowing a procedure for a reducer is useful, and probably
> worth considering for this version of the AI. We could say that it can
> be a procedure that accepts two parameters of the same type, where the first
> parameter is an in out parameter.

Right.  Though again, we could allow the second parameter to be of a different
type.

****************************************************************

From: Tucker Taft
Sent: Monday, January 8, 2018  3:04 PM

...
> Regarding syntax... I realize we aren't talking about adding general
> lambdas to Ada, but does it make sense to choose a syntax that wouldn't
> prohibit such an addition in the future?

There is a separate AI on lambdas.  The latest syntax proposal for reduction
expressions (at least last time I checked!) is compatible with using a lambda
expression for the function name.  E.g.

    (for E of V => E**2)'Reduce("+", 0)  -- sum of squares

could have a lambda expression in place of the "+":

    (for E of V => E**2)'Reduce(<lambda-expression>, 0)

The proposed syntax for lambda expressions is currently defined in:

   http://www.ada-auth.org/cgi-bin/cvsweb.cgi/ai12s/ai12-0190-1.txt?rev=1.5

and uses a syntax like:

   (function (X, Y) return X + Y)

So the above example could be written as:

   (for E of V => E**2)'Reduce( (function (X, Y) return X+Y) , 0)

Obviously not a great example, but you could imagine more interesting functions.

As far as terminology, "fold" and "reduce" are clearly very similar in
semantics. The annoying thing about "fold" is that there are so many different
versions of it (foldl, foldl1, foldr, ...).  So we have chosen to use the term
"reduction" even though in some cases it might be closer to a "fold."

****************************************************************

From: Peter Chapin
Sent: Monday, January 8, 2018  3:24 PM

Thanks! I'll take a look at that lambdas AI. As for the terminology, I don't
think it's a big deal. We have to call it something and "fold" is really
somewhat of an odd term in many ways. "Reduce" has the advantage of sounding
more like what is going on.

****************************************************************

From: Brad Moore
Sent: Monday, January 8, 2018  7:17 PM

In private email, Jean-Pierre was questioning the need for an initial value as
an argument to the 'Reduce attribute, as it could be considered an error for
attempting to reduce 0 items.

This is something I've struggled with as well.

This got me to thinking that there probably are times when an exception is
wanted, and times when it is not.

For example,

-- Determine the maximum number of widgets that can be found at a particular
-- store brand near me. If the answer is 0, because there are no such stores,
-- or 0 because the stores that do exist are out of stock, I dont care, the
-- answer is still 0.
Max_Widgets : constant Integer :=
   (for Location in Stores when Location.City = Calgary)'Reduce(Integer'Max, 0);

-- On the other hand, there may not always be a clear value to use. Is it clear
-- that a result of Temperature'Min means that no cities were found, or that
-- is an actual value for the coldest settlement in Antartica?
Max_Temp : constant Temperature :=
   (for City of World
       when City.Continent = Antartica => City.Max_Temperature)'Reduce(Temperature'Max, Temperature'Min);

A nice feature of the 'Reduce attribute concept, is that we can have more than
one version of the attribute.

We could also have a version that only accepts one argument, the reducer
subprogram, and it raises constraint error if there are no items to be reduced.

****************************************************************

From: Tucker Taft
Sent: Monday, January 8, 2018  7:41 PM

> In private email, Jean-Pierre was questioning the need for an initial
> value as an argument to the 'Reduce attribute, as it could be considered an
> error for attempting to reduce 0 items.
> ...
> A nice feature of the 'Reduce attribute concept, is that we can have more than
> one version of the attribute.
>
> We could also have a version that only accepts one argument, the
> reducer subprogram, and it raises constraint error if there are no items to be
> reduced.

I suppose, but I think you are over-thinking this.  It is very rare that you
really *want* an exception.  Let's keep this simple.  Yes there are a million
options imaginable, but to me that is where the foldl, foldl1, foldr, foldr1
madness arose.

A simple pragma Assert can always be inserted if you really want an exception
under certain conditions.

****************************************************************

From: Randy Brukardt
Sent: Monday, January 8, 2018  9:28 PM

> I believe it is a serious mistake to have to create an array to do a
> reduction.  There are many cases when you want to do a reduction over
> a series of values that are produced in various ways.  Limiting it to
> arrays means you end up having to figure out how to create the array,
> and furthermore, you need to know in advance how big an array to
> allocate, or undergo the dynamic allocation implicit in a growable
> vector.

I don't think anyone was suggesting that the array would have to actually
be materialized. But the prefix of this attribute seems like it has to be
an array aggregate, as it has the syntax of an array aggregate. Saying that
the semantics of some construct is determined by 7 characters at the very
end -- many lines after the start -- seems like the wrong way to go.

So, my view was that the prefix was an object of some appropriate type, and
there would be a rule that the order of evaluation of the parts of the prefix
and the 'Reduce itself are unspecified (probably would take a notwithstanding
rule). I suppose that means that the attribute would have to be able to handle
an existing array object, but that doesn't seem like a problem.

> Using a generalized iterator (including a filter of some
> sort) is significantly more flexible and potentially hugely more time
> and space efficient.

Since container aggregates need to be able to contain generalized iterators, I
don't see a difference here. Certainly, as-if optimizations are always allowed,
and we'll have to make the order of evaluation unspecified if any parallelism is
going to be applied.

---

One note here is that an aggregate standing by itself isn't currently a legal
attribute prefix. But a qualified aggregate is legal, so I don't think there
would be any significant issue with allowing an aggregate as a prefix. Probably
would want to allow any expression so long as it was in parens. (Note that we
just rejected this idea in terms of Obj'Image, but I guess we're allowed to
change our mind.)

****************************************************************

From: Randy Brukardt
Sent: Monday, January 8, 2018  10:34 PM

...
> The semantics of "(iterator)'Reduce(func, initial_val) would be
> simply:

There's a stray quote above that confused me for a while.

>     tmp := initial_val;
>     while more in (iterator) loop
>         tmp := func (tmp, <next-val-of-(iterator)>);
>     end loop;
>     return tmp;

Or more conventionally:

    declare
       tmp : <reducer type> := initial_val;
    begin
       for <id> <rest of iterator> loop
           tmp := func (tmp, <id>);
       end loop;
       Result := tmp; -- Or "return tmp" if this is a function.
    end;

(Here, Tucker's "(iterator)" is expanded a bit to be clearer.)

And this illustrates the problem I have with this whole idea. My goal for the
Ada 2020 is to have safe-by-construction parallelism features that can be added
to improve the performance of *existing* Ada code.

Thus, my hope would be that one could add parallelism to a loop simply by adding
"parallel" to the iterator specification (with possibly a small amount of change
to that specification). That assumes, of course, that the operations in the body
can be operated in parallel.

The problem is, of course, that that isn't quite possible for loops that involve
some sort of result. Just adding "parallel" to the  ought to get a error because
the use of "tmp" is not task-safe.

The solution to that problem that we're proposing is to completely abandon
conventional loops and switch to a functional notation. That's going to be hard
to do for existing code; discovering the reduction function and variable is
going to be hard in real loops (which are a lot more complex than the above, and
often have a number of variables to construct a result). It requires turning the
entire loop inside-out (and often creating helper types and functions, if there
is more than a single result object).

New code won't be much better; users will be required to think functionally
about reductions, but Ada is nowhere near a complete functional language (and it
would be impractical to allow things like general closures, so that isn't likely
to change), so one will then have to switch mental gears halfway through. It
sounds like the worst of both worlds to me.

At one point (Pittsburgh), we had a reasonable but unfinished solution that
looked a whole lot more like standard Ada. My recollection was that the main
problem was that the accumulator objects could be used as arrays, but I don't
know why we would make that property visible. (No one needs to access these
explicitly as arrays).

That would give something like:

    declare
       tmp : parallel <reducer type> := initial_val;
    begin
       parallel for <id> <rest of iterator> loop
           tmp := func (tmp, <id>);
       end loop;
       Result := tmp; -- Or "return tmp" if this is a function.
    end;

The idea being that "tmp" acts like a set of objects, one per tasklet, inside of
a parallel construct, but as a single object outside of such constructs. To get
from the set to the single object, the reducer/identity for the reducer type
would be used by default, or one could specify them with aspects as needed.

This looks a lot more like conventional Ada code, and certainly would be easier
to apply to existing loops than a reduction expression. And much less likely to
require frequent gear-switching in new code.

****************************************************************

From: Jean-Pierre Rosen
Sent: Monday, January 8, 2018  11:42 PM

> In private email, Jean-Pierre was questioning the need for an initial
> value as an argument to the 'Reduce attribute, as it could be considered
> an error for attempting to reduce 0 items.

Oops, I didn't want to make it private. Thunderbird changed its UI recently, and
I keep hitting "Reply" instead of "Reply to all"...

For the record, here it is:
-------------------------------------------
Le 07/01/2018 à 19:04, Brad Moore a écrit :
> I don't think we want the user to have to worry about handling such
> constraint errors. That's why we have the initial_value. You get that
> result if you are trying to apply 'Reduce to a null array, or an empty
> iterated_component_association. Having to deal with exceptions makes
> it more difficult to use with contracts for example.

But isn't reducing 0 values an error? The maximum of 0 elements is certainly not
Integer'First!

[...]
> I think we can (and probably should) get rid of the 'Identity aspect
> with either proposal. I don't think it is needed. For tasklets where
> Initial_Value is not being applied, the initial value can be the value
> of the first item. If there are more than one item per tasklet, the
> Reducer function can be applied to combine the first and subsequent
> results.

Agreed, that's what I propose.

****************************************************************

From: Brad Moore
Sent: Thursday, January 11, 2018  1:56 AM

> ...
>> The semantics of "(iterator)'Reduce(func, initial_val) would be
>> simply:
>
> There's a stray quote above that confused me for a while.
>
>>     tmp := initial_val;
>>     while more in (iterator) loop
>>         tmp := func (tmp, <next-val-of-(iterator)>);
>>     end loop;
>>     return tmp;
>
> Or more conventionally:
>
>    declare
>       tmp : <reducer type> := initial_val;
>    begin
>       for <id> <rest of iterator> loop
>           tmp := func (tmp, <id>);
>       end loop;
>       Result := tmp; -- Or "return tmp" if this is a function.
>    end;
>
> (Here, Tucker's "(iterator)" is expanded a bit to be clearer.)
>
> And this illustrates the problem I have with this whole idea. My goal
> for the Ada 2020 is to have safe-by-construction parallelism features
> that can be added to improve the performance of *existing* Ada code.
>
> Thus, my hope would be that one could add parallelism to a loop simply
> by adding "parallel" to the iterator specification (with possibly a
> small amount of change to that specification). That assumes, of
> course, that the operations in the body can be operated in parallel.
>
> The problem is, of course, that that isn't quite possible for loops
> that involve some sort of result. Just adding "parallel" to the  ought
> to get a error because the use of "tmp" is not task-safe.
>
> The solution to that problem that we're proposing is to completely
> abandon conventional loops and switch to a functional notation. That's
> going to be hard to do for existing code; discovering the reduction
> function and variable is going to be hard in real loops (which are a
> lot more complex than the above, and often have a number of variables to
> construct a result). It requires turning the entire loop inside-out (and
> often creating helper types and functions, if there is more than a single
> result object).

I really think in some cases users wont mind deleting an existing loop such as;

declare
   Sum : Integer := 0;
begin

   for I in 1 .. 1000 loop
      Sum := Sum + A(I);
   end loop

   Put_Line ("Sum of A=" & Integer'Image(Sum); end

and replacing with;

Put_Line ("Sum of A=" & Integer'Image(A'Reduce("+", 0)));

> New code won't be much better; users will be required to think
> functionally about reductions, but Ada is nowhere near a complete
> functional language (and it would be impractical to allow things like
> general closures, so that isn't likely to change), so one will then
> have to switch mental gears halfway through. It sounds like the worst of
> both worlds to me.
>
> At one point (Pittsburgh), we had a reasonable but unfinished solution
> that looked a whole lot more like standard Ada. My recollection was
> that the main problem was that the accumulator objects could be used
> as arrays, but I don't know why we would make that property visible.
> (No one needs to access these explicitly as arrays).
>
> That would give something like:
>
>    declare
>       tmp : parallel <reducer type> := initial_val;
>    begin
>       parallel for <id> <rest of iterator> loop
>           tmp := func (tmp, <id>);
>       end loop;
>       Result := tmp; -- Or "return tmp" if this is a function.
>    end;
>
> The idea being that "tmp" acts like a set of objects, one per tasklet,
> inside of a parallel construct, but as a single object outside of such
> constructs. To get from the set to the single object, the
> reducer/identity for the reducer type would be used by default, or one
> could specify them with aspects as needed.
>
> This looks a lot more like conventional Ada code, and certainly would
> be easier to apply to existing loops than a reduction expression. And
> much less likely to require frequent gear-switching in new code.

I can see that having both 'Reduce and parallel loop with reduction capabilities
would be appreciated by users.

I think there will be times when one or the other is a better choice for a
particular usage, and a 'Reduce attribute I believe will also have utility
beyond just parallelism.

What I am finding is that both forms need common support for parallelism.

I think both need subprograms to be annotated with an Associated aspect, as well
as the Reducer aspect that we had in earlier versions of the AI.

What I'd like to do, is carve out another AI from AI12-00119 that just has
wording for describing tasklets, executors, the Associated aspect, and Reducer
aspect. This would also have the changes to the containers and libraries to have
these attributes added.


It looks to me like this will be the most work to nail down in the RM. I think
the Associated aspect can borrow much of the wording that we currently have for
the NonBlocking aspect, except change occurrences of NonBlocking for Associated.

Then the Parallel Operations AI can be a lot better focused. The 'Reduce AI I
think will also be quite a lot simpler to write up, as both can reference the
tasklet AI.

Sound OK?

By having these in separate AI's we can discuss the pros and cons of each
feature based on their own merits, without having to look at the common support
as much, since that is needed for either solution anyways.

And yes, Randy, I will be sure to take your latest version of AI12-00119 as a
starting point. Thanks for your input.


On an aside, I did an experiment today where I wrote a parallel loop that
calculates 3 results, a sum of a set of values, the min of a set of values, and
the max of a set of values.

I was curious to see if writing this as three separate loops would be better or
worse than creating a composite object and writing a reducer function that can
combine results of this record type.

I found that writing as 3 separate loops was significantly faster than 1 loop.

For a loop that involves 4_000_000_000 iterations, the 3 loop solution took
11.17 seconds. The single loop took 18.29 seconds. Why bring this up?

The 'Reduce approach encourages users to write 3 separate 'Reduce expressions,
whereas the parallel loop approach encourages users to write a single loop with
3 results, which I found is easiest to implement in Paraffin if the 3 results
are combined into a single result type.

So it may be that using 'Reduce can result in significantly better performance
results than a parallel loop approach, for certain problems at least.

****************************************************************

From: Randy Brukardt
Sent: Thursday, January 11, 2018  3:23 AM

...
> I really think in some cases users wont mind deleting an existing loop
> such as;
>
> declare
>    Sum : Integer := 0;
> begin
>
>    for I in 1 .. 1000 loop
>       Sum := Sum + A(I);
>    end loop
>
>    Put_Line ("Sum of A=" & Integer'Image(Sum); end
>
> and replacing with;
>
> Put_Line ("Sum of A=" & Integer'Image(A'Reduce("+", 0)));

Actually, they probably will, because the latter might very well run 10 times
slower. (It certainly will if the compiler tries to parallelize it, since
setup/teardown of a tasklet probably will take a lot more than 250 integer adds;
it might even if left sequential, as compilers have been optimizing the former
for 40 years and probably will generate only 4 or 5 x86 machine instructions for
the former -- but the patterns are likely to be very specific to the existing
format and won't work on the slightly different organization of a reduction.)

Indeed, some hardware has instructions specifically for loops like this. It
might end up running in microcode and there's no way that any other feature
could beat that.

Besides, real loops that need to run faster don't look like this. They look
like:

	-- Now, calculate the various running totals:
	for I in 1 .. Actual_Len loop
	    if Sample_Data(I).Check_Item then
		-- Note: We used to calculate Sample_Count,
		-- Actual_Sum, and Actual_Sqr_Sum here, but we've
		-- removed these as this is likely the innermost loop
		-- for the program.
		if Sample_Data(I).Score <= 0.0 then
		    Negative_Count := Negative_Count + 1;
		end if;
		Sample_Sum := Sample_Sum + Sample_Data(I).Score;
		Sample_Sqr_Sum := Sample_Sqr_Sum + (Sample_Data(I).Score*Sample_Data(I).Score);
		Product_Sum := Product_Sum +
	            (Actual_Data(I).Weighted_Worse_Ratio*Sample_Data(I).Score);
		Diff_Sum := Diff_Sum +
		    (Actual_Data(I).Weighted_Worse_Ratio - Sample_Data(I).Score);
		Diff_Sqr_Sum := Diff_Sqr_Sum +
		    (Actual_Data(I).Weighted_Worse_Ratio - Sample_Data(I).Score)**2;
	        if Result_Debug or else (Limited_Calc_Debug and then (I <= Debug_Calc_Limit or else I >= Actual_Len - Debug_Calc_Limit)) then
		    Ada.Text_IO.Put_Line ("Item " & Actual_Data(I).Item_Name & "; cnt="
			    & Natural'Image(Actual_Count));
		    Ada.Text_IO.Put("Sample Score="); LFlt_IO.Put (Sample_Data(I).Score, 10,6,0);
		    Ada.Text_IO.Put("; Actual Score="); LFlt_IO.Put (Actual_Data(I).Weighted_Worse_Ratio, 10,6,0);
	            Ada.Text_IO.New_Line;
		    --Ada.Text_IO.Put("Sample Sum="); LFlt_IO.Put (Sample_Sum, 10,6,0);
		    --Ada.Text_IO.Put("; Sample Sum**2="); LFlt_IO.Put (Sample_Sqr_Sum, 10,6,0);
	            --Ada.Text_IO.New_Line;
		    --Ada.Text_IO.Put("Actual Sum="); LFlt_IO.Put (Actual_Sum, 10,6,0);
		    --Ada.Text_IO.Put("; Actual Sum**2="); LFlt_IO.Put (Actual_Sqr_Sum, 10,6,0);
	            --Ada.Text_IO.New_Line;
		    --Ada.Text_IO.Put("Product Sum="); LFlt_IO.Put (Product_Sum, 10,6,0);
		    --Ada.Text_IO.Put("; Diff Sum="); LFlt_IO.Put (Diff_Sum, 10,6,0);
		    --Ada.Text_IO.Put("; Diff Sum**2="); LFlt_IO.Put (Diff_Sqr_Sum, 10,6,0);
	            --Ada.Text_IO.New_Line;
		-- else no trace
		end if;
	    -- else do not score this pair.
	    end if;
	end loop;

This is the operative loop of a correlation calculator, comparing an array of
actual values to an array of (proposed) estimated values. The array length is
tends to be as much as 4000 items. (Note that the two values that only depend on
Actual_Data are precalculated, since Actual_Data never changes for a particular
run, while the Sample_Data is regenerated for each trial.)

Among other things, one wonders where the debugging would go in a reduction. If
the reduction gives dubious results, you'll have to end up writing a
conventional loop to figure out why (no I/O is allowed in parallel operations,
since it allows blocking, and even when it is, you can't trace the intermediate
results).

Another thing about this code is that common subexpression elimination means
that many of the memory reads are eliminated (or should be, anyway). For
instance, Sample_Data(I).Score should be loaded into a register and only read
once. The reduced memory traffic provides a substantial speed-up to the code.

...
> It looks to me like this will be the most work to nail down in the RM.
> I think the Associated aspect can borrow much of the wording that we
> currently have for the NonBlocking aspect, except change occurrences
> of NonBlocking for Associated.

Why? I though "associated" just specified that an operation is associative; that
doesn't change the semantics of an operation that calls it. All of the
(substantial) complication of nonblocking comes because one can't allow a call
(anywhere) of an operation that allows blocking, so the property has to
propagate. Don't see why "associative" would have to propagate.

Or is "associated" supposed to be something completely different? I don't
remember seeing anything about it previously...

> Then the Parallel Operations AI can be a lot better focused.
> The 'Reduce AI I think
> will also be quite a lot simpler to write up, as both can
> reference the tasklet AI.
>
> Sound OK?

I don't really see the point, since 'Reducer is trivial (it's a Boolean and
isn't transitive), the same should be true for 'Associative, and the basic
parallel loop proposal (which doesn't support any reductions) can provide all of
the tasklet stuff (there isn't much).

I'd expect that any reduction proposal would be in a separate AI (or perhaps
alternative AIs). If we decided to adopt multiple proposals, we could worry
about eliminating duplication, but I do that sort of thing all of the time.

...
> On an aside, I did an experiment today where I wrote a parallel loop
> that calculates 3 results, a sum of a set of values, the min of a set
> of values, and the max of a set of values.
>
> I was curious to see if writing this as three separate loops would be
> better or worse than creating a composite object and writing a reducer
> function that can combine results of this record type.
>
> I found that writing as 3 separate loops was significantly faster than
> 1 loop.
>
> For a loop that involves 4_000_000_000 iterations, the 3 loop solution
> took 11.17 seconds. The single loop took
> 18.29 seconds. Why bring this up?

Because you're testing the wrong thing?

All you've proved here is that Ada has lousy support for tuple-based
programming. I'd be very surprised of any formulation that uses functions
returning composite types would have decent performance doing anything.

If you *really* wanted to test this, you would have used a Reducer *procedure*
to combine the records. The overhead of a single in out parameter is likely to
be many times less than a function return (which is creating an object for each
iteration).

But you'd also have to test whether the approach of writing a single
conventional loop is faster or slower than any parallelism. The overhead of a
loop, especially a parallel loop, can be substantial.

> The 'Reduce approach encourages users to write 3 separate 'Reduce
> expressions, whereas the parallel loop approach encourages users to
> write a single loop with 3 results, which I found is easiest to
> implement in Paraffin if the
> 3 results are combined into a single result type.
>
> So it may be that using 'Reduce can result in significantly better
> performance results than a parallel loop approach, for certain
> problems at least.

Comparing one insane solution (writing a function returning a record, requiring
extra type and function declarations) to a second insane solution (writing the
iterator part repeatedly, with all of the possibilities of error with that)
hardly seems valuable. Both solutions require writing a lot of extra text
compared to the sequential loop, and both seem to require turning the loop
inside out.

The big thing this demostrates is that Ada is a long way from really supporting
functional programming, and it probably never will be able to do that AND have
performance. (Ada doesn't handle tuples easily, and IMHO it shouldn't.)

In any case, a parallel loop is not going to fix every performance problem. Any
problem made up of reducers that are simple machine instructions (like "+" or
"Min") are probably going to be fastest without any parallelism at all. (And
will take the least text if there is only one loop!) You'd need 100s of
thousands of iterations for those to matter, and there aren't that many real
problems with that many iterations.

But there are a lot of problems with expensive iterations that might be helped.
And with expensive setups that would be costly to repeat. (And note the above
comments about common subexpression elimination. If you converted my example
into 5 loops, you'd have to redo all of the memory reads of .Score 5 times,
while the original loop only would do each read once. That would add a lot of
slowdown.

Perhaps you'd like to retry your experiment with my example above? I suspect the
results would be somewhat (or maybe a lot) different.

****************************************************************

From: Brad Moore
Sent: Thursday, January 11, 2018  3:59 PM

> ...
>> I really think in some cases users wont mind deleting an existing
>> loop such as;
>>
>> declare
>>    Sum : Integer := 0;
>> begin
>>
>>    for I in 1 .. 1000 loop
>>       Sum := Sum + A(I);
>>    end loop
>>
>>    Put_Line ("Sum of A=" & Integer'Image(Sum); end
>>
>> and replacing with;
>>
>> Put_Line ("Sum of A=" & Integer'Image(A'Reduce("+", 0)));
>
> Actually, they probably will, because the latter might very well run
> 10 times slower. (It certainly will if the compiler tries to
> parallelize it, since setup/teardown of a tasklet probably will take a
> lot more than 250 integer adds; it might even if left sequential, as
> compilers have been optimizing the former for 40 years and probably
> will generate only 4 or 5
> x86 machine instructions for the former -- but the patterns are likely
> to be very specific to the existing format and won't work on the
> slightly different organization of a reduction.)

I disagree with this assessment. I would expect that a 'Reduce implementation
would generate the same code as a corresponding parallel loop, under the covers,
assuming only a single reduction result is needed.

For instance if a Paraffin like library was being called under the covers, the
same calls could be applied.

...
> Among other things, one wonders where the debugging would go in a reduction.
> If the reduction gives dubious results, you'll have to end up writing
> a conventional loop to figure out why (no I/O is allowed in parallel
> operations, since it allows blocking, and even when it is, you can't
> trace the intermediate results).

'Reduce is simple enough that there's nothing much to debug typically.

Reducers like Integer'Min, or "+", etc doesn't need debugging, and the rest of
the parallelism is provided by the implementation. There's a lot less to go
wrong.

I think if a loop is complex, then you are better off using a parallel loop,
then you can also insert debugging code more easily.

> Another thing about this code is that common subexpression elimination
> means that many of the memory reads are eliminated (or should be,
> anyway). For instance, Sample_Data(I).Score should be loaded into a
> register and only read once. The reduced memory traffic provides a
> substantial speed-up to the code.
>
> ...
>> It looks to me like this will be the most work to nail down in the RM.
>> I think the Associated aspect can borrow much of the wording that we
>> currently have for the NonBlocking aspect, except change occurrences
>> of NonBlocking for Associated.
>
> Why? I though "associated" just specified that an operation is
> associative; that doesn't change the semantics of an operation that
> calls it. All of the
> (substantial) complication of nonblocking comes because one can't
> allow a call (anywhere) of an operation that allows blocking, so the
> property has to propagate. Don't see why "associative" would have to
> propagate.

I think we'd also want to something similar where the compiler might warn the
user that a 'Reduce cannot be done in parallel. Similarly, you might want to
allow 'Associative on generic formals, then you would need to disallow an actual
that isn't associative. You might also want to query an attribute, like you can
for NonBlocking.

> Or is "associated" supposed to be something completely different? I
> don't remember seeing anything about it previously...

The idea is that it is a boolean aspect that can be applied to a binary function
where both parameters and result are of the same type. , or to a binary
procedure where both parameters are the same type, and one of the parameters is
an in out.

All it says is that semantically, you get the same result, regardless which
actual gets passed in which parameter. (e.g  A + B = B + A) for Integer "+"

>> Then the Parallel Operations AI can be a lot better focused.
>> The 'Reduce AI I think
>> will also be quite a lot simpler to write up, as both can reference
>> the tasklet AI.
>>
>> Sound OK?
>
> I don't really see the point, since 'Reducer is trivial (it's a
> Boolean and isn't transitive), the same should be true for
> 'Associative, and the basic parallel loop proposal (which doesn't
> support any reductions) can provide all of the tasklet stuff (there isn't
> much).

The Reducer aspect I am talking about is an aspect that can be applied to a
binary function or procedure where the two parameters are not of the same type.
It identifies another subprogram that has the Associative aspect, to be used
when combining results from a subprogram that has the Reducer aspect.

eg.

In Ada.Containers.Vectors...

   function "&" (Left, Right : Vector) return Vector with Associative;

   function "&" (Left : Vector; Right : Element_Type) return Vector with Reducer => "&";

The Reducer aspect names another function that has the Associative aspect.

You can then, for instance, append Characters to a Vector in parallel.
Otherwise, you could only concatenate vectors together in parallel.

> I'd expect that any reduction proposal would be in a separate AI (or
> perhaps alternative AIs). If we decided to adopt multiple proposals,
> we could worry about eliminating duplication, but I do that sort of thing all of the time.
>
> ...
>> On an aside, I did an experiment today where I wrote a parallel loop
>> that calculates 3 results, a sum of a set of values, the min of a set
>> of values, and the max of a set of values.
>>
>> I was curious to see if writing this as three separate loops would be
>> better or worse than creating a composite object and writing a
>> reducer function that can combine results of this record type.
>>
>> I found that writing as 3 separate loops was significantly faster
>> than 1 loop.
>>
>> For a loop that involves 4_000_000_000 iterations, the 3 loop
>> solution took 11.17 seconds. The single loop took
>> 18.29 seconds. Why bring this up?
>
> Because you're testing the wrong thing?
>
> All you've proved here is that Ada has lousy support for tuple-based
> programming. I'd be very surprised of any formulation that uses
> functions returning composite types would have decent performance doing anything.

Actually this works surprisingly well even for composites. I have updated my
program to produce a more information, and I think the results are quite
interesting.

I have also implemented Randy's test program from his previous email, and yes
that does also provide some interesting results that are quite different. See
even farther below.


With Optimization turned off in GNAT with 4 cores I ran the following tests on
my program which produced the following results.

1. Calculation of Sum of adding 1, 8_000_000_000 times using a sequential loop
2. Doing the same, but with a parallel loop
3. Calculate the sum as before, but also calculate min and max of the loop
   index, in 3 separate sequential loops
4. Do 3 again, but using 3 separate parallel loops.
5. Do all 3 calculations on a composite result in a single parallel loop using
   functional reduction. The calculation of results in each tasklet is done
   using a procedure, but the reduction between tasklets is done using a
   function, which is also used to provide the final result.
6. Do 5, except use a procedural reduction library. All reduction calls and
   final result are via a procedure in out parameter
7. Do 2, except instead of using my own Paraffin library, use Ada code passed
   to OpenMP library
8. Do 4, except use my OpenMP bindings from Ada.
9. Do 6, except use my OpenMP bindings from Ada.

We see here that using 3 separate parallelism loops ends up being twice as fast
as sequential. However, using composite record types, the performance is not
even worthwhile, as it is worse or at least roughly equivalent as sequential.

My suspicion is that having to use record types in comparison with elementary
types can add a significant amount of overhead. Probably the elementary types
can be processed more using hardware registers, etc.

Interesting things happen when I turn on optimization though. Skip down to see
those results.


************* Parallel Framework Test *************
  Physical Processors= 4

Single operation in a sequential Loop

{Elapsed = 00:00:19.25
Sum= 8000000000


Single Operation Parallelism Loop

{Elapsed = 00:00:05.30
Sum= 8000000000


3 operations in a sequential Loop

{Elapsed = 00:00:28.14
Sum= 8000000000, Min= 1, Max= 8000000000


3 Separate Parallelism Loops

{Elapsed = 00:00:15.64
Sum= 8000000000, Min= 1, Max= 8000000000


Single Functional Parallelism Loops

{Elapsed = 00:00:29.70
Sum= 8000000000, Min= 1, Max= 8000000000


Single Procedural Parallelism Loops

{Elapsed = 00:00:30.28
Sum= 8000000000, Min= 1, Max= 8000000000


Single Operation OpenMP Parallelism Loop

{Elapsed = 00:00:04.66
Sum= 8000000000


3 Separate OpenMP Parallelism Loops

{Elapsed = 00:00:15.55
Sum= 8000000000, Min= 1, Max= 8000000000


Single OpenMP Parallelism Loops

{Elapsed = 00:02:39.06
Sum= 8000000000, Min= 1, Max= 8000000000


----------------

Now after turning on full optimization in GNAT, I get the following results for
the same tests.


************* Parallel Framework Test *************
  Physical Processors= 4

Single operation in a sequential Loop

{Elapsed = 00:00:00.00
Sum= 8000000000


Single Operation Parallelism Loop

{Elapsed = 00:00:02.07
Sum= 8000000000


3 operations in a sequential Loop

{Elapsed = 00:00:08.24
Sum= 8000000000, Min= 1, Max= 8000000000


3 Separate Parallelism Loops

{Elapsed = 00:00:04.14
Sum= 8000000000, Min= 1, Max= 8000000000


Single Functional Parallelism Loops

{Elapsed = 00:00:05.30
Sum= 8000000000, Min= 1, Max= 8000000000


Single Procedural Parallelism Loops

{Elapsed = 00:00:05.31
Sum= 8000000000, Min= 1, Max= 8000000000


Single Operation OpenMP Parallelism Loop

{Elapsed = 00:00:02.09
Sum= 8000000000


3 Separate OpenMP Parallelism Loops

{Elapsed = 00:00:04.19
Sum= 8000000000, Min= 1, Max= 8000000000


Single OpenMP Parallelism Loops

{Elapsed = 00:00:17.29
Sum= 8000000000, Min= 1, Max= 8000000000

Interesting that a single operation, (sum) doesn't even register a time.
But when you try to do more than one, then there can be parallelism benefits.

But even with optimization turned on, you get better performance using 3
separate loops, although now at least there is at least some benefit to using a
single loop.

> If you *really* wanted to test this, you would have used a Reducer
> *procedure* to combine the records. The overhead of a single in out
> parameter is likely to be many times less than a function return
> (which is creating an object for each iteration).

In all my testing with Paraffin, I have found surprisingly that there never is
much of a difference between using a function vs a procedure. You can see that
in the results above.

> But you'd also have to test whether the approach of writing a single
> conventional loop is faster or slower than any parallelism. The
> overhead of a loop, especially a parallel loop, can be substantial.

I show that above in the results also. You can see the speedup vs sequential
code.

>> The 'Reduce approach encourages users to write 3 separate 'Reduce
>> expressions, whereas the parallel loop approach encourages users to
>> write a single loop with 3 results, which I found is easiest to
>> implement in Paraffin if the
>> 3 results are combined into a single result type.
>>
>> So it may be that using 'Reduce can result in significantly better
>> performance results than a parallel loop approach, for certain
>> problems at least.
>
> Comparing one insane solution (writing a function returning a record,
> requiring extra type and function declarations) to a second insane
> solution (writing the iterator part repeatedly, with all of the
> possibilities of error with that) hardly seems valuable. Both
> solutions require writing a lot of extra text compared to the
> sequential loop, and both seem to require turning the loop inside out.
>
> The big thing this demostrates is that Ada is a long way from really
> supporting functional programming, and it probably never will be able
> to do that AND have performance. (Ada doesn't handle tuples easily,
> and IMHO it
> shouldn't.)

I think the results above show that it actually does handle tuples pretty well.

> In any case, a parallel loop is not going to fix every performance problem.
> Any problem made up of reducers that are simple machine instructions
> (like "+" or "Min") are probably going to be fastest without any
> parallelism at all. (And will take the least text if there is only one
> loop!) You'd need 100s of thousands of iterations for those to matter,
> and there aren't that many real problems with that many iterations.

I see benefits to parallelism with both optimization turned off or on, even for
these simple operations.

> But there are a lot of problems with expensive iterations that might
> be helped. And with expensive setups that would be costly to repeat.
> (And note the above comments about common subexpression elimination.
> If you converted my example into 5 loops, you'd have to redo all of
> the memory reads of .Score 5 times, while the original loop only would
> do each read once. That would add a lot of slowdown.
>
> Perhaps you'd like to retry your experiment with my example above? I
> suspect the results would be somewhat (or maybe a lot) different.

I did just that, and created another test program.

This program runs the code sequentially, then as separate parallel loops one for
each result, then a single loop using a composite result type.

First are the results with optimization turned off.

************* Parallel Framework Test *************
  Physical Processors= 4

Multiple operations in a sequential Loop

{Elapsed = 00:00:01.45
Negative_Count= 0
Sample_Sum= 1.99965350799589E+07
Sample_Sqr_Sum= 1.33299985832170E+07
Product_Sum= 1.99965350799589E+07
Diff_Sum= 2.00046739200409E+07
Diff_Sqr_Sum= 1.33381374236478E+07


Multiple parallel loops

{Elapsed = 00:00:01.38
Negative_Count= 0
Sample_Sum= 1.99965350797953E+07
Sample_Sqr_Sum= 1.33299985832053E+07
Product_Sum= 1.99965350797953E+07
Diff_Sum= 2.00046739202047E+07
Diff_Sqr_Sum= 1.33381374236362E+07


Single parallel loop

{Elapsed = 00:00:00.42
Negative_Count= 0
Sample_Sum= 1.99965350797952E+07
Sample_Sqr_Sum= 1.33299985832053E+07
Product_Sum= 1.99965350797952E+07
Diff_Sum= 2.00046739202047E+07
Diff_Sqr_Sum= 1.33381374236363E+07

---------------------------------

Then I run it again with optimization turned on....

************* Parallel Framework Test *************
  Physical Processors= 4

Multiple operations in a sequential Loop

{Elapsed = 00:00:00.62
Negative_Count= 0
Sample_Sum= 1.99965350799589E+07
Sample_Sqr_Sum= 1.33299985832170E+07
Product_Sum= 1.99965350799589E+07
Diff_Sum= 2.00046739200409E+07
Diff_Sqr_Sum= 1.33381374236478E+07


Multiple parallel loops

{Elapsed = 00:00:00.94
Negative_Count= 0
Sample_Sum= 1.99965350797953E+07
Sample_Sqr_Sum= 1.33299985832053E+07
Product_Sum= 1.99965350797952E+07
Diff_Sum= 2.00046739202048E+07
Diff_Sqr_Sum= 1.33381374236362E+07


Single parallel loops

{Elapsed = 00:00:00.20
Negative_Count= 0
Sample_Sum= 1.99965350797950E+07
Sample_Sqr_Sum= 1.33299985832052E+07
Product_Sum= 1.99965350797950E+07
Diff_Sum= 2.00046739202050E+07
Diff_Sqr_Sum= 1.33381374236362E+07


With optimization turned off, there is only a slight benefit to using separate
loops, but there is a big advantage to using a single loop.

With optimization turned on, using parallelism with separate loops is quite a
bit worse than sequential.

However, using a single loop is about 3 times faster. So the results from this
program are pretty much the reverse of what you get with my program.

So to summarize all this,

I'd say that as the complexity of a loop increases, it is better and easier to
use a parallel loop.

As the complexity of a loop decreases, it is likely better and easier to use a
'Reduce call.

****************************************************************

From: Randy Brukardt
Sent: Thursday, January 11, 2018  10:40 PM

...
> > Actually, they probably will, because the latter might very well run
> > 10 times slower. (It certainly will if the compiler tries to
> > parallelize it, since setup/teardown of a tasklet probably will take
> > a lot more than 250 integer adds; it might even if left sequential,
> > as compilers have been optimizing the former for 40 years and
> > probably will generate only 4 or 5
> > x86 machine instructions for the former -- but the patterns are
> > likely to be very specific to the existing format and won't work on
> > the slightly different organization of a reduction.)
>
> I disagree with this assessment. I would expect that a 'Reduce
> implementation would generate the same code as a corresponding
> parallel loop, under the covers, assuming only a single reduction
> result is needed.

I think you missed my point. For the 1000 iterations in the original example,
highly optimized sequential loop code probably would clobber any attempt at a
parallel loop, as creation/ending synch on the tasklets would far outweigh the
cost of doing 1000 iterations (most likely with all of the critical stuff in
registers).

You saw an extreme case of this in your simple test program with optimization
on, as GNAT recognized that the loop didn't do anything very interesting other
than waste time and replaced the entire thing with the result of 8_000_000. Not
doing a loop is going to beat doing a loop any time. ;-)

The other part of my point is that a *sequential* reduce expression might not
get the benefit of those optimizations, because the setup and organization would
be somewhat different. Some of these optimizations tend to be pattern matching,
where a known pattern is recognized and replaced with something much simpler; if
the code is slightly different, the pattern doesn't match and the optimization
doesn't get done. Obviously, that would be very implementation-specific, so it's
hard to say much more about that.

...
> I think if a loop is complex, then you are better off using a parallel
> loop, then you can also insert debugging code more easily.

Other than that we don't have a way to do reductions currently in a parallel
loop. :-)

Note that I'd be happy to finish the current version of the parallel loop AI
without worrying about reducers, and deal with the reducer issues separately.
(That would at least let us make some progress.)

> > ...
> >> It looks to me like this will be the most work to nail down in the RM.
> >> I think the Associated aspect can borrow much of the wording that
> >> we currently have for the NonBlocking aspect, except change
> >> occurrences of NonBlocking for Associated.
> >
> > Why? I though "associated" just specified that an operation is
> > associative; that doesn't change the semantics of an operation that
> > calls it. All of the (substantial) complication of nonblocking comes
> > because one can't allow a call (anywhere) of an operation that
> > allows blocking, so the property has to propagate. Don't see why "associative"
> > would have to propagate.
>
> I think we'd also want to something similar where the compiler might
> warn the user that a 'Reduce cannot be done in parallel.

Warnings are not the Standard's concern. OTOH, I would like to have a way to
tell the compiler whether to parallelize (or not) a reduction expression, since
it doesn't in general have the information needed to make that choice. It can't
find out the number of iterations nor how expensive the reduction function is
(in special cases it can, like statically bounded iterations with a predefined
operator as a reduction function). Having to make a runtime choice would mean
essentially generating all of the code twice (once for a parallel reduction, and
once for a sequential reduction), and always choosing parallel would make most
reductions pretty slow (you need a significant amount of work - either
iterations or in the function body - before parallelizing will help anything).

> Similarly, you might want to allow 'Associative on generic formals,
> then you would need to disallow an actual that isn't associative. You
> might also want to query an attribute, like you can for NonBlocking.

That's still many times simpler than Nonblocking. Because Nonblocking controls
legality rules in bodies for which it is true, but we have cases where you don't
know the value until the generic is instantiated. You don't need any of that.

> > Or is "associated" supposed to be something completely different? I
> > don't remember seeing anything about it previously...
>
> The idea is that it is a boolean aspect that can be applied to a
> binary function where both parameters and result are of the same type.
> , or to a binary procedure where both parameters are the same type,
> and one of the parameters is an in out.
>
> All it says is that semantically, you get the same result, regardless
> which actual gets passed in which parameter. (e.g A + B = B + A) for
> Integer "+"

Um, that's "commutative". :-)

An associative operator is one for which (A + B) + C = A + (B + C). My
understanding is that was the only requirement needed to allow parallel
reduction (because we don't know which one we'd want to do first, but it is easy
enough to keep Left things on the left and Right things on the right).

> >> Then the Parallel Operations AI can be a lot better focused.
> >> The 'Reduce AI I think
> >> will also be quite a lot simpler to write up, as both can reference
> >> the tasklet AI.
> >>
> >> Sound OK?
> >
> > I don't really see the point, since 'Reducer is trivial (it's a
> > Boolean and isn't transitive), the same should be true for
> > 'Associative, and the basic parallel loop proposal (which doesn't
> > support any reductions) can provide all of the tasklet stuff (there
> > isn't much).
>
> The Reducer aspect I am talking about is an aspect that can be applied
> to a binary function or procedure where the two parameters are not of
> the same type. It identifies another subprogram that has the
> Associative aspect, to be used when combining results from a
> subprogram that has the Reducer aspect.
>
> eg.
>
> In Ada.Containers.Vectors...
>
>    function "&" (Left, Right : Vector) return Vector with Associative;
>
>    function "&" (Left : Vector; Right : Element_Type) return Vector
> with Reducer => "&";
>
> The Reducer aspect names another function that has the Associative
> aspect.
>
> You can then, for instance, append Characters to a Vector in parallel.
> Otherwise, you could only concatenate vectors together in parallel.

Yes, but this is still trivial (perhaps fairly large, but that is different than
hard). In any case, I don't want these aspects stuff mixed in with the basic
parallel loop proposal (which actually is nearing completion, I think -- it just
needs wording polishing).

Reduction stuff is its own thing, regardless of what context it appears in.

...
> So to summarize all this,
>
> I'd say that as the complexity of a loop increases, it is better and
> easier to use a parallel loop.

...assuming that we have a way to write a reduction in such a loop. :-)

> As the complexity of a loop decreases, it is likely better and easier
> to use a 'Reduce call.

Makes sense, but I at least rarely write such loops (outside of string
processing, which wouldn't have enough iterations to be worth parallelizing,
and which generally aren't reductions anyway). YMMV.

****************************************************************

From: Jean-Pierre Rosen
Sent: Thursday, January 11, 2018  11:35 PM

>> All it says is that semantically, you get the same result, regardless
>> which actual gets passed in which parameter. (e.g A + B = B + A) for
>> Integer "+"
> Um, that's "commutative". :-)
>
> An associative operator is one for which (A + B) + C = A + (B + C). My
> understanding is that was the only requirement needed to allow
> parallel reduction (because we don't know which one we'd want to do
> first, but it is easy enough to keep Left things on the left and Right things
> on the right).

And especially, note that "&" is NOT commutative...

****************************************************************

From: Brad Moore
Sent: Friday, January 12, 2018  1:13 PM

...
> I think you missed my point. ...

OK

> ...
>> I think if a loop is complex, then you are better off using a
>> parallel loop, then you can also insert debugging code more easily.
>
> Other than that we don't have a way to do reductions currently in a
> parallel loop. :-)
>
> Note that I'd be happy to finish the current version of the parallel
> loop AI without worrying about reducers, and deal with the reducer
> issues separately. (That would at least let us make some progress.)

I think thats a good idea for now, for that AI.
I'd probably still try to create a separate AI on Associative, Reducer aspects,
as that would be needed for the 'Reduce AI, but would be better to separate
that part out.

> I would like to have a way to
> tell the compiler whether to parallelize (or not) a reduction
> expression, since it doesn't in general have the information needed to make that choice.
> It can't find out the number of iterations nor how expensive the
> reduction function is (in special cases it can, like statically
> bounded iterations with a predefined operator as a reduction
> function). Having to make a runtime choice would mean essentially
> generating all of the code twice (once for a parallel reduction, and
> once for a sequential reduction), and always choosing parallel would
> make most reductions pretty slow (you need a significant amount of
> work - either iterations or in the function body - before parallelizing will
> help anything).

Maybe we could allow a parallel do block for this purpose. It indicates that
parallelism is desired, and we could also say that the "and" is optional.

parallel
do
    Foo := A'Reduce(F,0);
end do;

If the "and" is present, then parallelism is first applied to the separate
blocks, and then optionally applied internally to each block, if more cores are
available.

e.g.

parallel
do
    Foo := A'Reduce(F,0);
and
    Bar := A'Reduce(G,0);
end do;

Though if the compiler was unsure about reducing via the F reducer, it would
probably fall back to the default as would be applied if the enclosing parallel
do wasn't there.

If one really wanted the F reducer to be done in parallel for this example, one
could write:

parallel
do
    parallel do
       Foo := A'Reduce(F,0);
    end do;
and
    Bar := A'Reduce(G,0);
end do;

...
>> > Or is "associated" supposed to be something completely different? I
>> > don't remember seeing anything about it previously...
>>
>> The idea is that it is a boolean aspect that can be applied to a
>> binary function where both parameters and result are of the same
>> type.
>> , or to a binary procedure where both parameters are the same type,
>> and one of the parameters is an in out.
>>
>> All it says is that semantically, you get the same result, regardless
>> which actual gets passed in which parameter. (e.g A + B = B + A) for
>> Integer "+"
>
> Um, that's "commutative". :-)

Yes, sorry my mistake for trying to type out this email too fast without
thinking. The intent was to write the formula
    (A + B) + C = A + (B + C)

And so my description of associative is also incorrect. It is not about being
able to switch the order of parameters to the function, but rather being
flexible in applying the function to a series of values without concern for
order in how the function is applied.

> In any case, I don't want these aspects stuff mixed in with the basic
> parallel loop proposal (which actually is nearing completion, I think
> -- it just needs wording polishing).
>
> Reduction stuff is its own thing, regardless of what context it appears in.

Agreed.

****************************************************************

From: Brad Moore
Sent: Friday, January 12, 2018  1:17 PM

...
> And especially, note that "&" is NOT commutative...

Yes, sorry for the confusion of typing the wrong formula for associativity,

but note that while this is not commutative, it is associative, and thus is a
candidate for parallelisation. The intent of the proposal is to support both
commutative and non-commutative reducer functions.

****************************************************************

From: Brad Moore
Sent: Tuesday, January 16, 2018  9:44 AM

> ...
>> As far as explicit parallelism, I now believe the right way to handle
>> reductions in that context is to provide a chunk index to a parallel
>> loop, vaguely analogous to an entry family index. ...
[See AI12-0251-1 for this thread.]

I'm also thinking we might want to provide explicit parallelism via an attribute.

We could have;

   'Reduce

  which defaults to sequential semantics, but could implicitly
  inject parallelism if the global checks and nonblocking checks
  pass, and appropriate reducers are identified, (and if the implementation
  can tell that this would be a benefit to performance). An implementation
  could opt to skip all these checks and provide a simpler sequential
  implementation, particularly as it is not easy in a lot of cases to determine
  if introducing the parallelism overhead is worthwhile.

and

   'Parallel_Reduce


   which is the explicit form, that applies the global and nonblocking
   checks, and ensures that reducers are defined. The compiler rejects
   the compilation if these checks do not all pass.

   This way, the code documents the intent for parallelism, and also
   makes it clearer to the programmer that parallelism is being applied.
   And if the programmer wants to compare with sequential performance,
   or cannot make the checks pass, all he needs to do is take off the
   "Parallel_" from the attribute reference.

****************************************************************

From: Tucker Taft
Sent: Tuesday, January 16, 2018  10:51 AM

We talked about this in Vienna (or was it Lexington), and decided to leave out
an explicit "parallel" from the reduction expression.  If the compiler has to
be smart enough to check whether parallelism would be legal, it is smart enough
to insert it.  The same thing would apply to quantified expressions as well,
and eventually container aggregates, etc.  For these kinds of "self-contained"
expressions, I don't think we should create explicitly parallel versions, as
it just adds complexity -- they should be seen as inherently parallel, with
the compiler deciding when it is worth to make them actually parallel.

****************************************************************

From: Randy Brukardt
Sent: Tuesday, January 16, 2018  4:19 PM

> We talked about this in Vienna (or was it Lexington), and decided to
> leave out an explicit "parallel" from the reduction expression.

My recollection of this discussion (in Lexington) was a bit different. :-)

You pretty much announced that we don't need "parallel" in expression contexts.
Brad objected, but you and Raphael argued that tuning is not portable.

Somewhat later, Raphael complained that there isn't any way for the user to
tell if a reduction expression is executed in parallel. No one responded to
that.

For myself, this was a complete surprise and I didn't object at least in part
because I hadn't even thought about the possibility. Now that I have, I
strongly agree with Brad and Raphael's objections, and have more on top.

> If the compiler has to be smart enough to check whether parallelism
> would be legal, it is smart enough to insert it. The same thing would
> apply to quantified expressions as well, and eventually container
> aggregates, etc.  For these kinds of "self-contained"
> expressions, I don't think we should create explicitly parallel
> versions, as it just adds complexity -- they should be seen as
> inherently parallel, with the compiler deciding when it is worth to
> make them actually parallel.

Ada always allows as-if optimizations, and surely if the compiler can prove
that executing in parallel has no impact on the canonical results, then it can
execute anything in parallel. (That is enabled because the order of evaluation
of many things is unspecified. There is nothing special about a reduction in
this sense -- it applies to any aggregate, evaluation of parameters, and most
other expression parts.)

The problem, though, with as-if optimizations is that one has to be
conservative with them. One never wants an as-if optimization to run slower
than the original code.

That's a major issue for parallel execution, as the overhead of parallel
execution is going to be high on common desktop operating systems. That's
especially true since the Meltdown flaw, as the mitigation is for all system
calls to remap the address space. At least one system call will be needed to
start and stop a tasklet, as the bare machine approach of letting it busy wait
would be very unfriendly (burning CPU at all times).

Combining these two things, parallelization can only be automatically applied
when it is known that there are enough iterations and cost to each iteration
to make the savings be more than the overhead. But that means that only
reducers with static bounds and known bodies (either fully visible expressions
or known expression functions) and that can beat the overhead should be
parallelized. If you don't know the number of iterations, or the amount of
code, then one can't parallelize as there is a significant risk that the
performance would be way worse. Obviously, interpreters and even just-in-time
compilers can do better, but Ada is usually implemented with code generation
long before execution (especially necessary for embedded systems).

As such, automatic parallelization will be available only very rarely; probably
rarely enough to not make it worth the effort to even implement.
(Interestingly, it is much easier to tell when parallelization will do no good,
as simple short loops are quite common.) However, the programmer knows (or at
least suspects) when parallelization would improve the performance. They need
to be able to specify that they want parallelization for every construct for
which it makes sense (and that surely includes reductions).

****************************************************************

From: Jean-Pierre Rosen
Sent: Tuesday, January 16, 2018  11:41 PM

> Combining these two things, parallelization can only be automatically
> applied when it is known that there are enough iterations and cost to
> each iteration to make the savings be more than the overhead.
I would add:

and that the gain is significantly higher than what can be achieved with
regular tasks.

****************************************************************

From: Brad Moore
Sent: Wednesday, January 17, 2018  12:07 AM

> For myself, this was a complete surprise and I didn't object at least
> in part because I hadn't even thought about the possibility. Now that
> I have, I strongly agree with Brad and Raphael's objections, and have more
> on top.

I'd like to also add to what Randy said, which happened to very much echos my
own views.

Back in Vienna and Lexington, though, the situation was quite different from
what we have now.

Back then, we were talking reduction expressions, which looked a lot like
quantified expressions. We were talking about adding the parallel keyword to
that syntax, which did not fit in very well in my view and looked quite messy.
So, I was more able to understand why we didn't want to have the explicit
parallel keyword with reduction expressions, even if I was reluctant to go
that way.

Since then, we came up with the 'Reduce attribute idea. To have another
attribute such as 'Parallel_Reduce fits quite a lot more nicely into this
scheme. There is no additional keyword needed. It's just a substitution of a
different attribute name.

I note also that the proposal for explicit parallelism that you proposed the
other day relies on using 'Reduce calls, which is implicit parallelism.
If the goal of that proposal is to provide a way to indicate explicit
parallelism, shouldn't there also be a way to explicitly indicate the the
'Reduce calls associated with that syntax are also to be executed with
parallelism?In a lot of cases, parallelism isn't wanted, but other times it
would be.

****************************************************************

From: Brad Moore
Sent: Saturday, January 20, 2018  4:16 PM

This is my first attempt at rewriting the new AI on Reduction capabilities.
[This is version /02 of the AI - ED.]

This version is a complete replacement for what I submitted last time, and
introduces two attributes,

'Reduce

and

'Parallel_Reduce


The 'Reduce attribute is considered to have sequential semantics, whereas the
'Parallel attribute has parallel semantics.

I also got rid of the Identity_Value aspect, and Reducer aspect,

and instead added an Associative aspect which is a boolean aspect that indicates
if the subprogram results are considered to be associative or not.

For instance, Addition is considered to be associative because;

  (A + B) + C = A + (B + C)


The combiner_subprogram of the attribute can be a function or procedure, that
accepts exactly two parameters, both of the same type. The result of the
function is also of this same type.

It is because of this restriction that we can eliminate the 'Identity_Value and
'Reducer aspects. The first value assigned to a tasklet can be used as the
initial value, except for the case of the tasklet processing the first item of
the prefix, in which case, the initial_Value argument of the attribute is used.

Comments welcome, of course...

****************************************************************

From: Randy Brukardt
Sent: Thursday, January 25, 2018  1:57 AM

Mostly looks OK to me, although you don't need some of the generic Nonblocking
rules for Associative (since Associative doesn't control Legality Rules in the
subprogram body like Nonblocking does).

I fixed a number of typos and formatting issues when I posted this, the worst
being "two parameters are not of the same time" rather than type. This version
will get put on the website Friday with all of the other last minute updates.

****************************************************************

From: Jean-Pierre Rosen
Sent: Wednesday, February  7, 2018  10:33 AM

Since my proposal seems to not have been very clear, here is my suggestion
again from scratch.

Currently, the proposal uses some kind of expression (aggregate?
container? array?) as the prefix:
   <expression>'Reduce (F, <value>)

However, this would make 'Reduce very different from all other
attributes:
- Current syntax says that the prefix of an attribute is a name (or an
  implicit dereference). This proposal would be a complete change of
  syntax.

- All other attributes denote values or subprograms. What does 'Reduce
  denote?  And what is F? A name? Then 'Reduce can't be a function! It
  is actually a whole new kind of expression.

- I find having a prefix that extends over several lines before you
  discover it IS the prefix of an attribute extremely unreadable and
  confusing. Admitedly this is personal taste (I've never been a fan
  of prefixed notations)

My proposal is to change the syntax (NOT the semantics) by making F the
prefix: F'Reduce (<expression>) (The issue of having or not a default value
is independent and deserves another discussion).

Here:
- The prefix is a name - no change in syntax

- 'Reduce denotes a function that performs the reduction of its
  argument. The argument is syntactically an array whose components
  are appropriate for F. Its behaviour can be described in terms of
  a regular function

- For parallelism, it would be easy to specify the reduction function
  as an aspect Reduction => F'Reduce

This does NOT imply really creating an array to be passed to a regular
function.  We just need to specify that the components are evaluated in no
particular order.  Not creating the array is simply like putting F'Reduce
inline. It's 100% "as if".

OTOH, a compiler would be /allowed/ to construct an array if it is more
convenient or efficient in some cases, and/or to generate an actual function.

We could rule that F'Reduce has convention => intrinsic if we don't want it
to be passed to generics - or not. After all, all inlined subprograms need a
regular version for generics.

I think this model would be easier to understand and explain, and would fit
better in the current syntax.

****************************************************************

From: Edmond Schonberg
Sent: Wednesday, February  7, 2018  11:24 AM

That’s an attractive alternative, as long as the first argument can also be a
container or other itterable object,  not just an explicit iterator construct:
"+"'Reduce (S, 0) (adjacent double and single quote are unfortunate however).

****************************************************************

From: Randy Brukardt
Sent: Wednesday, February  7, 2018  5:55 PM

> Since my proposal seems to not have been very clear, here is my
> suggestion again from scratch.
>
> Currently, the proposal uses some kind of expression (aggregate?
> container? array?) as the prefix:
>    <expression>'Reduce (F, <value>)

It seems that you are responding to various wild ideas proposed in e-mail, and
not the concrete proposal - AI12-0242-1 - that Brad (presumably) spent hours
writing. Brad's proposal clearly explains what this is: the prefix is an array
or container object. Brad does not propose any syntax change.

<Rant>Please, can we at least talk about what is proposed and not old fever
dreams?? We need to make progress on this standard, and we can't do that if we
keep rehashing things already rejected.</Rant>

There is no functionality need for the prefix to be anything else; it might be
easier to write with some other rule but surely not easier to understand.
The prefix can be an aggregate of course, and that provides the iterators given
in the e-mail examples. But that is NOT fundamental to the operation -- at
least not to me (it makes much more sense when explained as operating on an
array object than when tangling the iterators into it). [Just because it is
written as a separate object doesn't require it to be implemented that way, of
course, and the idea was to allow the object evaluation and the reduction to be
interleaved -- probably a "Notwithstanding" rule.]

Aside: I don't see any real need for the prefix to be a container (one can
always write an array aggregate out of the appropriate container elements/parts
of elements), and Brad's proposal never explains what he means by "container",
anyway. (The Ada.Containers types are just normal tagged private types.) I
suspect that at a minimum an iterator needs to be present. I can see that it
might be more usable to allow containers, but I think it would be best to just
define the semantics in terms of arrays and figure out an extension to
containers later (they'll be a lot more complicated because of the need to talk
about iterators and to just define what is meant by a "container" in this
context). End Aside.

> However, this would make 'Reduce very different from all other
> attributes:
> - Current syntax says that the prefix of an attribute is a name (or an
>   implicit dereference). This proposal would be a complete change of
>   syntax.

Brad proposes no syntax change at all. I strongly concur with Brad on this; if
we later think that we need to improve the usability of the attribute, we can
consider other extensions but I don't think any of the ones I've considered fit
into Ada very well. (Nor does your proposal, so far as I can tell; more below.)

To use an aggregate in this notation, it would have to be a qualified
expression (something has to specify the type of the aggregate). That might
make it clunky to use in some contexts (needing to define a type just for it),
but the idea of an expression with no defined type is wildly out of character
for Ada and is a nonstarter with me. (Your proposal does nothing for this.)

> - All other attributes denote values or subprograms. What does 'Reduce
>   denote?  And what is F? A name? Then 'Reduce can't be a function! It
>   is actually a whole new kind of expression.

It denotes a value, of course. Why would you expect it to denote something
else? The whole idea of attributes denoting functions is a nasty hack that
causes more trouble than it is worth (functions with parameters of types that
you can't otherwise describe are hardly functions at all).

And surely there is nothing about a "value" that insists that the value has
to be constant; it could be the result of some calculation (indeed, even
A'First is the result of a calculation if the prefix is a parameter).

> - I find having a prefix that extends over several lines before you
>   discover it IS the prefix of an attribute extremely unreadable and
>   confusing. Admitedly this is personal taste (I've never been a fan
>   of prefixed notations)

A legitimate gripe, but one that applies equally to prefixed views and
(recently) existing attributes like Obj'Image. Argubly, that bird has flown.
Moreover, I expect that many of these expressions will be directly in array
objects and the prefixes will be no longer than other typical attributes (more
below).

> My proposal is to change the syntax (NOT the semantics) by making F
> the prefix: F'Reduce (<expression>) (The issue of having or not a
> default value is independent and deserves another discussion).
>
> Here:
> - The prefix is a name - no change in syntax

Straw man argument, the current proposal includes no change in syntax.

I've previously suggested allowing an aggregate as a prefix, but I now realize
that would not work, as an attribute prefix has to be resolved without
context, while an aggregate can only be resolved *with* context (else a
qualified expression, already allowed, would have to be used). I'm
uninterested in any Ada construct having no type.

Also, your proposal doesn't work very well for operators, as a string literal
is not syntactically a name (nor could it be, I think). One has to use an
expanded name for the prefix.

> - 'Reduce denotes a function that performs the reduction of its
>   argument. The argument is syntactically an array whose components
>   are appropriate for F. Its behaviour can be described in terms of
>   a regular function

Sorry, but Ada does not have a type which is "syntactically an array whose
components are appropriate for F". One cannot write such a function. This
would be a function in name only, which is as useful as describing 'Pos as a
function (not at all, leaning toward harmful).

Moreover, it hard to imagine how the resolution of such a thing would work.
I'm very uninterested in describing a new kind of type resolution; at least
the resolution of an attribute prefix is already context-free. And if you
make this context-free, then (just as in the existing proposal) an aggregate
would have to be qualified (because again the context would not give a type).
So this is not helping anything for usability.

> - For parallelism, it would be easy to specify the reduction function
>   as an aspect Reduction => F'Reduce

I don't see any existing situation where this would be of use. Either one has
existing imperative code, where the whole idea of reduction is foreign (and
requires lots of restructuring), or you are creating new code in a functional
style (where the attribute would be applied directly). One can always wrap a
reduce in an expression function if one really needs to treat it that way
(another reason why few attributes need to be treated as subprograms).

In any case, this is not a real function as the argument cannot be described
in Ada terms. So one would never want to treat it as a real function (in terms
of renaming and the like).

> This does NOT imply really creating an array to be passed to a regular
> function.  We just need to specify that the components are evaluated
> in no particular order.  Not creating the array is simply like putting
> F'Reduce inline.
> It's 100% "as if".
>
> OTOH, a compiler would be /allowed/ to construct an array if it is
> more convenient or efficient in some cases, and/or to generate an
> actual function.

This is surely the intent of Brad's proposal, and every proposal (even the
crazy ones) I've seen.

> We could rule that F'Reduce has convention => intrinsic if we don't
> want it to be passed to generics - or not. After all, all inlined
> subprograms need a regular version for generics.

All of the attribute functions are already Intrinsic, but that doesn't prevent
renaming or passing as a generic. Janus/Ada has a whole special block of code
devoted solely to resolving this possibility (renaming attributes or passing
them to generics) -- it's not practical to share that with anything else. I'd
much rather not have Reduce falling into this case. (This sort of use is
barely tested, as it never happens in practice and rarely in the ACATS.)

Luckily, this doesn't matter for your proposal, as with Pos and Val and many
others, you cannot describe the profile in Ada, so renaming/generic actuals
aren't possible anyway.

> I think this model would be easier to understand and explain, and
> would fit better in the current syntax.

I'm not going to argue with an opinion, but there is no difference in the
syntax. If anything, Brad's proposal is slightly better syntactically (as
there is no need to use expanded names for operators).

Still, when you originally proposed this, the examples made more sense to me
when written as Brad proposes. (Although no one ever wrote any of the examples
in a form that is actually sensible Ada -- with proper type names
-- so the jury is still out on that.) The most compelling example to me -- the
example with Tucker's "manual chunking" proposal -- uses an array object as
the prefix. (I'd guess using an aggregate would be rare for most users --
Tucker might be an exception. :-) In Brad's syntax:
      Sum := Partial_Sum'Reduce("+", 0);
In your proposal, the obvious expression is illegal:
      Sum := "+"'Reduce (Partial_Sum, 0); -- Illegal syntax as noted previously
so one has to write:
      Sum := Standard."+"'Reduce (Partial_Sum, 0);

The original makes much more sense to me, as one is reducing the array object
Partial_Sum with the operator "+". Reduction is not a function of "+", yet
your syntax makes it look that way.

Ed wrote:

>... "+"'Reduce (S, 0) (adjacent double and single quote are unfortunate
>however).

Well, the above is not legal Ada syntax, and J-P is not proposing that it be
changed (nor is Brad, for that matter). So one has to write this as an
expanded name:

    Standard."+"'Reduce (S, 0)

Note that there is nothing new about the adjacent " and '. For user-defined
operators, something like:
    Some_Pkg."+"'Access
is perfectly legal Ada 95 (not just Ada 2012).

****************************************************************

From: Tucker Taft
Sent: Thursday, February  8, 2018  9:37 AM

From an admittedly personal perspective, it feels like the ongoing discussion
of the "reduction expression" seems to have entered into the realm of "design
by committee," which I think we have usually avoided in the past. The notion
of a "reduction expression" was suggested a year or so ago, in part because I
have found it to be extraordinarily useful in ParaSail, and because it
included the notion of "reduction" which keeps coming up in discussions of
parallelism.

I have also seen the use of containers grow in Ada 2012 relative to Ada 2005
largely because of the iterator "syntactic sugar" added in Ada 2012.  The
reduction expression was new "syntactic sugar" that combined the power of
iterators with the notion of Map/Reduce.

The notion of "map/reduce" refers to performing an operation (the "map" part)
on some or all elements of some large data set or large search space, and then
combining the individual results (the "reduce" part) to produce a single
"reduced" result.  After many back and forths on e-mail, we seem to have ended
up with little more than something which applies an operator over the elements
of an array, and makes no use of the power of "syntactic sugar" to raise the
level of abstraction.  To me this is quite a disappointment.

I believe iterators are probably the most important new "programming" feature
of Ada 2012, while pre/postconditions are the most important new "declarative"
feature.  And for that matter, iterator-based constructs like quantified
expressions are critical to writing concise pre/postconditions.

For Ada 2020, building on the iterator "syntactic sugar" to me seems a key to
growing the real power of programming in Ada.  Things like iterator "filters"
are part of that -- they are not just "gewgaws."   Various combiners such as
concurrent iterators also add to the power.

Below is a copy of the set of examples of use of map/reduce expressions from
an earlier e-mail, with ParaSail's "|" replaced with Ada's "&" where
appropriate and "when" used for filters.  One thing to notice is that many of
the "reductions" involve concatenations of strings, which clearly would be a
real pain if you first had to create an array of strings, something which is
very awkward in Ada.

I realize this proposed syntax was unclear to some, but it also provides a lot
of power, and keeps operators like "+" and "&" in their natural "infix"
position, so the reduction computation is written in its natural form.  It
also allows an arbitrary amount of processing of the elements generated by the
iterator (the "map" part).  If we lose both of these advantages of this
notation, by reverting to something like "+"'Reduce(...), as well as losing the
ability to use any sort of iterator and filter, the proposal loses much of its
power.  It is not much more than what a generic "Reduce" package could
provide, so it really doesn't add any real power to programming in Ada.

So I would recommend we return to a powerful syntax that is based on
iterators, with operators in their natural "infix" position, etc.  Otherwise,
I believe we are wasting the opportunity and not really adding anything
significant.  At a minimum, we should look at how examples like those below
would be supported in whatever proposal we come up with.

------- Examples of map/reduce expressions, originally in ParaSail, but now with some Ada-ization ------

--  Count the number of formal-params that are output parameters
-- The "when bool" is a filter that controls whether any action is taken
-- for a particular item in the container

const Num_Outputs :=
                     (for each P of Formal_Params
                       when P.Is_Operation_Output => <0> + 1)


--  Create a bracketed list of the elements of DF.
--  Note that ParaSail allows two iterators that iterate together,
--  stopping when either reaches its end.
--  So in the case, we are iterating over the elements of "DF" while
--  also doing a simple "X := Y then Z"
--  iterator which sets X to Y initially and then sets it to Z on each
--  subsequent iteration.
const DF_Image :=
 (for (F in DF; Sep := "" then ", ") => <"["> & Sep & F) & "]"


--  This produces a string representation of a sequence of ranges
--  separated by spaces
return (for R in Rngs forward =>
  <"Countable::["> & R.First & ".." & R.Last & " ") & "]"


--  This prints out a bracketed, comma-separated string representation
--  of the inputs
--  It again uses two iterators being iterated together
Println((for (each Input of Inputs forward;
                   Sep := "" then ", ") =>
                <"["> & Sep & "VN" & Input) & "]")


--  This counts the number of parameters that are outputs or variables
--  with the filter in "{...}" again
(for each P of Params
   when P.Is_Operation_Output or else P.Is_Var => <0> + 1)


--  This determine the minimum index of a set of parameters that pass
--  the filter (i.e. are either outputs or variables).
--  Rather than using the most positive and most negative values as the
--  "identity" for Min and Max, in Parasail, null is used, meaning both
--  Min(null, X) = X and Max(null, X) = X
const First_Updated :=
 (for each [I => P] of Params
    when P.Is_Operation_Output or else P.Is_Var => Min(<null>, I))


-- Count the number of non-inputs
const Num_Inputs :=
 (for each P of Params when not P.Is_Operation_Output => <0> + 1)


--  Print the list of VNs in VNs_To_Prop
Println(
 (for VN in VNs_To_Prop forward => <"Recursive_Top_Down"> & " VN" & VN))


--  Create a set of the parent value numbers (VNs) of the VNs in
--  Changed_Operands
const Parents : VN_Set :=
  (for Opnd in Changed_Operands =>
    <[]> & Op_Ctx.VN_To_Parent_Set_Map[Opnd])


--  Print a list of [VNx => VNy] pairs representing the live-out set
Println(
 (for each [Loc => VN] of Live_Out_Set =>
   <" Live_Out_Set ="> & " [VN" & Loc & "=>VN" & VN & "]"))


--  Count the number of nodes that are found by following the
--  Parent chain.
--  This uses the general "X := Y then Z while W" iterator which
--  initializes X to Y, and then sets it to Z on each subsequent
--  iteration, as long as "W" remains true.
return (for N := Node_Id then Node_Tree.Parent[N]
          while N in Node_Tree.Parent => <0> + 1)


--  This asserts that the sum of the sizes in Bit_Field_Sizes should be <= 63
assert ((for each Size of Bit_Field_Sizes => <0> + abs(Size)) <= 63);


--  Build up an array mapping from Bit-Field Key to the
--  sum of the sizes of all bit-fields preceding this bitfield
const Bit_Field_Offsets : Array<Univ_Integer, Indexed_By => Key_Type> :=
 [for each [Key => Size] of Bit_Field_Sizes, Key =>
    (for K in Key_Type'First .. Key-1 => <0> + Bit_Field_Sizes[K])]


--  Dump the sizes of the bit fields
Println(
 (for (each [Key => Size] of Bit_Field_Sizes; Sep := "" then ", ")
   forward => <"Bit_Field_Sizes:["> & Sep & Key & " => " & Size) & "]")


--  Hash the characters of a string to produce an integer
func Hash_Vec(Vec : Vector<Univ_Character>) return Univ_Integer is
  return (for I in 1 .. Vec'Length reverse =>
    (<0> * 127 + (Vec[I] - Char_First)) mod Hash_Modulus)
end func Hash_Vec

****************************************************************

From: Randy Brukardt
Sent: Thursday, February  8, 2018  5:52 PM

I don't want to rehash all of the previous arguments, so I'm just going to
say a few words (but that doesn't mean that the old arguments aren't still
true).

> The reduction
> expression was new "syntactic sugar" that combined the power of
> iterators with the notion of Map/Reduce.

I recall the lead designer of Ada 95 telling people far and wide that Ada was
about providing building blocks, not finished structures. This was a classic
example of a "finished" feature rather than a set of building blocks. That
becomes clear when one just needs to reduce an array (likely to be a very
common operation associated with normal parallel loops in Ada 2020). With your
old proposed syntax, you have to write an iterator to do that, even though the
iterator is irrelevant noise.

Ada has a great support for creating maps (in the form of arrays), and Ada
2020 promises to extend that even further. So what's missing is a Reduce
primitive, and that's what's being proposed in AI12-0242-1.

Building some sort of Frankenstein combination of iterators and reduce without
having an actual reduce primitive just throws away all of the Ada history of
building blocks.

> The notion of "map/reduce" refers to performing an operation (the
> "map" part) on some or all elements of some large data set or large
> search space, and then combining the individual results (the "reduce"
> part) to produce a single "reduced"
> result.  After many back and forths on e-mail, we seem to have ended
> up with little more than something which applies an operator over the
> elements of an array, and makes no use of the power of "syntactic
> sugar" to raise the level of abstraction.  To me this is quite a
> disappointment.

Raising the "level of abstraction" beyond the level that most programmers can
understand is not helpful. I'd rather the handful of super-advanced programmers
be disappointed. And there is nothing natural or understandable about the
combination of Map/Reduce. It only began to make sense when we started talking
about Reduce on its own. (I think it exists solely because of its potential to
describe parallelism -- Ada is about describing the problem in programming
language terms, not making the problem unrecognizable to fit some preconceived
notion of abstraction.)

Adding a bunch of "syntactic broccoli" to describe a construct that few will
understand or use is not an improvement.

Let me add that I've been searching for ways to make the Reduce attribute
easier to use; I'm not unwilling to consider improvements. But discarding the
entire thing (and/or discarding strong typing) is not the way to go.

In any case, you haven't explained what it is that you can't do a combination
of an aggregate and the reduce operation. There should be a permission to
interleave the evaluation of the prefix (and its parts) with the actual Reduce
operation (I think it is missing in Brad's proposal).

> I believe iterators are probably the most important new "programming"
> feature of Ada 2012, while pre/postconditions are the most important
> new "declarative" feature.  And for that matter, iterator-based
> constructs like quantified expressions are critical to writing concise
> pre/postconditions.

I definitely do not agree with this. For loop iterators for containers are a
natural extension that I'm not surprised are widely used. But I don't see much
value to quantified expressions; their parts of pre/post should be encapsulated
in a function and as such the body of that function can use a regular for loop
if necessary. (We do need to be able to declare such functions "extra pure" ==
global null, so that they can be optimized algebraically - Ada 2020 ought to
fix that.)

> For Ada 2020, building on the iterator "syntactic sugar" to me seems a
> key to growing the real power of programming in Ada.  Things like
> iterator "filters" are part of that -- they
> are not just "gewgaws."   Various combiners such as
> concurrent iterators also add to the power.

We surely need concurrent iterators, but more than that is just too much.
Turning everything into an iterator is not "raising the abstraction level"
-- that's done by providing good abstract operations for an ADT. It's exposing
more implementation than necessary.

...
> So I would recommend we return to a powerful syntax that is based on
> iterators, with operators in their natural "infix"
> position, etc.  Otherwise, I believe we are wasting the opportunity
> and not really adding anything significant.

I'm afraid that reraising this at this point is likely to kill Ada 2020, as
there is little likelihood of consensus on this point.

> At a minimum, we should look at how examples like those below would be
> supported in whatever proposal we come up with.

I agree with this. I did the first few examples here. The annoyance (IMHO) is
having to find/declare an appropriate array type. I was hoping for suggestion
to deal with that, not to start over.

====

Before we look at Tuck's example's (many of which are weird uses for this
construct), let's look at a pair of common cases written both ways:

-- Reducing the result of a manually-chunked parallel loop:

-- Parasail-like:
    Sum := (for E of Partial_Sum => <0> + E);
-- AI12-0242-1:
    Sum := Partial_Sum'Reduce("+", 0);

-- Calculating the sum of squares for a data array (part of statistics gathering):
-- Parasail-like:
    Sum_Sqr := (for I in Data'range => <0.0> + Data(I)*Data(I));
-- AI12-0242-1:
    Sum_Sqr := Data_Array'(for I in Data'range => Data(I)*Data(I))'Reduce("+", 0.0);
-- Note: "Data_Array" is the type of Data here. No additional declaration is needed.


...
> const Num_Outputs :=
>                      (for each P of Formal_Params
>                        when P.Is_Operation_Output => <0> + 1)

    type Mapper is array (Positive range <>) of Natural;

    Num_Outputs : constant Natural :=
        Mapper'(for each P of Formal_Params => Boolean'Pos(P.Is_Operation_Output)'Reduce("+",0);

-- I'm assuming that we add all iterators to array aggregates as previously
-- proposed (although no one has written that up).

> --  This counts the number of parameters that are outputs or variables
> --  with the filter in "{...}" again
> (for each P of Params
>    when P.Is_Operation_Output or else P.Is_Var => <0> + 1)

-- This is essentially the same as the last one:

    Num_Outputs_or_Vars : constant Natural :=
        Mapper'(for each P of Formal_Params =>
           Boolean'Pos(P.Is_Operation_Output or else P.Is_Var)'Reduce("+",0);

> -- Count the number of non-inputs
> const Num_Inputs :=
>  (for each P of Params when not P.Is_Operation_Output => <0> + 1)

-- This is identical to the first, not going to waste my time copying it.

> --  This asserts that the sum of the sizes in Bit_Field_Sizes should
> be <= 63 assert ((for each Size of Bit_Field_Sizes => <0> + abs(Size))
> <= 63);

pragma assert (Mapper'(for each Size of Bit_Field_Sizes =>
                        abs(Size))'Reduce("+",0) <= 63);

> --  Hash the characters of a string to produce an integer func
> Hash_Vec(Vec : Vector<Univ_Character>) return Univ_Integer is
>   return (for I in 1 .. Vec'Length reverse =>
>     (<0> * 127 + (Vec[I] - Char_First)) mod Hash_Modulus) end func
> Hash_Vec

-- I think this one needs a helper function (I used Ada types here):
   function Hash_One (New_Char, Pending : Natural) return Natural is
      ((Pending * 127 + (New_Char - Char_First)) mod Hash_Modulus);

   function Hash_Vec(Vec : String) return Natural is
     (Mapper'(for I in 1 .. Vec'Length => Character'Pos(Vec(Vec'Length-I-1))'Reduce(Hash_One, 0));

-- If 'Reduce allowed the first parameter of the combiner to be of another type,
-- and you didn't insist on hashing in reverse (which seems unnecessary to me),
-- this could be a lot simpler:

   function Hash_One (New_Char : Character; Pending : Natural) return Natural is
      ((Pending * 127 + (Character'Pos(New_Char) - Char_First)) mod Hash_Modulus);

   function Hash_Vec(Vec : String) return Natural is
     (Vec'Reduce(Hash_One, 0));

> --  Create a bracketed list of the elements of DF.
> --  Note that ParaSail allows two iterators that iterate together,
> --  stopping when either reaches its end.
> --  So in the case, we are iterating over the elements of "DF" while
> --  also doing a simple "X := Y then Z"
> --  iterator which sets X to Y initially and then sets it to Z on each
> --  subsequent iteration.
> const DF_Image :=
>  (for (F in DF; Sep := "" then ", ") => <"["> & Sep & F) & "]"

-- I cannot for the life of me figure how this is supposed to work. Which
-- demonstrates my original point, I think...

-- In any case, for a language that supposedly is mostly used for uses
-- where no heap use is allowed, this kind of example (which will use
-- loads of heap doing all of these concats) seems silly. I'd rather use
-- 'Image function overloading for this sort of thing anyway.

I give up here for now. I think we may need to allow the first parameter of
the combiner to be of a different type in order to avoid having clunky
conversions (and unnecessary array types) in the uses of Reduce, but I'd like
to see more examples that are actually reducing something...

One thought: "'Reduce("+",0)" is so common that one could easily imagine
having a "'Sum" operation with just that meaning; it would simplify the
writing of a lot of these operations.

****************************************************************

From: Brad Moore
Sent: Thursday, February  8, 2018  10:45 PM

> Before we look at Tuck's example's (many of which are weird uses for
> this construct), let's look at a pair of common cases written both ways:
>
> -- Reducing the result of a manually-chunked parallel loop:
>
> -- Parasail-like:
>    Sum := (for E of Partial_Sum => <0> + E);
> -- AI12-0242-1:
>    Sum := Partial_Sum'Reduce("+", 0);
>
> -- Calculating the sum of squares for a data array (part of statistics gathering):
> -- Parasail-like:
>    Sum_Sqr := (for I in Data'range => <0.0> + Data(I)*Data(I));
> -- AI12-0242-1:
>    Sum_Sqr := Data_Array'(for I in Data'range => Data(I)*Data(I))'Reduce("+", 0.0);
> -- Note: "Data_Array" is the type of Data here. No additional declaration is needed.

With the 'Reduce attribute, I'm wondering if people would naturally gravitate
towards using an expression function to express the array aggregate. This I
think addresses Jean-Pierre's complaint about potentially long prefixes, by
splitting into two parts. It also eliminates the use of a qualified
expression, which Randy finds annoying. I think this also fits more in the
idea of using building blocks to arrive at a solution.

eg. The previous example could be written as;

function Squares return Data_Array is
  (for I in Data'Range => Data(I)**2);

Sum_Sqr := Squares'Reduce("+", 0.0);

> ...
>> const Num_Outputs :=
>>                      (for each P of Formal_Params
>>                        when P.Is_Operation_Output => <0> + 1)
>
>    type Mapper is array (Positive range <>) of Natural;
>
>    Num_Outputs : constant Natural :=
>        Mapper'(for each P of Formal_Params =>
> Boolean'Pos(P.Is_Operation_Output)'Reduce("+",0);

I think a paren is missing, and the "each" keyword which doesn't exist in
Ada should be dropped.

 Num_Outputs : constant Natural :=
        Mapper'(for P of Formal_Params =>
             Boolean'Pos(P.Is_Operation_Output))'Reduce("+",0);

or alternatively;

function Operations return Mapper is
   (for P of Formal_Params => Boolean'Pos(P.Is_Operation_Output));

Num_Outputs : constant Natural := Operations'Reduce("+",0);

If element filtering is commonly needed, such as the Boolean'Pos mechanism
of this example, I think it would be worthwhile to consider adding "when" as
syntactic sugar to replace the Boolean'Pos construct in this example. I see
this being useful for array aggregates in general, not necessarily anything
to do with 'Reduce attribute usage. This would also provide early elimination
of junk values from the aggregate that aren't of interest.

This example could then be written with 'Length instead of 'Reduce, for
instance:

 Num_Outputs : constant Natural :=
     Mapper'(for P of Formal_Params when P.Is_Operation_Output => 0)'Length;

I think whether we decide to add "When" syntax to aggregates should be
separate from this AI. As Randy's examples show, the reductions can be written
without them, but they seem to be a "nice to have" feature.
We can decide separately whether it'd be "nice enough to have".

For the remaining examples, I will assume that we don't have the "When" clause.

> -- I'm assuming that we add all iterators to array aggregates as
> previously proposed (although no one has written that up).
>
>> --  This counts the number of parameters that are outputs or
>> variables
>> --  with the filter in "{...}" again
>> (for each P of Params
>>    when P.Is_Operation_Output or else P.Is_Var => <0> + 1)
>
> -- This is essentially the same as the last one:
>
>    Num_Outputs_or_Vars : constant Natural :=
>        Mapper'(for each P of Formal_Params =>
> Boolean'Pos(P.Is_Operation_Output or else P.Is_Var)'Reduce("+",0);

Same typos apply here (missing paren and extra each) ....

  Num_Outputs_or_Vars : constant Natural :=
         Mapper'(for P of Formal_Params =>
            Boolean'Pos(P.Is_Operation_Output or else P.Is_Var))'Reduce("+",0);

or
   function Output_Or_Vars return Mapper is
      (for P of Formal_Params => Boolean'Pos(P.Is_Operation_Output or else P.Is_Var => 0));

   Num_Outputs_Or_Vars : constant Natural := Output_Or_Vars'Reduce("+",0);


...
> -- If 'Reduce allowed the first parameter of the combiner to be of
> another type,
> -- and you didn't insist on hashing in reverse (which seems
> unnecessary to me),
> -- this could be a lot simpler:
>
>   function Hash_One (New_Char : Character; Pending : Natural) return
> Natural is
>      ((Pending * 127 + (Character'Pos(New_Char) - Char_First)) mod
> Hash_Modulus);
>
>   function Hash_Vec(Vec : String) return Natural is
>     (Vec'Reduce(Hash_One, 0));

As the other examples show below, I think we will probably want to extend
the AI to allow the parameters and result to be of different types.

>> --  Create a bracketed list of the elements of DF.
>> --  Note that ParaSail allows two iterators that iterate together,
>> --  stopping when either reaches its end.
>> --  So in the case, we are iterating over the elements of "DF" while
>> --  also doing a simple "X := Y then Z"
>> --  iterator which sets X to Y initially and then sets it to Z on
>> each
>> --  subsequent iteration.
>> const DF_Image :=
>>  (for (F in DF; Sep := "" then ", ") => <"["> & Sep & F) & "]"
>
> -- I cannot for the life of me figure how this is supposed to work.
> Which
> -- demonstrates my original point, I think...
>
> -- In any case, for a language that supposedly is mostly used for uses
> -- where no heap use is allowed, this kind of example (which will use
> -- loads of heap doing all of these concats) seems silly. I'd rather
> use
> -- 'Image function overloading for this sort of thing anyway.
>
> I give up here for now. I think we may need to allow the first
> parameter of the combiner to be of a different type in order to avoid
> having clunky conversions (and unnecessary array types) in the uses of
> Reduce, but I'd like to see more examples that are actually reducing something...
>
> One thought: "'Reduce("+",0)" is so common that one could easily
> imagine having a "'Sum" operation with just that meaning; it would
> simplify the writing of a lot of these operations.
>

Carrying on from where Randy left off....

--  This produces a string representation of a sequence of ranges
--  separated by spaces

Note: This example is a case where we want the combiner result type to be
different than the combiner parameter types. We should consider allowing
that in the AI. For sequential reductions, this should work fine. If we want
this to work for parallel reductions, then we probably need to bring back the
aspects of Identity, and Reducer, where Identity indicates the value to use to
initialize the accumulator of any parallel chunk, and Reducer is an aspect
applied to the combiner subprogram that identifies another subprogram whose
parameters and results are all of the same time, so that the results from
different parallel chunks can be combined.

OTOH, these examples seem to be all about fiddling with character strings, and
those examples which seem to involve reduction of string arrays into a string
result maybe are not as common a use case as it might seem, looking at the
broader usage. We might find in 90 % or more the reductions involve parameters
and results of the same type.

-- Parasail
return (for R in Rngs forward =>
  <"Countable::["> & R.First & ".." & R.Last & " ") & "]"

-- Ada
function Ranges return String_Array is
  (for R in Rngs => R.First & ".." & R.Last & " ");

return ("Countable::[" & Ranges'Reduce("&", "") & "]");


--  This prints out a bracketed, comma-separated string representation
--  of the inputs
--  It again uses two iterators being iterated together

-- Parasail
Println((for (each Input of Inputs forward;
                   Sep := "" then ", ") =>
                <"["> & Sep & "VN" & Input) & "]")

NOTE, this example also needs support for the combiner result type to be
different than the combiner parameters.

-- Ada
function Inputs_CSV return String_Array is
  ((for I in Inputs'Range => (if I = Inputs'First then "" else ", ") & "VN" & Inputs(I));

Put_Line("[" & Inputs_CSV'Reduce("&", "") & "]");


--  This determine the minimum index of a set of parameters that pass
--  the filter (i.e. are either outputs or variables).
--  Rather than using the most positive and most negative values as the
--  "identity" for Min and Max, in Parasail, null is used, meaning both
--  Min(null, X) = X and Max(null, X) = X

-- Parasail
const First_Updated :=
 (for each [I => P] of Params
    when P.Is_Operation_Output or else P.Is_Var => Min(<null>, I))

-- Ada
function Updates return Mapper is
  (for I in Params => (if Params(I).Is_Operation_Output
                         or else Params(I).Is_Var
                       then Params(I)
                       else Integer'Max));

constant First_Updated := Updates'Reduce(Integer'Min, Integer'Max);


--  Print the list of VNs in VNs_To_Prop

-- Parasail

Println(
 (for VN in VNs_To_Prop forward => <"Recursive_Top_Down"> & " VN" & VN))

-- Ada

-- NOTE another case where the combiner result type differs from the combiner
-- parameters type

function VNs return String_Array is
  (for VN in VNs_To_Prop => " VN" & VN);

Put_Line("Recursive_Top_Down" & VNs'Reduce("&", ""));


Scanning down the other examples of Tuck's, I dont see anything that couldn't
be handled similarly as the examples already shown.

****************************************************************

From: Tucker Taft
Sent: Friday, February  9, 2018  10:00 AM

> I recall the lead designer of Ada 95 telling people far and wide that
> Ada was about providing building blocks, not finished structures. This
> was a classic example of a "finished" feature rather than a set of
> building blocks. That becomes clear when one just needs to reduce an
> array (likely to be a very common operation associated with normal
> parallel loops in Ada 2020). With your old proposed syntax, you have
> to write an iterator to do that, even though the iterator is irrelevant noise.  ...

It seems our respective notion of "building blocks" don't agree.  I believe
iterators (plus filters) are the best "building block" for generating values,
and what we need is some "building block" for absorbing the values generated
by an iterator, optionally performing a computation on them (using the
existing Ada computational  "building blocks"), and then combining the
elements into a single value (this is the new building block using
"<initial val>" or some other syntax).

Building blocks always produce somewhat more verbose representations for any
given special case (e.g. reducing an array), but ultimately provide much more
power to the programmer when the programmer wants to go beyond a specially
supported special case.  I can see that the 'Reduce attribute works nicely
for the array special case, but fails when the elements you want to reduce
represent some other sort of sequence of items, such as that which can be
produced by a combination of an iterator, an optional filter, and an Ada
expression.

I think it is a big loss to only support reduction over an existing concrete
object with a single binary function.  By using more flexible syntax, we can
support reduction of a sequence of values produced in many different ways,
without ever having to produce in memory a single concrete object representing
the sequence.

The original discussion of 'Reduce had it combined with a container aggregate,
and at that point I presumed we were talking about a very special
interpretation of the container aggregate, where it didn't have a type, it
just represented a sequence of values that are to be reduced.  But that
presumes we agree on the syntax of a container aggregate and the special
significance of 'Reduce applied to such an aggregate.  But if we start
insisting that this is a "normal" attribute with normal rules such as being
able to resolve the prefix to a single, pre-existing type, then the idea falls
apart in my view.  So if we have heartburn with giving the aggregate'Reduce
attribute special overloading rules, etc., then we should go back to some sort
of new syntax rather than lose all of the power of using an iterator, filter,
etc., which the container aggregate would provide.

****************************************************************

From: Tucker Taft
Sent: Friday, February  9, 2018  11:31 AM

> I definitely do not agree with this. For loop iterators for containers are a
> natural extension that I'm not surprised are widely used. But I don't see
> much value to quantified expressions; their parts of pre/post should be
> encapsulated in a function and as such the body of that function can use a
> regular for loop if necessary. ...

This claim of little value for quantified expressions may be based on your own
coding style, or anticipation thereof, but we have a number of folks actually
using Ada 2012, and SPARK 2014 which is based on it, and quantified expressions
are used heavily.  As one example, below is the spec for an implementation of
red-black trees.  The SPARK tools proved that the implementation matches the
specified contracts, which define full functional correctness.  It uses the
SPARK 2014 "Contract_Cases" aspect which is essentially a post-condition that
is similar to a case statement in that it requires non-overlapping predicates
as preconditions to each associated post-condition.  You could substitute it
with a large Post => (if xxx then ... elsif ...) and have essentially the same
semantics, but only lose the requirement that exactly one of the conditions
must evaluate to true.

Even a quick perusal will show the heavy use of quantified expressions.
Proving the functional correctness of this implementation of red-black trees
was to some extent an academic exercise, but we have, as an example, a customer
writing a secure kernel in Ada 2012/SPARK 2014, and theirs is definitely not an
academic exercise.  You can say that you would not write Ada like this, but
that is not the point.  Those who use formal methods in industrial settings
do make heavy use of quantified expressions.

In general we do have to rely on our own coding styles and experience to try to
evaluate the potential value of some new language feature, but if we have access
to code written by others, it is important to factor in their usage patterns as
well.  We should try not to fall into the trap that only our own personal
coding style is the "correct style" and any other usage pattern is inherently
bad style.  Our customers for these future Ada extensions are the existing and
potential professional Ada programmers making a living writing Ada.  The more
we know about how such folks use existing Ada features (or features of other
languages), the more we can serve their future needs, and/or make the Ada
language a friendly place for them to move to achieve high productivity with
high security.

------  red-black trees (author is Claire Dross) -------


with Tree_Model; use Tree_Model;
use Tree_Model.D_Seq;
use Tree_Model.V_Set;
with Ada.Containers; use Ada.Containers;
with Binary_Trees; use Binary_Trees;


--  This package provides an implementation of binary search trees on top of the
--  Forest type defined in Binary_Trees. A search tree is a forest with a root
--  and a natural value per node. It has an invariant stating that values of the
--  nodes of the tree are ordered. It provides primitives for inserting a value
--  in the tree and for searching the tree for a given value. It also provides
--  Rotate primitives that can be used to balance the search tree.


package Search_Trees with SPARK_Mode is
   pragma Unevaluated_Use_Of_Old (Allow);


   type Search_Tree is private with Default_Initial_Condition =>
     Size (Search_Tree) = 0;


   function Size (T : Search_Tree) return Extended_Index_Type;


   function Root (T : Search_Tree) return Index_Type with
     Pre => Size (T) /= 0;


   function Parent (T : Search_Tree; I : Index_Type) return Extended_Index_Type with
     Post => (if Size (T) = 0 then Parent'Result = Empty);


   function Position (T : Search_Tree; I : Index_Type) return Direction with
     Pre => Parent (T, I) /= Empty;


   function Model (T : Search_Tree) return Model_Type with
     Ghost,
     Pre => Size (T) /= 0;


   function Peek (T : Search_Tree; I : Index_Type; D : Direction) return Extended_Index_Type with
     Pre => Size (T) /= 0 and then Model (T) (I).K;


   function Values (T : Search_Tree) return Value_Set with
     Ghost,
     Post => (if Size (T) = 0 then Is_Empty (Values'Result));


   function Contains (T : Search_Tree; V : Natural) return Boolean with
     Post => Contains'Result = Contains (Values (T), V);


   procedure Insert
     (T : in out Search_Tree; V : Natural; I : out Extended_Index_Type)
   with
     Pre  =>
       --  There are free nodes to use
       Size (T) < Max,
     Contract_Cases =>


       --  Case 1: V is already in the tree. Return the empty node in I. All
       --  values in the tree, as well as the values of positions and parent
       --  link, are preserved. We cannot state simply that T = T'Old as this
       --  would be the component-wise equality of SPARK, not the logical
       --  equality of T and T'Old. Instead, state explicitly which properties
       --  are preserved.
       (Contains (Values (T), V) =>
          I = Empty
          and then Values (T) = Values (T'Old)
          and then (for all J in Index_Type => Parent (T, J) = Parent (T'Old, J))
          and then (for all J in Index_Type =>
                     (if Parent (T, J) /= Empty
                      then Position (T, J) = Position (T'Old, J)))
          and then (for all J in Index_Type =>
                     (if Model (T) (J).K then Model (T'Old) (J).K))
          and then (for all J in Index_Type =>
                     (if Model (T'Old) (J).K then Model (T) (J).K))
          and then (for all J in Index_Type =>
                        (for all D in Direction =>
                           (if Model (T) (J).K then
                                Peek (T'Old, J, D) = Peek (T, J, D)))),


        --  Case 2: the tree is empty. Return a singleton tree in T, whose root
        --  is I. Value V is added to the previous (empty) set of values. The
        --  value of the parent link is preserved for all nodes.
        Size (T) = 0 =>
          I /= Empty
          and then Size (T) = 1
          and then Root (T) = I
          and then Is_Add (Values (T'Old), V, Values (T))
          and then (for all I in Index_Type =>
                     (if I /= Root (T) then not Model (T) (I).K))
          and then (for all I in Index_Type => Parent (T, I) = Parent (T'Old, I)),


        --  Case 3: the tree is not empty and V is not in it. Return in T a tree
        --  with the same root, the same nodes expect a new node I, whose size
        --  has been increased by 1. Value V is added to the previous set of
        --  values. The value of position and parent link is preserved for all
        --  nodes except I.
        others =>
          I /= Empty
          and then Model (T) (I).K
          and then Root (T) = Root (T'Old)
          and then Size (T) = Size (T)'Old + 1
          and then Is_Add (Values (T'Old), V, Values (T))
          and then (for all J in Index_Type =>
                     (if I /= J then Parent (T, J) = Parent (T'Old, J)))
          and then (for all J in Index_Type =>
                     (if I /= J and Parent (T, J) /= Empty
                      then Position (T, J) = Position (T'Old, J)))
          and then (for all J in Index_Type =>
                     (if I /= J and Model (T) (J).K
                      then Model (T'Old) (J).K))
          and then (for all J in Index_Type =>
                     (if Model (T'Old) (J).K
                      then Model (T) (J).K and I /= J))
          and then (for all D in Direction => Peek (T, I, D) = 0)
          and then Peek (T'Old, Parent (T, I), Position (T, I)) = 0
          and then (for all J in Index_Type =>
                        (for all D in Direction =>
                           (if Model (T) (J).K
                            and I /= J
                            and (J /= Parent (T, I) or D /= Position (T, I))
                            then
                                Peek (T'Old, J, D) = Peek (T, J, D)))));


   procedure Left_Rotate (T : in out Search_Tree; I : Index_Type) with
     Pre  =>
       --  The tree is not empty
       Size (T) > 0


       --  I is in the tree
       and then Model (T) (I).K


       --  I has a right child
       and then Peek (T, I, Right) /= Empty,
     Post =>
       --  The size of the tree is preserved
       Size (T) = Size (T)'Old


       --  When rotating on the root, the right child of the root becomes the
       --  new root, otherwise the root is preserved.
       and then (if Root (T)'Old /= I then Root (T) = Root (T)'Old
                 else Root (T) = Peek (T'Old, I, Right))


       --  The right child of I becomes it parent, of which I becomes the left
       --  child.
       and then Parent (T, I) = Peek (T'Old, I, Right)
       and then Position (T, I) = Left


       --  The parent of I becomes the parent of its previous right child, and
       --  unless I was root, as the same child (left or right).
       and then Parent (T, Peek (T'Old, I, Right)) = Parent (T'Old, I)
       and then (if Root (T)'Old /= I
                 then Position (T, Peek (T'Old, I, Right)) = Position (T'Old, I))


       --  During the rotation, the subtree located at the left child of the
       --  right child of I becomes the right child of I.
       and then (if Peek (T'Old, Peek (T'Old, I, Right), Left) /= Empty
                 then Parent (T, Peek (T'Old, Peek (T'Old, I, Right), Left)) = I
                   and then Position (T, Peek (T'Old, Peek (T'Old, I, Right), Left)) = Right)


       --  Except for I, its right child, and the left child of its right child,
       --  the parent link is preserved.
       and then (for all J in Index_Type =>
                  (if J /= I
                     and then (Parent (T'Old, J) /= I
                               or else Position (T'Old, J) = Left)
                     and then (Parent (T'Old, J) = Empty
                               or else Parent (T'Old, Parent (T'Old, J)) /= I
                               or else Position (T'Old, Parent (T'Old, J)) = Left
                               or else Position (T'Old, J) = Right)
                   then Parent (T, J) = Parent (T'Old, J)))


       --  Except for I, its right child, and the left child of its right child,
       --  the position is preserved.
       and then (for all J in Index_Type =>
                  (if J /= I
                     and then Parent (T'Old, J) /= Empty
                     and then (Parent (T'Old, J) /= I
                               or else Position (T'Old, J) = Left)
                     and then (Parent (T'Old, Parent (T'Old, J)) /= I
                               or else Position (T'Old, Parent (T'Old, J)) = Left
                               or else Position (T'Old, J) = Right)
                   then Position (T, J) = Position (T'Old, J)))


       --  Nodes in the tree are preserved. We are using two implications
       --  instead of a Boolean equality, as automatic provers work better with
       --  two implications, which they can instantiate just for the directions
       --  of interest in a given proof.
       and then (for all J in Index_Type =>
                  (if Model (T) (J).K then Model (T'Old) (J).K))
       and then (for all J in Index_Type =>
                  (if Model (T'Old) (J).K then Model (T) (J).K))


       --  Values are preserved
       and then Values (T) = Values (T'Old)


       --  The right child of I now is the former left child of I's former right
       --  child.
       and then Peek (T, I, Right) = Peek (T'Old, Peek (T'Old, I, Right), Left)


       --  I's former right child has taken its place in the tree
       and then (if Parent (T'Old, I) /= 0 then
                      Peek (T, Parent (T'Old, I), Position (T'Old, I)) =
                      Peek (T'Old, I, Right))


       --  I is now the left child of its former right child
       and then Peek (T, Peek (T'Old, I, Right), Left) = I


       --  Other children are preserved
       and then
       (for all J in Index_Type =>
           (for all D in Direction =>
               (if (J /= I or D = Left)
                 and (J /= Parent (T'Old, I) or else D /= Position (T'Old, I))
                 and (J /= Peek (T'Old, I, Right) or else D = Right)
                 and Model (T) (J).K
                then Peek (T, J, D) = Peek (T'Old, J, D))));


   procedure Right_Rotate (T : in out Search_Tree; I : Index_Type) with
     Pre  =>
       --  The tree is not empty
       Size (T) > 0


       --  I is in the tree
       and then Model (T) (I).K


       --  I has a left child
       and then Peek (T, I, Left) /= Empty,
     Post =>
       --  The size of the tree is preserved
       Size (T) = Size (T)'Old


       --  When rotating on the root, the left child of the root becomes the
       --  new root, otherwise the root is preserved.
       and then (if Root (T)'Old /= I then Root (T) = Root (T)'Old
                 else Root (T) = Peek (T'Old, I, Left))


       --  The left child of I becomes it parent, of which I becomes the right
       --  child.
       and then Parent (T, I) = Peek (T'Old, I, Left)
       and then Position (T, I) = Right


       --  The parent of I becomes the parent of its previous left child, and
       --  unless I was root, as the same child (left or right).
       and then Parent (T, Peek (T'Old, I, Left)) = Parent (T'Old, I)
       and then (if Root (T)'Old /= I
                 then Position (T, Peek (T'Old, I, Left)) = Position (T'Old, I))


       --  During the rotation, the subtree located at the right child of the
       --  left child of I becomes the left child of I.
       and then (if Peek (T'Old, Peek (T'Old, I, Left), Right) /= Empty
                 then Parent (T, Peek (T'Old, Peek (T'Old, I, Left), Right)) = I
                   and then Position (T, Peek (T'Old, Peek (T'Old, I, Left), Right)) = Left)


       --  Except for I, its left child, and the right child of its left child,
       --  the parent link is preserved.
       and then (for all J in Index_Type =>
                  (if J /= I
                     and then (Parent (T'Old, J) /= I
                               or else Position (T'Old, J) = Right)
                     and then (Parent (T'Old, J) = Empty
                               or else Parent (T'Old, Parent (T'Old, J)) /= I
                               or else Position (T'Old, Parent (T'Old, J)) = Right
                               or else Position (T'Old, J) = Left)
                   then Parent (T, J) = Parent (T'Old, J)))


       --  Except for I, its left child, and the right child of its left child,
       --  the position is preserved.
       and then (for all J in Index_Type =>
                  (if J /= I
                     and then Parent (T'Old, J) /= Empty
                     and then (Parent (T'Old, J) /= I
                               or else Position (T'Old, J) = Right)
                     and then (Parent (T'Old, Parent (T'Old, J)) /= I
                               or else Position (T'Old, Parent (T'Old, J)) = Right
                               or else Position (T'Old, J) = Left)
                   then Position (T, J) = Position (T'Old, J)))


       --  Nodes in the tree are preserved. We are using two implications
       --  instead of a Boolean equality, as automatic provers work better with
       --  two implications, which they can instantiate just for the directions
       --  of interest in a given proof.
       and then (for all J in Index_Type =>
                  (if Model (T) (J).K then Model (T'Old) (J).K))
       and then (for all J in Index_Type =>
                  (if Model (T'Old) (J).K then Model (T) (J).K))


       --  Values are preserved
       and then Values (T) = Values (T'Old)


       --  The left child of I now is the former right child of I's former left
       --  child.
       and then Peek (T, I, Left) = Peek (T'Old, Peek (T'Old, I, Left), Right)


       --  I's former left child has taken its place in the tree
       and then (if Parent (T'Old, I) /= 0 then
                      Peek (T, Parent (T'Old, I), Position (T'Old, I)) =
                      Peek (T'Old, I, Left))


       --  I is now the right child of its former left child
       and then Peek (T, Peek (T'Old, I, Left), Right) = I


       --  Other children are preserved
       and then
       (for all J in Index_Type =>
           (for all D in Direction =>
               (if (J /= I or D = Right)
                 and (J /= Parent (T'Old, I) or else D /= Position (T'Old, I))
                 and (J /= Peek (T'Old, I, Left) or else D = Left)
                 and Model (T) (J).K
                then Peek (T, J, D) = Peek (T'Old, J, D))));


private


   type Value_Array is array (Index_Type) of Natural;


   type Search_Tree is record
      Root   : Extended_Index_Type := Empty;
      Struct : Forest;
      Values : Value_Array := (others => 0);
   end record
     with Type_Invariant =>
       (if Size (Struct) = 0 then Root = Empty
        else Root /= Empty
          and then Valid_Root (Struct, Root)
          and then Ordered_Leafs (Struct, Root, Values));


   function Ordered_Leafs (F : Forest; Root : Index_Type; Values : Value_Array)
     return Boolean
   --  Returns True iff the values in the tree rooted at Root are ordered in
   --  strictly ascending chain in a left-to-right depth-first traversal of
   --  the tree.


   with
     Ghost,
     Pre => Valid_Root (F, Root);


   function Model (T : Search_Tree) return Model_Type is
     (Model (T.Struct, T.Root));


   function Size (T : Search_Tree) return Extended_Index_Type is
     (Size (T.Struct));


   function Root (T : Search_Tree) return Index_Type is
     (T.Root);


   function Parent (T: Search_Tree; I : Index_Type) return Extended_Index_Type is
     (Parent (T.Struct, I));


   function Position (T: Search_Tree; I : Index_Type) return Direction is
     (Position (T.Struct, I));


   function Peek (T : Search_Tree; I : Index_Type; D : Direction) return Extended_Index_Type is
     (Peek (T.Struct, I, D));


end Search_Trees;

****************************************************************

From: Edmond Schonberg
Sent: Friday, February  9, 2018  12:58 PM

I hesitate to wade into this heated discussion, but I’d like to try to
summarize my view on the possible approaches to Reduce operations:

a) I agree with Tuck that iterators are an extremely useful addition to Ada,
and that iterators over containers add  a lot of expressive power to the
language. Iterators by themselves describe a stream of values that can be used
to construct an array or a container object, or can be an argument to a
reduction operation. As such they are a natural prefix for an attribute, if
this is the choice syntax. That prefix may look like an aggregate but of course
there is no need to build one, and therefore no type needs to be specified and
the resolution rules are the same as for any other iterator construct.

b) The specification of the default of the reduction by means of a bracketed
value looks unappealing to all but to Tuck. Making the nature of the construct
depend on an <init> buried somewhere in an expression is error-prone, requires
some arbitrary look-up (for the reader, parsing is trivial anyway) and does
not advertise properly the purpose of the construct.  Attribute ‘Reduce seems
like a simpler and clearer alternative.

c) A stream generated by an iterator can be fed into a reduction operation, but
there will be reductions performed over already constructed composite objects,
and thus the prefix of ‘Reduce can also be an expression that yields a type
with defined iterator aspects.  This makes the iterator itself implicit without
obscuring the purpose of the construct.

d) Filters over iterators are familiar from other languages, are indispensable
in mathematical notation, and should be part of the language: they are
intuitive, expressive, and trivial to implement.

e) Container aggregates fall naturally out of these iterators. We do need to
generalize named associations to handle map containers of course, and Tuck has
already proposed such. Cross-product iterators, i.e. nested iterators to
describe multi-dimensional constructions, are useful but I’m not sure the
syntax is clear enough.

My two pesetas ...

****************************************************************

From: Tucker Taft
Sent: Friday, February  9, 2018  3:28 PM

Thanks, Ed, for wading in.  I agree with you.  In particular, I understand the
opposition (by some ;-) to the "<initial_val>" syntax, and would be happy with
some potentially more visible alternative such as "{initial_val}" or
"<<initial_val>>".  Alternatively, the aggregate'Reduce syntax also works, so
long as the overload resolution rules for aggregate'Reduce are appropriate for
the purpose, and don't require the definition of a named type for the
pseudo-aggregate just so you can use qualified-expression syntax.  It also
presumes we define container aggregates.

Both syntactic approaches rely on having iterators with filters.  I have no
particular problem with allowing "when" clauses on more kinds of statements,
but that seems to me (but not to everyone, I guess) rather orthogonal to their
role as filters.  Their role as filters is closer to their role as entry-body
guards, than to the existing use of "when" clauses on an "exit" statement.
But if horse-trading implies we get raise...when as a side-effect of adding
filters, so be it.

JP's proposal, which puts the aggregate inside the parameter list (after
'Reduce) potentially makes special overload resolution rules harder, as it
only applies to the first parameter of potentially multiple parameters,
whereas there is exactly one prefix to an attribute, and we already know that
some attributes (such as 'Access) have special overloading rules for their
prefix. I also happen to think it is more natural to have the iterator appear
first, though that is clearly a personal preference.  That is, I like to see
the generator of the values first, and then the operation to be performed on
the generated values.  Finally, putting the iterator inside the parameter list
would result in two levels of parentheses in most cases, which adds to the
unpleasantness.

****************************************************************

From: Randy Brukardt
Sent: Friday, February  9, 2018  4:44 PM

> I hesitate to wade into this heated discussion, but I'd like to try to
> summarize my view on the possible approaches to Reduce operations:

Thanks for doing so. Someone has to be a voice of moderation here. :-)

I think it is interesting that I view your message as generally agreeing with
me (particularly on the most critical issues -- the silly <0> syntax and the
use of an attribute on actual array/container objects), while Tucker seems to
view the same (probably with a vastly different view of what is most
critical :-). Hopefully, that means that our differences are really not that
significant.

...
> e) Container aggregates fall naturally out of these iterators. We do
> need to generalize named associations to handle map containers of
> course, and Tuck has already proposed such. Cross-product iterators,
> i.e. nested iterators to describe multi-dimensional constructions, are
> useful but I'm not sure the syntax is clear enough.

This I (mildly) disagree with. Perhaps this is mainly terminology. I see
container aggregates and "these iterators" as completely orthogonal issues.
Probably they would end up having similar syntax, but clearly the resolution
rules are completely different (the type coming from context for an aggregate,
and the type coming from ??? for "these iterators".

(I'm quoting "these iterators" as they have no obvious name, not to suggest
disapproval.)

I also note that anything done for "these iterators" and container aggregates
also has to be done for array aggregates. (Arrays and containers -- especially
vector containers -- being as equivalent as possible.) Assuming that is the
case, "these iterators" are closer to an array aggregate than a container
aggregate (a container needing a complex iterator, while for an array,
iterating is simple indexing) -- but probably they'd be better being defined
completely separately.

In any case, I've said repeatedly that I'm open to looking at ease-of-use
improvements (which is all this is IMO). My roadmap has always been:
(1) Define 'Reduce using as few new concepts as possible and with no new syntax.
(2) Create examples using that 'Reduce.
(3) Determine what is problematical about those examples (specifically in
    ease-of-use).
(4) Propose an extension (or extensions) to address those problems.
(5) Create revised examples.
(6) Repeat as necessary.

The advantage of such a roadmap is that one always has something usable (if
not ideal), and with the deadline approaching, we need to take similar
approaches to everything we do. (Incremental development is the only sane form
of development. :-)

Tucker seems to have been leaping to the end, or leaping to the conclusion
that the first step is the only one, or something. He hasn't even commented
on the examples that Brad and I built to demonstrate that his examples can
easily be written using the proposed Reduce attribute (and that after
complaining that we hadn't written any such examples).

He doesn't seem to realize that I've spent a lot of time thinking about the
bare aggregate case; the problem is that I can't make it work, because the
type of the prefix is needed -- it's what determines the type/profiles of the
various other parts of the attribute. I don't want to invent new kinds of
type-less things (that's not Ada, and besides resolution is complex enough).
So I was hoping someone more invested (like, say, Mr. Taft) would make an
actual proposal for such an extension, rather than just going back to square
one and complaining.

****************************************************************

From: Randy Brukardt
Sent: Friday, February  9, 2018  4:53 PM

>This claim of little value for quantified expressions may be based on
>your own coding style, or anticipation thereof, but we have a number of
>folks actually using Ada 2012, and SPARK 2014 which is based on it, and
>quantified expressions are used heavily.

Sigh. This example demonstrates all that is wrong with those existing
mechanisms: little or no abstraction, giant expressions that are totally
useless to a (human) client, loss of the #1 goal of Ada (emphasis on
programming as a HUMAN activity), and more.

Going further in a bad direction is not going to be helpful to Ada's
underlying goals - the human element in programming.

This is too off-topic to discuss further (and moreover, it is irrelevant, as
we're stuck with these things and certainly they are going to appear in
aggregates as well).

****************************************************************

From: Randy Brukardt
Sent: Friday, February  9, 2018  5:57 PM

...
> Building blocks always produce somewhat more verbose representations
> for any given special case (e.g. reducing an array), but ultimately
> provide much more power to the programmer when the programmer wants to
> go beyond a specially supported special case.  I can see that the
> 'Reduce attribute works nicely for the array special case, but fails
> when the elements you want to reduce represent some other sort of
> sequence of items, such as that which can be produced by a combination
> of an iterator, an optional filter, and an Ada expression.

We're going to have to agree to disagree on what is and is not a building
block.

But the real problem here is the claim that the 'Reduce attribute "fails"
when the elements are "some other sequence of items". That's the definition
of an array aggregate! Brad and I had no real difficulty writing your examples
using the proposed simple features (and even without the filter).
With the filter, many of them didn't even need the 'Reduce.

You have not made any comment on the actual examples, which makes it
impossible to answer any criticisms, propose any additional features or
anything else. I've never expected that the current proposal would be the
only thing we'd consider, but just saying that it "fails" without any attempt
to explain why is not progressing the discussion at all.

What *I* find from the examples is two-fold: one, the restriction that all of
the operands of the reducer expression are the same type is too restrictive,
and two, array aggregates need the addition of the other kinds of container
iterators in order that the better represent sequences. I also find the need
to declare an appropriate array type annoying, but I'm unable to find any sane
semantics that avoids it.

> I think it is a big loss to only support reduction over an existing
> concrete object with a single binary function.  By using more flexible
> syntax, we can support reduction of a sequence of values produced in
> many different ways, without ever having to produce in memory a single
> concrete object representing the sequence.

Again, you're totally ignoring what is actually proposed. The actual proposal
includes (or is intended to, I think Brad may have left it out) a
notwithstanding allowing interleaving of the evaluation of the prefix and
'Reduce, including if the prefix is an aggregate, interleaving of the
evaluation of the individual associations. (Aside: gotta be careful that
doesn't allow evaluation out-of-order for array delta aggregates.) And Brad
already included a rule saying that the temporary object of such a prefix need
not actually exist.

There never, ever has been any intent that the prefix has to be materialized.
On top of which, any compiler could apply an as-if optimization to not
materialize an object that has no other lifetime. So this is simply not a real
issue and it is intensely frustrating to have to you raise this bogeyman
repeatedly.

> The original discussion of 'Reduce had it combined with a container
> aggregate, and at that point I presumed we were talking about a very
> special interpretation of the container aggregate, where it didn't
> have a type, it just represented a sequence of values that are to be
> reduced.  But that presumes we agree on the syntax of a container
> aggregate and the special significance of 'Reduce applied to such an
> aggregate.
>  But if we start insisting that this is a "normal" attribute with
> normal rules such as being able to resolve the prefix to a single,
> pre-existing type, then the idea falls apart in my view.  So if we
> have heartburn with giving the aggregate'Reduce attribute special
> overloading rules, etc., then we should go back to some sort of new
> syntax rather than lose all of the power of using an iterator, filter,
> etc., which the container aggregate would provide.

The insight that I had is that this operation really doesn't have anything to
do with a container at all, as you say, it just is a sequence of items.
And one way to represent a sequence of items in Ada is as an array. Once one
just looks at this as an array (only) operation, it becomes much simpler.

Moreover, this has nothing whatsoever to do with container aggregates (whether
we have them at all, etc.) Rather, it is really about *array* aggregates, and
potential expansions of what they can represent.

If we have to do special resolution, I view that as operating on something
with the syntax of an array aggregate, but that is not really an array
aggregate. But I think I need to see more realistic examples to be convinced
of the need, especially as it requires a potentially problematical syntax
change.

Container aggregates are completely orthogonal, as their primary goal is to
look as closely as possible to an array aggregate. (For vectors, that match
should be almost exact). They have nothing that I can see to do with the
prefix of this attribute, regardless of what rules we ultimately chose for it.

****************************************************************

From: John Barnes
Sent: Saturday, February 10, 2018  8:53 AM

I fear I have been distracted by giving lectures at Oxford to concentrate on
ARG.

I must say however, that I am concerned about the overuse of functional
notation. In small lumps it is OK. But readers need to "take a breath" as it
were and that is difficult if one's brain is heavily nested in a giant
functional lump.

I am also being triggered in that a student just submitted a whole lump of
Mathematica for me to look at. It's readable in small doses but indigestible
when the brackets mount up.

One of the design goals for Ada was to ensure that it was readable. And
remember that a program is read many more times than it is written. Jean
was right much of the time.

(I am also worried that (deo volente) when I rewrite my book for Ada 2020 that
it might be nearly 2000 pages.)

****************************************************************

From: Tucker Taft
Sent: Saturday, February 10, 2018  11:21 AM

Luckily most people don't try to prove full functional correctness.  I agree
you wouldn't want most of your specs to look like this.  Much more common is
to just specify some simple properties that should be true before or after an
operation, without trying to specify every detail of the effects.  But even
simple properties are often best captured with a quantified expression.  The
classic is the property of being ordered.  Most specs for sorting don't bother
requiring that the sorted output be a permutation of the unsorted input, but
they do often want to express succinctly that the result is ordered.

****************************************************************

From: Eddmond Schonberg
Sent: Sunday, February 11, 2018  2:27 PM

> He [Tuck] doesn't seem to realize that I've spent a lot of time
> thinking about the bare aggregate case; the problem is that I can't
> make it work, because the type of the prefix is needed -- it's what
> determines the type/profiles of the various other parts of the
> attribute. I don't want to invent new kinds of type-less things
> (that's not Ada, and besides resolution is complex enough). So I was
> hoping someone more invested (like, say, Mr. Taft) would make an
> actual proposal for such an extension, rather than just going back to square
> one and complaining.

I agree to the need for simple resolution rules for this new construct. I may
be inevitable to use a qualification for the prefix of ‘Reduce, but let me
propose another simplification (hope springs eternal...).  So far we have
described the prefix either as an object of an itterable type (array or
container implementing iterator interface) or else as an aggregate.

i just don’t see an arbitrary aggregate as a useful construct here: if you
need to apply a reduction to some random collection of values you will have
built that collection previously as an object. Furthermore, an aggregate with
a miscellaneous collection of values, specified positionally or by
associations (named or iterated) will not lend itself to any kind of parallel
processing. So I would propose to restrict the prefix to a parenthesized
iterated component association, which is the best model we have for a
generated sequence:

  (for X of C when P (X) => F (X))’Reduce (Op, init)

The needed resolution rule is now for the expression in the association, not
for the aggregate, which suggests two simple resolution rules:

a) the expected type of the expression is that of the default value (the second
   argument of the attribute)

b) The the expression must have a single type independent of context. This can
   be obtained by means of qualified expression when needed.

This way there is no misleading suggestion that the prefix is in any way a
constructed object. I suggest that this covers all useful cases of explicit
sequences. The other form:

     C’Reduce (Op, Init)

can have a similar rule :  the object C must have a unique type (and a
qualification in the case is a small burden.

****************************************************************

From: Tucker Taft
Sent: Sunday, February 11, 2018  6:06 PM

I have what I believe are pretty simple and implementable overload resolutions
rules for pseudo-aggregate'Reduce, without needing a qualified expression.  I
am reviewing them with Randy before sending them to the ARG as a whole.

****************************************************************

From: Edward Fish
Sent: Monday, February 12, 2018  12:41 PM

Some of this discussion has reminded me of Guy Steele's talk Four Solutions to
a Trivial Problem - especially around 40 min when he's talking about algebraic
properties and then goes into parallel properties, eventually drawing the
parallel between garbage-collection and parallelization (the former being
assigning data to memory, the latter assigning work [code] to processors).

It also occurs to me that the application/promotion of ranges into a
first-class Ada citizen might help; perhaps it might not be required, but as
a 1st class construct we could avoid the temptation to use "for" as a map. (He
makes the point that C-style "for( ; ; )" is possibly the worst way to
represent a map, but that also applies to what we're trying to do here;
granted that there is a big distinction between syntax and semantics, but
there's a reason that Java 8's streams have several tricks/libraries to work
with indexed elements.)

****************************************************************

From: Tucker Taft
Sent: Monday, February 12, 2018  2:35 PM

Hearing no objection from Randy, here is the overload resolution I would
recommend for pseudo-aggregate'Reduce(...):

We have talked about allowing a syntax of:

  aggregate'Reduce(...)

for our reduction expression (in addition to obj'Reduce()), but have not
specified the new syntax rules that will allow this, nor specified the
overload resolution that will make it usable without introducing a bogus
type qualifier.  The intent I believe is that we are just adopting the
aggregate *syntax*, presuming a more generalized use of iterators and perhaps
filters in the aggregate, but there is no expectation that an actual object
will be constructed. The new syntax might better be described as:

 (for iterator [filter] => elem_expression)'Reduce(combiner, initial-val)

since we are concerned about doing overload resolution on the parts of the
aggregate, not on the aggregate as a whole.

The easiest thing to require is that a 'Reduce attribute reference have a
single expected type.  Once we require that, then that provides the expected
type for the initial-val.  We don't really need to require that combiner be
symmetrical so long as we agree that the initial-val is the first operand.
In any case, the overload resolution problem is now equivalent to:

  X : expected_type := combiner(expected-type'(initial-val), elem_expression);

This seems pretty straightforward to resolve.  The iterator/filter are somewhat
irrelevant to this, and need to be resolved on their own without external
context.

Comments welcome...

****************************************************************

From: Edmond Schonberg
Sent: Monday, February 12, 2018  2:50 PM

We seem to be in full agreement!

****************************************************************

From: Tucker Taft
Sent: Monday, February 12, 2018  3:00 PM

Very glad to hear.

****************************************************************

From: Randy Brukardt
Sent: Wednesday, February 14, 2018  8:56 PM

Tucker and I have been having a lengthy private back-and-forth on whether it
is necessary to explicitly have a separate "parallel_reduce" attribute.
(We've also hashed out some other issues, mostly covered elsewhere.) I think
we understand each others position (not that either of us have convinced the
other of much). Tucker clearly is much more optimistic about what parallelism
can be automatically inserted for "normal" Ada code.


Here is a quick summary of my position for the record. I believe that it is
useful/necessary to be able to include "parallel" for the following reasons:

(1) As-if optimization have to have a strong probability of making the code
faster. It is very difficult to prove that for parallelism, especially if the
overhead is relatively high (as it is on most normal OSes).

(2) (1) means that compiler writers have little incentive to provide any
parallelism for a construct like this. Including a keyword/separate attribute
prods them to support it.

(3) Marketing: we are trying to expand Ada's marketplace with easy-to-use and
safe parallel constructs. It helps to show even the causal reader that this is
happening by having the word "parallel" somewhere. (I realize that not everyone
subscribes to this one much, recalling Ichbiah's shoe. :-)

(4) Insurance: if the user wants to force parallel behavior, they will get an
error if that behavior cannot be achieved.

(5) Level of parallelism part 1: At least on conventional targets,
coarse-grained parallelism is better than fine-grained parallelism, simply
because there is much less overhead. Brad had noted this effect, saying that
you want to put the parallelism at the highest-level possible (that is, the
highest level where parallelism is possible). If the programmer has explicitly
introduced parallelism at a high-level, the compiler introducing it
automatically at a lower level just adds overhead.

(6) Level of parallelism part 2: If the user has determined that the proper
place to introduce parallelism is a Reduce attribute, they want to add
parallelism there (and only there). That requires some sort of marking.

----

Some explanation of some of these:

(1) was explained in great detail in a previous ARG message. Basically, it is
necessary to ensure that the overhead of adding parallelism is not bigger than
the possible gain from parallel execution. This requires a relatively
expensive combiner function (or expensive iterator target expression in the
iterator/aggregate form), or a large number of iterations. However, most
complex functions are separately compiled, so one can't look into the body to
find out the complexity. And most iterations are dynamic in some way, so the
compiler doesn't really know how many of those are there, either. It is much
more likely that the compiler could be sure that parallelization does NOT pay
(the combiner could be a predefined operator or a simple expression function).

Tucker suggests that a compiler that would do such automatic parallelization
would only do so under control of a compiler switch. In such a case, you could
apply parallelization always unless you know it doesn't work -- secure in
knowing that the user can turn off that silly switch when it turns their
program from slow into a slug. :-)

(2), (3), and (4) are self-explanatory.

(5) comes from the notion that many problems are naturally parallel.

Let me start with a real-world example. I have a program that tries to find
"optimal" coefficients that matches observed results to estimated results.
Essentially, there are three low-level estimated values that are combined into
a score, and I would like the estimated score to be as close as possible to
the observed scores for some tested possibilities. (The ultimate idea being
to use these estimates to find new possibilities to test.)

It does this by testing various coefficients using a "rolling ball"
algorithm. A single set of coefficients can be considered a trial; the program
uses those to make a full set of estimates corresponding to the observed data
and then calculates statistical information on it (mean/variance/correlation)
-- the idea is to produce a score such that the differences of all three of
these parameters from that of the observed data is minimized.

The program starts with a survey pass (since there is no good idea of what
"good" coefficients might look like). The survey pass runs about 1.5 million
trials on the 7 coefficients. I keep the top 100 scores (and the associated
coefficients) in an array, the rest are just discarded. This takes about 2
hours per pass using one core on my current computer.

The statistics gathering code could be written as a series of reduction
expressions (this is the code example I previously gave Brad). For the sake
of discussion, let's assume they were written that way (even though that would
make the code run quite a bit slower because it would have a lot of additional
memory usage).

Since each trial is logically and programmatically separate from any other,
parallelism would be best introduced at the point of the trials loop. If that
was replaced by a parallel loop (and presumably, the results managed by a
protected object), one could easily use all of the processing elements
available to the program. (I am not expecting to have a machine with 2 million
cores on my desktop anytime soon. ;-)

Note that the processing needed for each trial is almost exactly the same;
there is almost no conditional code in trials (and most of what is there is
part of the process of saving the result, providing progress indications, and
for debugging). So each chunk would take about the same amount of time to
execute; no load-balancing is needed for this problem.

Therefore, any parallelism introduced for the reduction expressions would
simply add overhead -- all of the processing cores are already busy nearly
100% of the time. As an as-if optimization, parallel execution is a complete
failure in this case -- there is no chance of it providing any benefit at all.
Since the statistics could easily be in a separate unit from the parallel
loop, in general the compiler cannot know whether or not explicit
arallelism is in use.

Essentially, if the parallelism is introduced manually as part of modeling the
problem, any further parallelization has little value at absolute best, and in
most cases just will make the program run slower.

Both (1) and (5) can be mitigated somewhat by various techniques:
controlling this behavior with compilation switches, or perhaps better with
Ada-specific bind-time program generation. (At bind-time, the bodies of all
Ada functions will be available, and the presence or absence of explicit
tasking/parallelism will be easily determinable.)

Parallelism is relatively natural in many problems. Some examples from my
experience: the web server and e-mail filter both use pools of worker tasks
that are assigned jobs as they come in (from the public internet). Each job is
independent of any other, just depending on some global settings. One could
have used a parallel loop in Ada 2020 to get a similar effect (although that
complicate logging and debugging, as well as setting changes). In the
Janus/Ada compiler, the main optimizer works on chunks of intermediate code
never larger than a subprogram body. It's easy to imagine optimizing several
chunks at once, as each chunk is independent of any other.

Auto-generation of parallelism seems better used on algorithms where the
parallelism is not obvious. Tucker claims (probably correctly) that a
compiler/runtime pair knows a lot more about the target, load balancing, and
the like than any programmer could. So if the problem itself isn't naturally
parallel, trying to manually do parts in parallel probably won't work out very
well. OTOH, a lot of Ada code (and code in general) is naturally sequential,
adding dependencies even where they could very well not have existed. It's
hard for me to imagine a compiler smart enough to be able to untangle needed
sequential behavior from that which is accidentally introduced by the natural
organization of the programming language. That necessarily will reduce the
possible implicit parallelism.

Finally, point (6). If a problem is naturally parallel, one will want to write
parallel code for that high-level parallelism -- and only that code.
If that code turns out to be best expressed as a Reduce attribute, they
certainly will not be happy to have to turn it into a parallel loop instead
in order to require the generation of parallel behavior.

Tucker does point out that the introduction of "parallel" anywhere really
doesn't require any parallel execution; a compiler could run everything on a
single processor (and might be forced to do that by OS-level allocations).
But it does require the safety checks that everything is allowed to be run in
parallel (4); the user will not be writing code that can only be executed
sequentially when they intended parallel execution.

---

While I find none of these points compelling by themselves, the combination
makes it pretty clear to me that we do want a separate parallel version of
the Reduce attribute. But we don't want the regular Reduce to have any rules
that would prevent automatic parallelization in the future (presuming that it
could have been a Parallel_Reduce).

****************************************************************

From: Jeff Cousins
Sent: Saturday, February 17, 2018  2:46 PM

A subjective opinion, but I find it hard to believe that a compiler vendor
would provide a difficult optimisation if it were optional, I think we need
both ‘Reduce and ‘Parallel_Reduce.

I visited the IET (Institute of Engineering & Technology) yesterday and looked
in its library for books on parallelisation.  I’m not sure if I reached any
conclusion other than C++ is a horrible language, but I found TBB quite
interesting.

****************************************************************

From: Bob Duff
Sent: Saturday, February 17, 2018  5:17 PM

> A subjective opinion, but I find it hard to believe that a compiler
> vendor would provide a difficult optimisation if it were optional, I
> think we need both ‘Reduce and ‘Parallel_Reduce.

But optimizations are always optional.  What's to stop an implementer from
making ‘Reduce and ‘Parallel_Reduce generate exactly the same code?  Money,
that's what.  ARG has no say in the matter.

****************************************************************

From: Randy Brukardt
Sent: Saturday, February 17, 2018  7:52 PM

The problem I see is that parallel execution is hardly an optimization in many
circumstances. The performance of it is just different (might be better, might
be worse). I'm not in the business of making my customers code worse, so the
only way (money or not) that I could even consider such an code generation is
if the customer somehow asks for it.

But whether or not it is a good idea is going to vary a lot for each usage, so
one needs to be able to ask for each usage. (A compiler switch - that is,
something that is per unit -- is likely to be too coarse.) Perhaps one could
say something like, maybe, 'Reduce and 'Parallel_Reduce. ;-)

****************************************************************

From: Tucker Taft
Sent: Saturday, February 17, 2018  8:22 PM

Any "optimization" is always subject to specific conditions. Putting a given
variable in a register is an example, in that if the variable is too long
lived, it can cause too much register pressure, ultimately making surrounding
code less efficient.  I have given up on this battle, but I predict that in
the long run this distinction between 'Reduce and 'Parallel_Reduce will be
seen in the same light as C's old "register" annotation on local variables.
It represents a claim that there are no bad usages (e.g. 'Address on a
"register," data dependencies for 'Parallel_Reduce), but other than that, the
compiler will ignore the distinction since it can see the bad usages itself
(presuming Global annotations on the "combiner" in the latter case).

****************************************************************

From: Randy Brukardt
Sent: Saturday, February 17, 2018  8:38 PM

Of course, one can say the same thing about any construct. Therefore, we don't
need parallel blocks or loops either. Indeed, we don't need to do anything at
all to enable parallelism, since any sufficiently intelligent compiler can
always introduce it just where needed. So Ada 2012 is good enough and we can
all do something else. :-)

Certainly if "parallel_reduce" ends up like "register", you certainly would
have to say the same about parallel loops, and probably parallel blocks and
even tasks. I'm not too worried; maybe Ada 2051 will put these guys into Annex
J but I think it will be a long time before compilers are that good.
(I think it is much more likely we're not using compilers at all by then. :-)

****************************************************************

From: Tucker Taft
Sent: Saturday, February 17, 2018  11:57 PM

I was drawing the distinction between individual expressions, where the
compiler does not generally expect direction for how to do register
allocation, instruction scheduling, etc. vs. multi-statement constructs like
loops where some programmer control can be useful.  For this particular case,
the more appropriate place to put guidance would be on the "combiner"
operation rather than on each use of it, vaguely analogous to putting a pragma
Inline on a subprogram rather than on each call.

Many combiners are going to be builtins like Integer "+" about which the
compiler presumably knows what it needs to know.  For user-defined combiners
(e.g. some kind of set "union" operation), some measure of typical execution
cost might be helpful, as might some measure of the expense of the Next
operation of an iterator type.

Perhaps some kind of subprogram "Cost" aspect could be defined, ranging from,
say, 1 = cheap to 10 = very expensive (probably not a linear scale!  Perhaps
something like the Richter scale ;-).  Then a more global specification could
be used to indicate at what cost parallelization becomes interesting, with
lower numbers meaning more aggressive parallelization, and bigger numbers
meaning minimal parallelization.  This global number could be interpreted as
the relative cost of spawning a tasklet.  The Cost numbers wouldn't need to
have any absolute meaning -- they are merely a way to rank subprograms, and a
way to indicate at what level the expense of executing the subprogram dwarfs
the overhead of spawning a tasklet.  Each increment in Cost might represent a
factor of, say, 10 in relative cost, so 1 = 1 instruction, 2 = 10 instructions,
10 = 1Giga instructions.  If spawning a tasklet takes, say 100 instructions
(Cost=3), then you might want the compiler to consider parallelizing when the
set of subprograms involved have a Cost of >= 4.

This sort of "Cost" aspect could also be informed by profiling of the
execution of the program, and conceivably a tool could provide the cost aspects
to the compiler via a separate file of some format, which the programmer could
tweak as appropriate.

The main point of all of this is that in my view we don't want programmers
deciding at the individual reduction expression whether to parallelize or not.
Based on my experience, there will be a lot of these scattered about, and you
really don't want to get bogged down in deciding about 'Reduce vs.
'Parallel_Reduce each time you use one of these, nor going back and editing
individual expressions to do your tuning.

****************************************************************

From: Brad Moore
Sent: Saturday, February 17, 2018  1:50 PM

> I was drawing the distinction between individual expressions, where
> the compiler does not generally expect direction for how to do
> register allocation, instruction scheduling, etc. vs. multi-statement
> constructs like loops where some programmer control can be useful.
> For this particular case, the more appropriate place to put guidance
> would be on the "combiner" operation rather than on each use of it,
> vaguely analogous to putting a pragma Inline on a subprogram rather than
> on each call.

An interesting idea. The Cost aspect does appeal to me, but one problem I see
with that is if the cost value is hardcoded as an annotation to a subprogram,
then I see that it should be updated anytime a programmer makes a change to
application in that area of code, or to anything that area of code depends on.
Having to modify the sources every time one does a compile does not seem
reasonable to me, so I would expect that the cost would have to be determined
by the compiler, and the cost value would not appear as an aspect or
annotation in the source code. Even with such a compiler generated cost value,
I think there are a lot of concerns that Randy raises where a compiler would
have a difficult task, such as analysing the overall cost on all dependent
units which might be separately compiled. Also if cores are already busy due
to other tasking or parallelism being applied higher up in the code, it might
be difficult for the implementation to know if the parallelism is worthwhile
at a particular place in the  code.

One of my concerns is that at some time in the future, there may exist
exceptionally smart compilers that can apply sophisticated analysis to
determine whether parallelism should be applied or not, but I'd be concerned
that we'd be expecting to raise the bar so that all compiler vendors would
need to have this sophisticated analysis capability. If there is a way for the
programmer to explicitly state the desire for parallelism in a simple
straightforward manner, then perhaps compiler implementations that are not as
smart can still play the parallelism game.

I think typically programmers would start use the 'Reduce attribute, just as
they would write loops without the parallel keyword. If performance problems
are discovered, the optimization phase of the development cycle would look for
places to see where things can be sped up. If a programmer identifies a
'Reduce is a likely candidate for parallelization, he can try changing it to a
'Parallel_Reduce and see if that makes a beneficial difference or not.

If, down the road it all Ada compilers are smart enough to determine whether
'Reduce usages should be parallel or not, then it doesn't seem to be that big
a deal at that time to move 'Parallel_Reduce into annex J, it if truly isn't
needed anymore.

Having Parallel_Reduce now however, means that existing compilers can claim
better support for parallelism now, rather than having to wait for predictions
of future compiler technology to come true. From a marketing perspective, it
may be important to be able to make these claims now. If we have to wait for
some unknown period of time into the future, the rest of the world interested
in parallelism may have abandoned Ada for other options by then.

If one did decide they wanted to eliminate all uses of 'Parallel_Reduce from
the sources, a simple copy and replace operation could be applied, just as
deleting occurrences of the "register" keyword in C could be applied by such
an operation.

I'd be much more worried about doing this in C, however since the C
preprocessor might cause strange interactions with such an automated change to
sources, but Ada doesn't have that problem.

I think a main point to consider is that parallel reduction is really quite a
different algorithm than sequential reduction. Yes it is an optimization, but
it is also a change of algorithm, which may be worth indicating the specific
places in the source code where this alternate algorithm is desired. The
default usage I think would be 'Reduce, where the programmer wants the
compiler to decide as best it can which algorithm should be applied. But in
specific places where the programmer wants the parallelism algorithm, he
should be able to see that by looking at the call site in the code, in my
opinion.

****************************************************************

From: Erhard Ploedereder
Sent: Sunday, February 18, 2018  4:22 PM

> The main point of all of this is that in my view we don't want
> programmers deciding at the individual reduction expression whether to
> parallelize or not.  Based on my experience, there will be a lot of
> these scattered about, and you really don't want to get bogged down in
> deciding about 'Reduce vs. 'Parallel_Reduce each time you use one of
> these, nor going back and editing individual expressions to do your
> tuning.

For >50 years, we as a community have tried to create compilers that
parallelize sequential code based on "as if"-rules. We have not succeeded.
"As if" is simply too limiting. What makes anybody believe that we as Ada
folks can succeed where others have failed?

The reasonably successful models all based on "damn it, parallelize and
ignore the consequences vis-a-vis sequential code"-semantics. Of course,
if for some reason you know that the parallel code is slower than the
sequential version, then you can optimize by generating sequential code.
(Seriously.)

So, in short, I disagree. There is a need for a 'Parallel_Reduce, less to
indicate that you wish parallelization, but rather to absolve from a
requirement of "as if"-semantics of parallelized code.

****************************************************************

From: Edmond Schonberg
Sent: Sunday, February 18, 2018  5:35 PM

Indeed, I don’t think Ada can bring anything that will suddenly make
parallelization easy and effective. For a user concerned with performance,
tuning machinery is indispensable:  that means profiling tools and
annotations, which will invariably have to be target-dependent. The two
best-known langage-independent (kind of) models of distribution and parallel
computation in use today, OpenMP and OpenACC, both choose to use a pragma-like
syntax to annotate a program that uses the standard syntax of a sequential
language (Fortran, C, C++). This makes the inescapably iterative process of
tuning a program easier, because only the annotations need to be modified.
Those annotations typically carry target-specific information (number of
threads, chunk size, etc) . Concerning reduction, this suggests that a single
attribute is sufficient, and that a suitably placed pragma can be used to
indicate whether and how it should be parallelized. The compiler can warn on
the applicability of such a pragma the way it can warn on older optimization
pragmas.

From an implementation point of view there will be a big advantage in being
close to OpenMP and/or OpenACC given that several compiler frameworks (GCC,
LLVM) support these annotations.

As an aside, something that the discussion on parallelism has omitted
completely so far is memory placement, and the literature on parallel
programming makes it clear that without a way of specifying where things go
(which processor, which GPU, and when to transfer data from one to the other)
there is no hope of getting the full performance that the hardware could
supply.  Here also the pragmas proposed by Open… might be a good model to
emulate.

****************************************************************

From: Tucker Taft
Sent: Sunday, February 18, 2018  6:41 PM

> For >50 years, we as a community have tried to create compilers that
> parallelize sequential code based on "as if"-rules. We have not
> succeeded. "As if" is simply too limiting. What makes anybody believe
> that we as Ada folks can succeed where others have failed?

I think a reduction expression might be a special case, since the entire
computation is fundamentally side-effect free.

> The reasonably successful models all based on "damn it, parallelize
> and ignore the consequences vis-a-vis sequential code"-semantics. Of
> course, if for some reason you know that the parallel code is slower
> than the sequential version, then you can optimize by generating sequential code.
> (Seriously.)
>
> So, in short, I disagree. There is a need for a 'Parallel_Reduce, less
> to indicate that you wish parallelization, but rather to absolve from
> a requirement of "as if"-semantics of parallelized code.

But note (I believe) that our proposal is that 'Parallel_Reduce is illegal
(not erroneous) if it is not data-race free, and depends on the Global
annotations.  So I am not sure we are doing what you suggest.

****************************************************************

From: Randy Brukardt
Sent: Monday, February 19, 2018  9:07 PM

...
> > For >50 years, we as a community have tried to create compilers that
> > parallelize sequential code based on "as if"-rules. We have not
> > succeeded. "As if" is simply too limiting. What makes anybody
> > believe that we as Ada folks can succeed where others have failed?
>
> I think a reduction expression might be a special case, since the
> entire computation is fundamentally side-effect free.

Two problems with that:
(1) Ada 2020 is going to provide the tools to the compiler so that it can
 analyze any code for parallelizability. I don't see any reason that a
compiler that is doing such as-if code would want to limit that to just
'Reduce. It could apply parallelization to any code that met the conditions,
including loops and probably even sequential statements. So IF this is
practical (I'm skeptical), it has nothing really to do with 'Reduce, but
rather all Ada constructs (meaning that parallel loops, parallel blocks, and
even tasks are all redundant in the same way.

(2) 'Reduce is likely to be rare in most programmer's code, as the conditions
for making it useful aren't going to be easily visible to most programmers.
With the exception of a few problems that are "obviously" reduces, most of the
examples that Tucker provided are verging on "tricky" (using a reduce to
create a debugging string, for example).

> > The reasonably successful models all based on "damn it, parallelize
> > and ignore the consequences vis-a-vis sequential code"-semantics. Of
> > course, if for some reason you know that the parallel code is slower
> > than the sequential version, then you can optimize by generating
> > sequential code. (Seriously.)
> >
> > So, in short, I disagree. There is a need for a 'Parallel_Reduce,
> > less to indicate that you wish parallelization, but rather to
> > absolve from a requirement of "as if"-semantics of parallelized code.
>
> But note (I believe) that our proposal is that 'Parallel_Reduce is
> illegal (not erroneous) if it is not data-race free, and depends on
> the Global annotations.  So I am not sure we are doing what you
> suggest.

Any Ada parallelism will require that. Otherwise, the code has to be executed
purely sequentially (Ada has always said that). One could imagine a switch to
allow everything to be executed in parallel, but that by definition couldn't
follow the canonical Ada semantics.

In my view, the only parallelism that is likely to be successful for Ada is
that applied at a fairly high-level (much like one would do with Ada tasks
today). For that to work, the programmer has to identify where parallelism
can best be applied, with the compiler helping to show when there are problems
with that application. Such places should be just a handful in any given
program (ideally, just one).

Programmers sticking "parallel" all over hoping for better performance are
going to be rather disappointed, regardless of what rules we adopt. (Robert
always used to say that "most optimizations are disappointing", and that
surely is going to be the case here.) Real performance improvement is all
about finding the "hottest" code and then replacing that with a better
algorithm. Parallel execution has a place in that replacement, but it is not
any sort of panacea.

****************************************************************

From: Erhard Ploedereder
Sent: Tuesday, February 20, 2018  12:42 PM

>> For >50 years, we as a community have tried to create compilers that
>> parallelize sequential code based on "as if"-rules. We have not
>> succeeded. "As if" is simply too limiting. What makes anybody believe
>> that we as Ada folks can succeed where others have failed?

> I think a reduction expression might be a special case, since the
> entire computation is fundamentally side-effect free.

Then I misunderstood the purpose of 'Parallel_Reduce.
I thought that the parallelization also applied to the production of the
values to be reduced, e.g., applying the filters and evaluations of the
container's content values.

>> The reasonably successful models all based on "damn it, parallelize
>> and ignore the consequences vis-a-vis sequential code"-semantics.
>> Of course, if for some reason you know that the parallel code is
>> slower than the sequential version, then you can optimize by
>> generating sequential code. (Seriously.)
>>
>> So, in short, I disagree. There is a need for a 'Parallel_Reduce,
>> less to indicate that you wish parallelization, but rather to absolve
>> from a requirement of "as if"-semantics of parallelized code.

> But note (I believe) that our proposal is that 'Parallel_Reduce is
> illegal (not erroneous) if it is not data-race free, and depends on
> the Global annotations.  So I am not sure we are doing what you
> suggest.

Yes, I am afraid of that. So, any program that dares to use 'Parallel_Reduce
will be illegal until
 - compilers are up to snuff to show data-race-freeness
 - sufficient and minutely exact Global specs are provided and
   subsequently maintained to allow said compilers to prove
   absence of races (at which point compilers will KNOW that
   parallelization is possible, so that after this check there
   is no difference between 'Reduce and 'Parallel_Reduce....
   -- No, there is: the 'Parallel-version is the "bloody nuisance
   checks"-version).

Prepare for another heart of darkness in defining this well, with a net
benefit of zilch.

****************************************************************

From: Randy Brukardt
Sent: Tuesday, February 20, 2018  3:23 PM

"zilch"? I can agree, but only because that's the usual benefit of low-level
optimizations. For the vast majority of programs, they're irrelevant, and the
same is certainly true of fine-grained parallelism. As an as-if optimization,
it is impossible even in the rare cases where it might help (because there is
a substantial possibility that the code would be slower - the compiler knows
little about the function bodies and little about the number of iterations).

But we *have* to do this for marketing reasons. And it can help when it is
fairly coarse-grained. But *no one* can find data races by hand. (I know I
can't -- the web server goes catatonic periodically and I have been unable to
find any reason why.) Making it easy to have parallel execution without any
sort of checking is a guarantee of unreliable code.

I thought that the intent was that the parallelism checking would be a
"suppressible error", such that one could disable it with a pragma. (That of
course assumes that we define the concept.) So if you really think you can get
it right on your own, feel free to do so. I just hope no one does that in any
auto or aircraft software...

****************************************************************

From: Jean-Pierre Rosen
Sent: Thursday, March  1, 2018  3:54 AM

My proposal of having 'Reduce applied to the function didn't get much support.
Thinking about it, what worries me is that 'Reduce is a verb, while the name
of an entity designating a value should be a name (or adjective, or is_a...)
appropriate to the value.

I think that <expr>'Reduction(F,...) would be more palatable. What do others
think?

****************************************************************

From: Tucker Taft
Sent: Thursday, March  1, 2018  6:35 AM

Either is fine by me.

****************************************************************

From: Brad Moore
Sent: Saturday, March  3, 2018  12:07 AM

I think I still prefer 'Reduce, partly because it is a simpler word, but more
because this is associated with iteration which is an action, and can be used
in contexts where a verb reads better, particularly when the prefix involves
the word "for"

eg.

 (for I in 1 .. 10 => I**2)'Reduce("+", 0)

In the same way that I view "loop" is a verb.

  for I in 1 .. 10 loop
      Sum := @ + I**2;
  end loop


both fit the pattern;

   for x do Y;

Where "do" is substituted with "loop", or "Reduce"

While the overall expression is a value, the ' is applying an operation to the
prefix, and the effect of that attribute is about iteration.

I see an attribute like 'First as just semantically a substitution for a
value. But 'Reduce is more closer to an action semantically.

I could go either way, but my preference is still for 'Reduce.

****************************************************************

From: Jeff Cousins
Sent: Saturday, March  3, 2018  3:53 AM

>> I think that <expr>'Reduction(F,...) would be more palatable. What do
>> others think?

Reduction sounds more logical, but either is ok with me.  Maybe we should
have a straw poll some time.

****************************************************************

From: Ed Schonberg
Sent: Saturday, March  3, 2018  8:43 AM

> I could go either way, but my preference is still for ‘Reduce.

I also prefer the verb form.

****************************************************************

From: Brad Moore
Sent: Friday, May  4, 2018  5:42 PM

I have been looking at the AI regarding compare and swap and lock free
operations, and realised such operations can be another option for performing
reductions in parallel, (which is also similar to using protected objects for
such reductions).

I have implemented a set of generic Ada packages that match the capabilities of
GCC's atomic primitives.

Ordinarily, updating atomic variables have higher expense for overhead than
updating local variables, but if most of the calculations in a chunk of
iterations can be applied to variables of non-atomically updated types, and then
reductions of chunk results performed via updates using atomic or protected
operations, then the overhead of the updates can be marginalised. Also, if the
number of iterations is small enough, and the overhead of applying the reduction
on every iteration is negligible compared to other processing in each iteration
of the loop, it can be worthwhile to use atomic/lock free or protected
operations for reduction.

With this in mind, I thought it might be helpful to compare the various
possibilities for loop reduction with the proposals we have currently in the
context of producing multiple reduction results, which is an interesting case
that might draw one to consider parallel loop syntax.

Consider the case of iterating over an array to produce the Sum, Min, and Max of
all values of the array.

Using 'Reduce / 'Parallel_Reduce Syntax alone  (AI12-0242-1 Proposal), we could
write:


Sum := A'Parallel_Reduce("+", 0);
Min := A'Parallel_Reduce(Integer'Min, Integer'Last);
Max := A'Parallel_Reduce(Integer'Max, Integer'First);

or alternatively, to do all three in one reduction operation....;

type Summary is
   record
      Sum : Integer;
      Min : Integer;
      Max : Integer;
   end record;

-- Reducer function for the Reduction expression
function Reduce (L, R : Summary) return Summary is
  (Summary'(Sum => L + R,
            Min => Integer'Min (L, R),
            Max => Integer'Max (L, R)));

-- Reduction expression to compute all 3 results at once
Result : constant Summary := A'Parallel_Reduce(Reduce,
                                               Summary'(Sum => 0,
                                                        Min => Integer'Last,
                                                        Max => Integer'First));


If we include the capabilities of AI12-0190-1, Function expressions, this can be
further simplified to;

type Summary is
   record
      Sum : Integer;
      Min : Integer;
      Max : Integer;
   end record;

-- Reduction expression to compute all 3 results at once
Result : constant Summary :=
   A'Parallel_Reduce
       ((function (L, R) return (L+R, Integer'Min(L,R), Integer'Max(L,R))),
        Summary'(0, Integer'Last, Integer'First));

Note: In the first case, we are iterating through the array 3 times, in the
second (and third) cases, we only iterate once. Note also that in this
particular example, performance testing showed the first case to complete
quicker than the second, but your mileage can vary for different problems.
Sometimes the second approach can provide better results.

Moving on to parallel loops, we could consider just the capabilities of
AI12-0119-1. With ordinary parallel loops, reduction is not supported per se,
but can be accomplished, if the reductions occur for every iteration, and are
stored via lock free/atomic operations (or via a protected object).

-- Using my test implementation of atomic/lock free operations
package Atomic_Integers is new Atomic_Operations.Signed_Arithmetic (Atomic_Type => Integer);

parallel
for I in Arr'Range loop
   declare
      Dont_Care : Integer;
   begin
      --  Lengthy complex processing here

      Dont_Care := Atomic_Integers.Atomic_Add_And_Fetch (Item => Sum,
                                                         Value => A(I));
      Dont_Care := Atomic_Integers.Apply_Lock_Free_Operation (Item    => Min,
                                                              Process => Integer'Min'Access,
                                                              Value   => A(I));
      Dont_Care := Atomic_Integers.Apply_Lock_Free_Operation (Item    => Max,
                                                              Process => Integer'Max'Access,
                                                              Value   => A(I));
   end;
end loop

or alternatively, one might create a more customised PO or ADT for this, which
also reads better, but wont be lock-free unless one takes advantage of
non-standard Lock_Free aspect compiler extensions.

parallel
for I in Arr'Range loop
   declare
      Dont_Care : Integer;
   begin
      --  Lengthy complex processing here

      PO.Apply(A(I));  -- Updates the Max, Min, and Sum values via a PO.
   end;
end loop

This should work but likely only provides good performance if the number of
iterations is small and the overhead for updating the atomic reduction variables
is small compared to other processing occurring during each iteration of the
loop.

Otherwise, if the number of iterations is high, and the iteration itself is a
significant part of the processing, then per iteration reduction can be too
costly to consider, but some of the other additional proposals that provide loop
chunking capabilities, can be used.

First consider writing such a parallel Loop via AI12-0251-1 Proposal, where
special loop chunking syntax is provided:

declare
     Partial_Sum : array (1 .. 2*Num_CPUs) of Integer := (others => 0);
     Partial_Min : array (1 .. 2*Num_CPUs) of Integer := (others => 0);
     Partial_Max : array (1 .. 2*Num_CPUs) of Integer := (others => 0);
begin
     parallel (Chunk in Partial_Sum'Range) for I in Arr'Range loop
        Partial_Sum(Chunk) := @ + Arr(I);
        Partial_Min(Chunk) := Integer'Min(@, Arr(I));
        Partial_Max(Chunk) := Integer'Max(@, Arr(I));
     end loop;

     Sum := Partial_Sum'Reduce("+", 0);
     Min := Partial_Min'Reduce(Integer'Min, Integer'Last);
     Max := Partial_Max'Reduce(Integer'Max, Integer'First);
end;

This proposal pretty much requires the use of arrays for storing partial results
(and encourages the use of 'Reduce to provide the reductions). Reason: The
syntax does not allow for per-chunk variable declarations. Any declarations
inside the loop are per-iteration declarations, so the use of per-chunk local
variables for reduction do not apply to this case. The use of array syntax will
be familiar to existing Ada practitioners, but there may be issues in terms of
stack storage needed for the arrays particularly if the size of the components
is large, and some possible confusion or difficulty for the programmer in
deciding how big to make the arrays. Note, the bounds of the three arrays would
need to be identical for this to make sense and work. There is an arguable
advantage that the syntax presents this problem as a single non-nested loop.
That's actually an advantage and a disadvantage. It's an advantage because it
looks simpler, but a disadvantage because one cannot make per-chunk
declarations, or perform per-chunk initialisation or finalisation.

Another point to note is that syntactically, the reductions occur sequentially
after the parallel loop. Ideally, one might like to see the reductions occur
during the parallel loop processing, but in this case, it wouldn't make much of
a difference. The reduction processing is likely insignificant compared to the
processing of the parallel loop, which is probably more typical. Note that
'Reduce was used here rather than 'Parallel_Reduce. This is because there is
likely not be be any benefit to doing the 'Reduce in parallel, and may likely be
worse than sequential, as the overhead of parallelism is not overcome by the
other processing of the 'Reduce attribute.

Moving on to the other alternate chunking proposal. This proposal does not
define any new syntax, but instead provides a chunking library that can break up
a set of iterations into an array of chunks, which is a proposal likely also
needed for container iteration.

One could write this Parallel Loop via the AI12-0251-2 Proposal
as;

declare
   subtype Loop_Index is Integer range 1 .. A'Length;
   package Manual_Chunking is new Ada.Discrete_Chunking (Loop_Index);

   Chunks : constant
      Manual_Chunking.Chunk_Array_Iterator_Interfaces.Chunk_Array'Class
         := Manual_Chunking.Split;
begin
   parallel
   for Chunk in 1 .. Chunks.Length loop
      declare
         Partial_Sum : Integer := 0;
         Partial_Min : Integer := Integer'Last;
         Partial_Max : Integer := Integer'First;
      begin

         for I in Chunks.Start(Chunk) .. Chunks.Finish(Chunk) loop
            Partial_Sum := @ + A(I);
            Partial_Min := Integer'Min(@, A(I));
            Partial_Max := Integer'Max(@, A(I));
         end loop;

         -- Updates the Max, Min, and Sum values via a PO.
         PO.Apply(Partial_Sum, Partial_Min, Partial_Max);
      end;
   end loop;
end;

In this case, we have the option of declaring partial arrays and writing the
code similarly as the chunking syntax proposal above, but we also have the
option of just declaring local per-chunk variables, and then performing the
reductions in parallel, as part of the processing of each chunk, using updates
to atomic variables. Note the reduction occurs in parallel, rather than
sequentially after the parallel loop. A disadvantage might be that the loop
appears more complex, as it involves a nested loop. An advantage might be that
this is more closely resembles the actual processing that is occurring, and thus
involves less magic. This also allows for the possibility of having pre-chunk
initialisation code, and post-chunk finalisation code, which can be useful for
certain problems. It also allows one to query query the chunk boundaries, which
the syntax based proposal does not provide currently. Such capability could be
added to that proposal, if it is needed, however.

Finally, consider the possibilities of using a 3rd party parallelism library,
such as Paraffin. When combined with the anonymous loop body proposal AI, it is
interesting to note that the code ends up looking very much like the previous
example.


Using a parallelism library such as Paraffin, with anonymous loop body proposal
(AI12-0189-1), one could write:

declare
  subtype Loop_Index is Integer range 1 .. A'Length;

  package Parallel_Loops is new Paraffin.Dynamic (Loop_Index);
  use Parallel_Loops;
begin
   for (Start, Finish) of Parallel_Iterate loop
      declare
         Partial_Sum : Integer := 0;
         Partial_Min : Integer := Integer'Last;
         Partial_Max : Integer := Integer'First;
      begin

         for I in Start .. Finish loop
            Partial_Sum := @ + A(I);
            Partial_Min := Integer'Min(@, A(I));
            Partial_Max := Integer'Max(@, A(I));
         end loop;

         -- Updates the Max, Min, and Sum values via a PO.
         PO.Apply(Partial_Sum, Partial_Min, Partial_Max);
      end;
   end loop;
end;

This example has pretty much the same advantages and disadvantages of the
previous proposal, except that this relies on non-standard libraries, whereas
the previous example does not. It might be less obvious that the Start and
Finish variables are the Chunk start and finish iteration values, as those are
specific parameters to the anonymous subprogram that is passed to the
Parallel_Iterate call. Another disadvantage is that the compiler likely knows
less about the parallelism of this loop, which may mean there are certain usage
problems that may be more difficult for the compiler to detect.

It is also interesting that the special loop syntax in the for ... loop line
looks quite similar to the corresponding syntax in the syntax based chunking
proposal.

What conclusions can be drawn from all this? I'm not sure. We probably still
need to choose between AI12-0251-1, and AI12-0251-2 (or neither if the anonymous
loop body approach holds enough appeal).

We seem to have several ways to do parallel reduction, each of which might
appeal to different people for different problems. I think that's a good thing,
one can write very concise expressions when needed, or apply more explicit
approaches when desired.

I still like the idea of being able to specify that a PO be lock-free, possibly
in addition to providing a set of primitives to map to compare and swap
operations et al, though it appears that there is not much support for that
idea. One should be able to create similar abstractions to PO's on top of the
set of atomic/lock free primitives. One worry would be that for parallelism at
least, there might be the tendency to move away from the use of protected
objects, in favour of user-provided lock free packages or abstractions, if it is
perceived they have better performance. Maybe that's a good thing?

One parallelism capability that we do not have, that was recently added to
OpenMP, and was discussed at the recent IRTAW, was unstructured parallelism.
This the case where the parallelism structures are not as closely tied to
fork-join semantics. One might create tasklets at various points during
processing, but then specify independent synchronisation points in the code
where sets of related tasklets need to complete their processing.

I think this is a capability we'd likely want to explore at some point, but
likely that is something that would be best to leave to Ada 2025. ;-)  For Ada
202x, its probably best to focus on what we already have on the table, and
provide some basic parallelism capabilities that cover a broad set of the
parallelism spectrum, and then look at more esoteric (but potentially very
useful) features or capabilities further down the road.

*****************************************************************

Questions? Ask the ACAA Technical Agent