Version 1.4 of ai12s/ai12-0250-1.txt

Unformatted version of ai12s/ai12-0250-1.txt version 1.4
Other versions for file ai12s/ai12-0250-1.txt

!standard 5.5(4)          19-03-01 AI12-0250-1/03
!class Amendment 18-01-25
!status work item 18-01-25
!status received 18-01-24
!priority Very_Low
!difficulty Easy
!subject Iterator Filters
!summary
The syntax "when <condition>" may be used in an iterator to filter out elements which do not satisfy the specified condition.
!problem
When iterators are used for things like container aggregates and reduction expressions, it is common to want to limit the iteration to a subset of the "raw" container contents, or a subset of the elements of an array, or even a subset of the elements of some numeric sequence. In cases like this, if we were writing a loop, we could just enclose the body of the loop in an "if" statement. However, when the iterator is integrated within an expression like a container aggregate or a reduction expression, an "if" expression doesn't have the right semantics. For example, if you want a set made of all of the odd elements of another container, you would want to write something like the following:
S : constant Set := [for E of C when E mod 2 = 1 => E];
That is, you want to filter out the elements for which the predicate "E mod 2 = 1" is False. This is useful in other contexts as well, for example in a parallel loop, it clarifies that the elements that don't pass the filter probably won't take any significant amount of computing, so could be ignored for the purposes of determining the timings for the various threads executing the loop.
Here are further examples:
Odd_Elements_Are_Purple : constant Boolean := (for E of C when E mod 2 = 1 => Is_Purple(E));
type Pair is record Key : Integer; Val : Integer; end record;
A : array (Positive range <>) of Rec := [(Key => 55, Val => 77), (Key => 1, Val => 88)];
M : constant Map := [for E of A when E.Val mod 2 = 1 use E.Key => E.Val]; -- M = [55 => 77];
Note that we already check against the predicates of the discrete_subtype in a loop_parameter_specification or iterated_component_association, so this effectively "and"s the filter condition with the predicates, if any, so we already have a precedent for a discontiguous sequence of elements coming from an iterator.
!proposal
Allow an iterator to include a filter of the form: "when <condition>" which would cause the iterator to bypass the "body" of the loop or the "expression" of the iterated construct when the filter evaluates to False for the given value of the loop parameter.
Note that this AI interacts with AI12-0212-1 and other AIs involving iteration. This AI is now written relative to the 202X AARM version published on 20-Feb-2019, known as "Draft 18".
Note that when incorporating filters into reduction expressions, it became clear we should revert to the suggested alternative approach of using iterated_element_association rather than iterated_component_association. This had the nice effect of significantly simplifying the wording. We have disallowed the use of "reverse" in one case here, but at this point that is somewhat arbitrary, and could be allowed, since we seem to allow it in container aggregates, and reduction expressions can actually be seen as more like a container aggregate for a sequence of some sort rather than an array aggregate.
Originally this AI was written to be relative to rev 1.19 of AI12-0212-1, which in the header is described as "18-11-19 AI12-0212-1/08," so keep that in mind when comparing the wording in this version relative to the original AI.
!wording
Modify RM 4.3.3(20.3/5-20.b/5):
1. Each iterator_specification is elaborated (in an arbitrary order)
and an iteration is performed solely to determine a {maximum} count {for} [of] the number of values produced by the iteration; all of these counts are combined to determine the {maximum} overall length of the array, and ultimately {the limits on} the bounds of the array (defined below);
2. A second iteration is performed for each of the iterator_specifications,
in the order given in the aggregate, and for each value produced, the associated expression is evaluated, its value is converted to the component subtype of the array type, and used to define the value of the next component of the array starting at the low bound and proceeding sequentially toward the high bound. A check is made that the second iteration results in [the same} {an} array length {no greater than the maximum determined by the first iteration}; Constraint_Error is raised if this check fails.
AARM To Be Honest: Constraint_Error should be raised no later than when the iterations exceed the [expected] {maximum} array length; memory that doesn't belong to the aggregate temporary should not be overwritten.
Modify RM 4.3.3(26.2/5):
* For a named_array_aggregate containing only iterated_component_associations with an iterator_specification, the lower bound is determined as for a positional_array_aggregate without an others choice, and the upper bound is determined from the lower bound and the total number of values produced by the {second set of iterations} [iteration(s)];
Modify RM 4.3.3(32/5):
Implementation Permissions
When evaluating iterated_component_associations for an array_aggregate that contains only iterated_component_associations with iterator_specifications, the first step of evaluating a iterated_component_association can be omitted if the implementation can determine the {maximum} number of values by some other means.
Modify RM 4.3.5(28/5):
For a named_container_aggregate that is an indexed aggregate, all container_element_associations shall contain either a key_choice_list, or a loop_parameter_specification without a /key_/expression {or iterator_filter}. Furthermore, for such an aggregate, either:
Replace RM 4.5.10(3/5) with:
value_sequence ::= '[' [parallel[(chunk_specification)]] iterated_element_association ']'
Replace RM 4.5.10(6/5-6.b/5) with:
The iterated_element_association shall not have a /key_/expression, nor shall it have a loop_parameter_specification that has the reserved word reverse.
AARM Reason: The intent is that the syntax matches as closely as possible array or container aggregate notation. Syntax that matches a loop_parameter_specification with the reverse reserved word would not be permitted in an array aggregate, so we disallow that here.
[Author's Note: This is somewhat arbitrary, and reverse could be permitted, without any other change to the wording outside this paragraph.]
Delete RM 4.5.10(20/5).
Delete RM 4.5.10(23/5).
Modify RM 4.5.10(27/5-28/5):
For the evaluation of a value_sequence[:
* if the iterated_component_association has an iterator_specification, an iteration is performed for that iterator_specification (as described in 5.5.2)], {the iterated_element_association is elaborated, then an iteration is performed as described in 5.5 or 5.5.2,} and for each value produced by the iterator, the associated expression is evaluated with the loop parameter having this value, to produce a result that is converted to Value_Type, and used to define the next value in the sequence[;]{.}
Delete RM 4.5.10(29/5).
Replace RM 5.5(4) with:
loop_parameter_specification ::= defining_identifier in [reverse] discrete_subtype_definition [iterator_filter]
Add after RM 5.5(4):
iterator_filter ::= when condition
Add before RM 5.5(7/5):
An iterator_filter is defined to be /satisfied/ when the condition evaluates to True for a given iteration of a loop_parameter_specification, iterator_specification, or procedural_iterator. The term /conditionally executed/ when referring to the sequence_of_statements of a loop_statement means that the statements are executed only when there is no iterator_filter, or the iterator_filter is satisfied. In contexts where a loop_parameter_specification or iterator_specification is used to produce a sequence of values (see 4.3.3 and 4.3.5), if an iterator_filter is present, the sequence of values will contain only the values for which the iterator_filter is satisfied.
Modify RM 5.5(9/5):
For the execution of a loop_statement that has an iteration_scheme including a loop_parameter_specification, after elaborating the chunk_specification, if any, the loop_parameter_specification is elaborated. This [elaboration] elaborates the discrete_subtype_definition, which defines the subtype of the loop parameter. If the discrete_subtype_definition defines a subtype with a null range, the execution of the loop_statement is complete. Otherwise, the sequence_of_statements is {conditionally} executed once for each value of the discrete subtype defined by the discrete_subtype_definition that satisfies the predicates of the subtype ...
Modify RM 5.5(9.3/5):
[Redundant: For details about the execution of a loop_statement with the iteration_scheme including an iterator_specification, see 5.5.2. {For details relating to a procedural_iterator, see 5.5.3.}]
Modify RM 5.5(10):
5 A loop parameter {declared by a loop_parameter_specification} is a
constant; it cannot be updated within the sequence_of_statements of the loop (see 3.3).
Replace RM 5.5.2(2/5) with:
iterator_specification ::= defining_identifier [: loop_parameter_subtype_indication] in [reverse] iterator_name [iterator_filter] | defining_identifier [: loop_parameter_subtype_indication] of [reverse] iterable_name [iterator_filter]
Modify RM 5.5.2(10/3):
... Otherwise, the sequence_of_statements is {conditionally} executed and then the Next operation of the iterator type is called with the loop iterator and the current value of the loop parameter to produce the next value to be assigned to the loop parameter. ...
Modify RM5.5.2(10.2/5):
...Otherwise, the sequence_of_statements is {conditionally} executed and then the Next operation of the iterator type having a Chunk parameter is called, with the loop iterator, the current value of the loop parameter, and the corresponding chunk index, to produce the next value to be assigned to the loop parameter....
Modify RM 5.5.2(11/3):
... Otherwise, the sequence_of_statements is {conditionally} executed with the loop parameter denoting each component of the array for the loop, using a canonical order of components, which is last dimension varying fastest (unless the array has convention Fortran, in which case it is first dimension varying fastest). ... The loop iteration proceeds until the sequence_of_statements has been {conditionally} executed for each component of the array for the loop, or until the loop is left as a consequence of a transfer of control.
Replace RM 5.5.3(2/5) with:
procedural_iterator ::= iterator_parameter_specification of iterator_procedure_call [iterator_filter]
Add to the end of RM 5.5.3(15/5):
{The body of the locally declared procedure P follows this structure:
procedure P ... is begin if (iterator_filter is absent or satisfied) then sequence_of_statements end if; end P;
AARM Implementation Note: Any transfer of control out of the sequence_of_statements, presuming the callable entity "C" has Allows_Exit True (see below), could be transformed to a raise of some sort of special exception that only the implementation can handle and can use to implement the desired transfer of control.
}
!discussion
Some uses of filters are merely "nice to have", but when constructing an aggregate or performing a reduction, it is more important, in that trying to construct a new aggregate (or perform a reduction) from a subset of the elements of an existing container cannot be easily accomplished without something like a filter. For the reduction, it would be possible to use an "if" expression and substitute in the identity of the reduction, but the resulting construct is almost certainly harder to read and understand. But for an aggregate, an "if" expression does not have the right semantics (as mentioned in the !problem above).
Note that we do not propose to allow a filter on an array aggregate with a "simple" discrete range such as:
[for I in 1..10 => I * 2]
This is because the values of the index parameter ("I" in this case) are taken directly as the array indices, and a filter would not make sense -- no holey arrays please! By contrast, when a container iterator is used within an array aggregate, the iterator just produces a sequence of values and the index within the array is determined by position within this sequence of values. For a container aggregate, we have similar restrictions on "indexed" aggregates -- these are effectively user-defined array aggregates and have nearly identical semantics (and restrictions). For non-indexed container aggregates, we always allow filters, since these do not require full coverage over a contiguous key range.
Note that we do not define the notion of a "static" filter. There seems less reason to introduce the distinction between "static" and "dynamic" for a filter, analogous to what we have done for predicates, because the filter is explicit syntax in the iterator, and there is no promise that the overall iteration will somehow magically and statically "skip over" the filtered-out elements. The presumption with a filter is that you generate all of the elements of the iteration, and then ignore those filtered out. With a loop over a predicated subtype, it was felt that a loop over a dynamically-predicated subtype might have misleading performance characteristics if there were an assumption we might magically "visit" only the valid elements of the subtype.
!ASIS
[Not sure. Probably new routines for new syntax are needed. - Editor.]
!ACATS test
ACATS B- and C-Tests are needed to check that the new capabilities are supported.
!appendix

From: Tucker Taft
Sent: Wednesday, January 24, 2018  3:45 PM

Here is a short writeup of the notion of an iterator "filter" using "when
<condition>" (don't worry, the '<' '>' are not in the syntax -- they are just
the usual meta-notation for grammar non-terminals!). [This is version /01 of
this AI - ED.]

Florian is apparently swamped, so he encouraged me to take over his AIs in the
near term.

****************************************************************

From: Randy Brukardt
Sent: Wednesday, January 24, 2018  4:39 PM

> Allow an iterator to include a filter of the form:  "when <condition>"
> which would cause the iterator to bypass the "body" of the loop or the
> "expression" of the iterated construct when the filter evaluates to
> False for the given value of the loop parameter.

I don't see why loop statement iterators (and exits!) should be special in this
regard. This seems more sensible as a general feature that applied to all
simple_statements (and loops).

If
   loop I in 1 .. 10 when <something> loop and
   exit when <something>;
are good, why isn't
   goto when <something>;
or
   return Foobar when <something>;
good??

Otherwise, this seems like an ugly hack only needed in one specific case, and
generalized somewhat just to make it look less like the hack it is rather than
for any real need. (We've written Ada loops for nearly 40 years and I've never
missed a special conditional feature!)

> We would define a "static filter" as one that obeys essentially the
> same rules as a static predicate, roughly that it would be static if
> the loop parameter were replaced with a static expression.  For
> filters that accompany loop_parameter_specifications or
> iterated_component_associations, non-static filters would be permitted
> only in places where discrete subtypes with dynamic predicates are
> permitted.

The wording of 3.2.4 essentially puts these rules into terms of nonstatic
subtypes (Dynamic_Predicates are not usually mentioned explicitly), so this
formulation won't work as written. I can't think of an easy rule that would work
as a replacement, but I suppose Tucker can.

****************************************************************

From: Tucker Taft
Sent: Thursday, January 25, 2018  3:57 AM

> Otherwise, this seems like an ugly hack only needed in one specific
> case, and generalized somewhat just to make it look less like the hack
> it is rather than for any real need. (We've written Ada loops for
> nearly 40 years and I've never missed a special conditional feature!)

The generalization to loop statements is not the main point here.  The critical
need for a filter is in a container aggregate, and a reduction expression,
though as the reduction expression now seems to be heading toward being just the
application of an attribute reference to a container aggregate (where a call on
Add_Positional is effectively replaced by a call on the reduction operation),
the filter in a container aggregate would work for both.  I don't want to kill
this idea by over-generalizing it.  If we want "when" clauses to apply to other
things, fine, lets have a separate AI for that, but here we are talking about
applying when clauses to *iterators*, not to loop statements per-se.

...
> The wording of 3.2.4 essentially puts these rules into terms of
> nonstatic subtypes (Dynamic_Predicates are not usually mentioned
> explicitly), so this formulation won't work as written. I can't think
> of an easy rule that would work as a replacement, but I suppose Tucker can.

Yes, I believe we can harmonize the wording here.

****************************************************************

From: Randy Brukardt
Sent: Thursday, January 25, 2018  5:19 PM

...
> The generalization to loop statements is not the main point here.  The
> critical need for a filter is in a container aggregate, and a
> reduction expression, though as the reduction expression now seems to
> be heading toward being just the application of an attribute reference
> to a container aggregate (where a call on Add_Positional is
> effectively replaced by a call on the reduction operation), the filter
> in a container aggregate would work for both.  I don't want to kill
> this idea by over-generalizing it.  If we want "when"
> clauses to apply to other things, fine, lets have a separate AI for
> that, but here we are talking about applying when clauses to
> *iterators*, not to loop statements per-se.

I've already made my feelings on that clear. I understand the thinking for
container aggregates, but it is evil in all other contexts and as such I would
rather do without it completely. (One can usually write a predicate with the
same effect, and if you can't, you probably shouldn't be using the aggregate
form in the first place.)

I was thinking that I could be swayed to support a general across-the-board
enhancement, as then the language would be at least consistent. As it is, it is
just a gee-gaw with a very specific use and something that should be avoided in
all other cases (indeed, the rules you proposed would make it illegal in many
other cases, including in for loop statements, which probably is going too far
in the other direction; you definitely need to make it illegal in array
aggregates if nonstatic but I don't see the need for that in a for loop
statement).

As it stands, I can't support it as having too limited utility for the extra
overhead. Essentially, it should be allowed everywhere or nowhere.

****************************************************************

From: Tucker Taft
Sent: Wednesday, October 17, 2018  7:42 PM

I don't see AI12-0250 (iterator filters) on anybody's list.  What is its status?

****************************************************************

From: Randy Brukardt
Sent: Thursday, October 18, 2018  2:40 PM

We didn't discuss it last time, so there was no guidance or obvious need for
an update. It had a fairly low priority, if I recall correctly (recall our
priority votes of last spring). (Checking on this.) Yup, it's behind "Image
for all types", "Marshalling streams (split from the previous), "Declare
expressions", "Anonymous functions", "User-defined literals", "Access
ownership", and "Bignums" (as well as all of the AIs that are directly
associated with the instructions, and practically the "simple" AIs as well).
I know that we didn't talk about access ownership or bignums in Lisbon, so
we couldn't have discussed 250, either.

Of these, I would hope to complete all of them (with the possible exception
of "Access ownership"), so I'd expect it to get the top eventually, but
whether that will happen in time for it to make it into Ada 2020 is TBD.

****************************************************************

From: Tucker Taft
Sent: Thursday, December 6, 2018  4:10 PM

Here is a version of this AI on iterator filters that includes !wording.
[This is version /02 of the AI. - Editor.]

****************************************************************

From: Randy Brukardt
Sent: Thursday, December 6, 2018  10:32 PM

> !standard 5.5(4)                               18-12-06  AI12-0250-1/03

I'm not sure what happened to version /02; I don't have one, so that's the
number this draft was given. I know that will confuse everyone, sorry.

I didn't see any issues, other than removing a few double blanks, before I
filed this. But I'm too far behind to read this carefully (and I will
continue my voting against - I do not want to have to add 24 new kinds of
iteration code generation to my compiler -- nor do I want to make any other
implementers do that either).

****************************************************************

From: Tucker Taft
Sent: Friday, March 01, 2019 9:03 PM

I reviewed the "filter" AI, and made quite a number of changes to the wording,
taking advantage of the fact that Randy has integrated all of the other relevant
AIs into the RM wording.  So this AI is now written relative to Draft 18 of the
Ada 202X RM, which makes it a lot easier to understand (at least from my
perspective).  Probably the biggest change that resulted from incorporating
filters into all relevant syntax was to tip the scale back in favor of
"iterated_element_association" for reduction expressions.  Brad had suggested
this early on, and in retrospect it was the right approach.  Trying to use
"iterated_component_association" which was invented for array aggregates turns
out to be not as good a fit as "iterated_element_association" which was invented
for container aggregates.  In many ways, a reduction expression is more like a
container aggregate for a sequence of some sort (like a linked list), rather
than an array or other indexed structure.   Switching to
iterator_element_association actually allowed a bunch of the wording for
reduction expressions to be deleted or simplified.

So without further ado, here is the updated Filter AI.
[This is version /03 of the AI - Editor.]

****************************************************************

From: Randy Brukardt
Sent: Tuesday, March 05, 2019  9:59 PM

Editorial fixes:

A] Found two places with punctuation inside of quotes. John will flog me over
those, so please don't. (They probably were in the previous version, too.)

B] A bunch of double spaces after periods.

C] You changed some of the syntax to put reserved words in CAPS (mostly at the
start), but the bulk of it has the reserved words in lower case. I changed them
all into lower case, to be consistent.

D] In the AARM Rationale (which isn't a type of AARM note, BTW, you meant
"Reason"), you have

    NOTE: This is somewhat arbitrary, and reverse could be permitted,
    without any other change to the wording outside this paragraph.

This looks like a note to the ARG, not something for the AARM. So it needs to be
set off, preferably with [Author's Note: ...]

E] You have an AARM note following 5.5.3(15/5):

  AARM Implementation Note: Any transfer of control out of the
  sequence_of_statements, presuming the callable entity "C" has
  Allows_Exit True (see below), would be transformed to a raise of some
  sort of special exception that only the implementation can handle and
  can use to implement the desired transfer of control.

We decided not to describe an implementation technique in the AARM when we
discussed AI12-0189-1. I'm not sure why this has changed. AI12-0189-1 has quite
a bit on possible implementations of transfer-of-control, and they are quite a
bit more general than this note. Note using an exception here requires passing a
code along with the exception to describe the type of exit, which may not work
well in some implementations (particularly if using some underlying exception
mechanism rather than creating one).

In any case, "would be transformed" seems way more prescriptive than usually
done in such notes. I changed this to "could be transformed" to allow an
implementation to do this some other way should it want to.

We'll talk during the meeting about whether this note should even exist (I'd say
no, but I'm not going to just rip it out on my own).

I'm not a fan of the way 5.5.3(15/5) is presented, it seems rather different
than the rest of the paragraph. Another thing to discuss.

****************************************************************

Questions? Ask the ACAA Technical Agent