# Version 1.6 of ai05s/ai05-0158-1.txt

Unformatted version of ai05s/ai05-0158-1.txt version 1.6
Other versions for file ai05s/ai05-0158-1.txt

!standard 3.8.1(5)          10-01-20 AI05-0158-1/04
!standard 4.4(3)
!standard 4.5.2(3)
!standard 4.5.2(5)
!standard 4.5.2(27)
!class Amendment 09-06-07
!status work item 09-06-07
!priority Low
!difficulty Medium
!subject Generalizing membership tests
!summary
Extend the syntax of membership tests to simplify complex conditions that can be expressed as membership in a subset of values of any type. Introduce syntax for a list of expressions that may be used in loop contexts as well.
!problem
Conditions of the form (X = A) or else (X = B) or else (X = C) where A, B, C are of some arbitrary type are common, and will be more frequent when pre- and post-conditions become widely used. If the values of A, B,and C are contiguous values of some discrete type, the expression can be written as a membership operation: (X in A..C) . Otherwise this syntactic shortcut is not available. We propose a simple extension of the syntax of membership operations, so that the right operand can specify a subset of values of some type. The type does not need to be discrete, and the choices do not need to be static, so that one can write:
if C not in 'A' | 'B' | 'O' then Put_Line ("invalid blood type"); else ..
while Name (1 .. N) in T.Element (X) | T.Element (Y) | T.Element (Z) loop ...
!proposal
We propose a simple extension of the syntax of membership operations, so that the right operand can specify a subset of values of some type. We propose a simple extension of the syntax of membership operations, so that the right operand can specify a subset of values of some type. The type does not need to be discrete, and the choices do not need to be static, so that one can write:
if C not in 'A' | 'B' | 'O' then Put_Line ("invalid blood type"); else ..
while Name (1 .. N) in T.Element (X) | T.Element (Y) | T.Element (Z) loop ...
!wording
Modify 3.8.1 (5) as follows:
discrete_Choice ::= choice_expression | discrete_Range | others
Modify 4.4(3) as follows:
expression := relation {logical_operator relation}
choice_expression := choice_relation {logical_operator choice_relation}
choice_relation ::= simple_expression [relational_operator simple_expression]
relation ::= simple_expression [not] in membership_choice_list
membership_choice_list ::= membership_choice {'|' membership_choice}
membership_choice ::= choice_expression | range | subtype_mark
logical_operator ::= and | and then | or | or else | xor
Add after 4.5.2(3): (Name Resolution Rules)
If the membership_choice_list in a membership operation has more that one choice, then all choices must have a single common type, which is the tested type of the operation.
Add after 4.5.2 (5) (Static semantics)
The expected type of a choice_expression in a membership_choice is any type.
Modify 4.5.2(27) as follows:
If the choice_list has a single choice, the simple expression and the choice are elaborated in arbitrary order.
If the choice_list has more that one choice, the left operand is elaborated first and the operation is equivalent to a short-circuit sequence of individual tests on each choice.
Modify 4.5.2 (28) as follows:
A membership test using in yields the result True if any of the individual tests on the elements of the choice list yields True.
An individual membership test yields the result True if:
* The choice is an expression, and the simple_expression is equal to the value
of the choice. If the type of the expression is a record type or a limited type, the test uses the primitive equality for the type; otherwise the test uses predefined equality.
* The choice is a range or subtype mark and:
(followed by para. 29-30)
!discussion
The new syntax creates an upward incompatibility: a choice cannot be an expression, as in Ada2005, but is limited to an expression without a membership operator. This is likely to be harmless, and seems to be preferable to making the grammar ambiguous, as shown in the following example, in which a choice list can contain an arbitrary expression:
X : Boolean := .. case X is when Y in 1..10 | 20 => -- ambiguous: -- could be (Y in 1..10) | 20 -- which is otherwise type-incorrect
Even though this is a contrived example, unlikely to show up in realistic code, it seems preferable to modify the grammar as shown above.
There was a suggestion that the type of an expression in a choice should be non-limited. However, given that currently a subtype_mark in a choice can be of any type, it would be awkward to introduce this restriction. Clearly if the type has no primitive equality a membership test over a list of expressions is illegal.
[Editor's note: This decision brings visibility into the legality of membership operations. That seems like a lousy choice (pun intended). Personally, I would prefer the inconsistency. (Well, I'd prefer AI-153-2 instead of this one, which would require these to be discrete only, but make them work everywhere. But that's a separate issue.)]
!example
(See proposal.)
!ACATS test
Add an ACATS test for this new feature.
!appendix
```
From: Robert Dewar
Date: Monday, March 30, 2009  8:02 PM

How about allowing Pascal style sets while we are at it?

pragma Precondition
(xyz(3) = 'w' or else xyz(3) = 'x' or else xyz(3) = 'z');

is pretty horrible, now if only we had y instead of z we could write

(xyz(3) in 'w' .. 'y')

but suddenly for one missing element in the sequence we are driven to the top
form

(xyz(3) in ('w','x','z'))

easy to implement, easy to read, and very convenient

I would also allow e.g.

(xyz(3) in ('d'..'f', 'm', 'r'..'s'))

In the case of the GNAT code, we used to have

Nkind (X) = N_Node1 or else Nkind (X) = N_Node2

and we defined

Nkind_In (X, N_Node1, Node_2)

we have nine variants for 2 to 9 arguments, but really it would be much nicer to
say

Nkind (X) in (N_Node1, N_Node2)

I thought of {} instead of (), but I think () is probably more consistent with

I would allow these forms ONLY in membership tests, and the meaning of

Expr in (A, B, C)

Expr in (A) or else Expr in (B) or else Expr in (C)

if A is an expression

Expr in (A) means Expr = A

if A is a range or subtype name

Epxr in (A) means Expr in A

notin by obvious analogy.

It is a little tempting to allow them in loops as well

for J in (A, B, C) loop

but I would settle for just the membership tests

****************************************************************

From: Tucker Taft
Date: Monday, March 30, 2009  9:09 PM

I would prefer to use the "|" symbol, since that is what is already used in
choice lists.  E.g.:

Xyz(3) in ('w' .. 'x' | 'z')

Not sure whether the parens should be required.

****************************************************************

From: Robert Dewar
Date: Monday, March 30, 2009  9:21 PM

I am cool with | instead of ,

And yes, if you go with | instead of , you could indeed leave out the parens. I
sort of like the parens (I think of them as {} :-))

But I can go either way. Can I presume that the fact that you are
micro-analyzing the syntax means you agree with the basic idea? :-)

****************************************************************

From: Tucker Taft
Date: Monday, March 30, 2009  9:43 PM

I have always been annoyed that I can use multiple choice ranges in a case
alternative, but not in other similar contexts, such as membership tests (and
possibly "for" loops).  I frequently find myself turning what should be a simple
membership test into a case statement merely to get the multiple-choice-range
feature.

So yes, I support using "|" in more contexts. And I agree that preconditions and
postconditions will be heavy users of membership tests, and it will be annoying
if you can only conveniently talk about a single contiguous range of values.

****************************************************************

From: Randy Brukardt
Date: Monday, March 30, 2009  10:01 PM

>
>      (xyz(3) in ('w','x','z'))
>
> easy to implement, easy to read, and very convenient

Humm, the devil is in the details here. For what types are these memberships
supposed to work? Can they be static? Etc.

If they're only for static discrete types (as in case statements, based on
Tucker's responses), then I surely agree that they're easy to implement. I'm not
so sure in other cases.

****************************************************************

From: Robert Dewar
Date: Monday, March 30, 2009  10:11 PM

> If they're only for static discrete types (as in case statements,
> based on Tucker's responses), then I surely agree that they're easy to
> implement. I'm not so sure in other cases.

I gave a semantic equivalence, and no, it is not acceptable to limit these to
static values, but of course they are limited to discrete types, that goes
without saying.

****************************************************************

From: Tucker Taft
Date: Monday, March 30, 2009  10:04 PM

If we don't want to require parentheses in a membership test, then we would have
to change discrete choice to use a simple_expression rather than an expression:

discrete_choice ::=
simple_expression | discrete_range | others

and then we could change membership test to be:

simple_expression [not] in choice_list

where (non-discrete) choice and choice_list are:

choice ::= simple_expression | range | subtype_mark

choice_list ::= choice { '|' choice }

****************************************************************

From: Robert Dewar
Date: Monday, March 30, 2009  10:17 PM

If there are no parens, don't we have confusion in

case x is
when a in b | c

could be

when (a in b) | c

or

when a in (b | c)

****************************************************************

From: Tucker Taft
Date: Monday, March 30, 2009  10:34 PM

Yes, unless we change discrete_choice to use "simple_expression" rather than
"expression" which doesn't sound like such a bad idea in any case.

****************************************************************

From: Robert Dewar
Date: Tuesday, March 31, 2009  8:43 AM

Isn't that a clear upwards incompatibility?

****************************************************************

From: Tucker Taft
Date: Tuesday, March 31, 2009  11:29 AM

Yes, it is an upward incompatibility, though probably a reasonable one.
discrete_choice is used for case alternatives, variant alternatives, and array
aggregates. What you would lose is the ability to use relationals and membership
tests in such contexts, presuming the case expression, discriminant, or array
index is of a boolean type.

Of course these contexts allow no overlap in choices, and the case and variant
alternatives require the choice to be static, so trying to figure how to use a
membership test or a relational in a discrete_choice would be interesting.

E.g.:

case X is
when A < B =>
...
when others =>
...
end case;

or

Y := (A in 1..10 => 42, others => 43);

It is a bit hard to imagine how non-simple_expressions would appear in a
legitimate and legal program. Adding parentheses would probably help such
programs, should they exist.

Note that the suggested change would be upward "consistent," in that the
compiler would always catch the problem.

****************************************************************

From: Bob Duff
Date: Tuesday, March 31, 2009  11:59 AM

> If we don't want to require parentheses in a membership test, then we
> would have to change discrete choice to use a simple_expression rather
> than an expression:
>
>    discrete_choice ::=
>      simple_expression | discrete_range | others

Interestingly, this restriction was present in the Ada 83 syntax, and was
deliberately removed.

See AARM-4.4:

15.b  In various contexts throughout the language where Ada 83 syntax
has expression instead. This reflects the inclusion of modular
integer types, which makes the logical operators "and", "or", and
"xor" more useful in expressions of an integer type. Requiring
parentheses to use these operators in such contexts seemed
unnecessary and potentially confusing. Note that the bounds of a
range still have to be specified by simple_expressions, since
otherwise expressions involving membership tests might be ambiguous.
Essentially, the operation ".." is of higher precedence than the
logical operators, and hence uses of logical operators still have to
be parenthesized when used in a bound of a range.

And AARM-3.8.1:

29.b  The syntactic category choice is removed. The syntax rules for
variant, array_aggregate, and case_statement now use
discrete_choice_list or discrete_choice instead. The syntax rule for
record_aggregate now defines its own syntax for named associations.

choice ::= simple_expression
| discrete_range | others | component_simple_name

****************************************************************

From: Tucker Taft
Date: Tuesday, March 31, 2009  12:40 PM

Yes, I remember us making that change, simply for uniformity, as there was no
particular reason for the restriction.  Now that we have a reason, well that
changes everything!

****************************************************************

From: Robert Dewar
Date: Tuesday, March 31, 2009  3:25 PM

All these interesting comments on the syntax, but what do you think of the basic
idea? :-)

****************************************************************

From: Bob Duff
Date: Tuesday, March 31, 2009  3:39 PM

I haven't decided yet.  More later...

****************************************************************

From: Ed Schonberg
Date: Friday, April 24, 2009  8:28 AM

Going back to the syntax of set comprehensions:  the last proposal was to leave
out parentheses and revert to the Ada83 rule that only simple expressions can
occur is discrete choices.  The fact that this was an incompatibility was
dismissed as "probably harmless".  We just came up with an instance of a
relation in a case statement :  an OR of assorted constants in a modular
context.  An old lesson:  any upwards  incompatibility is suspicious.   So I
would prefer to see set  comprehensions as parenthesized,  |-separated lists,
with their own  production. For now they can only appear in membership tests,
but   having an explicit name for them will make it easier at a later date to
introduce them in loops, as was suggested. From there to treating them as
expressions and having proper set algebra is but an easy step!

****************************************************************

From: Robert Dewar
Date: Friday, April 24, 2009  9:41 AM

1. I much prefer the alternatives syntax, since we already have it in
the language, and it reads very nicely.

I do not see these as set comprehensions, but rather just an
abbreviation in conditionals, similar to COBOL
'
IF A EQUALS 2 OR 3 OR 17 THEN

if A in 2 | 3 | 17 then

seems nice.

2. I am dubious about allowing sets in loops, but even if we
go that far, the | syntax is still natural (and indeed since
the keyword is IN, is a smooth extension).

3. I do not like Ed's "easy step", it sounds like a mess to me from
all points of view (informal definition, formal definition including
type resolution, and implementation), and is simply too large a step,
too much outside what Ada is as a language for my taste.

4. Regarding the issue of upwards incompatibility, we have three choices

a) Just go back to the Ada 83 syntax. As Ed points out, this did
cause an incompatibility in our run time library, where we had

when A or B or C =>

and we have not tried our big test suite with this change (I will
try that run at some point).

If we decide on doing this, I would suggest that we immediately
issue a non-binding interpretation saying that we made a mistake
in Ada 95 and Ada 20xx and that we allow only simple expressions
in choices, allowing compilers to immediately implement the
change now, and avoiding the accumulation of more incompatibility
in existing code.

b) Just agree to resolve the potentially ambiguous forms in the Ada
95 manner. Messy from a formal point of view, but fine in practice
because only strange test programs will contain things like:

case Some_Boolean is
when A in B | C | D => bla bla

c) Muck with the syntax to simply exclude the use of IN or NOT IN
in this context, i.e. we introduce a new class of expression,
e.g, RRESTRICTED_EXPRESSION

which allows unparenthesized use of logical operators but does
not allow unparenthesized use of IN or NOT IN.

Fairly easy to implement, a bit awkward to define, but not

rexpression ::=
rrelation {and rrelation}  | rrelation {and then rrelation}
| rrelation {or rrelation}  | rrelation {or else rrelation}
| rrelation {xor rrelation}

rrelation ::=
simple_expression [relational_operator simple_expression]

Here rexpression = restricted_expression and
rrelation = restricted_relation (just did not feel like that
much typing).

From a users point of view, b) or c) are far preferable to a).

****************************************************************

From: Ed Schonberg
Date: Friday, April 24, 2009  10:41 AM

or d)  require a parenthesized list. Upwards compatible. Clean and orthogonal.

****************************************************************

From: Robert Dewar
Date: Friday, April 24, 2009  10:52 AM

But my list is predicated on keeping the | syntax. Yes, the parenthesis notation
avoids compatibility issues, but I think that is its only advantage. Remember I
originally suggested the parentheses, but as soon as Tuck suggested use of the
existing choices syntax, that made so much more sense to me, because the usage

if A in 2 | 3 .. 6 | 9 then

is so nicely parallel to

case 2 | 3 .. 6 | 9 =>

in both cases, we have an implied OR between equality conditions.

Anyway, let's see if this discussion can inspire some other ARG members to give
their opinions.

****************************************************************

From: Robert Dewar
Date: Friday, April 24, 2009  12:08 PM

I ran a test of our test suite requiring Simple_Expression for choices, number
of problems detected = ZERO, so it is only the one case in our own run-time that
ran into this problem.

****************************************************************

From: Randy Brukardt
Date: Friday, April 24, 2009  9:19 PM

...
> But my list is predicated on keeping the | syntax. Yes, the
> parenthesis notation avoids compatibility issues, but I think that is
> its only advantage. Remember I originally suggested the parentheses,
> but as soon as Tuck suggested use of the existing choices syntax, that
> made so much more sense to me, because the usage
>
>      if A in 2 | 3 .. 6 | 9 then
>
> is so nicely parallel to
>
>      case 2 | 3 .. 6 | 9 =>
>
> in both cases, we have an implied OR between equality conditions.

I agree. If we bother doing this at all, it should look as much as possible like
other things already in the language.

I don't like introducing an incompatibility for it, however -- it doesn't seem
important enough for that. I think Robert's "restricted_expression" idea is
probably best; using a membership (*any* membership) in a case statement is
pretty weird. If we're going to take an incompatibility, it ought to be there
and not in modular operations (which seem much more likely to come up in
practice, and indeed you have an example). But sticking exactly with the Ada 95
rules would be OK, too.

> Anyway, let's see if this discussion can inspire some other ARG
> members to give their opinions.

I've done it...

****************************************************************

From: Robert Dewar
Date: Saturday, April 25, 2009  8:15 AM

I implemented the restricted expression approach in our experimental
implementation of set membership. It took only 10 lines of code, and with this
change we get zero regressions on our own sources and our regression suite (when
we force this restriction on unconditionally).

****************************************************************

From: Ed Schonberg
Date: Sunday, November 1, 2009  10:25 AM

Here is an updated version of AI05-0158 (membership tests)
incorporating the changes suggested by you and Robert.
The new syntax is unambiguous, and presents a minuscule incompatibility with Ada2005.

[Editor's note: This is version /02 of the AI.]

****************************************************************

From: Tucker Taft
Date: Sunday, November 1, 2009  11:13 AM

It would help to have some examples.

Also, rather than using "restricted_*" where "restricted"

choice_expression ::=
choice_relation ...

also, rather than using simply "choice_list" and "choice" how about "membership_choice_list" and "membership_choice" to distinguish it from the other kinds of choices?  Hence:

choice_expression ::=
choice_relation {logical_operator choice_relation}

choice_relation ::=
simple_expression [relational_operator] simple_expression

relation ::= choice_relation |
simple_expression [not] in membership_choice_list

membership_choice_list ::=
membership_choice { '|' membership_choice }

membership_choice ::= choice_expression | range | subtype_mark

...

discrete_choice ::= choice_expression | discrete_range | OTHERS

...

expression ::= relation {logical_operator relation}

logical_operator ::= AND | AND THEN | OR | OR ELSE | XOR

****************************************************************

From: Ed Schonberg
Date: Monday, November 2, 2009  1:25 PM

Good points all. Here is updated version:

[Editor's note: This is version /03 of the AI.]

****************************************************************

From: Tucker Taft
Date: Monday, November 2, 2009  1:43 PM

The wording still talks about "choice_list" but the syntax now uses
"membership_choice_list".

****************************************************************

From: Randy Brukardt
Date: Monday, November 2, 2009  1:48 PM

I'll fix it when I post it. [And I did - ED]

****************************************************************
```

Questions? Ask the ACAA Technical Agent