Version 1.5 of ai05s/ai05-0158-1.txt
!standard 3.8.1(5) 09-11-02 AI05-0158-1/03
!standard 4.4(3)
!standard 4.5.2(3)
!standard 4.5.2(27)
!class Amendment 09-06-07
!status work item 09-06-07
!status received 09-03-30
!priority Low
!difficulty Medium
!subject Generalizing membership tests
!summary
Extend the syntax of membership tests to simplify complex conditions that can
be expressed as membership in a subset of values of any type.
!problem
Conditions of the form (X = A) or else (X = B) or else (X = C) where A, B, C
are of some arbitrary type are common, and will be more frequent when
pre- and post-conditions become widely used. If the values of A, B,and C are
contiguous values of some discrete type, the expression can be
written as a membership operation: (X in A..C) . Otherwise this syntactic
shortcut is not available. Memberships should be more flexible.
!proposal
We propose a simple extension of the syntax of
membership operations, so that the right operand can specify a subset of values
of some type. We propose a simple extension of the syntax of
membership operations, so that the right operand can specify a subset of values
of some type. The type does not need to be discrete, and the choices do not
need to be static, so that one can write:
if C not in 'A' | 'B' | 'O' then Put_Line ("invalid blood type"); else ..
while Name (1 .. N) in T.Element (X) | T.Element (Y) | T.Element (Z) loop ...
!wording
Modify 3.8.1 (5) as follows:
discrete_Choice ::= choice_expression | discrete_Range | others
Modify 4.4(3) as follows:
expression := relation {logical_operator relation}
choice_expression :=
choice_relation {logical_operator choice_relation}
choice_relation ::=
simple_expression [relational_operator simple_expression]
relation ::=
simple_expression [not] in choice_list
membership_choice_list ::= membership_choice {'|' membership_choice}
membership_choice ::= choice_expression | range | subtype_mark
logical_operator ::= and | and then | or | or else | xor
Add after 4.5.2(3): (Name Resolution Rules)
If the membership_choice_list in a membership operation has more that one
choice, then all choices must have a single common type, which is the tested
type of the operation.
Modify 4.5.2(27) as follows:
If the choice_list has a single choice, the simple expression and the choice
are elaborated in arbitrary order.
If the choice_list has more that one choice, the left operand is elaborated
first and the operation is equivalent to a short-circuit sequence of tests :
if a choice is a restricted expression the
corresponding test is an equality operation, otherwise it is a membership test
whose right operand is the corresponding range or subtype_mark.
!discussion
The new syntax creates an upward incompatibility: a choice cannot be an
expression, as in Ada2005, but is limited to an expression without a
membership operator. This is likely to be harmless, and seems to be preferable
to making the grammar ambiguous, as shown in the following example, in which
a choice list can contain an arbitrary expression:
X : Boolean := ..
case X is
when Y in 1..10 | 20 => --
--
--
Even though this is a contrived example, unlikely to show up in realistic
code, it seems preferable to modify the grammar as shown above.
!example
(See proposal.)
!ACATS test
Add an ACATS test for this new feature.
!appendix
From: Robert Dewar
Date: Monday, March 30, 2009 8:02 PM
How about allowing Pascal style sets while we are at it?
pragma Precondition
(xyz(3) = 'w' or else xyz(3) = 'x' or else xyz(3) = 'z');
is pretty horrible, now if only we had y instead of z we could write
(xyz(3) in 'w' .. 'y')
but suddenly for one missing element in the sequence we are driven to the top
form
How about allowing
(xyz(3) in ('w','x','z'))
easy to implement, easy to read, and very convenient
I would also allow e.g.
(xyz(3) in ('d'..'f', 'm', 'r'..'s'))
In the case of the GNAT code, we used to have
Nkind (X) = N_Node1 or else Nkind (X) = N_Node2
and we defined
Nkind_In (X, N_Node1, Node_2)
we have nine variants for 2 to 9 arguments, but really it would be much nicer to
say
Nkind (X) in (N_Node1, N_Node2)
I thought of {} instead of (), but I think () is probably more consistent with
Ada syntax style.
I would allow these forms ONLY in membership tests, and the meaning of
Expr in (A, B, C)
Expr in (A) or else Expr in (B) or else Expr in (C)
if A is an expression
Expr in (A) means Expr = A
if A is a range or subtype name
Epxr in (A) means Expr in A
notin by obvious analogy.
It is a little tempting to allow them in loops as well
for J in (A, B, C) loop
but I would settle for just the membership tests
****************************************************************
From: Tucker Taft
Date: Monday, March 30, 2009 9:09 PM
I would prefer to use the "|" symbol, since that is what is already used in
choice lists. E.g.:
Xyz(3) in ('w' .. 'x' | 'z')
Not sure whether the parens should be required.
****************************************************************
From: Robert Dewar
Date: Monday, March 30, 2009 9:21 PM
I am cool with | instead of ,
And yes, if you go with | instead of , you could indeed leave out the parens. I
sort of like the parens (I think of them as {} :-))
But I can go either way. Can I presume that the fact that you are
micro-analyzing the syntax means you agree with the basic idea? :-)
****************************************************************
From: Tucker Taft
Date: Monday, March 30, 2009 9:43 PM
I have always been annoyed that I can use multiple choice ranges in a case
alternative, but not in other similar contexts, such as membership tests (and
possibly "for" loops). I frequently find myself turning what should be a simple
membership test into a case statement merely to get the multiple-choice-range
feature.
So yes, I support using "|" in more contexts. And I agree that preconditions and
postconditions will be heavy users of membership tests, and it will be annoying
if you can only conveniently talk about a single contiguous range of values.
****************************************************************
From: Randy Brukardt
Date: Monday, March 30, 2009 10:01 PM
> How about allowing
>
> (xyz(3) in ('w','x','z'))
>
> easy to implement, easy to read, and very convenient
Humm, the devil is in the details here. For what types are these memberships
supposed to work? Can they be static? Etc.
If they're only for static discrete types (as in case statements, based on
Tucker's responses), then I surely agree that they're easy to implement. I'm not
so sure in other cases.
****************************************************************
From: Tucker Taft
Date: Monday, March 30, 2009 10:04 PM
If we don't want to require parentheses in a membership test, then we would have
to change discrete choice to use a simple_expression rather than an expression:
discrete_choice ::=
simple_expression | discrete_range | others
and then we could change membership test to be:
simple_expression [not] in choice_list
where (non-discrete) choice and choice_list are:
choice ::= simple_expression | range | subtype_mark
choice_list ::= choice { '|' choice }
****************************************************************
From: Robert Dewar
Date: Monday, March 30, 2009 10:17 PM
If there are no parens, don't we have confusion in
case x is
when a in b | c
could be
when (a in b) | c
or
when a in (b | c)
****************************************************************
From: Tucker Taft
Date: Monday, March 30, 2009 10:34 PM
Yes, unless we change discrete_choice to use "simple_expression" rather than
"expression" which doesn't sound like such a bad idea in any case.
****************************************************************
From: Robert Dewar
Date: Tuesday, March 31, 2009 8:43 AM
Isn't that a clear upwards incompatibility?
****************************************************************
From: Tucker Taft
Date: Tuesday, March 31, 2009 11:29 AM
Yes, it is an upward incompatibility, though probably a reasonable one.
discrete_choice is used for case alternatives, variant alternatives, and array
aggregates. What you would lose is the ability to use relationals and membership
tests in such contexts, presuming the case expression, discriminant, or array
index is of a boolean type.
Of course these contexts allow no overlap in choices, and the case and variant
alternatives require the choice to be static, so trying to figure how to use a
membership test or a relational in a discrete_choice would be interesting.
E.g.:
case X is
when A < B =>
...
when others =>
...
end case;
or
Y := (A in 1..10 => 42, others => 43);
It is a bit hard to imagine how non-simple_expressions would appear in a
legitimate and legal program. Adding parentheses would probably help such
programs, should they exist.
Note that the suggested change would be upward "consistent," in that the
compiler would always catch the problem.
****************************************************************
From: Bob Duff
Date: Tuesday, March 31, 2009 11:59 AM
> If we don't want to require parentheses in a membership test, then we
> would have to change discrete choice to use a simple_expression rather
> than an expression:
>
> discrete_choice ::=
> simple_expression | discrete_range | others
Interestingly, this restriction was present in the Ada 83 syntax, and was
deliberately removed.
See AARM-4.4:
Extensions to Ada 83 ...
15.b In various contexts throughout the language where Ada 83 syntax
rules had simple_expression, the corresponding Ada 95 syntax rule
has expression instead. This reflects the inclusion of modular
integer types, which makes the logical operators "and", "or", and
"xor" more useful in expressions of an integer type. Requiring
parentheses to use these operators in such contexts seemed
unnecessary and potentially confusing. Note that the bounds of a
range still have to be specified by simple_expressions, since
otherwise expressions involving membership tests might be ambiguous.
Essentially, the operation ".." is of higher precedence than the
logical operators, and hence uses of logical operators still have to
be parenthesized when used in a bound of a range.
And AARM-3.8.1:
Wording Changes from Ada 83
29.b The syntactic category choice is removed. The syntax rules for
variant, array_aggregate, and case_statement now use
discrete_choice_list or discrete_choice instead. The syntax rule for
record_aggregate now defines its own syntax for named associations.
The Ada 83 syntax is:
choice ::= simple_expression
| discrete_range | others | component_simple_name
****************************************************************
From: Tucker Taft
Date: Tuesday, March 31, 2009 12:40 PM
Yes, I remember us making that change, simply for uniformity, as there was no
particular reason for the restriction. Now that we have a reason, well that
changes everything!
****************************************************************
From: Robert Dewar
Date: Tuesday, March 31, 2009 3:25 PM
All these interesting comments on the syntax, but what do you think of the basic
idea? :-)
****************************************************************
From: Bob Duff
Date: Tuesday, March 31, 2009 3:39 PM
I haven't decided yet. More later...
****************************************************************
From: Ed Schonberg
Date: Friday, April 24, 2009 8:28 AM
Going back to the syntax of set comprehensions: the last proposal was to leave
out parentheses and revert to the Ada83 rule that only simple expressions can
occur is discrete choices. The fact that this was an incompatibility was
dismissed as "probably harmless". We just came up with an instance of a
relation in a case statement : an OR of assorted constants in a modular
context. An old lesson: any upwards incompatibility is suspicious. So I
would prefer to see set comprehensions as parenthesized, |-separated lists,
with their own production. For now they can only appear in membership tests,
but having an explicit name for them will make it easier at a later date to
introduce them in loops, as was suggested. From there to treating them as
expressions and having proper set algebra is but an easy step!
****************************************************************
From: Robert Dewar
Date: Friday, April 24, 2009 9:41 AM
Comments
1. I much prefer the alternatives syntax, since we already have it in
the language, and it reads very nicely.
I do not see these as set comprehensions, but rather just an
abbreviation in conditionals, similar to COBOL
'
IF A EQUALS 2 OR 3 OR 17 THEN
and the equivalent Ada
if A in 2 | 3 | 17 then
seems nice.
2. I am dubious about allowing sets in loops, but even if we
go that far, the | syntax is still natural (and indeed since
the keyword is IN, is a smooth extension).
3. I do not like Ed's "easy step", it sounds like a mess to me from
all points of view (informal definition, formal definition including
type resolution, and implementation), and is simply too large a step,
too much outside what Ada is as a language for my taste.
4. Regarding the issue of upwards incompatibility, we have three choices
a) Just go back to the Ada 83 syntax. As Ed points out, this did
cause an incompatibility in our run time library, where we had
when A or B or C =>
and we have not tried our big test suite with this change (I will
try that run at some point).
If we decide on doing this, I would suggest that we immediately
issue a non-binding interpretation saying that we made a mistake
in Ada 95 and Ada 20xx and that we allow only simple expressions
in choices, allowing compilers to immediately implement the
change now, and avoiding the accumulation of more incompatibility
in existing code.
b) Just agree to resolve the potentially ambiguous forms in the Ada
95 manner. Messy from a formal point of view, but fine in practice
because only strange test programs will contain things like:
case Some_Boolean is
when A in B | C | D => bla bla
c) Muck with the syntax to simply exclude the use of IN or NOT IN
in this context, i.e. we introduce a new class of expression,
e.g, RRESTRICTED_EXPRESSION
which allows unparenthesized use of logical operators but does
not allow unparenthesized use of IN or NOT IN.
Fairly easy to implement, a bit awkward to define, but not
bad, e.g.
rexpression ::=
rrelation {and rrelation} | rrelation {and then rrelation}
| rrelation {or rrelation} | rrelation {or else rrelation}
| rrelation {xor rrelation}
rrelation ::=
simple_expression [relational_operator simple_expression]
Here rexpression = restricted_expression and
rrelation = restricted_relation (just did not feel like that
much typing).
From a users point of view, b) or c) are far preferable to a).
****************************************************************
From: Ed Schonberg
Date: Friday, April 24, 2009 10:41 AM
or d) require a parenthesized list. Upwards compatible. Clean and orthogonal.
****************************************************************
From: Robert Dewar
Date: Friday, April 24, 2009 10:52 AM
But my list is predicated on keeping the | syntax. Yes, the parenthesis notation
avoids compatibility issues, but I think that is its only advantage. Remember I
originally suggested the parentheses, but as soon as Tuck suggested use of the
existing choices syntax, that made so much more sense to me, because the usage
if A in 2 | 3 .. 6 | 9 then
is so nicely parallel to
case 2 | 3 .. 6 | 9 =>
in both cases, we have an implied OR between equality conditions.
Anyway, let's see if this discussion can inspire some other ARG members to give
their opinions.
****************************************************************
From: Robert Dewar
Date: Friday, April 24, 2009 12:08 PM
I ran a test of our test suite requiring Simple_Expression for choices, number
of problems detected = ZERO, so it is only the one case in our own run-time that
ran into this problem.
****************************************************************
From: Randy Brukardt
Date: Friday, April 24, 2009 9:19 PM
...
> But my list is predicated on keeping the | syntax. Yes, the
> parenthesis notation avoids compatibility issues, but I think that is
> its only advantage. Remember I originally suggested the parentheses,
> but as soon as Tuck suggested use of the existing choices syntax, that
> made so much more sense to me, because the usage
>
> if A in 2 | 3 .. 6 | 9 then
>
> is so nicely parallel to
>
> case 2 | 3 .. 6 | 9 =>
>
> in both cases, we have an implied OR between equality conditions.
I agree. If we bother doing this at all, it should look as much as possible like
other things already in the language.
I don't like introducing an incompatibility for it, however -- it doesn't seem
important enough for that. I think Robert's "restricted_expression" idea is
probably best; using a membership (*any* membership) in a case statement is
pretty weird. If we're going to take an incompatibility, it ought to be there
and not in modular operations (which seem much more likely to come up in
practice, and indeed you have an example). But sticking exactly with the Ada 95
rules would be OK, too.
> Anyway, let's see if this discussion can inspire some other ARG
> members to give their opinions.
I've done it...
****************************************************************
From: Robert Dewar
Date: Saturday, April 25, 2009 8:15 AM
I implemented the restricted expression approach in our experimental
implementation of set membership. It took only 10 lines of code, and with this
change we get zero regressions on our own sources and our regression suite (when
we force this restriction on unconditionally).
****************************************************************
From: Ed Schonberg
Date: Sunday, November 1, 2009 10:25 AM
Here is an updated version of AI05-0158 (membership tests)
incorporating the changes suggested by you and Robert.
The new syntax is unambiguous, and presents a minuscule incompatibility with Ada2005.
[Editor's note: This is version /02 of the AI.]
****************************************************************
From: Tucker Taft
Date: Sunday, November 1, 2009 11:13 AM
It would help to have some examples.
Also, rather than using "restricted_*" where "restricted"
is somewhat mysterious, how about:
choice_expression ::=
choice_relation ...
also, rather than using simply "choice_list" and "choice" how about "membership_choice_list" and "membership_choice" to distinguish it from the other kinds of choices? Hence:
choice_expression ::=
choice_relation {logical_operator choice_relation}
choice_relation ::=
simple_expression [relational_operator] simple_expression
relation ::= choice_relation |
simple_expression [not] in membership_choice_list
membership_choice_list ::=
membership_choice { '|' membership_choice }
membership_choice ::= choice_expression | range | subtype_mark
...
discrete_choice ::= choice_expression | discrete_range | OTHERS
...
expression ::= relation {logical_operator relation}
logical_operator ::= AND | AND THEN | OR | OR ELSE | XOR
****************************************************************
From: Ed Schonberg
Date: Monday, November 2, 2009 1:25 PM
Good points all. Here is updated version:
[Editor's note: This is version /03 of the AI.]
****************************************************************
From: Tucker Taft
Date: Monday, November 2, 2009 1:43 PM
The wording still talks about "choice_list" but the syntax now uses
"membership_choice_list".
****************************************************************
From: Randy Brukardt
Date: Monday, November 2, 2009 1:48 PM
I'll fix it when I post it. [And I did - ED]
****************************************************************
Questions? Ask the ACAA Technical Agent