Version 1.5 of ai05s/ai05-0158-1.txt

Unformatted version of ai05s/ai05-0158-1.txt version 1.5
Other versions for file ai05s/ai05-0158-1.txt

!standard 3.8.1(5)          09-11-02 AI05-0158-1/03
!standard 4.4(3)
!standard 4.5.2(3)
!standard 4.5.2(27)
!class Amendment 09-06-07
!status work item 09-06-07
!status received 09-03-30
!priority Low
!difficulty Medium
!subject Generalizing membership tests
!summary
Extend the syntax of membership tests to simplify complex conditions that can be expressed as membership in a subset of values of any type.
!problem
Conditions of the form (X = A) or else (X = B) or else (X = C) where A, B, C are of some arbitrary type are common, and will be more frequent when pre- and post-conditions become widely used. If the values of A, B,and C are contiguous values of some discrete type, the expression can be written as a membership operation: (X in A..C) . Otherwise this syntactic shortcut is not available. Memberships should be more flexible.
!proposal
We propose a simple extension of the syntax of membership operations, so that the right operand can specify a subset of values of some type. We propose a simple extension of the syntax of membership operations, so that the right operand can specify a subset of values of some type. The type does not need to be discrete, and the choices do not need to be static, so that one can write:
if C not in 'A' | 'B' | 'O' then Put_Line ("invalid blood type"); else ..
while Name (1 .. N) in T.Element (X) | T.Element (Y) | T.Element (Z) loop ...
!wording
Modify 3.8.1 (5) as follows:
discrete_Choice ::= choice_expression | discrete_Range | others
Modify 4.4(3) as follows:
expression := relation {logical_operator relation}
choice_expression := choice_relation {logical_operator choice_relation}
choice_relation ::= simple_expression [relational_operator simple_expression]
relation ::= simple_expression [not] in choice_list
membership_choice_list ::= membership_choice {'|' membership_choice}
membership_choice ::= choice_expression | range | subtype_mark
logical_operator ::= and | and then | or | or else | xor
Add after 4.5.2(3): (Name Resolution Rules)
If the membership_choice_list in a membership operation has more that one choice, then all choices must have a single common type, which is the tested type of the operation.
Modify 4.5.2(27) as follows:
If the choice_list has a single choice, the simple expression and the choice are elaborated in arbitrary order. If the choice_list has more that one choice, the left operand is elaborated first and the operation is equivalent to a short-circuit sequence of tests : if a choice is a restricted expression the corresponding test is an equality operation, otherwise it is a membership test whose right operand is the corresponding range or subtype_mark.
!discussion
The new syntax creates an upward incompatibility: a choice cannot be an expression, as in Ada2005, but is limited to an expression without a membership operator. This is likely to be harmless, and seems to be preferable to making the grammar ambiguous, as shown in the following example, in which a choice list can contain an arbitrary expression:
X : Boolean := .. case X is when Y in 1..10 | 20 => -- ambiguous: -- could be (Y in 1..10) | 20 -- which is otherwise type-incorrect
Even though this is a contrived example, unlikely to show up in realistic code, it seems preferable to modify the grammar as shown above.
!example
(See proposal.)
!ACATS test
Add an ACATS test for this new feature.
!appendix

From: Robert Dewar
Date: Monday, March 30, 2009  8:02 PM

How about allowing Pascal style sets while we are at it?

   pragma Precondition
      (xyz(3) = 'w' or else xyz(3) = 'x' or else xyz(3) = 'z');

is pretty horrible, now if only we had y instead of z we could write

     (xyz(3) in 'w' .. 'y')

but suddenly for one missing element in the sequence we are driven to the top
form

How about allowing

     (xyz(3) in ('w','x','z'))

easy to implement, easy to read, and very convenient

I would also allow e.g.

    (xyz(3) in ('d'..'f', 'm', 'r'..'s'))

In the case of the GNAT code, we used to have

    Nkind (X) = N_Node1 or else Nkind (X) = N_Node2

and we defined

    Nkind_In (X, N_Node1, Node_2)

we have nine variants for 2 to 9 arguments, but really it would be much nicer to
say

    Nkind (X) in (N_Node1, N_Node2)

I thought of {} instead of (), but I think () is probably more consistent with
Ada syntax style.

I would allow these forms ONLY in membership tests, and the meaning of

    Expr in (A, B, C)

    Expr in (A) or else Expr in (B) or else Expr in (C)

if A is an expression

    Expr in (A) means Expr = A

if A is a range or subtype name

    Epxr in (A) means Expr in A

notin by obvious analogy.

It is a little tempting to allow them in loops as well

    for J in (A, B, C) loop

but I would settle for just the membership tests

****************************************************************

From: Tucker Taft
Date: Monday, March 30, 2009  9:09 PM

I would prefer to use the "|" symbol, since that is what is already used in
choice lists.  E.g.:

    Xyz(3) in ('w' .. 'x' | 'z')

Not sure whether the parens should be required.

****************************************************************

From: Robert Dewar
Date: Monday, March 30, 2009  9:21 PM

I am cool with | instead of ,

And yes, if you go with | instead of , you could indeed leave out the parens. I
sort of like the parens (I think of them as {} :-))

But I can go either way. Can I presume that the fact that you are
micro-analyzing the syntax means you agree with the basic idea? :-)

****************************************************************

From: Tucker Taft
Date: Monday, March 30, 2009  9:43 PM

I have always been annoyed that I can use multiple choice ranges in a case
alternative, but not in other similar contexts, such as membership tests (and
possibly "for" loops).  I frequently find myself turning what should be a simple
membership test into a case statement merely to get the multiple-choice-range
feature.

So yes, I support using "|" in more contexts. And I agree that preconditions and
postconditions will be heavy users of membership tests, and it will be annoying
if you can only conveniently talk about a single contiguous range of values.

****************************************************************

From: Randy Brukardt
Date: Monday, March 30, 2009  10:01 PM

> How about allowing
>
>      (xyz(3) in ('w','x','z'))
>
> easy to implement, easy to read, and very convenient

Humm, the devil is in the details here. For what types are these memberships
supposed to work? Can they be static? Etc.

If they're only for static discrete types (as in case statements, based on
Tucker's responses), then I surely agree that they're easy to implement. I'm not
so sure in other cases.

****************************************************************

From: Tucker Taft
Date: Monday, March 30, 2009  10:04 PM

If we don't want to require parentheses in a membership test, then we would have
to change discrete choice to use a simple_expression rather than an expression:

   discrete_choice ::=
     simple_expression | discrete_range | others

and then we could change membership test to be:

     simple_expression [not] in choice_list

where (non-discrete) choice and choice_list are:

   choice ::= simple_expression | range | subtype_mark

   choice_list ::= choice { '|' choice }

****************************************************************

From: Robert Dewar
Date: Monday, March 30, 2009  10:17 PM

If there are no parens, don't we have confusion in

   case x is
     when a in b | c

could be

     when (a in b) | c

or

     when a in (b | c)

****************************************************************

From: Tucker Taft
Date: Monday, March 30, 2009  10:34 PM

Yes, unless we change discrete_choice to use "simple_expression" rather than
"expression" which doesn't sound like such a bad idea in any case.

****************************************************************

From: Robert Dewar
Date: Tuesday, March 31, 2009  8:43 AM

Isn't that a clear upwards incompatibility?

****************************************************************

From: Tucker Taft
Date: Tuesday, March 31, 2009  11:29 AM

Yes, it is an upward incompatibility, though probably a reasonable one.
discrete_choice is used for case alternatives, variant alternatives, and array
aggregates. What you would lose is the ability to use relationals and membership
tests in such contexts, presuming the case expression, discriminant, or array
index is of a boolean type.

Of course these contexts allow no overlap in choices, and the case and variant
alternatives require the choice to be static, so trying to figure how to use a
membership test or a relational in a discrete_choice would be interesting.

E.g.:

     case X is
        when A < B =>
           ...
        when others =>
           ...
     end case;

or

     Y := (A in 1..10 => 42, others => 43);

It is a bit hard to imagine how non-simple_expressions would appear in a
legitimate and legal program. Adding parentheses would probably help such
programs, should they exist.

Note that the suggested change would be upward "consistent," in that the
compiler would always catch the problem.

****************************************************************

From: Bob Duff
Date: Tuesday, March 31, 2009  11:59 AM

> If we don't want to require parentheses in a membership test, then we
> would have to change discrete choice to use a simple_expression rather
> than an expression:
>
>    discrete_choice ::=
>      simple_expression | discrete_range | others

Interestingly, this restriction was present in the Ada 83 syntax, and was
deliberately removed.

See AARM-4.4:


                            Extensions to Ada 83 ...
    15.b  In various contexts throughout the language where Ada 83 syntax
          rules had simple_expression, the corresponding Ada 95 syntax rule
          has expression instead. This reflects the inclusion of modular
          integer types, which makes the logical operators "and", "or", and
          "xor" more useful in expressions of an integer type. Requiring
          parentheses to use these operators in such contexts seemed
          unnecessary and potentially confusing. Note that the bounds of a
          range still have to be specified by simple_expressions, since
          otherwise expressions involving membership tests might be ambiguous.
          Essentially, the operation ".." is of higher precedence than the
          logical operators, and hence uses of logical operators still have to
          be parenthesized when used in a bound of a range.

And AARM-3.8.1:

                         Wording Changes from Ada 83

    29.b  The syntactic category choice is removed. The syntax rules for
          variant, array_aggregate, and case_statement now use
          discrete_choice_list or discrete_choice instead. The syntax rule for
          record_aggregate now defines its own syntax for named associations.

The Ada 83 syntax is:

    choice ::= simple_expression
       | discrete_range | others | component_simple_name

****************************************************************

From: Tucker Taft
Date: Tuesday, March 31, 2009  12:40 PM

Yes, I remember us making that change, simply for uniformity, as there was no
particular reason for the restriction.  Now that we have a reason, well that
changes everything!

****************************************************************

From: Robert Dewar
Date: Tuesday, March 31, 2009  3:25 PM

All these interesting comments on the syntax, but what do you think of the basic
idea? :-)

****************************************************************

From: Bob Duff
Date: Tuesday, March 31, 2009  3:39 PM

I haven't decided yet.  More later...

****************************************************************

From: Ed Schonberg
Date: Friday, April 24, 2009  8:28 AM

Going back to the syntax of set comprehensions:  the last proposal was to leave
out parentheses and revert to the Ada83 rule that only simple expressions can
occur is discrete choices.  The fact that this was an incompatibility was
dismissed as "probably harmless".  We just came up with an instance of a
relation in a case statement :  an OR of assorted constants in a modular
context.  An old lesson:  any upwards  incompatibility is suspicious.   So I
would prefer to see set  comprehensions as parenthesized,  |-separated lists,
with their own  production. For now they can only appear in membership tests,
but   having an explicit name for them will make it easier at a later date to
introduce them in loops, as was suggested. From there to treating them as
expressions and having proper set algebra is but an easy step!

****************************************************************

From: Robert Dewar
Date: Friday, April 24, 2009  9:41 AM

Comments

1. I much prefer the alternatives syntax, since we already have it in
    the language, and it reads very nicely.

    I do not see these as set comprehensions, but rather just an
    abbreviation in conditionals, similar to COBOL
'
      IF A EQUALS 2 OR 3 OR 17 THEN

    and the equivalent Ada

      if A in 2 | 3 | 17 then

    seems nice.

2. I am dubious about allowing sets in loops, but even if we
    go that far, the | syntax is still natural (and indeed since
    the keyword is IN, is a smooth extension).

3. I do not like Ed's "easy step", it sounds like a mess to me from
    all points of view (informal definition, formal definition including
    type resolution, and implementation), and is simply too large a step,
    too much outside what Ada is as a language for my taste.

4. Regarding the issue of upwards incompatibility, we have three choices

    a) Just go back to the Ada 83 syntax. As Ed points out, this did
       cause an incompatibility in our run time library, where we had

          when A or B or C =>

       and we have not tried our big test suite with this change (I will
       try that run at some point).

       If we decide on doing this, I would suggest that we immediately
       issue a non-binding interpretation saying that we made a mistake
       in Ada 95 and Ada 20xx and that we allow only simple expressions
       in choices, allowing compilers to immediately implement the
       change now, and avoiding the accumulation of more incompatibility
       in existing code.

    b) Just agree to resolve the potentially ambiguous forms in the Ada
       95 manner. Messy from a formal point of view, but fine in practice
       because only strange test programs will contain things like:

           case Some_Boolean is
              when A in B | C | D => bla bla

    c) Muck with the syntax to simply exclude the use of IN or NOT IN
       in this context, i.e. we introduce a new class of expression,
       e.g, RRESTRICTED_EXPRESSION

       which allows unparenthesized use of logical operators but does
       not allow unparenthesized use of IN or NOT IN.

       Fairly easy to implement, a bit awkward to define, but not
       bad, e.g.

         rexpression ::=
            rrelation {and rrelation}  | rrelation {and then rrelation}
          | rrelation {or rrelation}  | rrelation {or else rrelation}
          | rrelation {xor rrelation}

         rrelation ::=
            simple_expression [relational_operator simple_expression]

         Here rexpression = restricted_expression and
         rrelation = restricted_relation (just did not feel like that
         much typing).

 From a users point of view, b) or c) are far preferable to a).

****************************************************************

From: Ed Schonberg
Date: Friday, April 24, 2009  10:41 AM

or d)  require a parenthesized list. Upwards compatible. Clean and orthogonal.

****************************************************************

From: Robert Dewar
Date: Friday, April 24, 2009  10:52 AM

But my list is predicated on keeping the | syntax. Yes, the parenthesis notation
avoids compatibility issues, but I think that is its only advantage. Remember I
originally suggested the parentheses, but as soon as Tuck suggested use of the
existing choices syntax, that made so much more sense to me, because the usage

     if A in 2 | 3 .. 6 | 9 then

is so nicely parallel to

     case 2 | 3 .. 6 | 9 =>

in both cases, we have an implied OR between equality conditions.

Anyway, let's see if this discussion can inspire some other ARG members to give
their opinions.

****************************************************************

From: Robert Dewar
Date: Friday, April 24, 2009  12:08 PM

I ran a test of our test suite requiring Simple_Expression for choices, number
of problems detected = ZERO, so it is only the one case in our own run-time that
ran into this problem.

****************************************************************

From: Randy Brukardt
Date: Friday, April 24, 2009  9:19 PM

...
> But my list is predicated on keeping the | syntax. Yes, the
> parenthesis notation avoids compatibility issues, but I think that is
> its only advantage. Remember I originally suggested the parentheses,
> but as soon as Tuck suggested use of the existing choices syntax, that
> made so much more sense to me, because the usage
>
>      if A in 2 | 3 .. 6 | 9 then
>
> is so nicely parallel to
>
>      case 2 | 3 .. 6 | 9 =>
>
> in both cases, we have an implied OR between equality conditions.

I agree. If we bother doing this at all, it should look as much as possible like
other things already in the language.

I don't like introducing an incompatibility for it, however -- it doesn't seem
important enough for that. I think Robert's "restricted_expression" idea is
probably best; using a membership (*any* membership) in a case statement is
pretty weird. If we're going to take an incompatibility, it ought to be there
and not in modular operations (which seem much more likely to come up in
practice, and indeed you have an example). But sticking exactly with the Ada 95
rules would be OK, too.

> Anyway, let's see if this discussion can inspire some other ARG
> members to give their opinions.

I've done it...

****************************************************************

From: Robert Dewar
Date: Saturday, April 25, 2009  8:15 AM

I implemented the restricted expression approach in our experimental
implementation of set membership. It took only 10 lines of code, and with this
change we get zero regressions on our own sources and our regression suite (when
we force this restriction on unconditionally).

****************************************************************

From: Ed Schonberg
Date: Sunday, November 1, 2009  10:25 AM

Here is an updated version of AI05-0158 (membership tests)   
incorporating the changes suggested by you and Robert.
The new syntax is unambiguous, and presents a minuscule incompatibility with Ada2005.

[Editor's note: This is version /02 of the AI.]

****************************************************************

From: Tucker Taft
Date: Sunday, November 1, 2009  11:13 AM

It would help to have some examples.

Also, rather than using "restricted_*" where "restricted"
is somewhat mysterious, how about:

    choice_expression ::=
      choice_relation ...

also, rather than using simply "choice_list" and "choice" how about "membership_choice_list" and "membership_choice" to distinguish it from the other kinds of choices?  Hence:

     choice_expression ::=
       choice_relation {logical_operator choice_relation}

     choice_relation ::=
       simple_expression [relational_operator] simple_expression

     relation ::= choice_relation |
       simple_expression [not] in membership_choice_list

     membership_choice_list ::=
       membership_choice { '|' membership_choice }

     membership_choice ::= choice_expression | range | subtype_mark

...

     discrete_choice ::= choice_expression | discrete_range | OTHERS

...

     expression ::= relation {logical_operator relation}

     logical_operator ::= AND | AND THEN | OR | OR ELSE | XOR

****************************************************************

From: Ed Schonberg
Date: Monday, November 2, 2009  1:25 PM

Good points all. Here is updated version:

[Editor's note: This is version /03 of the AI.]

****************************************************************

From: Tucker Taft
Date: Monday, November 2, 2009  1:43 PM

The wording still talks about "choice_list" but the syntax now uses
"membership_choice_list".

****************************************************************

From: Randy Brukardt
Date: Monday, November 2, 2009  1:48 PM

I'll fix it when I post it. [And I did - ED]

****************************************************************

Questions? Ask the ACAA Technical Agent