!standard 3.8.1(5) 09-11-02 AI05-0158-1/03 !standard 4.4(3) !standard 4.5.2(3) !standard 4.5.2(27) !class Amendment 09-06-07 !status work item 09-06-07 !status received 09-03-30 !priority Low !difficulty Medium !subject Generalizing membership tests !summary Extend the syntax of membership tests to simplify complex conditions that can be expressed as membership in a subset of values of any type. !problem Conditions of the form (X = A) or else (X = B) or else (X = C) where A, B, C are of some arbitrary type are common, and will be more frequent when pre- and post-conditions become widely used. If the values of A, B,and C are contiguous values of some discrete type, the expression can be written as a membership operation: (X in A..C) . Otherwise this syntactic shortcut is not available. Memberships should be more flexible. !proposal We propose a simple extension of the syntax of membership operations, so that the right operand can specify a subset of values of some type. We propose a simple extension of the syntax of membership operations, so that the right operand can specify a subset of values of some type. The type does not need to be discrete, and the choices do not need to be static, so that one can write: if C not in 'A' | 'B' | 'O' then Put_Line ("invalid blood type"); else .. while Name (1 .. N) in T.Element (X) | T.Element (Y) | T.Element (Z) loop ... !wording Modify 3.8.1 (5) as follows: discrete_Choice ::= choice_expression | discrete_Range | others Modify 4.4(3) as follows: expression := relation {logical_operator relation} choice_expression := choice_relation {logical_operator choice_relation} choice_relation ::= simple_expression [relational_operator simple_expression] relation ::= simple_expression [not] in choice_list membership_choice_list ::= membership_choice {'|' membership_choice} membership_choice ::= choice_expression | range | subtype_mark logical_operator ::= and | and then | or | or else | xor Add after 4.5.2(3): (Name Resolution Rules) If the membership_choice_list in a membership operation has more that one choice, then all choices must have a single common type, which is the tested type of the operation. Modify 4.5.2(27) as follows: If the choice_list has a single choice, the simple expression and the choice are elaborated in arbitrary order. If the choice_list has more that one choice, the left operand is elaborated first and the operation is equivalent to a short-circuit sequence of tests : if a choice is a restricted expression the corresponding test is an equality operation, otherwise it is a membership test whose right operand is the corresponding range or subtype_mark. !discussion The new syntax creates an upward incompatibility: a choice cannot be an expression, as in Ada2005, but is limited to an expression without a membership operator. This is likely to be harmless, and seems to be preferable to making the grammar ambiguous, as shown in the following example, in which a choice list can contain an arbitrary expression: X : Boolean := .. case X is when Y in 1..10 | 20 => -- ambiguous: -- could be (Y in 1..10) | 20 -- which is otherwise type-incorrect Even though this is a contrived example, unlikely to show up in realistic code, it seems preferable to modify the grammar as shown above. !example (See proposal.) !ACATS test Add an ACATS test for this new feature. !appendix From: Robert Dewar Date: Monday, March 30, 2009 8:02 PM How about allowing Pascal style sets while we are at it? pragma Precondition (xyz(3) = 'w' or else xyz(3) = 'x' or else xyz(3) = 'z'); is pretty horrible, now if only we had y instead of z we could write (xyz(3) in 'w' .. 'y') but suddenly for one missing element in the sequence we are driven to the top form How about allowing (xyz(3) in ('w','x','z')) easy to implement, easy to read, and very convenient I would also allow e.g. (xyz(3) in ('d'..'f', 'm', 'r'..'s')) In the case of the GNAT code, we used to have Nkind (X) = N_Node1 or else Nkind (X) = N_Node2 and we defined Nkind_In (X, N_Node1, Node_2) we have nine variants for 2 to 9 arguments, but really it would be much nicer to say Nkind (X) in (N_Node1, N_Node2) I thought of {} instead of (), but I think () is probably more consistent with Ada syntax style. I would allow these forms ONLY in membership tests, and the meaning of Expr in (A, B, C) Expr in (A) or else Expr in (B) or else Expr in (C) if A is an expression Expr in (A) means Expr = A if A is a range or subtype name Epxr in (A) means Expr in A notin by obvious analogy. It is a little tempting to allow them in loops as well for J in (A, B, C) loop but I would settle for just the membership tests **************************************************************** From: Tucker Taft Date: Monday, March 30, 2009 9:09 PM I would prefer to use the "|" symbol, since that is what is already used in choice lists. E.g.: Xyz(3) in ('w' .. 'x' | 'z') Not sure whether the parens should be required. **************************************************************** From: Robert Dewar Date: Monday, March 30, 2009 9:21 PM I am cool with | instead of , And yes, if you go with | instead of , you could indeed leave out the parens. I sort of like the parens (I think of them as {} :-)) But I can go either way. Can I presume that the fact that you are micro-analyzing the syntax means you agree with the basic idea? :-) **************************************************************** From: Tucker Taft Date: Monday, March 30, 2009 9:43 PM I have always been annoyed that I can use multiple choice ranges in a case alternative, but not in other similar contexts, such as membership tests (and possibly "for" loops). I frequently find myself turning what should be a simple membership test into a case statement merely to get the multiple-choice-range feature. So yes, I support using "|" in more contexts. And I agree that preconditions and postconditions will be heavy users of membership tests, and it will be annoying if you can only conveniently talk about a single contiguous range of values. **************************************************************** From: Randy Brukardt Date: Monday, March 30, 2009 10:01 PM > How about allowing > > (xyz(3) in ('w','x','z')) > > easy to implement, easy to read, and very convenient Humm, the devil is in the details here. For what types are these memberships supposed to work? Can they be static? Etc. If they're only for static discrete types (as in case statements, based on Tucker's responses), then I surely agree that they're easy to implement. I'm not so sure in other cases. **************************************************************** From: Tucker Taft Date: Monday, March 30, 2009 10:04 PM If we don't want to require parentheses in a membership test, then we would have to change discrete choice to use a simple_expression rather than an expression: discrete_choice ::= simple_expression | discrete_range | others and then we could change membership test to be: simple_expression [not] in choice_list where (non-discrete) choice and choice_list are: choice ::= simple_expression | range | subtype_mark choice_list ::= choice { '|' choice } **************************************************************** From: Robert Dewar Date: Monday, March 30, 2009 10:17 PM If there are no parens, don't we have confusion in case x is when a in b | c could be when (a in b) | c or when a in (b | c) **************************************************************** From: Tucker Taft Date: Monday, March 30, 2009 10:34 PM Yes, unless we change discrete_choice to use "simple_expression" rather than "expression" which doesn't sound like such a bad idea in any case. **************************************************************** From: Robert Dewar Date: Tuesday, March 31, 2009 8:43 AM Isn't that a clear upwards incompatibility? **************************************************************** From: Tucker Taft Date: Tuesday, March 31, 2009 11:29 AM Yes, it is an upward incompatibility, though probably a reasonable one. discrete_choice is used for case alternatives, variant alternatives, and array aggregates. What you would lose is the ability to use relationals and membership tests in such contexts, presuming the case expression, discriminant, or array index is of a boolean type. Of course these contexts allow no overlap in choices, and the case and variant alternatives require the choice to be static, so trying to figure how to use a membership test or a relational in a discrete_choice would be interesting. E.g.: case X is when A < B => ... when others => ... end case; or Y := (A in 1..10 => 42, others => 43); It is a bit hard to imagine how non-simple_expressions would appear in a legitimate and legal program. Adding parentheses would probably help such programs, should they exist. Note that the suggested change would be upward "consistent," in that the compiler would always catch the problem. **************************************************************** From: Bob Duff Date: Tuesday, March 31, 2009 11:59 AM > If we don't want to require parentheses in a membership test, then we > would have to change discrete choice to use a simple_expression rather > than an expression: > > discrete_choice ::= > simple_expression | discrete_range | others Interestingly, this restriction was present in the Ada 83 syntax, and was deliberately removed. See AARM-4.4: Extensions to Ada 83 ... 15.b In various contexts throughout the language where Ada 83 syntax rules had simple_expression, the corresponding Ada 95 syntax rule has expression instead. This reflects the inclusion of modular integer types, which makes the logical operators "and", "or", and "xor" more useful in expressions of an integer type. Requiring parentheses to use these operators in such contexts seemed unnecessary and potentially confusing. Note that the bounds of a range still have to be specified by simple_expressions, since otherwise expressions involving membership tests might be ambiguous. Essentially, the operation ".." is of higher precedence than the logical operators, and hence uses of logical operators still have to be parenthesized when used in a bound of a range. And AARM-3.8.1: Wording Changes from Ada 83 29.b The syntactic category choice is removed. The syntax rules for variant, array_aggregate, and case_statement now use discrete_choice_list or discrete_choice instead. The syntax rule for record_aggregate now defines its own syntax for named associations. The Ada 83 syntax is: choice ::= simple_expression | discrete_range | others | component_simple_name **************************************************************** From: Tucker Taft Date: Tuesday, March 31, 2009 12:40 PM Yes, I remember us making that change, simply for uniformity, as there was no particular reason for the restriction. Now that we have a reason, well that changes everything! **************************************************************** From: Robert Dewar Date: Tuesday, March 31, 2009 3:25 PM All these interesting comments on the syntax, but what do you think of the basic idea? :-) **************************************************************** From: Bob Duff Date: Tuesday, March 31, 2009 3:39 PM I haven't decided yet. More later... **************************************************************** From: Ed Schonberg Date: Friday, April 24, 2009 8:28 AM Going back to the syntax of set comprehensions: the last proposal was to leave out parentheses and revert to the Ada83 rule that only simple expressions can occur is discrete choices. The fact that this was an incompatibility was dismissed as "probably harmless". We just came up with an instance of a relation in a case statement : an OR of assorted constants in a modular context. An old lesson: any upwards incompatibility is suspicious. So I would prefer to see set comprehensions as parenthesized, |-separated lists, with their own production. For now they can only appear in membership tests, but having an explicit name for them will make it easier at a later date to introduce them in loops, as was suggested. From there to treating them as expressions and having proper set algebra is but an easy step! **************************************************************** From: Robert Dewar Date: Friday, April 24, 2009 9:41 AM Comments 1. I much prefer the alternatives syntax, since we already have it in the language, and it reads very nicely. I do not see these as set comprehensions, but rather just an abbreviation in conditionals, similar to COBOL ' IF A EQUALS 2 OR 3 OR 17 THEN and the equivalent Ada if A in 2 | 3 | 17 then seems nice. 2. I am dubious about allowing sets in loops, but even if we go that far, the | syntax is still natural (and indeed since the keyword is IN, is a smooth extension). 3. I do not like Ed's "easy step", it sounds like a mess to me from all points of view (informal definition, formal definition including type resolution, and implementation), and is simply too large a step, too much outside what Ada is as a language for my taste. 4. Regarding the issue of upwards incompatibility, we have three choices a) Just go back to the Ada 83 syntax. As Ed points out, this did cause an incompatibility in our run time library, where we had when A or B or C => and we have not tried our big test suite with this change (I will try that run at some point). If we decide on doing this, I would suggest that we immediately issue a non-binding interpretation saying that we made a mistake in Ada 95 and Ada 20xx and that we allow only simple expressions in choices, allowing compilers to immediately implement the change now, and avoiding the accumulation of more incompatibility in existing code. b) Just agree to resolve the potentially ambiguous forms in the Ada 95 manner. Messy from a formal point of view, but fine in practice because only strange test programs will contain things like: case Some_Boolean is when A in B | C | D => bla bla c) Muck with the syntax to simply exclude the use of IN or NOT IN in this context, i.e. we introduce a new class of expression, e.g, RRESTRICTED_EXPRESSION which allows unparenthesized use of logical operators but does not allow unparenthesized use of IN or NOT IN. Fairly easy to implement, a bit awkward to define, but not bad, e.g. rexpression ::= rrelation {and rrelation} | rrelation {and then rrelation} | rrelation {or rrelation} | rrelation {or else rrelation} | rrelation {xor rrelation} rrelation ::= simple_expression [relational_operator simple_expression] Here rexpression = restricted_expression and rrelation = restricted_relation (just did not feel like that much typing). From a users point of view, b) or c) are far preferable to a). **************************************************************** From: Ed Schonberg Date: Friday, April 24, 2009 10:41 AM or d) require a parenthesized list. Upwards compatible. Clean and orthogonal. **************************************************************** From: Robert Dewar Date: Friday, April 24, 2009 10:52 AM But my list is predicated on keeping the | syntax. Yes, the parenthesis notation avoids compatibility issues, but I think that is its only advantage. Remember I originally suggested the parentheses, but as soon as Tuck suggested use of the existing choices syntax, that made so much more sense to me, because the usage if A in 2 | 3 .. 6 | 9 then is so nicely parallel to case 2 | 3 .. 6 | 9 => in both cases, we have an implied OR between equality conditions. Anyway, let's see if this discussion can inspire some other ARG members to give their opinions. **************************************************************** From: Robert Dewar Date: Friday, April 24, 2009 12:08 PM I ran a test of our test suite requiring Simple_Expression for choices, number of problems detected = ZERO, so it is only the one case in our own run-time that ran into this problem. **************************************************************** From: Randy Brukardt Date: Friday, April 24, 2009 9:19 PM ... > But my list is predicated on keeping the | syntax. Yes, the > parenthesis notation avoids compatibility issues, but I think that is > its only advantage. Remember I originally suggested the parentheses, > but as soon as Tuck suggested use of the existing choices syntax, that > made so much more sense to me, because the usage > > if A in 2 | 3 .. 6 | 9 then > > is so nicely parallel to > > case 2 | 3 .. 6 | 9 => > > in both cases, we have an implied OR between equality conditions. I agree. If we bother doing this at all, it should look as much as possible like other things already in the language. I don't like introducing an incompatibility for it, however -- it doesn't seem important enough for that. I think Robert's "restricted_expression" idea is probably best; using a membership (*any* membership) in a case statement is pretty weird. If we're going to take an incompatibility, it ought to be there and not in modular operations (which seem much more likely to come up in practice, and indeed you have an example). But sticking exactly with the Ada 95 rules would be OK, too. > Anyway, let's see if this discussion can inspire some other ARG > members to give their opinions. I've done it... **************************************************************** From: Robert Dewar Date: Saturday, April 25, 2009 8:15 AM I implemented the restricted expression approach in our experimental implementation of set membership. It took only 10 lines of code, and with this change we get zero regressions on our own sources and our regression suite (when we force this restriction on unconditionally). **************************************************************** From: Ed Schonberg Date: Sunday, November 1, 2009 10:25 AM Here is an updated version of AI05-0158 (membership tests) incorporating the changes suggested by you and Robert. The new syntax is unambiguous, and presents a minuscule incompatibility with Ada2005. [Editor's note: This is version /02 of the AI.] **************************************************************** From: Tucker Taft Date: Sunday, November 1, 2009 11:13 AM It would help to have some examples. Also, rather than using "restricted_*" where "restricted" is somewhat mysterious, how about: choice_expression ::= choice_relation ... also, rather than using simply "choice_list" and "choice" how about "membership_choice_list" and "membership_choice" to distinguish it from the other kinds of choices? Hence: choice_expression ::= choice_relation {logical_operator choice_relation} choice_relation ::= simple_expression [relational_operator] simple_expression relation ::= choice_relation | simple_expression [not] in membership_choice_list membership_choice_list ::= membership_choice { '|' membership_choice } membership_choice ::= choice_expression | range | subtype_mark ... discrete_choice ::= choice_expression | discrete_range | OTHERS ... expression ::= relation {logical_operator relation} logical_operator ::= AND | AND THEN | OR | OR ELSE | XOR **************************************************************** From: Ed Schonberg Date: Monday, November 2, 2009 1:25 PM Good points all. Here is updated version: [Editor's note: This is version /03 of the AI.] **************************************************************** From: Tucker Taft Date: Monday, November 2, 2009 1:43 PM The wording still talks about "choice_list" but the syntax now uses "membership_choice_list". **************************************************************** From: Randy Brukardt Date: Monday, November 2, 2009 1:48 PM I'll fix it when I post it. [And I did - ED] ****************************************************************