Version 1.5 of ai12s/ai12-0214-1.txt

Unformatted version of ai12s/ai12-0214-1.txt version 1.5
Other versions for file ai12s/ai12-0214-1.txt

!standard 5.4(4/3)          16-01-09 AI12-0214-1/01
!class Amendment 17-01-09
!status work item 17-01-09
!status received 16-10-08
!priority Very_Low
!difficulty Hard
!subject Case pattern matching
!summary
** TBD.
!problem
Ada has case statements and expressions, that allows testing for the value of a variable. It allows and enforce full coverage checking. On the other hand, it only works on discrete types, which limits their usefulness somehow.
Also, Ada has discriminated types with variant parts, that allows having a variable "shape" for a record type depending on the value of a discriminant. This is a very useful facility, but we have no way of ensuring that a user is accessing the correct components, and we have to resort to dynamic checks.
type Maybe (Option : Boolean) is record case Option is when True => Val : Integer; when False => null; end case; end record;
procedure Print_The_Maybe (M : Maybe) is begin if M.Option = False then Put_Line (M.Val'Img); -- Error here, not statically checked else null; end if; end Print_The_Maybe;
!proposal
We propose introducing a new construct, mainly in case statements and expressions's alternatives: Pattern matching of the value being discriminated upon.
We would use aggregate syntax to specify the structure and expected values of the value matched upon, like in the following example:
type R is record
A, B : Boolean;
end record;
R_Inst : R;
case R_Inst is when (True, True) => .. when (True, False) => .. when (False, False) => .. when (False, True) => .. -- Every possibility has been covered. end case;
Along with this proposal, we propose relaxing the legality checking of the type of the top level value being matched upon by the case expression and statement, from discrete types, to the following rule:
The top level value's type needs to have a finite and known number of shapes. With shape being defined as:
- For numeric types, the shape is the number of possible values the type. - For record types, the shapes are the set of possible components lists
(1 for regular records, N for variant records)
- For array types with known bounds there is one shape. - For array types, there is is as many shapes as there are possible
(Start_Index, End_Index) combinations.
- For private types there is no known shape.
The box can be used when the users just wants to ignore a value, and others can be used in aggregate to denote of every other component, like in regular aggregates, and with the same limitations.
type Arr is array (Natural range <>) of Integer;
A : Arr := ...;
case A is -- Match when first element is one when (1, others => <>) => ...
-- Match when every element except first is one when (<>, others => 1) => ... end case;
But the proposal is not yet complete. In the example above, we want to be able to, not only match on the value of the Option discriminant, but also to easily get the value of the Val field, when Option is True. For this, we want to introduce a new variable binding that will be bound to Val. In terms of coverage, it is also equivalent to a wildcard (any value for Val will trigger the match):
procedure Print_The_Maybe (M : Maybe) is
begin
case M is
declare V : Integer when (True, V) => Put_Line (V'Img); when (Option => False) => null;
end case;
end Print_The_Maybe;
Since there can be no ambiguity on the type of V in the above example, we propose making the type annotation optional, and allow the following:
procedure Print_The_Maybe (M : Maybe) is
begin
case M is
declare V when (True, V) => Put_Line (V'Img); when (Option => False) => null;
end case;
end Print_The_Maybe;
The great benefit of the above is that with this construct, it is impossible to access the value of Val when the Option discriminant is false: An incorrect aggregate structure will result in an error.
This in turn allows more expressive ways of designing APIs when you need to return several values. So far you had the following options:
-- Example 1 procedure Read (Success : out Boolean; Number : out Integer);
-- Usage:
declare
Success : Boolean; Number : Integer;
begin Read (Success, Number); Put_Line (Number'Image);
end;
-- A lot of boiler plate just to read and print a number. Also user can -- ignore error case !
-- Example 2 function Read (Number : out Integer) return Boolean;
-- Usage:
declare
Number : Integer; Success : Boolean := Read (Number);
begin Put_Line (if Success then Number'Image else "No number");
end;
-- A lot of boiler plate still
-- Example 3
type Read_Result (Success : Boolean) is record case Success is when True => Result : Integer; when False => null; end case; end record;
function Read return Read_Result;
-- Usage:
declare
Read_Res : Read_Result := Read;
begin Put_Line (if Read_Res.Success then Read_Res.Result'Image else "No number");
end;
With pattern matching, taking the third example's API, you can express the use case as:
Put_Line (case Read is
declare Result when (True, Result) => Result'Image, when (Success => False) => "No_Number");
You don't need to introduce a procedural context to treat the result of the read, and the user has to handle the failure case, as opposed as the previous examples.
This pattern of being able to encourage the user to handle error cases, and remind him to treat everything via coverage checking, is very useful in contexts where exceptions are not an option.
The rule on types shapes means that the user cannot match on a value of a Float type, but can match on a record containing a Float. For subcomponents of a type with no known number of shapes, the only allowed matchers are either a reference to a pattern object declaration or a box:
type Rec (Option : Boolean) is record Value : Float; end record;
R : Rec;
case R is when (True, 1.0) => ... -- Illegal when (True, <>) => ... -- Legal declare A when (True, A) => ... -- Legal end case;
This feature allows matching on string literals, which are considered like array aggregates:
case S is when "begin" => ...; when "end" => ...; when others => ...; end case;
Syntax
This proposal requires introducing a new branch in the case_statement_alternative and case_expression_alternative rules.
pattern_object_declaration ::= defining_identifier [":" (subtype_indication | access_definition | array_type_definition)]
pattern_object_declarations ::= pattern_object_declaration {"," pattern_object_declaration}
case_statement_alternative ::= "declare" pattern_object_declarations "when" discrete_choice_list => sequence_of_statements | "when" discrete_choice_list "=>" sequence_of_statements
case_expression_alternative ::= "declare" pattern_object_declarations "when" discrete_choice_list => dependent_expression | "when" discrete_choice_list "=>" dependent_expression
Name Resolution Rules
A pattern matching literal should have the valid structure of a literal for the value matched. His type is the type of the value matched, without ambiguities.
In the case of aggregate literals, the type of each subcomponent's value is the type of the expected subcomponent, as in a regular aggregate.
In the case of aggregate literals with values bound through pattern object declarations, their type is the type of the subcomponent in the type definition.
Even though the type of a component cannot be ambiguous, the user is allowed to annotate the type of the pattern object declaration, for clarity purposes and for consistency with other object declarations. In that case, the type annotation needs to refer exactly to the type of the component.
In order to allow matching on records containing accesses, if the matched value or sub-component is an access, then a literal of the type pointed to by the access type is allowed. This would allow use of the form:
type Linked_List_Node; type Linked_List is access all Linked_List_Node;
type Linked_List_Node is record
Val : Integer; Next : Linked_List;
end record;
L : Linked_List;
case L.all is declare First, Second, Third, Next when (First, (Second, (Third, Next))) => ... when others => ... end case;
Legality Rules
The only entities allowed in a pattern matching literal are other literals and statically known constants. Referencing a non constant declaration defined in an outer scope is illegal.
Declared pattern objects are constant views of the components they refer to. Modifying their value is illegal. Taking a non-constant access on them is illegal.
The coverage check rule is relaxed in the following way:
1. A pattern object is considered to cover every possible value of the matched
value, regardless whether the type of the value is discrete or not.
2. The set represented by the union of all possible case alternatives needs to
cover every possible value of the type being matched upon.
3. The only option for the user to cover every possible value of a type with a
conceptually infinite number of values is to use a pattern object, a box, or a top level "others =>" alternative
4. The box and pattern objects won't trigger overlap errors for already covered
values, so that the following is possible
case A is
when (1, True) => ... declare A when (1, A) => ...
end case;
5. Every match via a literal for a specific value must precede a match for this
specific value made via a box or via a pattern object.
type Rec is record A, V : Integer range 1 .. 3; end record;
R : Rec;
case R is when (1, <>) => ...
-- Illegal! Should be before the matching of V subcomponent via box when (1, 1) => ... end case;
The type of the value being matched upon needs to have a known number of shapes, following the definition of shape outlined in the proposal.
Dynamic semantics
The pattern matching literal will be structurally and recursively compared to the value matched against. In case of a pattern object, any value is accepted. The match will succeed if every literal or constant matched is equal to its counterpart in the dynamic value matched upon.
!wording
** TBD (some of the above could be used)
!discussion
Coverage check
From preliminary discussions, I expect one of the biggest objections to this proposal to be the lifting of the restriction of the case statement and expression to work only on values of discrete types.
Lifting it is fundamental to this proposal. There was however a number of different possibilities:
1. Restrict to types that directly or indirectly have a discrete number of
possible values. A record containg integer sub-components would be allowed, but not a record containing a floating point sub-component.
2. Restrict to types that have a limited number of shapes. Dont allow direct
matching on literals for types that are not discrete. That's the choice being made currently.
3. Allow everything, just enforce the "others" when there is no discrete number
of possible values.
It is felt that 2 is the most pragmatic choice. it's necessary in order to make pattern-matching useful enough, because the first alternative would disable useful use cases such as this one:
type Read_Result (Success : Boolean) is record case Success is when True => Result : Integer; when False => null; end case; end record;
function Read return Read_Result;
case Read is
declare Result when (True, Result) => ... when others => ...
end case;
Ranges, subtypes:
When matching discrete types, one can use ranges and subtypes to match a set of value of the discrete type. Question is, should we allow matching sub-components in the same way, as in:
type Rec is record A, V : Integer range 1 .. 3; end record;
R : Rec;
case R is when (1 .. 2, 1 .. 2) => when (1 .. 3, 3) => when (3, 1 .. 3) => end case;
Syntax
The syntax for declaring pattern objects is new so we must consider different options.
First proposal:
case A is when (1, True) => ... declare A when (1, A) => ... -- Declaration via declare .. when ... end case;
This one is pretty verbose but feels very Ada idiomatic in our opinion.
Rejected alternative:
case A is when (1, True) => ... when (1, A) => ... -- Declaration via declare .. when ... end case;
This proposal corresponds to the way it is done in most languages with pattern-matching, but we feel it is not adapted to Ada.
While sufficient, this syntax would be confusing: We don't know whether A is a statically known constant or a newly declared pattern object without looking at the context. Also, if A also refers to an outer scope declaration, the code might do something else than what the user expects, if he's not familiar with the rule that you cannot use variables in matches.
Third proposal:
case A is when (1, True) => ... when (1, A is <>) => ... -- Declaration via identifier "is" <> end case;
Less verbose than the first proposal, and also has the advantage of composing nicely with the box matching syntax. Could also accomodate optional type annotations. Possible cons: The aggregate is cluttered, declarations are scattered, and the alterations to the grammar would be less localized.
Alternatives order rule
The rule 5 described in legality rules might be too restrictive.
5. Every match via a literal for a specific value must precede a match for this
specific value made via a box or via a pattern object.
This disallows the following, which intuitively corresponds to a valid execution:
case A is when (1, 1) => ... when (1, <>) => ... when (<>, 1) => ... end case;
First option: - Alternatives are considered in sequential order. - If an element of an alternative is less general than the same one in a
preceding alternative, then the subset of cases handled by this alternative and by none of the preceding alternatives, must be non-empty.
This would disallow the following
case A is when (1, <>) => ... when (1, 1) => ... -- This pattern can never be matched end case;
But allow those:
case A is when (1, <>) => ... when (<>, 1) => ... end case;
case A is when (<>, 1) => ... when (1, <>) => ... end case;
In the example above, both case statements are valid, and for the value (1, 1), a different code path will be executed, making the branches order sensitive, which goes against the current design of the case statement.
This is the way pattern matching works in OCaml/Haskell/etc..
Second option:
- An alternative can not appear twice. - If an element of an alternative is less general than the same one in a
preceding alternative, then the subset of cases allowed by both alternatives must be covered by a preceding alternative.
In this option, we're constraining the order of the alternatives for readability, eg. we force the user to go from less general to more general matches, but the order of alternatives has no direct influence on the code that will be executed in the end, making the proposal more in line with the current case statement and expression.
case A is -- ILLEGAL when (1, <>) => ... when (<>, 1) => ... end case;
case A is -- ILLEGAL when (<>, 1) => ... when (1, <>) => ... end case;
case A is -- LEGAL when (1, 1) => ... when (<>, 1) => ... when (1, <>) => ... end case;
!ASIS
** TBD.
!ACATS test
Many new ACATS tests would be needed to check that the new capabilities are supported.
!appendix

From: Raphael Amiard
Sent: Sunday, October 9, 2016  7:41 AM

Here is an AI for a feature proposal I've been drafting with some help. Of
course too late to discuss at this meeting, but it'll let a lot of time for
people to look at it until the next one though !
[This is version /01 of the AI - Editor.]

****************************************************************

From: Tucker Taft
Sent: Thursday, October 13, 2016  10:14 AM

Did you consider the syntax:

    when (True, <A>) =>

where <id> is declaring id to represent what "<>" would have represented on
its own?

I think we also talked about:

   when R : (True, <>) =>

where you now use R.blah to refer to parts matched by <>

I find the "declare ... when ..." syntax too verbose, and think the
"when R : ( ... ) =>"  syntax the most consistent with how exception
occurrences are declared now.

****************************************************************

From: Raphael Amiard
Sent: Thursday, October 13, 2016  10:29 AM

> Did you consider the syntax:
>
>    when (True, <A>) =>

No, we didn't think about that. On the one hand, I like it because it's very
concise, coherent with the unnamed case, and quite clear about what this does !

On the other hand I'm worried that it will make the lexer's work a bit harder,
since here "<A>" can be parsed either as "Op(LT), Id(A), Op(GT)" or as
"Pattern_Match_Id(A)".

I'll try and implement this in libadalang's parser, to see what the
repercussions are.

>
> where <id> is declaring id to represent what "<>" would have 
> represented on its own?
>
> I think we also talked about:
>
>   when R : (True, <>) =>
>
> where you now use R.blah to refer to parts matched by <>
>
> I find the "declare ... when ..." syntax too verbose, and think the "when R : 
> ( ... ) =>" syntax the most consistent with how exception occurrences 
> are declared now.

Yes we did. I think that being able to name the top level object is a great
capability, so I'll add it to the AI. I don't however, as explained in the
previous exchanges, think that it is a substitute for sub-component matching.
If you want I can submit my rationale on the ARG thread.

****************************************************************

From: Tucker Taft
Sent: Thursday, October 13, 2016  10:40 AM

...
> On the other hand I'm worried that it will make the lexer's work a bit 
> harder, since here "<A>" can be parsed either as "Op(LT), Id(A), Op(GT)"
> or as "Pattern_Match_Id(A)".

This one is pretty easy, because you can actually lex it as LT, Id, GT.  But
you just have to distinguish in the parser between unary "<" and binary ">"
which is pretty easy.  We make that sort of distinction all the time.

> I'll try and implement this in libadalang's parser, to see what the repercussions are.

I would be surprised if it is difficult to handle in the parser.  I don't see
any need to alter the lexer for this.

...
>> I find the "declare ... when ..." syntax too verbose, and think the 
>> "when R : ( ... ) =>" syntax the most consistent with how exception
>> occurrences are declared now.
>
> Yes we did. I think that being able to name the top level object is a 
> great capability, so I'll add it to the AI. I don't however, as 
> explained in the previous exchanges, think that it is a substitute for 
> sub-component matching. If you want I can submit my rationale on the ARG thread.

Yes, please do, as I don't remember why you think it is not an adequate substitute.

****************************************************************

From: Raphael Amiard
Sent: Thursday, October 13, 2016  10:41 AM

...
>This one is pretty easy, because you can actually lex it as LT, Id, GT.  But
>you just have to distinguish in the parser between unary "<" and binary ">"
>which is pretty easy.  We make that sort of distinction all the time. 

Yes, you're probably right !

>Yes, please do, as I don't remember why you think it is not an adequate
>substitute. 

Here it is, slightly edited to use the new syntax you proposed - I already
love it :)


Naming the object that is being matched upon, while it can be useful, is not
sufficient. First it's not as expressive. You'll have to repeat the path to the
sub (sub-sub) component you wanted to match, which is verbose and possibly
error prone. And then, if you want to make a rule out of the fact that you can
statically check that the path is correct, the implementation will be more
complex, because you'll have to remember the paths and check that what the user
is doing is going along those paths. Taking the realistically complex example
of the connection I showed earlier:

type Connection_State is (Init, Connecting, Connected, Disconnected);

type Ping (Has_Ping_Info : Boolean := False) is record
   case Has_Ping_Info is
   when True =>
      Last_Ping_Time    : Time_T;
      Last_Ping_Id      : Ping_Id;
   end case;
end record;  

type Connection_Info (State : Connection_State) is record
   Server : Internet_Address;
   case State is
   when Connected =>
      Session_Id        : Unbounded_String;
      Ping_Info         : Ping;
   when Connecting =>
      When_Initiated    : Time_T;
   when Disconnected =>
      When_Disconnected : Time_T;
   when Init =>
      null;
   end case;
end record;

C : Connection_Info;

case C is
   when (Connected, <S_Id>, (True, <>, <Ping_Time>)) =>
      Put_Line ("Connected ! Session Id is " & S_Id & " Ping time is " & Ping_Time'Image);
   when others => null;
end case;

Constrast with only top-level object naming:

case C is
   when CC : (Connected, <>, (True, <>, <>)) =>
      Put_Line ("Connected ! Session Id is "
                & CC.Session_Id & " Ping time is "
                & C.Ping_Info.Last_Ping_Time'Image);
                --  Woops, I used the original name rather than the matched
                --  name ! The compiler will silently ignore my error.
   when others => null;
end case;

Having to repeat the path is less readable and more error prone. You go through
the trouble of expressing the pattern, just to have to repeat the logic
underneath, effectively writing the paths twice, once in aggregate syntax, the
other in prefix syntax.

Statically ensuring that the accessed information is correct will be much more
work for the compiler.

Not to mention, that would be (yet another) feature that we don't implement
like other languages.

****************************************************************

From: Randy Brukardt
Sent: Monday, January 9, 2017  6:45 PM

(Replying to an old thread that I must have missed back in October:)

...	
>Constrast with only top-level object naming:
>		
>case C is
>   when CC : (Connected, <>, (True, <>, <>)) =>
>      Put_Line ("Connected ! Session Id is "
>                & CC.Session_Id & " Ping time is "
>                & C.Ping_Info.Last_Ping_Time'Image);
>                --  Woops, I used the original name rather than the matched
>                --  name ! The compiler will silently ignore my error.

What error? C and CC are views of the same object, and clearly have the same
value. If there is an error here, it is declaring CC in the first place (see
below).

>		   when others => null;
>		end case;

One would want these shorthands in cases where the name of the original object
is too complex. If, for instance, the original object was a function call with
parameters, then the shorthand makes sense:

case Get_Connection (From => Server) is
 ... -- Rest as above.

But in this case, if you mistyped the identifier, the compiler will give you
an error. So I don't see any real problem with mistakes here.

Keep in mind that every identifier (and every entity for that matter) that one
declares adds to the cognitive load of the reader. You really shouldn't do it
unless it actually helps reading the code. (Exactly where that line is
obviously is a personal choice, but it is far away from renaming a single
character identifier.)

>Having to repeat the path is less readable and more error prone. You go 
>through the trouble of expressing the pattern, just to have to repeat 
>the logic underneath, effectively writing the paths twice, once in 
>aggregate syntax, the other in prefix syntax.

Arguably, that's a good thing.

BTW, Ada doesn't currently allow positional <> aggregate components, and I'd
suggest that we retain that rule here. (Assuming you really want to model these
patterns as aggregates.) [Especially as many style guides ban positional record
aggregates altogether.] Therefore, your example would have to be written
something like:

case Get_Connection (From => Server) is
   when CC : (State => Connected, Server => <>,
              Session_Id => <>,
              Ping_Info => (Has_Ping_Info => True, Last_Ping_Time => <>, Last_Ping_Id => <>)) =>
       Put_Line ("Connected ! Session Id is "
                 & CC.Session_Id & " Ping time is "
                 & CC.Ping_Info.Last_Ping_Time'Image);

So the names you need are already in the source. Declaring more names would
just be more noise.

[Aside: in writing the above, I see that your original example doesn't have
enough components (the Server component seems to have been left out). Which is
why many style guides require component names ... ;-)]

>Statically ensuring that the accessed information is correct will be 
>much more work for the compiler.

There seems to be no need to do that. Again, this is just a different view of
an existing object, we really should not care which of those views is accessed.
		
>Not to mention, that would be (yet another) feature that we don't 
>implement like other languages.

You already know what I think about that: if you want to use some other
language, do that. Don't try to mess up Ada with the exact features of other
languages; whatever we do should fit into the Ada model and not look like it
was stolen from someone else. (That's why some form of case coverage is
mandatory for this feature.)

Tucker's idea seems to fit with the existing syntax of the language, and seems
to be sufficient for the job.

****************************************************************

From: Raphael Amiard
Sent: Sunday, February 12, 2017  6:45 AM

Thank you for your answer Randy, and sorry I took so long to answer ! I have
been swamped with other work at AdaCore, currently depiling my ARG work pile :)

>> case C is
>>    when CC : (Connected, <>, (True, <>, <>)) =>
>>       Put_Line ("Connected ! Session Id is "
>>                 & CC.Session_Id & " Ping time is "
>>                 & C.Ping_Info.Last_Ping_Time'Image);
>>                 --  Woops, I used the original name rather than the matched
>>                 --  name ! The compiler will silently ignore my error.
> What error? C and CC are views of the same object, and clearly have 
> the same value. If there is an error here, it is declaring CC in the 
> first place (see below).

We at least agree on that, in that case :) CC is useless. My example was
probably not such a good one. See below.

> One would want these shorthands in cases where the name of the 
> original object is too complex. If, for instance, the original object 
> was a function call with parameters, then the shorthand makes sense:
>
> case Get_Connection (From => Server) is
>   ... -- Rest as above.
>
> But in this case, if you mistyped the identifier, the compiler will 
> give you an error. So I don't see any real problem with mistakes here.

This is not about mistyping, it is about accessing a component that is
statically valid, but dynamically invalid due to discriminants. Let me
amend a previous example:

C, C2 : Connection_Info

case C is
   when (Connected, <>, <>, (True, <>, <>)) =>
      Print_Ping_Time (C2.Ping_Info.Ping_Time)

Here you're accessing the wrong object altogether (C2). This is valid Ada, so
it is pretty impossible to emit a warning, even though the code is clearly
wrong. Arguably the programmer should have used more descriptive names. He
should also have not made errors. The job of the compiler is to help him, and
that's a great opportunity to do so.

With introducing a binding, both writing the code and checking it is easier:

C, C2 : Connection_Info

case C is
   when (Connected, <>, <>, (True, <>, <Ping_Time>)) =>
      Print_Ping_Time (Ping_Time)


> Keep in mind that every identifier (and every entity for that matter) 
> that one declares adds to the cognitive load of the reader. You really 
> shouldn't do it unless it actually helps reading the code. (Exactly 
> where that line is obviously is a personal choice, but it is far away 
> from renaming a single character identifier.)

Yes, I agree with that line of reasoning. The C/CC example was a straw-man, of
the alternative proposal I don't like. In the example above, I feel like the
"Ping_Time" binding that is introduced, both by it's strong locality and
because of the static guarantees associated to it, helps the user understand
the code and make sure it's correct.

>> Having to repeat the path is less readable and more error prone. You go
>> through the trouble of expressing the pattern, just to have to repeat the 
>> logic underneath, effectively writing the paths twice, once in 
>> aggregate syntax, the other in prefix syntax.
> Arguably, that's a good thing.

Let's argue then :) I see no benefit in repeating the path, only potential for
errors, both for the writer and for the reader of the code.

> BTW, Ada doesn't currently allow positional <> aggregate components, 
> and I'd suggest that we retain that rule here. (Assuming you really 
> want to model these patterns as aggregates.) [Especially as many style 
> guides ban positional record aggregates altogether.] Therefore, your 
> example would have to be written something like:
>
> case Get_Connection (From => Server) is
>     when CC : (State => Connected, Server => <>,
>                Session_Id => <>,
>                Ping_Info => (Has_Ping_Info => True, Last_Ping_Time => <>,
>                              Last_Ping_Id => <>)) =>
>         Put_Line ("Connected ! Session Id is "
>                   & CC.Session_Id & " Ping time is "
>                   & CC.Ping_Info.Last_Ping_Time'Image);
>
> So the names you need are already in the source. Declaring more names 
> would just be more noise.

This does not make sense. The introduction of new names is used to introduce
new bindings. If you have a record "Line" with two "Points" components, who
themselves have X and Y components, if you want to match on the 4 subvalues
the components names are not going to be enough.

> [Aside: in writing the above, I see that your original example doesn't have
> enough components (the Server component seems to have been left out). Which
> is why many style guides require component names ... ;-)]

>> Statically ensuring that the accessed information is correct will be much
>> more work for the compiler.
> There seems to be no need to do that. Again, this is just a different view
> of an existing object, we really should not care which of those views is
> accessed.

The point of this feature, in my mind, is that you match the 
discriminants and the components at the same time. So every new binding 
you introduce is statically guaranteed to correspond to something in the 
matched value.

If you combine that with:

1. A style rule (that can easily be statically checked) that it is 
forbidden to access variable components of a record via the regular dot 
notation, eg. you have to use matching.
2. A legality rule that it is forbidden to mutate the object that you're 
matching upon (similar to the rule about renamings of discriminated 
records component if I remember correctly)

Then you get a style of programming where it is possible to statically 
guarantee that the user cannot illegally access a component of a 
discriminated record. This is the situation in languages such as OCaml, 
and it is a highly desirable one IMHO.

I think it is possible to reach this goal without introducing new 
bindings, eg. you need to check the components paths used  inside case 
branches, but the specification and implementation of such a feature 
will be harder as far as I can tell.

>> Not to mention, that would be (yet another) feature that we don't implement
>> like other languages.
> You already know what I think about that: if you want to use some other
> language, do that. Don't try to mess up Ada with the exact features of other
> languages; whatever we do should fit into the Ada model and not look like it
> was stolen from someone else. (That's why some form of case coverage is
> mandatory for this feature.)

Yes, I agree that similarity to other languages is not a strong 
argument. However, there often was a good reason why a feature was 
expressed in a certain way in another language, especially when this 
language is a language where the type safety was given a lot of thought, 
as ML and Haskell are. We should take some time to consider it. Here the 
reason of introducing new bindings is not (solely) expressivity, it's 
safety, a characteristic we care deeply about.

> Tucker's idea seems to fit with the existing syntax of the language, and
> seems to be sufficient for the job.

I strongly disagree with that. Tucker's idea is insufficient to 
guarantee safety, which is one of the key points of this  feature, not 
expressivity. I'm waiting for a counter argument :)

****************************************************************

From: Randy Brukardt
Sent: Monday, February 13, 2017  5:10 PM

> Thank you for your answer Randy, and sorry I took so long to answer ! 
> I have been swamped with other work at AdaCore, currently depiling my 
> ARG work pile :)

Real work being more important than ARG fun -- what a concept! :-)
 
...
> This is not about mistyping, it is about accessing a component that is 
> statically valid, but dynamically invalid due to discriminants. Let me 
> amend a previous example:
> 
> C, C2 : Connection_Info
> 
> case C is
>    when (Connected, <>, <>, (True, <>, <>)) =>
>       Print_Ping_Time (C2.Ping_Info.Ping_Time)
> 
> Here you're accessing the wrong object altogether (C2). This is valid 
> Ada, so it is pretty impossible to emit a warning, even though the 
> code is clearly wrong.

Well, it's only clearly wrong if you know the intent; I have parallel objects
like this all the time (often in writing list process). Which I suppose is
your point.

BTW, you've again ignored the fact that <> can only appear in named notation,
and I think that really does make a difference in these examples. (Not to
mention that your example has seven components when written this way, not
six. :-)

> Arguably the
> programmer should have used more descriptive names. He should also 
> have not made errors. The job of the compiler is to help him, and 
> that's a great opportunity to do so.
> 
> With introducing a binding, both writing the code and checking it is 
> easier:
> 
> C, C2 : Connection_Info
> 
> case C is
>    when (Connected, <>, <>, (True, <>, <Ping_Time>)) =>
>       Print_Ping_Time (Ping_Time)
> 
> 
> > Keep in mind that every identifier (and every entity for that 
> > matter) that one declares adds to the cognitive load of the reader. 
> > You really shouldn't do it unless it actually helps reading the 
> > code. (Exactly where that line is obviously is a personal choice, 
> > but it is far away from renaming a single character identifier.)
> 
> Yes, I agree with that line of reasoning. The C/CC example was a 
> straw-man, of the alternative proposal I don't like. In the example 
> above, I feel like the  "Ping_Time" binding that is introduced, both 
> by it's strong  locality and because of the static guarantees 
> associated to it, helps the user understand the code and make sure 
> it's correct.

What guarantees? It seems to me that you need those guarantees anytime you
have any sort of binding (the form doesn't matter). That is, the Tucker-style
binding needs the same guarantees.

> >> Having to repeat the path is less readable and more error prone. 
> >> You go through the trouble of expressing the pattern, just to have 
> >> to repeat the logic underneath, effectively writing the paths 
> >> twice, once in aggregate syntax, the other in prefix syntax.
> > Arguably, that's a good thing.
> 
> Let's argue then :) I see no benefit in repeating the path, only 
> potential for errors, both for the writer and for the reader of the 
> code.
> 
> > BTW, Ada doesn't currently allow positional <> aggregate components, 
> > and I'd suggest that we retain that rule here. (Assuming you really 
> > want to model these patterns as aggregates.) [Especially as many 
> > style guides ban positional record aggregates altogether.] 
> > Therefore, your example would have to be written something like:
> >
> > case Get_Connection (From => Server) is
> >     when CC : (State => Connected, Server => <>,
> >                Session_Id => <>,
> >                Ping_Info => (Has_Ping_Info => True, Last_Ping_Time => 
> >                              <>, Last_Ping_Id => <>)) =>
> >         Put_Line ("Connected ! Session Id is "
> >                   & CC.Session_Id & " Ping time is "
> >                   & CC.Ping_Info.Last_Ping_Time'Image);
> >
> > So the names you need are already in the source. Declaring more 
> > names would just be more noise.
> 
> This does not make sense. The introduction of new names is used to 
> introduce new bindings. If you have a record "Line"
> with two "Points" 
> components, who themselves have X and Y components, if you want to 
> match on the 4 subvalues the components names are not going to be 
> enough.

What doesn't make sense? You already have to have the component names in the
pattern, adding binding names as well is likely be confusing rather than
helpful.

Side-comment here: The way you have the binding defined, it doesn't seem
possible to pass a larger part of the matched record to a subprogram.
Consider a modification of the above:

 case Get_Connection (From => Server) is
     when CC : (State => Connected, Server => <>,
                Session_Id => <>,
                Ping_Info => (Has_Ping_Info => True, Last_Ping_Time => 
                              <>, Last_Ping_Id => <>)) =>
         Put_Line ("Connected ! Session Id is "
                   & CC.Session_Id & Display_Ping_Info (CC.Pinf_Info));

In this case, we're using an existing routine to generate the details about
the Ping_Information. That's pretty common (after all, one of the likely
reasons for having a subrecord is that it gets independently processed). I
don't see any way of doing this with your binding short of going back and
copying the original selecting information.

...
> The point of this feature, in my mind, is that you match the 
> discriminants and the components at the same time. So every new 
> binding you introduce is statically guaranteed to correspond to 
> something in the matched value.
> 
> If you combine that with:
> 
> 1. A style rule (that can easily be statically checked) that it is 
> forbidden to access variable components of a record via the regular 
> dot notation, eg. you have to use matching.
> 2. A legality rule that it is forbidden to mutate the object that 
> you're matching upon (similar to the rule about renamings of 
> discriminated records component if I remember correctly)

It has to be the latter, especially in your scheme -- it is essentially a
renaming of a discriminant-dependent component, so the same rules have to
apply. (We adopted that rule for iterators, for instance, for similar
reasons.) That means that the selecting expression would have to be "known
to be constrained".

And this is what I was talking about above: this is a property of *any*
binding in such a matching (so long as some discriminant-dependent components
are involved); it doesn't really matter about the syntax involved. If you have
any non-box matching on a discriminant or discriminant-dependent component,
you can't allow the item to be mutable lest the promise implicit in the
declaration be violated.

So the safety issue is the same either way; it doesn't depend on how the
binding(s) are defined.

...
> I think it is possible to reach this goal without introducing new 
> bindings, eg. you need to check the components paths used inside case 
> branches, but the specification and implementation of such a feature 
> will be harder as far as I can tell.

It's easy, I described it above. It's all in terms of existing Ada terminology
("known to be constrained", "discriminant-dependent component", etc.). It
might have to apply to multiple records (which I believe is already the case
for renames), but nothing hard or weird about that.

...
> > Tucker's idea seems to fit with the existing syntax of the language, 
> > and seems to be sufficient for the job.
> 
> I strongly disagree with that. Tucker's idea is insufficient to 
> guarantee safety, which is one of the key points of this feature, not 
> expressivity. I'm waiting for a counter argument :)

Tucker's idea combined with a "known-to-be-constrained" rule works fine to
guarantee safety (as it is the same as an object rename), and indeed that
seems necessary for any sort of binding. So that ends up identical either way.

OTOH, your proposal doesn't seem to allow both partial matching AND direct
access to the enclosing (sub)record that contains that matching. That seems
to be *less* functionality and *more* complexity. Which makes it a no-brainer
to me, YMMV. ;-)

****************************************************************

From: Raphael Amiard
Sent: Tuesday, February 14, 2017  5:10 PM

>> Thank you for your answer Randy, and sorry I took so long to answer ! 
>> I have been swamped with other work at AdaCore, currently depiling my 
>> ARG work pile :)
> Real work being more important than ARG fun -- what a concept! :-)

It's all fun, with varying degrees of "urgent" attached :)

>> case C is
>>     when (Connected, <>, <>, (True, <>, <>)) =>
>>        Print_Ping_Time (C2.Ping_Info.Ping_Time)
>>
>> Here you're accessing the wrong object altogether (C2). This is valid 
>> Ada, so it is pretty impossible to emit a warning, even though the 
>> code is clearly wrong.
> Well, it's only clearly wrong if you know the intent; I have parallel 
> objects like this all the time (often in writing list process). Which 
> I suppose is your point.

It's not only clearly wrong if you know the intent: It's clearly wrong if
your goal is to disallow access to a discriminant dependent component, when
you don't statically know that this access is correct (which is the case
above), then you should not access it, regardless of the intent.

> BTW, you've again ignored the fact that <> can only appear in named 
> notation, and I think that really does make a difference in these examples.
> (Not to mention that your example has seven components when written 
> this way, not six. :-)

Yes, sorry about that, I did not completely ignore it, I think I altered some
of them, and not all, very sloppy of me...

>> Yes, I agree with that line of reasoning. The C/CC example
>> was a straw-man, of the alternative proposal I don't like. In
>> the example above, I feel like the  "Ping_Time" binding that
>> is introduced, both by it's strong  locality and because of
>> the static guarantees associated to it, helps the user
>> understand the code and make sure it's correct.
> What guarantees?

The ones outlined above: You can statically check that a components exists 
before accessing it.

> What doesn't make sense? You already have to have the component names in the
> pattern, adding binding names as well is likely be confusing rather than
> helpful.

In that case, we're talking about a point record with no discriminant, so we 
don't care about safety, so it's completely a style issue, which is by essence 
subjective. I can't find a concrete example to discuss so let's agree this is 
not a case that is interesting for this discussion.

> Side-comment here: The way you have the binding defined, it doesn't seem
> possible to pass a larger part of the matched record to a subprogram.
> Consider a modification of the above:
>
>   case Get_Connection (From => Server) is
>       when CC : (State => Connected, Server => <>,
>                  Session_Id => <>,
>                  Ping_Info => (Has_Ping_Info => True, Last_Ping_Time =>
>                                <>, Last_Ping_Id => <>)) =>
>           Put_Line ("Connected ! Session Id is "
>                     & CC.Session_Id & Display_Ping_Info (CC.Pinf_Info));
>
> In this case, we're using an existing routine to generate the details about
> the Ping_Information. That's pretty common (after all, one of the likely
> reasons for having a subrecord is that it gets independently processed). I
> don't see any way of doing this with your binding short of going back and
> copying the original selecting information.

To be clear, I'm not arguing that top level binding is useless, in fact many 
languages with pattern matching do propose it. I'm arguing that it is not a 
substitute for sub components bindings, for the reasons outlined before.

This example of yours, while arguably expressive, also shows why it would be 
hard to guarantee the property I have outlined above - no access to fields if 
you can't guarantee their legality statically. You would have to keep a shape of 
the whole data structure, with known and unknown discriminants, possibly across 
indirectly nested case statements. This is flow analysis at this stage, and 
probably something you don't want to make mandatory at the language level.

>> If you combine that with:
>>
>> 1. A style rule (that can easily be statically checked) that it is
>> forbidden to access variable components of a record via the
>> regular dot
>> notation, eg. you have to use matching.
>> 2. A legality rule that it is forbidden to mutate the object
>> that you're
>> matching upon (similar to the rule about renamings of discriminated
>> records component if I remember correctly)
> It has to be the latter, especially in your scheme -- it is essentially a
> renaming of a discriminant-dependent component, so the same rules have to
> apply. (We adopted that rule for iterators, for instance, for similar
> reasons.) That means that the selecting expression would have to be "known
> to be constrained".

My list is inclusive, not exclusive. Of course 2. has to be guaranteed with 
Tuck's proposal and with mine. However it is not the main point. The main point 
is 1., because this is what will allow to enforce the invariant that no 
component of a discriminated record is accessed if we don't know statically that 
it is correct.

With the rule enforced, the code at the beginning:

C, C2 : Connection_Info

case C is
    when (Connected, <>, <>, (True, <>, <>)) =>
       Print_Ping_Time (C2.Ping_Info.Ping_Time)


Would be illegal because C2.Ping_Info.Ping_Time would fall under this rule.

If this was really the intent of your code, you'd have to write:

C, C2 : Connection_Info

case C is
    when (Connected, <>, <>, (True, <>, <>)) =>
       Print_Ping_Time
         (case C2 is
            when (Connected, <>, <>, <>, (True, <>, <PT>)) => PT)
            when others => No_Ping_Time)


It is more verbose, which is in this case a good thing ! You're ensuring that 
the programmer handles the error case explicitly.

> So the safety issue is the same either way; it doesn't depend on how the
> binding(s) are defined.

Only because we're not talking about the same safety issue.

>> I strongly disagree with that. Tucker's idea is insufficient to
>> guarantee safety, which is one of the key points of this feature, not
>> expressivity. I'm waiting for a counter argument :)
> Tucker's idea combined with a "known-to-be-constrained" rule works fine to
> guarantee safety (as it is the same as an object rename), and indeed that
> seems necessary for any sort of binding. So that ends up identical either
> way.

It does not, as explained above, guarantee safety of an accessed component of a 
discriminated record if that component depends on the discriminant.

> OTOH, your proposal doesn't seem to allow both partial matching AND direct
> access to the enclosing (sub)record that contains that matching.

It does. You just have to put your result in a declare block. You're not usually 
one to argue that this added verbosity is actually a big deal I think !

declare
   CC : Connection_Info := Get_Connection (From => Server)
begin
  case CC is
      when (State => Connected, Server => <>,
            Session_Id => <>,
            Ping_Info => (Has_Ping_Info => True, Last_Ping_Time =>
                          <>, Last_Ping_Id => <>)) =>
          Put_Line ("Connected ! Session Id is "
                    & CC.Session_Id & Display_Ping_Info (CC.Pinf_Info));

****************************************************************

From: Randy Brukardt
Sent: Tuesday, February 14, 2016  4:16 PM

> To be clear, I'm not arguing that top level binding is useless, in fact many 
> languages with pattern matching do propose it. I'm arguing that it is not a 
> substitute for sub components bindings, for the reasons outlined before.

Well, you have to be careful about making a proposal too complex. I've learned
through much bitter experience that if you come up with a fully worked out
proposal with all of the bells and whistles, you're most likely to end up with
nothing. It probably would have been better to spring this component matching
proposal when the rest of this idea is nearly finished... :-)

> This example of yours, while arguably expressive, also shows why it would be 
> hard to guarantee the property I have outlined above - no access to fields if 
> you can't guarantee their legality statically. You would have to keep a shape of 
> the whole data structure, with known and unknown discriminants, possibly across 
> indirectly nested case statements. This is flow analysis at this stage, and 
> probably something you don't want to make mandatory at the language level.

Within in one of your case statements (or, for that matter, in the scope of a
renames of the component), the Legality Rule already guarantees the property
you want. Indeed, because of the renames solution, there is a way to already
guarantee the property in Ada today (with an appropriate checking tool, of
course; sounds like something AdaControl could do).

That is, one could insist that all discriminant dependent components are bound
with renames before use:
     Ping_Time : ... renames Get_Connection (From => Server).Ping_Info.Last_Ping_Id;
Combined with appropriate "if"s/assertions, you can be guaranteed that the
component exists and is safe. (Indeed, with a proper tool, you really shouldn't
need to do anything, as a tool can relatively easily prove this property if it is
provable at all.)

...
> >> 1. A style rule (that can easily be statically checked) that it is 
> >> forbidden to access variable components of a record via the regular 
> >> dot notation, eg. you have to use matching.
...
> The main point is 1., because this is what will allow to enforce the 
> invariant that no component of a discriminated record is accessed if 
> we don't know statically that it is correct.

At least in my code, it is common to have a subprogram that works on a single
variant. For instance, the routine I was working on yesterday (slightly
modernized):

    procedure Lookup_Allocator (Expr : in Node_Ptr)
       with Pre => Expr.Kind = Allocator;

    procedure Lookup_Allocator (Expr : in Node_Ptr) is
    begin
        Lookup_Expr (Expr.Allocator_Type); -- The component is discriminant-dependent.
        ...
    end Lookup_Allocator;

With your rule, you'd have to wrap this entire body in one of your case
statements, and presumably have an "others" clause with an Internal_Error
call. But that completely defeats the purpose of the precondition (violating
our style rule: "never repeat the precondition in the body"), and would add a
lot of extra verbiage to the code.

So that seems like a very silly rule to have in general. I could see having it
in code that is inside of a case statement, but that seems too limited to be
of much use. And clearly, any barely competent tool could prove that the
component use is safe (at least in the absence of some other task causing
mischief).

> With the rule enforced, the code at the beginning:
> 
> C, C2 : Connection_Info
> 
> case C is
>     when (Connected, <>, <>, (True, <>, <>)) =>
>        Print_Ping_Time (C2.Ping_Info.Ping_Time)
> 
> 
> Would be illegal because C2.Ping_Info.Ping_Time would fall under this 
> rule.
> 
> If this was really the intent of your code, you'd have to write:
> 
> C, C2 : Connection_Info
> 
> case C is
>     when (Connected, <>, <>, (True, <>, <>)) =>
>        Print_Ping_Time
>          (case C2 is
>             when (Connected, <>, <>, <>, (True, <>, <PT>)) => PT)
>             when others => No_Ping_Time)
> 
> 
> It is more verbose, which is in this case a good thing ! 
> You're ensuring that the programmer handles the error case explicitly.

Seriously, this looks like madness to me. No sane programmer is ever going
to write the second just to meet some style rule. (Especially if they have to
put all of the component names into the patterns!)

...
> It does not, as explained above, guarantee safety of an accessed 
> component of a discriminated record if that component depends on the 
> discriminant.

That's not a worthwhile goal if it requires writing gallons of unnecessary
code, especially in the precondition/predicate/assertion cases.

> > OTOH, your proposal doesn't seem to allow both partial matching AND direct
> > access to the enclosing (sub)record that contains that matching.
> 
> It does. You just have to put your result in a declare block. You're not
> usually one to argue that this added verbosity is actually a big deal I
> think !
> 
> declare
>    CC : Connection_Info := Get_Connection (From => Server) begin
>   case CC is
>       when (State => Connected, Server => <>,
>             Session_Id => <>,
>             Ping_Info => (Has_Ping_Info => True, Last_Ping_Time =>
>                           <>, Last_Ping_Id => <>)) =>
>           Put_Line ("Connected ! Session Id is "
>                     & CC.Session_Id & Display_Ping_Info (CC.Pinf_Info));

If that's acceptable, then you don't need any binding mechanism and the
complications that it brings.

Besides, if you're really willing to write a lot of code, you don't need this
feature at all, so that simplifies it down to nothing -- the ultimate simple
solution. ;-)

****************************************************************

From: Raphael Amiard
Sent: Wednesday, June 14, 2016  8:38 AM

> Well, you have to be careful about making a proposal too complex. I've 
> learned through much bitter experience that if you come up with a 
> fully worked out proposal with all of the bells and whistles, you're 
> most likely to end up with nothing. It probably would have been better 
> to spring this component matching proposal when the rest of this idea 
> is nearly finished... :-)

I think the component matching is integral to the feature actually. Something
that only allows you to match literals would be crippled, both in terms of
expressivity and in terms of potential safety. It would still be a big
improvement on the status quo though, so I guess we can discuss this live in
Vienna !

> > This example of yours, while arguably expressive, also shows why it 
> > would be hard to guarantee the property I have outlined above - no 
> > access to fields if you can't guarantee their legality statically. 
> > You would have to keep a shape of
> > the whole data structure, with known and unknown discriminants, 
> > possibly across indirectly nested case statements. This is flow 
> > analysis at this stage, and probably something you don't want to make
> > mandatory at the language level.
>
> Within in one of your case statements (or, for that matter, in the 
> scope of a renames of the component), the Legality Rule already 
> guarantees the property you want. Indeed, because of the renames 
> solution, there is a way to already guarantee the property in Ada 
> today (with an appropriate checking tool, of course; sounds like something
> AdaControl could do).
>
> That is, one could insist that all discriminant dependent components 
> are bound with renames before use:
>      Ping_Time : ... renames Get_Connection (From => 
> Server).Ping_Info.Last_Ping_Id; Combined with appropriate 
> "if"s/assertions, you can be guaranteed that the component exists and 
> is safe. (Indeed, with a proper tool, you really shouldn't need to do 
> anything, as a tool can relatively easily prove this property if it is 
> provable at all.)

I'm not sure I understand this point ! We'll have to discuss this live.

> At least in my code, it is common to have a subprogram that works on a 
> single variant. For instance, the routine I was working on yesterday 
> (slightly modernized):
>
>     procedure Lookup_Allocator (Expr : in Node_Ptr)
>        with Pre => Expr.Kind = Allocator;
>
>     procedure Lookup_Allocator (Expr : in Node_Ptr) is
>     begin
>         Lookup_Expr (Expr.Allocator_Type); -- The component is discriminant-dependent.
>         ...
>     end Lookup_Allocator;
>
> With your rule, you'd have to wrap this entire body in one of your 
> case statements, and presumably have an "others" clause with an 
> Internal_Error call. But that completely defeats the purpose of the 
> precondition (violating our style rule: "never repeat the precondition 
> in the body"), and would add a lot of extra verbiage to the code.
>
> So that seems like a very silly rule to have in general. I could see 
> having it in code that is inside of a case statement, but that seems 
> too limited to be of much use. And clearly, any barely competent tool 
> could prove that the component use is safe (at least in the absence of 
> some other task causing mischief).

I understand what you mean. I think it is also an issue of code style, and as
with error handling in general, there is no unique good solution. However:

1. I do not think everybody would need to use that rule, or for that matter,
   such a simple rule. Its use is a trade-off between simplicity and safety,
   one that I would personally choose.

2. If you have a tool that does basic intra-procedural analysis, such as what
   you seem to be advocating, you could make the rule more powerful, by saying
   it only forces you to check the discriminant if it is not known at this
   point in the control flow, making the code above OK !.

In that case, the fact of being able to bind sub-components in the matchers is
just a very DRY convenient way to get at sub-components.

> Seriously, this looks like madness to me. No sane programmer is ever 
> going to write the second just to meet some style rule. (Especially if 
> they have to put all of the component names into the patterns!)

First, I don't agree that this second part should be enforced, second, your
perspective on verbosity seems double-standard'ish to me:

- You're fine with repeating sub-components name completely, even though it
  brings no benefits to the user and makes checking safety harder.

- You think this is unacceptable verbosity even though it brings substantial
  benefits. This isn't just "some style rule", it is a (certainly restrictive)
  rule that allows the programmer to be sure that it will eliminate a certain
  class of errors completely.

> ...
> > It does not, as explained above, guarantee safety of an accessed 
> > component of a discriminated record if that component depends on the 
> > discriminant.
>
> That's not a worthwhile goal

I strongly disagree with this unsubstantiated claim. I think it is a very
worthwhile goal.

> if it requires writing gallons of unnecessary
> code, especially in the precondition/predicate/assertion cases.

As explained above, we can imagine smarter rules/tools if you use a style of 
code where you know you pass around objects with an already known discriminant. 
Or you can still, not use the rule altogether.

> Besides, if you're really willing to write a lot of code, you don't need this
> feature at all, so that simplifies it down to nothing -- the ultimate simple
> solution. ;-)

I don't see how that is True, a-minima you still need some flow sensitive 
checking tool to guarantee access to fields that depend on a discriminant.

****************************************************************

From: Randy Brukardt
Sent: Wednesday, June 14, 2016  2:42 PM

>> Besides, if you're really willing to write a lot of code, you don't need this
>> feature at all, so that simplifies it down to nothing -- the ultimate simple
>> solution. ;-)
>
>I don't see how that is True, a-minima you still need some flow sensitive
>checking tool to guarantee access to fields that depend on a discriminant.

You need that in any case, and any ASIS-based tool has enough information to
make the check. So, for that matter, does any compentent Ada optimizer. (This
would make a possible Code Quality Warning in Janus/Ada, see the most recent
blog entry on RRSoftware.Com - http://www.rrsoftware.com/html/blog/quality.html
- for the basic idea.) No extra syntax needed.

****************************************************************

!topic Renaming in class membership test
!reference Ada 2012 RM4.4(3/4)
!from Niklas Holsti 18-01-15
!keywords membership renaming
!discussion

(This suggestion for some Ada extensions is taken from a discussion on 
comp.lang.ada, started on 2018-01-04 by Dmitry A. Kazakov within a 
thread with the irrelevant Subject "Re: stopping a loop iteration 
without exiting it".)

It is sometimes necessary to supplement dynamic dispatching by manually
coded case analysis, using membership tests in which the
tested_simple_expression has a class-wide type and the membership_choice 
is a descendant class-wide subtype, for example as follows, where X is a 
class-wide expression:

   if X in T'Class then ...

If the test returns True, the following actions usually need to access 
the tested_simple_expression (X) as an object of the membership_choice 
descendant type (T'Class). This leads to the following clumsy 
construction, which requires writing the descendant type identifier 
three times, and introducing a new indentation level:

   if X in T'Class then
      declare
         Same_X : T'Class renames T'Class (X);
      begin
         ... use Same_X as an object of T'Class
      end;
   end if;

Both Dmitry and I have been bothered by this feature of class-based case 
analysis.

It is suggested to allow a combination of the membership test and the 
declaration of the renaming (Same_X), as in:

    if X is Same_X : T'Class then
       ... Here Same_X is a renaming of T'Class (X).
    end if;

This form uses a different keyword ("is", not "in") to separate it from 
the normal membership test.

Clearly, this form of membership test cannot have more than one 
membership_choice (that is, it cannot be "if X is T'Class | S'Class then 
...") and it cannot be a negative test (it cannot be "is not").

Earlier in the same discussion thread, a similarly extended "case" 
statement for a class-wide selecting_expression was suggested, as in:

    case X is  -- or "case X'Tag"
       when Some_T : T'Class =>
          ... Here Some_T is a renaming of T'Class(X).
       when Some_S : S =>
          ... Here Some_S is a renaming of S(X).
       when others => ...
    end case;

where the legality conditions would require that no two "when" clauses 
have overlapping classes (that is, both "whens" cannot be True for the 
same X) and that an "others" clause always be present. However, this 
form could be problematic in a generic context, where the 
non-overlapping requirement of formal generic types (T, S) might not be 
easy to check at compile-time.

A further extension to the above would let the selecting_expression (X) 
be an access-to-class-wide, instead of a class-wide, with implicit 
dereferencing for the renamings (Some_T would be a renaming of 
T'Class(X.all)), and would then permit a "when null => ..." to handle 
the case X = null.

Continuing with further variations, access types in general can 
sometimes lead to similar clumsy renamings, as in this example from 
Dmitry, where P is access X_Type:

    if P /= null then
       declare
          X : X_Type renames P.all;
       begin
          ...
       end;

Here, again, an extension might allow a "case" statement with the access 
value P as the selecting_expression, although it can access only a 
single type, and a renaming combined with the "when":

    case P is
       when X : X_Type =>
          ... Here X is a renaming of P.all.
       when null =>
          ...
    end case;

Finally, a similar extension was suggested to the normal "case" 
statement, with a discrete selecting_expression. Here the extension is 
not needed to avoid a renaming declaration, but could help readability. 
For an example, from Dmitry, the following code:

    declare
       Symbol : constant Character := Get_Character;
    begin
       case Symbol is
          when '0'..'9' =>
             -- Process digit
          when 'A'..'Z' | 'a'..'z' =>
             -- Process letter

could be replaced by this, somewhat simpler code:

    case Get_Character is
       when Digit : '0'..'9' =>
          -- Process the digit Digit.
       when Letter : 'A'..'Z' | 'a'..'z' =>
          -- Process the letter Letter.

As observed in the comp.lang.ada thread, these suggestions have the 
common flavour of introducing a kind of "pattern matching" syntax into 
Ada control flow, but a very simple one (the pattern defines a single 
new name).

****************************************************************

From: Randy Brukardt
Sent: Saturday, January 20, 2018  8:15 PM

> As observed in the comp.lang.ada thread, these suggestions have the 
> common flavour of introducing a kind of "pattern matching" syntax into 
> Ada control flow, but a very simple one (the pattern defines a single 
> new name).

There already is such a "pattern matching" proposal in the hopper; the current
plan is to split it from AI12-0214-1 where it currently lives.

I first became concerned about the number of gee-gaws proposed for Ada 2020
because of this pattern matching proposal, so that should make it fairly
obvious where I stand on this one. ;-)

...
> If the test returns True, the following actions usually need to access 
> the tested_simple_expression (X) as an object of the membership_choice 
> descendant type (T'Class). This leads to the following clumsy 
> construction, which requires writing the descendant type identifier 
> three times, and introducing a new indentation level:
> 
>    if X in T'Class then
>       declare
>          Same_X : T'Class renames T'Class (X);
>       begin
>          ... use Same_X as an object of T'Class
>       end;
>    end if;

I've written a lot of such code (especially in the Claw Builder), and I never
once used this construction. In the Claw Builder, X typically is a dereference
of an access to a Root_Window, and the test is needed to call some operation
on some operation only defined for a child hierarchy (for instance, for
controls).

In such cases, the dependent code is almost always a single (dispatching?)
call, and since there is only one use of the name, it is better to just use
the type conversion directly on the call parameter, rather than to introduce
an extra name. Even if there are several uses, it is often the case that the
conversions are hardly any longer than the renaming, so the simpler code is
preferred.

In general, I think it is a bad idea to rename objects, as that requires
introducing an additional identifier to the program, increasing the number
that a reader must understand. Renaming is mainly a construction that helps
the writer, not the reader. I think it is best reserved for the rare cases
where the entity has to be evaluated once rather than multiple times. (It also
can make the code slower, by creating an intermediate storage location with
associated memory write(s), rather than just evaluating into registers).

Aliasing (having multiple names for the same thing) makes code harder to
understand both for the compiler and for human readers. There's a reason that
it is better to pass in all of the objects needed for a given subprogram even
if they are visible elsewhere (so the reader can consider the subprogram as a
single unit without considering any possible aliasing).

...
> Continuing with further variations, access types in general can 
> sometimes lead to similar clumsy renamings, as in this example from 
> Dmitry, where P is access X_Type:
> 
>     if P /= null then
>        declare
>           X : X_Type renames P.all;
>        begin
>           ...
>        end;

This is even worse. You've introduced an entire block and an extra name to
save 4 characters on each use! Any reasonable compiler will eliminate any 
redundant checks, so this construction buys nothing except a bunch of extra
lines in the code (and an obvious reduction in readability).

Now, I realize I am a charter member in the "write it all out explicitly"
club (I doubt that few other Ada programmers will write out
"Ada.Strings.Unbounded.To_Unbounded_String" as often as I do), but Ada code
is (or should be) primarily about making the result easy to read and
understand. Reasonable people can disagree about about how long is too long,
but the only time constructions like the above make sense is when the new name
is substantially shorter (remaining understandable) than the original name.
And in such cases, the length of the construction isn't particularly relevant
(since it is a lot less than the names in question). Shorthands here just make
it easier for the writer rather than helping the reader.

***************************************************************

Questions? Ask the ACAA Technical Agent