CVS difference for ais/ai-00285.txt

Differences between 1.31 and version 1.32
Log of other versions for file ais/ai-00285.txt

--- ais/ai-00285.txt	2005/10/31 05:18:22	1.31
+++ ais/ai-00285.txt	2005/11/16 06:51:05	1.32
@@ -6280,3 +6280,669 @@
 
 ****************************************************************
 
+From: Robert Dewar
+Sent: Sunday, January 23, 2005 12:16 PM
+
+I know this is not strictly part of the standard, but I think it is useful
+if the ARG deals with the issue of how to extend the semi-standard (used in
+ACATS) brackets notation for wide characters (this is also a way of
+canonicalizing source programs into ASCII 7-bit graphic characters and standard
+line terminator characters).
+
+I would recommend we extend brackets notation to allow six digits as in
+
+  ["010248"] representing the character DESERET SMALL LETTER LONG I.
+
+Second, I would forbid the use of brackets notation for wide characters in
+comments. Why? because it is a real pain to have to scan for brackets
+notation just in case PARAGRAPH SEPARATOR is used to end a comment and
+brackets notation is used for this:
+
+    -- comment ended by paragraph separator ["2029"] if True then
+       null;
+    end if;
+
+Should the IF be recognized here? I recommend no. If you say yes, then you
+create incompatibilities, since the following should be a perfectly legal
+in an Ada 95 program:
+
+    -- Note: do not use the sequence ["2029"] in comments, since it might
+    -- be interpreted as an end of line sequence if the ARG makes the
+    -- wrong decision about how to handle this case.
+
+but it now becomes illegal. I noticed this because I actually implemented
+allowing the brackets notation in comments, and then the bootstrap failed.
+Why? because in code that has compiled for years, we had:
+
+    --  All other codes, are output as described for Write_Char_Code. For
+    --  example, the string created by folding "A" & ASCII.LF & "Hello" will
+    --  print as "A["0A"]Hello". A No_String value prints simply as "no string"
+    --  without surrounding quote marks.
+
+and now the ["0A"] was recognized as a valid line terminator written in
+brackets notation.
+
+Adopting this rule (brackets notation not allowed in comments) means that there
+is a slight glitch in canonicalizing programs using PARAGRAPH SEPARATOR or
+LINE SEPARATOR to end a comment into 7-bit ASCII, but this seems very minor.
+You can always just use a normal LF character, and presumably in real life,
+UTF-8 will be supported anyway.
+
+By the way, I think a secondary standard describing a canonical implementation
+of UTF-8 and brackets notation would be a good thing!
+
+I realize this is strictly outside the scope of the standard, but the standard
+is about portability, and the ARG might as well help to make sure that all
+Ada compilers and the ACATS tests have a common view here.
+
+****************************************************************
+
+From: Pascal Leroy
+Sent: Monday, January 24, 2005  5:08 AM
+
+> I know this is not strictly part of the standard, but I think
+> it is useful if the ARG deals with the issue of how to extend
+> the semi-standard (used in
+> ACATS) brackets notation for wide characters (this is also a
+> way of canonicalizing source programs into ASCII 7-bit
+> graphic characters and standard line terminator characters).
+
+The bracket notation is not very important for us, since we only use it in
+the context of the ACATS, and our compiler doesn't support it.  Instead we
+run a simple preprocessor that detects it and produces (at the moment) the
+equivalent Latin-1 character.  Our plan is to support input files written
+in UTF-8, so I suppose that we'll need to upgrade the proprecessor to
+transform the bracket notation into a bona fide UTF-8 file.
+
+> I would recommend we extend brackets notation to allow six
+> digits as in
+>
+>   ["010248"] representing the character DESERET SMALL LETTER LONG I.
+
+Fine with me.  Note however that the restriction to 6 digits makes it
+impossible to check that characters beyond group 00 of 10646 are not
+allowed in identifiers but allowed in comments.  So it might make sense to
+go to 8 digits.
+
+> Second, I would forbid the use of brackets notation for wide
+> characters in comments.
+
+Since we use a very stupid preprocessor, and we'd rather not invest in
+teaching it about comments (and string literals), I would prefer to allow
+the use of the backet notation in comments.  But if worse comes to worst
+we'll just edit by hand the 2 or 3 ACATS tests where this is causing
+trouble.  No big deal.
+
+****************************************************************
+
+From: Robert Dewar
+Sent: Monday, January 24, 2005  7:41 AM
+
+> Fine with me.  Note however that the restriction to 6 digits makes it
+> impossible to check that characters beyond group 00 of 10646 are not
+> allowed in identifiers but allowed in comments.  So it might make sense to
+> go to 8 digits.
+
+Note that if you follow my recommendations for comments, this is not needed,
+since comments would not be scanned for brackets.
+
+>>Second, I would forbid the use of brackets notation for wide
+>>characters in comments.
+
+> Since we use a very stupid preprocessor, and we'd rather not invest in
+> teaching it about comments (and string literals),
+
+Are you saying that out of range characters are permitted in string literals.
+I do not think this is the case. If I am wrong, please point me to where in
+the AI it says this.
+
+Fixing your preprocessor to notice comments is, I would estimate after doing
+the implementation work, 0.1% of the effort to support wide wide character.
+
+> I would prefer to allow
+> the use of the backet notation in comments.  But if worse comes to worst
+> we'll just edit by hand the 2 or 3 ACATS tests where this is causing
+> trouble.  No big deal.
+
+Well you miss my legality question? With your stupid preprocessor, the
+following valid Ada 95 program becomes illegal (in fact it seems likely
+that your Ada 95 compiler with the preprocessor will reject this).
+
+procedure Nasty_ACATS test is
+    X : Wide_Character := ["2029"];
+begin
+    null;
+    --  The ARG messed up, this program is legal in Ada 95
+    --  but illegal in Ada 2005 with a stupid brackets
+    --  processor in action, because when we use the
+    --  sequence ["2029"] in a comment, it ends the line
+    --  and "in a comment, it ends the line" is an invalid
+    --  statement.
+end;
+
+You need to answer the question of whether this program is or is
+not legal in Ada 2005. I really think you want to say it is legal.
+
+In which case I think we should just say that brackets notation
+cannot be used in comments. You then have the choice of either
+fixing your brackets processor or editing the above new test
+in the ACATS suite by hand.
+
+But you say: "I would prefer to allow the use of the bracket
+notation in comments". This means you want the above program
+to be illegal.
+
+Either way, the above program should go into the ACATS suite,
+with a decision as to whether it is legal or not.
+
+Note that allowing brackets in comments also slows down the scanner
+by a noticeable amount, since scanning comments becomes about twice
+as hard.
+
+****************************************************************
+
+From: Pascal Leroy
+Sent: Monday, January 24, 2005  8:26 AM
+
+> > Since we use a very stupid preprocessor, and we'd rather not invest in
+> > teaching it about comments (and string literals),
+>
+> Are you saying that out of range characters are permitted in
+> string literals. I do not think this is the case. If I am
+> wrong, please point me to where in the AI it says this.
+
+No, I was just pointing out that in order to recognize comments in a
+brain-dead lexer, you also need to recognize string literals:
+
+	X : String := """This -- is not the beginning of a comment"; --
+... but this is.
+
+> procedure Nasty_ACATS test is
+>     X : Wide_Character := ["2029"];
+> begin
+>     null;
+>     --  The ARG messed up, this program is legal in Ada 95
+>     --  but illegal in Ada 2005 with a stupid brackets
+>     --  processor in action, because when we use the
+>     --  sequence ["2029"] in a comment, it ends the line
+>     --  and "in a comment, it ends the line" is an invalid
+>     --  statement.
+> end;
+
+(I suppose that you mean to have single quotes around the ["2029"] in the
+declaration of X.)
+
+Obviously this program is legal Ada 95 (including for a stupid
+preprocessor + compiler) because the 2029 doesn't terminate a line.
+
+> You need to answer the question of whether this program is or
+> is not legal in Ada 2005. I really think you want to say it is legal.
+
+My view is that the notation ["2029"] is just a substitute for the
+corresponding character (much like the notation \u in Java) so yes, it
+breaks the comment and makes the program illegal.  It also breaks the
+character literal, by the way, and that's another reason why the program
+is illegal.
+
+My justification is that the bracket notation being used primarily for the
+ACATS, it should make it possible to put any character anywhere so as to
+make it possible to test all combinations.
+
+But again, I am flexible, and if others think otherwise, I'll fix my
+stupid preprocessor.  Users won't care about this anyway, they'll want
+UTF-8 support.
+
+****************************************************************
+
+From: Robert Dewar
+Sent: Monday, January 24, 2005  9:56 AM
+
+> Obviously this program is legal Ada 95 (including for a stupid
+> preprocessor + compiler) because the 2029 doesn't terminate a line.
+
+No, no, your stupid preprocessor will convert this to the UTF-8 sequence
+for 16#2029# which absolutely MUST terminate the line. THis is a
+paragraph terminator, and the AI clearly says this must terminate a
+line. More strictly, you must have SOME representation that will allow
+a comment to be terminated by 16#2029#. GNAT will certainly recognize
+a UTF-8 coded version of this in a comment, and I think this is essential.
+
+> My view is that the notation ["2029"] is just a substitute for the
+> corresponding character (much like the notation \u in Java) so yes, it
+> breaks the comment and makes the program illegal.  It also breaks the
+> character literal, by the way, and that's another reason why the program
+> is illegal.
+
+The character literal is legal, why do you think otherwise? Assuming
+you have converted this to the equivalent UTF-8 sequence, you should
+accept this just fine.
+
+> My justification is that the bracket notation being used primarily for the
+> ACATS, it should make it possible to put any character anywhere so as to
+> make it possible to test all combinations.
+
+We can't have it both ways:
+
+   case 1. Brackets notation to be allowed in comments. In that case, there
+   is no way to have an ACATS test corresponding to the above program. This
+   is unfortunate, since this now becomes an important case to test.
+
+   case 2. Brackets notation not allowed in comments. Then indeed you cannot
+   test that the UTF-8 sequence for 2029 terminates a line without using
+   UTF-8. The GNAT test suite uses a test that is in UTF-8 just for this
+   purpose.
+
+I see the brackets notation as also useful for canonicalizing programs into
+7-bit form, very useful for email etc.
+
+> But again, I am flexible, and if others think otherwise, I'll fix my
+> stupid preprocessor.  Users won't care about this anyway, they'll want
+> UTF-8 support.
+
+Well I told you this is not theoretical. Allowing brackets in comments
+broke code that has compiled for years.
+
+****************************************************************
+
+From: Pascal Leroy
+Sent: Monday, January 24, 2005  10:29 AM
+
+> No, no, your stupid preprocessor will convert this to the
+> UTF-8 sequence for 16#2029# which absolutely MUST terminate
+> the line.
+
+If think you misread me, I said _Ada_95_ above, not 2005.  In Ada 95 the
+character ["2029"] is a graphic character and thus the program is legal no
+matter how you interpret the bracket notation in comments.  It's only in
+Ada 2005 that there is an issue, right?
+
+> The character literal is legal, why do you think otherwise?
+> Assuming you have converted this to the equivalent UTF-8
+> sequence, you should accept this just fine.
+
+A character literal can only contain a graphic_character, and in Ada 2005
+the character ["2029"] is a format effector, not a graphic character.
+That much seems crystal clear from the AI.  This is similar to the fact
+that in Ada 83 or 95 you cannot have a linefeed in a character literal.
+
+> We can't have it both ways:
+>
+>    case 1. Brackets notation to be allowed in comments. In
+> that case, there
+>    is no way to have an ACATS test corresponding to the above
+> program. This
+>    is unfortunate, since this now becomes an important case to test.
+>
+>    case 2. Brackets notation not allowed in comments. Then
+> indeed you cannot
+>    test that the UTF-8 sequence for 2029 terminates a line
+> without using
+>    UTF-8. The GNAT test suite uses a test that is in UTF-8
+> just for this
+>    purpose.
+
+I'm not sure why you say that in case 1 there is no way to write a test
+like the above.  It seems easy enough to just change the above text to
+have some real statement immediately after ["2029"] and to verify that it
+is indeed executed.  I agree that this is an important case to test,
+precisely because the meaning of 2029 has changed between Ada 95 and Ada
+2005, but I fail to see the difficulty.
+
+****************************************************************
+
+From: Robert Dewar
+Sent: Monday, January 24, 2005 10:55 AM
+
+> If think you misread me, I said _Ada_95_ above, not 2005.  In Ada 95 the
+> character ["2029"] is a graphic character and thus the program is legal no
+> matter how you interpret the bracket notation in comments.  It's only in
+> Ada 2005 that there is an issue, right?
+
+Actually no. Reread my original note! The issue arises also with ["0A"]
+in Ada 95 programs. Right now, GNAT allows this sequence in a comment
+without considering it to be a line terminator. Again, this was not a
+theoretical concern, we had code that had nothing whatever to do with
+wide characters that had this sequence of characters in a comment.
+
+I guess your stupid preprocessor would in fact interpret this as ending
+the comment. That means that GNAT and APEX have never been consistent
+in their treatment of semantics of brackets sequences.
+
+I guess it doesn't matter too much. I think it would be nice if everyone
+could agree on what this means, but it is not vital. One big difference
+here is that you don't really implement brackets as you noted. We do,
+and we want the implementation to be efficient and not cause problems
+with non-upwards compatibility.
+
+The latter is not an issue for you since you don't expect your users
+to come near to your brackets processing stuff. But we do, so we can't
+really afford to introduce an incompatibility.
+
+If we can't agree here, I would recommend that we say that the use of
+brackets notation within comments is implementation defined. Then we
+ban it from the ACATS tests, and all is well.
+
+Do all compilers implement UTF-8?
+
+Could we consider mandating this. It would solve the canonicalization
+problem completely.
+
+
+> I'm not sure why you say that in case 1 there is no way to write a test
+> like the above.
+
+> It seems easy enough to just change the above text to
+> have some real statement immediately after ["2029"] and to verify that it
+> is indeed executed.  I agree that this is an important case to test,
+> precisely because the meaning of 2029 has changed between Ada 95 and Ada
+> 2005, but I fail to see the difficulty.
+
+No, that is a completely different program. The program I have here is
+of course a test that you can have a graphic sequence that looks like
+brackets notation within a comment, and treat it as a comment. Since
+this is required to work in Ada 95, we should make sure that every
+compiler can accept this legal program.
+
+If you really want to introduce a standard mandated incompatibility
+that says this program is illegal, I would be strongly opposed. Yes,
+we agree that Ada 2005 requires that a comment be able to be terminated
+by a 2029, and a compiler must have some representation for this (in
+GNAT that representation is UTF-8).
+
+But a compiler must also be able to process the comment in the
+given program without treating it as a terminator, and must have
+a representation for that as well.
+
+Both GNAT and APEX implement UTF-8 and if the two tests are written
+in UTF-8, there is no issue and everything is fine. In UTF-8 mode,
+GNAT does not accept brackets notation.
+
+The issue I am raising here is whether one or both programs can be
+reprsented in brackets notation. My point is that one or other, but
+not both can be thus represented. I was suggesting that
+
+a) we agree on which of the two programs can be represented this way,
+that is, we agree on whether ["2029"] in brackets notation is interpreted
+in a comment as eight graphic characters, or as a teminating PS. We cannot
+have it both ways.
+
+b) that it is preferable to choose to treat this as graphic characters,
+since to do otherwise would risk incompatibilities with existing code.
+
+****************************************************************
+
+From: Robert Dewar
+Sent: Monday, January 24, 2005 12:00 PM
+
+Pascal, you said that 2029 is not allowed in string literals because it is a format
+effector. I can't read this into the AI. In fact we have:
+
+> The predefined type Wide_Character is a character type whose values correspond
+> to the 65536 code positions of the ISO 10646 Basic Multilingual Plane (BMP).
+> Each of the graphic characters of the BMP has a corresponding @fa<character_literal>
+> in Wide_Character. The first 256 values of Wide_Character have the same
+> @fa<character_literal> or language-defined name as defined for Character. The last 2
+> values of Wide_Character correspond to the nongraphic positions FFFE and FFFF
+> of the BMP, and are assigned the language-defined names @i<FFFE> and @i<FFFF>. As with
+> the other language-defined names for nongraphic characters, the names @i<FFFE> and
+> @i<FFFF> are usable only with the attributes (Wide_)Image and (Wide_)Value; they
+> are not usable as enumeration literals. All other values of Wide_Character are
+> considered graphic characters, and have a corresponding @fa<character_literal>.
+
+Why do you think the last sentence excludes anything other than FFFE and FFFF
+
+I must admit I find no corresonding statement for Wide_Wide_Character, so I do
+not know the status of code in the range 16#01_0000# to 16#10_FFFF# in the
+wide wide character case.
+
+In fact I am not sure why even the control characrtes in the first 32 positions
+are excluded as graphic characters here, since all other seems to refer only
+to FFFE and FFFF.
+
+****************************************************************
+
+From: Randy Brukardt
+Sent: Monday, January 24, 2005  2:15 PM
+
+The text you have quoted here is the old Ada 95 text. The replacement text
+reads:
+
+The predefined type Wide_Character is a character type whose values
+correspond to the 65536 code positions of the ISO/IEC 10646:2003 Basic
+Multilingual Plane (BMP). Each of the graphic characters of the BMP has a
+corresponding @fa<character_literal> in Wide_Character. The first 256 values
+of Wide_Character have the same @fa<character_literal> or language-defined
+name as defined for Character. Each of the @fa<graphic_character>s has a
+corresponding @fa<character_literal>.
+
+Which doesn't have the lines that you are wondering about.
+
+****************************************************************
+
+From: Robert Dewar
+Sent: Monday, January 24, 2005  2:53 PM
+
+> The text you have quoted here is the old Ada 95 text. The replacement text
+> reads:
+
+So this is annoying, we have introduced a non upwards compatibility by
+forcing compilers to go to a lot of effort to forbid a curious set of
+wide characters in string literals, just to cause people trouble who
+run into this silly rule in existing programs. UGH!
+
+In fact the relevant wording in the new AI is:
+
+graphic_character
+Any character which is not in the categories other_control, other_private_use,
+other_surrogate, other_format, format_effector, and whose code position is
+neither 16#FFFE# nor 16#FFFF#.
+
+Actually I find the use of format_effector junky here, since it is not a
+defined character category. I would replace it by separator_line,
+separator_paragraph. All other cases of format effectors are included
+in other cases of non-graphics.
+
+So I guess we need one more junk character table (I can now churn them out
+semi-automatically) whose sole purpose is to forbid characters in string
+literals and character literals for no good reason, oh well!)
+
+The range table for illegal string characters if anyone is interested is:
+
+      (16#00000#, 16#0001F#),  -- <control> .. <control>
+      (16#0007F#, 16#0009F#),  -- <control> .. <control>
+      (16#000AD#, 16#000AD#),  -- SOFT HYPHEN .. SOFT HYPHEN
+      (16#00600#, 16#00603#),  -- ARABIC NUMBER SIGN .. ARABIC SIGN SAFHA
+      (16#006DD#, 16#006DD#),  -- ARABIC END OF AYAH .. ARABIC END OF AYAH
+      (16#0070F#, 16#0070F#),  -- SYRIAC ABBREVIATION MARK .. SYRIAC ABBREVIATION MARK
+      (16#017B4#, 16#017B5#),  -- KHMER VOWEL INHERENT AQ .. KHMER VOWEL INHERENT AA
+      (16#0200C#, 16#0200F#),  -- ZERO WIDTH NON-JOINER .. RIGHT-TO-LEFT MARK
+      (16#02028#, 16#0202E#),  -- LINE SEPARATOR .. RIGHT-TO-LEFT OVERRIDE
+      (16#02060#, 16#02063#),  -- WORD JOINER .. INVISIBLE SEPARATOR
+      (16#0206A#, 16#0206F#),  -- INHIBIT SYMMETRIC SWAPPING .. NOMINAL DIGIT SHAPES
+      (16#0D800#, 16#0D800#),  -- <Non Private Use High Surrogate, First> .. <Non Private Use High Surrogate, First>
+      (16#0DB7F#, 16#0DB80#),  -- <Non Private Use High Surrogate, Last> .. <Private Use High Surrogate, First>
+      (16#0DBFF#, 16#0DC00#),  -- <Private Use High Surrogate, Last> .. <Low Surrogate, First>
+      (16#0DFFF#, 16#0E000#),  -- <Low Surrogate, Last> .. <Private Use, First>
+      (16#0F8FF#, 16#0F8FF#),  -- <Private Use, Last> .. <Private Use, Last>
+      (16#0FEFF#, 16#0FEFF#),  -- ZERO WIDTH NO-BREAK SPACE .. ZERO WIDTH NO-BREAK SPACE
+      (16#0FFF9#, 16#0FFFB#),  -- INTERLINEAR ANNOTATION ANCHOR .. INTERLINEAR ANNOTATION TERMINATOR
+      (16#0FFFE#, 16#0FFFF#),  -- excluded code positions
+      (16#1D173#, 16#1D17A#),  -- MUSICAL SYMBOL BEGIN BEAM .. MUSICAL SYMBOL END PHRASE
+      (16#E0001#, 16#E0001#),  -- LANGUAGE TAG .. LANGUAGE TAG
+      (16#E0020#, 16#E007F#),  -- TAG SPACE .. CANCEL TAG
+      (16#F0000#, 16#FFFFD#),  -- <Plane 15 Private Use, First> .. <Plane 15 Private Use, Last>
+      (16#100000#, 16#10FFFD#),  -- <Plane 16 Private Use, First> .. <Plane 16 Private Use, Last>
+
+That's just drawn from the database, but I am a little bit unsure of this table.
+What is the category of codes which simply have no definition at all in the table.
+I assume they are not excluded, since otherwise why are FFFE and FFFF specially
+treated. On the other hand this seems a bit odd.
+
+This is fun NOT.
+
+****************************************************************
+
+From: Dan Eilers
+Sent: Monday, January 24, 2005  2:17 PM
+
+> I would recommend we extend brackets notation to allow six digits as in
+>
+>   ["010248"] representing the character DESERET SMALL LETTER LONG I.
+
+
+I would prefer not using brackets for this purpose, possibly
+using 16#2029# notation instead.  I anticipate that brackets
+have other actual or potential uses for displaying annotated
+Ada text that might conflict.
+
+****************************************************************
+
+From: Robert Dewar
+Sent: Monday, January 24, 2005  3:07 PM
+
+Please explain the alternative notation you propose, remember that
+it must be able to occur in characters and strings without causing.
+any ambiguity
+
+****************************************************************
+
+From: Dan Eilers
+Sent: Monday, January 24, 2005  5:49 PM
+
+I think for general (i.e., non-ACATS) use, the existing ability to
+break up string literals using "&" is sufficient.  There isn't
+really any way to add escape characters to string literals that
+is clean enough to standardize.  For ACATS use, it doesn't really
+matter what notation is used.
+
+p.s.
+  What's the story with AI-388, which claims that Ada code would
+print better if numerics.pi used the UTF-8 symbol.   But there
+are a lot of Ada symbols that would print better if Ada used the
+UTF-symbol, such as *, /, <=, >=, /=, etc.  Using UTF-8 only for
+PI makes it stick out like a sore thumb.
+
+****************************************************************
+
+From: Robert Dewar
+Sent: Monday, January 24, 2005  9:56 PM
+
+I really do not understand what you are talking about here. This has
+nothing to do with escape characters in strings. Please
+give examples of what notation you propose to replace:
+
+    R99 : aliased["00202F"]Integer;  -- OK (narrow no break space)
+
+    R7["03D9"] : constant := 3;      -- OK, test case folding
+    R8 : constant := R6["03D8"] + 1;
+    pragma Assert (R8 = 4);
+
+> What's the story with AI-388, which claims that Ada code would
+> print better if numerics.pi used the UTF-8 symbol.   But there
+> are a lot of Ada symbols that would print better if Ada used the
+> UTF-symbol, such as *, /, <=, >=, /=, etc.  Using UTF-8 only for
+> PI makes it stick out like a sore thumb.
+
+Yes, indeed, this is an indulgeance we could do without. But
+I guess someone felt strongly about it, and apparently the general
+reaction was that it was not horrible enough to make a fuss about.
+
+****************************************************************
+
+From: Randy Brukardt
+Sent: Monday, January 24, 2005  2:37 PM
+
+> I know this is not strictly part of the standard, but I think it is useful
+> if the ARG deals with the issue of how to extend the semi-standard (used in
+> ACATS) brackets notation for wide characters (this is also a way of
+> canonicalizing source programs into ASCII 7-bit graphic characters and
+> standard line terminator characters).
+
+Well, as you noted, source representation is not an issue for the standard.
+
+And it's probably premature to discussion ACATS issues (which this clearly
+is).
+
+But it seems to me that one of the values of the ACATS is to encourage
+implementers to support things that are useful to users. So the decision of
+how to represent ACATS tests with Unicode characters in them should be one
+of what is most useful to users.
+
+My gut feeling is that users would be best served with UTF-8 support in
+their compilers, because that would allow identifiers that "look" correct in
+their editors. So I would think that it would be best to provide tests in
+UTF-8 format without any strange encoding formats. Of course,
+implementations could convert these into whatever other format, but I guess
+most would just accept the UTF-8 directly.
+
+The brackets notation seems to me to be a historical oddity created
+primarily because there wasn't a semi-universal format like UTF-8 in 1995.
+It has much less benefit to users than using UTF-8 does.
+
+I find it odd that a compiler would even support the brackets notation
+directly (that's certainly not how the ACATS UG expects these tests to be
+processed), but implementers can do what they want.
+
+Anyway, this issue will need to be discussed, but in the context of the
+ACATS, and I didn't see much point in bringing it up until we were actually
+creating ACATS tests for this part of the Standard. (I've got too much to do
+on the standard right now to spend a lengthy discussion on the ACATS...)
+
+****************************************************************
+
+From: Robert Dewar
+Sent: Monday, January 24, 2005  4:47 PM
+
+> The brackets notation seems to me to be a historical oddity created
+> primarily because there wasn't a semi-universal format like UTF-8 in 1995.
+> It has much less benefit to users than using UTF-8 does.
+
+Well I can speak to this as the inventor of the notation. Yes, that was one of
+the motivations, but there was another, which was that if we start using upper
+half characters, they get hard to communicate, e.g. in email etc.
+
+For example, the sample programs I have been giving would be inpenetrable
+if written in UTF-8 with all kinds of bizarre different character sets
+in use.
+
+Still I have no great objection to ACVC tests being written in UTF-8. I
+think it would always be useful to have in plain ASCII text an account of
+what the mysterious sequences mean (they are also very hard to interpret
+by humans).
+
+I don't think brackets notation is useless to users at all, it is the
+notation of choice if you have programs that you expect basically to
+be read and handled in Latin-1 contexts, but which have to deal with
+a few Unicode characters.
+
+****************************************************************
+
+From: Robert Dewar
+Sent: Thursday, February 3, 2005  11:11 PM
+
+Well I finally did the last checkin for AI-285, including all the
+packages (about 55 run time files added to the library), and all
+the documentation. Quite a task!
+
+One concern I have is that thorough testing of this will require
+a huge amount of effort. I wonder if we can enlist the help here
+of those delegations that have an interest in this support.
+
+****************************************************************
+
+From: Pascal Leroy
+Sent: Friday, February 4, 2005  3:29 AM
+
+France volunteers to write a test using LATIN CAPITAL LIGATURE OE and
+LATIN SMALL LIGATURE OE, so that we can finally write Ada software to
+control an egg cooker;-)
+
+****************************************************************
+
+From: Robert Dewar
+Sent: Friday, February 4, 2005  8:35 AM
+
+Be sure to write it in bad style, depending on the case equivalence :-)
+
+****************************************************************
+

Questions? Ask the ACAA Technical Agent