CVS difference for ais/ai-00285.txt

Differences between 1.16 and version 1.17
Log of other versions for file ais/ai-00285.txt

--- ais/ai-00285.txt	2004/06/10 05:39:56	1.16
+++ ais/ai-00285.txt	2004/08/28 01:41:53	1.17
@@ -1,4 +1,4 @@
-!standard A.3.2(49)                                    04-06-03  AI95-00285/07
+!standard A.3.2(49)                                    04-07-05  AI95-00285/08
 !class amendment 02-01-23
 !status work item 02-09-24
 !status received 02-01-15
@@ -25,14 +25,6 @@
 
 !proposal
 
-[Author's note: This AI is based on the working draft of ISO/IEC 10646:2003
-dated 2003-02-13.  This standard was published on 2003-12-15, but at the time of
-this writing I don't have access to a copy of the final standard.  While the
-!proposal of this AI contains numerous references to Unicode, the !wording
-section is carefully phrased to avoid such mentions.  It would be possible to
-remove references to Unicode from the !proposal, but it seems that having
-pointers to all the Unicode information would be useful for implementers.]
-
 The essence of this proposal is to allow the source of the program to be
 written using 16-bit characters (from the BMP) or 32-bit characters. Also,
 it makes it possible to operate on 32-bit characters at run-time
@@ -133,22 +125,21 @@
     http://www.unicode.org/Public/4.0-Update/CaseFolding-4.0.0.txt, is used to
     find the uppercase version of each character.
 
-We decided not to apply Normalization Form KC, as there seems to be insuffient
+We decided not to apply Normalization Form KC, as there seems to be insufficient
 experience on using normalization forms. This seems to be a lose-lose situation
 anyway: without normalization, texts that look alike don't have the same
 meaning; with normalization the widely available text tools like grep, awk, etc.
-don't work. We allow an implementation to provide a mode in which it performs
-normalization so that it can "do the right thing" if it turns out that usage of
-normalization becomes prevalent.
+don't work. We follow the lead of C# (ECMA-334) in specifying that a program
+which is not in Normalization Form KC has an implementation-defined effect. This
+ensures that a program text which is normalized is portable. It also allows an
+implementation to provide useful support for non-normalized texts if appropriate
+in a particular computing environment (in that case, the implementation must
+document how it handles such texts).
 
 Unicode doesn't provide guidance for the composition of numeric literals, so we
-don't change them. They are probably not very important from the
-internationalization standpoint anyway.
-
-Again, characters in category other_format (and punctuation_connector) are
-ignored when computing the value of a decimal literal. The numerical value of
-each character that is a number_decimal_digit is defined by the field "Decimal
-digit value" of the Unicode character database.
+don't change them. The use of the digits at positions 16#30# to 16#39# is
+universal in computer science, and allowing digits from other cultures could
+cause confusion while bringing little benefits.
 
 The definition and role of format_effectors is modified to include the
 characters at positions 16#85#, 16#2028# and 16#2029#. These characters may be
@@ -276,20 +267,15 @@
 characters is implementation defined (it need not be a representation defined
 within ISO/IEC 10646:2003).
 
+The semantics of an Ada program whose text is not in Normalization Form KC (as
+defined by section 24 of ISO/IEC 10646:2003) are implementation-defined.
+
 The description of the language definition in this International Standard uses
 the character properties General Category and Decimal Digit Value of the
 documents referenced by the note in section 1 of ISO/IEC 10646:2003. The actual
 set of graphic symbols used by an implementation for the visual representation
 of the text of an Ada program is not specified.
 
-[Author's note: the above jargon is a polite way of saying Unicode without using
-the characters U, n, i, c, o, d and e.  ISO/IEC 10646:2003 references Unicode
-all over the place, including in normative text.  As a matter of fact, a number
-of Unicode technical reports are listed in the "Normative references" section of
-ISO/IEC 10646:2003.  So rather than directly referencing Unicode, which might be
-hard to swallow for WG9 or SC22, I am using an indirect reference through
-ISO/IEC 10646:2003, which hopefully will be considered kosher.]
-
 The categories of characters are defined as follows:
 
 letter_uppercase
@@ -423,23 +409,11 @@
     section 1 of ISO/IEC 10646:2003, is applied to obtain the uppercase version
     of each character.
 
-	Implementation Advice
 
-If appropriate for the computing environment under consideration, an
-implementation should provide a mode where Normalization Form KC (as defined by
-section 24 of ISO/IEC 10646:2003) is applied to the identifier immediately
-before performing full case folding.
-
-
 Add after 2.6(6):
 
 No modification is performed on the sequence of characters in a string_literal.
 
-	Implementation Permission
-
-An implementation may provide a mode where Normalization Form KC (as defined by
-section 24 of ISO/IEC 10646:2003) is applied to the string literal.
-
 
 Replace 3.5(28-29) by:
 
@@ -795,7 +769,7 @@
             return Wide_Wide_Character_Set;
       function "xor" (Left, Right : in Wide_Wide_Character_Set)
             return Wide_Wide_Character_Set;
-      function "" (Left, Right : in Wide_Wide_Character_Set)
+      function "-" (Left, Right : in Wide_Wide_Character_Set)
             return Wide_Wide_Character_Set;
       function Is_In (Element : in Wide_Wide_Character;
                       Set : in Wide_Wide_Character_Set)

Questions? Ask the ACAA Technical Agent