CVS difference for ai05s/ai05-0079-1.txt

Differences between 1.2 and version 1.3
Log of other versions for file ai05s/ai05-0079-1.txt

--- ai05s/ai05-0079-1.txt	2007/12/13 04:39:38	1.2
+++ ai05s/ai05-0079-1.txt	2008/03/07 06:15:19	1.3
@@ -1,20 +1,24 @@
-!standard 2.2(7)                                             07-12-07    AI05-0079-1/01
+!standard 2.1(4/2)                                            08-02-26    AI05-0079-1/02
+!standard 2.2(7)
 !class binding interpretation 07-12-07
+!status ARG Approved  7-0-1  08-02-09
 !status work item 07-12-07
 !status received 07-10-25
 !priority Low
 !difficulty Easy
 !qualifier Omission
-!subject Other_Format characters should be allowed wherever separators are allowed
+!subject An other_format character should be allowed wherever a separator is allowed
 
 !summary
 
-Other_format characters should be allowed wherever the language allows separators.
-These characters have no meaning to an Ada program.
+An other_format character should be allowed wherever the language allows a
+separator. These characters have no meaning to an Ada program. Characters
+that are not in categories other_format, format_effector and graphic_character
+are not allowed outside of comments in an Ada program. 
 
 !question
 
-(1) A standard convention is to start a file with a zero width
+A standard convention is to start a file with a zero width
 non-breaking space character, 16#0000_FEFF#. By looking at the
 first bytes you can tell if the file is UTF-8, UTF-16 (LE/BE)
 or UTF-32 (LE/BE) encoded. That's the convention that Windows XP
@@ -31,21 +35,6 @@
 be any problem with allowing these more generally. Should these
 characters be allowed? (Yes.)
 
-(2) Recent versions of the Unicode technical report on identifiers
-(TR31: http://www.unicode.org/reports/tr31/) say that characters
-in class other_format should not be allowed in programming language
-identifiers for security reasons. (This changed since the Amendment
-started standardization.)
-
-The problem is basically that these characters can be used
-to write an identifier that looks like Foo_Bar but is actually
-Fo<Right-To-Left>aB_o<Left-To-Right>r. Stripping other_format results
-in FoaB_or, which is what the compiler sees, and lo and behold, I have
-introduced a vulnerability in a source file that looks perfectly
-kosher on your screen. 
-
-Should the definition of Ada be changed? (No.)
-
 !recommendation
 
 (See Summary.)
@@ -59,7 +48,7 @@
 
 Add a new paragraph after 2.2(7):
 
-One of more other_format characters are allowed anywhere that a separator
+One or more other_format characters are allowed anywhere that a separator
 is[; any such characters have no effect on the meaning of an Ada program].
 
 [Editor's note: These brackets indicate text that would be marked as redundant in
@@ -84,9 +73,11 @@
 !discussion
 
 We need wording to say that characters not explicitly allowed are prohibited in
-programs. This doesn't have that much force, since such characters can be defined
-as part of the source representation. But the goal is that a file encoded in UTF-8
-can be mapped directly to the language rules, without any source representation games.
+programs. This wording was unintentionally dropped from the Ada 95 version (which
+had it in 2.1(1)). This wording doesn't have that much force, since such characters
+can be defined as part of the source representation. But the goal is that a file
+encoded in UTF-8 can be mapped directly to the language rules, without any source
+representation games.
 
 There is one semi-weird side-effect of this wording. If [ZWS] represents zero width
 space, then if it appears in a compound delimiter:
@@ -96,7 +87,7 @@
 the program ought to be illegal, as other_format characters are not allowed between
 the parts of a compound delimiter (this is not before or after a lexical element).
 
-However, the language rules as amendment seem to requirethis to be treated as
+However, the language rules as amended seem to require this to be treated as
 two single delimiters (as no separator is required between delimiters). That seems
 bad, as an editor may show this such that it looks like a single compound delimiter
 (a zero-width space is unlikely to be very obvious).
@@ -108,20 +99,21 @@
 add a special rule to specifically make this case illegal, but that seems klunky at
 best. The net effect is that an other_format character can act as a separator.
 
-On question 2, we choose to do nothing now. Ada 2005 is based on ISO/IEC 10646:2003,
-which corresponds to Unicode 4.0, and we followed the recommendations of that version
-of Unicode. Moreover, changing identifier syntax now would introduce an
-incompatibility (probably fairly slight). One presumes a future version of Ada will
-update the Unicode reference, and that will have the effect of changing the characters
-allowed in identifiers slightly. (TR31 also has changed the character classifications
-that are allowed in identifiers.) That will also be mildly incompatible, and we
-think that would be a better time to make the change for other_format characters.
-
-It should noted that the TR31 pronouncement is not absolute; they suggest that perhaps
-some characters should be allowed. Essentially, they don't appear to be very sure what
-the right answer is. Perhaps they'll change their mind again in the future, so it seems
-silly to be chasing their whims on this issue.
+!corrigendum 2.1(4/2)
 
+@drepl
+The coded representation for characters is implementation defined
+[(it need not be a representation defined within ISO/IEC 10646:2003 ISO-10646-1)].
+A character whose relative code position in its plane is 16#FFFE# or 16#FFFF#
+is not allowed anywhere in the text of a program. 
+@dby
+The coded representation for characters is implementation defined
+[(it need not be a representation defined within ISO/IEC 10646:2003 ISO-10646-1)].
+A character whose relative code position in its plane is 16#FFFE# or 16#FFFF#
+is not allowed anywhere in the text of a program. 
+The only characters allowed outside of comments are those in categories
+@fa<other_format>, @fa<format_effector> and @fa<graphic_character>.
+
 !corrigendum 2.2(7)
 
 @dinsa
@@ -130,8 +122,8 @@
 required between an @fa<identifier>, a reserved word, or a @fa<numeric_literal>
 and an adjacent @fa<identifier>, reserved word, or @fa<numeric_literal>.
 @dinst
-One of more other_format characters are allowed anywhere that a separator
-is[; any such characters have no effect on the meaning of an Ada program].
+One of more @fa<other_format> characters are allowed anywhere that a separator
+is; any such characters have no effect on the meaning of an Ada program.
 
 !ACATS Test
 

Questions? Ask the ACAA Technical Agent