CVS difference for ai05s/ai05-0185-1.txt

Differences between 1.3 and version 1.4
Log of other versions for file ai05s/ai05-0185-1.txt

--- ai05s/ai05-0185-1.txt	2010/10/19 03:51:18	1.3
+++ ai05s/ai05-0185-1.txt	2010/10/22 04:52:26	1.4
@@ -423,7 +423,192 @@
 
 !appendix
 
+From: Robert Dewar
+Sent: Saturday, July 3, 2010  3:29 PM
+
+we forgot to say what the bounds of the result are for To_Lower and To_Upper. I
+suggest the same as the bounds of the input parameter (the alternative is always
+1 as the low bound).
+
+****************************************************************
+
+From: Robert Dewar
+Sent: Saturday, July 3, 2010  3:45 PM
+
+The Inline pragma for Is_Graphic says Is_Non_Graphic
+
+****************************************************************
+
+From: Robert Dewar
+Sent: Saturday, July 3, 2010  4:05 PM
+
+I object to the pragma Inline's that's up to the implementation what makes sense
+to mark as inlined.
+
+****************************************************************
+
 From: Randy Brukardt
+Sent: Saturday, July 3, 2010  5:28 PM
+
+Robert, in the future, please indicate the AI and version (and Bob would like
+the title as well) that you are looking at, because it can be hard to find
+whatever is being referred to.
+
+Anyway, once I figured out that you are talking about AI05-0185-1, the first
+note in the yet-to-be-published minutes says: "Drop all of the pragma Inline."
+
+It's not that helpful to review AIs between the end of a meeting and the
+publishing of the minutes, because it is likely that you'll just comment on
+stuff that has already been decided -- and that just adds to my workload without
+any corresponding benefit. There will be an editorial review of all of the newly
+completed AIs that will start shortly, and that is the appropriate time for
+reviewing these.
+
+****************************************************************
+
+From: Robert Dewar
+Sent: Saturday, July 3, 2010  6:40 PM
+
+No problem, I am just making comments as I implement things and notice them, but
+this stuff is hardly critical!
+
+One thing that does concern me is To_Lower, Ihope everyone realizes that
+To_Upper and To_Lower are not easily reversible.
+
+For instance in identifiers, you definitely want lower case i with a dot to be
+equivalent to upper case I without a dot. Anything else would be a big surprise
+to anyone who is not Turkish.
+
+But there are for characters lower case i with and without a dot, and upper case
+I with and without a dot. The natural folding would be to keep the dot, but
+that's obviously not what you want.
+
+So my current implementation of To_Upper folds lower case i with a dot to upper
+case I without a dot. But I am not sure what the To_Upper and To_Lower functions
+in these packages in the AI are supposed to do.
+
+Who has studied tyhe To_Upper/To_Lower issue carefully for the purpose of this
+AI? Someone I hope! Or were these routines just stuck in casually without
+thinking about the difficult problems behind them (I suspect this is the case,
+please tell me it isn't and that someone can tell me EXACTLY what they had in
+mind).
+
+I follow the locale independent case folding discussed in note 1 of ISO/IEC
+10646:2003 for To_Upper_Case currently.
+
+And now I can't even find this standard to look at it again :-(
+
+UGH! Case folding was one of the hardest things to deal with, and here it is in
+even greater glory in this package. Oh well I can always implement something or
+other. The RM certainly does not say what it means (though what *is* the
+reference to Simple_Lower_Case???)
+
+****************************************************************
+
+From: Robert Dewar
+Sent: Saturday, July 3, 2010  6:44 PM
+
+> For instance in identifiers, you definitely want lower case i with a
+> dot to be equivalent to upper case I without a dot. Anything else
+> would be a big surprise to anyone who is not Turkish.
+
+To expand on this a bit, my current To_Upper function maps both lower case i
+with dot and lower case i with no dot to upper case I with no dot. I am sure
+this is what is wanted for identifier case equivalence (anything else would be
+an incompatible disaster). But that means that To_Upper is a many-to-one
+mapping, and thus is not reversible.
+
+****************************************************************
+
+From: Randy Brukardt
+Sent: Sunday, July 4, 2010  6:07 PM
+
+...
+> One thing that does concern me is To_Lower, Ihope everyone realizes
+> that To_Upper and To_Lower are not easily reversible.
+
+Those of us in the ARG who (sort of) understand the character stuff surely know
+that. But it probably would be a good idea to make it clear to regular
+end-users, so it would make sense to add a user note.
+
+...
+> So my current implementation of To_Upper folds lower case i with a dot
+> to upper case I without a dot. But I am not sure what the To_Upper and
+> To_Lower functions in these packages in the AI are supposed to do.
+
+My understanding is that they are supposed to use the "Simple Uppercase Mapping"
+(and "Simple Lowercase Mapping") as defined by 10646. If there is no such thing,
+we have a problem! Probably the wording should make this clearer rather than
+just using Titlecase for the terms. That is, say something like "Simple
+Uppercase Mapping of ISO/IEC 10646:2003."
+
+> Who has studied tyhe To_Upper/To_Lower issue carefully for the purpose
+> of this AI? Someone I hope! Or were these routines just stuck in
+> casually without thinking about the difficult problems behind them (I
+> suspect this is the case, please tell me it isn't and that someone can
+> tell me EXACTLY what they had in mind).
+>
+> I follow the locale independent case folding discussed in note 1 of
+> ISO/IEC 10646:2003 for To_Upper_Case currently.
+>
+> And now I can't even find this standard to look at it again :-(
+
+I vaguely recall someone saying that this standard has free availability;
+presuming that is true there should be no problem getting a copy. (That said, I
+don't have a copy and should get one.)
+
+> UGH! Case folding was one of the hardest things to deal with, and here
+> it is in even greater glory in this package. Oh well I can always
+> implement something or other. The RM certainly does not say what it
+> means (though what *is* the reference to
+> Simple_Lower_Case???)
+
+It's "Simple Uppercase Mapping", and I presume there is something with that name
+in 10646. If not, we don't have a defined functionality, and that *surely* would
+be a problem.
+
+I personally had thought that this was talking about the same mapping used for
+Ada Identifiers, but having read the definition again, I'm not so sure anymore.
+That's because To_Upper for strings is defined in terms of To_Upper for
+characters, and that surely doesn't work for the full character set (how can
+To_Upper for a character return the *three* characters needed in some extreme
+cases??). So I suspect that you are right that there is a definitional problem
+here.
+
+****************************************************************
+
+From: Robert Dewar
+Sent: Sunday, July 4, 2010  6:30 PM
+
+> My understanding is that they are supposed to use the "Simple
+> Uppercase Mapping" (and "Simple Lowercase Mapping") as defined by
+> 10646. If there is no such thing, we have a problem! Probably the
+> wording should make this clearer rather than just using Titlecase for
+> the terms. That is, say something like "Simple Uppercase Mapping of ISO/IEC
+> 10646:2003."
+
+I don't know what this refers to, can someone find a reference?
+
+> I personally had thought that this was talking about the same mapping
+> used for Ada Identifiers, but having read the definition again, I'm
+> not so sure anymore. That's because To_Upper for strings is defined in
+> terms of To_Upper for characters, and that surely doesn't work for the
+> full character set (how can To_Upper for a character return the
+> *three* characters needed in some extreme cases??). So I suspect that
+> you are right that there is a definitional problem here.
+
+To_Upper cannot return three characters for one, what are you talking about?
+10646 has one code per point, we are not talking about UTF-8 strings here.
+
+For source it's up to you how the characters are represented, but conceptually
+identifiers are a sequence of wide_wide_characters.
+
+[This thread is rapidly turning to talk about identifiers; as such
+it continues in AI05-0227-1.]
+
+****************************************************************
+
+From: Randy Brukardt
 Sent: Wednesday, August 11, 2010  9:44 PM
 
 The text in this AI says:
@@ -440,9 +625,63 @@
 and the text description removed.
 
 ****************************************************************
+
+From: Randy Brukardt
+Sent: Wednesday, August 11, 2010  9:55 PM
+
+Should we change the Implementation Advice in A.3.1 since we are now providing
+some form of case mapping and classification? It says:
+
+If an implementation chooses to provide implementation-defined operations on
+Wide_Character or Wide_String (such as case mapping, classification, collating
+and sorting, etc.) it should do so by providing child units of Wide_Characters.
+Similarly if it chooses to provide implementation-defined operations on
+Wide_Wide_Character or Wide_Wide_String it should do so by providing child units
+of Wide_Wide_Characters.
+
+Argubly it is still correct, since one could easily imagine further
+classification functions and "full case folding". But it seems a bit misleading,
+especially as it originally was added because we were *not* adding
+Wide_Characters.Handling in Ada 2005; now that we decided to do that, it not
+clear that it is as useful. (And it is a bit weird that it doesn't mention
+String; why not make the same statement for it?)
+
+Thoughts??
+
+****************************************************************
+
+From: Randy Brukardt
+Sent: Wednesday, August 11, 2010  10:38 PM
+
+Having looked at this a bit more I wonder if the names of the Is_Other and
+Is_Punctuation routines are misleading.
+
+Is_Other returns True for Other_Format characters, but not for other characters
+classified as "other, something". I think this routine would be better called
+Is_Other_Format.
+
+I was going to ignore Is_Other, but then I saw that Is_Punctuation is very
+misleading. This returns true for characters in category punctuation_connector
+(that is, for underscore), but will return False for common punctuation like '.'
+and ','. Punctuation_connector is the only category used in the Ada grammar (in
+identifiers), so it is the only one our standard defines. As such, it probably
+is the only one we really want to support here, but clearly we need a name that
+isn't misleading. Is_Punctuation_Connector would be a much better name.
+
+Thoughts??
+
+****************************************************************
+
+From: Robert Dewar
+Sent: Wednesday, August 11, 2010  11:14 PM
+
+No objections to these name changes, they seem minor and are easy enough to
+adjust in existing code.
+
+****************************************************************
 
-Editor's note: This AI was reopened to address the items mentioned above and others
-raised during Editorial Review. Specifically:
+Editor's note: This AI was reopened to address the items mentioned above and
+others raised during Editorial Review. Specifically:
 
 I had previously asked that the names Is_Other and Is_Punctuation be changed to
 Is_Other_Format and Is_Punctuation_Connector; the latter in particular is very
@@ -461,10 +700,10 @@
 Uppercase Mapping. The first is a Unicode terms; but we can't refer to Unicode
 normatively in the Standard. The second doesn't exist anywhere.
 
-Moreover, these are different than what identifiers use. Robert and I had an e-mail
-meltdown on this back in July. And the identifier definition is completely daft,
-as the "convert to uppercase" definition says use Unicode full case folding -- but
-*that* is a conversion to *lower* case!
+Moreover, these are different than what identifiers use. Robert and I had an
+e-mail meltdown on this back in July. And the identifier definition is
+completely daft, as the "convert to uppercase" definition says use Unicode full
+case folding -- but *that* is a conversion to *lower* case! See AI05-0227-1.
 
 So we need to decide what we really want here.
 

Questions? Ask the ACAA Technical Agent