Version 1.1 of ais/ai-00285.txt

Unformatted version of ais/ai-00285.txt version 1.1
Other versions for file ais/ai-00285.txt

!standard A.3.2(49)          02-01-23 AI95-00285/00
!class amendment 02-01-23
!status received 02-01-15
!priority Medium
!difficulty Hard
!subject Latin-9 and Ada.Characters.Handling
!summary
!problem
Latin-9 has been introduced.
!proposal
!discussion
!example
!ACATS test
!appendix

From: Gary Dismukes
Sent: Tuesday, January 15, 2002  4:14 PM

Ben Brosgol recently pointed out to us (ACT) the introduction of a
variant of the Latin 1 character set that is designated Latin 9.

A web page describing Latin 9 can be viewed at:

  http://www.cs.tut.fi/~jkorpela/latin9.html

Here's the summary blurb on that page describing the relatively minor
differences between Latin 1 and Latin 9:

  ISO Latin 9 as compared with ISO Latin 1

  The ISO Latin 9 (ISO 8859-15) character set differs from the well-known
  ISO Latin 1 (ISO 8859-1) character set in a few positions only. The euro
  sign and some national letters used e.g. in French and Finnish have been
  introduced and some rarely used special characters omitted.

We've added a new package to the GNAT library named Ada.Characters.Latin_9,
analogous to Ada.Characters.Latin_1, to define character constants for this
new character set.

Robert Dewar asked me to post the following remarks from him
re Latin-9 and Ada.Characters.Handling:

----------

Note that the Ada package Latin-1 did not exactly follow the official
names of all characters, and I have copied its abbreviated naming style
for the new characters in Latin-9.

I have a gripe with the RM here. The setup for Ada.Characters.Latin_1 is
to have separate packages for separate character sets, which makes perfectly
good sense:

27   An implementation may provide additional packages as children of
Ada.Characters, to declare names for the symbols of the local character set
or other character sets.

But for Characters.Handling, we have the odd statement:

49   If an implementation provides a localized definition of Character or
Wide_Character, then the effects of the subprograms in Characters.Handling
should reflect the localizations.  See also 3.5.2.

which implies that some mysterious transformation happens on this package
(under what circumstnaces?) I think this is a bad idea for two reasons:

a) it requires specialized mechanisms in the compiler, and it seems odd
for the meaning of this package to depend on some compiler switch etc.

b) it precludes handling multiple character sets in the same program,
whereas the design for Ada.Characters.Latin_1 etc seems to accomodate this.

My recommendation is that an implementation generate separate packages,
called e.g. Ada.Characters.Handling_Latin_9 (with Ada.Characters.Handling
being a renaming of Ada.Characters.Handling_Latin_1 perhaps?)

Robert Dewar

*************************************************************

From: Pascal Leroy
Sent: Tuesday, January 15, 2002  5:05 PM

>   The ISO Latin 9 (ISO 8859-15) character set differs from the well-known
>   ISO Latin 1 (ISO 8859-1) character set in a few positions only. The euro
>   sign and some national letters used e.g. in French and Finnish have been
>   introduced and some rarely used special characters omitted.

Oh boy, good to see that the OE and oe ligatures are now available, and that
we now can write French without having to use Unicode!

*************************************************************

From: John Barnes
Sent: Wednesday, January 16, 2002  1:44 AM

Better put that on the agenda for the next ARG. Ada 2005
should use Latin 9 rather than Latin 1.  A minor change.
Might be a few incompatibilities.

*************************************************************

From: Pascal Leroy
Sent: Wednesday, January 16, 2002  12:53 PM

As I mentioned in a mail yesterday, the fact that you can use Latin 9 to
write French makes it look very interesting to me.

On the other hand, it is not too useful for Ada to support Latin 9 if the
OSes don't: if I emit the character OE and it print out as 1/4 on my screen,
I didn't gain much.

So while I agree that we should consider supporting Latin 9 _in_addition_ to
Latin 1 in Ada 05, I don't think Latin 9 should _replace_ Latin 1, because I
am ready to bet that we will still have Latin 1 OSes ten years from now.

*************************************************************

From: John Barnes
Sent: Thursday, January 17, 2002  1:33 AM

It was somewhat of a jokey suggestion as I am sure you are aware.

Indeed I had a big problem when writing my book and
displaying the type Character. I wrote it in QuarkXpress on
a PC and it was fine. The publishers moved it to a Mac
before printing and some characters came out wrong.  One of
them came out as a picture of an apple. Moreover, someone
had bitten a lump out of it. So much for standards I
thought.

But supporting Latin-9 would be nice. All those adverts on
the Paris Metro for eating an oeuf can then be printed
properly.

*************************************************************

From: Bob Duff
Sent: Thursday, January 17, 2002  1:14 PM

> Indeed I had a big problem when writing my book and
> displaying the type Character.

I had a great deal of trouble writing the part of the Reference Manual
where type Character lives.  I think Randy had some trouble with the
updated RM, too.  At least we didn't try to show type Wide_Character in
its full glory.  ;-)

7-bit ascii will live forever, I suppose.

*************************************************************

From: Bob Duff
Sent: Wednesday, January 16, 2002  2:15 PM

> Ben Brosgol recently pointed out to us (ACT) the introduction of a
> variant of the Latin 1 character set that is designated Latin 9.

The nice thing about standards is that there are so many to choose
from.  ;-)

> My recommendation is that an implementation generate separate packages,
> called e.g. Ada.Characters.Handling_Latin_9 (with Ada.Characters.Handling
> being a renaming of Ada.Characters.Handling_Latin_1 perhaps?)

That makes sense.

But I think the RM statement you complain about is envisioning a
nonstandard version of Standard.[Wide_]Character, which is a separate
issue.  I don't see that as a big deal -- if you don't think it's a good
idea, don't implement any such thing.  I tend to agree that compiler
switches and the like shouldn't normally be meddling with the semantics
of packages Standard and Characters.Handling without a very good reason.

*************************************************************

From: Florian Weimer
Sent: Friday, January 18, 2002  6:58 AM

> But I think the RM statement you complain about is envisioning a
> nonstandard version of Standard.[Wide_]Character, which is a separate
> issue.

If you use Latin 9 for Standard.Character, this is certainly a
non-standard version, and Ada.Characters.Handling has to be modified
to remain useful.

*************************************************************

From: Florian Weimer
Sent: Friday, January 18, 2002  6:58 AM

> Better put that on the agenda for the next ARG. Ada 2005
> should use Latin 9 rather than Latin 1.  A minor change.
> Might be a few incompatibilities.

I disagree.  With Latin 9, the mapping from Character to
Wide_Character is less straightforward, and this could have unexpected
results.

OTOH, it seems that Wide_Character is not widely used (unless you are
forced to do so by ASIS), so this might not matter much.

In addition, we really should add Wide_Wide_Character (which covers
the sixteen additional planes), or make Wide_Character itself wider.
Otherwise, using Unicode with standard Ada will be rather painful.

*************************************************************


Questions? Ask the ACAA Technical Agent