Version 1.1 of acs/ac-00302.txt

Unformatted version of acs/ac-00302.txt version 1.1
Other versions for file acs/ac-00302.txt

!standard 2.5(2)          18-01-12 AC95-00302/00
!standard 2.6(3)
!class Amendment 18-01-12
!status received no action 18-01-12
!status received 18-01-05
!subject Wide_Wide_Character and Wide_Wide_String encoding
!summary
!appendix

!topic Wide_Wide_Character literal
!reference Ada 2012 RM2.5(14)
!from Author John-Eric S๖derman 18-01-05
!keywords Wide_Wide_Character backslash hexadecimal
!discussion
Specification of literal for Wide_Wide_Character

after 2
{

Wide_Wide_Character_literal::='wide_wide_character_element'W
wide_wide_character_element::=[^\\]|\"|\'|\\|\b|\f|\n|\r|\t|\x[0-9A-Fa-f]{2}|\u[0-9A-Fa-f]{4}|\v[0-9A-Fa-f]{6}|\y[0-9A-Fa-f]{8}

}

after 3
{
 
The wide_wide_character_element is given in Regular expression format where
 [^\\] – any other graphic_character except backslash  
 \” – escaped quotation mark  
 \’ – escaped apostrophe  
 \b – generate BS  
 \f – generate FF  
 \n – generate LF  
 \r – generate CR  
 \t – generate HT  
 \x[0-9A-Fa-f]{2} – generate Wide_Wide_Character corresponding to this hexadecimal value (two digits)  
 \u[0-9A-Fa-f]{4} – generate Wide_Wide_Character corresponding to this hexadecimal value (four digits)  
 \v[0-9A-Fa-f]{6} – generate Wide_Wide_Character corresponding to this hexadecimal value (six digits)  
 \y[0-9A-Fa-f]{8} – generate Wide_Wide_Character corresponding to this hexadecimal value (eight digits)

}

change of 5/2
[

'A' '*' ''' ' '
'L' '?'W '?'W -- Various els.
'8'W '?W'?? ---- Big numbers-infinity and aleph.
'€'W '\v010600'W -- euro and Linear-A character

]

Reason for change, pii, lamda, infinity and alef not in ASCII-7 and for UTF-8
needs more than one byte

***************************************************************

!topic Wide_Wide_String literal
!reference Ada 2012 RM2.6(15)
!from Author John-Eric S๖derman 18-01-05
!keywords Wide_Wide_String backslash hexadecimal
!discussion
Specification of literal for Wide_Wide_String

after 4
{

Wide_Wide_String_literal::="{wide_wide_string_element}"W
wide_wide_string_element::=[^"\\]|\"|\'|\\|\b|\f|\n|\r|\t|\x[0-9A-Fa-f]{2}|\u[0-9A-Fa-f]{4}|\v[0-9A-Fa-f]{6}|\y[0-9A-Fa-f]{8}|""

}

after 6
{

A null Wide_Wide_String literal is a Wide_Wide_String_literal with no
wide_wide_string_element between the quotation marks.

}

after 7
{

The wide_wide_string_element is given in Regular expression format where 
 [^”\\] – any other graphic_character except backslash and quotation mark  
 \” – escaped quotation mark  
 \’ – escaped apostrophe  
 \b – generate BS  
 \f – generate FF  
 \n – generate LF  
 \r – generate CR  
 \t – generate HT  
 \x[0-9A-Fa-f]{2} – generate Wide_Wide_Character corresponding to this hexadecimal value (two digits)  
 \u[0-9A-Fa-f]{4} – generate Wide_Wide_Character corresponding to this hexadecimal value (four digits)  
 \v[0-9A-Fa-f]{6} – generate Wide_Wide_Character corresponding to this hexadecimal value (six digits)  
 \y[0-9A-Fa-f]{8} – generate Wide_Wide_Character corresponding to this hexadecimal value (eight digits)  
 
}
 
after 9/2 
{

A:Wide_Wide_String:="A good reason\nHappy New Year\nRegards and 10€ for you"W;

}

***************************************************************

From: Randy Brukardt
Sent: Friday, January 5, 2017  8:19 PM

> !topic Wide_Wide_Character literal
> !reference Ada 2012 RM2.5(14)
> !from Author John-Eric S๖derman 18-01-05 !keywords Wide_Wide_Character 
> backslash hexadecimal

...
>Reason for change, pii, lamda, infinity and alef not in ASCII-7 and for
UTF-8 needs more than one byte

It would have been helpful to have put the problem statements first in your
messages, since the ARG is likely to find its own solution to your problem.

In this case, however, there isn't even a problem, since Ada 2012 compilers
are required to accept source code in UTF-8 format (see  2.1(16/3) --
http://www.ada-auth.org/standards/rm12_w_tc1/html/RM-2-1.html#p16). So
characters like Pi and Lambda can be written as themselves, and use of
multiple bytes in that format is irrelevant.

As a practical matter, even Ada 2005 compilers have to accept UTF-8 source
code, as the ACATS includes a number of UTF-8 formatted test files. (And all
known Ada vendors run the ACATS regularly.)

How implementations meet this requirement may vary (perhaps a compiler switch
or mode is needed); you'll need to contact your compiler vendor (or ask on
social media, like comp.lang.ada or stackoverflow) to find out what is required.

The ARG had considered various encoding schemes, but they were considered to
greatly harm readabilty, and since UTF-8 was widely available, there didn't
seem to be any reason for doing that.

To see the readability harm, simply compare "Mr. S๖derman" to
"Mr. S\00F6derman". Which is easier to see that it is correct?

***************************************************************

From: Randy Brukardt
Sent: Friday, January 5, 2017  8:22 PM

As noted in my reply to "Wide_Wide_Character literal", encoding is not necessary
in Ada 2012 (or practically in Ada 2005) as UTF-8 Ada source has to be accepted
by Ada compilers.

***************************************************************

Questions? Ask the ACAA Technical Agent