Version 1.1 of ai05s/ai05-0182-1.txt
!standard 3.5(39.4/2) 09-10-30 AI05-0182-1/01
!class binding interpretation 09-10-30
!status work item 09-10-30
!status received 09-10-22
!priority Low
!difficulty Easy
!qualifier Omission
!subject Preciseness of S'Value
!summary
S'Wide_Wide_Value, S'Wide_Value, and S'Value may allow additional
representations for character values.
!question
What happens in the following cases:
(1) What should Character'Value ("'" & Character'Val(16) & "'") do?
Should it return Character'Val(16), or raise Constraint_Error?
(2) What should Character'Value ("HEX_00000041") do? Return 'A', or
raise Constraint_Error?
!response
(See summary.)
!wording
** TBD **
!discussion
The questioner goes on to report that both examples work (do not raise
Constraint_Error) in GNAT.
It's clear from the Standard that both of these should raise Constraint_Error;
the first is not an enumeration literal if the center character is a nongraphic
character, and the second is not the 'Image of a nongraphic character.
But that implies extra code in 'Value to reject these cases. Since this is a
runtime function, that is extra code that is added to every program that uses
'Value (and depending on the runtime model, possibly to every program whether or
not 'Value is used). That extra code would need to include a table of graphic
characters in Unicode, so it is not trivial in size. But this extra code is not
helping the user any: both she and the runtime know what the intended answer is
-- the runtime is just not allowed to provide it.
That seems stupid. Therefore we relax the requirement to allow any value with
the proper syntax (''' & <any character> & ''') and Hex_hhhhhhhh.
[Editor's note: Should we require or just allow this additional flexibility? I'd
be in favor of requiring it, but perhaps that is too much.
If this extra flexibility is considered a bad idea, I recommend that we classify
this question a pathology so that it is never tested - effectively allowing the
GNAT implementation.]
!ACATS Test
Consider a ACATS C-Test that whatever we decide is true. ** TBD **
!appendix
!topic 'Value on character types
!reference 3.5(39.4/2), 2.1
!from Adam Beneschan 09-10-22
!discussion
I think I know the answers to these, but I wanted to clarify (and bring the
issue up in case anyone thinks the behavior should be different):
(1) What should Character'Value ("'" & Character'Val(16) & "'") do?
Should it return Character'Val(16), or raise Constraint_Error?
(2) What should Character'Value ("HEX_00000041") do? Return 'A', or
raise Constraint_Error?
The way I read the RM, both should raise Constraint_Error. In the first case,
'<c>' is not an enumeration literal if <c> is a nongraphic character; in the
second case, "HEX_00000041" does not correspond to the 'Image of a nongraphic
character (neither would "HEX_00000010", for that matter, since the 'Image of
Character'Val(16) is "DLE" and not "HEX_00000010").
I'm just bringing this up because someone could argue that, since there's no
requirement that the argument of 'Value be exactly the same as the result of
'Image, for numeric types e.g., Character'Value should allow any string of the
form '<c>' and any string of the form HEX_dddddddd with a valid 8-digit hex
number, for consistency. I don't have a particular preference (except that
doing it the latter way means less work for me :)).
(GNAT does seem to accept any 3-character string with quote marks. It raises
Constraint_Error on a HEX_dddddddd string unless the first three characters have
the letter case "Hex", which I think is a bug; but it accepts any "Hex_dddddddd"
string with valid hex digits, regardless of whether the result is a graphic or
control character.)
****************************************************************
From: Randy Brukardt
Sent: Thursday, October 22, 2009 11:54 PM
> The way I read the RM, both should raise Constraint_Error.
> In the first case, '<c>' is not an enumeration literal if <c> is a
> nongraphic character; in the second case, "HEX_00000041"
> does not correspond to the 'Image of a nongraphic character (neither
> would "HEX_00000010", for that matter, since the 'Image of
> Character'Val(16) is "DLE" and not "HEX_00000010").
I agree with your reading. Not sure whether it is a good idea, though,
especially in the latter case.
> I'm just bringing this up because someone could argue that, since
> there's no requirement that the argument of 'Value be exactly the same
> as the result of 'Image, for numeric types e.g., Character'Value
> should allow any string of the form '<c>' and any string of the form
> HEX_dddddddd with a valid 8-digit hex number, for consistency. I
> don't have a particular preference (except that doing it the latter
> way means less work for me :)).
'Value for enumeration types surely allows many strings that 'Image can't
produce. Besides the obvious case of the leading and trailing blanks, there is
also the fact that lower case versions of (identifier) literals are accepted.
('Image always returns literals in upper case.)
What worries me here is the runtime overhead of checking the character class of
the middle character and of the result of the conversion of HEX_dddddddd. It
seems completely pointless to make such a check on the latter - it would be easy
to do for the Latin-1 part (it's not allowed there), but for the rest of Unicode
you'd need a character class chart. That's pretty big, and not something you'd
want to drag into programs. And it would be a lot of work to *avoid* dragging it
in if it is part of Character'Value.
> (GNAT does seem to accept any 3-character string with quote marks. It
> raises Constraint_Error on a HEX_dddddddd string unless the first
> three characters have the letter case "Hex", which I think is a bug;
> but it accepts any "Hex_dddddddd"
> string with valid hex digits, regardless of whether the result is a
> graphic or control character.)
I suspect that we want to rethink the rules for 'Value (well, technically for
'Wide_Wide_Value) in order that we aren't dragging along a big runtime overhead
that really doesn't help the user any. What possible advantage is there to
forcing the user to decide whether or not a character is a graphic character
before calling 'Value??
****************************************************************
Questions? Ask the ACAA Technical Agent