!standard B.3 (20) 99-08-31 AI95-00037/07 !standard B.3 (30-31) !standard B.3 (63) !class binding interpretation 96-02-06 !status Corrigendum 2000 99-05-24 !status WG9 approved 96-12-07 !status ARG approved 12-0-0 96-10-07 !status ARG approved 8-0-0 (subject to editorial review) 96-06-17 !status work item 95-11-01 !status received 95-06-25 !priority High !difficulty Medium !qualifier Error !subject In Interfaces.C, nul and wide_nul represent zero !summary In package Interfaces.C, the type wchar_t is a discrete type. The constants nul and wide_nul have implementation-defined values, which should have a representation of zero. Types char and wchar_t may use a signed or unsigned representation. !question The following declarations appear in Interfaces.C (B.3): <> (19) type char is ; (20) nul : constant char := char'First; ... (30) type wchar_t is ; (31) wide_nul : constant wchar_t := wchar_t'First; The declaration of wide_nul seems to imply that wchar_t supports the attribute First. What is the intent? If char and/or wchar_t are signed integer types in the interfaced C implementation, may the Ada implementation reflect that fact by using a signed representation for char and/or wchar_t? (Yes.) Note that if char and wchar_t have a signed representation, then char'First and wchar_t'First will not have a zero representation. Are the constants nul and wide_nul intended to be represented as zero? (Yes.) !recommendation (See summary.) !wording Modify paragraphs (19,20,30,31) as follows: 19 type char is ; 20 nul : constant char := ; ... 30 type wchar_t is ; 31 wide_nul : constant wchar_t := ; Add the following Implementation Advice: The constants nul and wide_nul should have a representation of zero. !discussion The intent is that wchar_t be discrete. The type char may have a signed representation. For example, the implementation might have: <> for char use (-128, -127, ..., 127); In that case, char'First is the wrong value to use for nul; the intent is that nul be represented as zero. Similarly, wchar_t could be an enumeration type with a signed representation, as for char. Wchar_t could also be a signed integer type. Either way, wchar_t'First is the wrong value to use for wide_nul. It is important to allow signed representations of char and wchar_t, in order to properly match what the C implementation does. !corrigendum B.03(20) @drepl @xcode char := char'First;> @dby @xcode char := @i<@ft>;> !corrigendum B.03(30) @drepl @xcode<@b wchar_t @b @i<@ft>;> @dby @xcode<@b wchar_t @b @i<@ft<>>;> !corrigendum B.03(31) @drepl @xcode wchar_t := wchar_t'First;> @dby @xcode wchar_t := @i<@ft>;> !corrigendum B.03(63) @dinsb An implementation should support the following interface correspondences between Ada and C. @dinst The constants nul and wide_nul should have a representation of zero. !ACATS test Implementation advice is not testable, since an implementation is free to ignore it. While a C-Test could be written to verify that char and wchar_t are discrete types, this would have little value. !appendix !section B.3(31) !subject Can wchar_t be signed? !reference RM95-B.3(31) !from Norman Cohen !reference as: 95-5119.b Norman H. Cohen 95-4-7>> !discussion Paragraph 30 indicates that the definition of Interfaces.C.wchar_t is implementation defined. However, since wide_nul is meant to have a representation of zero, the initialization of wide_nul to wchar_t'First in paragraph 31 suggests that wchar_t can be an enumeration type or a modular type, but not a signed integer type. Is that the intent? **************************************************************** !section B.3(31) !subject Can wchar_t be signed? !reference RM95-B.3(31) !reference 95-5119.b Norman Cohen 95-04-07 !from Tucker Taft 95-04-12 !reference as: 95-5127.b Tucker Taft 95-4-12>> !discussion > Paragraph 30 indicates that the definition of Interfaces.C.wchar_t is > implementation defined. However, since wide_nul is meant to have a > representation of zero, the initialization of wide_nul to wchar_t'First > in paragraph 31 suggests that wchar_t can be an enumeration type or a > modular type, but not a signed integer type. Is that the intent? I suppose. Although ANSI C only says "integral" I would imagine that almost all C implementations use unsigned for wchar_t. There is probably no harm in wchar_t being modeled as an unsigned type, even if it is signed in the C implementation. **************************************************************** !section B.3(31) !subject Can wchar_t be signed !reference RM95-B.3(31) !reference AI95-00037/00 !from Bjorn Kallberg !reference as: 95-5215.a Bjorn Kallberg 95-7-10>> !discussion In the proposed AI95, it is suggested that Interfaces.C.wchar_t is not a signed integer type, even if the C implementation has a signed representation. A better solution is to change (31) to say that wide_nul : constant wchar_t := implementation-defined. instead of the current wide_nul : constant wchar_t := wchar_t'first; The intent of the Interfaces.C is to give an Ada representation of the C types. With the definition of nul as the wchar_t'first, additional semantics which do not exist in C are added. The following is one example where the added restriction is harmful. There may be others. We have an C implementation were wide_char is signed, and we also have a C header file, defining some character values to positive and negative values. Automatic translation of this header file to Ada will not work. The above problem is of course not a major one, but as there is no argument presented except "not harmful" for the other view, it seems much better stay with the original intent, to mimic the C definition as closely as possible. Also, most other constants in this package are implementation defined /bj|rn **************************************************************** !section B.3(31) !subject Can wchar_t be signed? !reference RM95-B.3 (31) !reference AI95-00037 !from Pascal Leroy 95-10-20 !reference 95-5359.b Pascal Leroy 95-10-20>> !discussion This AI states that "there seems to be no harm in requiring that wchar_t not be signed." Making wchar_t unsigned when the corresponding C type is signed has at least one unpleasant consequence: it makes it difficult to translate algorithms written in C (some people do that...). If some algorithm explicitly uses the fact that the C type wchar_t is signed, you need extensive surgery to rewrite it in Ada. Also, the appendix says "I would imagine that almost all C implementations use unsigned for wchar_t." That's a nice story, but it is not true. At least one widely available OS (Solaris) has wchar_t defined as long. I would prefer to change the declaration of wide_nul to read: wide_nul : constant wchar_t := wchar_t'val (0); which would have the desired effect if wchar_t is enumerated, signed or unsigned (and would allow a signed wchar_t). **************************************************************** !section B.3(31) !subject Can wchar_t be signed !reference AI95-00037 !reference B.3(31) !from Tucker Taft 95-10-30 !reference 95-5373.d Tucker Taft 95-10-30>> !discussion I don't agree with the recommendation of this AI. The whole point of Interfaces.C is to match the characteristics of the corresponding C types in so far as practical. It seems straightforward to allow wchar_t to be signed or unsigned. The suggestion that wide_nul be defined as wchar_t'Val(0) seems to resolve any problems with it being signed. **************************************************************** !section B.3(31) !subject Can wchar_t be signed? !reference RM95-B.3(31) !reference AI95-00037/03 !from Keith Thompson 96-06-21 !reference 96-5605.a Keith Thompson 96-6-21>> !discussion Both Pascal Leroy and Tucker Taft suggest that wide_nul whould be defined as follows: wide_nul : constant wchar_t := wchar_t'val (0); This can be incorrect if wchar_t is an enumeration type with an enumeration representation clause. Specifically: type wchar_t is (...); -- 65536 values for wchar_t use (-32768, -32767, ..., 0, ..., 32766, 32767); Then wchar_t'Val(0) is the same as wchar_t'First, which is not the value whose internal representation is 0. Suggestion: leave the initialization for wide_nul implementation-defined, with a comment indicating that it should (shall?) have an internal representation of 0. An alternative would be to disallow the above declaration of wchar_t, but I see little point in doing so. **************************************************************** !section B.3(19) !subject Interfaces.C.Char !reference RM95-B.3(19) !reference RM95-B.3(30) !from Keith Thompson 96-12-05 !reference 96-5778.a Keith Thompson 96-12-5>> !discussion May the type Interfaces.C.Char be declared as type char is new Character; ? May Interfaces.C.wchar_t be declared as type wchar_t is new Wide_Character; ? If so, type conversions between Character and char, and between Wide_Character and wchar_t, are legal under some implementations and not under others. Given the wording in the RM, I believe the answer is yes in both cases. This is an interesting source of quietly non-portable programs. ****************************************************************