!standard 4.2(3) 19-02-22 AI12-0296-1/02 !standard 4.2(5) !standard 4.2(9/5) !standard 3.5.2(1) !standard 4.2.1(0) !class Amendment 18-01-22 !status Hold 7-0-0 18-10-23 !status work item 18-01-22 !status received 15-03-20 !priority Low !difficulty Medium !subject User-defined character and null literals !summary A mechanism for giving a user-defined meaning to character and null literals is added to Ada. !problem AI12-0249-1 and AI12-0295-1 provide a mechanism to use user-defined numeric and string literals respectively. These bring the advantages of literals to abstract data types such as unbounded strings and bignums. However, there is no such mechanism for character or null literals. These would be useful for abstract data types for similar reasons as the other more common literals. In particular, literals are always directly visible, which simplifies the use of ADTs for which they would be appropriate. !proposal Aspects will be defined to allow (lexical) character literals to be defined to be used with a (non-character) type, and the (lexical) null literal to be defined to be used with a (non-access) type. The aspects identify a subprogram to be used to interpret the literal as a value of the type. Character literals already rely on there being a single expected profile (see RM 4.2). Interestingly, you don't have to look "inside" a character literal until *after* you know its type. Ada 83 had a different rule for character literals that relied on visibility of the declaration of the character literal, and current Ada has a different rule for X.'a', which also relies on visibility of the declaration. The null literal uses a different overload resolution approach. It has a "universal" type, which then according to the overload resolution rules in RM 8.6, require the expected type to be a single type, as for string and character literals, or allow the expected type to be any type in a class, so long as the universal type of the literal "covers" the specified class. Since no types currently have Literal aspects, there is no upward compatibility issue. We need to decide how to represent the literals, and how to convert them. For character literals, we have Wide_Wide_Characters, so this represents a "universal" representation which can accommodate any character set known in the universe. (The null literal does not need a representation, there only being a single value.) So long as all conversions of literals are propagated to a library-unit elaboration procedure for library-level types, and to the point of declaration of the type for nested types, the overhead of converting literals can be kept manageable. One might often hope the conversion routines could be evaluated statically, but clearly we are getting into compiler-specific optimizations. Because the conversion might fail, we need to specify that it is a bounded error if a literal conversion function propagates an exception, with consequences being a compile-time or link-time error, or a Program_Error or the exception propagated by the conversion function, raised at the point where the value of the literal is used. This proposal builds on the one made in AI12-0249-1. !wording Modify paragraph 3.5.2(1): An enumeration type is said to be a /character/ type if at least one of its enumeration literals is a character_literal{, as is a type with a specified Character_Literal aspect (see 4.2.1)}. [Editor's Note: This definition might require some rewording of the definition (or use) of string types, as it makes any type with character literals a character type, and any one-dim array type with a character component is a string type. In particular, the rules for string literals and static string subtypes need to be checked to ensure that they work in the case of user-defined character literals.] Modify paragraph 4.2(3): For a name that consists of a character_literal, either{: *}its expected type shall be a single character type {T}, in which case it is interpreted as {follows: * if T is an enumeration type, as} a parameterless function_call that yields the corresponding value of the character type[,]{; * if T has a Character_Literal aspect specified (see 4.2.1), as a function_call on the named function that yields the corresponding value of the character type; *}or its expected profile shall correspond to a parameterless function with a character result type, in which case it is interpreted as the name of the corresponding parameterless function declared as part of the character type's definition (see 3.5.1). In either case, the character_literal denotes the enumeration_literal_specification. Modify paragraph 4.2(5): {For a character_literal that is a name, if the expected type for the literal is an enumeration type or if there is an expected profile, the} [A]character_literal [that is a name] shall correspond to a defining_character_literal of the expected type, or of the result type of the expected profile. Modify paragraph 4.2(9/5): [As modified by AI12-0249-1] If its expected type is a numeric type, the evaluation of a numeric literal yields the represented value. {If its expected type is an access type, the}[The] evaluation of the literal null yields the null value of the expected type. In other cases, the effect of evaluating a numeric{ or null} literal is determined by the Integer_Literal{,}[ or] Real_Literal{, or Null_Literal} aspect that applies (see 4.2.1). Add after 4.2.1(6/5): [As defined by AI12-0249-1 and AI12-0295-1] Null_Literal This aspect is specified by a name that denotes a constant object of type T, or that denotes a primitive function of T with no parameters and a result type of T. Character_Literal This aspect is specified by a /function_/name that denotes a primitive function of T with one parameter of type Wide_Wide_Character and a result type of T. [Redundant: A type with a specified Character_Literal aspect is considered a /character/ type.] Add to 4.2.1(7/5) before the last sentence: The Null_Literal aspect shall not be specified for a type T if the full view of T is an access type. The Character_Literal aspect shall not be specified for a type T if the full view of T is an enumeration type. AARM Reason: We do not allow Character_Literal to be specified on an enumeration type to avoid confusion as to whether such a type is a character type that requires compile-time checks. For instance, if we allowed this, then this type would be legal: type Ugh is (One, Two, Three) with Character_Literal => To_Ugh; -- Ugh is character type. function To_Ugh (C : in Wide_Wide_Character) return Ugh is (case C is when '1' => One, when '2' => Two, when '3' => Three, with others => (raise Program_Error)); But any string literal with a component of this type would be illegal (by a Legality Rule in 4.2). Reconciling this isn't worth the trouble. End AARM Reason. Add after 4.2.1(9/5): For the evaluation of a null literal with expected type having a Null_Literal aspect specified, the value is the that of the constant object denoted by the aspect, or the result of a call on the parameterless function denoted by the aspect. For the evaluation of a character_literal with expected type having a Character_Literal aspect specified, the value is the result of a call on the function specified by the aspect, with the parameter being the Wide_Wide_Character that corresponds to the literal. !discussion The !proposal section includes a relatively complete discussion of the issues. But here are a few other interesting questions or issues: * We do not want to have a situation where a literal may be usable on a partial view but not on the full view, for example, because the full view is a type that already has meaning for the same sort of literal. This avoids problems with full conformance checking when the meaning of a literal might be different in the visible and private parts of a package. * More generally, we chose to allow user-defined literals on almost any sort of type, so long as that sort of type didn't already allow that sort of literal. The thought was that you could imagine an integer type that would allow character literals, which would be interpreted as their 16-bit Unicode value, say -- the "signed_char" and "unsigned_char" types in Interfaces.C come to mind. !examples type Reference is private with Null_Literal => No_Reference; No_Reference : constant Reference; ... Obj : Reference := null; -- Equivalent to: -- Obj : Reference := No_Reference; type Signed_Char is mod 2**8 with Character_Literal => Char_Val; function Char_Val (Ch : in Wide_Wide_Character) return Signed_Char; ... X : Signed_Char := 'X'; -- Equivalent to: -- X : Signed_Char := Char_Val ('X'); !ASIS [Not sure. Might need new aspect names, but I didn't check - Editor.] !ACATS test ACATS B and C-Tests are needed to check that the new capabilities are supported, and that error cases are detected. !appendix [Editor's note: Split from AI12-0249-1 by vote at ARG meeting #60.] ****************************************************************