Version 1.2 of ai12s/ai12-0384-1.txt

Unformatted version of ai12s/ai12-0384-1.txt version 1.2
Other versions for file ai12s/ai12-0384-1.txt

!standard 4.10(0)          20-06-10 AI12-0384-1/02
!class Amendment 20-06-10
!status work item 20-06-10
!status received 20-06-10
!priority Low
!difficulty Medium
!subject Fixups for Put_Image and Text_Buffers
!summary
Add operations to associate with a Text_Buffer an adjustable level of indentation, and a character position within the line. Eliminate the End_Of_Line function. Add functions for putting and getting the content of the Text_Buffer as a UTF_8 String or as a UTF_16 Wide_String. Provide a New_Line string rather than a New_Line_Count. Add parameters to Get and Wide_Get to control what to substitute for characters that cannot be directly represented in Character and Wide_Character, respectively. Provide functions Get and Wide_Get for retrieving the whole buffer with possibly multiple-character substitutions for unrepresentable characters using a user-defined substitution function.
Allow Put_Image on an Unchecked Union. Give implementors a bit more flexibility about whether to support Put_Image on types on which Text_Buffers themselves depend. Clarify that T'Put_Image on a private type breaks privacy, and displays the same thing that the full type would display.
!problem
As part of implementing the Put_Image attribute (AI12-0020-1) and the Text_Buffers package (AI12-0340-1), a certain number of issues were identified with the proposed features. In particular:
(1) Defining the Put_Image for a composite type is likely to result in multiple lines of output. It seems highly desirable that such output is reasonably indented, to reflect the nesting of components and subcomponents. As currently defined the Put_Image attribute takes only the Text_Buffer and the value whose image is to be produced. If we want to provide some level of indenting, we need to incorporate into the Text_Buffer some notion of a current level of indentation.
(2) Any attempt to break lines of output at some reasonable length relies on knowing the position within the current line. As characters are added to a Text_Buffer it is straightforward for the buffer to keep track of how many characters have been added to it since the most recent newline. It is hard to do it in other ways, since, as mentioned above, Put_Image routines are called with only a Text_Buffer and a value as a parameter. For example, it might be reasonable to display a short array all on the current line, unless we are already far to the right, in which case one might want to put the array on its own line, or break the array across multiple lines. This is hard to decide without knowing where on the current line of output the Put_Image is invoked.
(3) The Get and Wide_Get routines currently produce implementation-defined results when the characters are not representable directly as Character or Wide_Character, respectively. The actual substitution performed would be better if it were user defined, and could support the reversibility of 'Image/'Value for enumeration types that have enumeration literals whose characters are not all representable with Characters or Wide_Characters. It would also be useful if UTF_8 and UTF_16 could be used for filling and emptying Text_Buffers, since we anticipate that many implementations will use UTF_8 or UTF_16 internally for representing identifiers in their symbol tables, and perhaps also for the characters in a Text_Buffer. This can also make the process "loss-less" while permitting efficient internal representation.
(4) Providing a "peek-ahead" capability via End_Of_Line for a newline character sequence is not particularly useful unless characters are being removed from the buffer one character at a time. And even then, other "peek-ahead" would also be needed (e.g. for white space, commas, brackets, etc.) to do any sort of complex parsing as might be required by a general 'Value implementation. We therefore remove the End_Of_Line function, provide a New_Line_String instead of the New_Line_Count, and leave for future standardization sufficient peek-ahead functions to support 'Value or similar lexing and parsing uses of Text_Buffers.
(5) An AARM note says that Put_Image on an Unchecked_Union will raise Program_Error. It might be nice if some other behavior were permitted.
!proposal
(1) We propose to add indentation to the root interface, because it is not easy to add indentation in an extension, given that the various Put_Image routines are only passed Root_Buffer_Type'Class. We propose a Current_Indent function, and Increase/Decrease_Indent procedures, as primitives, which determine how many spaces are inserted at the beginning of the buffer, and at the beginning of each nonempty line.
(2) We propose to add a Position_In_Line function that indicates which character position of the line is the next to be written, so a Put_Image routine can decide whether and when to split its output into multiple lines.
(3) We add a Substitute parameter of a single Character or Wide_Character to the Get and Wide_Get procedures, respectively, analogous to the Substitute parameter in Ada.Characters.Conversions.To_String. We add Get and Wide_Get functions that take a Substitute parameter that is of an access-to-function type, which replaces an unrepresentable character with a string of representable characters. We add Get/Put_UTF_8 and Wide_Get/Put_UTF_16 operations. We provide implementation-defined Substitute and Wide_Substitute functions that perform the substitution expected by the 'Value and 'Wide_Value functions for enumeration types with literals that are not directly representable using Character or Wide_Character.
(5) We give an implementation permission for the default Put_Image for an unchecked union to produce a recognizable string rather than raise Program_Error.
!wording
Modify 4.10 (7/5):
where the Wide_Wide_String value written out to the [stream]{text buffer} is defined as follows:
Modify AARM 4.10(13.a/5):
In general, the default implementation of T'Put_Image for a composite type will involve some sequence of calls to [Wide_Wide_String'Write]{Put and its Wide and Wide_Wide variants} and calls to the Put_Image procedures of component types and, in the case of an array type, index types. The [Wide_Wide_String'Write]{Put} calls may pass in either literal values (e.g., "(", ")", "'(", " => ", or ", "), or other things (such as component names for record values, task_id images for tasks, or the Wide_Wide_Expanded_Name of the tag in the class-wide case).
Modify 4.10(16/5):
For a class-wide type, the default implementation of T'Put_Image generates an image based on qualified expression syntax. [Wide_Wide_String'Write]{Wide_Wide_Put} is called with Wide_Wide_Expanded_Name of Arg'Tag. Then S'Put_Image is called, where S is the specific type identified by Arg'Tag.
Add after 4.10 (23/5):
Redundant[T'Put_Image is the same for all views of T, including any partial views.]
AARM Proof: A type-related operation aspect is the same for views of a type, see 13.1.
Modify 4.10 (28/5):
S'Wide_Image calls S'Put_Image passing Arg (which will typically store a sequence of character values in a text buffer) and then returns the result of retrieving the contents of that buffer with {function} Wide_Get {with Substitute being Wide_Image_Substitution}. The lower bound of the result is one.
Modify 4.10 (31/5):
S'Image calls S'Put_Image passing Arg (which will typically store a sequence of character values in a text buffer) and then returns the result of retrieving the contents of that buffer with {function} Get {with Substitute being Image_Substitution}. The lower bound of the result is one.
Add after 4.10 (40/5):
* For a string type, implementations may produce an image corresponding to a string literal.
* For an unchecked union type, implementations may raise Program_Error or produce some recognizable image (such as "(UNCHECKED UNION)").
Modify A.4.12 (2/5):
{with Ada.Strings.UTF_Encoding.Wide_Wide_Strings;} package Ada.Strings.Text_Buffers
with Pure, Nonblocking, Global => null is
Add after A.4.12(4/5):
subtype Positive_Text_Buffer_Count is Text_Buffer_Count range 1 .. Text_Buffer_Count'Last;
New_Line_String : constant String := /implementation-defined/;
Modify A.4.12(5/5):
New_Line_Count : constant Positive_Text_Buffer_Count := [/implementation-defined/]{New_Line_String'Length};
Delete A.4.12(6/5):
type Root_Buffer_Type is abstract tagged private;
and replace with:
type Root_Buffer_Type is abstract tagged limited private {with Default_Initial_Condition => Character_Count (Root_Buffer_Type) = 0 and then Current_Indent (Root_Buffer_Type) = 0 and then Position_In_Line (Root_Buffer_Type) = 1;
Modify A.4.12 (9/5):
procedure Get ( Buffer : in out Root_Buffer_Type; Item : out String; Last : out Natural{; Substitute : in Character := ' '}) is abstract with Post'Class => (declare Num_Read : constant Text_Buffer_Count := Text_Buffer_Count'Min (Character_Count(Buffer)'Old, Item'Length); begin Last = Num_Read + Item'First - 1 and then Character_Count (Buffer) = Character_Count (Buffer)'Old - Num_Read);
Modify A.4.12 (10/5):
procedure Wide_Get ( Buffer : in out Root_Buffer_Type; Item : out Wide_String; Last : out Natural{; Substitute : in Wide_Character := ' '}) is abstract with Post'Class => (declare Num_Read : constant Text_Buffer_Count := Text_Buffer_Count'Min (Character_Count(Buffer)'Old, Item'Length); begin Last = Num_Read + Item'First - 1 and then Character_Count (Buffer) = Character_Count (Buffer)'Old - Num_Read);
Delete the End_Of_Line function at A.4.12 (12/5).
Modify A.4.12 (13/5, 14/5, 15/5) to replace Post'Class in each with:
with Post'Class =>
Character_Count (Buffer) =
Character_Count (Buffer)'Old + Item'Length + (if Position_In_Line (Buffer)'Old = 1 then Current_Indent (Buffer) else 0);
Add after A.4.12 (16/5):
procedure Put_UTF_8 ( Buffer : in out Root_Buffer_Type; Item : in UTF_Encoding.UTF_8_String) is abstract with Post'Class => Character_Count (Buffer) = Character_Count (Buffer)'Old + UTF_Encoding.Wide_Wide_Strings.Decode (Item)'Length + (if Position_In_Line (Buffer)'Old = 1 then Current_Indent (Buffer) else 0);
procedure Wide_Put_UTF_16 ( Buffer : in out Root_Buffer_Type; Item : in UTF_Encoding.UTF_16_Wide_String) is abstract with Post'Class => Character_Count (Buffer) = Character_Count (Buffer)'Old + UTF_Encoding.Wide_Wide_Strings.Decode (Item)'Length + (if Position_In_Line (Buffer)'Old = 1 then Current_Indent (Buffer) else 0);
function Get_UTF_8 ( Buffer : in out Root_Buffer_Type) return UTF_Encoding.UTF_8_String with Post'Class => Character_Count (Buffer) = 0;
function Wide_Get_UTF_16 ( Buffer : in out Root_Buffer_Type) return UTF_Encoding.UTF_16_Wide_String with Post'Class => Character_Count (Buffer) = 0;
type Substitution_Function is access function (Item : Wide_Wide_Character) return String;
type Wide_Substitution_Function is access function (Item : Wide_Wide_Character) return Wide_String;
function Image_Substitution (Item : Wide_Wide_Character) return String;
function Wide_Image_Substitution (Item : Wide_Wide_Character) return Wide_String;
function Get ( Buffer : in out Root_Buffer_Type; Substitute : in Substitution_Function := Image_Substitution) return String with Post'Class => Get'Result'First = 1 and then Character_Count (Buffer) = 0;
function Wide_Get ( Buffer : in out Root_Buffer_Type Substitute : in Wide_Substitution_Function := Wide_Image_Substitution) return Wide_String with Post'Class => Wide_Get'Result'First = 1 and then Character_Count (Buffer) = 0;
function Position_In_Line (Buffer : Root_Buffer_Type) return Positive_Text_Buffer_Count;
Standard_Indent : constant Positive_Text_Buffer_Count := 3;
function Current_Indent (Buffer : Root_Buffer_Type) return Text_Buffer_Count;
procedure Increase_Indent (Buffer : in out Root_Buffer_Type; Amount : in Text_Buffer_Count := Standard_Indent) with Post'Class => Current_Indent (Buffer) = Current_Indent (Buffer)'Old + Amount;
procedure Decrease_Indent (Buffer : in out Root_Buffer_Type; Amount : in Text_Buffer_Count := Standard_Indent) with Pre'Class => Current_Indent (Buffer) >= Amount or else raise Constraint_Error, Post'Class => Current_Indent (Buffer) = Current_Indent (Buffer)'Old - Amount
Modify A.4.12 (19/5):
type Buffer_Type is new Root_Buffer_Type with private [with Default_Initial_Condition => Character_Count (Buffer_Type) = 0];
Delete A.4.12 (20/5) because the type is a nonabstract private extension so there is no need to visibly override the inherited operations; they can be overridden in the private part.
Modify A.4.12 (23/5) with:
type Buffer_Type (Max_Characters : Text_Buffer_Count) is new Root_Buffer_Type with private [with Default_Initial_Condition => Character_Count (Buffer_Type) = 0];
Replace A.4.12 (24/5) with:
-- Overridings of Put, Wide_Put, and Wide_Wide_Put are declared here -- with the aspect specification -- with Pre => -- Character_Count (Buffer) + Item'Length + -- (if Position_In_Line (Buffer)'Old = 1 -- then Current_Indent (Buffer) else 0) <= Buffer.Max_Characters -- or else raise Constraint_Error
-- Overridings of Put_UTF_8 and Wide_Put_UTF_16 are declared here -- with the aspect specification -- with Pre => -- Character_Count (Buffer) + -- UTF_Encoding.Wide_Wide_Strings.Decode (Item)'Length + -- (if Position_In_Line (Buffer)'Old = 1 -- then Current_Indent (Buffer) else 0) <= Buffer.Max_Characters -- or else raise Constraint_Error
Modify A.4.12 (27/5):
{New_Line_String is the representation for a new line in a Text_Buffer.} New_Line stores [New_Line_Count characters that represent a new line] {New_Line_String} into a text buffer. [End_of_Line returns True if the next characters to be retrieved from the text buffer represent a new line.] {Position_In_Line returns the character position in the current line of output, starting with one. Current_Indent returns the current indentation associated with the buffer, with zero meaning there is no indentation in effect; Increase_Indent and Decrease_Indent increase or decrease the indentation associated with the buffer.}
Modify A.4.12 (28/5):
A call to Put{, Put_UTF_8}, Wide_Put{, Wide_Put_UTF_16}, or Wide_Wide_Put stores a sequence of characters into the text buffer{, preceded by Current_Indent(Buffer) spaces (Wide_Wide_Characters with position 32) if they would have been the first characters on the current line}.
Modify A.4.12 (29/5):
A call to Get{, Get_UTF_8}, Wide_Get{, Wide_Get_UTF_16}, or Wide_Wide_Get returns the same sequence of characters as was present in the calls that stored the characters into the buffer{, if representable}. For a call to Get, if any character {C} in the sequence is not defined in Character, [the result is implementation defined] {C is replaced with the Substitute character when calling procedure Get, and with Substitute (C) when calling the function Get}. Similarly, for a call to Wide_Get, if any character {C} in the sequence is not defined in Wide_Character, [the result is implementation defined] {C is replaced with the Substitute character when calling procedure Wide_Get, and with Subsititute (C) when calling the function Wide_Get.
The functions Image_Substitution and Wide_Image_Substitution perform the implementation-defined substitutions necessary to support the Value and Wide_Value attributes for enumeration types that have literals using a character that is not representable directly as Character or Wide_Character, respectively}.
!discussion
See !problem and !proposal.
We have added a reminder that private types have the same implementation of Put_Image as their full type. Another choice would be to say that Put_Image is not defined on a private type unless explicitly provided, though that would clearly make composability more of a pain, analogous to what happens with 'Write on limited types. Clearly any implementor of a private type should specify Put_Image if they are concerned about the loss of information hiding. For compatibility reasons we can't really change what Put_Image does on the full type, and making the run-time semantics of Put_Image depend on where you invoke it seems like a generally bad idea (as well as requiring defining a new kind of aspect with it's own set of rules).
One could imagine other rules, such as hiding 'Image, but not hide 'Put_Image, on a private type. But given that the motivating purpose of the expansion of 'Image is to ease debugging/logging (see the !problem of AI12-0020-1), restrictions on the use of 'Image would just get in the way of the intended use. Additionally, there is nothing in the default 'Image that can't be learned from reading the specification of the private type, so any hiding is dubious anyway.
Users with critically information leakage concerns can always define their own Put_Image to block such leakage.
!ASIS
No ASIS effect.
!ACATS test
An ACATS C-Test is needed to check the various subprograms and the image attributes are implemented as expected.
that the new rules are enforced, rather than the previous rules. The test program given below can be the basis of an ACATS C-Test.
!appendix

From: Tucker Taft
Sent: Sunday, April 26, 2020  9:07 AM

I am in the process of putting together an AI that includes a set of proposed 
"fix-ups" for AI12-0020-1 (Put_Image) and AI12-0340-1 (Put_Image should use 
Text_Buffers) resulting from Bob Duff's implementation efforts.  Along the 
way, I noticed at least one AARM note in the new 4.10 section that is talking
about Wide_Wide_String'Write, along with one normative paragraph:

13.a/5: 
Discussion: In general, the default implementation of T'Put_Image for a 
composite type will involve some sequence of calls to Wide_Wide_String'Write 
and calls to the Put_Image procedures of component types and, in the case of 
an array type, index types. The Wide_Wide_String'Write calls may pass in 
either literal values (e.g., "(", ")", "'(", " => ", or ", "), or other 
things (such as component names for record values, task_id images for tasks, 
or the Wide_Wide_Expanded_Name of the tag in the class-wide case). 

16/5:
For a class-wide type, the default implementation of T'Put_Image generates an 
image based on qualified expression syntax. Wide_Wide_String'Write is called 
with Wide_Wide_Expanded_Name of Arg'Tag. Then S'Put_Image is called, where S 
is the specific type identified by Arg'Tag.

I'll try to remember to include these in my AI, but they are clearly unrelated 
to the lessons learned from Bob's implementation work, so might be better in 
some sort of "editorial fix-up" AI.

****************************************************************

From: Randy Brukardt
Sent: Wednesday, June 10, 2020  6:02 PM

A thought on this unposted AI (it will be posted soon, but I wanted this on the 
record): 

... 
> !discussion
> 
> See !problem and !proposal
> 
> We have clarified that private types have the same default 
> implementation of Put_Image as their full type. Another choice would 
> be to say that Put_Image is not defined on a private type unless 
> explicitly provided, though that would clearly make composability more 
> of a pain, analogous to what happens with 'Write on limited types.  
> Clearly any implementor of a private type should specify Put_Image if 
> they are concerned about the loss of information hiding.  For 
> compatibility reasons we can't really change what Put_Image does on 
> the full type, and making the run-time semantics of Put_Image depend 
> on where you invoke it seems like a generally bad idea.  So I think we 
> are stuck with either hiding it completely, or having it do the same 
> thing as on the full type.  Actually, one option is to hide 'Image, 
> but not hide 'Put_Image, on a private type.  That would mean that 
> composability is not interrupted, but a "casual" use of 'Image would 
> not be possible without an explicit declaration of Put_Image in the 
> visible part of the package.

The latter part of this discussion seems silly to me. The primary goal was to 
provide 'Image for all types to ease debugging/logging, and secondarily 
user-defined 'Image to allow the use of more appropriate images for types like 
Big_Integer and Vector. The goal was *not* to provide 'Image for all types 
except private types! Moreover, for debugging one generally wants most or all 
of the private data - information hiding is not a goal when debugging. Even 
for the containers, we want the result to show the complete contents of the 
container (just suppressing some of the control information). Finally, any 
sort of hiding rule would greatly complicate the definition of Image. (We had
to use such a thing for streaming as no useful default streaming is available
for some types, such as tasks, no such problem exists for Image.)

The default of the Image of the full type seems to be the natural and correct 
implementation to meet these primary goals. Remember that a private type in 
Ada is private in name only -- it's sitting right there in the source where 
anyone can inspect it. Hiding details of a type cannot be done with private 
types alone, so there's little reason to get too concerned about it.

User-defined Put_Image can be used for the rare case where exposing the 
internals of a type is a problem. 

Note that the same analysis applies to incomplete types; I think we get to 
ignore that as an incomplete type cannot be the prefix of an attribute nor can 
an object with an incomplete type be the prefix of anything (including an 
attribute).

****************************************************************

From: Tucker Taft
Sent: Wednesday, June 10, 2020  7:22 PM

> The latter part of this discussion seems silly to me. ...

Be that as it may, it seems worth exposing the issue, and attempting to 
enumerate the alternatives...

****************************************************************

From: Randy Brukardt
Sent: Wednesday, June 10, 2020  9:02 PM

> ... 
> > !discussion
> > 
> > See !problem and !proposal
> > 
> > We have clarified that private types have the same default 
> > implementation of Put_Image as their full type.

Furthermore, the AI has the following wording:

Add after 4.10 (23/5):
  For a private type or a private extension T, the default
  implementation of T'Put_Image is the same as that for its full type.

This literally goes without saying. Put_Image is defined to be a type-related 
operational attribute/aspect by 4.10(2/5), and 13.1(11/3) says "Operational 
and representation aspects are the same for all views of a type".
(Tucker has pointed out privately that 13.1(11/3) and several other rules 
really should only apply to type-related aspects, but even fixing that makes
no change here.)

Ergo, it is impossible for a private type to have a different Put_Image than 
its full type. That has nothing to do with the default implementation or any 
other qualification. If we wanted a difference (and I don't think we do), we 
would need to change Put_Image to some other kind of attribute/aspect.

If we're going to say anything at all here, it should be marked redundant and 
much more general:

   Redundant[For a private type or a private extension T, T'Put_Image 
   is the same as that for its full type.]

   AARM Proof: A type-related operation aspect is the same for views of a 
   type, see 13.1.

or maybe more simply:

   Redundant[T'Put_Image is the same for all views of T, including any 
   partial views.]

with the same proof.

The !discussion might mention that changing the basic model of type-related 
aspects is unwise, giving some of the reasons discussed.

In any case, the model here was copying the model of stream attributes, which 
never bother saying anything about partial views. Not really sure why 
Put_Image should be different.

****************************************************************

From: Tucker Taft
Sent: Wednesday, June 10, 2020  9:27 PM

> Ergo, it is impossible for a private type to have a different 
> Put_Image than its full type. That has nothing to do with the default 
> implementation or any other qualification. If we wanted a difference 
> (and I don't think we do), we would need to change Put_Image to some other
> kind of attribute/aspect.

I don't entirely agree.  It is perfectly possible for a full view to have an 
attribute, and the partial view to not have the attribute defined.  For example
'Val is defined for a full view if it is an integer or an enumeration type, but
not for the partial view.  So I agree they couldn't be different without 
breaking various other Ada principles, but we could say that Put_Image is not 
defined on the partial view, while it is defined on the full view.

> ...
> In any case, the model here was copying the model of stream 
> attributes, which never bother saying anything about partial views. 
> Not really sure why Put_Image should be different.

It feels somewhat different, because we know that the binary representation of 
data is the same independent of the view, but properties that are defined at a 
higher level can certainly be different for a partial view and a full view.

****************************************************************

From: Randy Brukardt
Sent: Wednesday, June 10, 2020  6:02 PM

...
> > Ergo, it is impossible for a private type to have a different 
> > Put_Image than its full type. That has nothing to do with the default 
> > implementation or any other qualification. If we wanted a difference 
> > (and I don't think we do), we would need to change 
> Put_Image to some other kind of attribute/aspect.
> 
> I don't entirely agree.  It is perfectly possible for a full 
> view to have an attribute, and the partial view to not have 
> the attribute defined.  For example 'Val is defined for a 
> full view if it is an integer or an enumeration type, but not 
> for the partial view.  So I agree they couldn't be different 
> without breaking various other Ada principles, but we could 
> say that Put_Image is not defined on the partial view, while 
> it is defined on the full view.

I don't think that exact model works (there was *some* reason that we invented 
the entire concept of "availability"), but you are right that some sort of 
similar model could be made to work. But that model would have to hide Image 
and/or Put_Image of any type with a private component in the current view, 
since the composition model would let any such type access the full 
representation of the private type.

The bigger problem is that the entire justification for defining universal 
'Image is to support debugging (see the !problem for AI12-0020-1), and the 
notion that no one wants to debug private types is ludicrous. Those who write 
everything as ADTs would be helped hardly at all, as virtually all of their 
types contain some private components. That way seems to lead to nowhere.

> > ...
> > In any case, the model here was copying the model of stream 
> > attributes, which never bother saying anything about partial views. 
> > Not really sure why Put_Image should be different.
> 
> It feels somewhat different, because we know that the binary 
> representation of data is the same independent of the view, 
> but properties that are defined at a higher level can 
> certainly be different for a partial view and a full view.

True, but that means abandoning the stream attribute model for some sort of 
view-specific model. And then how composition is supposed to work seems 
problematic. Sounds like a can of worms heading nowhere.

****************************************************************


Questions? Ask the ACAA Technical Agent