CVS difference for ais/ai-00301.txt

Differences between 1.5 and version 1.6
Log of other versions for file ais/ai-00301.txt

--- ais/ai-00301.txt	2002/08/31 00:32:37	1.5
+++ ais/ai-00301.txt	2002/10/01 03:08:54	1.6
@@ -1,4 +1,4 @@
-!standard A.4.5                                     02-08-26  AI95-00301/03
+!standard A.4.5                                     02-09-30  AI95-00301/04
 !class amendment 02-06-12
 !status work item 02-06-12
 !status received 02-06-12
@@ -97,19 +97,21 @@
     -- Index_Error is propogated if From is not in Source'range.
 
 
-Similar functions are added to Ada.Strings.Bounded and Ada.Strings.Unbounded.
+Similar functions are added to Ada.Strings.Bounded and Ada.Strings.Unbounded;
+in these, the Source parameter is a Bounded_String and Unbounded_String,
+respectively.
 
 
 The following operation is added to package Ada.Strings.Bounded:
 
-   procedure To_Bounded_String
+   procedure Set_Bounded_String
      (Target :    out Bounded_String;
       Str    : in     String);
    --  Identical in effect to Target := To_Bounded_String (Str);
 
 The following operation is added to package Ada.Strings.Unbounded:
 
-   procedure To_Unbounded_String
+   procedure Set_Unbounded_String
      (Target :    out Unbounded_String;
       Str    : in     String);
    --  Identical in effect to Target := To_Unbounded_String (Str);
@@ -117,14 +119,14 @@
 
 The following operations are added to package Ada.Strings.Unbounded:
 
-   function Slice
+   function Unbounded_Slice
      (Source : in Unbounded_String;
       Low    : in Positive;
       High   : in Natural)
      return Unbounded_String;
    --  Identical to To_Unbounded_String (Slice (Source, Low, High));
 
-   procedure Slice
+   procedure Unbounded_Slice
      (Source : in     Unbounded_String;
       Target :    out Unbounded_String;
       Low    : in     Positive;
@@ -176,7 +178,8 @@
    --  Identical in effect to
    --    Overwrite (Source, Position, To_String (New_Item));
 
-Similar operations are added to Ada.Strings.Bounded.
+Similar operations are added to Ada.Strings.Bounded, with the difference that
+Unbounded_Slice is named Bounded_Slice.
 
 
 The following child package is defined:
@@ -253,12 +256,17 @@
 and no programs change meaning (any incompatibilities cause compile-time
 errors).
 
+If the compatibility issues are considered to be significant enough, then
+these new routines could be added as child packages. Such a child package
+would not cause compatibility problems; however, it would look like an ugly
+add-on. And, choosing a name is difficult. Some of the names proposed
+include Ada.Strings.Unbounded.Additional_Operations,
+Ada.Strings.Unbounded.Enhancements, Ada.Strings.Unbounded.More,
+Ada.Strings.Unbounded.Unbounded_Operations, or
+Ada.Strings.Unbounded.Stuff_that_should_have_been_here_all_along. (OK, the
+last is a joke.) Thus, we did not follow this alternative.
 
-The procedure version of To_Bounded_String and To_Unbounded_String probably
-would be better named 'Set'. However, that would increase the possibility of
-a name collision incompatibility, so we did not change the names.
 
-
 An earlier proposal for the Index functions was to add a defaulted From
 parameter to the end of the existing routine. This would look like:
 
@@ -741,7 +749,7 @@
 --                                                                          --
 --                                 S p e c                                  --
 --                                                                          --
---    $Revision: 1.5 $                              --
+--                            $Revision: 1.6 $                              --
 --                                                                          --
 --          Copyright (C) 1992-1998, Free Software Foundation, Inc.         --
 --                                                                          --
@@ -1674,6 +1682,17 @@
 
 ****************************************************************
 
+From: Robert Dewar
+Sent: Saturday, August 31, 2002  7:31 AM
+
+>>When are the compilers going to implement faster Unbounded Strings?.
+
+When paying customers find it to be a priority, so far we have not had
+a single supported user who was concerned about the performance of
+unbounded string.
+
+****************************************************************
+
 From: Randy Brukardt
 Sent: Friday, August 30, 2002  6:57 PM
 
@@ -1703,3 +1722,1207 @@
 
 ****************************************************************
 
+From: Robert Dewar
+Sent: Friday, August 30, 2002  8:05 PM
+
+<<But, still I find it bizarre that Slice returns a String, and Tail returns
+an Unbounded_String (Tail essentially being a specialization of Slice).
+Sigh.
+>>
+
+Just a little irregular, save the colorful word "bizarre" for more significant
+things :-)
+
+<<Admittedly, a large part of my problem is the verboseness of "To_String" and
+"To_Unbounded_String" when you have to use them in virtually every unbounded
+string expression. A shorter name would be welcome, but I doubt that could
+be done compatibly enough to avoid screwing up existing code.
+>>
+
+provide renamings of "+"
+
+Too bad JDI and RBKD could not convince people to add a conversion operator
+:-)
+
+****************************************************************
+
+From: Bob Duff
+Sent: Saturday, August 31, 2002  9:11 AM
+
+What would the conversion operator have looked like?
+
+****************************************************************
+
+From: Florian Weimer
+Sent: Tuesday, September 3, 2002 3:53 PM
+
+AFAIK, the proposal involved a unary operator which was not used by
+the core language, and which could be overridden by programmers.  Like
+"+", but lacking the predefined meaning.
+
+****************************************************************
+
+From: Robert Dewar
+Sent: Tuesday, September 3, 2002 3:35 PM
+
+>>What would the conversion operator have looked like?
+
+Our proposal was simply to allow the currency conversion symbol (also called
+pillow, i forget its ISO name), its a square with curved sides (curved in)
+
+as an undefined unary operator, available to the user to redefine for whatever
+purpose, but with the idea of stylistically reserving it for conversions
+(the way some people use "+" now)
+
+****************************************************************
+
+From: Jean-Pierre Rosen
+Sent: Wednesday, September  4, 2002  2:05 AM
+
+And this would be a mess today, since this symbol has been reassigned to the
+Euro symbol... :-)
+
+****************************************************************
+
+From: Robert Dewar
+Sent: Wednesday, September 4, 2002  2:18 PM
+
+NO big mess. The euro symbol as a conversion operator is perfectly acceptable
+I think. the point is to choose some special character that is NOT otherwise
+used in the syntax.
+
+****************************************************************
+
+From: Pascal Obry
+Sent: Wednesday, September 4, 2002  3:06 PM
+
+So we have at least '@' or '~', both of them are certainly better looking
+(at least for europeans) as conversion operator than the euro symbol.
+
+****************************************************************
+
+From: Robert Dewar
+Sent: Wednesday, September 4, 2002  3:12 PM
+
+'@' or '~' would be just fine, and arguably there are a lot of people for
+whom upper half characters are a pain after all :-)
+
+****************************************************************
+
+From: Randy Brukardt
+Sent: Wednesday, September 4, 2002  7:46 PM
+
+I'd have to object to using '@'. It has been the conditional compilation
+character in our compiler's preprocessor since the beginning of time (1981).
+Virtually all of the code that we have (with the exception of Claw itself) uses
+this feature extensively.
+
+The compiler interprets '@' as an error (in canonical mode), as a space (in
+condcomp on mode) or as a comment symbol ["--"] (in condcomp off mode). This
+gives us the ability to build debugging and production versions of code without
+changing anything.
+
+The traditional conditional compilation solutions aren't useful for pragmas or
+for context clauses, and these are the main uses for our preprocessor. For
+instance, a typical unit in our compiler will start something like:
+
+    with J2Type_Decs, J2Resolutions;
+    @with J2Trace, J2Dump_Symbols, J2Dump_Types, Text_IO;
+    package body J2Checks is
+        pragma Debug(Off); pragma Suppress(All_Checks);
+        @pragma Debug(On); pragma Unsuppress(All_Checks);
+
+        ...
+
+****************************************************************
+
+From: Robert Dewar
+Sent: Wednesday, September 4, 2002  8:32 PM
+
+That seems like a weak argument. If we do decide to use @, then you can just use
+@@ to mean a single @.
+
+****************************************************************
+
+From: Randy Brukardt
+Sent: Wednesday, September 4, 2002  9:26 PM
+
+> That seems like a weak argument.
+
+Of course it's weak. I don't think that feature is used a lot by our customers,
+so it is mainly us that is affected.
+
+> If we do decide to use @, then you can just use
+> @@ to mean a single @.
+
+I suppose, although that complicates the lexical analysis somewhat. And
+existing programs would not be compatible (although they could be converted
+with a programs which would need to be aware of Ada lexical rules). Not a
+trivial or painless solution.
+
+I presume other compiler's preprocessors also use some of the "unused"
+characters. Besides '@' and '~', '\' and '^' are unused (along with '[', ']',
+'{', '}', and '`', which I think would be bad choices for this operation). I
+wonder if any of these would cause trouble for preprocessors or other
+frequently used tools?
+
+****************************************************************
+
+From: Craig Carey
+Sent: Sunday, September  1, 2002  4:22 AM
+
+<<But, still I find it bizarre that Slice returns a String, and Tail returns
+an Unbounded_String (Tail essentially being a specialization of Slice).
+Sigh.
+>>
+
+The package has other problems too.
+
+There is a problem for programmers who remove a better strings package and
+drop to use of Ada 95's Unbounded Strings.
+
+A typical line in the source code might be this:
+
+    X    : V_Str := ...;
+    ...
+    Text_IO.Put_Line ("Text = " & X & ".");  --  "&" returns a plain string
+
+That has X be a variable length string in the package that is being
+removed.
+
+When the Ada 95 Unbounded Strings package is switched in (and the user
+backs out of use of a better package) then there are two ways to rewrite
+that source code line.
+
+  (1)   Text_IO.Put_Line (+("Text = " & X & "."));
+  (2)   Text_IO.Put_Line ("Text = " & (+X) & ".");
+
+In that code, "+" is the "To_String()" function.
+
+Option (1) would run slower so it would be avoided and option (2) would
+be preferred.
+
+A problem with the 2nd is that it is inconvenient to add the extra
+parenthesis: "(+" and ")".
+
+An advanced text editor that processes syntax errors can locate the 1st
+perhaps, but inserting the 2nd ")" is not so easy. A text editor's
+regular expressions search and replace feature may fail to be allow the
+2nd substring (the ")") to be correctly inserted as required.
+
+Also the same problem with excess rewriting of source code can occur
+when Unbounded Strings are being removed, if the new superior package
+does not define any "&" functions.
+
+Another problem is that of replacing most instances of "X := Y" with
+"Assign(X,Y)". That does seem to be desirable and I don't have ideas
+on how to get vendors to provide both sides of the assignment, and how
+to get it to run faster.
+
+[It is excellent that controlled tagged types were not slow.]
+
+--------
+
+Also Ada's rules do allow this:
+
+    Z := "(1<p)" & not "p>=2"  --  define symbolically a 1-D polygon (1<p<2)
+
+; but do not allow these:
+
+    Z := "(1<p)" & -"p>=2"    --  alter to Z := "(1<p)" & (-"p>=2");
+    Z := "(1<p)" and -"p>=2"  --  alter to Z := "(1<p)" and (-"p>=2");
+    Z := "(1<p)" not "p>=2"
+
+I can't see why "X and -Y" is not allowed. (ref. RM 4.5).
+
+
+...
+><<Admittedly, a large part of my problem is the verboseness of "To_String" and
+>"To_Unbounded_String" when you have to use them in virtually every unbounded
+>string expression. A shorter name would be welcome, but I doubt that could
+>be done compatibly enough to avoid screwing up existing code.
+>>>
+>
+>Provide renamings of "+"
+>
+
+While that may be free of syntax errors and malfunctions, there is a doubtful
+area in how to pair up "+", "-", with the to-string and from-string conversions.
+
+Once Unbounded Strings are replaced with a superior package, then the conversion
+away from a plain string to the strings of the package, could occur much less
+in source code.
+
+So if an aim was to have more "+"s in the users' code than "-"s, then it
+could occur that persons thinking of Unbounded Strings prefer to have "+"
+rename To_Unbounded_String(), but to maximise the "+"/"-" ratio, the superior
+package differ and have "+" convert to a plain Ada 95 string. I presume it
+should be right for the ideal strings instead of seeming right for the
+existing Ada.Strings.Unbounded package.
+
+Leaving Ada.Strings.Unbounded in Ada is not such a good idea since it could
+be used in major projects and it does not seem to run fast enough.
+
+
+
+>>>When are the compilers going to implement faster Unbounded Strings?.
+>
+>When paying customers find it to be a priority, so far we have not had
+>a single supported user who was concerned about the performance of
+>unbounded string.
+
+
+Some of the GNAT Strings.Unbounded code is simple. E.g. here is the
+Tail routine that shows that Tail() will run slower if its result is
+converted to a plain string immediately after the function is called:
+
+------------------------------------------------------------------------------
+    function Tail
+      (Source : Unbounded_String;
+       Count  : Natural;
+       Pad    : Character := Space)
+       return   Unbounded_String is
+
+    begin
+       return
+         To_Unbounded_String (Fixed.Tail (Source.Reference.all, Count, Pad));
+    end Tail;
+------------------------------------------------------------------------------
+
+PS. The GNAT 3.14p file "a-strunb.adb" (which contains the body of the
+package Ada.Strings.Unbounded) has 2.89 lines of conformant active code, per
+function and procedure. [188 lines divided by 65 procedure and function
+declarations].
+
+   The lines counted excuded comments, blank lines, begin's, end's, and
+   declarations (including those with an assignment).
+
+Possibly GNAT's paying customers would not comment on a package that was
+so lightweight. A complaint might not seem heavyweight.
+
+The Tail function is implemented inefficiently since allocating bytes
+without allocating excess spare space. The specs in the Reference Manual
+don't require that. Its efficiency is limited by that of GNAT's
+controlled types.
+
+My timing tests showed that unnecessary copying is not the complete
+problem the slowness of ":=" is comparable (and worse if the strings under
+1000 bytes long). So ACT's paying customers might never get to complain
+about Unbounded Strings since such complaints may transform into
+complaints about GNAT's implementation of RM 7.6 (User-Defined Assignment
+and Finalization).
+
+Mr Dewar has used Unbounded Strings in the GNAT g-regpat.adb regular
+expressions package, and ARG would deprecate declare Unbounded Strings
+to be deprecated then possibly other vendors could instead prefer to
+use 'access Strings' or whatever is available, instead.
+
+****************************************************************
+
+From: Robert Dewar
+Sent: Sunday, September 1, 2002  5:23 AM
+
+<<Leaving Ada.Strings.Unbounded in Ada is not such a good idea since it could
+be used in major projects and it does not seem to run fast enough.
+>>
+
+If the ARG were to remove Ada.Strings.Unbounded from Ada they would lose
+all credibility in the Ada community, and no vendors would pay any attention.
+When Ada 95 was designed, upwards compatibileiy was a very important
+requirement. That requirement is even more important in any future upgrade.
+
+In fact many people are using Ada.Strings.Unbounded for manyu purposes
+now quite successfully. Yes, more efficient packages and implementations
+are possible, but the demand for these is non-existant as far as we are
+concerned, and on the other hand, ASU is widely used in situations where
+maximum efficiency is not a primary concern.
+
+As to supposed shortcomings of the package design, that's a matter of taste.
+So far the proposed modifications have all seemed to me to be plainly
+undesirable.
+
+At most it would be feasible to prpose an alternative package, but I doubt
+that this would gather the necesary consensus for approval.
+
+****************************************************************
+
+From: Robert Duff
+Sent: Sunday, September  1, 2002  8:29 AM
+
+Robert says:
+
+> If the ARG were to remove Ada.Strings.Unbounded from Ada they would lose
+> all credibility in the Ada community, and no vendors would pay any attention.
+
+Right.  The ARG would never seriously consider such a huge
+incompatibility.
+
+****************************************************************
+
+From: Craig Carey
+Sent: Sunday, September  1, 2002  5:48 AM
+
+>Mr Dewar has used Unbounded Strings in the GNAT g-regpat.adb regular
+>expressions package, and ARG would deprecate declare Unbounded Strings
+>to be deprecated then possibly other vendors could instead prefer to
+>use 'access Strings' or whatever is available, instead.
+
+Correction: the  Regular Expressions packages do not use
+Ada.Strings.Unbounded (this year). My comment was giving a wrong suggestion
+since most GNAT specs code avoids using Unbounded Strings
+(with exceptions being quite few, and including Ada.Strings.Unbounded.Text_IO
+and Ada.Strings.Unbounded.Aux).
+
+****************************************************************
+
+From: Craig Carey
+Sent: Sunday, September  1, 2002  9:18 PM
+
+Corrections
+
+At 02\09\01 22:48 +1200 Sunday, Craig Carey wrote:
+ >At 2002\09\01 21:21 +1200 Sunday, Craig Carey wrote:
+...
+ >At 02\08\23 23:01 -0500 Friday, Randy Brukardt wrote:
+...
+ > >   http://www.ada-auth.org/cgi-bin/cvsweb.cgi/AIs/AI-00301.TXT
+...
+ >The ":=" is a poor performer even when not allocating and deallocating:
+
+I presume that is wrong: I just timed with GNAT in Windows 2000 and a ":="
+(over a controlled type that did not have user defined pointers in it),
+where the Adjust() of the ":=" did nothing but was called, and it is
+about the speed as an empty Assign(X,Y) procedure.
+
+...
+ > >values with the function To_Unbounded_String. However, when this function
+ > >is used in an assignment statement, memory may be allocated twice (once by
+ > >the function, and once by Adjust), which is substantial extra overhead. A
+ > >procedure version of To_Unbounded_String would avoid this problem.
+
+Also, in Windows NT and Windows 2000, deallocating memory can be of a slowness
+similar to the slowness of allocating memory. That could be rechecked too.
+I presume it is 4 similar slowdowns and not 2, that result from the
+(able to be circumvented with difficulty) RM7.6 restrictions on what Adjust()
+can know.
+
+****************************************************************
+
+From: Alexander Kopilovitch
+Sent: Sunday, September  1, 2002  2:14 PM
+
+Generalizing the Strings/Unbounded_Strings issue, I would propose a new notion
+of "enveloped" private type. That is, a private type Y may be declared as an
+envelope (new keyword) of some base type X:
+
+  type Y is private envelope of X;
+
+Enveloping type (Y above) is required to have 2 private primitive operations:
+
+  function Strip (Source : Y) return X;
+
+and
+
+  function Upgrade (Source : X) return Y;
+
+which must be exact inverse of each other:
+
+  Strip( Upgrade(V) ) = V  and Upgrade( Strip(W) ) = W
+
+and their implementation is severely restricted so that compiler can verify
+and guarantee these identities.
+
+Then, a variable of enveloped type may be immediately initialized with a value
+of enveloping type and vice versa, in all cases of initialization, which include:
+
+1) declaration with initialization
+
+   V : X := R;  -- where R is either a variable or constant of type Y
+                -- or a function returning result of type Y
+
+   W : Y := S;  -- where S is either a variable or constant of type X
+                -- or a function returning result of type X
+
+2) argument for "in" parameter of a subroutine call
+
+   function F(A : in X; B : in Y)
+   ...
+   procedure P(A: in X; B : in Y)
+   ...
+   V : X;
+   W : Y;
+   ...
+   ... := F(W, V);
+   P(W, V);
+
+3) argument for "out" parameter of a procedure call
+
+   procedure P1(U : out X)
+   ...
+   procedure P2(U : in out X)
+   ...
+   W : Y;
+   ...
+   P1(W);
+   P2(W);
+
+In all these cases a compiler provides implicit conversions between types X
+and Y using private operations Strip and Upgrade of Y.
+
+Further, there may be several different envelopes for the same base type:
+
+  type Y is private envelope of X;
+  type Z is private envelope of X;
+
+and one of those envelopes may be immediately used for an initialization of a
+variable or parameter of another envelope type (as in the previous case above).
+For example:
+
+  procedure P(W : out Y)
+  ...
+  T : Z;
+  ...
+  P(T);
+
+The compiler provides implicit conversions between types Y and Z using
+compositions Z.Upgrade(Y.Strip(...)) and Y.Upgrade(Z.Strip(...)) .
+
+I believe that the notion of enveloped type may be considered (to some extent)
+as opposite to the notion of subtype.
+
+****************************************************************
+
+From: Robert Dewar
+Sent: Sunday, September  1, 2002  2:19 PM
+
+I think a proposal like Alexander's envelope proposal, especially one, which
+like this one, adds a very large amount of complexity, should always be
+accompanied by a motivating example worked out in detail, showing how
+some problem is solved with the new feature, and what is required to soilve
+the same problem without the new feature.
+
+For me, it would take a lot of convincing to accept a big feature like this,
+and I see it as only a minor convenience feature. But perhaps an example
+would show why I am wrong :-)
+
+****************************************************************
+
+From: Robert Duff
+Sent: Sunday, September  1, 2002  4:06 PM
+
+>   type Y is private envelope of X;
+
+First, I agree with Robert that motivating examples are needed
+for this sort of thing.
+
+Second, why invent a new kind of type.  Wouldn't it be simpler to invent
+a feature called "user-defined implicit conversions"?  I think that idea
+has been discussed here before.
+
+> Enveloping type (Y above) is required to have 2 private primitive operations:
+>
+>   function Strip (Source : Y) return X;
+>
+> and
+>
+>   function Upgrade (Source : X) return Y;
+>
+> which must be exact inverse of each other:
+>
+>   Strip( Upgrade(V) ) = V  and Upgrade( Strip(W) ) = W
+>
+> and their implementation is severely restricted so that compiler can verify
+> and guarantee these identities.
+
+Those severe restrictions seem difficult to define.
+
+> Then, a variable of enveloped type may be immediately initialized with
+> a value of enveloping type and vice versa, ...
+
+Why initialization, and not assignment statements?
+
+> 3) argument for "out" parameter of a procedure call
+
+I presume this can only work if the thing is passed by copy?
+How does this interact with "by reference" (and "return by reference")
+types?
+
+> The compiler provides implicit conversions between types Y and Z using
+> compositions Z.Upgrade(Y.Strip(...)) and Y.Upgrade(Z.Strip(...)) .
+
+Are there interactions with explicit type conversions?
+Need to think about whether ambiguities can be introduced.
+
+> I believe that the notion of enveloped type may be considered (to some
+> extent) as opposite to the notion of subtype.
+
+I don't understand the analogy -- please explain.
+
+I'm sure generics need some corresponding changes.
+Every time you change the semantics of private types,
+you need to think about corresponding changes to generics.
+See AARM-7.3(19.a-19.f).
+
+Would you allow:
+
+    type Y is new T with private envelope of X;
+
+or some other permutation of those keywords?
+That is, surely one would sometimes want to make the envelope
+visibly derived from some other type.
+
+****************************************************************
+
+From: Alexander Kopilovitch
+Sent: Wednesday, September  4, 2002  3:49 PM
+
+>> I believe that the notion of enveloped type may be considered (to some
+>> extent) as opposite to the notion of subtype.
+>
+>I don't understand the analogy -- please explain.
+
+Subtyping imposes restrictions on the type, while enveloping lifts
+restrictions, imposed on the type (at the cost of speed or memory, but not of
+safety).
+
+  Their essential common feature is virtually seamless interoperability with
+the base type and other subtypes or envelopes of the latter.
+
+  But yes, the analogy is not direct, because of different nature of
+restrictions involved: subtyping deals with restrictions on the type's domain,
+which is inherent to the type itself, while enveloping deals with the
+restrictions imposed by the type's surrounding environment and provides, say,
+additional lifestyles for objects of the type.
+
+****************************************************************
+
+From: Pascal Leroy
+Sent: Monday, September  2, 2002  7:02 AM
+
+First, let me remind everybody that this is a mailing list for discussing
+possible improvements/extensions to the Ada language.  Stylistic considerations
+about the optimal ratio of "+" vs "-" or about the wizardry needed to rewrite
+code using a text editor having nothing to do here.  There are forums more
+appropriate for this kind of chit-chat (CLA for example).
+
+Second, let me make one thing very clear: the ARG is not, repeat not, going to
+drop or obsolesce Ada.Strings.Unbounded_String.  The ARG could introduce minor
+incompatibilities in this unit if they were to bring significant benefits,
+although this would take some convincing.  It could introduce another string
+package providing similar capabilities, but that would take even more
+convincing.  But there is no point in discussing the removal of ASU, as this is
+not going to happen in our lifetime.
+
+This being said, I would like to make a few comments on the performance issue.
+
+1 - Contrary to popular belief, the main reason why the predefined units are
+defined in the RM is not because vendors can provide a super-efficient
+implementation for them.  Predefined units are intended to provide services
+that are well-defined and generally useful.  Most of the time they are
+implemented in pure Ada with little or no compiler magic.  Spending a lot of
+engineering effort in developing special-purpose magic for ASU is not the right
+trade-off for vendors: they are better off working on improving the general
+quality of their compiler and of the generated code, as it benefits all users,
+not only the vanishingly small minority who critically depends on the
+performance of ASU.
+
+2 - The comparison between pointer-to-string assignments and Unbounded_Strings
+assignment is entirely bogus (for the record, the ratio for Rational Apex
+4.0.0b turns out to be 25, ie in the same ballpark as for GNAT and ObjectAda).
+It should come as no surprise that assigning pointers is more efficient than
+assigning controlled objects with the attendant storage management.  But as
+soon as the pointers used for implementing the strings are exposed, there is
+the risk of storage leaks, multiple deallocation or other plagues.  By
+completely encapsulating the storage management issues, ASU prevents this sort
+of bugs, and that's very important for critical and/or long running
+applications.
+
+3 - Although we had some customers complaining about the performance of ASU in
+the '95-'96 timeframe (and we did improve it) I haven't seen a report on that
+topic for years.  Actually the last set of changes that we made to ASU was
+prompted by a customer who had Unbounded_Strings shared among tasks and wanted
+a tasking-safe version.  My first reaction was "if we do this you are not going
+to like the result because it's going to be much slower".  Their response was
+interesting.  They said: "look, we are not doing any ASU operation, or any
+string operation, or any operation involving heap, in the time-critical parts
+of our application; but in the non-critical parts we use ASU everywhere to
+guarantee that we don't have storage bugs; and we also depend for correctness
+on ASU operations to be atomic wrt tasking".  We ended up making this customer
+happy by providing two variants of this package.  My point is that here is a
+real life application that includes both hard real-time and command and control
+components, and these folks didn't give a damn about the speed of ASU.
+
+****************************************************************
+
+From: Robert Dewar
+Sent: Monday, September  2, 2002  9:38 AM
+
+Pascal, can you be a little more detailed about the task safety issue in
+Unbounded_String. That does seem a legitimate discussion. What does the
+RM require? What do implementations provide? What extra features are
+desirable?
+
+****************************************************************
+
+From: Pascal Leroy
+Sent: Monday, September  2, 2002 10:08 AM
+
+It is actually quite unclear what the RM requires, and when the issue cropped
+up we had heated internal discussions on that topic.  The paragraph in question
+is A(3) which says that "the implementation shall ensure that each language
+defined subprogram is reentrant in the sense that concurrent calls on the same
+subprogram perform as specified, so long as all parameters that could be passed
+by reference denote nonoverlapping objects."
+
+The problematic phrase of course is "nonoverlapping objects".  If you interpret
+nonoverlapping with the low-level meaning "don't share bytes" then it's hard to
+imagine how the user of a private type could decide if two objects overlaps
+(because the private type may or may not be implemented with levels of
+indirection).  A definition that would not violate the contract model of
+private types would have to be able to answer the question "could these two
+objects possibly overlap?" in term of high-level Ada semantics, and that
+doesn't seem straightforward.  Certainly the RM does a lot of hand waving here.
+
+In practice our "normal" implementation uses reference counting and so is not
+safe in the face of tasking (any assignment actually creates overlapping
+objects) so if you execute an assignment X := Y and pass X and Y to two tasks
+which modify these variables, pretty quickly you end up with a corrupted
+reference count.  This is an annoyance, and we document it, but it provides the
+best performance in sequential programs.  Note that implementation (like GNAT,
+I believe) which do deep copy of the string can still run into trouble if for
+instance two tasks execute in parallel the assignments X := Y and Y := Z.  You
+could end up assigning to X the first half of Y and the second half of Z,
+depending on interleaving.
+
+In order to deal with the customer request to make ASU tasking-safe we added a
+child unit of ASU (called, imaginatively, Ada.Strings.Unbounded.Rational) with
+a single procedure, Make_Tasking_Safe.  This procedure turns a global boolean
+which is tested when entering each subprogram in ASU to choose the appropriate
+implementation.  In tasking-safe mode, we do deep copy through a (single)
+protected object (it is not possible to create a tasking-safe implementation of
+reference counting in Ada--at least I couldn't find a way--sigh).
+
+****************************************************************
+
+From: Florian Weimer
+Sent: Monday, September  2, 2002  4:34 PM
+
+> Note that implementation (like GNAT, I believe) which do deep copy
+> of the string can still run into trouble if for instance two tasks
+> execute in parallel the assignments X := Y and Y := Z.
+
+I think this case is already defined to be erroneous in 9.11(11).
+
+****************************************************************
+
+From: Robert Dewar
+Sent: Monday, September  2, 2002  4:50 PM
+
+Yes, of course that is obviously erroneous, this would be obviously true
+with any non-atomic type, since you have one task reading a variable at
+the same time that some other task is writing it.
+
+In other words, common sense tells us this is erroneous, we don't need
+to consult the refrerence manual :-)
+
+****************************************************************
+
+From: Pascal Leroy
+Sent: Tuesday, September  3, 2002  1:55 AM
+
+The thing that bothers me is that, because Unbounded_Strings is a private
+type, and because the RM doesn't specify much regarding the concurrency
+behavior of ASU, the above situation may or may not be erroneous, and the
+user has no way to tell.  (Unless of course she has access to the source,
+which is the case with GNAT, but that's beside the point, we are talking
+language definition here.)
+
+> In other words, common sense tells us this is erroneous, we don't need
+> to consult the reference manual :-)
+
+If Unbounded_Strings are implemented a la GNAT, the assignments above are
+erroneous.  If they are somehow synchronized with a protected object, there
+is no erroneousness.  It would seem to be a useful thing to specify in the
+RM (heck, it could merely be a documentation requirement, no need to force
+implementations to change).
+
+****************************************************************
+
+From: Jean-Pierre Rosen
+Sent: Tuesday, September  3, 2002  2:54 AM
+
+> If Unbounded_Strings are implemented a la GNAT, the assignments above are
+> erroneous.  If they are somehow synchronized with a protected object, there
+> is no erroneousness.  It would seem to be a useful thing to specify in the
+> RM (heck, it could merely be a documentation requirement, no need to force
+> implementations to change).
+
+I don't think this is needed. Unsynchronized access had always been
+erronneous, nothing new here. Now, if an implementation behaves correctly in
+erroneous conditions, it is just one of the allowed behaviours under
+erroneous execution :-)
+
+****************************************************************
+
+From: Bob Duff
+Sent: Tuesday, September  3, 2002  8:56 AM
+
+Robert says:
+
+> > In other words, common sense tells us this is erroneous, we don't need
+> > to consult the reference manual :-)
+
+My common sense happens to agree with Robert's in this case.
+If you do X := Y and Y := Z in parallel on normal Strings,
+it's erroneous.  Unbounded strings are supposed to be growable,
+but in other ways, they ought to be just like Strings.
+
+Pascal says:
+
+> If Unbounded_Strings are implemented a la GNAT, the assignments above are
+> erroneous.  If they are somehow synchronized with a protected object, there
+> is no erroneousness.  It would seem to be a useful thing to specify in the
+> RM...
+
+Yes, the RM should have said so more clearly (assuming the A(3)
+paragraph is insufficient).  I suppose there are other private types
+where this issue arises?
+
+>... (heck, it could merely be a documentation requirement, no need to force
+> implementations to change).
+
+I don't see any need for a documentation req't.  Calling it erroneous
+doesn't force any implementation to change.
+
+****************************************************************
+
+From: Robert Dewar
+Sent: Tuesday, September  3, 2002  3:09 PM
+
+<<The thing that bothers me is that, because Unbounded_Strings is a private
+type, and because the RM doesn't specify much regarding the concurrency
+behavior of ASU, the above situation may or may not be erroneous, and the
+user has no way to tell.  (Unless of course she has access to the source,
+which is the case with GNAT, but that's beside the point, we are talking
+language definition here.)>>
+
+I disagree.
+
+If you have one task writing a bounded string and one task reading the
+same bounded string, that's obviously an improperly shared variable in
+the sense of RM 9.10, and clearly should be considered erroneous.
+
+Now if there are no visible shared variables of this case, then of course
+simultaneous task access should work fine. If you allow weasle arguments
+about the implementation sharing implicit stuff then the statement in the
+RM about task safety is compeltely meaningless.
+
+I don't think it is meaningless, I think it is very useful. In my opinion
+any implementation of unbounded strings that does not allow different tasks
+to do different things to different unbounded string objects is incorrect.
+
+****************************************************************
+
+From: Ted Baker
+Sent: Wednesday, September  4, 2002  2:25 PM
+
+> ... "the implementation shall ensure that each language defined
+> subprogram is reentrant in the sense that concurrent calls on the same
+> subprogram perform as specified, so long as all parameters that
+> could be passed by reference denote nonoverlapping objects."
+...
+> In practice our "normal" implementation uses reference counting and so is not
+> safe in the face of tasking (any assignment actually creates overlapping
+> objects) so if you execute an assignment X := Y and pass X and Y to two tasks
+> which modify these variables, pretty quickly you end up with a corrupted
+> reference count ... Make_Tasking_Safe.  This procedure turns a global boolean
+> which is tested when entering each subprogram in ASU to choose the appropriate
+> implementation.  In tasking-safe mode, we do deep copy through a (single)
+> protected object (it is not possible to create a tasking-safe implementation of
+> reference counting in Ada--at least I couldn't find a way--sigh).
+
+Exactly.
+
+A fully tasking-correct implementation of
+unbounded strings could not use reference semantics, and so would be
+too inefficient to be of interest to anyone.  A naive programmer could
+easily use these strings in ways that would indeed lead to some nasty
+hard-to isolate bugs in concurrent code.
+
+In that sense, I agree with those who would like to see this package
+"deprecated".
+
+****************************************************************
+
+From: Robert Dewar
+Sent: Wednesday, September  4, 2002  2:39 PM
+
+This is nonsense.
+
+We have lots of customers using this package, for whom the performance of
+the GNAT implementation is just fine. In fact we don't have any user who
+has expressed any concerns about the performance of this package.
+
+THe idea of deprecating a package just because there *might* be someone
+who found it inefficient is about as silly as urging that a television
+channel should be removed because someone might find it boring.
+
+If there is a need for a package with a different spec then go ahead and
+propose one, but please don't waste time trying to eliminate or deprecate
+the spec that is there now and which is widely used and definitely useful.
+
+As for naive programmers, note that as far as I am concerned a correct
+implementation cannot naively use reference counts because it wouold be
+wrong.
+
+To say that reference semantics cannot be used at all is wrong. There is
+no problem in implementing reference counts that are task safe, it just
+requires locks. Whether these locks represent a good time/space tradeoff
+depends on how light locks are (back in the good old 8080 days, my tasking
+operating system took two instructions to take or release a lock :-)
+
+****************************************************************
+
+From: Ted Baker
+Sent: Wednesday, September  4, 2002  2:47 PM
+
+The problem is how should a programmer (not knowing the implementation)
+predict when the same objects are being accessed concurrently?
+
+It may be "obvious" for entire strings, but it is not so obvious
+to a user whether a value (say passed out of a call to a function)
+is actually a substring of some other string (because the function
+obtained the string by taking a substring -- e.g., Tail of some
+other string).
+
+****************************************************************
+
+From: Robert Dewar
+Sent: Wednesday, September  4, 2002  2:55 PM
+
+If this kind of reference semantics is used, it must be transparent to the
+programmer. These two objects are independent in tasking terms, and it would
+be improper to reflect underlying sharing at the semantically noticeable
+level.
+
+You are thinking too much in implementation terms
+
+Separate objects of type Unbounded_String are separate objects in the semantic
+sense of tasking independence, and the implementation, whatever techniques
+it uses, must preserve this semantic view.
+
+****************************************************************
+
+From: Tucker Taft
+Sent: Wednesday, September  4, 2002  3:31 PM
+
+I agree.  The one place we "cheated" with respect to concurrency
+is for the Text_IO routines that have "implicit" File_Type
+parameters.  I think we have authorized implementations to
+*not* use a lock just so that concurrent Put_Line's to
+the default Current_Input will work.
+
+I would think that if there are other "implicit" globals
+like these that are part of the *semantic* description
+of a subprogram, then concurrency need not be supported.
+
+But I agree with Robert that globals or other sharing that
+are introduced by the implementation, and are not part of
+the language-defined semantics of the subprogram, will need some
+kind of locking or careful updating by the implementation.
+The user should *not* need to be aware of these kind
+of implementation details when writing concurrent
+programs.
+
+****************************************************************
+
+From: Craig Carey
+Sent: Thursday, September  5, 2002  1:59 AM
+
+
+This is not clarifying
+
+Robert Dewar wrote:
+
+ >Separate objects of type Unbounded_String are separate objects in the semantic
+ >sense of tasking independence, and the implementation, whatever techniques
+ >it uses, must preserve this semantic view.
+
+And also it seems to me that Ada's own strings do not follow that requirement.
+
+
+It is not all that obvious on where to draw the line between task unsafe
+being acceptable or otherwise.
+
+For example, what about when Unbounded String procedure only handles a
+single Unbounded String argument ?.  For example:
+
+U1, U2  : Unbounded_String;    --  Globals
+Last    : Natural := 300_000_000;
+A       : String (1 .. Last);
+
+
+{Task 1}:      U1 := To_String (A & To_Unbounded_String (A));
+                A := A (3000 .. Last) & A (1 .. 2999);
+
+{Task 2}:      A (K) := '+';
+
+
+Plain Ada Strings are not task safe.
+Compilers can refuse to make an Ada String be 'atomic'.
+
+Should there be task locking inside of the To_String() routine and the
+To_Unbounded_String() routines.
+
+Instead of the words "reference semantics" should be "transparent" (i.e.
+hidden). it might be safer to say that data structures
+(perhaps overlooking all of the Character data) should not be corrupted
+in a way that leads to problems including these:
+    (1) an exception can be raised later;
+    (2) a string is lost track of or a plain string is wrongly pointed to
+      more than once;
+    (3) an internal length field (in some record) becomes longer than the
+     real length of the allocated string.
+
+This could be tasking unsafe in an analogy with how the array
+concatenating "&" may be:
+
+    function Ada.Strings.Unbounded."&" (Left, Right : in Unbounded_String)
+          return Unbounded_String;
+
+I am unclear on whether a 'To_String()' routine ought do any task
+locking.
+
+****************************************************************
+
+From: Robert Dewar
+Sent: Thursday, September  5, 2002  7:25 PM
+
+Craig says
+
+> Plain Ada Strings are not task safe.
+
+This is completely confused. Craig, you entirely miss the point. Of course two
+tasks accessing the same string is erroneous. If the tasks try to access the
+same element of the string (one storing and one reading), then the normal
+shared variable rules of 9.10 make the program erroneous, and if they access
+separate elements, the independence requirement is not satisfied because the
+strings are packed.
+
+This is really not the place for elementary discussion of basic Ada concepts!
+
+> Compilers can refuse to make an Ada String be 'atomic'.
+
+Yes, of course, this rejection is expected if you understand atomic.
+
+> I am unclear on whether a 'To_String()' routine ought do any task
+> locking.
+
+Probably you should take this kind of discussion to comp lang ada.
+
+Once again (I am assuming no one else is confused but who knows?) The issue in
+a package like unbounded strings is that if two tasks do operations on
+different objects of type unbounded string, then there must be no visible
+inteference between the two operations (internal syncrhonization of some kind
+may or may not be required at the implementation level, but that is of no
+concern here).
+
+If two tasks do operations on the same unbounded string object (one reading
+and one writing), then the program is erroneous in the normal 9.10 sense.
+The only way this would not be the case is if the implementation provided
+a pragma Atomic for the type unbounded string, but of course this is only
+a theoretical observation, in practice no architecture would provide for
+atomic access to strings of unlimited length :-)
+
+****************************************************************
+
+From: Pascal Leroy
+Sent: Thursday, September  5, 2002  7:55 AM
+
+> To say that reference semantics cannot be used at all is wrong. There is
+> no problem in implementing reference counts that are task safe, it just
+> requires locks.
+
+Well, it's not that easy.
+
+The problem is that controlled types won't let you lock a piece of data (say,
+the reference count) for the entire duration of an assignment operation. You
+can of course implement Adjust and Finalize using protected operations, and
+that will take care of some of the concurrency problems. However, when you
+consider the sequence of events that take place during an assignment operation,
+there is a critical period after the rhs has been copied to the lhs but before
+Adjust has been called, where the reference count does not actually reflect the
+number of references that exist. This inconsistency can cause trouble if
+another task comes in at that very moment.
+
+It may be that such situations only arise in the presence of erroneous sharing
+in the sense of 9.10, but I couldn't convince myself that this was true.  And
+at any rate it doesn't help placate customers who demand a fully tasking-safe
+implementation of ASU.
+
+Note that in languages where you redefine the entire assignment operation (e.g.
+C++) then evidently you lock the object for the whole duration in the call to
+operator= and there is no difficulty.
+
+****************************************************************
+
+From: Robert Dewar
+Sent: Thursday, September  5, 2002  8:35 AM
+
+> It may be that such situations only arise in the presence of erroneous sharing
+> in the sense of 9.10, but I couldn't convince myself that this was true.  And
+> at any rate it doesn't help placate customers who demand a fully tasking-safe
+> implementation of ASU.
+
+As far as I can see this is exactly right, it is only a problem if you have
+erroneous shared variables in the sense of 9.10.
+
+As for a fully tasking-safe implementation of ASU, I find this an odd concept.
+It is unreasonable to demand that the type Unbounded_String be Atomic, and if
+it is not, then two tasks doing improper simultaneous access to the same
+unbounded string object is erroneous by 9.10.
+
+I see the situation as completely clear here from the user level semantic
+point of view. Yes, there may be implementation difficulties in providing
+the correct semantics, especially if you attempt inappropriate optimziations
+(inappropriate in that they violate the clear requirement that separate
+unbounded string objects be independent in the tasking access sense).
+
+There are many opportunities for incorrect optimizations of various Ada
+constructs, but this points to problems of implementation, not design!
+
+****************************************************************
+
+From: Nick Roberts
+Sent: Friday, September  6, 2002  9:10 AM
+
+I'm doing a bit of summarising of this thread so far. Randy
+summarised the discussion at Vienna with regard to the
+Ada.Strings packages by writing a proposal for an updated AI 301,
+which can be seen here:
+
+http://www.ada-auth.org/cgi-bin/cvsweb.cgi/AIs/AI-00301.TXT
+
+It would seem, after discussions here, the following changes
+suggested by this AI would be generally acceptable:
+
+[K1] The addition of procedure To_Bounded_String to package
+Ada.Strings.Bounded and procedure To_Unbounded_String to package
+Ada.Strings.Unbounded, as proposed.
+
+[K2] The addition of packages Ada.Text_IO.Bounded_IO,
+Ada.Wide_Text_IO.Wide_Bounded_IO, Ada.Text_IO.Unbounded_IO, and
+Ada.Wide_Text_IO.Wide_Unbounded_IO, as proposed. (Note, I have
+assumed these facilities would be as useful for bounded strings
+as for unbounded strings.)
+
+[K3] The addition of functions Index and Index_Non_Blank, as
+proposed, to Ada.Strings.Bounded and Ada.Strings.Unbounded (only)
+(since they are upwards compatible, and may be useful).
+
+[K4] The additional functions and procedures Replace_Slice,
+Insert, Overwrite, could be retained, since (in the absence of
+the proposed Slice) they are unlikely to cause problems in
+practice.
+
+It would seem that the following parts of the proposal need to be
+removed:
+
+[D1] The additional functions Slice (since they could introduce
+ambiguities in existing code that would cause it to fail to
+compile).
+
+I would like to propose the following changes:
+
+[NJR1] The proposed Slice functions could simply be renamed to
+Bounded_Slice and Unbounded_Slice, so as to obviate the ambiguity
+problem.
+
+[NJR2] The procedures To_Bounded_String and To_Unbounded_String
+are renamed Set_Bounded_String and Set_Unbounded_String.
+
+****************************************************************
+
+From: Robert Eachus
+Sent: Monday, September  9, 2002  3:45 PM
+
+First I would like to note that Nick Robert's sent a summary of the
+position so far, but with the wrong topic.  I hope Randy can get it to
+the right spot..
+
+Next, on the topic of "efficient" implementations of
+Ada.Strings.Unbounded, I am completely at a loss.  Ada.Strings.Bounded
+should be much faster at the cost of "extra" storage space.  If you are
+reading lines from a file and want efficiency, you either have a
+specified (in some cases by the operating system) maximum length on
+lines in a file, or you put in the "extra" overhead to deal with lines
+that are too long for the preset buffer size.
+
+Having Ada.Strings.Unbounded available makes this special case code a
+lot easier to write.   Ada.Strings.Unbounded is also fine for cases
+where you are only reading a few strings and don't want to have to worry
+about string lengths.  A perfect example is for holding the command
+line. (Yes you can put this in a String constant, but that may take an
+extra level of nesting.)  Another example is when you have user input
+from the console, or a text buffer in an HTML script.  In any of these
+cases the overhead of using Ada.Strings.Unbounded will normally be in
+the noise.
+
+Now to respond to Nick's summary:
+
+[K1] The addition of procedure To_Bounded_String to package
+Ada.Strings.Bounded and procedure To_Unbounded_String to package
+Ada.Strings.Unbounded, as proposed.
+
+I don't know why this is needed, but at least it is harmless.
+
+[K2] The addition of packages Ada.Text_IO.Bounded_IO,
+Ada.Wide_Text_IO.Wide_Bounded_IO, Ada.Text_IO.Unbounded_IO, and
+Ada.Wide_Text_IO.Wide_Unbounded_IO, as proposed. (Note, I have
+assumed these facilities would be as useful for bounded strings
+as for unbounded strings.)
+
+Same as above.  But I think that the Bounded versions would be even less
+useful, since they would require another generic instantiation.  (There would
+have to be a generic formal package parameter.)
+
+[K3] The addition of functions Index and Index_Non_Blank, as
+proposed, to Ada.Strings.Bounded and Ada.Strings.Unbounded (only)
+(since they are upwards compatible, and may be useful).
+
+Again, may be useful seems damning with faint praise...
+
+[K4] The additional functions and procedures Replace_Slice,
+Insert, Overwrite, could be retained, since (in the absence of
+the proposed Slice) they are unlikely to cause problems in
+practice.
+
+Even more damning. ;-)
+
+[D1] The additional functions Slice (since they could introduce
+ambiguities in existing code that would cause it to fail to
+compile).
+
+Yes, this should definitely go away.
+
+[NJR1] The proposed Slice functions could simply be renamed to
+Bounded_Slice and Unbounded_Slice, so as to obviate the ambiguity
+problem.
+
+This would work, but I don't see the need.
+
+[NJR2] The procedures To_Bounded_String and To_Unbounded_String
+are renamed Set_Bounded_String and Set_Unbounded_String.
+
+Now I am really confused!  To me the only reason to add these procedures
+that makes any sense is to make it clear that you are creating a new
+Bounded (Unbounded) value.  The Set_ nomenclature hides this, while Foo
+:= To_Unbounded_String(Bar); is a very reasonable way of  showing the
+assignment.
+
+So overall I see NO need to change the Ada.Strings.Bounded and
+Ada.Strings.Unbounded packages.  The child IO packages for
+Ada.Strings.Unbounded are certainly useful.  I don't see any reason to
+add the Ada.Strings.Bounded packages other than symmetry.
+
+****************************************************************

Questions? Ask the ACAA Technical Agent