!standard A.4.3 (68) 99-08-30 AI95-00128/05 !standard A.4.4(101) !standard A.4.4(102-105) !standard A.4.5(86-87) !standard A.4.3(2,74) !class binding interpretation 96-10-07 !status Corrigendum 2000 99-08-13 !status WG9 approved (8-0-0) 97-07-04 !status ARG approved 12-0-0 96-10-07 !status work item (letter ballot was 6-4-2) 96-10-03 !status ARG approved 7-0-1 (subject to letter ballot) 96-06-17 !status work item 96-04-17 !status received 96-04-04 !priority High !difficulty Easy !qualifier Omission !subject String Packages !summary This AI clarifies minor details of the semantics of some of the string-manipulation subprograms: 1. Fixed.Find_Token raises Constraint_Error if the value returned for First is not in Positive. 2. A call to Bounded.Slice with High > Length(Source) raises Index_Error. 3. The functions in Bounded, such as Replace_Slice, are defined in terms of the corresponding functions in Fixed, and the procedures in Bounded are defined in terms of the functions in Bounded. 4. A.4.3(2) takes precedence over any indication to the contrary in the following RM paragraphs. !question 1. The string packages (e.g., Ada.Strings.Fixed) have a procedure named Find_Token whose profile is: procedure Find_Token (Source : in String; Set : in Maps.Character_Set; Test : in Membership; First : out Positive; Last : out Natural); The semantics of this operation states that (RM95-A.4.3(68)) "if no such slice exists, then the value returned for Last is zero, and the value returned for First is Source'First." What happens when Source'First is not in Positive (which can happen only if Source is a null string)? [It raises Constraint_Error.] 2. The semantics of Bounded.Slice is stated as follows (A.4.4(101)): "Returns the slice at positions Low through High in the string represented by Source; propagates Index_Error if Low > Length(Source)+1." What happens when Low <= Length(Source)+1 and High > Length(Source)? Should it raise an exception? If so which one? Or should it return all characters from Low to Length(Source)? [It raises Index_Error.] 3. The semantics of many subprograms of package Bounded is defined in terms of the semantics of the corresponding subprograms of package Fixed (A.4.4(102-105). The meaning is clear in most cases, except for Head and Tail. A.4.4(105) says: "Each of the ... selector subprograms (Trim, Head, Tail) ... has an effect based on its corresponding subprogram in Strings.Fixed ..." The procedure Fixed.Head has the following profile: procedure Head (Source : in out String; Count : in Natural; Justify : in Alignment := Left; Pad : in Character := Space); and the procedure Bounded.Head has a rather different profile: procedure Head (Source : in out Bounded_String; Count : in Natural; Pad : in Character := Space; Drop : in Truncation := Error); Because the profiles are different, the "effect based on the corresponding subprogram" is not very clear. It is interesting to note that the semantics of the operations of package Unbounded makes a distinction between functions and procedures (RM95 A.4.5(86-87)), which clarifies very much the meaning. Is the intent similar for Bounded? The issue seems to be broader than Head and Tail: take for instance procedure Bounded.Replace_Slice. Is it based on the function Fixed.Replace_Slice, or on the procedure Fixed.Replace_Slice? The effect is rather different, since the procedure doesn't change the length of its argument, while the function may return a string of a different length than its argument. 4. Paragraph A.4.3(2) says: 2 For each function that returns a String, the lower bound of the returned value is 1. However, A.4.3(73) says: 73 function Replace_Slice (Source : in String; Low : in Positive; High : in Natural; By : in String) return String; 74 If Low > Source'Last+1, or High < Source'First-1, then Index_ Error is propagated. Otherwise, if High >= Low then the returned string comprises Source(Source'First..Low-1) & By & Source(High+1..Source'Last), and if High < Low then the returned string is Insert(Source, Before=>Low, New_Item=>By). The lower bounds of the above concatenations give Source'First as the lower bound, which might not be 1. Is the lower bound really 1? [Yes.] !recommendation (See summary.) !wording (See corrigendum.) !discussion 1. Fixed.Find_Token raises Constraint_Error if the value returned for First is not in Positive. Bounded.Find_Token and Unbounded.Find_Token's string argument always has a lower bound of 1 (by definition), so the question does not apply to them. 2. A call to Bounded.Slice with High > Length(Source) raises Index_Error. This is analogous to the normal string slicing operation. 3. The *function* Bounded.Head is defined in terms of the function Fixed.Head; a call of the function Bounded.Head is equivalent to: To_Bounded_String(Fixed.Head(To_String(Source), Count, Pad), Drop => Drop) The *procedure* Bounded.Head is defined in terms of the *function* Bounded.Head; a call to the procedure Bounded.Head is equivalent to: Source := Head(Source, Count, Pad, Drop); Corresponding rules apply to Tail. In general, the functions in Bounded, such as Replace_Slice, are defined in terms of the corresponding functions in Fixed, and the procedures in Bounded are defined in terms of the functions in Bounded. 4. Clearly, the intent is that the lower bound should always be 1, as stated in A.4.3(2). I think a "friendly" reading is that A.4.3(74) is just telling us the characters of the string (it says "comprises", and not "is equivalent to"), and is not intended to define the bounds. A.4.3(2) is therefore interpreted as taking precedence over any indication to the contrary in the following paragraphs. This applies in general; A.4.3(2) also takes precedence over A.4.3(78,86) and perhaps other paragraphs. !corrigendum A.04.03(68) @drepl @xindent @dby @xindent !corrigendum A.04.03(74) @drepl @xindent Source'Last+1, or High < Source'First-1, then Index_ Error is propagated. Otherwise, if High @>= Low then the returned string comprises Source(Source'First..Low-1) & By & Source(High+1..Source'Last), and if High < Low then the returned string is Insert(Source, Before=@>Low, New_Item=@>By).> @dby @xindent Source'Last+1, or High < Source'First-1, then Index_ Error is propagated. Otherwise, if High @>= Low then the returned string comprises Source(Source'First..Low-1) & By & Source(High+1..Source'Last), but with lower bound 1. Otherwise, if High < Low then the returned string is Insert(Source, Before=@>Low, New_Item=@>By).> !corrigendum A.04.03(86) @drepl @xindent @dby @xindent !corrigendum A.04.03(106) @drepl @xindent @dby @xindent !corrigendum A.04.04(101) @drepl @xindent Length(Source)+1.> @dby @xindent Length(Source)+1 or High @> Length(Source).> !corrigendum A.04.04(105) @drepl Each of the transformation subprograms (Replace_Slice, Insert, Overwrite, Delete), selector subprograms (Trim, Head, Tail), and constructor functions ("*") has an effect based on its corresponding subprogram in Strings.Fixed, and Replicate is based on Fixed."*". For each of these subprograms, the corresponding fixed-length string subprogram is applied to the string represented by the Bounded_String parameter. To_Bounded_String is applied the result string, with Drop (or Error in the case of Generic_Bounded_Length."*") determining the effect when the string length exceeds Max_Length. @dby Each of the transformation subprograms (Replace_Slice, Insert, Overwrite, Delete), selector subprograms (Trim, Head, Tail), and constructor functions ("*") has an effect based on its corresponding subprogram in Strings.Fixed, and Replicate is based on Fixed."*". For each of these functions, the corresponding fixed-length string function is applied to the string represented by the Bounded_String parameter. To_Bounded_String is applied to the result string, with Drop (or Error in the case of Generic_Bounded_Length."*") determining the effect when the string length exceeds Max_Length. For each of these procedures, the corresponding function in Strings.Bounded.Generic_Bounded_Length is applied, with the result assigned into the Source parameter. !ACATS test 1. It would be possible to create a C-Test to test this case, but it would be of very little value (as the problem can only occur for an unusual null string). 2. Test cases for this should be added to CXA4019. A test that raises Index_Error should be added for Bounded. 3. Functions and procedures in Bounded are adaquately tested in CXA4007. 4. It would be possible to create a C-Test to check that the lower bound of all of these functions is 1, but this does appear to be an important test. !appendix !section A.4.3(68) !subject Behavior of Find_Token when passed a null string with negative 'First !reference RM95-A.4.3(68) !from Pascal Leroy 96-03-22 !reference 96-5446.a Pascal Leroy 96-3-22>> !discussion The string packages (e.g., Ada.Strings.Fixed) have a procedure named Find_Token whose profile is: procedure Find_Token (Source : in String; Set : in Maps.Character_Set; Test : in Membership; First : out Positive; Last : out Natural); The semantics of this operation states that (RM95-A.4.3(68)) "if no such slice exists, then the value returned for Last is zero, and the value returned for First is Source'First." This definition, together with the profile of Find_Token, have the unfortunate consequence that, if Find_Token is passed a null string with a negative lower bound, Find_Token raises Constraint_Error, because Source'First is not in Positive: Null_String : String (-10 .. -20); ... Find_Token (Source => Null_String, ...); This is quite inconsistent, because a call to Find_Token with a null string whose 'First is positive doesn't raise an exception. _____________________________________________________________________ Pascal Leroy +33.1.30.12.09.68 pleroy@rational.com +33.1.30.12.09.66 FAX **************************************************************** !section A.4.3(68) !subject Behavior of Find_Token when passed a null string with negative 'First !reference RM95-A.4.3(68) !reference 96-5446.a Pascal Leroy 96-3-22 !from Robert Dewar 96-03-23 !reference 96-5448.a Robert Dewar 96-3-23>> !discussion I think the constraint error for a negative bound is perfectly fine. Yes, the language allows such strange null strings, but the normal representation for a null string should have non-negative bounds, there is no reason to go outside the index range by more than 1 for a null string. **************************************************************** !section A.4.3(68) !subject Behavior of Find_Token when passed a null string with negative 'First !reference RM95-A.4.3(68) !reference: 96-5446.a Pascal Leroy 96-3-22 !from Robert Eachus 96-03-25 !reference 96-5449.a Robert I. Eachus 96-3-25>> !discussion Pascal said: > This definition, together with the profile of Find_Token, have the > unfortunate consequence that, if Find_Token is passed a null > string with a negative lower bound, Find_Token raises > Constraint_Error, because Source'First is not in Positive... > This is quite inconsistent, because a call to Find_Token with a null string > whose 'First is positive doesn't raise an exception. I think this is pretty pathological. Since the index subtype of String is Positive, you can only have the call succeed--without raising Constraint_Error--with Source'First negative if you explicitly create a null string value with weird bounds, and call with that null string as a parameter. Normal calls, even normal calls with null slices of ordinary strings, won't cause any problems. Robert I. Eachus with Standard_Disclaimer; use Standard_Disclaimer; function Message (Text: in Clever_Ideas) return Better_Ideas is... **************************************************************** !section A.4.4(101) !subject What does Bounded.Slice do when High exceeds the upper bound of Source? !reference RM95-A.4.4(101) !from Pascal Leroy 96-03-27 !reference 96-5451.a Pascal Leroy 96-3-27>> !discussion The semantics of Bounded.Slice is stated as follows: "Returns the slice at positions Low through High in the string represented by Source; propagates Index_Error if Low > Length(Source)+1." >From this definition, I have trouble understanding what Bounded.Slice is expected to do when Low <= Length(Source)+1 and High > Length(Source). Should it raise an exception? If so which one? Or should it return all characters from Low to Length(Source)? **************************************************************** !section A.4.4(105) !subject Semantics of Bounded.Head and Bounded.Tail !reference RM95-A.4.4(105) !from Pascal Leroy 96-03-27 !reference 96-5451.b Pascal Leroy 96-3-27>> !discussion The semantics of many subprograms of package Bounded is defined in terms of the semantics of the corresponding subprograms of package Fixed (RM95 A.4.4(102-105). The meaning is clear in most cases, except for Head and Tail. RM95 A.4.4(105) says: "Each of the ... selected subprograms (Trim, Head, Tail) ... has an effect based on its corresponding subprogram in Strings.Fixed ..." The procedure Fixed.Head has the following profile: procedure Head (Source : in out String; Count : in Natural; Justify : in Alignment := Left; Pad : in Character := Space); and the procedure Bounded.Head has a rather different profile: procedure Head (Source : in out Bounded_String; Count : in Natural; Pad : in Character := Space; Drop : in Truncation := Error); Because the profiles are different, the "effect based on the corresponding subprogram" is not very clear. It is interesting to note that the semantics of the operations of package Unbounded makes a distinction between functions and procedures (RM95 A.4.5(86-87)), which clarifies very much the meaning. Is the intent similar for Bounded? The issue seems to be broader than Head and Tail: take for instance procedure Bounded.Replace_Slice. Is it based on the function Fixed.Replace_Slice, or on the procedure Fixed.Replace_Slice? The effect is rather different, since the procedure doesn't change the length of its argument, while the function may return a string of a different length than its argument. **************************************************************** !section A.4.3(68) !subject Behavior of Find_Token when passed a null string with negative 'First !reference RM95-A.4.3(68) !reference: 96-5446.a Pascal Leroy 96-3-22 !reference: 96-5448.a Robert Dewar 96-3-23 !reference: 96-5449.a Robert I. Eachus 96-3-25 !from Pascal Leroy 96-03-27 !reference 96-5452.a Pascal Leroy 96-3-27>> !discussion > I think this is pretty pathological. Since the index subtype of > String is Positive, you can only have the call succeed--without > raising Constraint_Error--with Source'First negative if you explicitly > create a null string value with weird bounds, and call with that null > string as a parameter. > I think the constraint error for a negative bound is perfectly fine. Yes, > the language allows such strange null strings, but the normal representation > for a null string should have non-negative bounds, there is no reason to go > outside the index range by more than 1 for a null string. I must say that I am not convinced by Robert & Robert's answer who both seem to say: "this is pathological, you deserved it!" First, I find String (0..-1) rather less pathological than String (1000..-1000). The former will raise an exception, the latter won't. I see no reason why. Second, I find it rather improper than an operation of one of the predefined packages raises Constraint_Error. As we all know, you already get Constraint_Error too often when writing/debugging Ada code. If I must be punished for using pathological null strings, then I'd rather get Index_Error than Constraint_Error. Third, what bothers me is that this Constraint_Error seems quite unintentional in the RM: it is not explicit stated, but arises because the parameter subtype is Positive. (I know, speculations about what is intentional in the RM and what is not are sterile...) _____________________________________________________________________ Pascal Leroy +33.1.30.12.09.68 pleroy@rational.com +33.1.30.12.09.66 FAX **************************************************************** !section A.4.3(68) !subject Behavior of Find_Token when passed a null string with negative 'First !reference RM95-A.4.3(68) !reference: 96-5446.a Pascal Leroy 96-3-22 !reference: 96-5448.a Robert Dewar 96-3-23 !reference: 96-5449.a Robert I. Eachus 96-3-25 !reference 96-5452.a Pascal Leroy 96-3-27 !from Bob Duff !reference 96-5457.a Robert A Duff 96-4-8>> !discussion > > I think this is pretty pathological. Since the index subtype of > > String is Positive, you can only have the call succeed--without > > raising Constraint_Error--with Source'First negative if you explicitly > > create a null string value with weird bounds, and call with that null > > string as a parameter. > > > I think the constraint error for a negative bound is perfectly fine. Yes, > > the language allows such strange null strings, but the normal representation > > for a null string should have non-negative bounds, there is no reason to go > > outside the index range by more than 1 for a null string. > > I must say that I am not convinced by Robert & Robert's answer who both seem > to say: "this is pathological, you deserved it!" > > First, I find String (0..-1) rather less pathological than String > (1000..-1000). The former will raise an exception, the latter won't. I see > no reason why. I agree -- they're *both* pathological. So I don't care if they raise an exception. If one raises an exception, and not the other, that seems fine. Furthermore, I don't particularly care *which* exception is raised. > Second, I find it rather improper than an operation of one of the predefined > packages raises Constraint_Error. As we all know, you already get > Constraint_Error too often when writing/debugging Ada code. If I must be > punished for using pathological null strings, then I'd rather get Index_Error > than Constraint_Error. Why should we care which exception is raised? > Third, what bothers me is that this Constraint_Error seems quite > unintentional in the RM: it is not explicit stated, but arises because > the parameter subtype is Positive. (I know, speculations about what > is intentional in the RM and what is not are sterile...) Yes, I agree, this was an unintentional oversight in the RM. But I don't see any better idea than to raise C_E in this case. Raising some other exception seems irrelevant -- any exception in this case indicates a bug. The only other possibility is to return a well-defined result -- but the RM clearly requires an out-of-range result, so C_E seems appropriate. - Bob **************************************************************** !section A.4.3(02) !subject Lower bounds should be 1 !reference RM95-A.4.3(02) !reference RM95-A.4.3(73) !from Bob Duff !reference 96-5475.a Robert A Duff 96-4-12>> !discussion Robert Dewar pointed out this problem to me. A.4.3(2) says: 2 For each function that returns a String, the lower bound of the returned value is 1. However, A.4.3(73) says: 73 function Replace_Slice (Source : in String; Low : in Positive; High : in Natural; By : in String) return String; 74 If Low > Source'Last+1, or High < Source'First-1, then Index_ Error is propagated. Otherwise, if High >= Low then the returned string comprises Source(Source'First..Low-1) & By & Source(High+1..Source'Last), and if High < Low then the returned string is Insert(Source, Before=>Low, New_Item=>By). The lower bounds of the above concatenations give Source'First as the lower bound, which might not be 1. Clearly, the intent is that the lower bound should always be 1, as stated in A.4.3(2). I think a "friendly" reading is that A.4.3(74) is just telling us the characters of the string (it says "comprises", and not "is equivalent to"), and is not intended to define the bounds. - Bob **************************************************************** !section A.4.3(02) !subject Lower bounds should be 1 !reference RM95-A.4.3(02) !reference RM95-A.4.3(73) !reference 96-5475.a Robert A Duff 96-4-12 !from Pascal Leroy 96-04-16 !reference 96-5490.a Pascal Leroy 96-4-16>> !discussion > The lower bounds of the above concatenations give Source'First as the > lower bound, which might not be 1. > > Clearly, the intent is that the lower bound should always be 1, as > stated in A.4.3(2). I think a "friendly" reading is that A.4.3(74) is > just telling us the characters of the string (it says "comprises", and > not "is equivalent to"), and is not intended to define the bounds. Note that there are (at least) two other functions to which this comment applies: Insert: RM95 A.4.3(78) says "otherwise returns Source (Source'First .. Before - 1) & New_Item & Source (Before .. Source'Last)." Delete: RM95 A.4.3(86) says "otherwise it is Source." There may be other functions for which the wording may also be misleading. So I guess that the AI should be worded in a way that makes it clear that it applies to all functions of package Fixed. (Maybe by stating that the wording of all paragraphs in RM95 A.4.3 is just specifying the characters of the string but not its bounds.) _____________________________________________________________________ Pascal Leroy +33.1.30.12.09.68 pleroy@rational.com +33.1.30.12.09.66 FAX **************************************************************** !section A.4.4(101) !subject What does Bounded.Slice do when High exceeds the upper bound of Source? !reference RM95-A.4.4(101) !reference 96-5451.a Pascal Leroy 96-3-27 !reference AI95-00128/00 !from Keith Thompson 96-07-16 !reference 96-5622.a Keith Thompson 96-7-16>> !discussion Pascal Leroy asked what Bounded.Slice does when Low <= Length(Source)+1 and High > Length(Source) (i.e., when the requested slice overlaps the end of the string). The current version of AI-00128 says a call with High > Length(Source) is equivalent to a call with High = Length(Source). What is the rationale for this? The name Slice implies that the function is intended to correspond to the predefined slice operation on type String, which raises an exception for such an overlap. For what it's worth, at least two existing implementations currently raise Index_Error in this case. **************************************************************** !section A.4.3(68) !subject String Packages !reference AI95-00128/01 !from Norman Cohen !reference 96-5689.c Norman H. Cohen 96-9-6>> !discussion I am startled by the decision (contradicting at least two existing implementations) that Slice does not raise an exception when called with an upper bound that is too high. Perhaps there is a well thought out justification for this, but if so, it should appear in a !discussion section. **************************************************************** !section A.4.5(82) !subject Lower bound of Unbounded.Slice !reference RM95-A.4.5(82) !reference RM95-A.4.4(101) !reference RM95-A.4.4(1) !reference RM95-A.4.3(2) !reference AI95-00128/04 !from Keith Thompson 96-12-03 !reference 96-5777.a Keith Thompson 96-12-3>> !discussion Note that this discussion applies equally to Unbounded.Slice and Bounded.Slice. (Well, Bounded.Generic_Bounded_Length.Slice if you want to be picky.) In a discussion on comp.lang.ada, Pascal Obry asked whether Ada.Strings.Unbounded.Slice (U, 6, 8) should have bounds 6..8 or 1..3. Bob Duff replied that the correct bounds are 1..3: > See A.4.5(82), A.4.4(101), A.4.4(1) "whose low bound is 1", A.4.3(2), > and AI-128. The correct result is 1..3. I don't find any of those references entirely convincing. Taking the references one at a time: A.4.5(82) says that Unbounded.Slice has the same effect as Bounded.Slice. A.4.4(101) says that Bounded.Slice "returns the slice at positions Low through High in the string represented by Source". This can be interpreted to mean that Slice(U, 6, 8) is equivalent to To_String(U)(6..8), giving bounds of 6..8. A.4.4(1) says that a Bounded_String represents a String whose low bound is 1. This is not directly relevant, since Slice returns a String, not a Bounded_String. A.4.3(2) says that "For each function that returns a String, the lower bound of the returned value is 1", but this refers to the package Ada.Strings.Fixed, which has no Slice function. The equivalent of the Slice function for fixed strings is an array slice operation, which in the example above yields bounds of 6..8. Finally, I found nothing in AI95-00128/04 that states that Slice returns a result with a lower bound of 1. Robert Dewar made a similar point on comp.lang.ada. To conclude, I believe the RM implies (perhaps weakly) that the Slice function of Bounded and Unbounded returns a result whose lower bound is not necessarily 1. On the other hand, A.4.3(2) could be taken to imply that the *intent* was for all functions returning String in all three packages to return a result with a lower bound of 1. This needs to be clarified in the next revision of AI-00128. **************************************************************** !from Randy Brukardt 99-08-17 1. While the question refers to all versions of Find_Token, the wording change only applies to Fixed.Find_Token. That's because only it can have a bound outside of Positive; the others always have a lower bound 1. I added a sentence to that effect to the discussion. 4. I checked through all of the paragraphs for Ada.Strings.Fixed operations, and I believe that I gave wording changes for the only ones for which someone might construe A.4.3(2) to not apply. The discussion implies that A.4.3(78) needs to be changed, but this paragraph already includes "but lower bound 1.", so I don't see how anyone could come to any other conclusion about the lower bound. ****************************************************************