!standard 4.6(15) 02-12-05 AC95-00048/01 !standard 4.6(24) !standard 4.6(42) !class confirmation 02-12-05 !status received no action 02-08-22 !subject Holey enumeration types and array indexing !summary !appendix From: Michael F. Yoder Sent: Monday, October 21, 2002 10:30 PM I ask the keepers of compilers to try this program and send me the results together with an indication of which compiler produced which result(s). If you can deduce the output without actually running it, that's fine also. :-) More generally, I'd like to know if there are any circumstances in which an existing compiler would use sparse indexing for a gapped enumeration type rather than indexing by the position values. with Ada.Text_IO; use Ada.Text_IO; procedure Enums is type Enum is (Red, Blue, Green); for Enum use (1, 3, 10); type Enum_Array is array (Enum, Enum, Enum) of Integer; begin -- Answer is expected to be 27 or 1000: Put_Line ("Enum_Array'Size/Integer'Size =" & Integer'Image (Enum_Array'Size/Integer'Size)); end Enums; **************************************************************** From: Robert Dewar Sent: Monday, October 21, 2002 10:34 PM Why is this question being asked, what AI is involved here? Michael can you explain what is going on here? As far as I know this is not an ARG issue, if it is, I have missed it. **************************************************************** From: Michael F. Yoder Sent: Monday, October 21, 2002 10:45 PM This is an old issue, it's mentioned in the minutes of several meetings as "holey enumeration types" or something like that, assigned to Mike Kamrad and myself. **************************************************************** From: Randy Brukardt Sent: Monday, October 21, 2002 11:17 PM You'll need to correspond with OCSystems and ICC (Irvine) directly if you want results from them; they don't typically follow this list. I tried the program on Janus/Ada 3.1.2, GNAT 3.15a, and ObjectAda 7.2 on Windows, and got 27 for each of them. Speaking only for Janus/Ada, the position number is used for all operations other than storage of (most) objects. [Temporary objects and for loop indexes are stored as position values, as no representations can be taken of them.] Values are immediately converted to position numbers upon reading (raising Program_Error if no position number is available), and converted to representation values immediately before storage. The optimizer can remove pairs of these operations (in order to avoid these conversions for simple assignments). Thus operations like array indexing, 'Succ, 'Pred, and for loops, do not need to be aware of enumeration representations. And holey arrays cannot be generated. **************************************************************** From: Robert Dewar Sent: Monday, October 21, 2002 11:41 PM OK, I just did not remember seeing this The answer for GNAT is that it uses the compact (pos) form, and the history is interesting. We originally used the sparse form (on the grounds that if youw ant the pos form, you can easily program that, and the opposite is not true). But we found that a lot of legacy VADS code counted on the compact representation, so we changed. **************************************************************** From: Pascal Leroy Sent: Tuesday, October 22, 2002 3:13 AM Rational Apex returns 27 (I believe that's independent of the version/host/target). **************************************************************** From: Robert A. Duff Sent: Tuesday, October 22, 2002 8:04 AM SofCheck's AdaMagic says 27. I tried two targets: the Patriot ewcc and the one that uses C as IL. I imagine other compilers (like Aonix and Green Hills) that use the AdaMagic front end do the same thing. **************************************************************** From: Robert Dewar Sent: Tuesday, October 22, 2002 8:31 AM I would be very surprised if any compiler at this stage did anything other than 27. I think it's the wrong choice myself, but it was universal enough in Ada 83 compilers that any serious compiler will conform to this expectation. **************************************************************** From: Tucker Taft Sent: Tuesday, October 22, 2002 8:57 AM We had "holey" arrays for a while, but trying to do assignment of holey arrays of controlled objects correctly was a holey nightmare. ;-) We also had at least one customer complaint as well; they were expecting compact, rather than holey arrays. **************************************************************** From: Robert Dewar Sent: Tuesday, October 22, 2002 9:13 AM > We had "holey" arrays for a while, but trying to > do assignment of holey arrays of controlled objects > correctly was a holey nightmare. ;-) Well we had everything working right in GNAT, but ... > We also had at least one customer complaint as well; > they were expecting compact, rather than holey arrays. we got more than one such complaint, so we changed :-) **************************************************************** From: Robert A. Duff Sent: Tuesday, October 22, 2002 9:18 AM > We had "holey" arrays for a while, but trying to > do assignment of holey arrays of controlled objects > correctly was a holey nightmare. ;-) Why is that? Is it because it's a pain to avoid initializing and finalizing the "holes"? **************************************************************** From: Tucker Taft Sent: Tuesday, October 22, 2002 11:01 AM I forget all the reasons. One big reason was that we have "data driven" RTS routines for initializing and finalizing arrays, which keep track of which components have been initialized so that only those get finalized. These routines would be significantly more complicated if they had to worry about holey arrays. And of course you could have discriminant-dependent holey arrays, and all kinds of other nightmarish things. At some point I decided that we really could never get them to work properly if they were holey, but I have forgotten what was the real "killer" problem. **************************************************************** From: Michael F. Yoder Sent: Tuesday, October 22, 2002 10:31 PM Oh, it's not fun enough if there isn't a nontrivial Adjust too. My favorite case is a slice assignment within the same array: the source and target needn't be the same size, and even if they are the same size a block transfer of storage can be wrong. So (in the case where you don't know whether there's overlap, or at which end) you must code two loops, one in each direction. If a nontrivial Adjust is present, I think you must use an array temp in the most general case. **************************************************************** From: Robert A. Duff Sent: Tuesday, October 22, 2002 11:04 AM Yeah, you have to be prepared to skip the holes there, too. Not impossible, but it sounds like a lot of work for a marginal feature. (Note that block transfers are also wrong in the case of signed floating-point zero.) >... If a nontrivial Adjust is present, I think > you must use an array temp in the most general case. I don't think you ever *need* a temp. You can always check "if address of left-hand side = address of right-hand side", and skip the whole assignment in that case. By the way, I can't find my copy of the Ada 83 Rationale right now, but I think it says the intent is to use the sparse representation by default, and the dense representation if you have pragma Pack. That's pretty weird, since pragma Pack is also intended to affect the 'Component_Size. And there *was* no 'Component_Size clause in Ada 83, so pragma Pack was the *only* way to affect 'Component_Size. **************************************************************** From: Robert Dewar Sent: Tuesday, October 22, 2002 10:47 AM OK, so that's no big surprise, the general model requires an array temp in any case (remember you can't clobber the target if there are any exceptions, so you generally need a temp if there is anything non-trivial going on). **************************************************************** From: Robert A. Duff Sent: Tuesday, October 22, 2002 11:24 AM In Ada 95, I believe you *can* clobber the target. That was a deliberate change from Ada 83 (for efficiency reasons). But you do need to deal with: X(I) := X(J); where I and J might be equal sometimes. **************************************************************** From: Michael F. Yoder Sent: Tuesday, October 22, 2002 11:35 PM Robert A Duff wrote: >>(Note that block transfers are also wrong in the case of signed >>floating-point zero.) I thought block transfers were OK for signed zeros, and it was block comparison that's wrong. >... If a nontrivial Adjust is present, I think >you must use an array temp in the most general case. > > >I don't think you ever *need* a temp. You can always check "if >address of left-hand side = address of right-hand side", and skip >the whole assignment in that case. I was thinking of the case of partial overlap: say A(E1 .. E5) := A(E2 .. E6). >By the way, I can't find my copy of the Ada 83 Rationale right now, >but I think it says the intent is to use the sparse representation >by default, and the dense representation if you have pragma Pack. >That's pretty weird, since pragma Pack is also intended to affect >the 'Component_Size. And there *was* no 'Component_Size clause in Ada >83, so pragma Pack was the *only* way to affect 'Component_Size. I remember thinking that was odd. But I didn't consider it a bug, because you can always derive Squashed_Enum and give it a compressed representation. An array type using Squashed_Enum in place of Enum is compatible in type conversions. So I agree with Robert Dewar that in the absence of pragmas controlling this aspect of representation, it's technically superior to use sparse indexing. However customers seem to have forced everyone's hand in this regard. :-) **************************************************************** From: Robert A. Duff Sent: Tuesday, October 22, 2002 12:14 PM > I thought block transfers were OK for signed zeros, and it was block > comparison that's wrong. Yes, of course. I wasn't thinking clearly. > I was thinking of the case of partial overlap: say A(E1 .. E5) := A(E2 > .. E6). In that case, if you copy in the appropriate direction (which must be chosen at run time), you still don't need a temp. **************************************************************** From: Michael F. Yoder Sent: Tuesday, October 22, 2002 1:23 PM Are we talking about the same case? I thought that the permissions in 7.6(18-21) didn't extend that far. I thought you had to finalize the target, that is A(E1) through A(E5), before doing the assignment, which meant that A(E2) through A(E5) at least had to be copied out of the way first; and I don't see how to excuse not copying A(E6) too. You're saying that a loop that finalized A(E1), then copied A(E2), then adjusted, then finalized A(E2), then copied A(E3), etc. is legal? **************************************************************** From: Robert A. Duff Sent: Tuesday, October 22, 2002 2:32 PM > Are we talking about the same case? I think so. I'm talking about: A(E1 .. E5) := A(E2 .. E6); where the two slices might overlap. Or two formal parameters: X := Y; where the actuals might be overlapping slices. >... I thought that the permissions in > 7.6(18-21) didn't extend that far. I thought you had to finalize the > target, that is A(E1) through A(E5), before doing the assignment, which > meant that A(E2) through A(E5) at least had to be copied out of the way > first; and I don't see how to excuse not copying A(E6) too. > > You're saying that a loop that finalized A(E1), then copied A(E2), then > adjusted, then finalized A(E2), then copied A(E3), etc. is legal? Yes, it is OK to do the array components one-by-one. The permission is 7.6(20). The AARM annotation seems to be talking about exactly this case. 20 For an assignment_statement for a noncontrolled type, the implementation may finalize and assign each component of the variable separately (rather than finalizing the entire variable and assigning the entire new value) unless a discriminant of the variable is changed by the assignment. 20.a Reason: For example, in a slice assignment, an anonymous object is not necessary if the slice is copied component-by-component in the right direction, since array types are not controlled (although their components may be). Note that the direction, and even the fact that it's a slice assignment, can in general be determined only at run time. It fails to point out that you also have to detect the case where the slices are identical (E1 = E2 in the above example). I.e., you have to check for 7.6(19) at run time, too. Some time after having written these parts of the RM, I worked on implementing this stuff. I remember thinking that it was silly to have defined the semantics in terms of temps, and then allowing to optimize away the temps, because it seemed that avoiding the temps was both simpler and more efficient. Therefore, it might have been better to *require* the avoidance of the temps. Caveat: There is at least one AI related to these paragraphs. I don't remember what it says, but it may have changed some of this. **************************************************************** From: Tucker Taft Sent: Tuesday, October 22, 2002 12:58 PM > By the way, I can't find my copy of the Ada 83 Rationale right now, > but I think it says the intent is to use the sparse representation > by default, and the dense representation if you have pragma Pack. Actually, all it says is that the "simplest way to treat an array indexed by [a holey] enumeration type ... is ... leaving holes ... in the storage used." It doesn't mention pragma Pack, and it doesn't particularly justify the approach except on simplicity grounds. They do admit that the "user should be aware of the hidden storage costs involved" and go on to say that these costs are "preferable to prohibiting the use of [such] types." But they don't compare it against using a contiguous representation, so it is not a very interesting discussion. Since the only argument seems to be "simplicity" then it loses quickly when other things end up making it more complicated or more storage-inefficient. And they imply the ultimate evil would be prohibiting use of such types as indexes, so with a "very holey" enumeration type, using a compact representation would really be the only practical approach. The funny thing about the whole (hole ?) discussion relating to holey enumeration types is that it uses the EBCDIC character type as the canonical example, but in fact, noone in their right mind would use holey enumeration types for an 8-bit character type, in my view. The holeyness justs inserts hidden overhead all over the place, which is the last thing you want when dealing with a character type. It is about at this point that I get up on my high horse and start complaining about holey enumeration types as being the single silliest idea in Ada 83... But I won't ;-). **************************************************************** From: Robert Dewar Sent: Tuesday, October 22, 2002 1:19 PM > The funny thing about the whole (hole ?) discussion relating > to holey enumeration types is that it uses the EBCDIC character > type as the canonical example, but in fact, noone in their > right mind would use holey enumeration types for an 8-bit > character type, in my view. The holeyness justs inserts > hidden overhead all over the place, which is the last > thing you want when dealing with a character type. I disagree, a "holey" representation is natural for EBCDIC, where you have many code positions that do nto represent valid characters. **************************************************************** From: Dan Eilers Sent: Wednesday, October 23, 2002 7:13 PM ICC Ada prints 27. **************************************************************** From: Joyce Tokar Sent: Thursday, October 24, 2002 11:49 AM DDC-I SCORE Ada Compilation System Output is Enum_Array'Size/Integer'Size = 27 **************************************************************** From: Robert Dewar Sent: Thursday, October 24, 2002 4:50 PM So the conclusion on this is that in fact all compilers use the compact form, so it is merely a waste of time to standardize it at this stage, but it would be harmless. ****************************************************************