!standard 7.1(01) 03-07-31 AC95-00064/01 !class amendment 03-07-31 !status received no action 03-07-31 !status received 03-01-10 !subject Converting application types to Storage_Array !summary !appendix From: Jeffrey Carter Sent: Friday, January 10, 2003 8:01 PM This seems to be an area where I would like to see an AI. However, it is possible that it is already addressed by the language and I've missed it. I did not find anything about this in the AIs. There is a need for a language-defined mechanism to efficiently (unchecked) convert an arbitrary value of any type to (an appropriate subtype of) Storage_Array without copying the value and in such a way that attributes such as 'First, 'Last, 'Length, and 'range are valid for the result. Especially in applications that deal with custom hardware, it is desirable to be able to write operations such as procedure P (Data : in Storage_Array); and use attibutes of Data to perform the operation. The actual values that will be passed to P may be quite large. Hard real-time systems often have both time and memory constraints that disallow copying of the value in order to attach bounds information to it. Depending on an implementation's scheme for connecting bounds information to an array value, Unchecked_Conversion or an overlay using an address clause may not provide valid bounds information, preventing the use of normal array attributes. This leads to systems that use implementation-dependent characteristics, which are not portable to other compilers, or that define such operations using an address and size: procedure P (Addr : in Address; Size : in Positive); with all the opportunities for error that this approach is infamous for. We need a mechanism which is portable (therefore language defined), and that guarantees both that the value is not copied and that normal array attributes may be safely used. If the mechanism requires that a subtype of Storage_Array with appropriate bounds be declared, an attribute that gives the actual size of an object or parameter in storage elements would be very useful, as the calculation of this value from 'Size is repetitive and error prone, and 'Max_Size_In_Storage_Elements may not be the correct value. Does the language define such a mechanism, or should I prepare an AI? If it already exists, the ARM should explicitly state it in 13.7.1. This is something essential for Ada's intended application area and that a lot of people have missed. **************************************************************** From: Randy Brukardt Sent: Friday, January 10, 2003 9:01 PM > There is a need for a language-defined mechanism to efficiently > (unchecked) convert an arbitrary value of any type to (an appropriate > subtype of) Storage_Array without copying the value and in such a way > that attributes such as 'First, 'Last, 'Length, and 'range > are valid for the result. I'd say: "Use the slice, Jeff, use the slice!" I presume your point is that using Storage_IO to do this implies a copy, which can be a problem in some circumstances. When we've had to do stuff like this in Claw (although there we're usually trying to get at a C string of some sort), we've generally used a combination of unchecked conversions on accesses to the objects in question, along with slices of the result. This would look something like: use System.Storage_Elements; type Blob ... -- The source type of interest. Better have a -- contiguous representation. -- Convert to storage array: type Blob_Access is access all Blob; subtype Max_Storage_Array is Storage_Array (1 .. Storage_Offset'Last); type SA_Access is access all Max_Storage_Array; -- We use a maximum sized statically constrained Storage_Array here, -- because not all implementations use 'bare' pointers for dynamic arrays. function Convert_Access is Unchecked_Conversion (Source => Blob_Access, Target => SA_Access); -- A "safe" Unchecked_Conversion; it has to work without making your -- program erroneous. -- Here's the conversion: ... Convert_Access(Blob_Object'Unchecked_Access).all(1 .. (Blob'Size+Storage_Element'Size-1)/Storage_Element'Size) ... This conversion will copy nothing but one pointer and (possibly) a slice descriptor; the generated code is very little, even though there is a lot of text. Unfortunately, you can't write a function to do this, because its parameters must be of mode in, and thus you can only have access-to-constants of them. (And, I suppose, the function would be at risk of copying its result anyway). This is unsafe only if you calculate the size of the type incorrectly (and thus access beyond the end of the object). That's not likely (outside of compiler bugs). The real danger is if Blob is a type that can change size (that is, it is mutable type), or if the compiler has represented it in a non-contiguous fashion. For a compiler like Janus/Ada, getting a contiguous object requires being pretty careful. If the object is non-contiguous, there isn't any way to get a simple Storage_Array from it without copying, simply because pieces of the object are in different places. That's why Storage_IO implies copying. The point is that what you want isn't possible in general. And it's easy to get when it is possible (although I don't see why you'd want to go to Storage_Array -- generally it makes more sense to go to some application-specific type or to Stream_Array). There might be some value in having an attribute which would tell you if a particular object is contiguously represented, because then at least you could bullet-proof the program: if not Blob'Contiguous then raise Program_Error; -- Hey, we assume that!! end if; but that's about all that I can see that is possible. (Janus/Ada has such an attribute, called 'Is_Simple, but we've never documented it. That's odd, because we do document the distinction between simple and non-simple types. We use it a lot in the runtime, especially in generic units like Sequential_IO.) **************************************************************** From: Pascal Leroy Sent: Saturday, January 11, 2003 2:59 AM I fully agree with Randy's analysis. **************************************************************** From: Jeffrey Carter Sent: Saturday, January 11, 2003 6:03 PM ... > I presume your point is that using Storage_IO to do this implies a > copy, which can be a problem in some circumstances. Yes. > -- Here's the conversion: ... > Convert_Access(Blob_Object'Unchecked_Access).all(1 .. > (Blob'Size+Storage_Element'Size-1)/Storage_Element'Size) ... > > This conversion will copy nothing but one pointer and (possibly) a > slice descriptor; the generated code is very little, even though > there is a lot of text. OK. It seems like a lot of work for something that is so often needed in Ada's intended application area. The object must be aliased, which seems like an unnecessary restriction on the rest of the system. The number of times I've seen people (including me) make errors calculating the number of storage elements makes me want an attribute that provides this value. In fact, I'd say your example should use Blob_Object'Size to calculate this value, rather than Blob'Size, since the two can differ. I'd also use Storage_Unit rather than Storage_Element'Size, since the two are defined to be the same and the former is shorter. Being able to say something like (1 .. Blob_Object'Size_In_Storage_Elements) would eliminate this source of errors. > This is unsafe only if you calculate the size of the type incorrectly > (and thus access beyond the end of the object). That's not likely > (outside of compiler bugs). In practice, I've found it fairly common. > > The real danger is if Blob is a type that can change size (that is, > it is mutable type), or if the compiler has represented it in a > non-contiguous fashion. For a compiler like Janus/Ada, getting a > contiguous object requires being pretty careful. If the object is > non-contiguous, there isn't any way to get a simple Storage_Array > from it without copying, simply because pieces of the object are in > different places. That's why Storage_IO implies copying. > > The point is that what you want isn't possible in general. And it's > easy to get when it is possible (although I don't see why you'd want > to go to Storage_Array -- generally it makes more sense to go to some > application-specific type or to Stream_Array). There might be some > value in having an attribute which would tell you if a particular > object is contiguously represented, because then at least you could > bullet-proof the program: I've seen lots of projects where this kind of thing was needed, and copying was not acceptable. One area is sticking bits into (or getting them from) some hardware device, and letting the rest of the application deal with the data as one of a large number of large types. There's also applications that need to add error-correction bits and interleave bits from multiple bytes to data that internally may be one of a number of different types. Copying is usually more acceptable in such cases, but the need to deal with lots of different types as a sequence of bytes remains. These internal types are usually carefully laid out at the bit level to be contiguous, so the problem of contiguousitude doesn't arise. I've always wondered about having both Stream_Element and Storage_Element; I'd think that systems where stream and storage elements were different would be fairly rare. If an application-specific type were used, it would generally be equivalent to Storage_Array, so using Storage_Array seems like the right thing to do. At least the language allows a way to achieve this, even if it's not very obvious. Is there any way to put an example in the ARM to point out this approach to all the dummies like me and the people I've worked with? Does anyone else prefer a different approach? **************************************************************** From: Randy Brukardt Sent: Saturday, January 11, 2003 8:57 PM >... > In fact, I'd say your example should use Blob_Object'Size to calculate > this value, rather than Blob'Size, since the two can differ. Yes, that's right, and what I meant. > I'd also use Storage_Unit rather than Storage_Element'Size, since the two > are defined to be the same and the former is shorter. It's defined in a different place, and using it makes the reader figure out that they're the same. Not a big deal; I much prefer either to simply using '8' which a lot of people do (and they leave out the rounding, too); that will usually work, until someone wants to compile the program on the U2200 or something like that. > Being able to say something like > > (1 .. Blob_Object'Size_In_Storage_Elements) > > would eliminate this source of errors. Does any compiler provide this now? It's simple enough to do that if there was any demand, it must have been implemented by now. (Geez, I'm starting to sound like Robert! :-) >... > I've always wondered about having both Stream_Element and > Storage_Element; I'd think that systems where stream and storage > elements were different would be fairly rare. Janus/Ada 95 for the U2200 has these different. The machine has direct accessibility to 36-bit words only, while I/O is done in 9-bit bytes. So Storage_Unit = 36 and Stream_Element_Size = 9. > If an application-specific type were used, it would generally be equivalent > to Storage_Array, so using Storage_Array seems like the right thing to do. Not really. Every time I've had to do this, the target type was Interfaces.C.Char_Array. In practice, this is the same as Storage_Array, but of course Ada is strongly typed, and we needed to use To_Ada to get a real Ada string out it in the end. > At least the language allows a way to achieve this, even if it's not > very obvious. Is there any way to put an example in the ARM to point out > this approach to all the dummies like me and the people I've worked with? This really doesn't have much to do with the ARM (which is a language definition, not a language usage guide). There are a lot of Ada idioms that aren't in the ARM. Wasn't somebody making a web page to collect Ada idioms?? **************************************************************** From: Jeffrey Carter Sent: Sunday, January 12, 2003 8:27 PM ... >>(1 .. Blob_Object'Size_In_Storage_Elements) >> >>would eliminate this source of errors. > > Does any compiler provide this now? It's simple enough to do that if there > was any demand, it must have been implemented by now. (Geez, I'm starting to > sound like Robert! :-) Not that I know of, though I've wanted it on many occasions. Sounding like Robert, GNAT has 'Object_Size, but it's in bits, and applies to a type. >>If an application-specific type were used, it would generally be equivalent >>to Storage_Array, so using Storage_Array seems like the right thing to do. > > Not really. Every time I've had to do this, the target type was > Interfaces.C.Char_Array. In practice, this is the same as Storage_Array, but > of course Ada is strongly typed, and we needed to use To_Ada to get a real > Ada string out it in the end. I was refering to the embedded applications I've been involved with. Apparently your experience has been different. **************************************************************** From: Robert Dewar Sent: Sunday, January 12, 2003 8:49 PM (1 .. Blob_Object'Size_In_Storage_Elements) Seems silly when Blob_Object'Size / Storage_Unit will give same result. **************************************************************** From: Jeffrey Carter Sent: Monday, January 13, 2003 12:46 PM Randy and I both seem to feel that (Blob_Object'Size + Storage_Unit - 1) / Storage_Unit is needed to round up to the next multiple of Storage_Unit if Blob_Object'Size is not such a multiple. It would be nice if your version were correct for all platforms and compilers. Is that the case? **************************************************************** From: Robert Dewar Sent: Monday, January 13, 2003 3:35 PM Not for an object size. I can't imagine any sensible compiler allocating a stand alone object in memory that does not occupy an integral number of storage units. That's certainly the case with GNAT (nothing else would make sense). **************************************************************** From: Robert A. Duff Sent: Monday, January 13, 2003 2:34 PM In my experience, if I want to know the number of storage units, then I'm doing something that requires the size to be an integer number of storage units. So I usually write something like this: pragma Assert(Blah'Size mod System.Storage_Unit = 0); ... Blah'Size / System.Storage_Unit ... In many cases, there's no point in rounding up -- that answer is just as wrong as any other. **************************************************************** From: Pascal Leroy Sent: Monday, January 13, 2003 4:03 AM Agreed. In fact the rounding approach is weird because it assumes that the object starts on a storage unit boundary but might end in the middle of a storage unit. But then the object might as well start and end in the middle of a storage unit, in which case the rounding would be entirely wrong -- think of a 2-bit object straddling two storage units. This leads me to the conclusion that Bob's assertion should really be: pragma Assert(Blah'Alignment /= 0 and Blah'Size mod System.Storage_Unit = 0); ****************************************************************