!standard B.3.1(16) 17-09-05 AC95-00291/00 !class Amendment 17-09-05 !status received no action 17-09-05 !status received 17-07-10 !subject Proposal for new function in Interfaces.C.Strings !summary !appendix !topic Proposal for new function in Interfaces.C.Strings !reference Ada 2012 RMB.3.1 !from Victor Porton 17-07-11 !keywords C interface !discussion I propose to add the following new function to Interfaces.C.Strings: function Value_With_Possible_NULs( Item : in chars_ptr; Length : in size_t) return String; This function should return the Ada string corresponding to the sequence of C chars starting at Item and of length Length. Note that (as it is clear from the function name) this string may contain NUL characters. Also: If Item = Null_Ptr, then Value_With_Possible_NULs propagates Dereference_Error. It is possible to implement this function using Interfaces.C.Pointers package, but it is not feasible to implement it anew in every program or library which needs it. So I propose to add this function to Interfaces.C.Strings for everybody to have easy access to this function. Function name is debatable. *************************************************************** From: Randy Brukardt Sent: Monday, July 10, 2017 8:15 PM > I propose to add the following new function to Interfaces.C.Strings: > > function Value_With_Possible_NULs( > Item : in chars_ptr; Length : in size_t) return String; As with all enhancement requests for Ada, you need a problem statement. As described in the forthcoming announcement: "For enhancement requests, it is very important to describe the programming problem and why the Ada 2012 solution is complex, expensive, or impossible. A detailed description of a specific enhancement is welcome but not necessarily required. The goal of the ARG is to solve as many programming problems as possible with new/enhanced Ada features that fit into the existing Ada framework. Thus the ARG will be looking at the language as a whole, which may suggest alternative solutions to the problem." It's also important to note that the priority that is given to enhancements comes from a judgment of the importance of the problem; if we have to guess what the problem is, we might very well get that wrong. *************************************************************** From: Victor Porton Sent: Tuesday, July 11, 2017 7:49 AM This function is necessary for converting C strings which may contain NUL (and whose length is determined by a size_t parameter). Such strings may appear for example when reading a disk file or receiving information from network. It is a common place for such strings with NULs to exist. Implementing this function (using Interfaces.C.Pointers) requires repeated work in every program and library which needs it. Also it is a nontrivial problem to guess that it can be implemented with Interfaces.C.Pointers. Also implementing this knowing compiler internals may probably make it faster than implementation using only current Ada standard. *************************************************************** From: Tucker Taft Sent: Tuesday, July 11, 2017 1:40 PM Have you checked whether Interfaces.C.To_Ada(..., Trim_Null => False) can accomplish what you need? *************************************************************** From: Randy Brukardt Sent: Tuesday, July 11, 2017 4:34 PM To expand on Tuck's response a bit: The routines in Interfaces.C.Strings are necessary for strings that are NUL terminated, as such strings require a scan in order to determine the length. But for a string with an explicit length, no such routine is needed -- you already know the length and can use a slice to get the correct length. Thus, the only requirement is to get from a chars_array to a String, and that's what Ada.Interfaces.To_Ada is for. This is how we handled this in Claw; we never used Interfaces.C.Strings at all. Admittedly, we didn't care much about the form of the low-level bindings because clients were expected only to use the higher-level interface. This does mean that one shouldn't use Chars_Ptr with a C string of explicit length, and perhaps the Standard should be clearer about that (when it talks about "C-Style" strings, it is talking only about nul-terminated strings). One would need a very different package if it was intending to support strings that also included nuls (all of the operations would need counterparts that don't take nuls into account). I'd have to see your exact problem in order to demonstrate how I'd handle this (the Win32 APIs all require a buffer passed in, which is then sliced to get the appropriate result; your C routines may operate differently). *************************************************************** From: Joey Fish Sent: Tuesday, July 11, 2017 4:55 PM The problem with "Interfaces.C.To_Ada(..., Trim_Null => False)" is that it terminates at the first null, the Trim_Null indicates trimming off the terminal null. (The parameter does not impact the reading of "internal null" values.) -- The problem, as stated, is the passing of data which may itself contain NUL. The Value function [in GNAT] is as follows: function Value (Item : chars_ptr; Length : size_t) return char_array is begin if Item = Null_Ptr then raise Dereference_Error; end if; -- ACATS cxb3010 checks that Constraint_Error gets raised when Length -- is 0. Seems better to check that Length is not null before declaring -- an array with size_t bounds of 0 .. Length - 1 anyway. if Length = 0 then raise Constraint_Error; end if; declare Result : char_array (0 .. Length - 1); begin for J in Result'Range loop Result (J) := Peek (Item + J); --> if Result (J) = nul then --> return Result (0 .. J); --> end if; end loop; return Result; end; end Value; The marked portion here shows that a NUL inside the data terminates the processing, leaving some data unprocessed. Thus, to address the problem the original poster posited B.3.1 The Package Interfaces.C.Strings should have Value extended by a parameter allowing the indicated portion to be bypassed. *************************************************************** From: Victor Porton Sent: Tuesday, July 11, 2017 4:28 PM Interfaces.C.To_Ada requests a char_array as an argument. So we need to construct a char_array from chars_ptr first. But to get a char_array we need first use Interfaces.C.Pointers. It is valid to use To_Ada, but this function requires first calling Interfaces.C.Pointers.Value. My actual code does contain C_Ada (but not only it): with Interfaces.C; use Interfaces.C; with Interfaces.C.Pointers; package RDF.Auxiliary.C_Pointers is    new Interfaces.C.Pointers(Index              => size_t,                              Element            => char,                              Element_Array      => char_array,                              Default_Terminator => nul);    function Value_With_Possible_NULs (Item: RDF.Auxiliary.C_Pointers.Pointer; Length: size_t) return String is    begin       return To_Ada(RDF.Auxiliary.C_Pointers.Value(Item, Interfaces.C.ptrdiff_t(Length)), Trim_Nul=>False);    end; (Note that the code above uses RDF.Auxiliary.C_Pointers.Pointer instead of chars_ptr as proposed, but this can be easily solved using unchecked conversion), *************************************************************** From: Tucker Taft Sent: Tuesday, July 11, 2017 9:19 PM > The problem with "Interfaces.C.To_Ada(..., Trim_Null => False)" is that it > terminates at the first null, the Trim_Null indicates trimming off the > terminal null. (The parameter does not impact the reading of "internal null" > values.) -- The problem, as stated, is the passing of data which may itself > contain NUL. Actually, the semantics of To_Ada with Trim_Null => False is that NULs are treated just like regular characters, which I believe is what you want. Here is the RM description: "... For procedure To_Ada, each element of Item (if Trim_Nul is False) ... is converted (via the To_Ada function) to a Character, which is assigned to the corresponding element of Target. Count is set to the number of Target elements assigned. If Target is not long enough, Constraint_Error is propagated. ..." As it says, *each* element of Item. Perhaps these should have been two different routines, because I admit the wording is a bit confusing. But if you ignore the part that is associated with the “Trim_Nul => True” case, then To_Ada is a simple routine that treats NUL just like a regular character. In fact, in 99 44/100% of Ada compilers for machines that are byte addressable, a char_array and an array of Characters are the very same thing. So you could just use Unchecked_Conversion. But that admittedly isn’t going to look as portable. *************************************************************** From: Randy Brukardt Sent: Tuesday, July 11, 2017 10:49 PM > The problem with "Interfaces.C.To_Ada(., Trim_Null => False)" is that > it terminates at the first null, the Trim_Null indicates trimming off > the terminal null. (The parameter does not impact the reading of > "internal null" values.) -- The problem, as stated, is the passing of > data which may itself contain NUL. That's not true at all, the definition in B.3(31) makes that clear as Tucker reported. > The Value function [in GNAT] is as follows: Which has what to do with Interfaces.C.To_Ada??? > The underlined/bold portion here shows that a NUL inside the data terminates > the processing, leaving some data unprocessed. For Interfaces.C.Strings, which is only for processing nul-terminated strings. If you use a hammer to turn a screw, do you complain to Craftsman that the hammer doesn't work??? *************************************************************** From: Randy Brukardt Sent: Tuesday, July 11, 2017 11:22 PM > Interfaces.C.To_Ada requests a char_array as an argument. > > So we need to construct a char_array from chars_ptr first. Why? There is no reason to use a chars_ptr at all (it is a pointer to a nul-terminated string, and you don't have that). > But to get a char_array we need first use Interfaces.C.Pointers. IMHO, there is no reason to ever use Interfaces.C.Pointers. It's a complete waste of time. (Janus/Ada failed to implement it, mostly by mistake, and so far as I recall no one ever complained.) Just declare the proper types for your interface in the first place and you're fine. In this case, you want a C convention access to a constrained chars_array. And then you can slice that in your thicker binding. Easy peasy. ;-) > It is valid to use To_Ada, but this function requires first calling > Interfaces.C.Pointers.Value. > > My actual code does contain C_Ada (but not only it): > > with Interfaces.C; use Interfaces.C; > with Interfaces.C.Pointers; > > package RDF.Auxiliary.C_Pointers is >    new Interfaces.C.Pointers(Index              => size_t, >                              Element            => char, >                              Element_Array      => char_array, >                              Default_Terminator => nul); >    function Value_With_Possible_NULs (Item: RDF.Auxiliary.C_Pointers.Pointer; > Length: size_t) return String is >    begin >       return To_Ada(RDF.Auxiliary.C_Pointers.Value(Item, > Interfaces.C.ptrdiff_t(Length)), Trim_Nul=>False); >    end; > > (Note that the code above uses > RDF.Auxiliary.C_Pointers.Pointer instead of chars_ptr as proposed, but > this can be easily solved using unchecked conversion), I suppose this works, but it seems like using a bazooka to kill a mosquito. Remember that C is weakly typed, so it doesn't differentiate between a nul-terminated string and a string with an explicit length. But they should be different types in Ada; the semantics are substantially different. Here's how I would deal with such a situation in Win32: MAX_Name : constant := 100; type Acc_Name is access all Interfaces.C.Char_Array (0 .. MAX_NAME) with Convention => C; function Get_Name (H : in Handle; Buffer : in Acc_Name; Length : in out Size_T) return D_Word with Import => True, Convention => C, External_Name => "GetNameA"; function Get_Name_of_Handle (H : in Handle) return String is Result : D_Word; Name_Buffer : aliased Interfaces.C.Char_Array (0 .. MAX_NAME); Blen : Size_T := MAX_NAME; begin Result := Get_Name (H, Name_Buffer'Access, Blen); if Result /= 0 then Raise_Windows_Error; else return Interfaces.C.To_Ada (Name_Buffer (0 .. Blen), Trim_Nul => False); end if; end Get_Name_of_Handle; In Win32, you always pass in a buffer and a length, so you always know that the result will be shorter than that. And you then just slice the result. If you instead get a pointer allocated by the called routine, it's a bit more complex but the general idea is the same. In any of these schemes (including using Interfaces.C.Strings), we're assuming some sort of all-Ada wrapper subprogram. So it is fine for there to be some complications so long as those are restricted to the wrapper (clients should be able to use normal Ada types without complication). *************************************************************** From: Victor Porton Sent: Tuesday, July 11, 2017 5:10 PM The problem with To_Ada is that it requires char_array, not chars_ptr. For this reason it requires a wrapper. My proposal is exactly to add this wrapper. You may create a new package if you want. We DO need C strings with explicit length when interfacing with some C libraries. For example a file downloaded from the net may have NUL bytes in it. It is not wise to say "don't use C strings with explicit length", because it IS needed when interfacing with some particular libraries. And Interfaces.C is exactly for this, that is for interfacing with C libraries. The my exact problem: When interfacing with both I/O streams http://librdf.org/raptor/api/raptor2-section-iostream.html and downloading from the net http://librdf.org/raptor/api/raptor2-section-www.html of Raptor C library, I must to be sure my code does work correctly if the downloaded file contains NUL bytes. Doing otherwise would be contrary to reliability (it would mean that my code does a wrong thing if encountering a NUL byte)! For this reason I implemented my own version of Value_With_Possible_NULs function using Interfaces.C.Pointers. This is not a good idea to repeat this in each and every library which needs to interface with C. **************************************************************