Version 1.2 of acs/ac-00291.txt

Unformatted version of acs/ac-00291.txt version 1.2
Other versions for file acs/ac-00291.txt

!standard B.3.1(16)          17-09-05 AC95-00291/00
!class Amendment 17-09-05
!status received no action 17-09-05
!status received 17-07-10
!subject Proposal for new function in Interfaces.C.Strings
!summary
!appendix

!topic Proposal for new function in Interfaces.C.Strings
!reference Ada 2012 RMB.3.1
!from Victor Porton 17-07-11
!keywords C interface
!discussion

I propose to add the following new function to Interfaces.C.Strings:

function Value_With_Possible_NULs(
  Item : in chars_ptr; Length : in size_t) return String;

This function should return the Ada string corresponding to the
sequence of C chars starting at Item and of length Length.

Note that (as it is clear from the function name) this string may
contain NUL characters.

Also: If Item = Null_Ptr, then Value_With_Possible_NULs propagates
Dereference_Error.

It is possible to implement this function using Interfaces.C.Pointers
package, but it is not feasible to implement it anew in every program
or library which needs it. So I propose to add this function to
Interfaces.C.Strings for everybody to have easy access to this
function.

Function name is debatable.

***************************************************************

From: Randy Brukardt
Sent: Monday, July 10, 2017  8:15 PM

> I propose to add the following new function to Interfaces.C.Strings:
> 
> function Value_With_Possible_NULs(
>   Item : in chars_ptr; Length : in size_t) return String;

As with all enhancement requests for Ada, you need a problem statement. As
described in the forthcoming announcement:

"For enhancement requests, it is very important to describe the programming
problem and why the Ada 2012 solution is complex, expensive, or impossible.
A detailed description of a specific enhancement is welcome but not
necessarily required. The goal of the ARG is to solve as many programming
problems as possible with new/enhanced Ada features that fit into the existing
Ada framework. Thus the ARG will be looking at the language as a whole, which
may suggest alternative solutions to the problem."

It's also important to note that the priority that is given to enhancements
comes from a judgment of the importance of the problem; if we have to guess
what the problem is, we might very well get that wrong.

***************************************************************

From: Victor Porton
Sent: Tuesday, July 11, 2017  7:49 AM

This function is necessary for converting C strings which may contain NUL (and
whose length is determined by a size_t parameter). Such strings may appear for
example when reading a disk file or receiving information from network. It is
a common place for such strings with NULs to exist.

Implementing this function (using Interfaces.C.Pointers) requires repeated work
in every program and library which needs it. Also it is a nontrivial problem to
guess that it can be implemented with Interfaces.C.Pointers.

Also implementing this knowing compiler internals may probably make it faster
than implementation using only current Ada standard.

***************************************************************

From: Tucker Taft
Sent: Tuesday, July 11, 2017  1:40 PM

Have you checked whether Interfaces.C.To_Ada(..., Trim_Null => False) can
accomplish what you need?

***************************************************************

From: Randy Brukardt
Sent: Tuesday, July 11, 2017  4:34 PM

To expand on Tuck's response a bit:

The routines in Interfaces.C.Strings are necessary for strings that are NUL
terminated, as such strings require a scan in order to determine the length.
But for a string with an explicit length, no such routine is needed -- you
already know the length and can use a slice to get the correct length.

Thus, the only requirement is to get from a chars_array to a String, and
that's what Ada.Interfaces.To_Ada is for. 

This is how we handled this in Claw; we never used Interfaces.C.Strings at
all. Admittedly, we didn't care much about the form of the low-level bindings
because clients were expected only to use the higher-level interface.

This does mean that one shouldn't use Chars_Ptr with a C string of explicit
length, and perhaps the Standard should be clearer about that (when it talks
about "C-Style" strings, it is talking only about nul-terminated strings).
One would need a very different package if it was intending to support strings
that also included nuls (all of the operations would need counterparts that
don't take nuls into account).

I'd have to see your exact problem in order to demonstrate how I'd handle
this (the Win32 APIs all require a buffer passed in, which is then sliced to
get the appropriate result; your C routines may operate differently).

***************************************************************

From: Joey Fish
Sent: Tuesday, July 11, 2017  4:55 PM

The problem with "Interfaces.C.To_Ada(..., Trim_Null => False)" is that it
terminates at the first null, the Trim_Null indicates trimming off the
terminal null. (The parameter does not impact the reading of "internal null"
values.) -- The problem, as stated, is the passing of data which may itself
contain NUL.

The Value function [in GNAT] is as follows:

   function Value
     (Item   : chars_ptr;
      Length : size_t) return char_array
   is
   begin
      if Item = Null_Ptr then
         raise Dereference_Error;
      end if;

      --  ACATS cxb3010 checks that Constraint_Error gets raised when Length
      --  is 0. Seems better to check that Length is not null before declaring
      --  an array with size_t bounds of 0 .. Length - 1 anyway.

      if Length = 0 then
         raise Constraint_Error;
      end if;

      declare
         Result : char_array (0 .. Length - 1);

      begin
         for J in Result'Range loop
            Result (J) := Peek (Item + J);

-->         if Result (J) = nul then
-->             return Result (0 .. J);
-->         end if;
         end loop;

         return Result;
      end;
   end Value;


The marked portion here shows that a NUL inside the data terminates the
processing, leaving some data unprocessed. 

Thus, to address the problem the original poster posited B.3.1 The Package
Interfaces.C.Strings should have Value extended by a parameter allowing the
indicated portion to be bypassed.

***************************************************************

From: Victor Porton
Sent: Tuesday, July 11, 2017  4:28 PM

Interfaces.C.To_Ada requests a char_array as an argument.

So we need to construct a char_array from chars_ptr first. But to get a 
char_array we need first use Interfaces.C.Pointers.

It is valid to use To_Ada, but this function requires first calling
Interfaces.C.Pointers.Value.

My actual code does contain C_Ada (but not only it):

with Interfaces.C; use Interfaces.C;
with Interfaces.C.Pointers;

package RDF.Auxiliary.C_Pointers is
   new Interfaces.C.Pointers(Index              => size_t,
                             Element            => char,
                             Element_Array      => char_array,
                             Default_Terminator => nul);
   function Value_With_Possible_NULs (Item:
RDF.Auxiliary.C_Pointers.Pointer; Length: size_t) return String is
   begin
      return To_Ada(RDF.Auxiliary.C_Pointers.Value(Item,
Interfaces.C.ptrdiff_t(Length)), Trim_Nul=>False);
   end;

(Note that the code above uses RDF.Auxiliary.C_Pointers.Pointer instead of
chars_ptr as proposed, but this can be easily solved using unchecked
conversion),

***************************************************************

From: Tucker Taft
Sent: Tuesday, July 11, 2017  9:19 PM

> The problem with "Interfaces.C.To_Ada(..., Trim_Null => False)" is that it
> terminates at the first null, the Trim_Null indicates trimming off the
> terminal null. (The parameter does not impact the reading of "internal null"
> values.) -- The problem, as stated, is the passing of data which may itself
> contain NUL.

Actually, the semantics of To_Ada with Trim_Null => False is that NULs are
treated just like regular characters, which I believe is what you want.
Here is the RM description:

"... For procedure To_Ada, each element of Item (if Trim_Nul is False) ... is
converted (via the To_Ada function) to a Character, which is assigned to the
corresponding element of Target. Count is set to the number of Target elements
assigned. If Target is not long enough, Constraint_Error is propagated. ..."

As it says, *each* element of Item.  Perhaps these should have been two
different routines, because I admit the wording is a bit confusing.  But if
you ignore the part that is associated with the “Trim_Nul => True” case, then
To_Ada is a simple routine that treats NUL just like a regular character.

In fact, in 99 44/100% of Ada compilers for machines that are byte
addressable, a char_array and an array of Characters are the very same thing.
So you could just use Unchecked_Conversion. But that admittedly isn’t going to
look as portable.

***************************************************************

From: Randy Brukardt
Sent: Tuesday, July 11, 2017  10:49 PM

> The problem with "Interfaces.C.To_Ada(., Trim_Null => False)" is that 
> it terminates at the first null, the Trim_Null indicates trimming off 
> the terminal null. (The parameter does not impact the reading of 
> "internal null" values.) -- The problem, as stated, is the passing of
> data which may itself contain NUL.

That's not true at all, the definition in B.3(31) makes that clear as Tucker
reported.

> The Value function [in GNAT] is as follows:

Which has what to do with Interfaces.C.To_Ada???

> The underlined/bold portion here shows that a NUL inside the data terminates
> the processing, leaving some data unprocessed. 

For Interfaces.C.Strings, which is only for processing nul-terminated strings.
If you use a hammer to turn a screw, do you complain to Craftsman that the
hammer doesn't work???

***************************************************************

From: Randy Brukardt
Sent: Tuesday, July 11, 2017  11:22 PM

> Interfaces.C.To_Ada requests a char_array as an argument.
> 
> So we need to construct a char_array from chars_ptr first. 

Why? There is no reason to use a chars_ptr at all (it is a pointer to a 
nul-terminated string, and you don't have that).

> But to get a char_array we need first use Interfaces.C.Pointers.

IMHO, there is no reason to ever use Interfaces.C.Pointers. It's a complete
waste of time. (Janus/Ada failed to implement it, mostly by mistake, and so
far as I recall no one ever complained.) Just declare the proper types for
your interface in the first place and you're fine.

In this case, you want a C convention access to a constrained chars_array.
And then you can slice that in your thicker binding. Easy peasy. ;-)

> It is valid to use To_Ada, but this function requires first calling 
> Interfaces.C.Pointers.Value.
> 
> My actual code does contain C_Ada (but not only it):
> 
> with Interfaces.C; use Interfaces.C;
> with Interfaces.C.Pointers;
> 
> package RDF.Auxiliary.C_Pointers is
>    new Interfaces.C.Pointers(Index              => size_t,
>                              Element            => char,
>                              Element_Array      => char_array,
>                              Default_Terminator => nul);
>    function Value_With_Possible_NULs (Item: RDF.Auxiliary.C_Pointers.Pointer;
>                                       Length: size_t) return String is
>    begin
>       return To_Ada(RDF.Auxiliary.C_Pointers.Value(Item,
>          Interfaces.C.ptrdiff_t(Length)), Trim_Nul=>False);
>    end;
> 
> (Note that the code above uses
> RDF.Auxiliary.C_Pointers.Pointer instead of chars_ptr as proposed, but 
> this can be easily solved using unchecked conversion),

I suppose this works, but it seems like using a bazooka to kill a mosquito.

Remember that C is weakly typed, so it doesn't differentiate between a
nul-terminated string and a string with an explicit length. But they should
be different types in Ada; the semantics are substantially different.

Here's how I would deal with such a situation in Win32:

   MAX_Name : constant := 100;
   type Acc_Name is access all Interfaces.C.Char_Array (0 .. MAX_NAME)
      with Convention => C;
   
   function Get_Name (H : in Handle; Buffer : in Acc_Name;
                      Length : in out Size_T) return D_Word
      with Import => True, Convention => C, External_Name => "GetNameA";

   function Get_Name_of_Handle (H : in Handle) return String is
       Result : D_Word;
       Name_Buffer : aliased Interfaces.C.Char_Array (0 .. MAX_NAME);
       Blen : Size_T := MAX_NAME;
   begin
       Result := Get_Name (H, Name_Buffer'Access, Blen);
       if Result /= 0 then
           Raise_Windows_Error;
       else
           return Interfaces.C.To_Ada (Name_Buffer (0 .. Blen), Trim_Nul => False);
       end if;
   end Get_Name_of_Handle;

In Win32, you always pass in a buffer and a length, so you always know that
the result will be shorter than that. And you then just slice the result.

If you instead get a pointer allocated by the called routine, it's a bit more
complex but the general idea is the same. 

In any of these schemes (including using Interfaces.C.Strings), we're assuming
some sort of all-Ada wrapper subprogram. So it is fine for there to be some
complications so long as those are restricted to the wrapper (clients should
be able to use normal Ada types without complication).

***************************************************************

From: Victor Porton
Sent: Tuesday, July 11, 2017  5:10 PM

The problem with To_Ada is that it requires char_array, not chars_ptr.
For this reason it requires a wrapper. My proposal is exactly to add this
wrapper.

You may create a new package if you want.

We DO need C strings with explicit length when interfacing with some C
libraries. For example a file downloaded from the net may have NUL bytes in
it. It is not wise to say "don't use C strings with explicit length", because
it IS needed when interfacing with some particular libraries. And Interfaces.C
is exactly for this, that is for interfacing with C libraries.

The my exact problem:

When interfacing with both I/O streams
http://librdf.org/raptor/api/raptor2-section-iostream.html
and downloading from the net
http://librdf.org/raptor/api/raptor2-section-www.html
of Raptor C library, I must to be sure my code does work correctly if the
downloaded file contains NUL bytes.

Doing otherwise would be contrary to reliability (it would mean that my code
does a wrong thing if encountering a NUL byte)!

For this reason I implemented my own version of Value_With_Possible_NULs
function using Interfaces.C.Pointers. This is not a good idea to repeat this
in each and every library which needs to interface with C.

**************************************************************


Questions? Ask the ACAA Technical Agent