Version 1.1 of ai12s/ai12-0284-1.txt

Unformatted version of ai12s/ai12-0284-1.txt version 1.1
Other versions for file ai12s/ai12-0284-1.txt

!standard A.16(12/2)          18-06-11 AI12-0284-1/01
!standard A.16(66/2)
!standard A.16(67/3)
!class Amendment 18-06-11
!status Hold 18-06-11
!status work item 18-06-11
!status received 18-05-18
!priority Very_Low
!difficulty Easy
!subject Renaming a file to an existing file should be allowed
!summary
[Editor's note: This AI was "born" on hold as it came too late to be included in Ada 2020.]
Add a default parameter "Overwrite" to Ada.Directories.Rename to allow overwriting the target file.
!problem
The Ada.Directories.Rename procedure renames an existing file from Old_Name to New_Name. It will raise the Use_Error exception if the target file New_Name already exists.
This behavior forbids to implement a safe update of a file by replacing the old content by a new one. The Unix rename(2) system call has been designed with this atomic property. The common pattern for such safe file update is: - write a new file with the new content near the target file (at least
on the same file system),
- be sure that the whole file is written successfully and closed, - rename the file to the target one, hence replacing it atomically (at
least from a system call point of view).
With the Ada Rename procedure, we are forced to remove the target file before the Rename. This creates a window where the file does not exist anymore on the file system. If other programs need the file within that window time, or, if the program is stopped, we loose our file.
The rename(2) system call prevents that.
!proposal
(See Summary.)
!wording
Replace A.16(12/2) and A.16(66/2) with:
procedure Rename (Old_Name, New_Name : in String; Overwrite : Boolean := False);
Modify A.16(67/3):
Renames an existing external file (including directories) with name Old_Name to New_Name. The exception Name_Error is propagated if the string given as Old_Name does not identify an existing external file or if the string given as New_Name does not allow the identification of an external file. The exception Use_Error is propagated if the external environment does not support the renaming of the file with the given name (in the absence of Name_Error). In particular, Use_Error is propagated if {Overwrite is False and} a file or directory already exists with name New_Name.
** TBD. The obvious wording given above does not guarantee behavior that
would solve the given problem. See !discussion.
!discussion
Note that we still have to allow raising Name_Error and Use_Error for New_Name, as the format of the name may be wrong, and permissions may not allow overwriting the file.
---
On a machine that support such operations, it would appear that:
Copy_File (Old_Name, New_Name); Delete (Old_Name);
would have the correct semantics. If the file is large, that could be relatively expensive - but so would creating such a temporary copy in the first place. One probably would use a different method than either of these for updating large files.
---
None of the existing wording for the Ada I/O packages implies that file operations have to be executed indivisibly. Many operations (for instance, opening an Out_File mode Text_IO file) involve multiple operations on many operating systems. Even when the operation is available on the target system, an implementer may not use it -- for instance to avoid permissions issues or for backwards compatibility.
Thus depending on such an indivisible property is not going to be portable; even if it works on a particular implementation, it may not work on a different implementation for the same target, much less on some other target. Moreover, the lack of such a property is unlikely to show up in testing, leaving a significant problem.
Requiring that such a property exists would effectively limit Ada to targets with fancy journalling file systems - common on desktops these days (Windows and Linux have these), not so common on other devices (including optical disks). Otherwise, a poorly timed crash or power failure still can lose the file.
Therefore, it seems unlikely that the suggested change would really provide the facility that the proposer is looking for. Argbly, it is better to write clearly non-portable code in this case, in order that the expectation for atomic behavior be clearly documented.
!ASIS
No changes needed.
!ACATS test
An ACATS C-Test is needed to check that the new parameter is supported.
!appendix

!topic Ada.Directories renaming a file to an existing file should be allowed
!reference Ada 2012 RMA.16
!from Stephane Carrez 18-05-18
!keywords directories rename file
!discussion

The Ada.Directories.Rename procedure renames an existing file from
Old_Name to New_Name.
It will raise the Use_Error exception if the target file New_Name
already exists.

This behavior forbids to implement a safe update of a file by
replacing the old content by a new one.
The Unix rename(2) system call has been designed with this atomic property.
The common pattern for such safe file update is:
- write a new file with the new content near the target file (at least
on the same file system),
- be sure that the whole file is written successfully and closed,
- rename the file to the target one, hence replacing it atomically (at
least from a system call point of view).

With the Ada Rename procedure, we are forced to remove the target file
before the Rename.
This creates a window where the file does not exist anymore on the file system.
If other programs need the file within that window time, or, if the
program is stopped, we loose our file.

The rename(2) system call prevents that.

To keep the backward compatibility and allow this standard rename behavior,
I would suggest to add an optional boolean parameter to allow
overwriting the destination file.
Something like:

procedure Rename (Old_Name, New_Name : in String; Overwrite : in
Boolean := False);

When Overwrite is True, there is no check on New_Name and the
Use_Error exception is not raised.

****************************************************************

From: Randy Brukardt
Sent: Friday, May 18, 2018  7:28 PM

...
> With the Ada Rename procedure, we are forced to remove the target file 
> before the Rename.
> This creates a window where the file does not exist anymore on the 
> file system.
> If other programs need the file within that window time, or, if the 
> program is stopped, we loose our file.
> 
> The rename(2) system call prevents that.
> 
> To keep the backward compatibility and allow this standard rename 
> behavior, I would suggest to add an optional boolean parameter to 
> allow overwriting the destination file.
> Something like:
> 
> procedure Rename (Old_Name, New_Name : in String; Overwrite : 
> in Boolean := False);
> 
> When Overwrite is True, there is no check on New_Name and the 
> Use_Error exception is not raised.

I don't think this has the effect you want. If I was implementing this 
specification (without having seen this message), I would call
Delete(New_Name) before calling the rename primitive. (I never trust 
APIs outside of their core functionality, and I'd prefer to not even do 
that. :-) That's not the result you are looking for. Moreover, we can't 
require this to act atomically since that may not even be possible on some
targets -- not to mention that Ada has no concept of atomic file operations.

Similarly, the implementation of Open Out_File for Text_IO and Sequential_IO 
typically erases the file before creating a new one, because there can't be 
anything old in the file when it is closed (and almost all OSes preserve the 
contents when opening for writing). That is not atomic in this way, either.

My point here is that expecting portable operations to be atomic in any sense 
is probably a mistake. Without testable requirements to that effect, it simply
isn't going to be the case.

In cases like that of your example, I have code that needs to access the file 
fall back to the temporary name if the main file is missing. Since we don't 
delete the main file until we've finished writing the new version, if the main
file is missing, the temporary is a newly written version. We can then rename 
that and precede normally. That will work without depending on any properties
of the file system, so it is the most portable option (again, recall that I'm
the paranoid programmer -- I depend as little as possible on anything I don't
control; one always has to assume that a file is missing and have a fall back 
plan).

Alternatively, of course, you can make your own binding to rename(2). (I prefer
the above, since it works equally well on Windows and Linux and whatever the 
next big thing is.)

Practically, this suggestion is too late for Ada 2020 (public suggestions 
needed to be received by January 15th, and we've since frozen the possibilities
to additions by anyone). So, unless someone can successfully argue that this 
should be treated as an error in the Standard, it probably won't get added to
Ada for at least 7 more years.

****************************************************************

From: Stephane Carrez
Sent: Saturday, May 19, 2018  5:08 AM

> I don't think this has the effect you want. If I was implementing this 
> specification (without having seen this message), I would call
> Delete(New_Name) before calling the rename primitive. (I never trust 
> APIs outside of their core functionality, and I'd prefer to not even 
> do that. :-) That's not the result you are looking for. Moreover, we 
> can't require this to act atomically since that may not even be 
> possible on some targets -- not to mention that Ada has no concept of atomic
> file operations.
>
> Similarly, the implementation of Open Out_File for Text_IO and 
> Sequential_IO typically erases the file before creating a new one, 
> because there can't be anything old in the file when it is closed (and 
> almost all OSes preserve the contents when opening for writing). That is not
> atomic in this way, either.

I don't ask for atomic IO and I don't need that to implement a safe update of
a file.

> My point here is that expecting portable operations to be atomic in 
> any sense is probably a mistake. Without testable requirements to that 
> effect, it simply isn't going to be the case.

I just expect to be able to use what the operating system provides.

> In cases like that of your example, I have code that needs to access 
> the file fall back to the temporary name if the main file is missing. 
> Since we don't delete the main file until we've finished writing the 
> new version, if the main file is missing, the temporary is a newly 
> written version. We can then rename that and precede normally. That 
> will work without depending on any properties of the file system, so 
> it is the most portable option (again, recall that I've the paranoid 
> programmer -- I depend as little as possible on anything I don't 
> control; one always has to assume that a file is missing and have a fall
> back plan).

Except that if you want to update a file that is managed by another program,
you have to modify that program to tell it about the temporary file and 
modify it so that it takes it into account.  You also increase the complexity
of these programs by forcing them to check for such temporary file.

> Alternatively, of course, you can make your own binding to rename(2). 
> (I prefer the above, since it works equally well on Windows and Linux 
> and whatever the next big thing is.)

Yes, this is what I'm doing and I was hoping to be able to use the Ada Rename
procedure instead and hence be more portable.

> Practically, this suggestion is too late for Ada 2020 (public 
> suggestions needed to be received by January 15th, and we've since 
> frozen the possibilities to additions by anyone). So, unless someone 
> can successfully argue that this should be treated as an error in the 
> Standard, it probably won't get added to Ada for at least 7 more years.

ok, sorry to have missed that deadline...

****************************************************************

From: Randy Brukardt
Sent: Friday, June 1, 2018  7:13 PM

> > I don't think this has the effect you want. If I was implementing 
> > this specification (without having seen this message), I would call
> > Delete(New_Name) before calling the rename primitive. (I  never 
> > trust APIs outside of their core functionality, and I'd prefer to 
> > not even do that. :-) That's not the result you are looking for. 
> > Moreover, we can't require this to act atomically since that may not 
> > even be possible on some targets -- not to mention that Ada has no
> concept of atomic file operations.
> >
> > Similarly, the implementation of Open Out_File for Text_IO and 
> > Sequential_IO typically erases the file before creating a new one, 
> > because there can't be anything old in the file when it is closed (and
> > almost all OSes preserve the contents when opening for writing). That 
> > is not atomic in this way, either.
> >
> 
> I don't ask for atomic IO and I don't need that to implement a safe 
> update of a file.

You do need atomic file open/create/replace though, and that was what I was
focusing on.

> > My point here is that expecting portable operations to be atomic in 
> > any sense is probably a mistake. Without testable requirements to 
> > that effect, it simply isn't going to be the case.
> 
> I just expect to be able to use what the operating system provides.

That's inherently non-portable. Either the requirements are spelled out in the
RM, or depending upon them is foolhardy. If the code gets moved to a different
operating system (say an RTOS) or different compiler, that "feature" you're
depending upon might disappear. And you'd have no way to determine that
happened until a subtle malfunction in your program.

I try to implement Ada I/O based on a very small set of operations that are 
available almost everywhere (essentially the set of operations from MS-DOS
2.0); unless the specification clearly states otherwise, implementing your
rename as two operations is perfectly correct (and would have been necessary
in MS-DOS 2.0; it doesn't allow overwriting with a rename). That's not what 
you want. You have to propose wording that actually would have the effect you
want, or there is no reason to assume that it is going to be implemented that
way.

...
> > Alternatively, of course, you can make your own binding to rename(2). 
> > (I prefer the above, since it works equally well on Windows and 
> > Linux and whatever the next big thing is.)
> 
> Yes, this is what I'm doing and I was hoping to be able to use the Ada 
> Rename procedure instead and hence be more portable.

But you are depending on unique behavior of your target system. That is never
going to be "more portable". The definition you suggested would actually make
your code *less* portable as it hides the requirement for a particular 
implementation of renames.

Remember, Ada.Directories is about providing a least-common denominator set of
facilities that will work essentially everywhere (even on systems without paths
or with unusual file name formats - think VMS). It was never about giving access
to all or even most of the target facilities, but just a portable subset that
could be used without concern. Specific oddities like this renames version or
the append bit have no place in the Ada I/O system - even if implemented, you
can't depend upon them because they might not be on some other target. (If you
want to write code that is locked in to a particular target and compiler
combination, go ahead, but don't call it portable!)

****************************************************************

From: Gautier de Montmollin
Sent: Friday, June 1, 2018  9:20 PM

> Remember, Ada.Directories is about providing a least-common 
> denominator set of facilities that will work essentially everywhere 
> (even on systems without paths or with unusual file name formats - think
> VMS).

So a constructive answer would be rather to tell why an addition to
Ada.Directories is an "oddity" (or not).

Since the proposal is useful and can be implemented on Unix, Linux, Windows 
(via MoveFileEx), it covers already most currently used OSes.

An interesting discussion would be what OSes belong to the least-common 
denominator set for future Ada versions.
It is not serious to have MS-DOS in that set. MS-DOS is the oddity!

****************************************************************

From: Randy Brukardt
Sent: Sunday, June 3, 2018  2:15 AM

I wouldn't call MS-DOS an "oddity" (Windows is a direct descendant of MS-DOS,
and it still supports the FAT filesystems), but it certainly is more history 
than an issue these days. Ada does still have to support various RTOSes, and 
I don't know whether or not they would support the intended semantics.

But the real problem is that there is nothing in the Standard which would 
suggest that an atomic (vis-a-vis file open/create) operations is intended
-- for *any* file operation. With the current wording, it would be completely 
correct to implement this rename by separate delete and rename calls. And I'd
most likely do that as that would be most likely to avoid permissions 
problems. The wording of the Standard would have to somehow make such an
implementation wrong -- and it also would have to do the same for all of the
file opens and creates in the language. (Many of those aren't implemented 
atomically, either.)

And any such wording would have to deal with cases where the language 
semantics (such as truncating a file when it is opened for writing) can't
be obviously supported with the OS operations. (I doubt that we're going to
suddenly start limiting Ada ONLY to Windows and Unix-like systems.)

For me personally, that's a lot of language work that would also require a 
whole lot of runtime work as well. And I seriously doubt that it would really
help file operations significantly - one still would have to be prepared for
file issues caused by crashes at the wrong instant. (On Windows at least,
unless one can guarantee that you're never going to have a FAT file system -
as on the USB stick - you could be interrupted between the time the old file
was removed and the new file was renamed -- even if that is a single 
operation from the perspective of the program.)

****************************************************************

Questions? Ask the ACAA Technical Agent