!standard A.16 (00) 05-08-15 AI95-00248/14 !standard A.8.2(22) !class amendment 00-11-28 !status Amendment 200Y 02-07-12 !status WG9 Approved 02-12-13 !status ARG Approved 8-0-1 02-06-22 !status work item 00-11-28 !status received 00-11-28 !priority Medium !difficulty Medium !subject Directory Operations !summary Package Ada.Directories provides portable access to tree-structured file systems and the ability to search for files. !problem Most modern operating systems contain some sort of tree-structured file system. Many applications need to manage these file systems (by creating and removing directories, searching for files, and the like). Most Ada 95 compilers provide some sort of access to the operations needed to manage these systems. But these packages differ in many ways, making portable Ada 95 programs impossible. The POSIX libraries provide operations for doing this, but these usually are available only on POSIX systems, leaving out many popular operating systems including MS-DOS, most flavors of Microsoft Windows, and even Linux. Ada 95 has already opened the door to standard packages that are not necessarily applicable to all implementations with Ada.Command_Line. !proposal (See wording.) !wording Static Semantics with Ada.IO_Exceptions; with Ada.Calendar; package Ada.Directories is -- Directory and file operations: function Current_Directory return String; procedure Set_Directory (Directory : in String); procedure Create_Directory (New_Directory : in String; Form : in String := ""); procedure Delete_Directory (Directory : in String); procedure Create_Path (New_Directory : in String; Form : in String := ""); procedure Delete_Tree (Directory : in String); procedure Delete_File (Name : in String); procedure Rename (Old_Name, New_Name : in String); procedure Copy_File (Source_Name, Target_Name : in String; Form : in String := ""); -- File and directory name operations: function Full_Name (Name : in String) return String; function Simple_Name (Name : in String) return String; function Containing_Directory (Name : in String) return String; function Extension (Name : in String) return String; function Base_Name (Name : in String) return String; function Compose (Containing_Directory : in String := ""; Name : in String; Extension : in String := "") return String; -- File and directory queries: type File_Kind is (Directory, Ordinary_File, Special_File); type File_Size is range 0 .. ; function Exists (Name : in String) return Boolean; function Kind (Name : in String) return File_Kind; function Size (Name : in String) return File_Size; function Modification_Time (Name : in String) return Ada.Calendar.Time; -- Directory searching: type Directory_Entry_Type is limited private; type Filter_Type is array (File_Kind) of Boolean; type Search_Type is limited private; procedure Start_Search (Search : in out Search_Type; Directory : in String; Pattern : in String; Filter : in Filter_Type := (others => True)); procedure End_Search (Search : in out Search_Type); function More_Entries (Search : in Search_Type) return Boolean; procedure Get_Next_Entry (Search : in out Search_Type; Directory_Entry : out Directory_Entry_Type); procedure Search ( Directory : in String; Pattern : in String; Filter : in Filter_Type := (others => True); Process : not null access procedure (Directory_Entry : in Directory_Entry_Type)); -- Operations on Directory Entries: function Simple_Name (Directory_Entry : in Directory_Entry_Type) return String; function Full_Name (Directory_Entry : in Directory_Entry_Type) return String; function Kind (Directory_Entry : in Directory_Entry_Type) return File_Kind; function Size (Directory_Entry : in Directory_Entry_Type) return File_Size; function Modification_Time (Directory_Entry : in Directory_Entry_Type) return Ada.Calendar.Time; Status_Error : exception renames Ada.IO_Exceptions.Status_Error; Name_Error : exception renames Ada.IO_Exceptions.Name_Error; Use_Error : exception renames Ada.IO_Exceptions.Use_Error; Device_Error : exception renames Ada.IO_Exceptions.Device_Error; private -- Not specified by the language. end Ada.Directories; External files may be classified as directories, special files, or ordinary files. A @i is an external file that is a container for files on the target system. A @i is an external file that cannot be created or read by a predefined Ada input-output package. External files that are not special files or directories are called @i. AARM Ramification: A directory is an external file, although it may not have a name on some targets. A directory is not a special file, as it can be created and read by Directories. AARM Discussion: Devices and soft links are examples of special files on Windows and Unix. Even if an implementation provides a package to create and read on soft links, such links are still special files. A @i is a string identifying an external file. Similarly, a @i is a string identifying a directory. The interpretation of file names and directory names is implementation-defined. The @i of an external file is a full specification of the name of the file. If the external environment allows alternative specifications of the name (for example, abbreviations), the full name should not use such alternatives. A full name typically will include the names of all of the directories that contain the item. The @i of an external file is the name of the item, not including any containing directory names. Unless otherwise specified, a file name or directory name parameter in a call to a predefined Ada input-output subprogram can be a full name, a simple name, or any other form of name supported by the implementation. AARM Discussion: The full name on Unix is a complete path to the root. For Windows, the full name includes a complete path, as well as a disk name ("C:") or network share name. For both systems, the simple name is the part of the name following the last '/' (or '\' for Windows). For example, in the name "/usr/randy/ada-directories.ads", "ada-directories.ads" is the simple name. AARM Ramification: It is possible for a file or directory name to be neither a full name nor a simple name. For instance, the Unix name "../parent/myfile" is neither a full name nor a simple name. The @i is the directory that is used if a directory or file name is not a full name (that is, when the name does not fully identify all of the containing directories). AARM Discussion: The default directory is the one maintained by the familiar "cd" command on Unix and Windows. Note that Windows maintains separate default directories for each disk drive; implementations should use the natural implementation. A @i is a single item in a directory, identifying a single external file (including directories and special files). For each function that returns a string, the lower bound of the returned value is 1. function Current_Directory return String; Returns the full directory name for the current default directory. The name returned shall be suitable for a future call to Set_Directory. The exception Use_Error is propagated if a default directory is not supported by the external environment. procedure Set_Directory (Directory : in String); Sets the current default directory. The exception Name_Error is propagated if the string given as Directory does not identify an existing directory. The exception Use_Error is propagated if the external environment does not support making Directory (in the absence of Name_Error) a default directory. procedure Create_Directory (New_Directory : in String; Form : in String := ""); Creates a directory with name New_Directory. The Form parameter can be used to give system-dependent characteristics of the directory; the interpretation of the Form parameter is implementation-defined. A null string for Form specifies the use of the default options of the implementation of the new directory. The exception Name_Error is propagated if the string given as New_Directory does not allow the identification of a directory. The exception Use_Error is propagated if the external environment does not support the creation of a directory with the given name (in the absence of Name_Error) and form. procedure Delete_Directory (Directory : in String); Deletes an existing empty directory with name Directory. The exception Name_Error is propagated if the string given as Directory does not identify an existing directory. The exception Use_Error is propagated if the external environment does not support the deletion of the directory (or some portion of its contents) with the given name (in the absence of Name_Error). procedure Create_Path (New_Directory : in String; Form : in String := ""); Creates zero or more directories with name New_Directory. Each non-existent directory named by New_Directory is created. For example, on a typical Unix system, Create_Path ("/usr/me/my"); would create directory "me" in directory "usr", then create directory "my" in directory "me". The Form parameter can be used to give system-dependent characteristics of the directory; the interpretation of the Form parameter is implementation-defined. A null string for Form specifies the use of the default options of the implementation of the new directory. The exception Name_Error is propagated if the string given as New_Directory does not allow the identification of any directory. The exception Use_Error is propagated if the external environment does not support the creation of any directories with the given name (in the absence of Name_Error) and form. procedure Delete_Tree (Directory : in String); Deletes an existing directory with name Directory. The directory and all of its contents (possibly including other directories) are deleted. The exception Name_Error is propagated if the string given as Directory does not identify an existing directory. The exception Use_Error is propagated if the external environment does not support the deletion of the directory or some portion of its contents with the given name (in the absence of Name_Error). If Use_Error is propagated, it is unspecified whether a portion of the contents of the directory is deleted. procedure Delete_File (Name : in String); Deletes an existing ordinary or special file with name Name. The exception Name_Error is propagated if the string given as Name does not identify an existing ordinary or special external file. The exception Use_Error is propagated if the external environment does not support the deletion of the file with the given name (in the absence of Name_Error). procedure Rename (Old_Name, New_Name : in String); Renames an existing external file (including directories) with name Old_Name to New_Name. The exception Name_Error is propagated if the string given as Old_Name does not identify an existing external file. The exception Use_Error is propagated if the external environment does not support the renaming of the file with the given name (in the absence of Name_Error). In particular, Use_Error is propagated if a file or directory already exists with name New_Name. procedure Copy_File (Source_Name, Target_Name : in String; Form : in String); Copies the contents of the existing external file with name Source_Name to an external file with name Target_Name. The resulting external file is a duplicate of the source external file. The Form parameter can be used to give system-dependent characteristics of the resulting external file; the interpretation of the Form parameter is implementation-defined. Exception Name_Error is propagated if the string given as Source_Name does not identify an existing external ordinary or special file, or if the string given as Target_Name does not allow the identification of an external file. The exception Use_Error is propagated if the external environment does not support creating of file with the name given by Target_Name and form given by Form, or copying of the file with the name given by Source_Name (in the absence of Name_Error). AARM Ramification: Name_Error is always raised if Source_Name identifies a directory. It is up to the implementation whether special files can be copied, or if Use_Error will be raised. function Full_Name (Name : in String) return String; Returns the full name corresponding to the file name specified by Name. The exception Name_Error is propagated if the string given as Name does not allow the identification of an external file (including directories and special files). AARM Discussion: Full name means that no abbreviations are used in the returned name, and that it is a full specification of the name. Thus, for Unix and Windows, the result should be a full path that does not contain any "." or ".." directories. Typically, the default directory is used to fill in any missing information. function Simple_Name (Name : in String) return String; Returns the simple name portion of the file name specified by Name. The exception Name_Error is propagated if the string given as Name does not allow the identification of an external file (including directories and special files). function Containing_Directory (Name : in String) return String; Returns the name of the containing directory of the external file (including directories) identified by Name. (If more than one directory can contain Name, the directory name returned is implementation-defined.) The exception Name_Error is propagated if the string given as Name does not allow the identification of an external file. The exception Use_Error is propagated if the external file does not have a containing directory. AARM Discussion: This is purely a string manipulation function. If Name is not given as a full name, the containing directory probably won't be one, either. For example, if Containing_Directory ("..\AARM\RM-A-8") is called on Windows, the result should be "..\AARM". If there is no path at all on the name, the result should be "." (which represents the current directory). Use Full_Name on the result of Containing_Directory if the full name is needed. function Extension (Name : in String) return String; Returns the extension name corresponding to Name. The extension name is a portion of a simple name (not including any separator characters), typically used to identify the file class. If the external environment does not have extension names, then the null string is returned. The exception Name_Error is propagated if the string given as Name does not allow the identification of an external file. AARM Discussion: For Unix and Windows, the extension is the portion of the simple name following the rightmost period. For example, in the simple name "RM-A-8.html", the extension is "html". function Base_Name (Name : in String) return String; Returns the base name corresponding to Name. The base name is the remainder of a simple name after removing any extension and extension separators. The exception Name_Error is propagated if the string given as Name does not allow the identification of an external file (including directories and special files). AARM Discussion: For Unix and Windows, the base name is the portion of the simple name preceding the rightmost period. For example, in the simple name "RM-A-8.html", the base name is "RM-A-8". function Compose (Containing_Directory : in String := ""; Name : in String; Extension : in String := "") return String; Returns the name of the external file with the specified Containing_Directory, Name, and Extension. If Extension is the null string, then Name is interpreted as a simple name; otherwise Name is interpreted as a base name. The exception Name_Error is propagated if the string given as Containing_Directory is not null and does not allow the identification of a directory, or if the string given as Extension is not null and is not a possible extension, or if the string given as Name is not a possible simple name (if Extension is null) or base name (if Extension is non-null). AARM Discussion: If Name is null, Name_Error should be raised as nothing is not a possible simple name or base name. type File_Kind is (Directory, Ordinary_File, Special_File); The type File_Kind represents the kind of file represented by an external file or directory. type File_Size is range 0 .. ; The type File_Size represents the size of an external file. function Exists (Name : in String) return Boolean; Returns True if an external file represented by Name exists, and False otherwise. The exception Name_Error is propagated if the string given as Name does not allow the identification of an external file (including directories and special files). function Kind (Name : in String) return File_Kind; Returns the kind of external file represented by Name. The exception Name_Error is propagated if the string given as Name does not allow the identification of an existing external file. function Size (Name : in String) return File_Size; Returns the size of the external file represented by Name. The size of an external file is the number of stream elements contained in the file. If the external file is not an ordinary file, the result is implementation-defined. The exception Name_Error is propagated if the string given as Name does not allow the identification of an existing external file. The exception Constraint_Error is propagated if the file size is not a value of type File_Size. AARM Discussion: We allow raising Constraint_Error, so that an implementation for a system with 64-bit file sizes does not need to support full numerics on 64-bit integers just to implement this package. Of course, if 64-bit integers are available on such a system, they should be used when defining type File_Size. function Modification_Time (Name : in String) return Ada.Calendar.Time; Returns the time that the external file represented by Name was most recently modified. If the external file is not an ordinary file, the result is implementation-defined. The exception Name_Error is propagated if the string given as Name does not allow the identification of an existing external file. The exception Use_Error is propagated if the external environment does not support reading the modification time of the file with the name given by Name (in the absence of Name_Error). type Directory_Entry_Type is limited private; The type Directory_Entry_Type represents a single item in a directory. These items can only be created by the Get_Next_Entry procedure in this package. Information about the item can be obtained from the functions declared in this package. A default-initialized object of this type is invalid; objects returned from Get_Next_Entry are valid. type Filter_Type is array (File_Kind) of Boolean; The type Filter_Type specifies which directory entries are provided from a search operation. If the Directory component is True, directory entries representing directories are provided. If the Ordinary_File component is True, directory entries representing ordinary files are provided. If the Special_File component is True, directory entries representing special files are provided. type Search_Type is limited private; The type Search_Type contains the state of a directory search. A default-initialized Search_Type object has no entries available (function More_Entries returns False). Type Search_Type needs finalization (see 7.6). [Editoral note: The last sentence depends on AI-360, drop it if AI-360 is not included in the Amendment.] procedure Start_Search (Search : in out Search_Type; Directory : in String; Pattern : in String; Filter : in Filter_Type := (others => True)); Starts a search in the directory named by Directory for entries matching Pattern. Pattern represents a pattern for matching file names. If Pattern is null, all items in the directory are matched; otherwise, the interpretation of Pattern is implementation-defined. Only items that match Filter will be returned. After a successful call on Start_Search, the object Search may have entries available, but it may have no entries available if no files or directories match Pattern and Filter. The exception Name_Error is propagated if the string given by Directory does not identify an existing directory, or if Pattern does not allow the identification of any possible external file or directory. The exception Use_Error is propagated if the external environment does not support the searching of the directory with the given name (in the absence of Name_Error). When Start_Search propagates Name_Error or Use_Error, the object Search will have no entries available. AARM Discussion: Pattern should use the pattern matching characters commonly used on the target. For instance, on Unix and Windows, both '*' and '?' should be supported, with their conventional meaning. procedure End_Search (Search : in out Search_Type); Ends the search represented by Search. After a successful call on End_Search, the object Search will have no entries available. function More_Entries (Search : in Search_Type) return Boolean; Returns True if more entries are available to be returned by a call to Get_Next_Entry for the specified search object, and False otherwise. procedure Get_Next_Entry (Search : in out Search_Type; Directory_Entry : out Directory_Entry_Type); Returns the next Directory_Entry for the search described by Search that matches the pattern and filter. If no further matches are available, Status_Error is raised. It is implementation-defined as to whether the results returned by this routine are altered if the contents of the directory are altered while the Search object is valid (for example, by another program). The exception Use_Error is propagated if the external environment does not support continued searching of the directory represented by Search. procedure Search ( Directory : in String; Pattern : in String; Filter : in Filter_Type := (others => True); Process : not null access procedure (Directory_Entry : in Directory_Entry_Type)); Searches, in the directory named by Directory, passing each entry matching Pattern to a call of the subprogram designated by Process. Pattern represents a pattern for matching file names. If Pattern is null, all items in the directory are matched; otherwise, the interpretation of Pattern is implementation-defined. Only items that match Filter will be returned. The exception Name_Error is propagated if the string given by Directory does not identify an existing directory, or if Pattern does not allow the identification of any possible external file or directory. The exception Use_Error is propagated if the external environment does not support the searching of the directory with the given name (in the absence of Name_Error). AARM Discussion: Pattern should be interpreted in the same way as for Start_Search. function Simple_Name (Directory_Entry : in Directory_Entry_Type) return String; Returns the simple external name of the external file (including directories) represented by Directory_Entry. The format of the name returned is implementation-defined. The exception Status_Error is propagated if Directory_Entry is invalid. function Full_Name (Directory_Entry : in Directory_Entry_Type) return String; Returns the full external name of the external file (including directories) represented by Directory_Entry. The format of the name returned is implementation-defined. The exception Status_Error is propagated if Directory_Entry is invalid. function Kind (Directory_Entry : in Directory_Entry_Type) return File_Kind; Returns the kind of external file represented by Directory_Entry. The exception Status_Error is propagated if Directory_Entry is invalid. function Size (Directory_Entry : in Directory_Entry_Type) return File_Size; Returns the size of the external file represented by Directory_Entry. The size of an external file is the number of stream elements contained in the file. If the external file represented by Directory_Entry is not an ordinary file, the result is implementation-defined. The exception Status_Error is propagated if Directory_Entry is invalid. The exception Constraint_Error is propagated if the file size is not a value of type File_Size. function Modification_Time (Directory_Entry : in Directory_Entry_Type) return Ada.Calendar.Time; Returns the time that the external file represented by Directory_Entry was most recently modified. If the external file represented by Directory_Entry is not an ordinary file, the result is implementation-defined. The exception Status_Error is propagated if Directory_Entry is invalid. The exception Use_Error is propagated if the external environment does not support reading the modification time of the file represented by Directory_Entry. Implementation Requirements For Copy_File, if Source_Name identifies an existing external ordinary file created by a predefined Ada input-output package, and Target_Name and Form can be used in the Create operation of that input-output package with mode Out_File without raising an exception, then Copy_File shall not propagate Use_Error. AARM Discussion: This means that Copy_File will copy any file that the Ada programmer could copy (by writing some possibly complicated Ada code). Implementation Advice If other information about a file (such as the owner or creation date) is available in a directory entry, the implementation should provide functions in a child package Directories.Information to retrieve it. AARM Note: For Microsoft Windows, Directories.Information should contain at least the following routines: package Ada.Directories.Information is -- System-specific directory information. -- Version of the Microsoft Windows operating system. function Creation_Time (Name : in String) return Ada.Calendar.Time; function Last_Access_Time (Name : in String) return Ada.Calendar.Time; function Is_Read_Only (Name : in String) return Boolean; function Needs_Archiving (Name : in String) return Boolean; -- This generally means that the file needs to be backed up. -- The flag is only cleared by backup programs. function Is_Compressed (Name : in String) return Boolean; function Is_Encrypted (Name : in String) return Boolean; function Is_Hidden (Name : in String) return Boolean; function Is_System (Name : in String) return Boolean; function Is_Offline (Name : in String) return Boolean; function Is_Temporary (Name : in String) return Boolean; function Is_Sparse (Name : in String) return Boolean; function Is_Not_Indexed (Name : in String) return Boolean; function Creation_Time (Directory_Entry : in Directory_Entry_Type) return Ada.Calendar.Time; function Last_Access_Time (Directory_Entry : in Directory_Entry_Type) return Ada.Calendar.Time; function Is_Read_Only (Directory_Entry : in Directory_Entry_Type) return Boolean; function Needs_Archiving (Directory_Entry : in Directory_Entry_Type) return Boolean; -- This generally means that the file needs to be backed up. -- The flag is only cleared by backup programs. function Is_Compressed (Directory_Entry : in Directory_Entry_Type) return Boolean; function Is_Encrypted (Directory_Entry : in Directory_Entry_Type) return Boolean; function Is_Hidden (Directory_Entry : in Directory_Entry_Type) return Boolean; function Is_System (Directory_Entry : in Directory_Entry_Type) return Boolean; function Is_Offline (Directory_Entry : in Directory_Entry_Type) return Boolean; function Is_Temporary (Directory_Entry : in Directory_Entry_Type) return Boolean; function Is_Sparse (Directory_Entry : in Directory_Entry_Type) return Boolean; function Is_Not_Indexed (Directory_Entry : in Directory_Entry_Type) return Boolean; -- Additional implementation-defined subprograms allowed here. end Ada.Directories.Information; For Unix-like systems (POSIX, Linux, etc.), Directories.Information should contain at least the following routines: package Ada.Directories.Information is -- System-specific directory information. -- Unix and similar systems version. function Last_Access_Time (Name : in String) return Ada.Calendar.Time; function Last_Status_Change_Time (Name : in String) return Ada.Calendar.Time; type Permission is (Others_Execute, Others_Write, Others_Read, Group_Execute, Group_Write, Group_Read, Owner_Execute, Owner_Write, Owner_Read, Set_Group_ID, Set_User_ID); type Permission_Set_Type is array (Permission) of Boolean; function Permission_Set (Name : in String) return Permision_Set_Type; function Owner (Name : in String) return String; -- Returns the image of the User_Id. If a definition of User_Id -- is available, an implementation-defined version of Owner -- returning User_Id should also be defined. function Group (Name : in String) return String; -- Returns the image of the User_Id. If a definition of Group_Id -- is available, an implementation-defined version of Group -- returning Group_Id should also be defined. function Is_Block_Special_File (Name : in String) return Boolean; function Is_Character_Special_File (Name : in String) return Boolean; function Is_FIFO (Name : in String) return Boolean; function Is_Symbolic_Link (Name : in String) return Boolean; function Is_Socket (Name : in String) return Boolean; function Last_Access_Time (Directory_Entry : in Directory_Entry_Type) return Ada.Calendar.Time; function Last_Status_Change_Time (Directory_Entry : in Directory_Entry_Type) return Ada.Calendar.Time; function Permission_Set (Directory_Entry : in Directory_Entry_Type) return Permission_Set_Type; function Owner (Directory_Entry : in Directory_Entry_Type) return String; -- See Owner above. function Group (Directory_Entry : in Directory_Entry_Type) return String; -- See Group above. function Is_Block_Special_File (Directory_Entry : in Directory_Entry_Type) return Boolean; function Is_Character_Special_File (Directory_Entry : in Directory_Entry_Type) return Boolean; function Is_FIFO (Directory_Entry : in Directory_Entry_Type) return Boolean; function Is_Symbolic_Link (Directory_Entry : in Directory_Entry_Type) return Boolean; function Is_Socket (Directory_Entry : in Directory_Entry_Type) return Boolean; -- Additional implementation-defined subprograms allowed here. end Ada.Directories.Information; We give these definitions to give guidance so that every implementation for a given target is not unnecessarily different. Implementers are encouraged to make packages for other targets as similar to these as possible. Start_Search and Search should raise Use_Error if Pattern is malformed, but not if it could represent a file in the directory but does not actually do so. Rename should be supported at least when both New_Name and Old_Name are simple names and New_Name does not identify an existing external file. AARM Discussion: "Supported" includes raising an exception if either name is malformed, the file to rename doesn't exist, insufficient permission for the operation exists, or similar problems. But this advice requires implementations to document what they do, and tells implementers that simply raising Use_Error isn't acceptable. Notes The operations Containing_Directory, Full_Name, Simple_Name, Base_Name, Extension, and Compose operate on file names, not external files. The files identified by these operations do not need to exist. Name_Error is raised only if the file name is malformed and cannot possibly identify a file. Of these operations, only the result of Full_Name depends on the current default directory; the result of the others depends only on their parameters. Using access types, values of Search_Type and Directory_Entry_Type can be saved and queried later. However, another task or application can modify or delete the file represented by a Directory_Entry_Type value or the directory represented by a Search_Type value; such a value can only give the information valid at the time it is created. Therefore, long-term storage of these values is not recommended. If the target system does not support directories inside of directories, Is_Directory will always return False, and Containing_Directory will always raise Use_Error. If the target system does not support creation or deletion of directories, Create_Directory, Create_Path, Delete_Directory, and Delete_Tree will always propagate Use_Error. The second sentence of A.8.2(22) is deleted. (The Full_Name function provides the needed functionality without any overhead on Open and Create.) !example Let's say that we wanted to display the sizes of all of the files with a particular extension in a directory. We could do it as follows: with Ada.Text_IO, Ada.Directories; procedure Find_Executables is Search : Ada.Directories.Search_Type; Item : Ada.Directories.Directory_Entry_Type; use type Ada.Directories.Directory_Entry_Type; Executable_Extension : constant String := "EXE"; begin Ada.Directories.Start_Search (Search => Search, Directory => "C:\Bin", Pattern => Ada.Directories.Compose (Name => "*", Extension => Executable_Extension), Filter => (Ordinary_Files => True, others => False)); while Ada.Directories.More_Entries (Search => Search) loop Ada.Directories.Get_Next_Entry (Search => Search, Directory_Entry => Item); Ada.Text_IO.Put_Line ("Found file: " & Ada.Directories.Simple_Name(Item) & " Size:" & Ada.Directories.File_Size'Image(Ada.Directories.Size(Item))); end loop; Ada.Directories.End_Search (Search => Search); end Find_Executables; !discussion The proposed package is based on the existing Claw package (Claw.Directories) [which was designed for Microsoft Windows only], the Ada POSIX bindings, and the GNAT package GNAT.Directory_Operations. The Ada POSIX bindings cannot be used directly, because they are rather UNIX-centric. For instance, Create_Directory takes a POSIX_Permissions parameter, which is very difficult to map to another operating system. They also define an entire new set of exceptions, which would complicate the definition of the package. The names of the routines are based on those used in POSIX and GNAT. Claw uses shorter names based on the idea that the parameter and/or prefix will make it clear what type of object is being created. For example, Create_Directory is Create, and Current_Directory is Current. This seems too radical. We considered unifying the Delete_File and Delete_Directory operations as Delete, which would slightly simplify the description of the operation. However, since both Unix and Windows use different system calls to implement Delete_File and Delete_Directory, the implementation would be substantially more complex (requiring determining the file kind before proceeding). That seemed too complex, especially as the programmer typically knows whether they are deleting a file or directory. POSIX and Claw provide an iterator version of the directory searching mechanism; POSIX only provides an iterator version. We provide both, which eliminates the need to provide a way to stop an iteration in the middle. A "quit" parameter has proven to be very confusing, even to experienced programmers. We don't provide such a parameter; if it necessary to stop an iteration in the middle, use the object-based version of searching. POSIX and GNAT provide only the item name as a result from a directory search operation. This package provides a private type, in order that other information can be provided, such as the file size and modification time. Moreover, we encourage providing other information if it is available. (For instance, Windows provides creation and last access time stamps in a directory entry.) Of course, using such information is not portable. The implementation advice above shows the intent. The package is defined as a child of Ada. This is necessary as the package is designed to work with all kinds of files, so it would be inappropriate for it to be a child of any specific IO package, and making it a child of IO_Exceptions also seems inappropriate. Moreover, we could cause conflicts by defining grandchildren of Ada, but not for children of Ada (since it is illegal to compile children of Ada, and implementors should not define their own, while there are no such restrictions for grandchildren of Ada). A.8.2(22) has long been tested by the ACATS as a requirement. A proper implementation of this rule is expensive, as every open must save the full name in order to be tolerant of changes in the default directory. (An Ada 95 implementation could be unfriendly and ignore changes in the default directory, but the introduction of this package would make that implementation incorrect). It is better to provide those operations to those who need them; so Full_Name is included with this package (with a requirement to support the old A.8.2(22)), and the rule has been deleted from A.8.2(22). Thus, the effect of the Ada 95 Name can be provided by: Ada.Directories.Full_Name(Name(File)) directly after opening a file without the overhead for users that don't need it. !corrigendum A.8.2(22) @drepl @xindent @dby @xindent !corrigendum A.16(1) @dinsc The package Directories provides operations for manipulating files and directories, and their names. @i<@s8> The library package Directories has the following declaration: @xcode<@b Ada.IO_Exceptions; @b Ada.Calendar; @b Ada.Directories @b -- @ft<@i> @b Current_Directory @b String; @b Set_Directory (Directory : in String); @b Create_Directory (New_Directory : @b String; Form : @b String := ""); @b Delete_Directory (Directory : @b String); @b Create_Path (New_Directory : @b String; Form : @b String := ""); @b Delete_Tree (Directory : @b String); @b Delete_File (Name : @b String); @b Rename (Old_Name, New_Name : @b String); @b Copy_File (Source_Name, Target_Name : @b String; Form : @b String := ""); -- @ft<@i> @b Full_Name (Name : @b String) @b String; @b Simple_Name (Name : @b String) @b String; @b Containing_Directory (Name : @b String) @b String; @b Extension (Name : @b String) @b String; @b Base_Name (Name : @b String) @b String; @b Compose (Containing_Directory : @b String := ""; Name : @b String; Extension : @b String := "") @b String; -- @ft<@i> @b File_Kind @b (Directory, Ordinary_File, Special_File); @b File_Size @b 0 .. @ft<@i>; @b Exists (Name : @b String) @b Boolean; @b Kind (Name : @b String) @b File_Kind; @b Size (Name : @b String) @b File_Size; @b Modification_Time (Name : @b String) @b Ada.Calendar.Time; -- @ft<@i> @b Directory_Entry_Type @b; @b Filter_Type @b (File_Kind) @b Boolean; @b Search_Type @b; @b Start_Search (Search : @b Search_Type; Directory : @b String; Pattern : @b String; Filter : @b Filter_Type := (@b =@> True)); @b End_Search (Search : @b Search_Type); @b More_Entries (Search : @b Search_Type) @b Boolean; @b Get_Next_Entry (Search : @b Search_Type; Directory_Entry : @b Directory_Entry_Type); @b Search ( Directory : @b String; Pattern : @b String; Filter : @b Filter_Type := (@b =@> True); Process : @b (Directory_Entry : @b Directory_Entry_Type)); -- @ft<@i> @b Simple_Name (Directory_Entry : @b Directory_Entry_Type) @b String; @b Full_Name (Directory_Entry : @b Directory_Entry_Type) @b String; @b Kind (Directory_Entry : @b Directory_Entry_Type) @b File_Kind; @b Size (Directory_Entry : @b Directory_Entry_Type) @b File_Size; @b Modification_Time (Directory_Entry : @b Directory_Entry_Type) @b Ada.Calendar.Time; Status_Error : @b Ada.IO_Exceptions.Status_Error; Name_Error : @b Ada.IO_Exceptions.Name_Error; Use_Error : @b Ada.IO_Exceptions.Use_Error; Device_Error : @b Ada.IO_Exceptions.Device_Error; @b -- @ft<@i> @b Ada.Directories;> External files may be classified as directories, special files, or ordinary files. A @i is an external file that is a container for files on the target system. A @i is an external file that cannot be created or read by a predefined Ada input-output package. External files that are not special files or directories are called @i. A @i is a string identifying an external file. Similarly, a @i is a string identifying a directory. The interpretation of file names and directory names is implementation-defined. The @i of an external file is a full specification of the name of the file. If the external environment allows alternative specifications of the name (for example, abbreviations), the full name should not use such alternatives. A full name typically will include the names of all of the directories that contain the item. The @i of an external file is the name of the item, not including any containing directory names. Unless otherwise specified, a file name or directory name parameter in a call to a predefined Ada input-output subprogram can be a full name, a simple name, or any other form of name supported by the implementation. The @i is the directory that is used if a directory or file name is not a full name (that is, when the name does not fully identify all of the containing directories). A @i is a single item in a directory, identifying a single external file (including directories and special files). For each function that returns a string, the lower bound of the returned value is 1. The following file and directory operations are provided: @xcode<@b Current_Directory @b String;> @xindent @xcode<@b Set_Directory (Directory : @b String);> @xindent @xcode<@b Create_Directory (New_Directory : @b String; Form : @b String := "");> @xindent @xcode<@b Delete_Directory (Directory : @b String);> @xindent @xcode<@b Create_Path (New_Directory : @b String; Form : @b String := "");> @xindent @xcode<@b Delete_Tree (Directory : @b String);> @xindent @xcode<@b Delete_File (Name : @b String);> @xindent @xcode<@b Rename (Old_Name, New_Name : @b String);> @xindent @xcode<@b Copy_File (Source_Name, Target_Name : @b String; Form : @b String);> @xindent The following file and directory name operations are provided: @xcode<@b Full_Name (Name : @b String) @b String;> @xindent @xcode<@b Simple_Name (Name : @b String) @b String;> @xindent @xcode<@b Containing_Directory (Name : @b String) @b String;> @xindent @xcode<@b Extension (Name : @b String) @b String;> @xindent @xcode<@b Base_Name (Name : @b String) @b String;> @xindent @xcode<@b Compose (Containing_Directory : @b String := ""; Name : @b String; Extension : @b String := "") @b String;> @xindent The following file and directory queries and types are provided: @xcode<@b File_Kind @b (Directory, Ordinary_File, Special_File);> @xindent @xcode<@b File_Size @b 0 .. @ft<@i>;> @xindent @xcode<@b Exists (Name : @b String) @b Boolean;> @xindent @xcode<@b Kind (Name : @b String) @b File_Kind;> @xindent @xcode<@b Size (Name : @b String) @b File_Size;> @xindent @xcode<@b Modification_Time (Name : @b String) @b Ada.Calendar.Time;> @xindent The following directory searching operations and types are provided: @xcode<@b Directory_Entry_Type @b;> @xindent @xcode<@b Filter_Type @b (File_Kind) @b Boolean;> @xindent @xcode<@b Search_Type @b;> @xindent @xcode<@b Start_Search (Search : @b Search_Type; Directory : @b String; Pattern : @b String; Filter : @b Filter_Type := (@b =@> True));> @xindent @xcode<@b End_Search (Search : @b Search_Type);> @xindent @xcode<@b More_Entries (Search : @b Search_Type) @b Boolean;> @xindent @xcode<@b Get_Next_Entry (Search : @b Search_Type; Directory_Entry : @b Directory_Entry_Type);> @xindent @xcode<@b Search ( Directory : @b String; Pattern : @b String; Filter : @b Filter_Type := (others =@> True); Process : @b (Directory_Entry : @b Directory_Entry_Type));> @xindent @xcode<@b Simple_Name (Directory_Entry : @b Directory_Entry_Type) @b String;> @xindent @xcode<@b Full_Name (Directory_Entry : @b Directory_Entry_Type) @b String;> @xindent @xcode<@b Kind (Directory_Entry : @b Directory_Entry_Type) @b File_Kind;> @xindent @xcode<@b Size (Directory_Entry : @b Directory_Entry_Type) @b File_Size;> @xindent @xcode<@b Modification_Time (Directory_Entry : @b Directory_Entry_Type) @b Ada.Calendar.Time;> @xindent @i<@s8> For Copy_File, if Source_Name identifies an existing external ordinary file created by a predefined Ada input-output package, and Target_Name and Form can be used in the Create operation of that input-output package with mode Out_File without raising an exception, then Copy_File shall not propagate Use_Error. @i<@s8> If other information about a file (such as the owner or creation date) is available in a directory entry, the implementation should provide functions in a child package Directories.Information to retrieve it. Start_Search and Search should raise Use_Error if Pattern is malformed, but not if it could represent a file in the directory but does not actually do so. For Rename, if both New_Name and Old_Name are simple names, and New_Name does not identify an existing external file, then Rename should not propagate Use_Error. @xindent<@s9> !ACATS Test ACATS test(s) need to be created. !appendix From: Randy Brukardt Sent: Wednesday, October 04, 2000 9:35 PM Most modern operating systems contain some sort of tree-structured file system. Many applications need to manage these file systems (by creating and removing directories, searching for files, and the like). Most Ada 95 compilers provide some sort of access to the operations needed to manage these systems. But, these packages differ in many ways, making portable Ada 95 programs impossible. The POSIX libraries provide operations for doing this, but these usually are available only on POSIX systems, leaving out many popular operating systems including MS-DOS, most flavors of Windows, and even Linux. Ada 95 has already opened the door to standard packages that are not necessarily applicable to all implementations with Ada.Command_Line. Therefore, I am proposing a package Ada.Directories. This package is based on the existing Claw package (Claw.Directories) [which was designed for Microsoft Windows only], the Ada POSIX bindings, and the GNAT package GNAT.Directory_Operations. Static Semantics with Ada.IO_Exceptions; package Ada.Directories is function Get_Current_Directory return String; procedure Set_Current_Directory (Directory : in String); procedure Create_Directory (New_Directory : in String; Form : in String := ""); procedure Remove_Directory (Directory : in String); type Directory_Entry_Type is private; No_Directory_Entry : constant Directory_Entry_Type; type Filter_Type is (Everything, Only_Directories, Only_Files); generic with procedure Action (Directory_Entry: in Directory_Entry_Type; Quit : in out Boolean); procedure For_Every_Directory_Entry (Directory : in String); generic with procedure Action (Directory_Entry: in Directory_Entry_Type; Quit : in out Boolean); procedure For_Matching_Directory_Entries (Directory : in String; Pattern : in String; Returns : in Filter_Type := Claw.Directories.Everything); -- Operations on Directory Entries: function Is_Valid (Directory_Entry: in Directory_Entry_Type) return Boolean; function Name_of (Directory_Entry : in Directory_Entry_Type) return String; function Is_Directory (Directory_Entry : in Directory_Entry_Type) return Boolean; function Is_Ordinary_File (Directory_Entry : in Directory_Entry_Type) return Boolean; Name_Error : exception renames Ada.IO_Exceptions.Name_Error; Use_Error : exception renames Ada.IO_Exceptions.Use_Error; Device_Error : exception renames Ada.IO_Exceptions.Device_Error; private -- Not specified by the language. end Ada.Directories; A @i is a container for files on the target system. A @ is a string identifying a directory. The interpretation of directory name strings is implementation-defined. A @i is a single item in a directory, identifying a single external file or directory. function Get_Current_Directory return String; Returns the directory name string for the current default directory. The name returned shall be suitable for a future call to Set_Current_Directory. The exception Use_Error is propagated if the external environment does not support a default directory. procedure Set_Current_Directory (Directory : in String); Sets the current default directory. The exception Name_Error is propagated if string Directory does not identify an existing directory. The exception Use_Error is propagated if the external environment does not support making Directory (in the absence of Name_Error) a default directory. procedure Create_Directory (New_Directory : in String; Form : in String := ""); Create a directory with name New_Directory. The Form can be used to give system-dependent characteristics of the directory; the interpretation of the Form parameter is implementation-defined. A null string for Form specifies the use of the default options of the implementation of the new directory. The exception Name_Error is propagated if string New_Directory does not identify a possible directory. The exception Use_Error is propagated if the external environment does not support the creation of a directory with the given name (in the absence of Name_Error) and Form. procedure Remove_Directory (Directory : in String); Remove an existing directory with name Directory. The exception Name_Error is propagated if string Directory does not identify an existing directory. The exception Use_Error is propagated if the external environment does not support the deletion of the directory with the given name (in the absence of Name_Error). type Directory_Entry_Type is private; The type Directory_Entry_Type represents a single item in a directory. These items can only be created by the generic directory searching procedures in this package. Information about the item can be obtained from the functions declared in this package. No_Directory_Entry represents an invalid directory entry. A default initialized object of this type shall be set to No_Directory_Entry. type Filter_Type is (Everything, Only_Directories, Only_Files); The type Filter_Type specifies which directory entries are provided from a search operation. Everything specifies that all matching directory entries are provided. Only_Directories specifies that only directory entries representing directories are provided. Only_Files specifies that only directory entries representing ordinary files (not directories) are provided. generic with procedure Action (Directory_Entry: in Directory_Entry_Type; Quit : in out Boolean); procedure For_Every_Directory_Entry (Directory : in String); Calls Action once for each directory entry in the directory named by Directory. When Action is called, Directory_Entry contains a valid directory entry, and Quit is False. If Action sets Quit to True, For_Every_Directory_Entry finishes with no further calls to action. The exception Name_Error is propagated if string Directory does not identify an existing directory. The exception Use_Error is propagated if the external environment does not support the searching of the directory with the given name (in the absence of Name_Error). generic with procedure Action (Directory_Entry: in Directory_Entry_Type; Quit : in out Boolean); procedure For_Matching_Directory_Entries (Directory : in String; Pattern : in String; Returns : in Filter_Type := Claw.Directories.Everything); Calls Action once for each directory entry in the directory named by Directory which matches Returns and Pattern. Pattern represents a file name matching pattern; its interpretation is implementation-defined. When Action is called, Directory_Entry contains a valid directory entry, and Quit is False. If Action sets Quit to True, For_Every_Directory_Entry finishes with no further calls to action. The exception Name_Error is propagated if string Directory does not identify an existing directory, or if Pattern does not identify any possible file. The exception Use_Error is propagated if the external environment does not support the searching of the directory with the given name (in the absence of Name_Error). function Is_Valid (Directory_Entry: in Directory_Entry_Type) return Boolean; Returns True if Directory_Entry is valid, and False otherwise. function Name (Directory_Entry : in Directory_Entry_Type) return String; Returns the external name of the file, directory, or other item represented by Directory_Entry. The format of the name returned is implementation-defined. The exception Use_Error is propagated if Directory_Entry is invalid. function Is_Directory (Directory_Entry : in Directory_Entry_Type) return Boolean; Returns True if the item represented by Directory_Entry is a directory, and False otherwise. The exception Use_Error is propagated if Directory_Entry is invalid. function Is_Ordinary_File (Directory_Entry : in Directory_Entry_Type) return Boolean; Returns True if the item represented by Directory_Entry is an ordinary file (not a directory), and False otherwise. The exception Use_Error is propagated if Directory_Entry is invalid. Implementation Advice If other information about a file is available (such as the size or creation date) in a directory entry, the implementation should provide functions in a child package Ada.Directories.Information to retrieve it. The name returned by Name should be a file name without any directory information. Notes Values of Directory_Entry_Type can be copied and queried later. However, another task or application can modify or delete the file represented by a Directory_Entry_Type value; such a value can only give the information valid at the time it is created. Therefore, long-term storage of these values is not recommended. If the target system does not support directories inside of directories, Is_Directory will always return False. If the target system does not support creation or deletion of directories, Create_Directory and Remove_Directory will always propagate Use_Error. Action may not be called at all for For_Every_Directory_Entry and For_Matching_Directory_Entries if there are no matching directory entries. Design Notes: 1) The names of the routines are based on those used in POSIX and GNAT. Claw uses shorter names based on the idea that the parameter and/or prefix will make it clear what type of object is being created. For example, Create_Directory is Create, and Get_Current_Directory is Get_Current. This seems too radical. 2) GNAT and Claw provide an "open" (non-iterator) version of the directory searching mechanism. I've not included that, because it may not be easy to provide an operation which can be saved/continued at a (much) later time (even in the same application.) The iterator eliminates the need to store the state of searching outside of the package. 3) POSIX and GNAT provide only the item name as a result from a directory search operation. This package provides an private type, in order that other information can be provided as it is available. (For instance, Windows provides various time stamps in a directory entry.) Of course, using such information is not portable. The implementation advice above shows the intent. 4) The intent is that Pattern in For_Matching_Directory_Entries matches the standard on the target system. For instance, on Windows, * and ? are the wildcard characters. I don't know of a good way to say this. Comments, brickbats welcome. **************************************************************** From: Robert Dewar Sent: Wednesday, October 04, 2000 9:55 PM I am dubious about trying to standardize functionality of this kind at this stage. If we do decide that this is a worth while approach then we should take a comprehensive view of what is needed, rather than carve out individual pieces haphazardly. Incidentally, in GNAT, we only make things part of the Ada hierarchy if we think they are possible candidates for such treatment. The current set of such files in Ada is: Ada.Command_Line.Remove Ada.Direct_IO.C_Streams Ada.Exceptions.Is_Null_Occurrence Ada.Sequential_IO.C_Streams Ada.Streams.Stream_IO.C_Streams Ada.Strings.Unbounded.Text_IO Ada.Strings.Wide_Unbounded.Wide_Text_IO Ada.Task_Identification.Image Ada.Text_IO.C_Streams Ada.Wide_Text_IO.C_Streams The packages we add to System are: System.Address_Image System.Assertions System.Partition_Interface System.Task_Info System.Wch_Cnv System.Wch_Con It is indeed true that our package GNAT.Directory_Operations is NOT in the Ada or System hierarchies, which means that we do not consider this suitable area for standardization. However, if this *is* an area for standardization, I think that the minimal richness is what is found in the GNAT unit, and I find the proposed Ada child too bare. Here is the GNAT spec: -- Directory operations -- This package provides routines for manipulating directories. A directory -- can be treated as a file, using open and close routines, and a scanning -- routine is provided for iterating through the entries in a directory. package GNAT.Directory_Operations is subtype Dir_Name_Str is String; -- A subtype used in this package to represent string values that are -- directory names. A directory name is a prefix for files that appear -- with in the directory. This means that for Unix systems, the string -- includes a final '/', and for DOS-like systems, it includes a final -- '\' character. It can also include drive letters if the operating -- system provides for this. The final '/' or '\' in a Dir_Name_Str is -- optional when passed as a procedure or function in parameter. type Dir_Type is limited private; -- A value used to reference a directory. Conceptually this value includes -- the identity of the directory, and a sequential position within it. Null_Dir : constant Dir_Type; -- Represent the value for an uninitialized or closed directory. Directory_Error : exception; -- Exception raised if the directory cannot be opened, read, closed, -- created or if it is not possible to change the current execution -- environment directory. procedure Change_Dir (Dir_Name : Dir_Name_Str); -- Changes the working directory of the current execution environment -- to the directory named by Dir_Name. -- -- Raises Directory_Error if Dir_Name does not exist. procedure Make_Dir (Dir_Name : Dir_Name_Str); -- Create a new directory named Dir_Name. -- -- Raises Directory_Error if Dir_Name cannot be created. function Get_Current_Dir return Dir_Name_Str; -- Returns the current working directory for the execution environment. procedure Get_Current_Dir (Dir : out Dir_Name_Str; Last : out Natural); -- Returns the current working directory for the execution -- environment. The name is returned in Dir_Name; Last is the index in -- Dir_Name such that Dir_Name (Last) is the last character written. If -- Dir_Name is too small for the directory name, the name will be -- truncated before beeing copied to Dir_Name. procedure Open (Dir : out Dir_Type; Dir_Name : in Dir_Name_Str); -- Opens the directory named by Dir_Name and returns a Dir_Type value -- that refers to this directory, and is positioned at the first entry. -- -- Raises Directory_Error if Dir_Name cannot be accessed. In that case -- Dir will be set to Null_Dir. procedure Close (Dir : in out Dir_Type); -- Closes the directory stream refered to by Dir. After calling Close -- Is_Open will return False. Dir will be set to Null_Dir. -- -- Raises Directory_Error if Dir has not be opened (Dir = Null_Dir). function Is_Open (Dir : Dir_Type) return Boolean; -- Returns True if Dir is open, or False otherwise. procedure Read (Dir : in out Dir_Type; Str : out String; Last : out Natural); -- Reads the next entry from the directory and sets Str to the name -- of that entry. Last is the index in Str such that Str (Last) is the -- last character written. Last is 0 when there is no more file in the -- directory. If Str is too small for the file name, the file name will -- be truncated before beeing copied to Str. The list of files returned -- includes directories in systems providing a hierarchical directory -- structure, including . (the current directory) and .. (the parent -- directory) in systems providing these entries. The directory is -- returned in target-OS form. -- -- Raises Directory_Error if Dir has not be opened (Dir = Null_Dir). function Read_Is_Thread_Safe return Boolean; -- Indicates if procedure Read is thread safe. On systems where the -- target system supports this functionality, Read is thread safe, -- and this function returns True (e.g. this will be the case on any -- Unix or Unix-like system providing a correct implementation of the -- function readdir_r). If the system cannot provide a thread safe -- implementation of Read, then this function returns False. private type Dir_Type_Value; type Dir_Type is access Dir_Type_Value; Null_Dir : constant Dir_Type := null; end GNAT.Directory_Operations; **************************************************************** From: Randy Brukardt Sent: Wednesday, October 04, 2000 10:23 PM I understand (but disagree) with the rest of your comment. But I am puzzled by a couple of things: > Incidentally, in GNAT, we only make things part of the Ada hierarchy if > we think they are possible candidates for such treatment. Well, an implementation isn't allowed to add children of Ada, so GNAT couldn't do that even if it made sense. > However, if this *is* an area for standardization, > I think that the minimal richness is what is found in the > GNAT unit, and I find the proposed Ada child too bare. The proposal covers virtually all of the functionality of the GNAT package (with some differences), and adds quite a bit of functionality that the GNAT package does not have. So I don't know where you get "bare" from. I looked at the GNAT spec before making the proposal, and used all of the ideas from it that I could. > Here is the GNAT spec: ... > subtype Dir_Name_Str is String; I left out the subtype because it didn't seem to buy anything. The description of the meaning of the string had to be left out because it wouldn't be appropriate in the standard. > type Dir_Type is limited private; This is part of the "open" version of directory searching. Claw has both, but I think an iterator version would be better suited for standardization. (I gave the reasons in the notes at the end of the proposal.) > Null_Dir : constant Dir_Type; Don't need this if you don't need the above type. > Directory_Error : exception; I suggested using IO_Exceptions for this, as these are IO operations. Moreover, we get some of the documentation requirements and definitions for free in that case, otherwise we have to write them up specifically. > procedure Change_Dir (Dir_Name : Dir_Name_Str); Called Set_Current_Directory in the proposal. "Dir" is an abbreviation, and the Ada standard doesn't like abbreviations. If people think "Directory" is too long, we could change that to "Folder". > procedure Make_Dir (Dir_Name : Dir_Name_Str); Called Create_Directory in the proposal. > function Get_Current_Dir return Dir_Name_Str; Called Get_Current_Directory in the proposal. > procedure Get_Current_Dir (Dir : out Dir_Name_Str; > Last : out Natural); I didn't provide a procedure version of the functions, but if that is thought to be important, its easy to add. > procedure Open (Dir : out Dir_Type; Dir_Name : in Dir_Name_Str); Part of the "Open" file searching. Use the iterator For_Every_Directory_Entry instead. > procedure Close (Dir : in out Dir_Type); Also part of the "Open" file searching. Not necessary, the iterator does it automatically. > function Is_Open (Dir : Dir_Type) return Boolean; Also part of the "Open" file searching. Since there is no object, there is no need for this. > procedure Read > (Dir : in out Dir_Type; > Str : out String; > Last : out Natural); Also part of the "Open" file searching. The "Action" procedure and the "Name" function takes the place of this. > function Read_Is_Thread_Safe return Boolean; This seems too system-specific to be part of the standard. The proposal, OTOH, offers filtered searching and the capability of retrieving properties beyond just the name, which are quite useful in my experience. **************************************************************** From: Pascal Leroy Sent: Thursday, October 05, 2000 3:13 AM > I am dubious about trying to standardize functionality of this kind at > this stage. I agree with Robert. First, I am not sure that there are many applications out there that need to run on both Unix and Windows (I don't know of any in our installed base; I know of people migrating from Unix to NT, but that's a different story altogether). Moreover, anyone designing such an application will be better off encapsulating the directory services that they need (and many other OS services, btw) than relying on an interface that will necessarily be a least common denominator. More importantly, we must make the best use of the scare ARG resources. While a package of directory services would be nice-to-have, it's easy for users to write their own, so it's not like anyone is stuck waiting for the ARG to come up with a solution. On the other hand, there are amendment proposals on the table that deal with real language holes which cannot be circumvented in a reasonably simple fashion. We have add this discussion already when we did the TC: let's focus on the 20% of the issues that will benefit 80% of the users. **************************************************************** From: Pascal Leroy Sent: Thursday, October 05, 2000 3:02 AM > Well, an implementation isn't allowed to add children of Ada, so GNAT > couldn't do that even if it made sense. WHAT?! Hopefully this statement is incorrect, or else the RM needs to be changed. As far as I can tell, even the _user_ can add children of Ada (well, they better know what they are doing, but still they can). **************************************************************** From: Robert Dewar Sent: Thursday, October 05, 2000 10:23 AM <<> Incidentally, in GNAT, we only make things part of the Ada hierarchy if > we think they are possible candidates for such treatment. Well, an implementation isn't allowed to add children of Ada, so GNAT couldn't do that even if it made sense. >> But it is allowed to add grandchildren! **************************************************************** From: Michael Yoder Sent: Thursday, October 05, 2000 9:27 AM I assume "Claw.Directories.Everything" should be replaced by "Everything." **************************************************************** From: David C. Hoos, Sr. Sent: Thursday, October 05, 2000 5:03 AM I have six comments about the proposal, viz.: 1. I like the proposal, in general. 2. There is an obvious error where "Claw.Directories.Everything" should simply be "Everything" 3. The use of verb phrases for function names is not recommended by most (if not all) standards. Therefore I would recommend Get_Current_Directory be named Current_Directory, instead. 4. The function "Name_Of" should be named simply "Name", for consistency (e.g.) with Ada.Text_IO. 5. The function "Is_Ordinary_File" adds yet another term for something that has had a different name for many years. I suggest that that function be named "Is_Regular_File." 6. The word "Directory" in the names of the generic iterators is redundant -- after all, the package is named "Directories," and "Directory" is the formal names for a parameter of these iterators. Long names (i.e. no cryptic abbreviations) are useful, up to the point where additional words in the name add no new information. I was puzzled by your statement that "The POSIX libraries .... leav[e] out ... even Linux." The florist POSIX/Ada bindings have been available for Linux for several years. Even so, the point about those OSs left out is well-taken, and having this functionality as a child of Ada is a good idea, in my view. **************************************************************** From: jj@ddci.dk Sent: Thursday, October 05, 2000 7:38 AM It is illegal for the user (in standard mode), see [AARM A.2(4)] but I don't see how it should be illegal for the implementation. **************************************************************** From: Randy Brukardt Sent: Thursday, October 05, 2000 1:27 PM A couple of responses to comments: Pascal said: > First, I am not sure that there are many applications out > there that need to run on both Unix and Windows (I don't know of any > in our installed base; I know of people migrating from Unix to NT, > but that's a different story altogether). Well, most of the ones I've worked on do in fact run on both. The Janus/Ada compiler uses these sort of operations for library management. The mini-web server uses them to provide download listings. Of course, in both cases, this isn't the *only* non-portable operation. Pascal continues: > While a package of directory services would be nice-to-have, > it's easy for users to write their own, so it's not like anyone is stuck > waiting for the ARG to come up with a solution. I think "easy" is the wrong word here, "possible" is more like it. Doing so requires fairly detailed understanding the API of the OS. However, that really isn't the point. The point is that (almost?) every Ada compiler provides this functionality (so it must be important), but they all do it differently. It is this sort of needless difference that makes real-world Ada programs less portable than they otherwise would have to be. > On the other hand, there are amendment proposals on the table that > deal with real language holes which cannot be circumvented in a > reasonably simple fashion. Certainly, and I would be the first to say that those should be dealt with first. But there aren't that many critical issues (thanks to Tucker's excellent job that last time around), almost everything proposed falls into the "nice to have" category. You can't convince me that "Is_Null_Occurrence" or even 'Object_Size is any more important than this. Robert Dewar said (responding to me): ><couldn't do that even if it made sense.>> >But it is allowed to add grandchildren! Yes, of course. But for the sort of significant package (such as this one), making it a grandchild of some vaguely related package doesn't make a lot of sense. Certainly, Ada.Text_IO.Directories or Ada.IO_Exceptions.Directories does not seem like a good idea; so I'm not surprised that the GNAT designers did not include this functionality in Ada. David Hoos noted a couple of errors in the proposal. I had changed "Name_of" to just "Name", but obviously missed the most important place. It's obvious that I didn't compile this package (as it is intended as a trial ballon, so I didn't want to spend too long on it). **************************************************************** From: Tucker Taft Sent: Thursday, October 05, 2000 1:02 PM Randy Brukardt wrote: > ... > Therefore, I am proposing a package Ada.Directories. This package is based > on the existing Claw package (Claw.Directories) [which was designed for > Microsoft Windows only], the Ada POSIX bindings, and the GNAT package > GNAT.Directory_Operations. > ... > Comments, brickbats welcome. I think this is an important area for standardization. I am surprised that others have not repeatedly faced the issue of moving code between Unix and NT (and Mac for some of us ;-). Certainly all the server-side code we develop these days has to run on both Unix and Win2K/NT. Even if it runs on only Unix, having to go to the Posix interface to get these basic capabilities seems like overkill, especially given that basic File I/O is part of the Ada library. **************************************************************** From: Robert Dewar Sent: Thursday, October 05, 2000 11:05 AM For GNAT, you cannot add children to Ada unless you use a special implementors-only switch (this switch of course results in non-standard behavior). And you cannot add grandchildren either without this special switch (we take full advantage of the RM permission to restrict all user additions to standard hierarchies). By the way, the RM specifically expects there to be modes to control this, and the relevant RM restriction is: 4 In the standard mode, it is illegal to compile a child of package Ada. which says nothing about grandchildren, and this omission is quite deliberate (or at least I have always assumed it is, since it is really quite important for an implementation to be able to add grandchildren). <> Well you may want to change it, but you have had five years to make that comment, and have not done so, so it is a bit late now to make a late and very big change to the language! What possible benefit would there be in changing the language at this stage. Sounds like you just did not read the RM carefully here :-) :-) Note that the deliberateness of the decision here can be assessed by the following paras in the AARM: 4.a Reason: The intention is that mentioning, say, Ada.Text_IO in a with_clause is guaranteed (at least in the standard mode) to refer to the standard version of Ada.Text_IO. The user can compile a root library unit Text_IO that has no relation to the standard version of Text_IO. 4.b Ramification: Note that Ada can have non-language-defined grandchildren, assuming the implementation allows it. Also, packages System and Interfaces can have children, assuming the implementation allows it. 4.c Implementation Note: An implementation will typically support a nonstandard mode in which compiling the language defined library units is allowed. Whether or not this mode is made available to users is up to the implementer. 4.d An implementation could theoretically have private children of Ada, since that would be semantically neutral. However, a programmer cannot compile such a library unit. So I don't think there is an issue here, merely some language design points which Pascal had overlooked :-) **************************************************************** From: Robert Dewar Sent: Thursday, October 05, 2000 11:00 AM <> Well I have to comment that we have many users who are working on applications that have to work on both Unix and NT, but then this is something that GNAT works hard to accomodate, so the difference is not surprising. Actually note that many libraries, as opposed to applications have to solve this problem (in contexts other than GNAT, including I am sure most or all other Ada 95 compilers). For example, CLAW is a library that must work on NT and Unix, even if a given app does not. Still I think it will be relatively difficult to proceed in standardizing this, although reading Randy's latest message makes me more sympathetic to the effort (after all this is something any C programmer can deal with in a portable manner, why should it be non-portable in Ada?) And regarding the last sentence, the whole point in portable libraries is to depend on the least common denominator if you want to be portable. Also, migration is NOT such a "different story altogether". By making this portable, you eliminate one more task involved in such migration. Of course in practice all vendors have decent solutions to this problem (at least I would assume that is the case). The question is whether it is worth trying to abstract a common approach, and I must say I am somewhere in the middle on this issue at this stage, so let's see what other people think. **************************************************************** From: Pascal Leroy Sent: Thursday, October 05, 2000 3:35 PM > So I don't think there is an issue here, merely some language design > points which Pascal had overlooked :-) Surely I was wrong when I said that users can compile a child of Ada: that's clearly forbidden by RM95 A.2(4). However, I don't see that this paragraph forbids implementation-defined children of Ada. I can't find a requirement that says "implementation-defined units have to be compiled in the standard mode". Surely an implementation-defined child of Ada will not be compiled in the standard mode, but that's part of the black magic that takes place when the implementers prepare their predefined libraries. (To be honest, I may have a biased view here because we use a library-based compilation model; not sure how that works with a source-based model.) **************************************************************** From: Pascal Leroy Sent: Thursday, October 05, 2000 3:42 PM > > On the other hand, there are amendment proposals on the table that > > deal with real language holes which cannot be circumvented in a > > reasonably simple fashion. > > Certainly, and I would be the first to say that those should be dealt with > first. But there aren't that many critical issues (thanks to Tucker's > excellent job that last time around), almost everything proposed falls into > the "nice to have" category. You can't convince me that "Is_Null_Occurrence" > or even 'Object_Size is any more important than this. Mutually-dependent types, unchecked-union, access-to-constant parameters, access type conversions, meaning of Bit_Order, revised rules for dispatching, T'Class as generic actual: these are much more important issues IMHO than taking a random package and trying to standardize it. **************************************************************** From: Michael Yoder Sent: Thursday, October 05, 2000 2:21 PM Pascal wrote: >More importantly, we must make the best use of the scare ARG resources. >While a package of directory services would be nice-to-have, it's easy for >users to write their own, so it's not like anyone is stuck waiting for the >ARG to come up with a solution. On the other hand, there are amendment >proposals on the table that deal with real language holes which cannot be >circumvented in a reasonably simple fashion. Pascal wrote: >We have add this discussion already when we did the TC: let's focus on the >20% of the issues that will benefit 80% of the users. > >Pascal I agree with Pascal's sentiment but I may disagree with where this issue lands. I would say lack of simple directory operations has the most common annoying impediment to my programming for quite some time. (The systems for which I write common code are VMS and Unix.) That is, my current belief is that this *is* worth the use of scarce resources; it isn't a strongly held opinion. It seems to me that even if this were low payoff, it's also low effort, which may make it worthwhile regardless. **************************************************************** From: Jean-Pierre Rosen Sent: Thursday, October 05, 2000 12:29 PM From: "Randy Brukardt" >[...] > Therefore, I am proposing a package Ada.Directories. This package is based > on the existing Claw package (Claw.Directories) [which was designed for > Microsoft Windows only], the Ada POSIX bindings, and the GNAT package > GNAT.Directory_Operations. > This is certainly a good idea - I had it also ;-) . May I suggest that you have a look at package OS_Services which is available from Adalog's components page (http://pro.wanadoo.fr/adalog/compo2.htm). I take a different approach to scanning and obtaining information about directory entries. One of the main ideas of my approach was to set up a structure that can provide OS specific services without affecting portability for programs that don't use them, i.e. don't restrict to a least common denominator. Maybe we cannot come up with a solution that takes the best of both worlds. **************************************************************** From: Robert Dewar Sent: Thursday, October 05, 2000 7:18 PM <> Well I must say that our impression from users is that they are quite happy to use the package we provide, and we do not have the impression that this is a significant problem. **************************************************************** From: Robert Dewar Sent: Thursday, October 05, 2000 7:15 PM <> That's a real stretch. Look at the paragraphs in the AARM 4.a Reason: The intention is that mentioning, say, Ada.Text_IO in a with_clause is guaranteed (at least in the standard mode) to refer to the standard version of Ada.Text_IO. The user can compile a root library unit Text_IO that has no relation to the standard version of Text_IO. 4.b Ramification: Note that Ada can have non-language-defined grandchildren, assuming the implementation allows it. Also, packages System and Interfaces can have children, assuming the implementation allows it. This pretty clearly implies that Ada can NOT have non-language-defined children, and this is surely the intent. <> There is absolutely no difference here that results from the compilation model. In either case, there needs to be a special mechanism for adding children to Ada. But I think it would be quite wrong for an implementor to provide implementation defined children of Ada. Incidentally, I completely misunderstood Pascal's position here, I thought he was arguing AGAINST implementation defined grandchildren, and in fact he appears to be arguing FOR implementation defined children. **************************************************************** From: Robert A Duff Sent: Friday, October 06, 2000 9:02 AM Robert wrote: > This pretty clearly implies that Ada can NOT have non-language-defined > children, and this is surely the intent. Yes, that was the intent, as far as I recall. **************************************************************** From: Robert A Duff Sent: Friday, October 06, 2000 9:06 AM Well, I can't resist making one somewhat-technical comment: Randy wrote: > Called Set_Current_Directory in the proposal. "Dir" is an abbreviation, > and the Ada standard doesn't like abbreviations. If people think "Directory" > is too long, we could change that to "Folder". I don't think we have to avoid the "Dir" abbreviation here. It's no worse than the "IO" in "Text_IO". I could live with "Directory", too. I really, really hate the term "Folder". **************************************************************** From: Robert A Duff Sent: Friday, October 06, 2000 9:01 AM I have lots of technical comments on Randy's proposal. However, I think we should first decide whether (at least tentatively) whether to standardize this stuff at all. I'm (mildly) in favor of it. Can we argue about that first, and perhaps have a straw poll? I don't want to waste time arguing about the technical points if only two people are interested. Pascal, knowing that priorities are not absolute, but are relative to each other, wrote: > Mutually-dependent types, unchecked-union, access-to-constant parameters, > access type conversions, meaning of Bit_Order, revised rules for > dispatching, T'Class as generic actual: these are much more important issues > IMHO than taking a random package and trying to standardize it. I agree that the following are higher priority than directory ops: mutually-dependent types, access-to-constant parameters, access type conversions, T'Class as generic actual. I don't agree about unchecked-union or meaning of Bit_Order. I don't remember what "revised rules for dispatching" means. :-( On my list, "mutually-dependent types" is highest priority. Mike Yoder wrote: > That is, my current belief is that this *is* worth the use of scarce > resources; it isn't a strongly held opinion. It seems to me that even if > this were low payoff, it's also low effort, which may make it worthwhile > regardless. Perhaps. But I'll bet we could spend a LOT of time arguing about stylistic details. Also, whether to add one more "nice-to-have" operation to the package. **************************************************************** From: Pascal Leroy Sent: Friday, October 06, 2000 2:57 AM > It seems to me that even if > this were low payoff, it's also low effort, which may make it worthwhile > regardless. There are really three "efforts" involved here: 1 - The effort needed to standardize this package and its semantics. 2 - The effort needed to write ACAATS tests to ensure that implementers comply with the specification. 3 - The effort needed to implement the package. I'm sure that #3 is a very small effort; it really looks like an afternoon project to me. Regarding #1: the ARG has demonstrated a surprising ability to argue for hours about totally trivial details; in fact, it seems that the more trivial the issue, the more arguments it generates because everybody understands the issue and has a strong opinion. Regarding #2, the ARG has not been very good at producing tests for new or modified capabilities. The only significant effort in that direction were the tests that Bob Duff wrote in the ACAATS 2.2 timeframe. My gut feeling is that 60% of the changes that went into the corrigendum have no corresponding test. My conclusion is that I'd rather devote precious ARG-hours to other topics, or to writing ACAATS tests for old AIs. **************************************************************** From: Pascal Leroy Sent: Friday, October 06, 2000 2:48 AM > < children of Ada. I can't find a requirement that says > "implementation-defined units have to be compiled in the standard mode".>> > > That's a real stretch. Look at the paragraphs in the AARM > > 4.a Reason: The intention is that mentioning, say, Ada.Text_IO in > a with_clause is guaranteed (at least in the standard mode) to refer > to the standard version of Ada.Text_IO. The user can compile a root > library unit Text_IO that has no relation to the standard version of > Text_IO. I understand the language design principle of preventing a random user from unwittingly compiling a package named Ada.Text_IO, thereby resulting in plague and pestilence. (Note that you would hope that this same user would be prevented from compiling a package named Text_IO, too, but she isn't.) I don't see why it's beneficial to prevent an implementation from providing a package named Ada.Some_Useful_Services. Especially when it's OK for them to provide a root library unit named Some_Useful_Services (which causes name-space pollution) or a grandchild of Ada named Ada.Tags.Some_Useful_Services (as is illustrated by the case of directory services, there are situations where there is no good place to put a grandchild of Ada). > This pretty clearly implies that Ada can NOT have non-language-defined > children, and this is surely the intent. Even if I followed Robert's logic (and I cannot be too much convinced at this point) I don't see why it's good. > Incidentally, I completely misunderstood Pascal's position here, I thought > he was arguing AGAINST implementation defined grandchildren, and in fact > he appears to be arguing FOR implementation defined children. Correct. **************************************************************** From: Christoph Grein Sent: Friday, October 06, 2000 1:39 AM Robert Dewar wrote: ... > Of course in practice all vendors have decent solutions to this problem > (at least I would assume that is the case). The question is whether it > is worth trying to abstract a common approach, and I must say I am > somewhere in the middle on this issue at this stage, so let's see what > other people think. Is this an invitation on a poll? If so, I say it's worth the effort, even if it provides onlu the least commeon denominator. What I'm missing and which I think is fundamental is a requirment like function Name (Directory_Entry : in Directory_Entry_Type) return String; Returns the external name of the file, directory, or other item represented by Directory_Entry. The format of the name returned is implementation-defined. The exception Use_Error is propagated if Directory_Entry is invalid. => The returned string shall be usable for Ada.Text_IO file operations if Is_Ordinary_File is true for the given Directory_Entry (perhaps after an appropriate call to Set_Current_Directory). As yet IMHO the interaction with Ada.Text_IO is completely unspecified. **************************************************************** From: Robert Dewar Sent: Friday, October 06, 2000 10:35 AM <> I agree, the term folder would be merely an obfuscation in this environment (perhaps Randy prefers "transput" to "input-output" since it is shorter :-) **************************************************************** From: Robert Dewar Sent: Friday, October 06, 2000 10:39 AM <> Interesting, shows how the view of a language designer can be different from the view from the applications end. I would never have included access-to-constant parameters, or access type conversions, or T'Class as generic actual on my list of important things. These are simply nice to have, but are not causing anyone any significant troubles in the field as far as I can see. Yes, mutually-dependent types are significant, but in our experience, only in connection with Java environments so far. The meaning of Bit_Order on the other hand is quite important, and certainly we implemented this in GNAT because there was significant demand. Whether or not other compilers standardize this as required is of course not so signiicant to us. Similarly, we could not begin to survive without Unchecked_Union. that's an absolutely vital capability for some of the existing C bindings. From a usage point of view, having access to directory operations is also absolutely vital. An Ada compiler not providing this would be crippled from the point of view of many users. Whether this means it should be standardized is another matter. Robert Dewar (speaking now as a compiler vendor, and not as a language designer :-) **************************************************************** From: Pascal Leroy Sent: Friday, October 06, 2000 10:42 AM > Pascal, knowing that priorities are not absolute, but are relative to > each other, wrote: > > > Mutually-dependent types, unchecked-union, access-to-constant parameters, > > access type conversions, meaning of Bit_Order, revised rules for > > dispatching, T'Class as generic actual: these are much more important issues > > IMHO than taking a random package and trying to standardize it. > > I agree that the following are higher priority than directory ops: > mutually-dependent types, access-to-constant parameters, access type > conversions, T'Class as generic actual. > I don't agree about unchecked-union or meaning of Bit_Order. > I don't remember what "revised rules for dispatching" means. :-( By "revised rules for dispatching" I was alluding to AI-00232. > On my list, "mutually-dependent types" is highest priority. On my list, mutually-dependent types and unchecked-unions are the highest priority items. I could live without T'Class as generic actual. Anyway, maybe a first step would be to look at the amendment AIs that are on the table, and take a straw vote regarding their priority. The !priority of the AIs have been assigned by the editor (Bob and Randy) and presumably reflected their own feeling. However it would be good to know what topics are considered high-priority by the ARG as a whole. This is especially important for amendment AIs, as they are likely to require considerably more time than the AIs that merely plug holes in the RM. **************************************************************** From: Robert Dewar Sent: Friday, October 06, 2000 10:58 AM <> Well Pascal is free to second guess a clear design intention, but as I say, to change the language here would require a lot of convincing. Personally, I feel that adding implementation dependent first level children to Ada would be a horrible idea. Users need to know that packages in Ada are part of the RM and portable. Yes, I realize that this argument would apply to grand-children as well, but there was a concious decision to compromise between two conflicting requirements here. I think that implementations should be very judicious in adding new packages to the Ada hierarchy, even at the grandchild level. I certainly do not want to encourage profligacy here by allowing children to be added. For instance I would think it just horrible if we had added Ada.SPITBOL, rather than GNAT.SPITBOL to our distribution. Note that the permission in RM (A.3(27)) is interesting: (ooops, I mean A.3.3(27)) 27 An implementation may provide additional packages as children of Ada.Characters, to declare names for the symbols of the local character set or other character sets. Now of course, under all our interpretations, this implementation permission is junk, since everyone agrees you can add grandchildren anyway, but the fact that the authors state this explicitly as implementatoin permission (rather than implementation advice that such packages might be appropriate) is helpful in understanding the general viewpoint here. **************************************************************** From: Robert Dewar Sent: Friday, October 06, 2000 11:07 AM <> OK, I agree with this assessment. Indeed mutually dependent types and unchecked unions are language extension (yes yes, I know that technically unchecked union is not an extension :-) tha we provide in GNAT, and which both our own technology and many of our users programs depend. In the case of WITH TYPE, we just had to go ahead with our own design (which is simpler than the full bells-and-whistles proposal from Tucker) in the absence of an agreed solution, because we absolutely required this for our Java related work. As I said earlier, unchecked union also is quite critical (again we find Tucker's version to have unnecessary bells and whistles, and prefer a simpler approach :-) **************************************************************** From: Randy Brukardt Sent: Friday, October 06, 2000 11:52 AM > Pascal, knowing that priorities are not absolute, but are relative to > each other, wrote: > > > Mutually-dependent types, unchecked-union, access-to-constant parameters, > > access type conversions, meaning of Bit_Order, revised rules for > > dispatching, T'Class as generic actual: these are much more important issues > > IMHO than taking a random package and trying to standardize it. To follow up on these messages, I would rank the issues in this order: AI-0217 (Mutually dependent types) AI-0230 (implicit conversions between access types) AI-0222 (Feature control) AI-0224 (pragma Unsuppress) AI-0218 (Accidental overriding) [Unassigned] {Directory operations} [Unassigned] {Other packages all compilers provide.:-)} [Unassigned] {'Object_Size} AI-0216 (Unchecked unions) AI-0231 (access-to-constant parameters) AI-0241 (Testing for Null_Occurrence) AI-0232 (Dispatching operation visibility and ambiguity) AI-0234 (Unsigned integer types) I don't think "meaning of Bit_Order" is an amendment. (AI-0133 is classed as a BI). Similarly with "T'Class as a generic actual type" (AI-0158 is also classed as a BI). I can believe that the latter is misclassed. I think the last two on the list are bad ideas that will never go anywhere, thus I gave them the lowest priority. There probably are other things that ought to be added to this list, but those are the ones we have on the table at the moment. We'd probably benefit from a general discussion of how we'll approach this at the upcoming meeting. **************************************************************** From: Randy Brukardt Sent: Friday, October 06, 2000 4:42 PM > Robert wrote: > > > This pretty clearly implies that Ada can NOT have non-language-defined > > children, and this is surely the intent. And Bob replied: > Yes, that was the intent, as far as I recall. Randy (who remembers trivia like this, but forgets to check URLs before sending important announcements...) recalls: Ada 9X originally had a requirement that a child of Ada could be compiled only if its name matched a language-defined package. That requirement was hard to implement (I remember that we did implement it as part of the UI contracts; we had to look up the name of the unit against a list - yuck), and still allowed problems, so it was changed to the requirement that now exists. Certainly, the history of requirement was that no non-language-defined children of Ada be allowed; the requirement was written more weakly just so that it was easier to implement. **************************************************************** From: Jean-Pierre Rosen Sent: Saturday, October 07, 2000 12:20 AM > I have lots of technical comments on Randy's proposal. However, I think > we should first decide whether (at least tentatively) whether to > standardize this stuff at all. > It always puzzles me to see that the Ada community refuses to see that anything can exist before it is rubber-stamped by ISO. After all what made the success of Java ? That Sun very rapidly pushed lots and lots of components - even at the cost of quality, see the number of deprecated features. If we (an informal group) can agree on a sufficiently useful package, and can get compiler vendors to provide the package, that's good enough from a user's point of view. If the package gets widely used, it can make its way in an annex for the next revision. No need to waste ARG time on this. After all, that's what the PAS way at ISO is. **************************************************************** From: Robert A Duff Sent: Tuesday, October 10, 2000 11:55 AM Robert wrote: > The meaning of Bit_Order on the other hand is quite important, and > certainly we implemented this in GNAT because there was significant > demand. Whether or not other compilers standardize this as required > is of course not so signiicant to us. > > Similarly, we could not begin to survive without Unchecked_Union. > that's an absolutely vital capability for some of the existing C > bindings. OK, I guess find the above convincing. > I think that implementations should be very judicious in adding new > packages to the Ada hierarchy, even at the grandchild level. I agree. The reason for allowing grandchildren is that the implementation might want to extend an existing (language-defined) abstraction, so it ought to be a child of that thing, especially if it needs to know about the private part. Pascal wrote: > I don't see why it's beneficial to prevent an implementation from providing > a package named Ada.Some_Useful_Services. Especially when it's OK for them > to provide a root library unit named Some_Useful_Services (which causes > name-space pollution) ... Well, the top-level Some_Useful_Services doesn't really pollute the namespace, because users can just compile their own thing called Some_Useful_Services. Anyway, it's logically impossible to forbid compiler vendors from providing Some_Useful_Services. > (Note that you would hope that this same user would > be prevented from compiling a package named Text_IO, too, but she > isn't.) As I recall, some people believed (during the 9X process) that this was a feature -- that you can have your own version of Text_IO. I wouldn't do that myself. But it was intentional that this be allowed. Robert wrote: > This by the way is Parkinson's Second Law: The time spent discussing an > issue is inversely proportional to its importance. I remember one time when the ARG spent about 20 minutes deciding whether to take a 15-minute coffee break *now*, versus after doing one more AI. I believe John Barnes got up and left the room muttering under his breath. ;-) **************************************************************** From: David Emery Sent: Tuesday, October 10, 2000 1:00 PM Granted, I'm biased in this respect, but I'd really suggest that we could reuse the existing POSIX directories stuff. The key thing is adding the "spin" on how this works in Windows. I think it would be A Very Bad Thing for Ada to have something that does not directly align with the existing (and widely used) POSIX API. Incidentally, a problem I see with Randy's proposal is that it's very ambiguous with respect to things in the file namespace that are not "files" or "directories". POSIX recognizes lots of stuff in the file namespace, including character-devices, block-devices, semaphores, shared memory segments, etc. **************************************************************** From: bjkae@infomatics.saab.se Sent: Tuesday, October 10, 2000 1:07 PM Straw poll vote: I am in favor of standardizing directory operations. Standardizing trivial but commonly used operations is very useful. Ada83 probably suffered a lot as a number of these packages were lacking: Elementary functions, string operations, command line, etc. Let us continue the Ada95 way of making things easier also for trivial programs. /Björn Källberg **************************************************************** From: Ted Baker Sent: Wednesday, October 11, 2000 5:46 AM As you might expect, I support Dave's suggestion. --Ted **************************************************************** From: David Emery Sent: Wednesday, October 11, 2000 12:22 PM I think it would be A Good Thing if the directory operations could be defined in terms of renames on existing POSIX operations. This will preserve signatures and semantics. For non-POSIX implementations, the implementor can provide the proper bodies, etc. **************************************************************** From: Jean-Pierre Rosen Sent: Thursday, October 12, 2000 2:44 AM > Granted, I'm biased in this respect, but I'd really > suggest that we could reuse the existing POSIX > directories stuff. The key thing is adding the "spin" > on how this works in Windows. I think it would be > A Very Bad Thing for Ada to have something that does > not directly align with the existing (and widely > used) POSIX API. Then, doing anything would be a waste of time. If you like the POSIX interface, by all means, use it! Nothing prevents you from having the same interface on non-POSIX systems. I think the need that has been expressed here is for something that is NOT linked to any special OS. There is a challenge here: allowing to be system-independent while not falling into the least-common-denominator syndrom. This was my main concern when I wrote OS_Services: define a structure that would be extensible to allow access to OS pecularities, AND allow a set of common portable features. If you don't see what I mean, please go to http://pro.wanadoo.fr/adalog/compo2.htm and download OS_Services. There is a full HTML documentation that discusses the issues. **************************************************************** From: David Emery Sent: Thursday, October 12, 2000 12:49 PM We do a major dis-service to the community if we produce a OS-independent package that breaks lots of existing code for no clear benefit. It's clear that there is a lot of similarities between Unix/POSIX, Windows, MacOS, etc in their file systems. Early in the POSIX work I did a study of how far you could implement the POSIX API on MS-DOS (I called it "DOSIX":-) Most of the POSIX file system packages could be implemented with minor "semantic spin". Given that lots of programs are using the POSIX file packages, it makes sense to me to adopt a compatible interface for other systems. This is particularly true since the major porting efforts that I've seen that are not Unix->Unix are Unix->Windows. (And most certainly not Windows->Unix, unfortunately.) Thus I think the community is best served by preserving the POSIX API as much as possible. That's why I suggest that the Directory services be defined so that they could be implemented using "renames" of the existing POSIX services. **************************************************************** From: Tucker Taft Sent: Thursday, October 12, 2000 5:22 PM David Emery wrote: > ... > Thus I think the community is best served by preserving > the POSIX API as much as possible. > > That's why I suggest that the Directory services be > defined so that they could be implemented using > "renames" of the existing POSIX services. This sounds nice, but we should be sure that we aren't forcing some kind of artificial simulation of the Posix path syntax (e.g. blah/blah/blah) onto the Windows, Mac, VMS, etc. implementations. How about posting a sample renaming, and perhaps indicate where there might be a need for some additional functions, constants, etc.? Some of us haven't memorized the Posix interface, and don't have ready access to it. **************************************************************** From: Jean-Pierre Rosen Sent: Friday, October 13, 2000 3:39 AM > We do a major dis-service to the community if we > produce a OS-independent package that breaks > lots of existing code for no clear benefit. I am puzzled by this statement... There IS a Posix binding to Ada, and if a different package appears, it certainly does not mean that all users of POSIX *must* give up on Posix and switch to that package! > [snip] > Thus I think the community is best served by preserving > the POSIX API as much as possible. > > That's why I suggest that the Directory services be > defined so that they could be implemented using > "renames" of the existing POSIX services. What I'm saying is that if there is not enough added value between the proposed package and the Posix interface, then it's not worth the effort. I'm viewing (IMHO) this package as high-level, thick if you wish. I'd rather use a Posix emulation package on Windows (btw it exists thanks to P. Obry) than an "almost Posix" package. **************************************************************** From: Ted Baker Sent: Tuesday, October 17, 2000 4:05 PM | I think the need that has been expressed here is for something | that is NOT linked to any special OS.... The POSIX standard is not linked to any specific OS, any more than Ada is linked to a specific OS. POSIX.1 has been implement not just on a variety of Unix-like systems, but also on Windows NT, VAX VMS, PrimeOS, and other operating systems. **************************************************************** From: Robert A Duff Sent: Tuesday, October 17, 2000 5:26 PM > The POSIX standard is not linked to any specific OS, any more than > Ada is linked to a specific OS. POSIX.1 has been implement not just > on a variety of Unix-like systems, but also on Windows NT, VAX VMS, > PrimeOS, and other operating systems. I would like to hear someone who knows what they're talking about list the reasons, if any, why POSIX is unsuitable as the "directory operations" package we're talking about. Then we can decide "POSIX is good enough; we don't need anything else" or "we need to design something based on POSIX that more fully solves the problem" or "POSIX is irrelevant, so we should design something from scratch (or based on Randy's package or GNAT's package or whatever)". **************************************************************** From: Randy Brukardt Sent: Wednesday, October 18, 2000 12:46 PM > > The POSIX standard is not linked to any specific OS, any more than > > Ada is linked to a specific OS. POSIX.1 has been implement not just > > on a variety of Unix-like systems, but also on Windows NT, VAX VMS, > > PrimeOS, and other operating systems. > > I would like to hear someone who knows what they're talking about list > the reasons, if any, why POSIX is unsuitable as the "directory > operations" package we're talking about. Then we can decide "POSIX is > good enough; we don't need anything else" or "we need to design > something based on POSIX that more fully solves the problem" or "POSIX > is irrelevant, so we should design something from scratch (or based on > Randy's package or GNAT's package or whatever)". I'm not completely sure that I know what I'm talking about here (because I can only find a late draft of the POSIX standard, not a current version), but let me try this anyway. First of all, "Randy's package" is strongly based on the POSIX package. Most of the subprogram names and functions are the same. My proposal is essentially that of Bob's second choice. I don't think starting from scratch makes any sense at all, but that POSIX is not directly a very good choice. Let me enumerate some of the reasons: -- All of the package names start with "POSIX". I think this would be very off-putting to anyone running on a non-POSIX system. This package ought to be a child of Ada, like Text_IO. -- The package we standardize has to be meaningful on any target system. (At least, we have to define what happens when the operations aren't supported.) The POSIX packages, by definition, do not need to deal with this problem. I spent quite of bit of effort to define the package such that it would be meaningful on the largest variety of systems. For instance, it is important that directory searching work even if the underlying system doesn't support user-created subdirectories. -- POSIX exceptions seem to be mapped to the various error codes that a POSIX system can return. For a standard package, a mapping to IO_Exceptions is much better. By doing that, we avoid having to explain again all of the stuff about the meaning of Device_Error and so on. -- Create_Directory takes a POSIX_Permissions parameter. Ada has traditionally used Form parameters for this use, which is why my proposal uses that rather than a "permissions" parameter. I don't think there is any value to standardizing permissions, especially at this late date. -- I used the names "Get_Current_Directory" and "Set_Current_Directory" rather than the POSIX "Change_Working_Directory" and "Get_Current_Working_Directory" because these are the most meaningful to someone who usually works on Windows (i.e. me!). I have no objection to using "Change_Working_Directory" instead of "Set_Current_Directory", but "Get_Current_Working_Directory" is unnecessarily verbose. If we do change to the POSIX terminology, we should at least use "Get_Working_Directory" or just "Working_Directory"; the "current" is redundant. -- The POSIX operations Change_Working_Directory and Get_Current_Working_Directory are in the POSIX_Process_Environment package. While we might want to consider a standard environment package, that is the wrong place for these routines. That reflects the implementation of the current (or working) directory, not the usage of them. In particular, some systems do not sort this information in the environment at all. -- The POSIX directory searching generic function is not designed for extensibility. It doesn't have to be: the target is known. A standard package should, however, be designed so that an implementation can provide child packages to provide whatever other information is available in the directory entry. -- Finally, POSIX does not include any wildcard searching operations, thus forcing each programmer to implement their own. That is in keeping with the POSIX tradition, but it is inappropriate for directory searching on most systems where at least some wildcard processing is part of the operation. Even without it, the POSIX searching generic can be wildly inefficient: it makes one call to the Action routine for every item found. If you are looking only for directories, the vast majority of calls will be false matches. If the underlying system provides any filters at all, returning thousands of false matches is silly. -- POSIX's case sensitivity doesn't bear on this proposal, thank goodness, but it is a primary reason why using POSIX on Windows is a bad idea. Trying to impose case sensitivity on a case insensitive file system leads to many problems, the result being that while POSIX can be used on Windows NT, files created with it are often incompatible with standard Windows applications. (The Windows documentation tells you to avoid the POSIX switches unless you're doing POSIX.) Now, we could design something radically different than POSIX, but I don't think that would be justified. But a direct use of POSIX, which clearly had different goals, especially for portability, than the Ada standard, does not meet the requirements. (If it did, we should simply be using POSIX for this purpose.) **************************************************************** From: Tucker Taft Sent: Friday, December 01, 2000 4:55 PM Randy, You should probably look at JP Rosen's OS_Services package as part of working on the "file/directory services" amendment AI. Reference: http://pro.wanadoo.fr/adalog/compo2.htm **************************************************************** From the minutes of the Leuven meeting (May 18-20, 2001): It was noted that the package seems to be too high in the hierarchy of predefined packages. Randy notes that this package works with all sorts of files, so it would be inappropriate for it to be a child of any specific IO package, and making it a child of IO_Exceptions also seems inappropriate. If Ada 95 had defined a package IO that was the parent of all of the IO packages, then this package could go there, but of course Ada 95 does no such thing. Moreover, we could cause conflicts by defining grandchildren of Ada, but not for children of Ada (since it is illegal to compile children of Ada, and implementors should not define their own, while there are no such restrictions for grandchildren of Ada). Change the name of Get_Current_Directory to Current_Directory. Steve M. suggests that Remove_Directory ought to say that non-empty directories are not deleted. After some discussion, it is suggested to add Create_Tree (which makes all directories needed in the name) and Remove_Tree (delete everything in the named tree, including directories) to the package. Steve's suggestion is adopted. It is also suggested to add Delete (by name [string]) to the package. Rename (Old_Name, New_Name) should also be added. This should be described as changing the name of the file from Old_Name to New_Name. There should be an implementation requirement for this to work at the same level in the same directory (others may raise Use_Error). The Name routine for a directory entry should be split into a Simple_Name and Full_Name routine. Pascal suggests that there should be some way to get the parent directory. First Erhard, and then Steve M. express confusion about how the directory searching iterator works. Randy notes that if Ada experts can't figure out the POSIX iterator mechanism, what chance do users have? It is suggested to add a limited (type) handle searching mechanism, rather than the current iterator scheme. The suggestion is made that the directory searching stuff be put into a separate child unit to avoid clutter. Pascal comments that he doesn't like all these operations taking file names, and that he would prefer a full-fledged Directory_Entry abstraction, much like Text_IO has a File_Type abstraction. There would be an operation to resolve a name to obtain a Directory_Entry, and various operations on Directory_Entries. The group appears unconvinced. A request for file name composition/decomposition functions was made by Tucker. Randy notes that these are very tough to do right. The group would like to see these. Steve M. would like to see a way to create links. Others note that the links on Windows or Mac aren't the same as the links on Unix, so a common definition would be hard. On Sunday, Randy brings up a recent conformity assessment problem, which is somewhat related to this AI. A.8.2(22) says "If an external environment allows alternative specifications of the name (for example, abbreviations), the string returned by the function should correspond to a full specification of the name." The ACATS has always tested this as if it were a requirement. However, some implementations have not been following this requirement (and this fact was overlooked by the testers). Should this "should" be a "shall" (a requirement) or shall it remain a "should" (an Implementation Advice)? There is definitely a serious overhead (it is necessary to call getcwd() on every Open and Create), and there are implementations which have strayed. The best solution is to give users an appropriate directories package, and eliminate this requirement altogether. Randy says that the immediate problem is three ACATS tests (from ACVC 1.11) which require this behavior. With the above solution, the best course of action is to withdraw these tests (they have no other value). The general agreement is that a good directories package makes this requirement unnecessary. We should plan to remove this requirement when a directories package is standardized. No one supported making this a "shall". Randy will integrate this (and associated operations, such as a Simple_Name => Full_Name routine) into the directories package. **************************************************************** From the minutes of the Bloomington meeting (October 5-7, 2001): Once again the motivation for this AI is questioned. Users want it in the standard library and with the Ada POSIX interface in jeopardy, it seems to be necessary. While the discussion of this AI was scattered among the operations and types of the package specification, the minutes have organized the discussion and decisions by how the operations and types were listed: Create_Path (formerly Create_Tree): The name Create_Path better describes the intent of this operation than Create_Tree. A recommendation to add a Boolean parameter for recursion instead of having a separate routine was defeated by a 3-2-5 straw vote. Create the directory with name New_Directory with any necessary enclosing directories. Use the same wording that is found in A.8.2 for files for the raising of exceptions. The last sentence needs to be changed to refer to "any directories" instead of a single directory. The description also should say "zero or more", as this routine does not have to create something. Current_Directory: Is the default directory defined? Not really. Default directory is useful for the use of simple names for directories and files. What it should be is what would be expected from the underlying environment. But Windows has a default directory for each device, which is very different from Unix. Then it should return a full name. Also use the same wording that is found in A.8.2 for files for the raising of exceptions. Set_Directory: Accept simple names (using the default directory as the enclosing directory) or full name. It was during the discussion of this operation that it was determined the terms simple names, full names and default directory need better definitions before the description of operations. (It was noted that portability is best supported through the use of simple names.) Delete_Tree (formerly Remove_Tree): Correct the cut-and-paste error in this description. Use_Error exception is now raised if it is not able to delete the entire tree (or if that tree does not exist). It is noted that when Use_Error is raised, it is possible that some files and/or directories are deleted. A recommendation to add a Boolean parameter for recursion instead of having a separate routine was defeated by a 2-5-3 straw vote. At this point the question was raised as to why use the name Remove_Tree instead of Delete_Tree? The reason given is that most operating systems use the term remove. But remove is inconsistent with the term delete for files in the current language. It was decided to use the term delete, so names Delete_File, Delete_Directory and Delete_Tree will replace Delete (file), Remove_Directory and Remove_Tree. Containing_Directory (formerly Parent_Directory): It was decided to change the name to the new name as the term containing directory is more descriptive of what is being requested. This function will return a full name of the directory of the containing directory. Rename: It was recommended that the Implementation Advice say the Rename function should work when both names are simple names. The discussion brought up the lack of copy operation for files or directories. It was decided to add a Copy_File operation, but no copy directory. Pascal cautions about the difficulties with special files on Unix (and similar beasts on other operating systems). Tuck argues that this shouldn't be a problem for files created by Ada. Copy_File should be defined to copy files that can be created by Ada packages, and to have an implementation-defined behavior for other files. The discussion turned its attention to the search operations and supporting types, leaving the discussion of full and simple names for later. Is_Valid: The need for this operation is questioned. Randy says that it is a useful test for determining if a search produced something meaningful and Bob supported his point. Tuck pointed out that we have the constant No_Directory_Entry for testing of the result of a search, and therefore that operation Is_Valid appears to be redundant. If Is_Valid is retained it should mean the same thing as comparing with No_Directory_Entry. The straw vote on this meaning for Is_Valid was approved, 6-0-4, but the group decided that the redundacy was unpleasant and approved removing Is_Valid by a 7-1-2 straw vote. Note: later in the discussion it was restored! Start_Search and End_Search: The Start_Search operation is an iterator operation on the search object. Consequently Search_Type should be a controlled type to properly support the finalization of search objects. There was a discussion on whether a call on End_Search should be optional. If this is the case, then it means that the finalization of the search object (by leaving the scope of the object, or by starting a new search with the same object) has the effect of calling End_Search. The End_Search subprogram should be retained as a way to release operating system facilities early (i.e., before leaving the enclosing block). The meeting voted to make calls on End_Search optional by a 5-2-3 straw vote. Then there was a recommendation that Start_Search act as a restart, thereby eliminating the need to explicitly call the End_Search operation to restart a search. The group approved making Start_Search act as a restart by a 6-2-2 straw vote. Get_Next_Match: The only significant issue on the Get_Next_Match operation was how it handled an unstarted search object: either return No_Directory_Entry in the Directory_Entry parameter or raise a Status_Error exception. The exception approach is similar to how the existing file operations handle unopened files and was approved by 6-3-1. At the end of the iterator, Get_Next_Match returns No_Directory_Entry. Is_Directory and Is_Ordinary_File: The motivation for distinguishing ordinary files is due to the existence of special files, such as symbolic links to existing files. It was suggested that a more direct way of distinguishing among directories, ordinary files and special files was by defining an enumeration type with these values, such as: type Directory_Entry_Kind is (Directory, Ordinary_File, Special_File); and then change the filter type to be: type Filter_Type is array (Directory_Entry_Kind) of Boolean; Consequently the two functions Is_Directory and Is_Ordinary_File are replaced with a new function Directory_Kind that operates on directory entry objects and returns the new enumeration type. Also the filter parameter in the Start_Search operation is changed. Special files appear to be files that Ada programs can't create or read. Bob believes there are lots of programs that would like to read soft link files in order to find the target. If that is the case may be the best way to handle special files is to provide implementation advice but what is that advice? Does POSIX provide any directions? Not really because it is allowed to deal with all files types that this package doesn't want to handle. It was decided the specifics of special files, such as soft links, should be left to AARM where operating system details can be discussed. Composition of simple and full file names: Randy explained that he borrowed wording for full names from the RM for the Name operation. He further noted that he did consider Jean-Pierre Rosen's documentation on composition, where he divided the composition of file names into these elements: ú Device ú Path ú File ú Extension ú Version This decomposition of file names is meant to be the union of all operating systems. There is a risk that another operating system does not fit this composition. Consequently, he decided against adding any operations to decompose full names. It was noted this list was missing a network element (the host name). Should all of this be avoided by putting the details into a child package? Tuck and Erhard argue that composition of full names for files and directories should be available in this package. Pascal would like to have some access to the composition elements without doing string manipulations to get them from the name strings. The focus is narrowed to Simple_Name function and the composition functions. Tuck argues for just one composition function for either directories or files and let an implementation do the right thing. Portability between systems and implementation is the goal of this interface and that makes the system-specific elements, beyond file and directory names and extensions, difficult to expose. After much discussion, it was decided to take a series of straw votes on how to proceed: ú On whether this package should provide all three kinds of operations, namely compose full file and directory names, extract relative names (the containing directory name or the simple name) and file extension operations (extract simple name or extension and compose file name with simple name and extension), it was defeated by 3-3-4. John commented that he didn't like the notion of making "extension" a first-class citizen. ú The group took a step back to see if there would be agreement on just the compose and relative name operations and it did by 8-0-2 straw vote. ú After taking another straw vote to support more operations (6-3-1), the group supported the addition of just the extension operations (i.e., Base, Extension and Compose_With_Extension) by 5-2-3. Erhard would like to have a (modification) time stamp and size operations. Randy says it is difficult to know what size should be returned, especially in the presence of 64 bit values and to know how to handle time stamps for operating systems that do not produce time stamps. For those operating system with no time stamping, a time stamp operation would raise Use_Error. Consequently the meeting approved time stamp (meaning the time when the file or directory was last modified) by consensus. As for the size of files, it was suggested to let the existing Count type in Stream_IO package (A.12.1) handle this problem. Unfortunately its bounds may not be appropriate to deal with very large files. It was decided not to change the type of count in A.12.1 to be an implementation-defined integer type. Instead a new signed type should be declared in Directory_Operations; the upper bound is implementation-defined and the Size function returns values of this type or raises Constraint_Error if the size is an illegal value for the implementation. The reason why the group eventually decided on a signed type is because operations on Size values are common, and we don't want modular semantics for them. It is implementation defined what the Size function produces for directories or special files. All query functions, such as Kind, Modification_Time or Size, should be applicable to both names and Directory_Entry_Type values. The term "same containing directory" should replace "same directory location" in the Implementation Advice on Rename paragraph and the AARM ramification paragraph should be dropped. Finally the discussion ended with debating whether Directory_Entry_Type should be a limited type, since there appears to be no need for its assignment. This would simplify the implementation and it would be treated similarly to how files are treated. If assignment is needed, then the program should store a pointer to it. Consequently this means the constant No_Directory_entry is no longer needed and the Is_Valid function is restored! Vote on Intent for all these changes: 7-1-1. **************************************************************** From: Randy Brukardt Sent: Monday, January 7, 2002 7:55 PM Package Ada.Directories (AI-248) contains a basic set of file composition/decomposition functions. The only area that I had trouble with in writing up the current design was with the composition function(s). There were a variety of alternatives, which I narrowed down to two: A single Compose function that returns a file or directory name; Separate Compose_Directory and Compose_File_Name functions that return the item specified. I choose the first design, because it is consistent with the behavior of the decomposition functions. For instance, Simple_Name can extract a simple name from either a file name or directory name. However, there is a possible problem with this choice: if there exists an operating system for which file names and directory names are different syntactically, then the single Compose will not work. That's because this is intended to be a purely syntactical function, and it does not require the files or directories to work -- thus it couldn't tell which to create. Is this a real concern? (I can't think of any systems with that problem, but I'm not familiar with them all!) If so, I'll change the proposal to use the second alternative. **************************************************************** From: Tucker Taft Sent: Tuesday, January 8, 2002 12:40 AM On VMS, directories and files have significantly different syntax. However, if the "simple_name" that is returned/expected by decompose/compose retains the special directory syntax when it refers to a directory, this would not be a problem. That is, a "simple_name" in "[]" on VMS would be a directory, whereas a "simple_name" not enclosed in "[]" would be a file. So I guess the answer is that the simple_name's syntax must identify it as a file or a directory, if there is a syntactic difference. If that rule is unpleasant, and there is a desire that "simple_name"s be simple unadorned strings, then you would seem to need two decomposition and two composition functions. **************************************************************** From: Nick Roberts Sent: Saturday, January 19, 2002 1:48 PM First, my thanks to Randy for his sharpness in spotting that the original version of this comment (which has been deleted) was based on an outdated version of the AI. Well done Randy. I have since corrected this, and self-administered 20 lashes of the birch in penitance. My comments in reply to Randy's question (quoted below, hope this doesn't offend) and pertaining to AI-00248-04 are as follows. (1) It would seem particularly apt to accompany the introduction of this package with another, perhaps Ada.Environment, which would provide for the interrogation of the 'environment strings' that many operating systems use to communicate information to programs. E.g.: package Ada.Environment is function Get (Name : in [Wide_]String) return [Wide_]String; function Environment_Name return [Wide_]String; function Environment_Version return [Wide_]String; end Ada.Environment; For the Get function, if an environment variable with the requested Name exists, its value (also a string) is returned, otherwise a null string is returned. The Environment_Name function is intended to return a broad identification of the program's execution environment, e.g. the basic name of an operating system or RTS. The Environment_Version function is intended to give a more precise identification of the same. Possible examples of the results of Environment_Name and Environment_Version might be: "Linux" and "2.4 (Red Hat 6.2)"; "Windows" and "NT 4.0"; "BSD" and "4.4"; "OS/2" and "4.0". Admittedly there are a lot of extra possibilities for this package, which I'll defer to another comment. (2) I think Wide_String should be used throughout Ada.Command_Line, Ada.Environment, and Ada.Directories (instead of String). Doing so would head off a lot of internationalisation [should we use 'i18n' for this word in future?] problems for the future. I believe making this change for (the existing package) Ada.Command_Line would cause fewer problems than it would avoid, in the long term. (3) It was presumably a little mistake to use the name Set_Current_Directory (instead of Set_Directory) in the description of the Current_Directory function. In any case, I would actually prefer the name Set_Current_Directory for the Set_Directory procedure. (4) With regard to Create_Path, I would prefer the standard mandated the behaviour that if any Create (or Create_Directory) - in Ada.Directories or any of the *_IO packages - names a directory within its path which does not exist, that directory should be automatically created. There would then be no need for Create_Path. This would presumably cause extra work for the implementor of the standard Ada *_IO packages, but I think it would be worthwhile, especially since it is something that many application programs would have to do anyway. It seems to me that, in a situation where a directory somewhere along the path of a file (or directory) name does not exist, simply creating the necessary directory would allow the program to continue correctly with its intended operation, and be very unlikely to do any significant harm. On the other hand, raising an exception will stymie the program, at least to some extent. Catching this exception, finding out which directory does not exist, creating the directory, and then trying again is a lot of bother to foist onto the application programmer (and done much more easily and efficiently by an implementation which can use implementation-specific information and functionality). (5) Perhaps it would be worth adding the rule that it is an error to attempt to delete the current default directory; any attempt to do so should fail and cause Use_Error to be propagated. (6) I think the Form parameter should be added to the Rename and Copy_File procedures: procedure Rename (Old_Name, New_Name: in String; Form: in String := ""); procedure Copy_File (Source_Name, Target_Name: in String; Form: in String := ""); to allow implementation-specific information to be added (e.g. permissions). The advantage of this approach is that values for the Form parameter could be read from, say, a configuration file, thus making it possible to make a portable Ada program have useful implementation-specific behaviour. Because operations such as file renaming and copying are high-level operations anyway (likely to require the execution of many tens or hundreds of thousands of machine instructions), the overhead of interpreting Form strings is likely to be perfectly acceptable. (7) If passed a null string, the Containing_Directory function should return the parent of the current default directory (or the root directory if the current default directory is the root, or if the OS does not support hierarchical directories). A null string would then be a useful default value: function Containing_Directory (Name: in String := "") return String; (8) Maybe the functions Randy was referring to should be as follows. Remove the Compose function. Add: function [Compose_]Directory_Name ( Simple_Name: in String; Containing_Directory: in String := "" ) return String; which returns the full name of a directory which represents the path which starts at Containing_Directory (assumed to already be a valid path), and adds the directory named Simple_Name. Also add: function [Compose_]File_Name ( Simple_Name: in String; Containing_Directory: in String := "" ) return String; which returns the full name of a file which represents the path comprising Containing_Directory and Simple_Name. Finally, add: function [Compose_]Simple_Name ( Base_Name: in String; Extension: in String := "" ) return String; which returns the simple name comprising the given Base_Name and Extension. E.g. [Compose_]Simple_Name("myfile","txt") = "myfile.txt". I think the 'Compose_' prefixes are a bit long. Perhaps replacing them with 'To_' would be better. (9) Please change the name of the type Directory_Entry_Type to just Directory_Entry, and parameters named Directory_Entry to just Item. Admittedly this is merely cosmetic, but I do feel these names are wordier than they need to be. (10) File_Kind might be defined as follows: type File_Kind is ( Directory, -- a subdirectory Ordinary_File, -- any file (text or binary) Special_File, -- any non-file nevertheless openable Reserved ); -- anything (non-dir) not openable By 'openable' I mean able to be opened (in general) by the Open procedure of (at least) one of the *_IO packages (whichever is/are appropriate to the file or device). The idea behind the new category (Special_File) is to make provision for those entities which are openable, but which may not be renamable, copyable, or deletable (e.g. because they are not normal files). I think the name Special_File is more appropriate here, and a name such as Reserved is better for those entities which are 'something else'. (11) We could have: Everything: constant Filter_Type := (others => True); Openables: constant Filter_Type := (Ordinary_File | Special_File => True, others => False); (12) I would like to see the addition of some useful basic pattern helpers, e.g.: Match_All_Names: constant String; function Match_Filetype (Filetype: in String) return String; which would help enable code to be more portable. Match_All_Names might have the value "*" or "*.*", for example. (13) The Start_Search procedure could be declared: procedure Start_Search (Search: in out Search_Type; Filter: in Filter_Type := Everything; Pattern: in String := Match_All_Names; Directory: in String := ""); The default value of "" for the Directory parameter would specify the current default directory. (14) I think the standard should explicitly permit a directory to be locked by a call to Start_Search; if so it must be unlocked by a call to End_Search. Obviously most implmentations won't do this, but some may wish to. (15) Add: function Temporary_Directory return String; which returns the full name of a temporary directory, which should be readable, writable, and, preferably, exclusive to the calling program (process) and initially empty. Returns "" if not available; propagates Use_Error if not applicable. (16) Add: function Root_Directory (Directory: in String := "") return String; which returns the full name of a/the root directory (of the physical storage device on which the directory or directory entry Directory resides). The default "" for the Directory parameter indicates the current default directory. Propoagtes Use_Error if N/A. Examples of the result: "C:\", "/". (17) Introduce to Ada.Sequential_IO, Ada.Direct_IO, Ada.Text_IO, Ada.Wide_Text_IO, and Ada.Streams.Stream_IO: procedure Rename (File: in out File_Type; New_Name: in String; Mode: in File_Mode := In_File); which has the effect of resetting the File object (to the given Mode), and at the same time renaming it. This procedure would be very useful in facilitating a typical read-an-original-file, rewrite-it-to-a-temporary-file, rename-the-original-file, rename-the- temporary-file cycle. In an environment where there are multiple processes executing in parallel, and files are locked while open, this procedure could provide exactly the right locking semantics for the described cycle (providing the application program also behaves appropriately). It should also be mandated by the standard that this procedure is 'destructive': if an ordinary file corresponding to New_Name exists, it is automatically deleted and replaced. **************************************************************** From: Steve Deller Sent: Monday, January 21, 2002 1:48 PM > For the Get function, if an environment variable with the > requested Name exists, its value (also a string) is returned, > otherwise a null string is returned. POSIX environments distinguish "null" from "does not exist". This function needs a similar distinction, or perhaps have two functions, "get" and function Exists (Name : in [Wide_]String) return Boolean > The Environment_Name function is intended to return a broad > identification of the program's execution environment, e.g. > the basic name of an operating system or RTS. The > Environment_Version function is intended to give a more > precise identification of the same. Possible examples of the > results of Environment_Name and Environment_Version might be: > "Linux" and "2.4 (Red Hat 6.2)"; "Windows" and "NT 4.0"; > "BSD" and "4.4"; "OS/2" and "4.0". How does this differ from the "intent" of System.Name and System.System_Name? The rest of the comment seems bent on redefining "POSIX Ada" and adding it to the LRM. Is that true? Where is the proposed specification of "Ada.Directories"? **************************************************************** From: Jean-Pierre Rosen Sent: Tuesday, June 11, 2002 11:22 AM !class amendment !subject Directory Operations alternative !summary AI248 provides some functionnalities dealing with extracting various parts from a file name, as well as accessing some properties of files (file size, modification time). This AI proposes an alternative approach that would provide a more flexible and extendable way of dealing with such information. !problem The approach taken by AI248 is based on the assesment that the properties associated to a file are highly operating-system dependent, and therefore that only those properties that can be expected to be provided by (almost) every operating system should be defined in the standard. However, this approach suffers from the "least common denominator" syndrom; any program that really needs to access or manipulate file properties is likely to miss some important features if restricted to this least common denominators. The proposed alternative is based on the notion of abstract properties, defined as enumerated types. Some values of the enumerated type are required, but extra values would be implementation-defined, therefore allowing any property that makes sense for the operating system to be represented. An implementation advice defines the form of common values, so that usual properties would be portable across operating systems. This proposal has the following properties: - All properties of files on a given operating system can be accessed. - Programs would be fully portable across different compilers for a given operating system. - A program that uses only properties that are common to several operating systems would be portable across these operating systems. - A program that uses properties that are not available on the target operating system would not compile. !proposal (See wording.) !wording Static Semantics Replace the sections named "File and directory name operations" and "File and directory queries" from AI248 with the following declarations: function Join (Path : String; Name : String) return String; Assembles a path (list of directories) with a file name to make a full name. The lower bound of the result string is 1. type String_Property is private; function "+" (Left, Right : String_Property) return String_Property; Expanded : constant String_Property; As_File : constant String_Property; Device : constant String_Property; Path : constant String_Property; Name : constant String_Property; Extension : constant String_Property; Version : constant String_Property; Native_Form : constant String_Property; function Property (Query : String_Property; Name : String; Default : String := ""; Relative_To : String := "") return String; A value of type String_Property describes the parts of a file name that are to be returned by the Property function. The constants Device, Path, Name, Extension and Version define various parts of a file name. The exact meaning of which part of a file name correspond to these constants is implementation defined. If a part has no meaning on the target operating system, the implementation shall return the empty string for it. The "+" operator returns a value expressing that both Left and Right are part of the property (logical union). Expanded is a String_Property expressing that the returned string should not depend on any default value (i.e. be an absolute path). As_File is a String_Property that expresses that the returned string should have the syntax of a file name, even if the given string is in a form of a directory name. Native_Form is a String_Property that expresses the fullest description of a file name. The Property function extracts various parts from a given file name; the lower bound of the result string is 1. A value of type String_Property defines which part are to be included in the result. Name needs not be the name of an existing file. If a (requested) part is not present in Name, then the same par (if any) extracted from Default is returned. If the part is not present in Default either, the null string shall be returned for it, unless the Expanded property is also given, in which case an implementation defined value is returned. This value shall be such that it matches an absolute name for the file. The Relative_To parameter indicates how to interpret relative file names; any relative name shall be assumed to be starting from this directory. If Relative_To is itself relative, it shall be interpreted according to the current directory. The default value ("") means that relative values will be interpreted according to the current directory. AARM Discussion: This feature is especially useful for multi-tasking programs where tasks operate under different directories, since the notion of current directory is generally global to the program, not to individual tasks. type Boolean_Property is (Exists, Is_Directory, Is_Regular_File); function Property (Query : Boolean_Property; Name : String) return Boolean; The Property function returns the value corresponding to the given parameter. If the parameter is Exists, returns True if a file (of any kind, including directories and special files) exists with the given Name. If the parameter is Is_Directory, returns True if a file with the given Name exists and is a directory. If the parameter is Is_Regular_File, returns True if a file exists with the given Name, and is a file that can be open using one of the Ada IO packages. type Time_Property is (Modification_Time); function Property (Query : Time_Property; Name : String) return Ada.Calendar.Time; The Property function returns the value corresponding to the given parameter. If the parameter is Modification_Time, returns the time the file with the given Name was last changed. What is considered a change for a file is implementation defined. The exception Name_Error is propagated if Property (Exists, Name) is false. type Integer_Value is implementation_defined; type Integer_Property is (Size); function Property (Query : Long_Property; Name : String) return Integer_Value; The Property function returns the value corresponding to the given parameter. If the parameter is Size, returns the size in bytes of the file with the given Name. The exception Name_Error is propagated if Property (Exists, Name) is false. type Enum_Property is (File_Kind, Path_Kind); type Enum_Values is (File, Directory, CharacterSpecial, BlockSpecial, Fifo, Link, Socket, Absolute, Relative, VolumeRelative); subtype File_Kind_Enums is Enum_Values range File..Socket; subtype Path_Kind_Enums is Enum_Values range Absolute..VolumeRelative; function Property (Query : Enum_Property; Name : String) return Enum_Values; The Property function returns the value corresponding to the given parameter. If the parameter is File_Kind, returns one of the values of the subtype File_Kind_Enums, describing the kind of the file. If Property (Is_Regular_File, Name) is True, the value shall be File. If Property (Is_Directory, Name) is True, the value shall be Directory. The conditions where other values are returned are implementation defined; especialy, if some values of the subtype File_Kind_Enums do not make sense for the target operating system, they are never returned. If the parameter is Path_Kind, returns one of the values of the subtype Path_Kind_Enums, describing whether the interpretation of the given Name depends on the current execution context, like the current directory, or not. If the given Name does not depend on the context, Absolute shall be returned. If the interpretation of Name depends only on the current directory, Relative shall be returned. If the interpretation depends on some element of the execution context other than the current directory, like the current drive, Volume_Relative shall be returned. If some values of the subtype Path_Kind_Enums do not make sense for the target operating system, they are never returned. The exception Name_Error is propagated if Property (Exists, Name) is false. Implementation Requirements Strings returned by queries on parts of a file name shall return any separator used by the operating system to separate these parts. It is therefore guaranteed that the various parts can be concatenated with "&" to form a legal name. Example: Device includes the final ":" (MS-DOS) Path includes a final directory separator (MS-DOS, Unix) or the brackets (VMS) Extension includes the leading "." (MS-DOS, Unix, VMS) Version includes the leading ";" (VMS) Implementation permission An implementation may add values to the types String_Property, Boolean_Property, Time_Property, Integer_Property, Enum_Property and Enum_Values to express implementation dependent properties of files. Implementation Advice The following values of type Boolean_Property should be provided if they make sense for the target operating system: Is_Owned, Is_Readable, Is_Writable, Is_Executable The following values of type Time_Property should be provided if they make sense for the target operating system: Access_Time, Creation_Time. !Discussion The proposed change is derived from the functionalities provided by the package OS_Services, available from http://www.adalog.fr/compo2.htm. The proposed functionalities have been implemented on top of three different services, both under Windows and Linux: - Gnat.Directory_Operations - TCL/TK (with Tash) - Posix (with Florist or Win32/Posix). String properties do not include a "Host" or "Protocol" value. This function is intended to parse file names, not URLs. On some operating systems, the syntax of a URL may be different from the syntax of a URL. If on the other hand a file name includes the name of a host (like host:/path), it is returned as the value for the Device query. The default value for Relative_To in the string Property function is "" rather than a call to Current_Directory because it allows more flexibility for the default behavior, and also for efficiency reasons: the default evaluation of a file name may not require evaluating Current_Directory, as would happen if it was the default value. Changing the default value from Current_Directory to "" resulted in a major speed-up in one of our applications. **************************************************************** From: Martin Dowie Sent: Friday, May 23, 2003 3:39 AM The example quotes something called 'Ada.Directories.Only_Files' but this is not defined in the package. Is it supposed to be one of a number of pre-defined filer constants? e.g. Only_Ordinary_Files : constant Filter_Type := (Ordinary_File => True, others => False); Only_Directories : constant Filter_Type := (Directory => True, others => False); Only_Special_Files : constant Filter_Type := (Special_File => True, others => False); Only_Files : constant Filter_Type := (Ordinary_File | Special_File => True, others => False); **************************************************************** From: Randy Brukardt Sent: Friday, May 23, 2003 1:24 PM No, I think it was defined in an earlier version of the AI (when the filter was directly an enumeration). Probably the example needs to be corrected. **************************************************************** From: Martin Dowie Sent: Wednesday, June 25, 2003 3:23 PM Why is it 'Create_Path' but 'Delete_Tree'? Though the comment describing 'Create_Path' actually calls it 'Create_Tree'. Should it be consistent or as it is? **************************************************************** From: Tucker Taft Sent: Wednesday, June 25, 2003 4:02 PM Well Create_Path makes more sense than Create_Tree, since it is only creating a single path down to the specified file; it is not creating a tree of nodes. Delete_Tree on the other hand is deleting a whole tree of nodes. > Though the comment describing 'Create_Path' actually calls it 'Create_Tree'. That sounds like a bug! **************************************************************** From: Martin Dowie Sent: Thursday, June 26, 2003 2:32 AM Good point, well made! **************************************************************** From: Pascal Obry Sent: Thursday, June 26, 2003 2:05 PM Ok, but a Create_Tree makes sense too. Create_Tree will create the complete directory tree. This is equivalent to mkdir -p option. **************************************************************** From: David C. Hoos Sent: Thursday, June 26, 2003 2:34 PM Except that the -p option at most creates a _chain_ of directories -- i.e., there are no _branches_ in what's created. **************************************************************** From: Pascal Obry Sent: Thursday, June 26, 2003 3:02 PM Well that's just a specific kind of tree to me :) Anyway, Create_Path could do that, right ? A path can be a set of directories. **************************************************************** From: Randy Brukardt Sent: Thursday, June 26, 2003 5:36 PM Create_Path *does* do that: it creates a path (a list [or chain if you prefer] of directories). As the AI says: Creates zero or more directories with name New_Directory. Each non-existent directory named by New_Directory is created. ... If you want to create only one directory, use Create_Directory instead. Delete_Tree deletes the complete tree of directories (including all branches). If you want to delete only a single directory, use Delete_Directory instead. **************************************************************** From: Pascal Obry Sent: Friday, June 27, 2003 12:41 AM Ok I missed that. Thanks. **************************************************************** From: Martin Dowie Sent: Friday, June 27, 2003 3:13 AM It is the intention that a search should recursively descend a given directory? I am assuming not. **************************************************************** From: Randy Brukardt Sent: Friday, June 27, 2003 8:19 PM No, it searches within a single directory. Since directory names are returned, it is easy enough to program a recursive search if desired. **************************************************************** From: Bernard Maudry Sent: Saturday, June 28, 2003 3:17 AM A naive recursion is easy. But directory trees may contain circularities, in which the naive recursion loops for ever :-(( **************************************************************** From: Christoph Grein Sent: Sunday, June 29, 2003 11:12 PM Really? A tree (I thought) is an acyclic directed graph. You can have circularities with links, but those are special files, not directory files, so I do not see this problem. **************************************************************** From: Simon J. Wright Sent: Monday, June 30, 2003 3:14 AM For symbolic links, true, but hard links are just normal entries in directories (on Unix filesystems and, I'm pretty sure, VMS) **************************************************************** From: John-Pierre Rosen Sent: Monday, June 30, 2003 4:15 PM If you know a good algorithm (i.e. better than brutly limiting the depth of recursion) to detect such circularities, I am interested. BTW, it's no more a tree, it's a graph... **************************************************************** From: Tucker Taft Sent: Monday, June 30, 2003 4:48 PM One has to presume there is some way to uniquely identify a location within the graph, and given that, one simply keeps a stack of these unique identifiers for the nodes currently being "visited," and recognizes recursion when a node encountered is already in the stack. For directories, the only unique identifier provided by the package is the Full_Name, though typically there is an underlying "inode number" or equivalent for O/S-dependent code (e.g. in the implementation of Ada.Directories). **************************************************************** From: John Halleck Sent: Monday, June 30, 2003 4:54 PM ASSUMING that you have some unique identifier of directories (such as unix inodes) so that you can answer the question of whether two are "the same"... If you are really looking for loops in a tree, there is an easy method. Clearly (!), if there is a loop then there will be a point where the currently visited node has already been visited somewhere above us in the CURRENT call tree. (Not to be confused with "has already been visited since we started".) Create a list. Each time you go down a level of recursion, add a pointer to the current directory node to the end of the list. As you exit remove that item from the end of the list. Since this is really a copy of the call tree, the operations only need to be done on the end... and it only needs to be a singly linked list, since we can remember where we added it. If the node you are about to process is in the current list, you have a cycle. If you have a cycle, at some point you will be in this state. The worst case storage requirement is proportional to the depth of the directory tree. If you are looking to tell trees from graphs (instead of loops vs no loops) the problem is, I think, harder. At least the only way I can see in that case is to remember all nodes that you've visited. Of course, if you have a chain instead of a tree, looking for loops can be done without the list overhead, with just two variables. Start the two variables out at the head. Each step move one down the chain with one variable, and two down with the other. Either the faster one will hit the end, meaning no loop, or the slower and faster one will eventually be equal which would mean that there was a loop. Running time is proportional to the distance before the loop, and twice the loop length. **************************************************************** From: John Halleck Sent: Monday, June 30, 2003 5:10 PM It has been pointed out by one of the folk BCC'd on this that a much cleaner design can result if you add on to the head of a list, and pass a pointer to the new element. But I think that's just an implementation detail... **************************************************************** From: Alexandre E. Kopilovitch Sent: Monday, June 30, 2003 6:15 PM What is "good"? Certainly, there are algorithms that deal with circularities, and it isn't too difficult to create your own flavor, just by choosing a checker for "used" nodes. For example, you may use splay tree. >BTW, it's no more a tree, it's a graph... But not an arbitrary graph - you still have the root (from which all other nodes are accessible) at hand. **************************************************************** From: John Halleck Sent: Monday, June 30, 2003 6:26 PM Checking for "used" nodes tells you it is a graph and not a tree. But graph's aren't intrinsiticly bad... only those with loops cause circularities. And it is a lot less overhead to check for loops then to check for graphs. Keeping track of all "used" nodes has a space requirement proportional to the size of the directory structure. Just testing for loops in a directory tree has space proportional to the depth of the tree. (log n for a "well balanced" tree) (And, as my previous post pointed out, you can do the non-circularity test with a simple one directional list, instead of the overhead of a splay tree.) **************************************************************** From: Robert A. Duff Sent: Monday, June 30, 2003 7:27 PM All this talk about graphs is missing the point. In the file systems I'm aware of, all the non-treeish edges in the graph are distinguished. They're called "symbolic links" in Unix, for example. You walk the "graph" of directories by walking the actual directory entries, and skipping the symbolic links. So it really is just a plain tree walk, with no need to detect cycles, nor keep track of where you've been. Anything else would be beyond the scope of this AI! P.S. Yes, I know Unix hard links form a DAG, but they can't point to directories, so most programs don't care. **************************************************************** From: Pascal Obry Sent: Tuesday, July 1, 2003 1:32 AM Tucker Taft writes: > As I believe others have indicated, on Unix "hard" links are not > allowed to directories, only to "leaves" in the file system, making > a particularly simple form of DAG. I'm not sure Windows > has anything like "hard" links. Symbolic (soft) links on Unix, and > "shortcuts" on Windows, can create arbitrary graphs, AFAIK Windows "shortcuts" are only usable under Windows Explorer. The Win32 API does not know about "shortcuts", these are just seen as files with .lnk extension. So there is no way to create graph on Windows. **************************************************************** From: Jean-Pierre Rosen Sent: Tuesday, July 1, 2003 2:31 AM > One has to presume there is some way to uniquely identify a location > within the graph, and given that, one simply keeps a stack of these > unique identifiers for the nodes currently being "visited," Of course, if you have a unique identification, it is trivial. But that's the point: I don't know how to do that. Certainly, the full name does not count: on Unix you may have several different full names. Note also that the presence of "mount --bind" complicate things, as the full name is not unique, *even if it has no symbolic links in it*. (Doesn't seem that "mount --bind" can create circularities, though). **************************************************************** From: Jean-Pierre Rosen Sent: Tuesday, July 1, 2003 2:34 AM > P.S. Yes, I know Unix hard links form a DAG, but they can't point to > directories, so most programs don't care. Note that "mount --bind" allows you to pass several times through the same directory, although I don't think it allows you to loop indefinitely. **************************************************************** From: John Halleck Sent: Tuesday, July 1, 2003 11:08 AM Yes, but on unix you have inodes to uniquely identify them. What is being used as the unique identifier doesn't need to be exposed to users, it only has to be known to the routines that are doing the walk. And it is only an issue on those systems where loops are possible. I fail to see the problem here. For all operating systems that I can think of at the moment it seems one has either the property that you can't have loops, or you have the property that there is SOME unique identifier. (Including on operating systems that have thankfully passed away since I last programmed for them.) In the first case it is not an issue. In the second case there are reasonable well known algorithms to deal with it. **************************************************************** From: Bernard Maudry Sent: Tuesday, July 1, 2003 2:33 AM >If you know a good algorithm (i.e. better than brutly limiting the depth >of recursion) to detect such circularities, I am interested. Store the inodes of the ancestors and compare with the current node one. In fact, the naive recursion will only loop until Storage_Error is raised (or segmentation violation is signaled) when you reach the end of the stack, in which case you know you've got a circularity :-) **************************************************************** From: Jean-Pierre Rosen Sent: Tuesday, July 1, 2003 12:14 PM Not really portable to windows ;-) **************************************************************** From: Randy Brukardt Sent: Tuesday, July 1, 2003 3:17 PM Doesn't need to be. The only way to get a directory loop in Windows is to use a sector editor to edit a disk directory entry. And if you do that, Chkdsk and Scandisk throw up, so that hardly can be considered a correct disk structure. (I've had that happen on damaged floppies in the past, which is why I know it's possible...) So there isn't any good reason for loop-detecting in Windows. (The discussion here suggests that there isn't much reason to do it on Unix, either, but YMMV.) **************************************************************** From: Juanma Barranquero Sent: Monday, July 7, 2003 3:50 PM > Doesn't need to be. The only way to get a directory loop in Windows is to > use a sector editor to edit a disk directory entry. Not on W2K/XP, where you can have both hard and soft links ("junctions"). It's true that 2K/XP systems do not include a user-level tool to create junctions, but there are some out there(see JUNCTION.EXE, at http://www.sysinternals.com). Microsoft's Developer Networks has (sparse) documentation for hard links and junctions: http://msdn.microsoft.com/library/default.asp?url=/library/en-us/fileio/base/hard_links_and_junctions.asp **************************************************************** Editor's note: Windows hard links, like Unixes, cannot point at a directory. Windows soft links (really 'reparse points') probably should be identified as special files. But hardly anything uses them. - RLB - 7/22/03. **************************************************************** From: Martin Dowie Sent: Tuesday, July 1, 2003 11:44 AM This is getting a bit confusing now. The AI states: "A special file is an external file that cannot be created or read by a predefined Ada Input-Output package." Well, I've just read in a '.lnk' file using Ada.Text_IO and dumped them to screen - it's a mess but it clearly can be done! But here Tucker indicates that '.lnk' files should be special files. So - what's it to be??? **************************************************************** From: Robert A. Duff Sent: Tuesday, July 1, 2003 12:32 PM I think it is too file-system-specific to try to specify anything about the meaning of special files. In particular, we should not forbid Text_IO from dealing with them (perhaps using Form parameters). So a special file is whatever the implementation wants it to be (perhaps with some hints as to how we expect it to work on Unix and Windows). **************************************************************** From: Randy Brukardt Sent: Tuesday, July 1, 2003 2:16 PM I don't think that Windows .lnk files are special enough to treat as "special" files. As someone else pointed out, they only have a meaning to Explorer -- if you open one with a file operation (like Text_IO), it is just an ordinary file. Of course, an implementation could try to interpret them if it wanted, and such an implementation probably ought to treat them as special. So I'd say that the most important issue is that the implementation is consistent. **************************************************************** From: Martin Dowie Sent: Tuesday, July 1, 2003 3:40 PM Absolutely, but I think it's a good idea to thrash these things out so that when compiler vendors do start to produce there own (and I can throw my implementation away! :-), that they all, hopefully, implement consistently on the same platform. **************************************************************** From: Martin Dowie Sent: Tuesday, July 1, 2003 7:16 AM For Windows... ...is a file named '.foobar' a simple name or an extension? While a file can not be named thus in Windows Explorer is can be created via Ada.Text_IO or via a 'DOS' prompt and then listed via Explorer. In this case, both the 'Name' and 'Type' columns in Windows Explorer contain the value '.foobar'. I'm was inclined to treat this as a simple name with no extension until I created a file called '.txt' which was treated by Windows as a text file and opened up quiet happily in 'notepad', which in turn was fine about saving it as that name after some edits. Now I'm not sure at all... **************************************************************** From: Randy Brukardt Sent: Tuesday, July 1, 2003 2:28 PM The AARM notes suggests that it is the extension, and the Simple_Name would be "". But I think the important thing here is to do what Windows does, and it seems that ".txt" has an extension of "txt". And, since it seems easiest to just use the rightmost '.', I'd be inclined not to make a special case. **************************************************************** From: Martin Dowie Sent: Tuesday, July 1, 2003 3:37 PM I think my original question was wrong! ".txt" is a simple name with an extension "txt" and a _base_name_ of "". **************************************************************** From: Martin Dowie Sent: Tuesday, July 1, 2003 7:44 AM For function Containing_Directory, what should it return if only a simple name is passed? e.g. "foobar.txt". Should it return Use_Error as "the external file does not have a containing directory" or the current directory? If the former then Containing_Directory must always be passed a parameter that hase some form of directory reference within it (be it '.'/'..'/'C:\'), yes? **************************************************************** From: Randy Brukardt Sent: Tuesday, July 1, 2003 2:24 PM The AI says that Containing_Directory "is purely a string manipulation function." That was not true originally, but ultimately it was changed to just manipulate strings. So it clear does NOT return the Current_Directory (explicitly). OTOH, I don't see any reason to fail this call on most systems (and it would be unfriendly to do so). The Current_Directory can always be written ".\" on Windows and "./" on Unix. So that is what I'd return if there is no explicit path on the name. Probably the AARM note should make this explicit. **************************************************************** From: Nick Roberts Sent: Tuesday, July 1, 2003 3:58 PM In my previous comment (numbered 7) in the AI, I made an assumption that the Containing_Directory would interrogate the operating system (execution environment). I misunderstood the intent of this function, and I agree that it should not interrogate the OS (but be a decompositional function, which merely examines the string parameter passed into it). I suggest that if the Containing_Directory function is passed a simple name, as Martin illustrates, the function should simply return a string which signifies 'the current directory' (e.g. "" or "."). In addition, I suggest that if the function is passed "", it should return a string which signifies 'the parent directory' (e.g. ".."). In either case, Use_Error could be raised if an appropriate string value is not possible. **************************************************************** From: Laurent Pautet Sent: Tuesday, July 1, 2003 3:03 PM I would suggest to have a look at CORBA COS Naming which provides data structures and several routines quite similar to a file system. **************************************************************** From: Nick Roberts Sent: Tuesday, July 1, 2003 3:48 PM > A naive recursion is easy. But directory trees may contain circularities, > in which the naive recursion loops for ever :-(( I look upon CORBA (specifically, CosNaming), and some other similar (object-oriented) systems, as a model to bear in mind for the future of the object naming systems that will take over from traditional file naming systems. Under CORBA (CosNaming), there is no attempt to forbid or prevent referential loops (and I believe any such attempt would be futile). I think we have to assume that, in general, name systems could have loops. However, I also believe that, under a well-organised name structure, it will always be nonsensical to 'recursively' descend a tree, because the (tacit) assumption is that membership within a container (directory) imparts some extra meaning (type) upon the members, which recursive descent is oblivious to. Therefore, my suggestion is that programs (Ada or otherwise) never do recursive descent. The only programs which must be able to do this are tools which must themselves interpret legacy software (typically scripts). All other programs which perform recursive descent can be reasonably easily reprogrammed to avoid doing so (and should be). Legacy software, especially scripts, which performs (or demands) recursive descent will have to be rewritten before being suitable for environments which may have loops. **************************************************************** From: Nick Roberts Sent: Tuesday, July 1, 2003 3:16 PM "Tucker Taft" wrote: > Well Create_Path makes more sense than Create_Tree, > since it is only creating a single path down to the > specified file; it is not creating a tree of nodes. > > Delete_Tree on the other hand is deleting a whole > tree of nodes. I'd like to draw attention to my previous comment (numbered 4) on Create_Path. The semantics I suggest would render this procedure unnecessary. **************************************************************** From: Randy Brukardt Sent: Tuesday, July 1, 2003 3:41 PM The ARG considered the entire list of 17 items from your previous comment at the February 2002 meeting. The minutes don't record the exact disposition for each item, but any of these ideas that were thought to be valuable were used in the final AI. My personal opinion is that you really do need both routines. Always automatically creating missing directories would be bad, especially in a high-security environment. That's because a missing directory will be created to be owned by the creator, with appropriate permissions for that creator. If the directory was supposed to be owned by some other user (possibly with stronger access controls), this behavior could be used to change the ownership of critical resources (presuming that some technique for deleting the original directory had been found by an attacker). Since Ada generally has defaults that are "safe", and automatic creation of directories is not necessarily "safe", we want to do that only when the user explicitly asks for it. (And there has to be a way for the user not to ask for it.) **************************************************************** From: Nick Roberts Sent: Tuesday, July 1, 2003 4:49 PM Excellent! > My personal opinion is that you really do need both routines. ... I see that Create_Path is defined so that it can be used to harmlessly ensure a path exists. The semantics I suggested (automatic creation of missing elements of a path) can therefore be easily achieved by preceding a statement such as: Create(Accounts_File,Out_File,Accounts_Name); with: Create_Path( Containing_Directory(Accounts_Name) ); I suspect that this is what many Ada programmers would do, in practice. This approach has the benefit of making it somewhat more obvious to the reader (code reviewer) that some or all elements of a file's path may be created, in addition to the file. It could also have the benefit of allowing the programmer to trap an exception raised by the (attempted) creation of the path prior to the (attempted) creation of the file itself. It has the disadvantage, for existing Ada software, of either continuing to fail to create a file whose path is not all in existence or requiring the appropriate Create_Path calls to be inserted (and the software rebuilt). I'm not convinced about Randy's security argument. Assuming that many Ada programmers do what I illustrate above, the security risk Randy describes will remain. In the end, wouldn't it just be simpler to have Ada create the path elements automatically? Perhaps it is not a very important point. **************************************************************** From: Randy Brukardt Sent: Tuesday, July 1, 2003 6:24 PM > It has the disadvantage, for existing Ada software, of either continuing to > fail to create a file whose path is not all in existence or requiring the > appropriate Create_Path calls to be inserted (and the software rebuilt). Existing software either already handles directory creation with implementation-specific calls, or assumes that an outside force has created needed directories. Either way, there is no obvious value to a change. > I'm not convinced about Randy's security argument. Assuming that many Ada > programmers do what I illustrate above, the security risk Randy describes > will remain. But you made the argument above that the explicit call to Create_Path makes code review easier. And, clearly, if the application cares about security, the code review will see the Create_Path and determine whether that is appropriate. In any case, I don't see the need for wide-spread use of this sort of function. Claw.Directories does not have anything like Create_Path, meaning I've never used it in an application. Generally, I assume that an "outside force" has created the needed "root" directories (if not, even the barest testing will detect the missing directory), and thus only create the local directories needed. There have been a couple of cases in Q&D apps where Create_Path probably would have been used, but I think it ought to be avoided in major applications. But that depends on many factors; I don't want to be deciding that for everyone. > In the end, wouldn't it just be simpler to have Ada create the path elements > automatically? Perhaps it is not a very important point. Why do you think it is a good idea to cover up an error? If the application really expects the directory to already exist (for security or other reasons), why would you want to force the error to not be detected? (And essentially, there would be no way to detect the error?). In any case, this was previously decided, and I certainly hope that we are not re-opening every decision because someone doesn't like it. If that is the case, this will be Ada 2015, not Ada 2005! **************************************************************** From: Nick Roberts Sent: Wednesday, July 2, 2003 7:41 AM Ah, but I think there is value to the change I suggested. The majority of software (which creates files) obtains the filenames it uses from an outside source, such as a configuration file or command-line argument, and simply uses the filename given to it without interpreting or checking it. Users of these programs will have got used to the idea that they must e.g. create directory "x" before passing a filename "x/y" to the program. It would be genuinely useful to many users to be able to just pass "x/y" to the program, knowing the program will automatically create "x" if necessary. My suggestion would enable existing Ada programs to obtain this new and useful functionality simply by being rebuilt (no source text changes would be necessary). Surely this would be of value? > > I'm not convinced about Randy's security argument. Assuming that > > many Ada programmers do what I illustrate above, the security risk > > Randy describes will remain. > > But you made the argument above that the explicit call to Create_Path > make code review easier. And, clearly, if the application cares about > security, the code review will see the Create_Path and determine > whether that is appropriate. Yes. In fact, what I am trying to do is to put both sides of the coin. I am saying that I think there is value in either approach (Create_Path or implicit path creation), and that it is not a foregone conclusion that one is better than the other. > ... > > In the end, wouldn't it just be simpler to have Ada create the > > path elements automatically? Perhaps it is not a very important > > point. > > Why do you think it is a good idea to cover up an error? If the > application really expects the directory to already exist (for > security or other reasons), why would you want to force the error > to not be detected? (And essentially, there would be no way to > detect the error?). As I previously mentioned, it is not usually the case that the application (program) expects anything of a filename it uses to create a file, but rather of the user passing the filename to the program. It is not simply a case of 'it is an error if the path does not exist'! Sometimes it will indeed be an error. For example, if the user typed "makeskel ~/prog/test" when "makeskel ~/test/prog" was intended. The typical consequences of such an error would be annoying, but not serious (compared to getting an error such as "'prog' does not exist"). Often, however, it will be truly useful for the user to be able to type, for example, "makeskel ~/test/fred/t1" knowing that the directory "fred" will be created automatically. More importantly, it will be extremely useful for an installation program that, for example, reads a long list of file names from a configuration file, not to be stopped if it happens that one of those names contains a path with a missing element. The user could be left perplexed. (Sometimes a half-completed installation can be pernicious in making it impossible to go forward or back.) In such cases, it is most likely that the best solution is simply to create the missing element; this action may sometimes be wrong, but it will very rarely be harmful. > In any case, this was previously decided, and I certainly > hope that we are not re-opening every decision because > someone doesn't like it. If that is the case, this will be > Ada 2015, not Ada 2005! Well possibly the previous decision needs to be reviewed, and I hope that I've convincingly argued for doing so. I agree it's not good enough for me just to say "I don't like that!" I would respectfully suggest that it's also not good enough for the ARG to reject a suggestion simply on the basis of "We don't like that!" **************************************************************** From: Nick Roberts Sent: Tuesday, July 1, 2003 5:30 PM Are Search_Type and Directory_Entry_Type really intended to be limited? Line 542 of AI-248 (v1.17) says: Values of Search_Type and Directory_Entry_Type can be saved and queried later. How can a value of a limited type be saved and queried later? If Directory_Entry_Type were to be of a non-limited type, I think the implied "=" function could be rather useful. Some implementations could use internal data (e.g. inode number) to perform this test, which might be better (more reliable, more efficient) than the alternative of comparing full names. Possibly it is not really necessary to allow values of Search_Type to be saved and queried later. Maybe Search_Type should remain limited. **************************************************************** From: Randy Brukardt Sent: Tuesday, July 1, 2003 6:08 PM > How can a value of a limited type be saved and queried later? Create the object with an allocator, and save the access somewhere. I think that statement was a holdover from the original version of the package, where these types were not limited. But this is similar to Exception_Occurrence (also limited), which also can be saved and queried later. I still think Directory_Entry_Type should be non-limited, but I was overruled. There doesn't seem to be much that you could do with a Search_Type item that you saved, so there doesn't seem to be any point of that. > If Directory_Entry_Type were to be of a non-limited type, I think the > implied "=" function could be rather useful. Some implementations could use > internal data (e.g. inode number) to perform this test, which might be > better (more reliable, more efficient) than the alternative > of comparing full names. Perhaps. But defining the semantics precisely enough to have that effect would be difficult. I had just left it implementation-defined. In any case, it was decided at the November 2001 meeting to make it limited. Perhaps this is a good time for a general comment. This package is a perfect example of a case where "Perfect" is the enemy of "Good Enough". Everyone has an idea of how to do this a little bit better. If we actually did that, we'd be bogged down here, and never get an actual implementation to use! This is functionality that probably ought to have been in Ada 83 - we really don't want to delay it to "improve" the package. Of course, if there are real bugs or gaps, those we need to know about -- but, otherwise, the ARG has other, more important things to do than endlessly revisit this package. **************************************************************** From: Vincent Celier Sent: Thursday, November 20, 2003 5:00 PM I'd like to have some clarification about function Extension and Base_Name. This is what the AI says: function Extension (Name : in String) return String; Returns the extension name corresponding to Name. The extension name is a portion of a simple name (not including any separator characters), typically used to identify the file class. If the external environment does not have extension names, then the null string is returned. The exception Name_Error is propagated if the string given as Name does not allow the identification of an external file. AARM Discussion: For Unix and Windows, the extension is the portion of the simple name following the rightmost period. For example, in the simple name "RM-A-8.html", the extension is "html". function Base_Name (Name : in String) return String; Returns the base name corresponding to Name. The base name is the remainder of a simple name after removing any extension and extension separators. The exception Name_Error is propagated if the string given as Name does not allow the identification of an external file (including directories and special files). AARM Discussion: For Unix and Windows, the base name is the portion of the simple name preceding the rightmost period. For example, in the simple name "RM-A-8.html", the base name is "RM-A-8". For function Extension, the AI speaks of "separator characters", for function Base_Name the AI speaks of "extension characters". I am assuming these are the same. I am also assuming that the two paragraphs starting with AARM discussion are not normative, Let's say that the only extension character is the dot '.'. Extension (".") = "" Base_Name (".") = "" Extension (".hidden") = "hidden" Base_Name (".hidden") = "" If we follow the AARM discussion, Extension ("..") = "" Base_Name ("..") = "." Extension ("....toto") = "toto" Base_Name ("....toto") = "..." Extension ("file.1.ada") = "ada" Base_Name ("file.1.ada") = "file.1" But from the normative part, we could also decide for these: Base_Name ("..") = "" -- we have removed "any extension and extension separators" Base_Name ("....toto") = "" -- ditto Extension ("file.1.ada") = "1.ada" Base_Name ("file.1.ada") = "file" Are any of this interpretations correct? (A perverse reading could also decide that Extension ("file.1.ada") = "ada" Base_Name ("file.1.ada") = "file1" -- we have removed "any extension and extension separators" **************************************************************** From: Randy Brukardt Sent: Thursday, November 20, 2003 9:38 PM > I'd like to have some clarification about function Extension and > Base_Name. ... The wording of these functions (and much of the rest of the AI) is intentionally vague. It is intended that implementations will select an appropriate mapping for their OS. The AARM notes are intended to explain what we mean for Windows and Unix, so that there is at least some uniformity of implementations. > For function Extension, the AI speaks of "separator characters", for function > Base_Name the AI speaks of "extension characters". I am assuming these are the same. Base_Name talks about "extension separators", not "extension characters". The use of "separator characters" in Extension looks like a buglet. The AI usually talks of "separators" or "extension separators". "Character" hardly appears at all. The buglet seems harmless, though. (Hopefully, we'll fix it when we do integration of the changes.) > I am also assuming that the two paragraphs starting with AARM > discussion are not normative, Of course. > Let's say that the only extension character is the dot '.'. "extension separator". :-) > If we follow the AARM discussion, > > Extension ("..") = "" > Base_Name ("..") = "." > > Extension ("....toto") = "toto" > Base_Name ("....toto") = "..." > > Extension ("file.1.ada") = "ada" > Base_Name ("file.1.ada") = "file.1" That seems correct. For Windows (which I am most familiar), the extension follows the right-most dot. Martin Dowie pointed out in a question he sent in July (see the AI appendix) that a file named ".txt" loads in Notepad when clicked on - Windows thinks its a "Txt" file. Similarly, Rational's "File.1.Ada" is an "Ada" file to Windows. I don't want to speak for other OSes. > But from the normative part, we could also decide for these: > > Base_Name ("..") = "" > -- we have removed "any extension and extension separators" > > Base_Name ("....toto") = "" > -- ditto > > Extension ("file.1.ada") = "1.ada" > Base_Name ("file.1.ada") = "file" > > Are any of this interpretations correct? Certainly those are correct interpretations of the standard, and you wouldn't fail validation if you followed any of them. But the intent here is that the extension is whatever the extension is on the target system, and the AARM guidance is intended to insure that implementors on common systems make the same choices. Does that answer your question?? **************************************************************** From: Robert A. Duff Sent: Friday, November 21, 2003 8:26 AM > The wording of these functions (and much of the rest of the AI) is > intentionally vague. It is entirely appropriate that the requirements of these functions be vague enough that an implementation has plenty of freedom to do whatever seems right for their platform. However, we shouldn't be so vague that implementers don't know what to do. The AARM notes are an attempt to achieve uniformity, at least on Windows and Unix. They should be precise enough that two independent implementers will not accidentally choose two different behaviors. Therefore, I think it would be helpful to add some of Vincent's "weird" examples to the AARM. It would be a Good Thing if all implementations choose the same extension for ".", "..", ".mumble", "mumble.", and "mumble.1.ada". It would also be good if Unix implementations and Windows implementations gave the same answers. It would also be good if Unix implementations did the same thing as other tools, like :h and :t in Unix csh. I've no idea what those do. **************************************************************** From: Robert I. Eachus Sent: Friday, November 21, 2003 9:42 AM Definitely, And add some examples like ".cshrc", "../tmp" and "C:\Program Files". **************************************************************** From: David Emery Sent: Tuesday, November 25, 2003 3:18 AM A couple more questions based on and extending (pun intended :-) Vincent's questions: On POSIX, directories often have two special entries, "." and "..". What do Extension(".") Extension("..") Base(".") and Base("..") return ? Note that, if a directory is a 'file' (and in POSIX, a directory is certainly a special case of file), so the 'current directory' and 'parent directory' ("." and "..", respectively) designate an external file, so the condition for "Name_Error" does not match. On Traditional Unix/POSIX, the use of file extensions is a convention only. This not true on Windoze or on MacOS X (and that's a mis-feature of OS X, IMHO.) **************************************************************** From: Vincent Celier Sent: Tuesday, November 25, 2003 7:03 AM >From Randy's reply, I understand that the recommanded values are: > Extension(".") = "" > Extension("..") = "" > Base(".") = "" > and Base("..") = "." **************************************************************** From: Robert Dewar Sent: Tuesday, November 25, 2003 8:12 AM The important thing is to have simple rules. If the rules produce odd results in these marginal cases, that's too bad, but it should not be fixed by introducing weird exceptional cases. **************************************************************** From: David Emery Sent: Tuesday, November 25, 2003 8:28 AM For what it's worth, I don' think that these are "marginal cases". A marginal case, IMHO, would occur if some user could create a file named only ".". If you do "ls -aF" on any POSIX-conforming system, you'll see the entries "." and "..", on -every directory-. Special cases, absolutely. Marginal, no. (But I'd consider "..." a marginal case, since that is not a filename that appears in every directory.) These special cases must be clearly, unambiguously handled by the standard. (I'm particularly thinking about someone writing a program that iterates over the contents of directories, a quite reasonable thing to do, and that program will have to handle "." and "..".) But we can handle these cases by simple declaration of the rules. My preference is to just state these cases. But I could also live with a general rule that says that the interpretation of a filename consisting only of separators produces the empty string for both Base() and Extension(), which handles the Windoze and POSIX flavors correctly. **************************************************************** From: Robert Dewar Sent: Tuesday, November 25, 2003 8:35 AM > For what it's worth, I don' think that these are "marginal cases". > A marginal case, IMHO, would occur if some user could create a file > named only ".". If you do "ls -aF" on any POSIX-conforming system, > you'll see the entries "." and "..", on -every directory-. Special > cases, absolutely. Marginal, no. (But I'd consider "..." a marginal > case, since that is not a filename that appears in every directory.) What is marginal is to ask for the extension of a file called . or .. My point is that I do NOT want to see special rules for such cases. I want to see simple rules for what extension means, and I really don't care what peculiar results come out for . and .. since the concept of extension is bogus in these cases anyway. > These special cases must be clearly, unambiguously handled by the > standard. Well I would prefer that they are NOT special cases in the standard. I think all we need is a set of rules that apply algorithmically to all file names, and whatever comes out of these rules is what you get for . and .. In particular: > >> > >> > Extension(".") > >> = "" > >> > Extension("..") > >> = "" > >> > Base(".") > >> = "" > >> > and Base("..") > >> = "." These results seem quite fine to me **************************************************************** From: Randy Brukardt Sent: Tuesday, November 25, 2003 5:59 PM Robert Dewar said: > > These special cases must be clearly, unambiguously handled by the > > standard. > > Well I would prefer that they are NOT special cases in the standard. I think > all we need is a set of rules that apply algorithmically to all file names, > and whatever comes out of these rules is what you get for . and .. The current AI gives the following recommendation for extensions on Unix and Windows systems: AARM Discussion: For Unix and Windows, the extension is the portion of the simple name following the rightmost period. For example, in the simple name "RM-A-8.html", the extension is "html". Robert continues: > In particular: > > > >> > > >> > Extension(".") > > >> = "" > > >> > Extension("..") > > >> = "" > > >> > Base(".") > > >> = "" > > >> > and Base("..") > > >> = "." > > These results seem quite fine to me It is pretty weird to take the extension of any directory name (the type of a directory is "directory", you don't need to figure out the type from the name as with regular files), so I tend to agree with Robert. But I don't feel strongly. Similarly, I don't see much point in further examples for the definition of "Extension". If you don't know what the "rightmost period" means, you don't belong within a country mile of a computer, much less a compiler runtime. There might be some value in other cases (I haven't looked at the whole of this for a while). **************************************************************** From: David Emery Sent: Tuesday, November 25, 2003 6:22 PM What's wrong with my suggestion that a filename consisting solely of separators has neither a Base nor an Extension? This handles the Windoze and POSIX cases correctly, and avoids the special cases that Robert Dewar does not like. **************************************************************** From: Randy Brukardt Sent: Tuesday, November 25, 2003 6:35 PM > What's wrong with my suggestion that a filename consisting solely > of separators has neither a Base nor an Extension? This handles the > Windoze and POSIX cases correctly, and avoids the special cases > that Robert Dewar does not like. That means that the identity Compose (Name => Base_Name(File_Name), Extension => Extension(File_Name)) = File_Name would not hold. I realize that the normative language can't guarantee this (because it has to allow for the possibility of multiple separators), but on all common systems, there is only one separator, and thus it does hold. And it should logically hold on all systems (if there are multiple separators, they should be interchangable). With your proposal, Compose would return "." when File_Name is "..". It also would make the implementation of Base_Name and Extension a lot more complex (admittedly compared to nearly trivial). There is no easy way to determine if a string is "solely of separators", you simply have to check with a extra loop. **************************************************************** From: Vincent Celier Sent: Tuesday, November 25, 2003 8:26 PM > That means that the identity > > Compose (Name => Base_Name(File_Name), Extension => Extension(File_Name)) > = File_Name > > would not hold. I realize that the normative language can't guarantee this > (because it has to allow for the possibility of multiple separators), but on > all common systems, there is only one separator, and thus it does hold. But that will not hold for Posix. File_Name = "toto" Compose (...) = "toto" File_Name = "toto." Compose (...) = "toto" For these two different file names "toto" and "toto.", Compose (Name => Base_Name(File_Name), Extension => Extension(File_Name)) gives the same value, because the two Base_Name and Extension are the same. **************************************************************** From: Tucker Taft Sent: Wednesday, November 26, 2003 7:33 AM It seems like we ought to address this. Perhaps we need something that just returns the separator as a string, and can return "" for "toto" and "." for "toto.", or something like that. It would be nice to have an invertible set of split/compose operations. **************************************************************** From: Vincent Celier Sent: Wednesday, November 26, 2003 8:20 AM I am not sure we should standardize new stuff the "would be nice". Base_Name and Extension are, as far as I know, new concepts that are not defined in Unix for example. **************************************************************** From: David Emery Sent: Wednesday, November 26, 2003 8:27 AM They're not officially part of POSIX, and are only conventions in Unix. As far as I remember, the only 'official' use of a "." in a filename is the convention that is honored by shell utilities (e.g. "ls") that filenames that begin with a "." are, -by default- hidden from "ls" without an extra option. Examples include ".emacs" and ".cshrc". Such files are relatively common in 'normal' Unix, and are very common in MacOX (they contain the resource fork for a file.) Windoze does not seem to honor this convention, whenever I hand a USB drive or similar with files from MacOS X, I always have to remind the recipient "Look for the file named 'foo.bar', not '.foo.bar' ." It seems to me that we're making a better argument for deleting these operations altogether. **************************************************************** From: Tucker Taft Sent: Wednesday, November 26, 2003 2:58 PM I don't agree at all. Essentially all development tools on Unix and Windows use extensions to mean something about the content of the file. We are trying to provide useful capabilities to Ada users operating in Unix-like or Windows-like environments. Splitting file names apart and putting them back together is extremely common. In the c-shell, uses of ":t", ":h", ":r", and friends appear in almost every script of any size. Forcing every Ada user who wants to fiddle with file names to reinvent these wheels is what we are trying to avoid. **************************************************************** From: Randy Brukardt Sent: Wednesday, November 26, 2003 3:26 PM Dave Emery said: > It seems to me that we're making a better argument for deleting these > operations altogether. Suggestions to change the AI, which is already approved at both the ARG and WG9 levels, are out of order at this point. If we reopen discussion on every AI when someone decides they don't like something about it, we'll never make any progress -- especially on AIs like this one that everyone has an opinion about. Obviously, we don't want to ignore technical problems, but there aren't any here. This entire discussion is about whether the AARM language suggested in the AI is correct. My personal opinion is that the file name manipulation functions don't belong in this package. But, if we're going to have them, we ought to have a reasonably complete set. And there is no problem with the primary intended use of Extension - to check a particular file name to see if it has a particular extension. That makes perfect sense on POSIX as it does on Windows -- checking for .o in POSIX makes as much sense as checking for .obj in Windows. Similarly, the primary purpose of Compose is to add a known extension to a base name that comes from the program. So it seems that Compose is not an identity with Base_Name and Extension. That's too bad, but it should be clear that nothing in the AI says that it is. (Of course, in the other direction, there is no problem.) **************************************************************** From: Gary Dismukes Sent: Friday, March 25, 2003 5:18 PM I have a few comments on the Search_Type operations defined in package Ada.Directories that have come up since I submitted my review comments on Annex A. These are really just suggestions for clarification of the semantics of the Start_Search and End_Search operations. It's not clear what the state of the Search_Type actual parameter is after an unsuccessful call to Start_Search (when an exception such as Name_Error or Use_Error occurs). It seems that it should be explicitly defined to have no entries (i.e., it's in the same state as a default-initialized Search_Type object). This would also help make it clear that you don't have to call End_Search after such an unsuccessful call to Start_Search. Another choice would be to say it's in the same state as before the call, rather than resetting it to "no entries". Come to think of it, it seems that it should be an error to call Start_Search if the Search parameter has any entries, since that may indicate an error in usage (End_Search should have been called). So I would propose that an exception be raised in that case and the state be left as is (and otherwise the Search parameter has no entries after the call if some other exception condition occurs). I also think that the first two sentences of the semantic description of Start_Search need some revision. The first sentence reads: Starts a search in the directory entry in the directory named by Directory for entries matching Pattern. I believe this should really say (marking the deletion with []): Starts a search [in the directory entry] in the directory named by Directory for entries matching Pattern. The second sentence says: Pattern represents a file name matching pattern. and can be misread as saying "represents a file name that matches pattern", even though that obviously doesn't make sense. For improved scanning, either a hyphen should be added ("file-name matching pattern"), or, perhaps better, rephrase this as: Pattern represents a pattern that matches file names. As for End_Search, it would be nice to clarify that this operation can be used to end a search early (which is presumably the main reason for having this op). Also, it might be good to state explicitly whether it's necessary to call this operation to end a search. Actually, in the case where all entries are consumed, it's presumably just a formality to call End_Search, unless we want it to be an error to call Start_Search on a search object that hasn't been formally ended by a call to End_Search (but I'm not sure that's a worthwhile requirement to impose). The description of End_Search says: After a successful call on End_Search, the object Search will have no entries available. This raises the question as to what constitutes an unsuccessful call to End_Search. I would have thought that it's always okay to call End_Search. I guess the one case where it might make sense to have an error condition would be if Start_Search hasn't been called on the Search object, but that hardly seems worth protecting against. In any case, if there really is some condition that should cause an exception, then that should be stated explicitly. As a final comment, there was some mention in the minutes of one of the ARG meetings that Search_Type should be a controlled type, so that search objects can be finalized properly. It seems like it would be helpful to mention this in an AARM note. **************************************************************** From: Randy Brukardt Sent: Wednesday, August 17, 2005 11:18 PM Here is a disposition reply to a message Gary sent back in March. ... > Come to think of it, it seems > that it should be an error to call Start_Search if the Search > parameter has any entries, since that may indicate an error in > usage (End_Search should have been called). So I would propose > that an exception be raised in that case and the state be left > as is (and otherwise the Search parameter has no entries after > the call if some other exception condition occurs). When we discussed this way back when (in Cupertino, I think), we decided that the semantics of Start_Search was that it always (re)started a search: there is an implicit call to End_Search at the start of it. (This is *not* the semantics that I recommended, but I was overruled.) The only reason to call End_Search is to free up resources that won't be needed any more in an object that will continue to exist (a library-level object, for instance). So I think your original idea is what was intended. I added "When Start_Search propagates Name_Error or Use_Error, the object Search will have no entries available." as the last sentence of the paragraph. (I mentioned those exceptions specifically, because we shouldn't put any requirements on what happens when Device_Error (or some other exception) is raised; that represents an internal error or a malfunction and it isn't worth putting requirements on it. And we need to match End_Search.) ... > I believe this should really say (marking the deletion with []): > > Starts a search [in the directory entry] in the directory named by > Directory for entries matching Pattern. Yes, of course. > The second sentence says: > > Pattern represents a file name matching pattern. > > and can be misread as saying "represents a file name that matches pattern", > even though that obviously doesn't make sense. For improved scanning, either > a hyphen should be added ("file-name matching pattern"), or, perhaps better, > rephrase this as: > > Pattern represents a pattern that matches file names. If you didn't ask for a hyphen to be added somewhere, I would have thought some imposter was using your name. :-) I think rephrasing is best, but I don't think yours works. I used (after much experimenting): Pattern represents a pattern for matching file names. It still sounds a bit like a tautology, but that's because of the name of the parameter. > As for End_Search, it would be nice to clarify that this operation > can be used to end a search early (which is presumably the main reason > for having this op). Also, it might be good to state explicitly whether > it's necessary to call this operation to end a search. Actually, in the > case where all entries are consumed, it's presumably just a formality > to call End_Search, unless we want it to be an error to call Start_Search > on a search object that hasn't been formally ended by a call to End_Search > (but I'm not sure that's a worthwhile requirement to impose). As I mentioned above, we decided specifically not to do that. It's never necessary to call End_Search. > The description of End_Search says: > > After a successful call on End_Search, the object Search will have > no entries available. > > This raises the question as to what constitutes an unsuccessful call > to End_Search. I would have thought that it's always okay to call > End_Search. I guess the one case where it might make sense to have > an error condition would be if Start_Search hasn't been called on > the Search object, but that hardly seems worth protecting against. > In any case, if there really is some condition that should cause > an exception, then that should be stated explicitly. The only case that would raise an exception would be if some underlying operation failed. That would raise Device_Error (these are input-output operations, so A.13(11) applies). The Windows operation can return a failure code (although they never explain why). So an implementation might raise an exception for that reason. (Admittedly, my implementation just ignores errors from Find_Close.) That's sort of a blanket permission; it's never mentioned explicitly in the text. I've added an AARM note to explain that. I suppose we could drop the word "successful", but it seems nasty to impose requirements in the case of an underlying failure. (Imagine on a Windows implementation that the find handle didn't get closed on a failure. A later operation might return an entry, which would be wrong if we insisted on clearing even when a failure occurred.) > As a final comment, there was some mention in the minutes of one of > the ARG meetings that Search_Type should be a controlled type, so that > search objects can be finalized properly. It seems like it would be > helpful to mention this in an AARM note. That's normative: A.16(101): "Type Search_Type needs finalization." Which is code for "might be implemented as a controlled type". That's all over the Standard (now), and saying more here, but not for (say) Text_IO.File_Type would be weird. Certainly, we're not going to *require* it to be a controlled type. So I don't think anything more is needed. **************************************************************** From: Santiago Uruena Sent: Thursday, August 11, 2005 1:51 PM !topic Security vulnerability in Ada.Directories !reference RM06-A.16 !from Santiago Urue¤a 2005-08-11 !keywords security vulnerability, file descriptor, Ada.Directories !discussion The package Ada.Directories, proposed for the 1st Amendment to the Ada language, provides a set of very useful operations for managing directories and files (ordinary and special files). However, the current interface is susceptible to filesystem-based attacks, a family of security vulnerabilities which affects every operating system with a Unix-like filesystem, and probably to some recently versions of Windows. In those systems the content of a file is independent of its name, probably having more than one path through hard links and symbolic links. This family of attacks affects those applications making multiple operations over a file (or directory) through the path name instead of using a file descriptor. A race condition can happen between two operations, and therefore the path name can refer to a different file. This can make the application fail because the program cannot find the file again, or worse, the attack can lead to a security vulnerability if the "impersonator" file is specially crafted to leak some information or to smash valuable data. Both situations can be avoided using always the file descriptor. For example, if the working directory of an application is moved during execution, the program does not need to fail if it always uses the file descriptor of the open directory while traversing trough subdirectories. Sadly, this cannot be achieved with the current proposal of Ada.Directories because the operation Set_Directory only receives the path name. This type of attacks is explained by Ulrich Drepper in a paper about defensive programming [1], please read the whole section 2.2 for full details ("Defeating Filesystem-Based Attacks", pp.5-9). The paper do not only covers security-sensitive applications, but also techniques that every robust program should follow. He talks about creating files safely, the problems an application can face when replacing, renaming or removing a file (or directory), and techniques for achieving additional security identifying the correct file before opening it. However, some of that techniques cannot be implemented with the current prososal of the package Ada.Directories because a lot of its operations do not receive a file descriptor but only path names: with Ada.IO_Exceptions; with Ada.Calendar; package Ada.Directories is -- Directory and file operations: function Current_Directory return String; procedure Set_Directory (Directory : in String); procedure Create_Directory (New_Directory : in String; Form : in String := ""); procedure Delete_Directory (Directory : in String); procedure Create_Path (New_Directory : in String; Form : in String := ""); procedure Delete_Tree (Directory : in String); procedure Delete_File (Name : in String); procedure Rename (Old_Name, New_Name : in String); procedure Copy_File (Source_Name, Target_Name : in String; Form : in String := ""); -- File and directory name operations: function Full_Name (Name : in String) return String; function Simple_Name (Name : in String) return String; function Containing_Directory (Name : in String) return String; function Extension (Name : in String) return String; function Base_Name (Name : in String) return String; function Compose (Containing_Directory : in String := ""; Name : in String; Extension : in String := "") return String; -- File and directory queries: type File_Kind is (Directory, Ordinary_File, Special_File); type File_Size is range 0 .. implementation-defined; function Exists (Name : String) return Boolean; function Kind (Name : String) return File_Kind; function Size (Name : String) return File_Size; function Modification_Time (Name : String) return Ada.Calendar.Time; -- Directory searching: type Directory_Entry_Type is limited private; type Filter_Type is array (File_Kind) of Boolean; type Search_Type is limited private; procedure Start_Search (Search : in out Search_Type; Directory : in String; Pattern : in String; Filter : in Filter_Type := (others => True)); procedure End_Search (Search : in out Search_Type); function More_Entries (Search : in Search_Type) return Boolean; procedure Get_Next_Entry (Search : in out Search_Type; Directory_Entry : out Directory_Entry_Type); procedure Search ( Directory : in String; Pattern : in String; Filter : in Filter_Type := (others => True); Process : not null access procedure (Directory_Entry : in Directory_Entry_Type)); -- Operations on Directory Entries: function Simple_Name (Directory_Entry : Directory_Entry_Type) return String; function Full_Name (Directory_Entry : Directory_Entry_Type) return String; function Kind (Directory_Entry : Directory_Entry_Type) return File_Kind; function Size (Directory_Entry : Directory_Entry_Type) return File_Size; function Modification_Time (Directory_Entry : Directory_Entry_Type) return Ada.Calendar.Time; Status_Error : exception renames Ada.IO_Exceptions.Status_Error; Name_Error : exception renames Ada.IO_Exceptions.Name_Error; Use_Error : exception renames Ada.IO_Exceptions.Use_Error; Device_Error : exception renames Ada.IO_Exceptions.Device_Error; private -- Not specified by the language. end Ada.Directories; To allow the implementation of the safe/secure techniques described in the paper, the following operations must receive/return a file descriptor (probably an object of the type Directory_Entry_Type is adequate): function Current_Directory return Directory_Entry_Type; function Containing_Directory (Name : in Directory_Entry_Type) return Directory_Entry_Type; procedure Set_Directory (Directory : in Directory_Entry_Type); These operations are needed for safely traversing through directories, and can be easily implemented over the operating system (e.g. fchdir is standard POSIX [2]). Other operations are highly recommended to receive or return a file descriptor, mainly the operations for removing, although the implementation is not direct (the paper gives some patterns to achieve it): procedure Delete_File (Name : in Directory_Entry_Type); procedure Rename (Entry : in Directory_Entry_Type; New_Name : in String); procedure Copy_File (Source : in Directory_Entry_Type; Target_Name : in String; Form : in String := ""); The following operations are also recommended, but the implementation can be tricky: procedure Create_Directory (Directory : in out Directory_Entry_Type New_Directory : in String; Form : in String := ""); procedure Delete_Directory (Directory : in Directory_Entry_Type); procedure Create_Path (Directory : in out Directory_Entry_Type New_Directory : in String; Form : in String := ""); procedure Delete_Tree (Directory : in Directory_Entry_Type); Of course, not every operation should be changed to receive a file descriptor, because it does not make any sense, or it is purely a string manipulation function: function Compose (Containing_Directory : in String := ""; Name : in String; Extension : in String := "") return String; function Extension (Name : in String) return String; function Base_Name (Name : in String) return String; Moreover, the last two functions should not work with the file descriptor, otherwise the extension and base name obtained may not be "atomic", i.e. the whole file name can have changed between the two calls. An operation in which does not make any sense to use a file descriptor is the Exists function: function Exists (Name : String) return Boolean; This operation can be very useful, but a naive usage can bring security problems: if a program uses it to check whether a file do not exists before creating it, an attacker could create a symbolic-link to a valuable file, which later will be crushed by the application (this attack has been employed many times to remove the contents e.g. of the file /etc/passwd through suid root applications whose create files in the /tmp directory). However, I do not propose removing this operation from the interface because it can be very useful to a lot of users, and any security-aware application could avoid using it. As the paper about defensive programming states, using the file-descriptor interface can degrade performance sometimes, mainly when it is needed to get the meta data of several files --three system calls (open, fstat, close) for every file versus only one (stat)--. In those cases Mr. Drepper advocates for using first the path name interface until the desired file is found, and later the file-descriptor interface for avoiding race conditions. However, it is not recommended to provide both interfaces because, in my opinion, the safe behavior should be the default one, and those applications that require improved performance probably will need to access directly the interface of the OS. If the interface of Ada.Directories provides both interfaces for every operation, users will end using the insecure one because it is easier to use, without noticing that the other variants provides more advantages. This is the case of the C Library, with hundreds of functions that must not be used because they introduce several security problems, but programmers are still using them because they are available. And this is not a small problem, but a big source of the current security problems. Although it is strongly recommended that the interface always provide subprograms that operate with file descriptors, maybe it is not possible to force the implementations to avoid all race conditions. For example, the method used in [1] to achieve a secure rename of the desired file relies on the /proc filesystem, which is Linux specific. Nowadays operating systems do not provide a system call that receives a file descriptor for this operation, only the path, so this could be an implementation burden, although it can be done in some systems (maybe this can be a documentation requirement). However, the file descriptor could be a private type which stores the path name (among other fields, as done in the current AdaCore implementation of Directory_Entry), therefore this interface can easily be implemented over any OS (although not every operation will be 100% secure), whereas compiler vendors could implement a secure interface over those operating systems that allows it (probably in the future the operating systems will add the required secure interface). On the other hand, the interface that operates only with file names, although easier to implement, cannot be implemented securely in those systems that allow it. The same argumentation applies to the subpackage Ada.Directories.Information, which in my opinion should provide only the interface that receives file descriptors: package Ada.Directories.Information is -- System specific directory information. -- Unix version. function Last_Access_Time (Name : in String) return Ada.Calendar.Time; function Last_Status_Change_Time (Name : in String) return Ada.Calendar.Time; type Permission is (Others_Execute, Others_Write, Others_Read, Group_Execute, Group_Write, Group_Read, Owner_Execute, Owner_Write, Owner_Read, Set_Group_ID, Set_User_ID); type Permission_Set_Type is array (Permission) of Boolean; function Permission_Set (Name : in String) return Permision_Set_Type; function Owner (Name : in String) return String; -- Returns the image of the User_Id. If a definition of User_Id -- is available, an implementation-defined version of Owner -- returning User_Id should also be defined. function Group (Name : in String) return String; -- Returns the image of the User_Id. If a definition of Group_Id -- is available, an implementation-defined version of Group -- returning Group_Id should also be defined. function Is_Block_Special_File (Name : in String) return Boolean; function Is_Character_Special_File (Name : in String) return Boolean; function Is_FIFO (Name : in String) return Boolean; function Is_Symbolic_Link (Name : in String) return Boolean; function Is_Socket (Name : in String) return Boolean; function Last_Access_Time (Directory_Entry : in Directory_Entry_Type) return Ada.Calendar.Time; function Last_Status_Change_Time (Directory_Entry : in Directory_Entry_Type) return Ada.Calendar.Time; function Permission_Set (Directory_Entry : in Directory_Entry_Type) return Permission_Set_Type; function Owner (Directory_Entry : in Directory_Entry_Type) return String; -- See Owner above. function Group (Directory_Entry : in Directory_Entry_Type) return String; -- See Group above. function Is_Block_Special_File (Directory_Entry : in Directory_Entry_Type) return Boolean; function Is_Character_Special_File (Directory_Entry : in Directory_Entry_Type) return Boolean; function Is_FIFO (Directory_Entry : in Directory_Entry_Type) return Boolean; function Is_Symbolic_Link (Directory_Entry : in Directory_Entry_Type) return Boolean; function Is_Socket (Directory_Entry : in Directory_Entry_Type) return Boolean; -- Additional implementation-defined subprograms allowed here. end Ada.Directories.Information; Furthermore, maybe the ARM should encourage implementors to add subprograms to this package (or other grandchilder of Ada) using the preferred interface, unless there is a good reason to avoid file descriptors. At first sight the existing Ada IO packages are not vulnerable to filesystem-based attacks. For example, in Ada.Text_IO the file descriptors returned by Open and Create are always used. However, it is necessary to add operations for converting a Directory_Entry_Type in the File_Type of the different existing IO packages and vice versa. Also, a operation for duplicating file descriptors would be very valuable (e.g. for making two different read operations over the same file without opening twice the file using the path). Of course, not every vulnerability described in the paper [1] can be simply avoided changing the interface. Few of them are only problematic in some applications, and therefore the programmer must use adequate techniques to defeat those possible attacks. But as stated above, some problems (not only security problems) are completely avoided using always file descriptors, and for that reason this should be the default interface. In conclusion, the addition of the package Ada.Directories to the Standard Libraries are very good news, being necessary for several applications. However, the current proposed interface is not safe in some aspects, being susceptible to a number of problems and security attacks. A change in the interface could avoid a high number of those problems, and would open the door for the implementation of some techniques needed in robust applications. References: [1] Ulrich Drepper, "Defensive Programming for Red Hat Enterprise Linux (and What To Do If Something Goes Wrong)", May 2005. http://people.redhat.com/drepper/defprogramming.pdf [2] IEEE, The Open Group, "Standard for Information Technology--Portable Operating System Interface (POSIX)--System Interfaces", IEEE Std 1003.1, 2004 Edition **************************************************************** From: Randy Brukardt Sent: Thursday, August 11, 2005 7:10 PM Certainly, Ada.Directories is subject to race conditions. You certainly don't need a symlink attack to cause one -- it can happen any time a file is deleted or modified. Users always need to use these sorts of facilities with caution: any assumptions based on the results of one operation can change before another one. That's why any I/O operations in Ada have to be protected by exception handlers -- it's impossible to eliminate exception causing conditions from a program. In any case, file descriptor operations only cover the tip of the iceberg. There are plenty of "security vulnerabilities" that you can program using them -- simply because a symlink attack breaks the connection between names and files. And there aren't many programs that can completely avoid the use of filenames -- virtually all programs need to have presisent data. Moreover, there are many ways to use Ada.Directories in ways that don't cause issues. For instance, you talk about a program that assumes that the current directory doesn't change. I'd have one simple name for such a program - "WRONG!". Indeed, some Ada 95 compilers make exactly this assumption, and it is a very dangerous one. In any case, Ada.Directories was designed only to manage names. That's because that is the only commonaility between commonly used operating systems. Windows, for example, does not have any concept similar to the Unix file descriptor in its API. That means that an implementation of your interface on Windows would have to use names internally. And that would give users a false sense of security -- instead of protecting their applications against the effects of bad data, they might decide they're safe based on these file descriptor interfaces. For POSIX-specific things like file descriptors, programmers should use the standard POSIX interfaces. Such a program won't run on Windows, but that's good, because it wouldn't be safe there anyway. But I think it would be better if programmers really programmed defensively -- which is to never assume *anything* about the file system, and always be prepared for corrupted/missing files. That requires good planning and error handling, which I realize a lot of programmers out there aren't willing to do. In any case, Directory_Entry_Type is supposed to be a snapshot of the state of a file at the point it is created. The basic idea comes from Windows, but it is possible to create something similar in Unix. Of course, a file descriptor could be included, but that was not really the intent (the file might be modified or deleted between the time that the DET was create and the time it was queries, causing another race condition). In any case, it is much too late in the process for a total redesign of the package. We'd need to determine implementability of all of these new routines. My guess is that most of them would have to be implemented by using names under the covers -- even the POSIX APIs you mentioned are quite new and aren't likely to be available on many Unix systems. And all of the new wording would have to be reviewed and corrected by the entire group of people. Unless we were willing to delay the entire standard another six months, the only practical option would be completely delete the offending parts of this interface (or the entire interface). But that would seem pretty silly compared to the facilities available in other languages. The only time in Ada that you can assume that the identity of a file does not change is the time that you have it open. As soon as you close it, you have to reverify it before using it again. And that's really true on any system, because you can't get file descriptors or anything else other than names to persist between program executions -- and there is little reason to close a file during the running of a single program. And even in that case, you still have to be prepared for corrupted files. So, to me, "defensive programming" means handling exceptions anytime you do I/O, and being prepared for I/O failures. There is never going to be a technique or interface that will eliminate the need to do that - because the entire purpose of file systems are to change - and that change doesn't even have to be malicious. **************************************************************** From: Bob Duff Sent: Friday, August 12, 2005 11:06 AM I agree with Santiago Urue¤a's points. In many cases, it is easier to write correct code using something like file descriptors rather than file names. On the other hand, Randy is probably correct when he says: > In any case, it is much too late in the process for a total redesign > of the package. **************************************************************** From: Florian Weimer Sent: Sunday, August 14, 2005 2:19 AM * Santiago Urue¤a: > procedure Set_Directory (Directory : in Directory_Entry_Type); > > These operations are needed for safely traversing through directories, and > can be easily implemented over the operating system (e.g. fchdir is > standard POSIX [2]). The claim that this can be easily implemented is dubious. In POSIX, the current directory is a per-process resource. For typical Ada implementations, this means that calling Set_Directory in any task will change the current directory for all tasks. In many cases (especially for code in libraries), the approach based on path names is therefore preferable, even though it suffers from external race conditions. **************************************************************** From: Bob Duff Sent: Sunday, August 14, 2005 9:44 AM I don't understand. Either way, the programmer needs to be aware of the fact that it's a per-process thing (e.g. by not using tasking, or by having just one task that does I/O, so it's the only one that cares what the current directory is). **************************************************************** From: Florian Weimer Sent: Tuesday, August 16, 2005 5:20 AM I'm concerned by packages which use Ada.Directories internally, but do not expose it at the interface (for example, a file-based queue). Such libraries cannot change the current working directory of the process without breaking other code. **************************************************************** From: Bob Duff Sent: Tuesday, August 16, 2005 4:20 PM OK, but that concern is the same, whether the interface to the change-dir operation takes a name or not. Are you saying that this functionality should not be provided at all? **************************************************************** From: Florian Weimer Sent: Wednesday, August 17, 2005 2:08 PM Oh, sorry for being unclear. My concern is not the change-dir operation per se. As far as I can see, practical portable use of directory handles on POSIX systems to prevent this kind of race conditions *requires* calling fchdir, because there are no open/stat/chdir operations which take a directory descriptor argument *and* a name which is intepreted relative to that directory. You have to fchdir first, and then use the regular operation with a relative path. And this is a no-no in a multi-threaded program. I hope this explanation helps you to understand my concerns. **************************************************************** From: Santiago Uruena Sent: Monday, August 15, 2005 12:08 PM Thank you for your kindly attention. I'm agree that a new interface to the package Ada.Directories does not worth delaying the approval of the new version of the language nor the removal of the package, altough the current proposal could be changed to build even more robust programs. **************************************************************** From: Pascal Leroy Sent: Tuesday, August 16, 2005 5:10 AM > I agree with Santiago Urue¤a's points. In many cases, it is > easier to write correct code using something like file > descriptors rather than file names. I am not sure what is the justification for Bob's statement. It seems to me that there are situations where a filename-based API would be easier to program. At any rate, Santiago Urue¤a's concern was not code correctness in general: he was specifically discussing security issues. It is important to realize that the Ada predefined libraries were generally not designed with security in mind. I suppose that the reason is in part historical: in the 80s and even in the beginning of the 90s nasty security attacks were not of practical significance. It is also probably relevant that most Ada applications run in some kind of secure environment. Finally, for portability reasons the definition of the libraries can assume very little in terms of the primitives provided by the underlying OS. True, there are potential race conditions in Ada.Directories. But it's important to realize that there are potential race conditions in the existing I/O packages, too. For instance, there is nothing in the RM that would prevent Reset from being implemented as a close/open pair. In fact this kind of implementation is pretty much required for Stream_IO.Set_Mode on some OSes. So filesystem-based attacks are already possible in Ada 83/95, and programmers should be aware of this fact. > On the other hand, Randy is probably correct when he says: > > > In any case, it is much too late in the process for a total > redesign > > of the package. A redesign is out of the question. The only option at this point would be to withdraw the package from the RM if people think that it's actively harmful. Of course, that would require a WG9 vote. **************************************************************** From: Robert C. Leif Sent: Tuesday, August 16, 2005 12:30 PM I totally agree with Randy that the new standard can not be changed at this time. One possibility is to specify the name for a place holder that can be subsequently implemented and include a general comment as to what would be implemented. Possibly, a child library would be appropriate. **************************************************************** From: Bob Duff Sent: Tuesday, August 16, 2005 4:43 PM Pascal wrote: > > I agree with Santiago Urue¤a's points. In many cases, it is > > easier to write correct code using something like file > > descriptors rather than file names. > > I am not sure what is the justification for Bob's statement. It seems to > me that there are situations where a filename-based API would be easier to > program. At any rate, Santiago Urue¤a's concern was not code correctness > in general: he was specifically discussing security issues. The kinds of bugs that Santiago Urue¤a (and the paper he referred to) talk about are very real, in my experience. I agree there are situations where a filename-base API makes sense; I think the suggestion was to *add* operations, but also keep the filename-based ones. I also agree that these kinds of bugs are primarily a security concern. A bug that happens by chance once in a zillion runs is probably more serious when a bad guy is *making* it happen. > It is important to realize that the Ada predefined libraries were > generally not designed with security in mind. I suppose that the reason > is in part historical: in the 80s and even in the beginning of the 90s > nasty security attacks were not of practical significance. It is also > probably relevant that most Ada applications run in some kind of secure > environment. Finally, for portability reasons the definition of the > libraries can assume very little in terms of the primitives provided by > the underlying OS. > > True, there are potential race conditions in Ada.Directories. But it's > important to realize that there are potential race conditions in the > existing I/O packages, too. For instance, there is nothing in the RM that > would prevent Reset from being implemented as a close/open pair. In fact > this kind of implementation is pretty much required for Stream_IO.Set_Mode > on some OSes. So filesystem-based attacks are already possible in Ada > 83/95, and programmers should be aware of this fact. I agree with the above. ... > A redesign is out of the question. The only option at this point would be > to withdraw the package from the RM if people think that it's actively > harmful. Of course, that would require a WG9 vote. My suggestion is to leave it as is for now, and perhaps fix it in the *next* revision of the language. And we don't need to talk about the technical details until then. ;-) In *my* school of language design, we do not eliminate useful operations because they *might* be misused. I would provide both kinds, and trust programmers to know which is appropriate. But as I said, I think it's too late to do anything about it to the Standard now. **************************************************************** From: Randy Brukardt Sent: Tuesday, August 16, 2005 5:39 PM > I also agree that these kinds of bugs are primarily a security concern. > A bug that happens by chance once in a zillion runs is probably more > serious when a bad guy is *making* it happen. I agree the problems are real, but I don't think using file descriptors adds any significant security. All it does is put a band-aid on one particular problem; the bad guys will certainly find other attacks and it is much better to program defensively against the entire spectrum of possible attacks (not that it is possible to eliminate all possible ways of attack). The original problem with symlinks was that they were being used to overwrite system files. That particular problem is easily prevented with three simple rules (none having anything to do with file descriptors): 1) Avoid unnecessary temporary files. Machines have lots of memory; the file system is never going to be very secure; so don't go there if you can avoid it. 2) If you're going to write an entire file, always Create it (in Ada terms); never Open it, even if you expect that it already exists. Since Create will delete any existing files, any symlinks will be discarded before the file is written. (Ada implementations have to insure that the work this way, but that is out of the hands of the ordinary user.) 3) If you're going to write an existing file, always verify that the file is in fact the one you think it is before writing anything. (That means that every file needs some sort of distinctive signature, but that actually is a good thing, as it will help detect bugs as well as malicious attacks.) On top of that, you also have to be prepared to deal with corrupted files of all sorts, since it is pretty much impossible (on either Posix or Windows) to guarantee that some other process doesn't tamper with your files. On Windows, you have a bit of protection while the file is open, but that doesn't help much (if you can keep the file open, you're probably violating the above rule #1). It's probably impossible to completely prevent temporary file injection attacks, but they can be made harder by verification or (in critical cases) encryption. We're going to hear more about such attacks as the easier avenues are closed off. So I would be quite dubious about encoding the "rule of week" (including mine, for that matter) into the design of the Ada libraries. They're going to be around for a long, long time, while next year's security advice is likely to be very different than this year's. It's quite possible that a whole different design would be better for next year's advice than would be needed this year. P.S. Secure I/O for Ada would be a sensible topic for an International Workshop Agreement (IWA), a relatively fast and informal way to create secondary standards. ****************************************************************