Version 1.1 of ai12s/ai12-0240-3.txt

Unformatted version of ai12s/ai12-0240-3.txt version 1.1
Other versions for file ai12s/ai12-0240-3.txt

!standard H.7(0)          18-10-18 AI12-0240-3/01
!standard H.8(0)
!class Amendment 18-10-18
!status work item 18-10-18
!status received 18-10-18
!priority Low
!difficulty Hard
!subject Access value ownership and parameter aliasing
!summary
** TBD.
!problem
Pointers (access types) are essential to many complex Ada data structures, but they also have significant downsides, and can create many kinds of safety and security problems. The question is can we create a subset of access-type functionality which supports the creation of interesting data structures without bringing all of the downsides. The notion of pointer "ownership" has emerged as one way to "tame" pointer problems, while preserving flexibility. The goal is to allow a pattern of use of pointers that avoids dangling references as well as storage leaks, by providing safe, immediate, automatic reclamation of storage rather than relying on unchecked deallocation, while also not having to fall back on the time and space vagaries of garbage collection. As a side benefit, we can also get safer use of pointers in the context of parallelism. This AI proposes the use of pointer ownership, as well as some additional rules preventing "aliasing" between parameters, and between pa meters and global variables, to provide safe, automatic, parallelism-friendly heap storage management while allowing the flexible construction of pointer-based data structures, such as trees, linked lists, hash tables, etc.
When attempting to prove properties about a program, particularly programs with multiple threads of control, the enemy is often the unknown "aliasing" of names introduced by access types and certain uses of (potentially) by-reference parameters. By unknown aliasing of names, we mean the situation where two distinct names might refer to the same object, without the compiler being aware of that. A rename introduces an alias, but not an "unknown alias," because the compiler is fully aware of such an alias, and hence is not something we are worrying about here. On the other hand, if a global variable is passed by reference as a parameter to a subprogram that also has visibility on the same global, the by-reference parameter and the global are now aliases within the subprogram, and the compiler generating code for the subprogram has no way of knowing that, hence they are "unknown" aliases. One approach is to always assume the worst, but that makes analysis much harder, and in so cases infeasible.
Access types also introduce unknown aliasing, and in most cases, an analysis tool will not be sure whether the aliases exist, and will again have to make worst-case assumptions, which again may make any interesting proof infeasible.
!proposal
This AI introduces an aspect called "Ownership" for access types and composite types which allows more precise description of heap usage. This AI is kept simple, with the expectation that a more complete proposal will be adopted in future versions of the standard.
Pointer ownership enables automatic storage reclamation since there is never more than one "owning" pointer to a given object, and the pointers are generally limited to pointing only at heap objects (never at a stack-allocated object or aliased component). This restriction matches that provided by "pool-specific" access types, so in this AI we limit the notion of pointer ownership for named access types to such types.
Access type Ownership:
If a pool-specific access-to-variable type has the value True for the aspect Ownership, then certain operations on the access type are restricted, and certain operations result in an automatic deallocation of the designated object. The Ownership aspect is inherited by derived types, and is non-overridable. On a root type, the Ownership aspect defaults to False. We also define the aspect on composite types and private types.
The basic rule is that at most one object of such an access-to-variable type with Ownership True may be used to update a designated object at any time. There might be multiple access-to-variable (or access-to-constant) objects that designate the same object (or some part of an object reachable from it) at any given time, but at most one of them provides read/write access to the designated object. Furthermore, read-only access is permitted to the designated object only when there is no access value that has read/write access. In other words, we enforce a "concurrent read, exclusive write" (CREW) discipline.
We support the notion of a general access type with aspect Unchecked_Reference True, which supports explicit conversion from an Ownership True access type, but is permitted to violate these rules, with the user taking responsibility that the use does not violate thread or memory safety.
We use the terms "borrower" and "observer" to describe access values that are not the primary owner of the designated object, but which have read/write access (borrower) or read-only access (observer) to the designated object. Within the scope of a borrower or an observer, the original access object is, respectively, "borrowed" or "observed." If neither borrowed or observed, the object is considered "unrestricted" and can be used to read/write the designated object.
Subtypes of an access type with Ownership True can have the aspect Borrower or Observer. When an object is of such a subtype, initializing it results in the object being borrowed or observed, rather than being moved.
To summarize the possible states of an access-to-variable object of a type with pointer ownership:
1. "unrestricted" -- may be dereferenced and used to update the designated
object.
2. "observed" -- value has been "observed," meaning that other
access-to-constant objects or IN formal parameters may exist that give read-only access to all or part of the designated object, or part of some object directly or indirectly owned by the designated object.
3. "borrowed" -- value has been "borrowed," meaning that the value has been
copied into a short-lived access-to-variable object, such as an IN parameter, possibly of an anonymous access type. "Borrowed" also covers the case where some part of the designated object is passed as a potentially by-ref [IN] OUT parameter. Finally, it also covers the case where a subcomponent that is itself an access-to-variable with ownership, is borrowed. The state reverts to unrestricted when the short-lived reference no longer exists.
An access-to-variable object of a named type that is unrestricted or observed is considered to be the "owner" of its designated object. There should be exactly one owner of every existing object created by an allocator for an access type with a True Ownership aspect. While observed, the access object is itself read-only. When an object is borrowed, ownership has been temporarily transferred to another object, and while in such a state the original access object is not directly usable, but it can be used to initialize an observer, in which case any borrowers are also implicitly observed at that point, restricting all of them to read-only access throughout the scope of the new observer.
Ownership is transitive, in that if a pointer owns a designated object, it indirectly owns every object owned by one of its access subcomponents.
To simplify the rules, we introduce the notion of managed objects, which come in two varieties:
1. Ownership Objects -- Objects of a type with Ownership, Borrower, or
Observer aspect True:
1. Owning access objects -- access-to-variable objects with Ownership
or Borrower aspect True.
2. Observing access objects -- access-to-constant objects with Ownership
aspect True, or access-to-variable objects with Observer aspect True.
3. Composite Ownership Objects
2. Other Managed Objects -- Any part of a non-Ownership object designated by an
owning (1.a above) or observing access object (1b. above)
We require pass-by-reference for all types with Ownership aspect True, unless they are visibly an access type. This is important to properly deal with the possibility of exceptions, because pass-by-copy for [in] out composite parameters could produce dangling references if a subprogram propagated an exception that was then caught. (Ownership aspect is not defined at all for scalar types, and we don't allow a private type with Ownership aspect True to be of a by-copy type.)
Note that the rules are actually expressed in terms of "names" rather than "objects" since objects don't really exist at compile-time. In the rules, we distinguish "static" names from "dynamic" names, where static names are those made of names denoting stand-alone objects, dereferences of owning/observing access objects, and selected components, while "dynamic" names can involve indexing or slicing.
!wording
Add the following sections to Annex H:
H.7 The Ownership Aspect
This section describes the Ownership aspect, which can be specified for an access type so that there can be either one writable access path to an object designated by an object of such an access type, or one or more read-only access paths via such access objects, but never both. Furthermore, objects designated by such access objects are automatically finalized and deallocated when the last access path is removed.
[rest TBD...]
!discussion
** TBD.
!example
** TBD.
!ASIS
[Not sure. It seems like some new capabilities might be needed, but I didn't check - Editor.]
!ACATS test
ACATS B- and C-Tests are needed to check that the new capabilities are supported.
!appendix

From: Tucker Taft
Sent: Thursday, October 18, 2018  11:59 AM

Here is a slimmed-down version of the Ownership AI, in an attempt to get 
something related to this into Ada 2020, so the Global annotation can be 
useful for containers and similar data structures.  Definitely less than even
half baked.  Main idea is to use the Ownership aspect, but also allow subtypes
with Borrower or Observer aspects (to eliminate need to use anonymous access 
types) that act as borrowers or observers when initialized, rather than owners.
Also proposed having general access types identified as Unchecked_Reference 
types, which can be produced by explicitly converting from an Ownership True
access value (which are always pool-specific), and which are not checked by 
any rules, so the user needs to ensure memory and thread safety on their own
if they use these (e.g. for a cursor).
[This is version /01 of this AI - Editor.]

Some of this was suggested by Randy's version 2, but tries to hew a bit closer
to the original version.

Wording has been elided, since it is not fully baked, and there might be no 
interest in this at all.

****************************************************************

From: Randy Brukardt
Sent: Thursday, October 18, 2018  1:58 PM

I'll post this now and try to find time before leaving to properly review it. 

****************************************************************

From: Tucker Taft
Sent: Thursday, October 18, 2018  2:49 PM

OK, thanks.  It is really more of a starting point for discussion, rather than
a thought-through AI.  I would suggest we start with your version of the AI if
we get around to 240 at all, and then use #1 and #3 as alternatives worth 
thinking about.

****************************************************************

Questions? Ask the ACAA Technical Agent