Version 1.1 of ai12s/ai12-0240-3.txt
!standard H.7(0) 18-10-18 AI12-0240-3/01
!standard H.8(0)
!class Amendment 18-10-18
!status work item 18-10-18
!status received 18-10-18
!priority Low
!difficulty Hard
!subject Access value ownership and parameter aliasing
!summary
** TBD.
!problem
Pointers (access types) are essential to many complex Ada data structures, but
they also have significant downsides, and can create many kinds of safety and
security problems. The question is can we create a subset of access-type
functionality which supports the creation of interesting data structures without
bringing all of the downsides. The notion of pointer "ownership" has emerged as
one way to "tame" pointer problems, while preserving flexibility. The goal is
to allow a pattern of use of pointers that avoids dangling references as well as
storage leaks, by providing safe, immediate, automatic reclamation of storage
rather than relying on unchecked deallocation, while also not having to fall
back on the time and space vagaries of garbage collection. As a side benefit,
we can also get safer use of pointers in the context of parallelism. This AI
proposes the use of pointer ownership, as well as some additional rules
preventing "aliasing" between parameters, and between pa meters and global
variables, to provide safe, automatic, parallelism-friendly heap storage
management while allowing the flexible construction of pointer-based data
structures, such as trees, linked lists, hash tables, etc.
When attempting to prove properties about a program, particularly programs with
multiple threads of control, the enemy is often the unknown "aliasing" of names
introduced by access types and certain uses of (potentially) by-reference
parameters. By unknown aliasing of names, we mean the situation where two
distinct names might refer to the same object, without the compiler being aware
of that. A rename introduces an alias, but not an "unknown alias," because the
compiler is fully aware of such an alias, and hence is not something we are
worrying about here. On the other hand, if a global variable is passed by
reference as a parameter to a subprogram that also has visibility on the same
global, the by-reference parameter and the global are now aliases within the
subprogram, and the compiler generating code for the subprogram has no way of
knowing that, hence they are "unknown" aliases. One approach is to always
assume the worst, but that makes analysis much harder, and in so cases
infeasible.
Access types also introduce unknown aliasing, and in most cases, an analysis
tool will not be sure whether the aliases exist, and will again have to make
worst-case assumptions, which again may make any interesting proof infeasible.
!proposal
This AI introduces an aspect called "Ownership" for access types and
composite types which allows more precise description of heap usage.
This AI is kept simple, with the expectation that a more complete proposal
will be adopted in future versions of the standard.
Pointer ownership enables automatic storage reclamation since there is never
more than one "owning" pointer to a given object, and the pointers are generally
limited to pointing only at heap objects (never at a stack-allocated object or
aliased component). This restriction matches that provided by "pool-specific"
access types, so in this AI we limit the notion of pointer ownership for named
access types to such types.
Access type Ownership:
If a pool-specific access-to-variable type has the value True for the
aspect Ownership, then certain operations on the access type are
restricted, and certain operations result in an automatic deallocation
of the designated object. The Ownership aspect is inherited by derived
types, and is non-overridable. On a root type, the Ownership aspect
defaults to False. We also define the aspect on composite types and
private types.
The basic rule is that at most one object of such an access-to-variable
type with Ownership True may be used to update a designated object at
any time. There might be multiple access-to-variable (or
access-to-constant) objects that designate the same object (or some part
of an object reachable from it) at any given time, but at most one of
them provides read/write access to the designated object. Furthermore,
read-only access is permitted to the designated object only when there
is no access value that has read/write access. In other words, we
enforce a "concurrent read, exclusive write" (CREW) discipline.
We support the notion of a general access type with aspect
Unchecked_Reference True, which supports explicit conversion from an
Ownership True access type, but is permitted to violate these rules,
with the user taking responsibility that the use does not violate thread
or memory safety.
We use the terms "borrower" and "observer" to describe access values
that are not the primary owner of the designated object, but which have
read/write access (borrower) or read-only access (observer) to the
designated object. Within the scope of a borrower or an observer, the
original access object is, respectively, "borrowed" or "observed." If
neither borrowed or observed, the object is considered "unrestricted" and
can be used to read/write the designated object.
Subtypes of an access type with Ownership True can have the aspect Borrower
or Observer. When an object is of such a subtype, initializing it
results in the object being borrowed or observed, rather than being
moved.
To summarize the possible states of an access-to-variable object of a type with
pointer ownership:
1. "unrestricted" -- may be dereferenced and used to update the designated
object.
2. "observed" -- value has been "observed," meaning that other
access-to-constant objects or IN formal parameters may exist that give
read-only access to all or part of the designated object, or part of some
object directly or indirectly owned by the designated object.
3. "borrowed" -- value has been "borrowed," meaning that the value has been
copied into a short-lived access-to-variable object, such as an IN parameter,
possibly of an anonymous access type. "Borrowed" also covers the case where
some part of the designated object is passed as a potentially by-ref [IN] OUT
parameter. Finally, it also covers the case where a subcomponent that is
itself an access-to-variable with ownership, is borrowed. The state reverts
to unrestricted when the short-lived reference no longer exists.
An access-to-variable object of a named type that is unrestricted or
observed is considered to be the "owner" of its designated object.
There should be exactly one owner of every existing object created by an
allocator for an access type with a True Ownership aspect. While
observed, the access object is itself read-only. When an object is
borrowed, ownership has been temporarily transferred to another object,
and while in such a state the original access object is not directly
usable, but it can be used to initialize an observer, in which case any
borrowers are also implicitly observed at that point, restricting all of
them to read-only access throughout the scope of the new observer.
Ownership is transitive, in that if a pointer owns a designated object, it
indirectly owns every object owned by one of its access subcomponents.
To simplify the rules, we introduce the notion of managed objects, which come in
two varieties:
1. Ownership Objects -- Objects of a type with Ownership, Borrower, or
Observer aspect True:
1. Owning access objects -- access-to-variable objects with Ownership
or Borrower aspect True.
2. Observing access objects -- access-to-constant objects with Ownership
aspect True, or access-to-variable objects with Observer aspect True.
3. Composite Ownership Objects
2. Other Managed Objects -- Any part of a non-Ownership object designated by an
owning (1.a above) or observing access object (1b. above)
We require pass-by-reference for all types with Ownership aspect True, unless
they are visibly an access type. This is important to properly deal with the
possibility of exceptions, because pass-by-copy for [in] out composite
parameters could produce dangling references if a subprogram propagated an
exception that was then caught. (Ownership aspect is not defined at all for
scalar types, and we don't allow a private type with Ownership aspect True to be
of a by-copy type.)
Note that the rules are actually expressed in terms of "names" rather than
"objects" since objects don't really exist at compile-time. In the rules, we
distinguish "static" names from "dynamic" names, where static names are those
made of names denoting stand-alone objects, dereferences of owning/observing
access objects, and selected components, while "dynamic" names can involve
indexing or slicing.
!wording
Add the following sections to Annex H:
H.7 The Ownership Aspect
This section describes the Ownership aspect, which can be specified for an
access type so that there can be either one writable access path to an object
designated by an object of such an access type, or one or more read-only access
paths via such access objects, but never both. Furthermore, objects designated
by such access objects are automatically finalized and deallocated when the last
access path is removed.
[rest TBD...]
!discussion
** TBD.
!example
** TBD.
!ASIS
[Not sure. It seems like some new capabilities might be needed,
but I didn't check - Editor.]
!ACATS test
ACATS B- and C-Tests are needed to check that the new capabilities are
supported.
!appendix
From: Tucker Taft
Sent: Thursday, October 18, 2018 11:59 AM
Here is a slimmed-down version of the Ownership AI, in an attempt to get
something related to this into Ada 2020, so the Global annotation can be
useful for containers and similar data structures. Definitely less than even
half baked. Main idea is to use the Ownership aspect, but also allow subtypes
with Borrower or Observer aspects (to eliminate need to use anonymous access
types) that act as borrowers or observers when initialized, rather than owners.
Also proposed having general access types identified as Unchecked_Reference
types, which can be produced by explicitly converting from an Ownership True
access value (which are always pool-specific), and which are not checked by
any rules, so the user needs to ensure memory and thread safety on their own
if they use these (e.g. for a cursor).
[This is version /01 of this AI - Editor.]
Some of this was suggested by Randy's version 2, but tries to hew a bit closer
to the original version.
Wording has been elided, since it is not fully baked, and there might be no
interest in this at all.
****************************************************************
From: Randy Brukardt
Sent: Thursday, October 18, 2018 1:58 PM
I'll post this now and try to find time before leaving to properly review it.
****************************************************************
From: Tucker Taft
Sent: Thursday, October 18, 2018 2:49 PM
OK, thanks. It is really more of a starting point for discussion, rather than
a thought-through AI. I would suggest we start with your version of the AI if
we get around to 240 at all, and then use #1 and #3 as alternatives worth
thinking about.
****************************************************************
Questions? Ask the ACAA Technical Agent