Version 1.10 of ai12s/ai12-0240-1.txt
!standard H.7(0) 18-01-24 AI12-0240-1/04
!standard H.8(0)
!class Amendment 17-10-09
!status work item 17-10-09
!status received 17-10-05
!priority Low
!difficulty Hard
!subject Access value ownership and parameter aliasing
!summary
** TBD.
!problem
Pointers (access types) are essential to many complex Ada data structures, but
they also have significant downsides, and can create many kinds of safety and
security problems. The question is can we create a subset of access-type
functionality which supports the creation of interesting data structures without
bringing all of the downsides. The notion of pointer "ownership" has emerged as
one way to "tame" pointer problems, while preserving flexibility. The goal is
to allow a pattern of use of pointers that avoids dangling references as well as
storage leaks, by providing safe, immediate, automatic reclamation of storage
rather than relying on unchecked deallocation, while also not having to fall
back on the time and space vagaries of garbage collection. As a side benefit,
we can also get safer use of pointers in the context of parallelism. This AI
proposes the use of pointer ownership, as well as some additional rules
preventing "aliasing" between parameters, and between pa meters and global
variables, to provide safe, automatic, parallelism-friendly heap storage
management while allowing the flexible construction of pointer-based data
structures, such as trees, linked lists, hash tables, etc.
When attempting to prove properties about a program, particularly programs with
multiple threads of control, the enemy is often the unknown "aliasing" of names
introduced by access types and certain uses of (potentially) by-reference
parameters. By unknown aliasing of names, we mean the situation where two
distinct names might refer to the same object, without the compiler being aware
of that. A rename introduces an alias, but not an "unknown alias," because the
compiler is fully aware of such an alias, and hence is not something we are
worrying about here. On the other hand, if a global variable is passed by
reference as a parameter to a subprogram that also has visibility on the same
global, the by-reference parameter and the global are now aliases within the
subprogram, and the compiler generating code for the subprogram has no way of
knowing that, hence they are "unknown" aliases. One approach is to always
assume the worst, but that makes analysis much harder, and in so cases
infeasible.
Access types also introduce unknown aliasing, and in most cases, an analysis
tool will not be sure whether the aliases exist, and will again have to make
worst-case assumptions, which again may make any interesting proof infeasible.
!proposal
This AI introduces a restriction against parameter aliasing
(No_Parameter_Aliasing), and an aspect called "Ownership" for access types and
composite types, which together provide safe, automatic storage management.
Furthermore this set of features can reduce potential aliasing to the point that
the only remaining potential aliases involve names denoting elements or slices
of the same array using dynamic indices. This sort of aliasing is felt to be
manageable by most analysis tools, as it is restricted to determining the
possibility of numeric equality between such indices. Run-time checks are
defined to deal with this sort of aliasing.
Pointer ownership enables automatic storage reclamation since there is never
more than one "owning" pointer to a given object, and the pointers are generally
limited to pointing only at heap objects (never at a stack-allocated object or
aliased component). This restriction matches that provided by "pool-specific"
access types, so in this AI we limit the notion of pointer ownership for named
access types to such types. We could incorporate a restricted set of "general"
access types as well, but that would require more rules, and in this version we
choose to go with the simpler, pool-specific-only, approach. We do permit use
of certain objects of an anonymous access type, but these have restricted use,
and in any case cannot be used with Unchecked_Deallocation.
Access type Ownership:
If a pool-specific access-to-variable type has the value True for the aspect
Ownership, then certain operations on the access type are restricted, and
certain operations result in an automatic deallocation of the designated object.
The Ownership aspect is inherited by derived types, and is non-overridable. On
a root type, the Ownership aspect defaults to False. We also define the aspect
on composite types and private types.
The basic rule is that at most one object of such an access-to-variable type
with Ownership True may be used to refer to a designated object at any time.
There might be multiple access-to-variable (or access-to-constant) objects that
designate the same object (or some part of it) at any given time, but at most
one of them may be dereferenced. All of the others are considered "invalid,"
because their value has been "borrowed" (see further below).
Conversely, there might be multiple "valid" access-to-constant objects that
designate a given object, so long as all access-to-variable objects designating
the object are at that point invalid (or "observed" -- see below). This is the
usual "single writer, multiple reader" rule for safe access to a shared
variable.
We say an access-to-variable object is "observed" when its value has been
"observed", and such an object can only be used for read-only access to the
designated object.
To summarize the possible states of an access-to-variable object of a type with
pointer ownership (changing "valid" to "unrestricted" below):
1. "unrestricted" -- may be dereferenced and used to update the designated
object.
2. "observed" -- value has been "observed," meaning that other
access-to-constant objects or IN formal parameters may exist that give
read-only access to all or part of the designated object, or part of some
object directly or indirectly owned by the designated object.
3. "borrowed" -- value has been "borrowed," meaning that the value has been
copied into a short-lived access-to-variable object, such as an IN parameter,
possibly of an anonymous access type. "Borrowed" also covers the case where
some part of the designated object is passed as a potentially by-ref [IN] OUT
parameter. Finally, it also covers the case where a subcomponent that is
itself an access-to-variable with ownership, is borrowed. The state reverts
to unrestricted when the short-lived reference no longer exists.
An access-to-variable object of a named type that is unrestricted or observed is
considered to be the "owner" of its designated object. There should be exactly
one owner of every existing object created by an allocator for an access type
with a True Ownership aspect. While observed, the access object is itself
read-only. When an object is borrowed, ownership has been temporarily
transferred to another object, and while in such a state the original access
object is not usable at all.
Ownership is transitive, in that if a pointer owns a designated object, it
indirectly owns every object owned by one of its access subcomponents.
To simplify the rules, we introduce the notion of managed objects, which come in
two varieties:
1. Ownership Objects -- Objects of a type with Ownership aspect True:
1. Owning access objects -- access-to-variable objects with Ownership aspect
True
2. Observing access objects -- access-to-constant objects with Ownership
aspect True
3. Composite Ownership Objects
2. Other Managed Objects -- Any part of a non-Ownership object designated by an
owning (1.a above) or observing access object (1b. above)
We require pass-by-reference for all types with Ownership aspect True, unless
they are visibly an access type. This is important to properly deal with the
possibility of exceptions, because pass-by-copy for [in] out composite
parameters could produce dangling references if a subprogram propagated an
exception that was then caught. (Ownership aspect is not defined at all for
scalar types, and we don't allow a private type with Ownership aspect True to be
of a by-copy type.)
Note that the rules are actually expressed in terms of "names" rather than
"objects" since objects don't really exist at compile-time. In the rules, we
distinguish "static" names from "dynamic" names, where static names are those
made of names denoting stand-alone objects, dereferences of owning/observing
access objects, and selected components, while "dynamic" names can involve
indexing or slicing.
!wording
Add the following sections to Annex H:
H.7 The Ownership Aspect
This section describes the Ownership aspect, which can be specified for an
access type so that there can be either one writable access path to an object
designated by an object of such an access type, or one or more read-only access
paths via such access objects, but never both. Furthermore, objects designated
by such access objects are automatically finalized and deallocated when the last
access path is removed.[a][b][c]
Static Semantics
The following aspect may be specified for a named access-to-object type, a
stand-alone object (including a generic formal in-mode object) of an anonymous
access-to-object type, or a composite type:
Ownership
The Ownership aspect is of type Boolean, and shall be specified by a static
expression. The Ownership aspect for a named general access type is
False[d][e], and only confirming specifications are permitted. The Ownership
aspect is nonoverridable for a pool-specific access type, and if not specified
for a root pool-specific access type, the value is False. [f][g] If not
specified for a stand-alone object (or generic formal in-mode object) of an
anonymous access-to-object type, the value is False if there is no
initialization (or default) expression, and otherwise it is the same as that of
the type of the initialization (or default) expression. For an untagged
composite type, the Ownership aspect is nonoverridable in derived types. For a
nonlimited controlled type, the Ownership aspect is False, and only confirming
specifications are permitted. For other composite types, the Ownership aspect
is True if there are any visible subcomponents for which the Ownership aspect is
True if the type is tagged and its parent (if any) has Ownership aspect True,
and in these cases only confirming specifications are permitted.
AARM Ramification: We do not require that all descendants of a tagged type
with Ownership aspect False also have Ownership False, but we disallow
(implicit or explicit) conversion between types with differing Ownership,
preventing inheritance of dispatching operations across such an Ownership
aspect change.
SPARK Note: We could probably allow inheriting dispatching operations in
SPARK, even if Ownership differs between a parent type and its descendant, so
long as Extensions_Visible is False. Not sure if we want to add such an
aspect to Ada.
AARM Reason: We do not allow a nonlimited type to be controlled and also have
Ownership True, as the Adjust procedure of the controlled type takes over the
semantics of assignment, and would defeat the move/borrow/observe semantics
defined below. See H.7.1 for a bounded error relating to the Adjust
procedure for a controlled types with any subcomponents with Ownership True.
The Ownership aspect of an anonymous access-to-object type is determined as
follows:
* If it is the type of a stand-alone object (including a generic formal in-mode
object), it is the same as that of the object;
* If it is the type of a formal parameter or result of a callable entity to
which the Restriction No_Parameter_Aliasing applies, it is True;
* Otherwise, it is False.
The Ownership aspect of a class-wide type is the same as that of the root type
of the class.
An access-to-variable type with Ownership aspect True is called an owning access
type. Similarly, an object of an owning access type is called an owning access
object. An access-to-constant type with Ownership aspect True is called an
observing access type. Similarly, an object of an observing access type is
called an observing access object. Finally, an object that is a part of an
object with Ownership aspect True, or a part of the dereference of an owning or
observing access object, is called a managed object.
[Redundant: Any composite type with Ownership aspect True is a by-reference type
(see 6.2).]
[Ed. note: This additional rule for 6.2, where by-reference types are defined,
is given below.]
AARM Reason: Forcing by-reference parameter passing for such composite types
simplifies the rules significantly, and avoids the possibility of dangling
references upon exception propagation. Later we disallow private types with
Ownership aspect True to have a full type that is a by-copy type, so this
effectively means that all non-access types with Ownership aspect True are
necessarily by-reference types.
AARM Ramification: An observing access object is never an exclusive owner of
its designated object; while it exists it may be dereferenced to produce a
constant view, which can be passed around, read in parallel, etc.
A name denoting an object can either be a static name or a dynamic name. The
following are static names:
* a name that statically denotes an object (see 4.9);
* a selected component with prefix being a static name;
* a dereference (implicit or explicit), with prefix being a static name denoting
an object of an access type with Ownership aspect True;
* a name that denotes an object renaming declaration, if the object_name
denoting the renamed entity is a static name.
Any other name that denotes an object, other than an aggregate or the result of
a function call (or part thereof), is a dynamic name. Names that denote an
aggregate or the result of a function call, or part thereof, have their own set
of rules.
AARM Reason: We considered specifying that indexed components having static
index expressions were also static names, but chose to leave them out, since
they can be aliased with names that have dynamic index expressions. So
currently a static name and a dynamic name are never "peer" aliases, and it
is trivial for the compiler to determine whether two static names denote the
same object. On the other hand, determining whether two dynamic names denote
the same or overlapping objects might require a run-time check on the values
of array indices, or on the values of pointers of a non owning/observing
type.
A static or dynamic name has a root object[h][i] defined as follows:
* if the name has a prefix that statically denotes an object, it is the object
statically denoted by that prefix;
* otherwise, the name has a prefix that denotes an object renaming declaration,
in which case the root object is the renamed object.
AARM Reason: A renaming can be an implicit borrowing if the renamed object is
part of a dereference of an owning access object, so we want to go "through"
the renaming, rather than re-dereference the owning access object (which has
been implicitly "borrowed" in any case).
Two names with the same root object statically overlap when one is a static name
and the other denotes the same object, or has a static prefix that denotes the
same object.
AARM Reason: We will need this term later, and this seems like the best place
to introduce it. Note that static names/prefixes never involve indexing or
slicing, so "overlap" is easy to define. They denote the same "whole"
object, or one denotes a component or an object "owned" by the other.
A static name that denotes a managed object can be in one of the following
ownership states:
* unrestricted -- the name may be used as defined elsewhere in this standard,
and if it denotes an owning access object, a dereference of the name provides
a variable view;
* observed -- the name provides a constant view, and if it denotes an owning
access object, it may be used as the prefix of a dereference, but the
dereference is similarly a constant view;
* borrowed -- the name is not usable when in this state; the name has
temporarily relinquished ownership to a borrower;
If a static name is in the observed or borrowed state, then any other static
name with the same root object that denotes the same object is in the same
ownership state.
AARM Reason: This is a desirable property. We don't want the exact form of a
static name to affect its state.
A dynamic name that denotes a managed object can also be in the observed or
borrowed state, but in addition it can be in a dynamic ownership state where
run-time checks may be necessary to determine its (run-time) ownership state. If
a name (static or dynamic) that denotes a managed object has a prefix that is in
the observed or borrowed state, then the name is similarly in the observed or
borrowed state (respectively). If a dynamic name that denotes a managed object
is not in the observed or borrowed state because of this rule, then it is in the
dynamic ownership state.
AARM Ramification: In this case we know that any static prefix must be in the
unrestricted state rather than observed or borrowed. Run-time checks may be
required when using dynamic names that have a dynamic ownership state.
Certain operations are considered to observe, borrow, or move the value of a
managed object, which might affect whether a name that denotes the object is
considered to be in the unrestricted, observed, borrowed, or dynamic ownership
state after the operation.
For names that denote managed objects:
* The default ownership state of such a name is determined as follows:
* If it denotes a constant[j][k][l][m][n], other than an in parameter of an
owning access type, its default state is observed;
AARM Reason: Constant-ness is transitive for components that are owning
access objects, so access-to-constant to the root means that nothing "below"
that pointer can be written.
* If it is a dynamic name that denotes a variable, its default ownership
state is dynamic;
* Otherwise, its default ownership state is unrestricted.
* The following operations observe[o] a name that denotes a managed object:
* An assignment operation where the target is an access-to-constant object of
an anonymous type, and the name denotes an owning access object as the
source of the assignment, if the assignment is part of the initialization
of a stand-alone object[p] or the parameter association for a formal
parameter or generic formal object of mode in;
AARM Reason: This supports traversing a data structure in read-only fashion,
while preserving the handle on the root of the structure.
* An assignment operation that is used to initialize a constant object
[q][r](including a generic formal object of mode in) of a named type with
Ownership aspect True but that is not an owning access type,[s] where the
name denotes the source of the assignment.
AARM Reason: We handle the case of assignment to named owning access types as
a special case below under borrowing. Managed objects that do not have
Ownership aspect True can be freely copied, since they necessarily have no
nested owning access objects.
* A call where the name denotes an actual parameter, and the formal parameter
is of mode in and composite or aliased;[t][u]
AARM Reason: Unless the parameter is an owning access object, we treat this
as observing. If it is an owning access object, and the formal is
access-to-variable, then it is considered borrowing (see below). If the
formal is neither composite nor aliased, the parameter is guaranteed to be
passed by copy and no long-lived "observer" of the managed object is being
created.
At the point where a name denoting a managed object is observed, the state of
the name becomes observed, and remains so until the end of the scope of the
observer. While a name that denotes a managed object is in the observed state
it provides a constant view.
AARM Reason: We want the managed object to still designate the same object(s)
until all the observers go away, so we disallow updating the object.
At the point where a static name that denotes a managed object is observed,
every static name that denotes the same managed object is observed, and every
name with that static name as a prefix, is similarly observed.
AARM Ramification: This applies recursively down the tree of managed objects,
meaning that observing a managed object effectively observes all of the
objects "owned" by that managed object. Dynamic objects rely on a different
rule that will check (at run-time if necessary) that there are no objects
that "own" this object that are in the borrowed state.
* The following operations borrow a name that denotes a managed object:
* An assignment operation that is used to initialize an owning access object,
where the (borrowed) name denotes the source of the assignment and the
target is a stand-alone variable of an anonymous access-to-variable type,
or is a constant [v][w][x][y][z][aa](including an in parameter or a generic
formal object) of a (named or anonymous) access-to-variable type;
AARM Reason: This supports traversing a data structure with the ability to
modify something, without losing a handle on the root of the structure.
Unlike subcomponents of a constant, IN parameters and stand-alone constants
of an owning access type still provide read/write access. This is to
accommodate existing practice in the use of constants of an access type to
still be used to update the designated object. Note that as soon as the
object becomes a component of a larger constant, we treat everything "owned"
by a composite constant as itself being constant. Note that we don't
consider this a complete transfer of ownership (i.e. a "move") because we
don't want the sole owner to be a constant, since we can't set it to null if
we move the value out of the object.
* A call (or instantiation) where the name denotes an actual parameter that
is a managed object other than an owning access object, and the formal
parameter is of mode out or in out (or the generic formal object is of mode
in out).
AARM Reason: We consider this to be "borrowing" of the actual parameter name
denoting the managed object, while the formal parameter name comes into use
as an unrestricted name for the object. We don't want the actual parameter
name used for any other purpose during the execution of the called subprogram
or generic instance, since the formal parameter name provides full read/write
access to the object.
* Object renaming [ab][ac]where the name is the object_name denoting the
renamed object, when the renamed object is not in the observed or borrowed
state.
AARM Reason: We are effectively borrowing the object, because we want to be
sure the designated object is not deallocated while the renaming exists. If
it has already been borrowed, such a rename would not be permitted. If it
has already been observed, then the renaming provides a constant view, but
that does not change the ownership state of the object -- it remains
"observed."
At the point where a name denoting a managed object is borrowed, the state of
the name becomes borrowed, and remains so until the end of the scope of the
borrower. While a name that denotes a managed object is in the borrowed state
it provides a view that allows neither reading nor updating.
AARM Reason: We want the name to still designate the same object(s) until the
borrower goes away, so we disallow updating via a borrowed name. We also
disallow moving, borrowing, observing, and dereference (see below in legality
rules). So effectively a name that is borrowed is completely "dead" until
the borrowing ends.
At the point where a static name that denotes a managed object is borrowed,
every static name that denotes the same managed object is borrowed, and every
name with that static name as a prefix, is similarly borrowed.
AARM Ramification: This applies recursively down the tree of managed
objects, meaning that borrowing a managed object effectively borrows all of
the objects "owned" by that managed object. Dynamic objects rely on a
different rule that will check (at run-time if necessary) that there are no
objects that "own" this object that are in the observed or borrowed state.
* The following operations are considered move operations:
* An assignment operation, where the target is a variable or return object
[ad][ae][af](see 6.5) of a named type[ag] with Ownership aspect True,
including an OUT or IN OUT formal parameter [ah]of an owning access type.
AARM Ramification: This includes both the copy in and the copy back of such a
formal parameter from its actual parameter, and an assignment to/from such a
formal inside the subprogram.
* An assignment operation where the target is part of an
aggregate[ai][aj][ak], having a named type with Ownership aspect True.
AARM Reason: These operations are considered "moves" because they result in a
transfer of ownership, and possibly a deallocation of the object designated
by the target beforehand, or a setting to null of the source afterward (see
dynamic semantics below). Note that this does not include by-reference
parameter passing, which is what is used for all composite/private types with
Ownership aspect True.
Legality Rules
If a type has a partial view, the Ownership aspect may be specified explicitly
only on the partial view, and if specified True, the full type shall not be an
elementary type nor an untagged private or derived type with Ownership aspect
False; furthermore, if the Ownership aspect for the full type would be True if
not explicitly specified, the Ownership aspect of the partial view shall be
True, either by inheritance from an ancestor type or by an explicit
specification.
AARM Reason: If the full type might contain an owning component, then it is
important that the partial view indicates that. On the other hand, if the
partial view has Ownership True, it is OK if the full view is tagged but
would not be considered to have Ownership aspect True merely due to its
components or parent. We disallow having a full type that is untagged
private or derived with Ownership aspect False, because such types might be
by-copy types, and we want private types with Ownership aspect True to always
be by-reference types, to avoid problems with exceptions propagation leaving
dangling references.
The source of an assignment operation to an object of an anonymous
access-to-object type with Ownership True (including a parameter) shall neither
be[al][am] an allocator nor a function call[an].
AARM Reason: This could lead to a storage leak, since we do not finalize and
reclaim the object designated by such an access object when its scope ends --
we assume there is still some other access object that designates it. An
allocator or function call must first be assigned to an (owning) access
object of a named type, and then it can be "walked" using an object of an
anonymous type. Note that we want any allocators to be of a named
pool-specific type so their storage is allocated from a pool that is being
used for "managed" objects, so you would end up having, in any case, to put
an explicit type qualification around the allocator to specify the named
pool-specific type. So requiring the target to be an access object of a
named (pool-specific) type doesn't seem like a big burden.
The prefix of a name shall not denote denote a function call, or a qualified
expression whose operand is a function call, an aggregate, or an allocator, if
the prefix is of a type with Ownership aspect True. [Redundant: A function
call, aggregate, or allocator of such a type can be assigned to an object of a
named type (including a parameter), which can then be used as a prefix for
selecting components or dereferencing.]
AARM Reason: Again, this is to avoid storage leaks, from constructs like
F(A).X, where it is not clear when we could deallocate the storage of the
object designated by F(A). We could relax this by requiring the compiler to
deallocate these objects at the "right" time, but that would require worrying
about observes and borrows, etc. Also, allowing names whose prefix is a
function call or qualified expression with allocator or aggregate as an
operand would complicate other rules, so we make them illegal.
An Access or Unchecked_Access attribute reference shall not be of a type with
the Ownership aspect True, unless the prefix denotes a managed object, and the
value is directly used to initialize a stand-alone object of an anonymous access
type, or an in parameter of an anonymous access type; this is considered
observing the prefix if the anonymous access type is access-to-constant, and
borrowing otherwise.
AARM Reason: We allow taking ‘Access of aliased "managed" objects to
initialize an object of an anonymous access type to support use of pointers
in programs where dynamic allocation is not permitted. The same rules for
borrowing and observing apply, but no "move" operations are permitted, nor
any deallocations.[ao][ap]
A declaration that observes or borrows a managed object shall not occur within
the private part or private descendant of a package, nor within a package body,
if the root of the name of the managed object denotes an object whose scope
includes the visible part of the package, unless the accessibility level of the
declaration is statically deeper than that of the package.[aq]
AARM Reason: We disallow borrowing in "private" an object that is visible,
since the compiler needs to "know" whether an object has been observed or
borrowed, everywhere within the scope of the object.
In a conversion [Redundant: (explicit or implicit)], the Ownership aspect of the
operand type and that of the target type shall be the same. [Redundant: Note
that anonymous access types are never convertible to (named) pool-specific
access types (see 4.6).]
AARM Ramification: A specific type cannot be converted to a class-wide type
if the Ownership aspect of the specific type differs from that of the root
type of the class. Similarly, a type cannot inherit a dispatching operation
from its parent type if the parent has a different value for its Ownership
aspect, since the implicit conversion associated with calling an inherited
subprogram would violate this rule.
SPARK Note: We could perhaps relax this in the context of the
Extensions_Visible attribute being True.
For an assignment statement where the target is a stand-alone object of an
anonymous access-to-object type with Ownership aspect True:
* If the type of the target is an anonymous access-to-variable type (an owning
access type), the source shall be an owning access object denoted by a name
that is not in the observed state[ar][as], and whose root object is the target
object itself;[at]
AARM Reason: At its declaration, such an owning access object can be
initialized from any owning access object in the unrestricted state. On
subsequent assignment statements, such an object can only be set to point to
somewhere in the "tree" headed by its prior value. Hence, you can use it to
walk exactly one tree -- you can't jump between different trees using the
same SAOAAT.
* If the type of the target is an anonymous access-to-constant type (an
observing access type), the source shall be an owning access object denoted by a
name that is in the observed state, and whose root object is also in the
observed state and not declared at a statically deeper accessibility level than
that of the target object.
AARM Reason: At its declaration, such an observing access object can be
initialized from any owning access object in the unrestricted or observed
state, but on subsequent assignment statements, it can only be assigned from
objects rooted at an observer that lives at least as long as this observer.
For managed objects:
* While the state of a name that denotes a managed object is observed, the name
shall not be moved nor borrowed and shall not be used as the target of an
assignment;
AARM Reason: Observed objects and everything "owned" by them are read only.
* While the state of a name that denotes a managed object is borrowed, the name
shall not be moved, borrowed, nor observed (directly or indirectly), and shall
not be used as a primary, as a prefix, as an actual parameter, nor as the
target of an assignment; [au][av]
AARM Reason: The object has been borrowed, and should not even be observed
other than via the borrower.
* If the source of a "move" is a name that denotes an object with Ownership
aspect True, other than a function call, the name shall be a variable
[Redundant: that is not in the observed or borrowed state]; furthermore, there
shall be no name that statically overlaps this name that is in the observed or
borrowed state;
AARM Reason: After a move, we set the source to be null if it is an object
other than an anonymous return object, aggregate, or allocator, and
effectively "destroy" everything bearing a name that overlaps the source.
Note that the only case where a (non-anonymous) owning access object can be a
constant and not be in the observed state, is when it is an in parameter. So
the first part really only disallows moving the value of an in parameter.
You can borrow the value of an in parameter, by passing it further along as
an in parameter, or assigning it to an object of an anonymous owning
access-to-variable type. The second part is more important -- you can't move
something if anything "beneath" it is in the borrowed or observed state.
* At the point of a call, any name that denotes a managed object that might be
written (other than via a formal parameter) as part of the invocation of the
target callable entity (as restricted by the value of its Global aspect and
the restrictions imposed by the No_Parameter_Aliasing restriction -- see
below) shall not be in the observed or borrowed state. Similarly, any name
that denotes a managed object that might be read (but not written) as part of
the call, shall not be in the borrowed state.
AARM Reason: We want to be able to assume that all relevant writable globals
are unrestricted when a subprogram starts executing, and all read-only
globals are not borrowed.
Dynamic Semantics
When the value of a variable of a named owning access type is moved to another
object, the value of the owning access variable is set to null. This applies to
each subcomponent [aw]that is of a named owning access type in a move of a
composite object. [Redundant: This includes the case of moving an actual
parameter to a formal parameter of a named owning access type that is of mode
out or in out, as well as the copy back.]
AARM Reason: If we didn't set it to null, an exception might allow the old
value of the variable to be used after handling the exception, which would be
bad if the designated object had been deallocated prior to raising the
exception.
When a new value is assigned to a variable of a named owning access type, if the
prior value is non-null and does not match the new value, the object designated
by the prior value is finalized, and its storage deallocated. If the value being
assigned matches the prior value of the target, there is no effect. This
applies to each subcomponent having a named owning access type, in an assignment
to a composite object.
AARM Reason: When an owning access object of a named type can legally be
assigned, that means there are no borrowers or observers in existence, so it
is safe to reclaim the storage. Objects of an anonymous type never truly
"own" an object, though they can be the unique borrower. Note that we don't
want to rely on the general finalization rule (see below) for handling the
left-hand side of an assignment, because we have the extra test that the new
and old values differ.
Similarly, if a variable of a named owning access type (including a component)
is non-null when it is finalized (other than when it is the target of an
assignment, which is covered by the above rule), the object designated by the
variable is finalized, and its storage is deallocated. This applies as well to
formal parameters of mode out and in out that are of a named owning access type,
when the subprogram does not complete normally. [Redundant: This rule applies
recursively to subcomponents of the designated object that are of an owning
access type, ensuring that all objects designated indirectly by an owning access
object are finalized and deallocated when the owning access object is
finalized.]
AARM Reason: We treat both the copy-in and the copy-back as a "move" for OUT
and IN OUT parameters of a named owning access type, so if a subprogram does
not complete normally, no copy-back occurs, and we must perform the
finalization action to avoid a storage leak.
AARM Ramification: The finalization that normally precedes an assignment,
according to 7.6, is handled differently, as described for the assignment
case (the finalization only happens if the value is changing). On the other
hand, the finalization that happens to the subcomponents of the "old"
designated object prior to it being reclaimed, is intended to be covered by
this rule, so we get reclamation of the entire tree of objects.
Finally, if a constant of a named type with Ownership aspect True is initialized
by a [ax]function call, aggregate, or allocator, then when it is finalized, if
the constant is not itself part of a return object or aggregate, each part of
the constant that is of a named owning access type is finalized as above.
[Redundant: This includes in parameters initialized by such an expression.]
AARM Reason: We are attempting to prevent storage leakage via constants that
effectively "own" an object created at the time of their initialization, if
initialized by a function call, aggregate, or allocator.
For managed objects denoted by a dynamic name, checks are used to ensure that no
other name that is a potential alias is in a conflicting state. Two names are
potential aliases when:
* both names statically denote the same entity;
* both names are selected components with the same selector and with prefixes
that are potential aliases;
* both names are indexed components, with prefixes that are potential aliases,
and corresponding indexing expressions that are both static, if any, have the
same value;
* both names are slices, with prefixes that are potential aliases, and the
corresponding discrete_ranges are overlapping in the case when both are static
ranges;
* one name is a slice whose prefix is a potential alias of the other name;
* one is a slice and the other is an indexed component, with prefixes that are
potential aliases, and for the case when both the discrete_range and the
indexing expression are static, the value of the indexing expression is within
the range;
* both names are dereferences, with prefixes that are potential aliases;
* both are aliased views having the same type, and at least one of them is a
dereference of a non-owning access type;
* one denotes an object renaming declaration, and the other denotes the same
renaming declaration, or is a potential alias with the object_name denoting
the renamed entity.
Two names N1 and N2 potentially overlap if some prefix of N1 is a potential
alias of N2, or some prefix of N2 is a potential alias of N1. The prefix and
the name that are potential aliases are called the potentially aliased parts of
the potentially overlapping names.
For a dynamic name D1 that is in a dynamic ownership state, if an operation
requires that[ay], were it static, it not be in an observed[az][ba] (or
borrowed) state[bb], then for every other dynamic name D2 that is in the
observed (or borrowed) state and potentially overlaps with D1, a check is made
that the potentially aliased parts of the names do not in fact denote
overlapping parts of the same object.[bc][bd] If this check fails,
Program_Error is raised.
AARM Ramification: This requires keeping track of what potentially aliased
(see above) dynamic names are observed (or borrowed), and doing checks that
there are none that overlap with this name or any part of the tree rooted at
this name. Note that this can be as simple as comparing the addresses of the
shared prefixes.
AARM Reason: We don't just talk about "overlap" of the denoted objects, but
instead talk about overlap between the potentially aliased parts of the two
names. That is necessary because we are interested in a more general notion
of "overlap" between names that treats designated objects as overlapping with
the object containing the pointer to them.
[Ed. question: Note that we are relying here (and later) on the meaning of
(physical) "overlap" that is already used in various parts of the standard (e.g.
A(3/4) and 13.3(73.6/3)). Should we actually define it "officially" somewhere
-- perhaps as part of defining attribute function Overlaps_Storage?]
H.7.1 Operations of Owning Access Types
The following attribute is defined for an object X of a named non-limited type
T[be][bf][bg][bh][bi]:
X'Copy[bj]
The evaluation of X'Copy when T is a type with Ownership False yields an
anonymous object initialized by assignment from X; when T is a composite type
with Ownership True, X'Copy yields an anonymous object initialized from X
(including adjustment of controlled parts) but with all subcomponents X.S that
are of a named owning access type (and are not subcomponents of a controlled
part) being replaced by X.S'Copy; when T is an owning access type, X'Copy raises
Program_Error if the designated type of T is limited, and otherwise it yields
null if X is null and if not null, the result of evaluating an allocator of type
T with the designated object initialized from X.all'Copy. For the purposes of
other language rules, X'Copy is equivalent to a call on a function that
"observes" X.
If the evaluation of X'Copy propagates an exception, any object that has been
newly allocated and adjusted is finalized, including any owning access
subcomponent that has been successfully replaced by its copy.
AARM Ramification: The intent is that no newly allocated storage is lost if
an exception is propagated by X'Copy, but that no part of X or X.all itself
is finalized or deallocated as a side-effect of such an exception.
The stream-oriented attributes (see 13.13.2) for every subtype S of a named
access-to-variable type T with designated type D and Ownership aspect True, have
the following semantics:
AARM Ramification: Essentially the access objects "disappear" in the
streaming representation, and all you have are the non-access type elements
of the "tree," plus some Boolean "existence" flags to indicate whether an
access object was null.
S'Output
If the designated type D does not have an available Output attribute,
Program_Error is raised. Otherwise, S'Output (Stream, Item), where Item is of
type T, invokes Boolean'Write (Stream, False) if Item is null, and otherwise
invokes Boolean'Write (Stream, True) followed by D'Output (Stream,
Item.all).[bk]
S'Write
S'Write (Stream, Item) is equivalent to S'Output (Stream, Item).
S'Input
If the designated type D does not have an available Input attribute,
Program_Error is raised. Otherwise, S'Input (Stream) invokes
Boolean'Input(Stream). If this returns False, S'Input returns null. If
Boolean'Input returns True, an allocator of the form:
new D'(D'Input (Stream))
is evaluated and returned by S'Input. If the evaluation of D'Input propagates
an exception, S'Input propagates the exception, and no storage is allocated.
S'Read
S'Read (Stream, Item), where Item is of type T, invokes S'Input (Stream) and
assigns (moves) the result to Item. If S'Input propagates an exception, S'Read
propagates the exception.
The owning access type T in the above is considered to support external
streaming (see 13.13.2), provided its designated type D supports external
streaming. For the purposes of this determination, a set of mutually recursive
owning access types do not by themselves prevent an enclosing type from
supporting external streaming.
Bounded Errors
If a nonlimited controlled type T has a subcomponent of an owning access type,
it is a bounded error if the Adjust procedure for the type does not replace all
such subcomponents with null or an access value designating a newly allocated
object. The possible consequences are that Program_Error is raised immediately
after invoking Adjust, or the guarantee of an exclusive single updater might be
violated, and a future dereference of the unreplaced access value could lead to
erroneous execution.
H.8 The No_Parameter_Aliasing Restriction
[TBD: Talk about aliasing between parallel tasklets, or have a separate
No_Data_Races restriction?]
This section describes the No_Parameter_Aliasing restriction, which disallows
passing a part of a variable to a subprogram or entry[bl][bm][bn], if it is
potentially referenced as a global by the callable entity, unless both access
paths are read only. Furthermore, it disallows operating on overlapping parts
of a single variable twice within the same expression or simple statement,
unless both operations are read only, or one operation is read only, and it
occurs strictly before the second operation where it might be updated.
AARM Reason: We could allow two updates, if one occurs strictly before the
second update, but for now we disallow two updates to the same object within
a single expression or simple statement, given the likelihood for error.
An operation that occurs within an expression or simple statement occurs
strictly before[bo] a second such operation only if:
* The first occurs within an expression, and the result of the expression is an
operand to the second;
* The first occurs within the left operand of a short-circuit control form, and
the second occurs within the right operand of the same short-circuit control
form;
* The first occurs within the condition or selecting_expression of a
conditional_expression, and the second occurs within a dependent_expression of
the same conditional_expression; or
* The first operation occurs strictly before some other operation, that in turn
occurs strictly before the second.
Static Semantics
The following restriction_identifier is language defined:
No_Parameter Aliasing
If the No_Parameter_Aliasing restriction applies to the declaration of a
callable entity, then the following rules apply on calls to such an entity;
these rules also apply if the restriction applies at the point of a call or
other operation:
* On any call to which the restriction applies, a part of a variable[bp] shall
not be passed as a parameter if a statically overlapping part (see H.7) of the
variable can be referenced as a global during the invocation of the called
entity (as limited by its (explicit or default) Global aspect -- see 6.1.2),
unless both access paths are read only;
* Within a single expression or simple statement, statically overlapping parts
of a single variable shall not be passed twice as operands to
operations[bq][br], unless both are read only operands, one is a read-only
scalar operand that is not an aliased parameter, or one is a read-only operand
to an operation that occurs strictly before the operation where it an
updatable operand;
AARM Reason: We allow non-aliased scalar operands because they are certain to
be passed by copy; we disallow access-type operands if not used strictly
before the second operation, because of the possible side-effect of nulling
out an access object as part of passing it as an in-out parameter.
Dynamic Semantics
In a context where the No_Parameter_Aliasing restriction disallows static
overlap between two names, and the names potentially overlap (see H.7), a check
is made, after evaluating the potentially conflicting operand(s) but before
invoking the operation(s), that there is in fact no overlap between the
potentially aliased parts of the two names. Program_Error is raised if this
check fails.
[end of H.8]
Add after 6.2(7/3):
* a composite type with Ownership aspect True (see H.7);
!discussion
The objective of this pair of features, the Ownership aspect and the
No_Parameter_Aliasing restriction, is to allow pointers to heap objects to be
used safely and efficiently without fear of dangling references or storage
leaks, and to reduce aliasing to the point where programs using pointers and
pass-by-reference can nevertheless be fully analyzed, in particular for
compile-time detection of data races.
The Ownership aspect is a relatively simple idea, where an object is owned by
exactly one pointer at any time. When an object is owned, it can be deallocated
when its owner is assigned to be null or point at a different object. We want
to allow read-only copies to be used to walk a data structure, so we allow
access-to-constant copies so long as they are of a type that goes away before
the original object is finalized. We also allow access-to-variable copies which
are considered "borrowers." Most commonly these will be stand-alone objects of
an anonymous access-to-object type, though an IN parameter of a named
access-to-variable type also acts as a borrower.
Similarly, a long-lived dereference of the root of a data structure (e.g. as
part of parameter passing) is considered borrowing and may be used to walk a
data structure destructively, while restoring the ownership state of the access
object designating the root after the dereference goes away. To change what is
the root object itself requires a direct assignment to the owner of the root,
which cannot occur while the "borrower" exists.
As noted we have four states for an access object: unrestricted, observed,
borrowed, and dynamic. An unrestricted access object is the owner, and provides
read-write access to the designated object. Assigning a new value to an
unrestricted access object causes the prior designated object to be deallocated.
An observed access object is a shared owner, and provides read-only access to
the designated object. An observed access object cannot itself be set to point
elsewhere. A borrowed access object is not useful until the borrowers go away.
Tracking the state of stand-alone access objects is pretty straightforward.
Tracking the state of components is more challenging. We have given the rules
in terms of static and dynamic names, and their prefixes. We believe these
rules are now sound and sufficiently flexible to permit "normal" kinds of data
structure manipulations.
We have provided "deep" copying and streaming attributes, so that arbitrarily
complex data structures built using owning access types can be easily copied or
streamed, without additional work by the programer.
For the No_Parameter_Aliasing rules, we fall back on run-time checks to prevent
aliasing when necessary for "dynamic" names. Hopefully such checks can be
eliminated at compile-time in almost all interesting cases. SPARK would
complain if such a check could not be proved at analysis time.
We presume that the Adjust procedure of nonlimited controlled types turn every
assignment into a "deep" copy, potentially making use of the ‘Copy attribute on
any subcomponents that have Ownership True. Therefore, we do not allow a
controlled type to itself have Ownership True, even if it has subcomponents with
Ownership True, so there is no notion of move/observe/borrow with respect to
controlled types per-se, though objects designating them could still have
Ownership True. Effectively they are treated like "elementary" types from the
point of view of most of these rules. However, they still have to abide by the
No_Parameter_Aliasing rules, which are largely independent of Ownership, except
in terms of the definition of "overlap."
So in some sense, these Owning access types provide a very low overhead
"controlled" type. If you still want to write your own Adjust and Finalize
procedures, then you can explicitly declare a controlled type. If all you want
is automatic storage management for levels of indirection, then just use an
access type with Ownership True and the ‘Copy attribute where you want it, and
let the compiler take care of all of the necessary grunt work of preventing
dangling references and storage leaks (in a provably safe manner).
The work on this topic was started in earnest by an intern at AdaCore,
Georges-Axel Jaloyan, mentored by Yannick Moy. Many of these concepts are
inspired by the "borrow checker" of the Rust language, and/or the anti-aliasing
rules of the ParaSail language and the SPARK language.
Eventually we will extend this to cover checking for data races between
potentially concurrent actions.
Safety Properties
The rules that make up the above proposal are intended to ensure the following
desired safety properties:
1. For any heap object that was created using an allocator of an owning access
type, and not yet deallocated, one of the following is true:
* There is exactly one name in the unrestricted state denoting an owning
access object that designates the heap object; or
* There are one or more names in the observed state denoting access objects
that designate the heap object.
* When an operation observes or borrows using a dynamic name, that name
enters the observed or borrowed state throughout the scope of the observer
or borrower; during that period run-time checks are used to check any other
dynamic name in the dynamic ownership state that is potentially overlapping
with the first dynamic name, to see if it does in fact overlap, and if so,
to preclude it from all operations, if it overlaps a borrowed name, or from
updating operations, if it overlaps an observed name.
There may be any number of names in the borrowed state denoting access objects
designating the heap object, but these give no access while in this state.
1. When such a heap object is deallocated, there are no access objects
designating it.
2. Names in the borrowed state provide no access to the designated object.
3. Names in the observed state provide read-only access to the designated
object, and any name that denotes a subcomponent of the designated object
that is of an owning access type, is also in the observed state.
4. Only static names in the unrestricted state provide read-write access to the
designated object, or dynamic names in the dynamic ownership state where they
have been checked to ensure they do not overlap with a dynamic name in the
observed or borrowed state.
!examples
Here is an example of swapping two elements of a singly linked list.
type List;
type List_Ptr is access List with Ownership;
type List is record
Next : List_Ptr;
Data : Data_Type;
end record;
procedure Swap_Last_Two (X : in out List_Ptr) is
--
--
begin
if X = null or else X.Next = null then
--
return;
else if X.Next.Next = null then
--
declare
Second : List_Ptr := X.Next;
--
--
begin
--
X.Next := null; --
Second.Next := X;
X := Second;
end;
else
--
declare
Walker : access List := X[bs];
begin
while Walker /= null loop
declare
Next_Ptr : List_Ptr renames Walker.Next;
begin
if Next_Ptr /= null
and then Next_Ptr.Next /= null
and then Next_Ptr.Next.Next = null
then
--
declare
Last : List_Ptr := Next_Ptr.Next
--
--
begin
--
Next_Ptr.Next := null; --
Last.Next := Next_Ptr;
Next_Ptr := Last;
return; --
end;
end if;
end;
--
--
--
Walker := Walker.Next;
end loop;
end;
end if;
end Swap_Last_Two;
!ASIS
[Not sure. It might be necessary to have a query for the new aspects. - Editor.]
!ACATS test
ACATS B-Tests and C-Tests are needed to check that the new capabilities are
supported.
!appendix
From: Tucker Taft
Sent: Thursday, October 5, 2017 12:05 AM
Here is a new AI [Version /01 of the AI - Editor] based on generalizing to
Ada some work we have been doing on adding "safe" pointers to SPARK. I have
coupled this with a no-parameter-aliasing restriction because you really need
both to get much benefit from either. We have several customers, both SPARK
users and "regular" Ada users, who have quite an interest in this topic as
well. We will be prototyping this in any case, but in my view (and others)
it could be a really great feature for Ada 202X more generally.
****************************************************************
From: Randy Brukardt
Sent: Thursday, October 5, 2017 8:08 PM
This seems insufficiently baked for the ARG to work on it. (If we had
unlimited time, it would be different, but this looks like something that
could take all of the time available.)
In particular, the comment in the !discussion:
Tracking validity of components is more challenging, and perhaps additional
rules over and above those given above will be needed. In particular, we
have proposed rules for dynamically-indexed array components. They will
need to be studied and used in examples to know whether they are adequate
and usable.
I don't think the ARG is the place to develop these sorts of rules. In addition
to the proposed rules, there need to be rules for dealing with this in
generics. (The word "generic" does not appear in this proposed AI, so it's
pretty clear that there is an enormous amount missing).
For instance, how do the parameter aliasing restrictions get enforced on
generic formal types? Assume-the-worst in generic bodies would be wildly
incompatible, and not enforcing them would seem to destroy the entire idea.
Not allowing generics at all doesn't seem viable, either, nor is abandoning
the contract model (I suspect SPARK does the last, but that won't fly in Ada).
Similarly, how are the restrictions enforced when a type with a component of a
tracked access type is passed as a generic actual type? Assume-the-worse would
ban most existing assignments -- no way.
Finally, I'm terrified at the notion of automatic deallocation of objects.
In the case of a general access type (or generic formal type), the compiler
cannot know at compile-time the proper storage pool to use to deallocate the
original object. (The object may not have been allocated at all [it could have
been an aliased global object], or could have been implicitly converted from
some other type.) Besides, most anonymous access types have no storage pool
associated, so how could it be deallocated automatically? I think automatic
deallocation has to be limited to pool-specific types (where the correct pool
is well-defined); other places where it would be needed ought to be illegal.
[Yes, any use of Unchecked_Deallocation on a general access type is dangerous,
but such uses are directly visible in the code, involve the name "Unchecked",
and as such should be carefully checked in code reviews. Stuff that happens
automatically is much, much harder to review.]
These sorts of (big) issues tell me that this idea needs more work before the
ARG starts on it. There clearly need to be more examples created, and possibly
a prototype implementation. The basic idea seems sound, but it is hard to say
if it is usable in Ada at this point, and it seems to be the sort of
experimental idea that isn't ready for Standardization.
****************************************************************
From: Jeff Cousins
Sent: Friday, October 6, 2017 5:06 AM
Interesting, and clearly some work has gone into this. But...
It says "The Ownership aspect is a relatively simple idea" but it looks quite
complicated to me, most of it is more like what I would expect a static
analysis tool to do than a compiler. E.g. it has a section on control flow -
if we were to require flow analysis of a compiler, then there are more useful
things that it could look for, such as checking that all paths through a
subprogram assign values to all the out parameters, and that all paths through
a function return a value.
It seems to have an Ada 83 view of access types, i.e. that they only point to
objects allocated on the heap. Randy's reply elaborates on this.
Also, access types are currently classified as being elementary types, I think
that would have to be revised if they have to carry around hidden state
information with them.
****************************************************************
From: Tucker Taft
Sent: Friday, October 6, 2017 10:31 AM
> Interesting, and clearly some work has gone into this. But...
>
> It says "The Ownership aspect is a relatively simple idea" but it looks
> quite complicated to me, most of it is more like what I would expect a
> static analysis tool to do than a compiler.
What I meant was that the *idea* is simple, but the compiler support is indeed
not simple.
> E.g. it has a section on control flow - if we were to require flow analysis
> of a compiler, then there are more useful things that it could look for,
> such as checking that all paths through a subprogram assign values to all
> the out parameters, and that all paths through a function return a value.
You are correct that this would introduce some control-flow based checks, and
there are many checks that could take advantage of that. In fact, almost
every Ada compiler already does some amount of control flow, merely to provide
helpful warnings. This would be the first time we would make legality depend
on that, which is worth considering!
> It seems to have an Ada 83 view of access types, i.e. that they only point
> to objects allocated on the heap. Randy's reply elaborates on this.
This is a good point. The TBD "conversion" rules were meant to cover that, in
particular a value of such an "owning" access type would only be allowed to
point to an actual heap object, of the correct storage pool. 'Access or
'Unchecked_Access would be illegal for such types, as would be conversions
from types that are not "owning" access types or that use a different
Storage_Pool.
> Also, access types are currently classified as being elementary types, I
> think that would have to be revised if they have to carry around hidden
> state information with them.
The rules are intended to mean that the hidden state is only needed at compile
time. But even now, there are access types that are multiple words, e.g.
access-to-protected-subprogram, and in some implementations,
access-to-unconstrained-array. So being multiple words doesn't necessarily
imply being "composite" from a language point of view.
****************************************************************
From: Randy Brukardt
Sent: Monday, October 9, 2017 8:21 PM
> Here is a new AI ...
I noted that the !problem seems to concentrate totally on the problems of
static analysis tools. It starts:
"When attempting to prove properties about a program, ..."
I'd argue that a large proportion of Ada users aren't planning to do that
anytime soon. And arguably, static analysis tools can solve their own problems
(as you say is being done with SPARK). So what's the value of this proposal
to the average Ada user?
Static analysis features like Preconditions and Predicates are evaluated at
runtime in Ada, so they are valuable to Ada programmers even if they never
even have imagined a static analysis tool. I think that is an important part
of any major feature added to Ada.
---
> You are correct that this would introduce some control-flow based
> checks, and there are many checks that could take advantage of that.
> In fact, almost every Ada compiler already does some amount of control
> flow, merely to provide helpful warnings. This would be the first
> time we would make legality depend on that, which is worth considering!
So far as I know, only GNAT uses front-end flow analysis for "helpful
warnings". I've never seen such warnings in other Ada compilers (most
compiler warnings aren't very helpful!).
Janus/Ada is probably extreme in this regard, but it could not enforce any
rules on anything larger than an expression. Flow analysis is done only in the
optimizer, only if the optimizer is on, and can only produce warnings - it has
to produce an executable program by that point; the data structures needed to
produce detailed errors are long gone by that point. I would have to discard
40% of the compiler and rewrite it to have any way to do any serious flow
analysis in the front-end. (Hint: I'm not going to do that.)
Additionally, if we require flow analysis for Legality Rules, we would have to
define precisely what flow analysis is required and not required for that
purpose. Besides being a lot of work, it certainly would be different than what
any actual compiler did for such things, meaning that every vendor would have
to bifurcate their flow work (much as every vendor has to split compile-time
expression evaluation into static and non-static parts). That's going to be a
lot of work even for compilers that already do the needed analysis.
It also seems that this sort of requirement would also require abandoning all
pretense of a contract model for generics. (You all know where I stand on
that.) It wouldn't make much sense to require assume-the-worst in a generic
body, as that would essentially make all code illegal; nor does it make much
sense to use a bunch of runtime checks for this (or any such model).
Finally, if we were to start requiring flow analysis, there are lots of things
that would be better uses for it than this proposal: uninitialized stand-alone
object prevention seems to lead the list (as Jeff noted).
It might be that this idea is best left as a SPARK feature, since it doesn't
seem to fit into the existing model of what an Ada compiler is required to do.
****************************************************************
From: Tucker Taft
Sent: Monday, October 9, 2017 8:36 PM
You may not buy the rationale, but in my view, safe storage management is a
huge issue, and many languages have adopted full garbage collection, not
because it is a "good thing," but merely because having to deal with storage
leaks and dangling references is so painful.
I think if Ada had a "mode" in which all storage management was safe and
automatic, and there were no dangling references or storage leaks, and yet
also no asynchronous garbage collector meaning that real-time guarantees are
straightforward, all in the presence of multitasking and/or parallelism, this
would be truly extraordinary. In my experience using a language that has
these features (ParaSail) improves productivity by another order of magnitude
over current Ada.
I realize controlled types were supposed to solve all of these problems, but
they are painful to get right, can make debugging significantly harder, can
add measurable overhead, and don't really address issues of multi-tasking.
I agree this AI is a lot of work, but I think the payoff could be enormous.
****************************************************************
From: Randy Brukardt
Sent: Monday, October 9, 2017 9:07 PM
> You may not buy the rationale, but in my view, safe storage management
> is a huge issue, and many languages have adopted full garbage
> collection, not because it is a "good thing,"
> but merely because having to deal with storage leaks and dangling
> references is so painful.
You didn't give that as a rationale! The only thing the AI talks about is
proving programs and eliminating aliasing.
I'm surely not against "safe storage management" (that would be like being
against apple pie!), but I don't see how the features in the AI provide
anything safe. (Quite the reverse in fact, unless one assumes things about the
TBDs known only to the author.)
> I think if Ada had a "mode" in which all storage management was safe
> and automatic, and there were no dangling references or storage leaks,
> and yet also no asynchronous garbage collector meaning that real-time
> guarantees are straightforward, all in the presence of multitasking
> and/or parallelism, this would be truly extraordinary.
Yes, it would be truly extraordinary. I'd actually say impossible. :-)
If you can do that, without (a) wrecking the model of Ada (i.e. generic
contract model), (b) wrecking Ada implementations (i.e. requiring flow
analysis were none is previously required), and (c) wrecking portability
(i.e. requiring checks without sufficient definition), please do so. But the
proposal you sent does none of that, and doesn't even give a rationale for
anything that would help a non-SPARK Ada user. As I initially said,
insufficiently baked.
To get this proposal into anything resembling a usable form by mid-2019 would
require putting virtually all of our effort into that, and forgetting about
parallel operations, Global, and most of the other open proposals.
(That is, the stuff we've already promised.) A proposal would have to be
absolutely stupendous for that, and you'd have to make a much better case (in
particular, that you could really deliver the benefits that you claim in this
e-mail *and* keep usability).
P.S. I'd much rather you did your homework rather than coming up with new
pie-in-the-sky ideas. I realize the latter is more fun, but fun doesn't get
anything finished, and we're entering the home stretch of this Standard.
****************************************************************
From: Tucker Taft
Sent: Monday, October 9, 2017 10:18 PM
I realize this isn't part of my homework, but it seems important to at least
add the conversion rules into this, so it can be evaluated for safety.
Without the conversion rules, safety is clearly hopeless to evaluate.
So here is an update which adds conversion rules (and disallows 'Access and
'Unchecked_Access). [This is version /02 of the AI - Editor.]
I promise to do no more work on this until all my other homework is done... ;-)
****************************************************************
From: Randy Brukardt
Sent: Wednesday, October 11, 2017 10:42 PM
(A) This has been assigned AI12-0240-1.
(B) The AI had horrible formatting, which I fixed. But that prevents comparing
for changes, so I just copied the "Converting values" part and left the
rest alone. Since you left the TBD about conversions later in the
description, I'm pretty sure you didn't do a careful update. Hope I
didn't miss any other change.
(C) Ada has a name for an access type doesn't allow 'Access and
'Unchecked_Access -- a pool-specific type. An anonymous type can't specify
an aspect anyway, so those can't be involved. So why not just say that the
types in question have to be pool-specific? It seems simpler and fewer
rules would be needed (the needed conversion rules already exist for those
types).
****************************************************************
From: Raphael Amiard
Sent: Thursday, October 12, 2017 9:11 AM
> I noted that the !problem seems to concentrate totally on the problems
> of static analysis tools. It starts:
>
> "When attempting to prove properties about a program, ..."
>
> I'd argue that a large proportion of Ada users aren't planning to do
> that anytime soon. And arguably, static analysis tools can solve their
> own problems (as you say is being done with SPARK). So what's the
> value of this proposal to the average Ada user?
In my opinion you're drawing a line in the sand that is not so clearly
delimited. A static type system like the one Ada has exists to prove
properties about a program, and it is part of the compiler, not of an external
static analysis tool.
In this instance, the value of this proposal is to solve a class of memory
problems in the language directly, instead of relying on static analysis
tools to do it, IMHO.
****************************************************************
From: Randy Brukardt
Sent: Thursday, October 12, 2017 10:35 PM
My point, which I may not have made so well, is that the Problem statement/
Proposal explanation ought to spend at least some time on the benefits to
people who aren't using formal static analysis tools. To only talk about proof
issues leaves a lot of us out.
My understanding was that such techniques could be used to make programs that
don't use any tools safe (which Tucker also said in his reply to me). The AI
needs to emphasize that to get more broad-based support.
****************************************************************
From: Tucker Taft
Sent: Monday, January 15, 2018 10:37 AM
Here is an update to (well, really a complete re-write of) the AI on safe
pointers with automatic storage management. It is available in various formats.
We have been working on it in Google Docs, which allows commenting, suggestions,
etc. So here is the link to that:
https://docs.google.com/document/d/1pCIzY7ch1Zwa56MUNglzH4M_Ci3L90FvtZh9TBP5okg/edit?usp=sharing
But I realize google-anything can be of concern to some folks. So I have
attached a "plain text" version, and will send a "rich text" version separately
(my first attempt to send a "rich text" version doesn't seem to have gotten past
the various spam filters). The plain text retains some indications of italic
and bold fonts, I believe, which isn't particularly helpful, and has a list of
the various comments at the end of the file, which is even less useful. At some
point I will clean up the plain text version so it conforms with our normal
lexical conventions, but in the mean time, hopefully this is adequate.
[This is version /02 of the AI.]
****************************************************************
From: Randy Brukardt
Sent: Monday, January 15, 2018 10:02 PM
The .RTF version was ginormous, so the list wouldn't send it. I've put a version
into the Grab Bag, but of course Tucker updated the AI since his previous
sending, so most likely you want to use one of these versions:
Text version:
http://www.ada-auth.org/ai-files/grab_bag/AI12-0240-1v7-stt.TXT
Rich Text version:
http://www.ada-auth.org/ai-files/grab_bag/AI12-0240-1v7-stt.rtf
Or you can use the original Google Docs version (although that can change, so
it's not useful for a permanent reference):
https://docs.google.com/document/d/1pCIzY7ch1Zwa56MUNglzH4M_Ci3L90FvtZh9TBP5
okg/edit?usp=sharing
P.S. I didn't really read it, but I did verify that the .rtf could be loaded
successfully.
****************************************************************
From: Tucker Taft
Sent: Wednesday, January 24, 2018 10:37 AM
I have new versions of this. I was planning to provide a "pdf" instead of an
"rtf" plus a ".txt" file.
I have attached a new ".txt" version, and I'll send the PDF directly to Randy.
[This is version /04 of the AI - ED.]
The major change was a more generalized 'Copy attribute, so it could be used
more widely to prevent limitations that, for example, come with by-reference
parameter passing. In addition, I tried to incorporate issues relating to the
use of controlled types with subcomponents that are owning access objects.
****************************************************************
From: Randy Brukardt
Sent: Wednesday, January 24, 2018 4:58 PM
> I have attached a new ".txt" version, and I'll send the PDF directly
> to Randy.
You can find both of these in the Grab-Bag, see
http://www.ada-auth.org/ai-files/grab_bag/AI12-0240-1v8-stt.TXT for the text
version and http://www.ada-auth.org/ai-files/grab_bag/AI12-0240-1v8-stt.pdf
for the PDF version.
****************************************************************
From: Randy Brukardt
Sent: Thursday, January 25, 2018 12:38 AM
I've read little of this AI, but I note that you made the same mistake here
that you did in container aggregates:
If a type has a partial view, the Ownership aspect may be specified explicitly
only on the partial view, and if specified True, the full type shall not be an
elementary type nor an untagged private or derived type with Ownership aspect
False; furthermore, if the Ownership aspect for the full type would be True if
not explicitly specified, the Ownership aspect of the partial view shall be
True, either by inheritance from an ancestor type or by an explicit
specification.
Since an aspect has the same value for all views, the full type will always
have the same value. So rules like: "derived type with Ownership aspect False"
can't ever happen and thus are vacuous.
Note that this is the same reason that Preelaborable_Initialization (as
currently defined) cannot be an aspect: it has different values for the
partial and full views.
****************************************************************
From: Tucker Taft
Sent: Saturday, March 24, 2018 9:31 AM
I updated a bit this presentation on safe pointers (AI-0240). It could still
use more pictures, etc., but it might be useful for those still struggling to
understand this proposal.
https://docs.google.com/presentation/d/1NdZl-Pee6ZyniVtAoqx6i8Fv0KD6nrqLixqX3LXzvkA/edit
****************************************************************
From: Tucker Taft
Sent: Tuesday, March 27, 2018 11:48 AM
Randy and I were discussing how to handle cursors, if the rest of the container
data structure were implemented using these Ownership pointers. We began to
discuss some sort of "secondary" pointers or "unchecked" references, but I just
realized that good old 'Unchecked_Access provides exactly what we need. The
pointer inside a cursor would not have Ownership True, would be of a general
access type, and would be initialized by using 'Unchecked_Access.
Unchecked_Access already seems to have exactly the right semantics for such
pointers, namely it is up to the programmer to worry about dangling references.
So I don't see any need to add any special notion of "Unchecked_References" to
AI12-0240. So long as Unchecked_Access works, I don't think we need a new
feature to handle the "cursor problem."
****************************************************************
From: Randy Brukardt
Sent: Tuesday, March 27, 2018 5:37 PM
I don't think this works. Recall (for the rest of you, this was from our
private discussion) that the primary concern is being able to statically claim
that a dereference is not a global object because it is "owned" by an object
that is explicitly passed as a parameter. This is necessary so that the Global
reference can statically be "null" (or some other very restricted setting)
rather than having to be "all" -- "all" necessarily prevents most parallelism
when checking for safety.
Note that we have to be able to dereference the unchecked reference, as it is
not legal to convert the unchecked reference back to the owned pointer (that
would be a conversion into a pool-specific type, which is generally not
allowed).
For the owned pointers themselves, there is a fairly simple reachability rule
which provides the needed safety. The same rule can be used for the unchecked
pointers used in cursors, so long as the compiler can know that the unchecked
pointers cannot contain anything other than an owned pointer.
However, proving that the unchecked pointers cannot contain anything other
than an owned pointer requires analysis of all of the code that could possibly
create such a pointer. One can probably prove it in very limited circumstances
(such as when declared in a body), but the possibility of child units having
visibility into a private part means that separate compilation makes such a
proof unlikely unless we were to require analysis beyond what is usually
expected of Ada compilers.
As such, it seems necessary to declare that the type is an unchecked reference
of an owned pointer, so that we can apply the rules for owned pointers without
having to engage in complex proofs (perhaps OK for SPARK, not OK for Ada).
Once that's done, we might as well make that a pool-specific type in order to
be telling the truth about the capabilities of the type. (Personally, I'd
rather that all of these things are unchecked references by default, and that
the owned pointers are marked on the component or object declaration, as that
would require the least "noise" in the code -- there are only a few owned
pointer components in a typical program. Unmarked objects and components are
either "observers" if they have a short lifetime and "unchecked" if they have
a long lifetime. I don't see the value of having separate types. But that can
be discussed later.)
****************************************************************
From: Tucker Taft
Sent: Thursday, May 9, 2018 7:00 PM
https://docs.google.com/presentation/d/19dxDPWCb-Mg6A97SwAiu8iBRX2Q8s-3ynmwo_wXp48w/edit?usp=sharing_eip&ts=5af38b77
Here is an updated presentation about the "safe pointers" (Ownership aspect)
which I gave on Monday at the High Confidence Software and Systems conference.
It might be of interest to those trying to "grok" the proposed AI. Randy and I
are talking about further additions to support secondary pointers such as
those used for cursors. This doesn't mention that. -Tuck
****************************************************************
From: Randy Brukardt
Sent: Wednesday, October 10, 2018 3:05 PM
[Replying to a message from Tucker Taft in another thread. - Editor.]
>Intriguingly, the anonymous access types are emerging as useful for
>implementing the notion of temporarily borrowing an object in the
>context of "owning" access types. See AI12-0240-1:
>
>http://www.ada-auth.org/cgi-bin/cvsweb.cgi/ai12s/ai12-0240-1.txt?rev=1.9
This is not "intriguing" or "useful", but rather a downright evil (mis)usage
of anonymous access types. The primary purpose of using these [in AI12-0240-1]
is to provide a hack to avoid declaring explicitly what "kind" of owning
access type is involved. That is a terrible idea, because:
(A) owned access types are always pool-specific, while anonymous access
types are always general. Moreover, the point of anonymous access types
is to allow easy inter-conversion, and that has to be banned for owned
access types. Their use in this context is a complete and utter lie to
any uninitiated reader.
(B) one wants to use the various kinds of owned access types in many
contexts, and assuming the form of the declaration is unnecessarily
constraining to the programmer.
(C) note that there are at least four "kinds" of owned access types, so
purely depending on the form of declaration is insufficient [to
differentiate them].
We (you and I) discussed privately better ways to declare these things, which
I wrote up in the alternative AI (AI12-0240-2), in large part because you
suddenly became unwilling to discuss needed changes to the proposal (as well
as refusing to answer reasonable questions about the proposal, rather claiming
"the SPARK folks have considered all of the issues" -- which I don't believe,
simply because I can't find any rules about owned access objects used in
constants in AI12-0240-1).
In a private thread, we've been discussing ways to let the SPARK people have
better visibility into things that we're doing that might affect them -- but
this cuts both ways! The ARG ought to have better visibility into things that
SPARK is doing that might interfere with future Ada features.
>This AI probably won't make it into Ada 2020, but the SPARK folks are in
>the process of adding a stripped-down version of it to SPARK, ...
This is unfortunate, mainly because the proposal needs work (and
simplification!), and sticking it into SPARK in a half-baked form will make
a substantial constituency that will be against properly baking it.
It's also unfortunate if a stripped down version of this AI doesn't make it
into Ada 2020, because it means that unbounded containers that can be used in
parallel with checking on are either (a) impossible or (b) have to be written
in non-Ada.
***************************************************************
From: Tucker Taft
Sent: Wednesday, October 10, 2018 3:38 PM
It sounds like somewhere along the way I dropped the ball on discussing
alternative approaches, and sorry for that. We have been working away on
something based on AI12-0240-1 for inclusion in SPARK. Be that as it may, I
am still interested in alternatives. I was sort of waiting for your
alternative (AI12-0240-2) to show up in my inbox, but I guess you uploaded it
and expected me to go find it online (which is a reasonable expectation, but
alas, I never realized it was there). I just went and looked at it, and it
is interesting but also quite different. I'll share it with the SPARK team
and see what sort of comments they have on it.
***************************************************************
From: Randy Brukardt
Sent: Wednesday, October 10, 2018 5:00 PM
[Taking this private, as it's not interesting to everyone else.]
> It sounds like somewhere along the way I dropped the ball on
> discussing alternative approaches, and sorry for that. We have been
> working away on something based on AI12-0240-1 for inclusion in SPARK.
> Be that as it may, I am still interested in alternatives. I was sort
> of waiting for your alternative (AI12-0240-2) to show up in my inbox,
> but I guess you uploaded it and expected me to go find it online
> (which is a reasonable expectation, but alas, I never realized it was
> there).
It was on the agenda of the last meeting, so it wasn't exactly hidden. You
had pretty much stopped responding to my messages on the topic at that point,
so I wasn't sure you remained interested.
>I just went and looked at it, and it is interesting but also quite
>different. I'll share it with the SPARK team and see what sort of
>comments they have on it.
It's also not complete, in that I didn't make any attempt to define kinds
beyond "owned" and "unchecked". There needs to be at least a definition of a
"viewer" kind, and possibly others.
The idea of using subtypes was yours, I think. I didn't initially like it,
but I realized that in some cases, especially with "viewers", that there might
be a lot of these and having to tag every object would be a pain. Using types
would force explicit conversions for these types all over the place, which
seems like overkill, especially for safe things like "viewers".
OTOH, in normal use, there won't be many "owned" pointers, so it makes sense
to have those on the declaration (especially as they are so important).
Thus, I eventually concluded that both forms are needed. (Note that I've
sometimes missed the ability to declare an aliased subtype so I didn't have
to remember to stick that on every object declaration; this feels the same
to me. That is one area where I think early Ada 9x was closer to correct --
but we've agreed to disagree on that years ago and rehashing is not needed.)
BTW, so far as I understand, it's not possible to build a doubly-linked list
solely with owned pointers; one of them has to be some other kind (and the
only proposed one that works is "unchecked". That suggests that for
ADT/container use in particular one has to have more than just owned
pointers.
The requirement to effectively declare the ownership mappings might not be
necessary, in that one probably could derive it from the various uses of the
owned pointers (at least in ADT cases, where those are all in the private part
and body). However, that would mean that Ada's one-pass like
reading/compilation model would have to be abandoned -- that might be a mess
in some compilers. (Ahem, Janus/Ada in particular.) Perhaps there is a better
way to do it that doesn't require quite so much detail.
Note of course that I was trying to solve a different problem than
AI12-0240-1: specifically, ownership in ADTs so that the implementation can
use pointers without having to resort to Global All or something equally
annoying (which of course would prevent parallel use when checking is
enabled).
It would be best if we could solve both of these problems with one solution,
even if we only standardize the minimum corner for now (as roughly described
in AI12-0240-2).
Anyway, comments welcome, if only because I don't have many complete examples
to consider, so I really don't know how this works in practice (I've mainly
considered the doubly linked list container when designing AI12-0240-2.)
***************************************************************
From: Randy Brukardt
Sent: Wednesday, October 10, 2018 5:08 PM
> [Taking this private, as it's not interesting to everyone else.]
Apparently I did a bad job of that. :-) I intended that message just for
Tucker, since it mostly discussed our previous private conversations on this
topic.
***************************************************************
From: Tucker Taft
Sent: Wednesday, October 10, 2018 5:11 PM
Thanks for the background, and sorry for the misunderstanding. It is
probably safe to assume that when I stop responding, it is because my
brain just overflowed, and I have forgotten to reply and now the e-mail is
buried in a sea of other e-mails. I have also been traveling an insane
amount this year, having traveled across the "pond" eight times since
January, including two 2-week trips to Japan. So my brain is "mush" much
of the time due to jet lag.
***************************************************************
From: Tucker Taft
Sent: Thursday, October 11, 2018 7:00 PM
***************************************************************
Questions? Ask the ACAA Technical Agent