Version 1.1 of acs/ac-00124.txt

Unformatted version of acs/ac-00124.txt version 1.1
Other versions for file acs/ac-00124.txt

!standard 4.5.2(30/2)          06-01-04 AC95-00124/01
!class Amendment 06-01-04
!status received no action 06-01-04
!status received 05-11-03
!subject Real-time membership tests with multiple inheritance
!summary
!appendix

From: Tucker Taft
Date: Thursday, November  3, 2005  9:12 AM

With Ada 2005, we now are adding multiple inheritance of
interfaces, and as such, creating more of a challenge
for implementations that want to ensure constant-time
overhead for membership tests and dispatching.  Norm
Cohen published a paper in 1991 about this problem in
the context of single inheritance (ACM TOPLAS 13(4):1991).
For multiple inheritance, there is an interesting discussion
of at least the membership test aspect of this in the
latest TOPLAS (Volume 27, number 5, Sep. 05) by
Gil and Zibin, "Efficient Subtyping Tests with
PQ-Encoding."  That led me to another interesting
paper, available at the following URL:

    http://www.cs.purdue.edu/homes/jv/pubs/ecoop03.pdf

called "Java Subtype Tests in Real-Time" by Palacz and
Vitek.

Neither paper seems to discuss the overhead of dispatching.
However both are based on the idea of partitioning the types
(classes or interfaces) into "buckets" or "slices" such that
no two types in a bucket have a common descendant.   Once you
have done that, then to check if a given object with tag
identifying type TD is a descendant of a type TA:

   a) assume each type has an array with one slot per bucket,
      identifying the ancestor of the type, if any, that is
      from that bucket
   b) assume each type identifies what bucket it is in.
   c) check whether in TD's array, the slot associated with
      TA's bucket identifies TA.  If so, then TA is an ancestor
      of TD.  If not, then TA is not an ancestor of TD.

In a single-inheritance hierarchy, creating buckets is done easily
by associating a level number with each type, representing its
number of ancestors, and putting all types at the same
level in the same bucket.  This is essentially what
Norm Cohen's algorithm did, and I suspect it is what most
Ada 95 compilers do to implement membership tests.

Given this approach to membership tests, its seems that
the problem of dispatching can be handled similarly.
Rather than simply identifying the ancestor, if any,
that a given type has from a given bucket, one could
include a pointer to a dispatching table in each slot
in the array.  Since each type has at most one ancestor
from each bucket, and since the bucket associated with
each type is fixed, this means you can find the dispatch
table for a given ancestor by using its bucket number
to index into this array of pointers to dispatch
tables.

The dispatch table for a given ancestor is one that contains
pointers to the various dispatching operations, in the
order they were assigned "dispatching slot numbers" for that
ancestor.  For a given type, many of its ancestors can point to
the same dispatch table.  This is because that most types will
start out with the same dispatch-table layout as their parent
type, and then merge in additional operations from the other
"progenitor" types.  This might also imply that
for the dispatching problem alone, one could have fewer
total buckets and still come up with a bucket number for
each type that can be used to find an appropriate dispatch
table.  In particular, for the purpose of dispatching,
a type could generally be in the same "bucket" as its parent
type, since for any descendants they have in common,
the dispatch table used for the child can also be used
for the parent.  However, if one is already assigning
buckets for the membership test, it might be simplest
to reuse the bucket assignment for the dispatching
problem as well.

Here is an example:

     type T1 is interface;
     type T2 is interface;
     type T3 is new T1 and T2 ...
     type T4 is new T1 ...
     type T5 is new T2 ...
     type T6 is new T4 and T2 ...

For membership tests, we could assign these to buckets
as follows:

    Bucket 1: T1
    Bucket 2: T2
    Bucket 3: T3, T4, T5
    Bucket 4: T6

To decide whether T5 is an ancestor of T6, one
would look in T6's ancestor array:
    [1 => T1, 2 => T2, 3 => T4, 4 => T6]
with bucket index 3, and discover that T5 is *not*
an ancestor, because some other type (T4) is
occupying that location in T6's ancestor array.

As far as dispatching, presuming we are viewing some
object via a T2 pointer, to find the dispatching table
for calling one of T2's dispatching operations, we
would get use bucket index 2 to get the pointer to
the appropriate dispatch table for the object's underlying
run-time type (as indicated by its tag).

If we assume each type in the above example adds two
new operations, say Op11, Op12 for T1, Op21, Op22 for T2,
etc., then the table of pointers to dispatch tables
for T6 would look like this:

     1 => [T6_Op11, T6_Op12]
     2 => [T6_Op21, T6_Op22]
     3 => [T6_Op11, T6_Op12, T6_Op41, T6_Op42]
     4 => [T6_Op11, T6_Op12, T6_Op41, T6_Op42,
                 T6_Op21, T6_Op22, T6_Op61, T6_Op62]

Clearly 1, 3, and 4 could all point to the same place,
and 2 could point into the "middle" of that same table
(at the T6_Op21 slot).

We could compress these types into a smaller number
of "buckets" just for the purpose of dispatching:

    Dispatching Bucket 1: T1, T4, T6
    Dispatching Bucket 2: T2, T5
    Dispatching Bucket 3: T3

The savings in this case wouldn't be worth it.  In a larger
type hierarchy, it might be worth it, especially if there are
relatively few interfaces compared to the number of non-interface
types.

The one critical thing to point out is that this assignment
to buckets cannot generally be done until link time, or
later.  If the linker is not up to the job, then one possibility
would be to build these tables as part of the "tag registration"
that happens as tagged types are elaborated (to support
"Internal_Tag" and friends).

The nice thing about this is that it provides constant-time
overhead for multiple inheritance, without having to add
multiple "sub tags" into tagged objects -- a single tag
is sufficient.

****************************************************************

From: Tucker Taft
Date: Thursday, November  3, 2005 10:41 AM

On further reflection, since you always know at compile-time whether
you are testing membership against an interface class-wide type,
or calling a dispatching operation of an interface type,
there seems no reason to alter the way membership tests and
dispatching is performed for non-interface types.

This means that a separate "interface ancestor" array could
be maintained, and only interfaces need to be assigned
"bucket numbers" in the sense described below.  The
interface ancestor array would be indexed by
bucket number, and would provide two things, one
the identity of the ancestor from this given bucket
(if any), and a pointer to a dispatch table for the
primitives of that ancestor, in the order expected
for that interface.  This would mean that using types
that involve no interfaces would work the same way as
it does in Ada 95, and hence there would be essentially
no "distributed overhead" for having interfaces in the
language.  However, any type that has at least one
interface ancestor would require the interface ancestor
array, and each interface would need to be assigned
a "bucket number."

Note that the assignment of interfaces to buckets depends on
what descendants exist within a given program, so you still
couldn't assign them before link-time.  If deferred to run-time,
this would mean updating the assignments as any tagged type with
interface ancestors is elaborated.

****************************************************************


Questions? Ask the ACAA Technical Agent