Version 1.1 of acs/ac-00149.txt

Unformatted version of acs/ac-00149.txt version 1.1
Other versions for file acs/ac-00149.txt

!standard 7.5(8.1/2)          07-10-24 AC95-00149/01
!class confirmation 07-10-24
!status received no action 07-10-24
!status received 07-09-26
!subject Out of range array aggregates and build-in-place
!summary
!appendix

!topic Out-of-range array aggregates and build-in-place
!reference 7.5(8.1)
!from Adam Beneschan 07-09-26
!discussion

In the following program:

    package Pak1 is
        Count : Integer := 0;
        type Rec is record
            N : Integer;
        end record;
        type Rec_Arr is array (natural range <>) of Rec;
        function Func return Rec;
    end Pak1;

    package body Pak1 is
        function Func return Rec is
        begin
            Count := Count + 1;
            return (N => Count);
        end Func;
    end Pak1;

    with Pak1;
    with Text_IO;
    procedure Test570 is
    begin
        begin
            declare
                A : Pak1.Rec_Arr (1 .. 20) := (1 .. 1000 => Pak1.Func);
            begin
                null;
            end;
        exception
            when Constraint_Error => null;
        end;
        Text_IO.Put_Line (Integer'Image (Pak1.Count));
    end Test570;

The semantics of the declaration of A is that an anonymous array
object with bounds 1..1000 is created, Func is called 1000 times to
set up the array, and then the code attempts to convert this object to
the subtype Rec_Arr(1..20) and then gets a constraint error.  The
program will output 1000.

Now suppose we make Rec a limited type, so that no anonymous object
can be created.  What value of Count is output?  Is an implementation
that outputs either 0 or 20 (or maybe 21) for Count a legal
implementation?

If the initialization expression had been a call to a function that
returns an unconstrained array, and the function attempted to return
an array with bounds 1..1000, the Implementation Permission in 6.5(24)
would allow the implementation to raise Constraint_Error at the point
of the call as soon as it was determined that the constraint would be
violated.  But I don't see any such permission for initialization
expressions that are aggregates.

Intuitively, I would think that for this aggregate:

    A : Pak1.Rec_Arr (1 .. 20) := (1 .. 20 => Pak1.Func);

the implementation would call Pak1.Func effectively using A(I) as an
"out" parameter for each I in 1..20; obviously this won't work when
the aggregate is too big, if Func is expected to be called 1000
times.

****************************************************************

From: Randy Brukardt
Sent: Wednesday, September 26, 2007  5:56 PM

....
> The semantics of the declaration of A is that an anonymous array
> object with bounds 1..1000 is created, Func is called 1000 times to
> set up the array, and then the code attempts to convert this object to
> the subtype Rec_Arr(1..20) and then gets a constraint error.  The
> program will output 1000.
>
> Now suppose we make Rec a limited type, so that no anonymous object
> can be created.  What value of Count is output?  Is an implementation
> that outputs either 0 or 20 (or maybe 21) for Count a legal
> implementation?

I think so, and I think that is true for the non-limited case as well.
That's because of the optimization permissions of 11.6(6). The
Constraint_Error here is surely a language-defined check, and the permission
allows the expression to be moved forward (or backwards) so long as the same
handler will be used. (Which is surely the case for a single statement or
declaration.) In particular, the "external interactions" don't need to match
the canonical semantics, meaning that you can't reason at all about the
number of calls to the function. (OTOH, if no exception is raised, the
number of calls is well-defined.)

We need the special permission for functions because the exception would
then be raised within the function (and that is different exception
handler). In hindsight, I wonder if we really did need that permission at
all. Well, let's not answer questions not asked... :-)

Of course, this all depends on my understanding of 11.6, and we all know
that *no one* understands 11.6 :-), so I might be totally off base...

****************************************************************

From: Randy Brukardt
Sent: Wednesday, September 26, 2007  8:27 PM

Stephen Baird asserts privately:

> Build-in-place should be implementable without relying on 11.6.

That doesn't make much sense. 11.6 is a bunch of exceptions to the canonical
semantics; build-in-place is a set of exceptions to the canonical semantics.
Why is one set of exceptions better than any other?

What you need to prevent trouble with Adam's example is a permission to
raise the exception early. (That's what the permission for functions is
about.) The language already has it. Having two permissions with the same
effect is bad language design!

To start saying that all language permissions are not created equal is
pretty silly.

You could argue that 11.6 itself is bad language design, but I would counter
that build-in-place itself is also bad language design. Both warp the
canonical semantics for a particular end. It's not worth putting lipstick on
a pig.

The reason I was in favor of including the permission for functions is to
make it crystal clear that raising the exception is allowed, given that it
is logically in a different scope (possibly separately compiled) than the
original exception (it's only inside of the function that you can tell that
there is a problem). That's not necessary in the aggregate case, as
everything is in the same place.

I realize that someone has suggested that build-in-place be given first
class language status. I don't believe that is possible without a massive
rewrite of the standard, because you are going to be continually running
into examples like Adam's. That's why it wasn't done in the first place. I
think the time spent on such a rewrite is better spent on other Ada topics,
rather than making a bad situation worse.

****************************************************************

From: Tucker Taft
Sent: Wednesday, September 26, 2007  8:58 PM

I agree with Randy.  Our compiler generates
"raise Constraint_Error;" for this, followed
by some (dead) code that would do bad things if it
were executed.  In general in our compiler, if
there is an applicable index constraint (what
we more generally call a "target subtype") for
an array aggregate, even if it doesn't have an "others"
association, we check the actual bounds against
the applicable index constraint before we start
evaluating component expressions.  It might be
nice if that were explicitly permitted by RM 4.3.*,
but as Randy points out, 11.6 allows this kind
of reordering generally.

****************************************************************

From: Pascal Leroy
Sent: Thursday, September 27, 2007  4:42 AM

> Stephen Baird asserts privately:
>
> > Build-in-place should be implementable without relying on 11.6.

I am with Steve here.  It should be possible for a compiler writer to skip
the sections titled Implementation Permissions and still know how to
implement the language correctly.

> That doesn't make much sense. 11.6 is a bunch of exceptions
> to the canonical semantics; build-in-place is a set of
> exceptions to the canonical semantics. Why is one set of
> exceptions better than any other?

Oh boy, that's confused!  In case you didn't notice, 11.6 is a bunch of
*Implementation Permissions*.  7.5(8.1/2) is an *Implementation
Requirement*.  Surely I would assume that a requirement has precedence
over a permission.  In other words, a requirement is absolutely not an
exception to the canonical semantics: if you don't comply with it you
don't have a correct compiler.

> To start saying that all language permissions are not created
> equal is pretty silly.

This statement is true, but then saying that a requirement is on the same
footing as a permission is also quite silly.

> The reason I was in favor of including the permission for
> functions is to make it crystal clear that raising the
> exception is allowed, given that it is logically in a
> different scope (possibly separately compiled) than the
> original exception (it's only inside of the function that you
> can tell that there is a problem). That's not necessary in
> the aggregate case, as everything is in the same place.

I think that Adam's example shows that, when it comes to build-in-place,
whenever we have a rule regarding functions we should have a similar rule
regarding aggregates.  So there should be somewhere (probably in 4.3) an
equivalent of 6.5(24/2).

> I realize that someone has suggested that build-in-place be
> given first class language status.

Yes, and that someone believes that this is the only sane way to deal with
the stream of oddities and pathologies that Adam and Steve have been
producing over the last few months.

> I don't believe that is
> possible without a massive rewrite of the standard

FUD, FUD, FUD!

> That's why it wasn't done in the first place.

That's not my recollection.  At least that's not how I perceived it at the
time.  My view that we had to go with an Implementation Requirement
because it was not really testable and had no effect on the high-level
semantics of the language.  Both assertions turned out to be untrue:
witness the fact that someone wrote ACATS tests to check that you actually
build in place.

****************************************************************

From: Edmond Schonberg
Sent: Thursday, September 27, 2007  7:56 AM

> Oh boy, that's confused!  In case you didn't notice, 11.6 is a bunch of
> *Implementation Permissions*.  7.5(8.1/2) is an *Implementation
> Requirement*.  Surely I would assume that a requirement has precedence
> over a permission.  In other words, a requirement is absolutely not an
> exception to the canonical semantics: if you don't comply with it you
> don't have a correct compiler.
>

Didn't Randy mention 11.6 just to answer Adam's question about the
indeterminacy of the output? The program may print 0, 20, or 21,
depending on when the constraint violation is detected and how the
code for the aggregate is generated.  If the compiler recognizes
statically that the bounds of the aggregate violate the constraint,
it is certainly legal to replace the whole aggregate with a raise, in
which case the program outputs zero` (that's what happens with GNAT,
and apparently with Tuck's compiler as well).  If the aggregate is
expanded into a loop then 20 and 21 are possible outputs. As long as
this expansion does not generate a temporary this obeys 7.5 (8.1/2),
so where is the problem?

****************************************************************

From: Tucker Taft
Sent: Thursday, September 27, 2007  8:09 AM

I sympathize with Steve and Pascal, but I agree with
Randy that 11.6 takes precedence over *all* rules
in the language.  They are written explicitly in
a way that says *the rest of the standard* defines
the "canonical semantics" and then 11.6 is applied
over and above those canonical semantics.

I don't buy the argument that implementation
permissions are somehow "lower" in the pecking
order than implementation requirements.  I don't
think you can create a pecking order among
paragraphs based on the kind of paragraph.
All normative paragraphs are of equal weight in general.
It is only the words *within* the paragraph that
can establish a "pecking order" and words like
"notwithstanding what it says elsewhere" or
the kind of explicit discussion present in
11.6 are examples of that.

However, I agree with Pascal that we ought to
at least *try* to define build-in-place formally.
However, for this particular check, I think the
issue existed before Ada 2005, because of the
frequency with which you have a "target subtype"
for an aggregate (or more specifically an "applicable
index constraint" for an array aggregate), and that
it should be legal to perform the constraint checks
on the bounds before evaluating the individual
component expressions.  Other parts of the manual
go out of their way to permit "any order" when
there are multiple constraint checks and/or multiple
expression evaluations.  It seems we ought to do
something similar here.  And having a symmetry
between function calls and aggregates makes sense.

****************************************************************

From: Robert A. Duff
Sent: Thursday, September 27, 2007  8:10 AM

> > Stephen Baird asserts privately:
> >
> > > Build-in-place should be implementable without relying on 11.6.
>
> I am with Steve here.

I share that sentiment.

But in practical terms, what matters is what the compiler is allowed/required
to do.  IMHO, the natural implementation would be to check the bounds before
evaluating the components of the aggregate, in which case Count will be 0 when
done.  So we should allow that (whether limited or nonlimited).

I suppose it's in the "spirit of Ada" to allow pretty much anything else, too
-- that is, any number of components between 0 and 1000 could be evaluated.

****************************************************************

From: Robert A. Duff
Sent: Thursday, September 27, 2007  9:12 AM

>...  If the compiler recognizes  statically that the bounds of the
> aggregate violate the constraint,  it is certainly legal to replace the whole
> aggregate with a raise, in  which case the program outputs zero` (that's what
> happens with GNAT,  and apparently with Tuck's compiler as well).

Right.  But the compiler doesn't need to recognize anything statically in order
to get zero.  I just checked, and GNAT gets zero even if the bounds of the
left-hand side and the bounds of the aggregate are dynamic, presumably because
it checks the bounds first (which I think is the natural implementation).

My modified version is shown below, and it prints this:

Constraint_Error raised.
 0

-----

package Pak1 is

   pragma Elaborate_Body;

    Count : Integer := 0;
    type Rec is limited record
        N : Integer;
    end record;
    type Rec_Arr is array (natural range <>) of Rec;
    function Func return Rec;

    Ident_Int_20 : Integer;
    Ident_Int_1000 : Integer;

end Pak1;

package body Pak1 is
    function Func return Rec is
    begin
        Count := Count + 1;
        return (N => Count);
    end Func;
begin
   Ident_Int_20 := 20;
   Ident_Int_1000 := 1000;
end Pak1;

with Pak1;
with Text_IO;
procedure Test570 is
begin
    begin
        declare
            A : Pak1.Rec_Arr (1 .. Pak1.Ident_Int_20) :=
                 (1 .. Pak1.Ident_Int_1000 => Pak1.Func);
        begin
            null;
        end;
    exception
       when Constraint_Error =>
          Text_IO.Put_Line ("Constraint_Error raised.");
    end;
    Text_IO.Put_Line (Integer'Image (Pak1.Count));
end Test570;

****************************************************************

From: Robert A. Duff
Sent: Thursday, September 27, 2007  9:16 AM

> However, I agree with Pascal that we ought to
> at least *try* to define build-in-place formally.

I think something like this might be a good approach:
the return object is created, and at a certain point
it "becomes" the target object.  Atomically with respect
to abort, of course.  ;-)

Furthermore, if the target object has constrained nominal subtype,
the C_E can happen "early".

And as Pascal pointed out, aggregates and functions are similar,
and deserve similar wording.

****************************************************************

From: Adam Beneschan
Sent: Thursday, September 27, 2007 10:24 AM

> Didn't Randy mention 11.6 just to answer Adam's question about the
> indeterminacy of the output? The program may print 0, 20, or 21,
> depending on when the constraint violation is detected and how the
> code for the aggregate is generated.  If the compiler recognizes
> statically that the bounds of the aggregate violate the constraint,
> it is certainly legal to replace the whole aggregate with a raise, in
> which case the program outputs zero` (that's what happens with GNAT,
> and apparently with Tuck's compiler as well).  If the aggregate is
> expanded into a loop then 20 and 21 are possible outputs. As long as
> this expansion does not generate a temporary this obeys 7.5 (8.1/2),
> so where is the problem?

I framed my question in terms of the indeterminacy of the value of
Count, but (as I think some suspected) the value of Count is only part
of the issue; it's more of a concrete symptom of the underlying issue,
which is: in this construct, where Rec_Arr is an array of a limited
type:

    A : Pak1.Rec_Arr (1 .. 20) := (1 .. 1000 => Pak1.Func);

and the RM requires that the aggregate be "built in place", just what
the heck does that mean?  I'm siding with those who think that relying
on 11.6 isn't a good solution, because

(1) 11.6(6) is an implementation permission, not a requirement, so
there needs to be a definite answer to the question "What are the
semantics of this declaration for a compiler that does not invoke the
permission of 11.6(6)", even if every existing compiler would actually
take advantage of that permission---and I don't see a good answer to
that question; and

(2) 11.6 is entitled "Exceptions and Optimization", and to me the term
"optimization" is about improving performance---not about turning
incorrect programs into correct ones, which it would be if we said
that a compiler would not be correct unless it took advantage of
11.6(6) in this declaration.

In a previous thread, when I asked a different question about the
effect of this build-in-place requirement, Tucker mentioned that there
was already some perception in the ARG that it's necessary to be more
explicit about what "build-in-place" means; this seems to me to be
another example demonstrating why it's necessary.  I also think that
"build-in-place" should be given first-class language status, and I've
done a little bit of thinking about how the RM could be reworded for
this (mainly because of a nagging feeling that maybe I should be doing
some of the work instead of just harping on the problems :)), but I
haven't gotten too far yet...

****************************************************************

From: Edmond Schonberg
Sent: Thursday, September 27, 2007 10:51 AM

> I framed my question in terms of the indeterminacy of the value of
> Count, but (as I think some suspected) the value of Count is only part
> of the issue; it's more of a concrete symptom of the underlying issue,
> which is: in this construct, where Rec_Arr is an array of a limited
> type:
>
>     A : Pak1.Rec_Arr (1 .. 20) := (1 .. 1000 => Pak1.Func);
>
> and the RM requires that the aggregate be "built in place", just what
> the heck does that mean?

In your original question you said:

Intuitively, I would think that for this aggregate:

     A : Pak1.Rec_Arr (1 .. 20) := (1 .. 20 => Pak1.Func);

the implementation would call Pak1.Func effectively using A(I) as an
"out" parameter for each I in 1..20; obviously this won't work when
the aggregate is too big, if Func is expected to be called 1000
times.

I think we all share this intuition. If the aggregate is  (1..x =>
Pac1.Func) and x is non-static, then either we check that x = 20
before the loop (and raise an exception before any call to Func), or
else the call to A(I) needs to do a check on the value of I at each
step, and we call Func 20 times before raising the exception. 7.5 (8
1/2) just says that these individual assignment at NOT to an
anonymous temporary, so they must be "in place", i.e. in object A.
This does not rely on 11.6 to define the semantics (apart from the
indeterminacy of Count). What am I missing?

****************************************************************

From: Jean-Pierre Rosen
Sent: Thursday, September 27, 2007  9:38 AM

> I suppose it's in the "spirit of Ada" to allow pretty much anything else, too
> -- that is, any number of components between 0 and 1000 could be evaluated.
>
Yes, and talking about user's benefit (as opposed to compiler writer
ease), I can't imagine that a non-insane program would benefit from
guaranteed portability (i.e. same number of evaluations whatever the
compiler) in this case...

****************************************************************

From: Pascal Leroy
Sent: Thursday, September 27, 2007 11:51 AM

> I think we all share this intuition... What am I missing?

I believe that you are missing the fact that this intuition is not supported
by any normative wording in the RM.  In the non-limited case, you *have*to*
call the function 1000 times to build the aggregate and then you do the
check.  In the limited case, there is no wording that tells you that don't
really have to call the function 1000 times.  In other words, we require
build-in-place, but what if there is place to build (part of) the object?

****************************************************************

From: Randy Brukardt
Sent: Thursday, September 27, 2007 12:59 PM

...
> > That doesn't make much sense. 11.6 is a bunch of exceptions
> > to the canonical semantics; build-in-place is a set of
> > exceptions to the canonical semantics. Why is one set of
> > exceptions better than any other?
>
> Oh boy, that's confused!  In case you didn't notice, 11.6 is a bunch of
> *Implementation Permissions*.  7.5(8.1/2) is an *Implementation
> Requirement*.  Surely I would assume that a requirement has precedence
> over a permission.  In other words, a requirement is absolutely not an
> exception to the canonical semantics: if you don't comply with it you
> don't have a correct compiler.

It is a requirement to implement a permission; it does not change the
canonical semantics at all. Both 11.6 and build-in-place are mods to the
canonical semantics, and as such have a lesser status - but beyond that, it
doesn't make sense to give "goodness" rankings to RM rules.

...
> > I realize that someone has suggested that build-in-place be
> > given first class language status.
>
> Yes, and that someone believes that this is the only sane way to deal with
> the stream of oddities and pathologies that Adam and Steve have been
> producing over the last few months.
>
> > I don't believe that is
> > possible without a massive rewrite of the standard
>
> FUD, FUD, FUD!

Not FUD, fact. The underlying problem here is that build-in-place is not
assignment, and describing it as such is going to be forever error prone.
Assignment and initialization are two different things. But to change that
properly means rewriting every paragraph where "assignment" is used to
describe initialization.

Moreover, the majority of these anomalies can be resolved simply by looking
at the canonical semantics (that everything is copied); they don't require
additional wording to resolve. But every last one will require wording if
the canonical semantics is changed to include build-in-place, and I've be
very surprised if we've uncovered them all already (my guess would be that
we've only seen a few of them - call that FUD if you like, but it is based
on years of experience handling these sorts of issues).

A simple half-baked change here surely won't help any; we'd still have a
never-ending stream of issues and AIs to fix them. If we really are going to
try to fix this this way, we have to commit to a full-fledged study of the
effects of the change on the entire RM.

Moreover, only a few ARG members are really capable of handling a change of
this sort, and I don't think any of them really can spend the time now (you
are moving and changing jobs; I'm supposed to be building an ACATS; Tucker
has major ASIS projects; there are a few others that are capable of writing
such a massive language change, but none of them have done that sort of
project in years). And in any case I'm going to be stuck with the job of
rewriting all of the AARM paragraphs. Yikes!

...
> > That's why it wasn't done in the first place.
>
> That's not my recollection.  At least that's not how I perceived it at the
> time.  My view that we had to go with an Implementation Requirement
> because it was not really testable and had no effect on the high-level
> semantics of the language.  Both assertions turned out to be untrue:
> witness the fact that someone wrote ACATS tests to check that you actually
> build in place.

It originally was a Corrigendum change, and it was done in the easiest way
possible. But the only alternative is a massive change: at least 15
paragraphs would need to be rewritten in AARM 7.6 alone. We're again in
Corrigendum mode, and I find it hard to justify such a change.

Perhaps we're painted into a corner here vis-a-vis the standard, but in any
case, this is not important - I find it hard to believe that any implementer
will not understand what is expected; and no one cares whether an exception
is raised a bit early. There surely aren't going to be any ACATS tests on
such minutia. The canonical semantics is clear, but 11.6 makes that
untestable in any case. So this is a low-priority issue at best.

****************************************************************

From: Randy Brukardt
Sent: Thursday, September 27, 2007  1:07 PM

> > I think we all share this intuition... What am I missing?
>
> I believe that you are missing the fact that this intuition is not supported
> by any normative wording in the RM.  In the non-limited case, you *have*to*
> call the function 1000 times to build the aggregate and then you do the
> check.  In the limited case, there is no wording that tells you that don't
> really have to call the function 1000 times.  In other words, we require
> build-in-place, but what if there is place to build (part of) the object?

No, you can't have it both ways. Either you only consider the canonical
semantics (which does not include build-in-place), or you consider the
entire semantics (which includes 11.6). There is never any point at which
either set of semantics is inconsistent.

Besides, the entire question has nothing to do with a build-in-place
requirement. The question here is whether you ever bave to build an
aggregate that isn't going to fit, and the answer is no, because of 11.6. We
don't have a specific permission to reorder here, because there is a blanket
one. Why do we need to complicate the standard with extra permissions?

****************************************************************


Questions? Ask the ACAA Technical Agent