Version 1.1 of acs/ac-00192.txt

Unformatted version of acs/ac-00192.txt version 1.1
Other versions for file acs/ac-00192.txt

!standard 13.3.2(00)          10-02-19 AC95-00192/01
!class Amendment 10-02-19
!status received no action 10-02-19
!status received 09-11-18
!subject Preconditions are actively harmful
!summary
!appendix

From: Pascal Leroy
Date: Tuesday, December 29, 2009  5:18 AM

[From a thread filed in AI05-0145-2.]

> One hopes that there will exist tools that detect and warn about "bad"
> preconditions (and all of the other sorts of contracts that we're planning
> to add). Without that, it will be very hard to reason about anything.

I couldn't agree more: these contracts are either pure-ish and not terribly
useful, or full of dependencies on global/external state, and devilishly hard to
analyze.  This is exactly why I think that all this contract business is useless
at best and actively harmful at worst.

****************************************************************

From: Tucker Taft
Date: Thursday, December 31, 2009  8:38 AM

I have a lot of experience using assertions, and it is all good.  I have no
experience using preconditions, postconditions, or invariants per-se, though of
course assertions can mimic all these roles, and again, when used as such, my
experiences have been positive.  It is certainly true that during maintenance
some assertions fail, but that is usually a very useful thing, as it makes one
go back and review assumptions that might or might not still hold.

On a nightly basis, we run regression tests both with a version with assertions
turned on and with a version with assertion checks suppressed.  We ship the
version with assertion checks suppressed (but normal constraint checks remain
on).  The version with assertions turned on is significantly slower, but still
usable.  We have an even slower version, which activates additional assertions,
some analogous to your suggested expensive monotonic log assertion.  We run a
few tests nightly with this "slowest" version.

The proportion of bugs caught by assertions vs. other means is very high
(probably > 50%).  Of course they will probably be caught eventually by some
other means, but the assertion failures tend to be easier to debug because they
identify the problem at a higher level than might a constraint check failure, or
heaven forbid a core dump.  (Hey, we can use the word "core" again without
feeling over 50 years old!).

Very occasionally a complex assertion creates its own problems, but that is
almost vanishingly rare.  Much more common is when turning on debugging output
causes the results to change, because the elaborate tree dumps, etc., change
caching behavior, etc.

One difference might be is that most of these assertions were inserted very
early on.  I can imagine going in after the fact to add assertions,
preconditions, postconditions, etc., might be difficult.  But it seems very
natural to add a precondition to almost every subprogram.  I'll admit I would
probably make significantly less use of postconditions and invariants.

I guess we already know that your mileage may vary...

****************************************************************

From: Robert Dewar
Date: Thursday, December 31, 2009  9:03 AM

I echo all Tuck's observations, and his comments apply almost unchanged to the
GNAT compiler sources.

One interesting extension in GNAT is pragma Check

pragma Check (check-identifier, condition [, message]);

which allows categorization of assertions, individually controllable by

pragma Check_Policy (check_identifier, [ON | OFF | CHECK | IGNORE]);

In fact preconditions and postconditions are just special cases of categorized
assertions, controllable using those identifiers.

****************************************************************

From: Bob Duff
Date: Thursday, December 31, 2009  9:12 AM

> I have a lot of experience using assertions, and it is all good.  I
> have no experience using preconditions, postconditions, or invariants
> per-se, ...

I agree with your ramblings.  Two general comments:

Please include "user-defined constraints" in your list with "preconditions,
postconditions, or invariants".  I still wonder whether the user-defined
constraints AI and the invariants AI should be combined, but I haven't had time
to work on that. Either way, I continue to believe that user-defined constraints
are more useful than invariants, as currently defined.

Another advantage of all these things is documentation.
And it's better documentation than you get with comments, because it is more
likely to be true (and, more importantly, more likely to stay true during
maintenance).  Also, it's written in a precise language, rather than English, so
it's more likely to be understood.

> We have
> an even slower version, which activates additional assertions, some
> analogous to your suggested expensive monotonic log assertion.

I missed the monotonic log assertion -- what's that?

The slower version you're talking about contains two kinds of assertions in
addition to the somewhat-slow version: (1) assertions that are very slow because
they walk a giant data structure to assert global properties of it, and (2)
assertions that occur in very low-level procedures that hardly ever change, so
they probably no longer have many bugs, and are called often, so the assertions
are expensive.

****************************************************************

From: Robert Dewar
Date: Thursday, December 31, 2009  9:35 AM

> Another advantage of all these things is documentation.
> And it's better documentation than you get with comments, because it
> is more likely to be true (and, more importantly, more likely to stay
> true during maintenance).  Also, it's written in a precise language,
> rather than English, so it's more likely to be understood.

I find the last sentence remarkable, the idea that using a precise language
makes things more like to be understood is a fallacy much shared by formal folks
(no doubt there are people who think the current Ada RM is more likely to be
understood than the original, but for me it is much more incomprehensible).
After all, there were those who thought the Algol-68 report comprehensible :-)

But in simple cases, sure it is good to have things precise, though in any case
where there is an issue of clear understanding, I mostly reject the notion of
self documenting code (*ESPECIALLY* complex conditions full of AND THEN, OR
ELSE, and now case expressions and conditional expresssions). Such complex
expressions always need comments in addition to be clear in my view.

****************************************************************

From: Bob Duff
Date: Thursday, December 31, 2009  10:22 AM

> I find the last sentence remarkable, the idea that using a precise
> language makes things more like to be understood is a fallacy much
> shared by formal folks...

You are right.  I guess I'd claim that formality makes things easier to
understand, and also harder to understand.  ;-) An example of the former is:

    X : Integer := 1; -- X is always between 1 and 10.

versus:

    X : Integer range 1..10 := 1;

where the comment is confusing (did the author mean ("between 1 and 10,
inclusive"?).  It really means, the original programmer thought X should be in
2..9 (or maybe 1..10), but they might have been wrong, and even if they were
right, it might no longer be true.

>... (no doubt there
> are people who think the current Ada RM is more likely  to be
>understood than the original, ...

We received one comment to that effect during Ada 9X.
But most people (including me) think the Ada 95 RM is harder to understand than
the Ada 83 RM, for programmers.  (Compiler writers, on the other hand, need to
know the exact rules, even when they are unfortunately filled with "unless...,
except on Tuesdays, or blah blah".)

>...but for me it is
> much more incomprehensible).

I agree.  It's also bigger.

>...After all, there were those
> who thought the Algol-68 report comprehensible :-)

Not me.  Even after finally grokking Van Wijngaarden grammars, I was still
puzzled.  It didn't help that they used the "," symbol (I think?) for
concatenation, whereas I'm used to just juxtaposition, as in:

    assignment_statement ::= name := expression

versus:

    assignment_statement ::= name , := , expression

I'm sure you'll correct me if I'm misremembering this detail.

> But in simple cases, sure it is good to have things precise, though in
> any case where there is an issue of clear understanding, I mostly
> reject the notion of self documenting code (*ESPECIALLY* complex
> conditions full of AND THEN, OR ELSE, and now case expressions and
> conditional expresssions). Such complex expressions always need
> comments in addition to be clear in my view.

Robert always says "always" when I usually say "usually".  ;-) But basically, I
agree -- it's good to have assertions AND comments.

****************************************************************

From: Robert Dewar
Date: Thursday, December 31, 2009  10:41 AM

...
>    X : Integer range 1..10 := 1;

Yes, good example. Though of course if I see

      X : Integer range 1 .. 10;

without a comment, I object strenuously. Where on earth is this arbitrary
constant "10" coming from. Cries out for a comment.

In my MIMIMAL language, the only constants allowed in code were 0 and 1. Every
other constant had to be named and declared and commented separately.

> versus:
>
>     assignment_statement ::= name , := , expression
>
> I'm sure you'll correct me if I'm misremembering this detail.

yes, essentially, though := could never appear in the syntax like this.

As one of the elite club who could read the A68 report and was fluent in W
grammars, I have a biased view, though to be honest, the inaccessibility of the
revised report was a fatal blow. In particular, it delayed me getting involved
in A68 implementation for several years, which was quite unfortunate.

> Robert always says "always" when I usually say "usually".  ;-) But
> basically, I agree -- it's good to have assertions AND comments.

Right, always almost always means usually. And usually usually means nearly
always.

****************************************************************

From: Bob Duff
Date: Thursday, December 31, 2009  11:17 AM

> Yes, good example. Though of course if I see
>
>       X : Integer range 1 .. 10;
>
> without a comment, I object strenuously. Where on earth is this
> arbitrary constant "10" coming from. Cries out for a comment.

Sure, but the comment isn't going to tell you that it's 10 -- it's going to tell
you WHY it's 10.

> In my MIMIMAL language, the only constants allowed in code were 0 and
> 1. Every other constant had to be named and declared and commented
> separately.

You didn't allow "... range 1 .. 2**31-1"?
I could see requiring "Bits_Per_Word : constant := 32", but I really don't want "Two : constant := 2"
or "Binary_Radix : constant := 2",
and "range 1..Binary_Radix**(Bits_Per_Word - 1) - 1".  ;-)

To me, so long as a "magic number" is not duplicated, and is properly commented,
I don't think it necessarily needs to be a named constant.  I mean, referring to
"My_Buffer'Last" is just as good as "Buffer_Length", maybe better in some cases
(because fewer names to learn).

How about:

    Print_Error ("Error: file not found.");

?
****************************************************************

From: Robert Dewar
Date: Thursday, December 31, 2009  11:31 AM

>> In my MIMIMAL language, the only constants allowed in code were 0 and
>> 1. Every other constant had to be named and declared and commented
>> separately.
>
> You didn't allow "... range 1 .. 2**31-1"?

Certainly not, that sounds like target dependent crud which would never be
permitted in a MIMINAL program without all sorts of stuff.

> I could see requiring "Bits_Per_Word : constant := 32", but I really
> don't want "Two : constant := 2"
> or "Binary_Radix : constant := 2",
> and "range 1..Binary_Radix**(Bits_Per_Word - 1) - 1".  ;-)

Right, it proved overkill, but I still find that even good programmers scatter
code with junk uncommented constants.

> To me, so long as a "magic number" is not duplicated, and is properly
> commented, I don't think it necessarily needs to be a named constant.
> I mean, referring to "My_Buffer'Last" is just as good as
> "Buffer_Length", maybe better in some cases (because fewer names to
> learn).
>
> How about:
>
>     Print_Error ("Error: file not found.");

Not sure what your point is here

****************************************************************

From: Bob Duff
Date: Thursday, December 31, 2009  4:30 PM

> >     Print_Error ("Error: file not found.");
>
> Not sure what your point is here

You said no literals except in constant decls.
I wasn't sure whether you meant just numeric literals, or whether string
literals are included.  I don't want to be required to say:

    File_Not_Found_Message : constant String := "Error: file not found.";

    Print_Error (File_Not_Found_Message);

This is getting rather off topic.  Sorry.  (But you started it.  ;-) )

****************************************************************

From: Tucker Taft
Date: Thursday, December 31, 2009  11:05 AM

I recently got a comment from the CTO of Raytheon's Intelligence Division (the
same fellow who spoke at SIGAda 2009), who said he *loved* the Ada 95 reference
manual, because it was so precise and it read like a logic text book.  So there
are some out there who appreciated our efforts!  ;-)

****************************************************************

From: Robert Dewar
Date: Thursday, December 31, 2009  11:09 AM

Yes, well I know people who reacted that way to the
Algol-68 report! There will always be some who like to read logic text books!
There are those who like the (much more formal) PL/1 standard, but no
programmers read it.

It really comes down to who do you want to read the standard

programmers, or only people who can read logic text books.

I liked the Ada 83 RM because it achieved what had not been achieved since the
days of Algol-60, namely typical programmers used it as a reference. That's
unfortunately less true today.

****************************************************************

From: Bob Duff
Date: Thursday, December 31, 2009  11:28 AM

> I liked the Ada 83 RM because it achieved what had not been achieved
> since the days of Algol-60, namely typical programmers used it as a
> reference. That's unfortunately less true today.

It's a worthy goal.  But I think the most important goal of a language standard
is to promote uniformity across implementations, and I don't want to sacrifice
that for readability.

As a programmer, I don't really care about the details of the overload
resolution rules, so long as they're not error prone (see C++) and so long as I
can trust that all compilers will agree on what they are.

I do think it's possible to get the best of both worlds, primarily by making
things simple and uniform, so you don't end up with rules that have 17
occurrences of "unless ...", "except ...".

****************************************************************

From: Robert Dewar
Date: Thursday, December 31, 2009  11:35 AM

> It's a worthy goal.  But I think the most important goal of a language
> standard is to promote uniformity across implementations, and I don't
> want to sacrifice that for readability.

I disagree, what's the point if, as in C, the standard is very clear on what is
portable and what is not, but no C programmers know the rules, so they end up
being close to useless.

> As a programmer, I don't really care about the details of the overload
> resolution rules, so long as they're not error prone (see C++) and so
> long as I can trust that all compilers will agree on what they are.
>
> I do think it's possible to get the best of both worlds, primarily by
> making things simple and uniform, so you don't end up with rules that
> have 17 occurrences of "unless ...", "except ...".

I am dubious, and I can't see any evidence that Ada 95 is implemented more
uniformly than Ada 83. Furthermore you have to rely on compiler writers to do
the right thing and not follow obvious errors in the RM (such as treating

    subtype X is integer range 1 .. 10;

as a non-static subtype in the original Ada 83.

Can you really document that the more formal approach of the RM has contributed
to an improvement in uniformity between compilers. I don't believe it for a
moment.

In my experience by far the biggest source of incompatibilities in porting code
is differences in implementation dependent stuff (packages, machine code
inserts, rep clauses, pragmas, attributes, representation etc). It is very
seldom that we have run into compiler differences as a portability problem.

****************************************************************

From: Bob Duff
Date: Thursday, December 31, 2009  4:26 PM

> I disagree, what's the point if, as in C, the standard is very clear
> on what is portable and what is not, but no C programmers know the
> rules, so they end up being close to useless.

That seems like an orthogonal issue.  The problem in C is too much undefined
behavior ("erroneous" in Ada terms). That is, the C standard does not
sufficiently promote uniformity; it allows all manner of different behavior
across implementations.

A programmer doesn't need to understand the exact rules if misunderstanding
results in a compile-time error, as is the case for overload resolution in Ada
(mostly).

...
> I am dubious, and I can't see any evidence that Ada 95 is implemented
> more uniformly than Ada 83. Furthermore you have to rely on compiler
> writers to do the right thing and not follow obvious errors in the RM
> (such as treating
>
>     subtype X is integer range 1 .. 10;
>
> as a non-static subtype in the original Ada 83.

Agreed.  Back then, I thought such RM bugs needed urgent fixing.
But I've mellowed -- nowadays I often lecture the ARG that we don't need to fix
so-and-so wrong wording in the RM unless our fix will change the behavior of
some implementer or programmer.  If "everybody knows" what is meant, then don't
bother.

> Can you really document that the more formal approach of the RM has
> contributed to an improvement in uniformity between compilers.

No, I can't.  All I have is anecdotes.

But you have to admit that for "type T is range 1..3", T'Size = 2 on all Ada 95
implementations, whereas that was not true in Ada 83.  I know you don't like the
Ada 95 rule but at least it's uniform.

****************************************************************

From: Robert Dewar
Date: Friday, January 1, 2010  5:17 AM

> But you have to admit that for "type T is range 1..3", T'Size = 2 on
> all Ada 95 implementations, whereas that was not true in Ada 83.  I
> know you don't like the Ada 95 rule but at least it's uniform.

But not helpful, because 'Size has almost no impact. The real issue is in what
GNAT calls Object_Size, and that is left completely vague in the RM
unfortunately. Where it does have an impact, it has been a major source of
*non-portability* from Ada 83 to Ada 95 (e.g. requiring 31 bits for Natural in a
packed composite.

It's hard to argue you have struck a blow for portability when you introduce a
major change that is significantly incompatible with nearly all existing
implementations and in practice causes major porting issues where none existed
before.

****************************************************************

From: Pascal Leroy
Date: Thursday, December 31, 2009  5:18 AM

> I'll admit I am curious why you have developed such
> negative feelings toward "all this contract business."
> Is it based on bad experience, or on the potential
> for trouble if preconditions are impure?
>
> We certainly use "pragma Assert" heavily, to good
> effect, for the equivalent of preconditions in our
> static analysis tool.  Usually there is also a
> comment such as "--#pre XXX" or "-- Requires: XXX".
> In most cases these would translate rather
> directly into "with Pre => XXX" which would seem
> to be a feature.  Where does the "active harm" arise?

(Gosh, I ended up writing such a long message, most of which is probably boring.
Sorry about that, but Tuck asked for it.  And happy new year to all.)

I have no problem with pragma Assert (although it's not as useful as one might
hope -- more later).  My problem with preconditions/postconditions/invariants is
that you are inventing a lot of language baggage for these features.  Remember
that any addition to the language has a cost just because it makes it more
complicated for users to learn and for implementors to support.  The question is
whether this cost is offset by the benefits to the user community at large.  I
believe that invariants are only marginally useful and will only help a small
category of Ada users.

Overall, I feel that Ada 83 stroke the right balance with subtypes/constraints:
the concepts are easy to grasp, putting a constraint in a declaration is simple,
and it brings many benefits thanks to the runtime checks.  Note however that
constraints add up to a lot of complexity in terms of language description and
implementation.  This being said, they are undoubtedly one of the strong points
of Ada.  I'm all in favor of beefing up constraints, e.g. by supporting
non-contiguous scalar and discriminant constraints, but I think that
user-defined constraints/invariants are misguided.

In general I am wary of adding implicit calls to user code all over the place:
Ada is a language that is relatively WYSIWYG is the sense that you can get a
pretty good idea of what (user) code gets executed when, and invariants break
that property.  I fear that there is a lot of potential for users being
confused, e.g. by assuming that invariants are guaranteed at places they are not
(see Randy's message).  I also fear that nasty problems could be introduced
during maintenance: say that a subprogram P has a precondition that uses
function F, and that during maintenance F is changed to execute on a separate
task or to cache some of its computations.  This could have unpleasant
consequences either by detecting an invariant violation where there is none or
by failing to detect a violation.

Of course none of this is new and it can happen in the absence of invariants,
but I actually think that (non-trivial) invariants make it harder, not easier,
to reason about the program, because the code can be scattered in many different
places.

I guess I could change my mind if someone came up with realistic examples of
practical programs were invariants would bring benefits that would be hard to
achieve with the language as it stands.  But I am sick and tired of not
Is_Empty(S) being a precondition for Stack.Pop.  This is such a trivial
invariant that it's sure to be caught by the smallest amount of testing.  The
really interesting property of a Stack is the LIFO order, and that one is
actually much harder to express as an invariant.  (Exercise left to the reader:
how do you do that?)

[Irrelevant aside: on the topic of making constraints more useful, I think they
should be allowed in more places where only a type_mark is currently allowed.
For instance, I'd love to write:

  function Cos(X : Float_Type'Base) return Float_Type'Base range -1.0..1.0;

and having to declare a specific named subtype for that purpose is a pain.  I
realize that there are difficulties with doing this in the general case, but it
should be allowed for static constraints.  End of irrelevant aside.]

Enough with language design philosophizing.  You asked if I had had bad
experience with assertions.  I can think of three areas on which I worked in
recent years where assertions were either (1) not too useful or (2) would have
been useful but didn't work in the end:

1 - Numerical libraries

Surprisingly enough, scalar constraints are not all that useful in this area.
Expressing that Cos returns a number in the range -1.0..1.0 is good for clients
and for the compiler, but it's not the right way to ensure that the
implementation of Cos doesn't return values outside this range.  The right way
to do this is of course to *prove* that the formulae you use have the right
properties.  And if you have a bug, chances are that it's only going to show up
for a minuscule subset of the input parameter, so the constraint check is
unlikely to help catch it.

A non-trivial invariant that comes to mind in this area is that the
implementation of, say, Log, should be monotonically increasing (strangely
enough the RM has nothing to say about this).  This is a significant issue
because Log is implemented using polynomials over a bunch of intervals, and
there is potential for trouble at the junction between intervals.  When do you
check this invariant?  Surely not each time Log is called.  You could do that at
elaboration time (ignore the purity issue) but does it make sense to have an
expensive check that is always going to pass?

Overall, the best approach is a combination of mathematical proof and exhaustive
testing in the areas that are known to be problematic (like the monotony
requirement).  Although I ended up having a few assertions in this code, they
were typically of the most trivial nature and wouldn't have benefited from
extensive language support.

2 - Ada Semanticist

There were a number of areas (e.g., generic instantiations -- surprised?) that
were particularly difficult and were generating an endless stream of bug
reports.  Moreover, these bugs were hard to analyze, because a corruption in a
part of the Diana tree would result in incorrect generated code miles away.  So
I tried to add assertions in some critical parts of the semanticist to verify
that the tree was well-formed.  I must confess that I never managed to express a
set of assertions that would work satisfactorily.  It all started simple: the
Sm_Foo attribute of this node must be a Dn_Bar that is frozen.  And then a
customer would complain that some legitimate code wouldn't compile: oh wait, it
can really be a Dn_Baz if it comes from a formal generic package; ah, but that's
not true if the actual is really limited; unless of course it has an access
discriminant, etc., etc.  Over time the assertions became a collection of weird
cases that had been known in the past to cause trouble.

My point here is that if you have a really complicated problem at hand, which
you don't know how to express formally, it may be extremely hard to express
"interesting" invariants.

3 - Distributed Systems

The stuff that I am working on at Google has lots of "quasi-invariants", things
that are true most of the time, but may be violated under some circumstances.
Here is an example: I was recently working on a component that gets work items
from some storage system, orders them according to rather complicated criteria
(including urgency and the nature of the work that needs to be done) and ships
them to various servers for processing; it also monitors the progress of the
processing.

A very important invariant of this component is that it tries hard to guarantee
some deadlines.  In normal operations, each work item must be processed within X
seconds.  However, problems can and do happen: hardware dies, machines get
oversubscribed, the network gets congested, etc.  In this case, the system is
expected to degrade gracefully and to continue making progress although some
deadlines might be violated.

I was interested in writing invariants to ensure that the deadline management
was correct (if only because an occasional failure of that part would be hard to
detect: a small needle in a very big haystack).  I spent two days trying to
write assertions before giving up: I just couldn't find any assertion that would
express something interesting about the scheduling algorithm *and* would be true
in the presence of various kinds of failures.

What I ended up doing was to write tests that checked that the deadlines were
guaranteed on a healthy system, and other tests that checked that the system was
still making some progress if I injected failures.  Effectively, the invariants
are checked in the tests, not in the production code.

****************************************************************

From: Bob Duff
Date: Thursday, December 31, 2009  3:59 PM

Ah, now I see the "monotonic log" thing.  I wonder why Tuck saw this hours ago, and I just got it.

Pascal Leroy wrote:

> (Gosh, I ended up writing such a long message, most of which is
> probably boring.  Sorry about that, but Tuck asked for it.  And happy
> new year to
> all.)

Somebody famous wrote something like "sorry this letter is so long; I didn't
have time to write a short one".  ;-)

Thanks for your opinions.

And Happy New Year to you, too!

> I have no problem with pragma Assert (although it's not as useful as
> one might hope -- more later).  My problem with
> preconditions/postconditions/invariants is that you are inventing a
> lot of language baggage for these features.  Remember that any
> addition to the language has a cost just because it makes it more
> complicated for users to learn and for implementors to support.  The
> question is whether this cost is offset by the benefits to the user
> community at large.  I believe that invariants are only marginally
> useful and will only help a small category of Ada users.
>
> Overall, I feel that Ada 83 stroke the right balance with
> subtypes/constraints:  the concepts are easy to grasp, putting a
> constraint in a declaration is simple, and it brings many benefits
> thanks to the runtime checks.  Note however that constraints add up to
> a lot of complexity in terms of language description and
> implementation.  This being said, they are undoubtedly one of the
> strong points of Ada.  I'm all in favor of beefing up constraints,
> e.g. by supporting non-contiguous scalar and discriminant constraints,
> but I think that user-defined constraints/invariants are misguided.
>
> In general I am wary of adding implicit calls to user code all over the
> place: Ada is a language that is relatively WYSIWYG is the sense that
> you can get a pretty good idea of what (user) code gets executed when,
> and invariants break that property.

I see your point, but there are other features that do likewise:
finalization, record component defaults, parameter defaults, Adjust -- all of
these cause code to be executed far away from where it is written.  Dispatching
calls are similar (calls something you don't see at the call site). All these
things cause mysterious behavior when stepping through a program in a debugger.

>...I fear that there is a lot of potential for  users being confused,
>e.g. by assuming that invariants are guaranteed at  places they are not
>(see Randy's message).  I also fear that nasty problems  could be
>introduced during maintenance: say that a subprogram P has a
>precondition that uses function F, and that during maintenance F is
>changed  to execute on a separate task or to cache some of its
>computations.  This  could have unpleasant consequences either by
>detecting an invariant  violation where there is none or by failing to detect a violation.

I don't understand the above.  If P's invariant calls F, then F will be called
by whatever task calls P, so I don't understand what you mean by "F is changed
to execute on a separate task".  Maybe you meant, "the program is changed so it
now calls P from multiple tasks"?  Well, yeah, the same issues arise if F is
called from P's body.  I also don't understand the cache issue. If F uses a
cache, you'd still want it to return the same results for the same actuals.

> Of course none of this is new and it can happen in the absence of
> invariants, but I actually think that (non-trivial) invariants make it
> harder, not easier, to reason about the program, because the code can
> be scattered in many different places.
>
> I guess I could change my mind if someone came up with realistic
> examples of practical programs were invariants would bring benefits
> that would be hard to achieve with the language as it stands. ...

OK.  In GNAT, we have a type Node_Id.  Conceptually, it's a pointer to a variant
record, or pointer to class-wide.  But it's actually implemented as an integer
index into some table(s), and there's lots of getter/setter procedures.

There are thousands of procedures that take parameters of type Node_Id, but that
expect the Kind field to be in some set of enumeration values.  If I had
preconditions, I would state this expectation on such parameters.  Better, if I
had invariants/user-def-constraints, I would declare subtypes such as
Callable_Entity_Node_Id, and use those as parameter subtypes.

This seems hugely useful to me, for documenting what node kinds are expected,
and for debugging (catch the bug early, before it wipes its fingerprints).

>...  But I am sick and tired of not
> Is_Empty(S) being a precondition for Stack.Pop.  This is such a
>trivial  invariant that it's sure to be caught by the smallest amount of
>testing.
>  The really interesting property of a Stack is the LIFO order, and
>that one  is actually much harder to express as an invariant.
>(Exercise left to the
> reader: how do you do that?)

I don't know.  Good point.

> Overall, the best approach is a combination of mathematical proof and
> exhaustive testing in the areas that are known to be problematic (like
> the monotony requirement).

I think a monotony requirement would require Log to be monotonous (boring).
Gary will correct me if I'm wrong, but I think "monotonicity requirement"
is what you mean.  ;-)

> My point here is that if you have a really complicated problem at
> hand, which you don't know how to express formally, it may be
> extremely hard to express "interesting" invariants.

But it seems like you need to understand those invariants to write the code.
If you don't have invariants, you better use comments.  If you don't use
comments, then you're in trouble.  I really don't see how to write code in an
Ada front end if I don't know what node kinds I'm dealing with every step of the
way.

> The stuff that I am working on at Google has lots of
> "quasi-invariants", things that are true most of the time, but may be
> violated under some circumstances.

Yeah, invariants/user-def-constraints won't help here.  Maybe you should write
code that logs potential failures to a log file, with info to help diagnose
problems "by hand".

****************************************************************

From: Bob Duff
Date: Thursday, December 31, 2009  4:08 PM

> [Irrelevant aside: on the topic of making constraints more useful, I
> think they should be allowed in more places where only a type_mark is
> currently allowed.  For instance, I'd love to write:
>
>   function Cos(X : Float_Type'Base) return Float_Type'Base range
> -1.0..1.0;
>
> and having to declare a specific named subtype for that purpose is a pain.
>  I realize that there are difficulties with doing this in the general
> case, but it should be allowed for static constraints.  End of
> irrelevant aside.]

Early versions of Ada (1980?) allowed this.  I think the rationale for removing
it was that it would be unclear when the constraint gets evaluated:  On the
spec, on the body, both (check that it gets the same answer), on the call? I
don't buy it -- the obvious answer is "on the spec".

Your answer (require it to be static) is also acceptable.

Or you could come up with rules about how it must statically denote a
(possibly-non-static) constant, or be an expression involving such.

As you say, "irrelevant aside" -- we're not likely to bother with this.

****************************************************************

From: Pascal Leroy
Date: Friday, January 1, 2010  4:06 AM

>> I guess I could change my mind if someone came up with realistic
>> examples of practical programs were invariants would bring benefits
>> that would be hard to achieve with the language as it stands. ...
>
> OK. In GNAT, we have a type Node_Id. Conceptually, it's a pointer to
> a variant record, or pointer to class-wide. But it's actually
> implemented as an integer index into some table(s), and there's lots
> of getter/setter procedures.
>
> There are thousands of procedures that take parameters of type
> Node_Id, but that expect the Kind field to be in some set of
> enumeration values. If I had preconditions, I would state this
> expectation on such parameters. Better, if I had
> invariants/user-def-constraints, I would declare subtypes such as
> Callable_Entity_Node_Id, and use those as parameter subtypes.
>
> This seems hugely useful to me, for documenting what node kinds are
> expected, and for debugging (catch the bug early, before it wipes its
> fingerprints).

That's an interesting example, but if I were trying to tackle this problem from
scratch with Ada 2005, I would strive for *static* checking using strong typing.
For instance:

   -- Abstract root type with all the operations common to all nodes.
   type Node_Id is abstract tagged ...;
   function Kind(N : Node_Id) return Node_Kind is abstract;

   -- Now define interfaces for all the useful categories, with appropriate primitive ops.
   type Callable_Entity is interface;
   function Convention(E : Callable_Entity) return Language is abstract;

   type Program_Unit is interface;
   function Address(P : Program_Unit) return Expression is abstract;

   -- Here is a concrete type that uses interfaces to specify all the categories it belongs to.
   type Subprogram is new Node_Id and Callable_Entity and Program_Unit with ...;
   -- Must define Convention and Address.

   -- Now you can use the interfaces to specify exactly what parameters you expect:
   function Are_Mode_Conformant(E1, E2 : Callable_Entity'Class) return Boolean;

Of course, some work is required to put the proper type structure in place and
to test it, but the great news is that with interfaces you are not restricted to
hierarchical classifications, you can have as many classifications as you like,
and you can add more during maintenance without having to do major restructuring
of your code.

In my book, static checking is superior to any kind of invariant/assertion/check
that you can dream of.  I suspect that you have been using Ada 83 for too long,
and that you are trying to solve problems that wouldn't exist if you started
from scratch with a more modern dialect of the language.

>> The really interesting property of a Stack is the LIFO order, and
>> that one is actually much harder to express as an invariant.
>> (Exercise left to the
>> reader: how do you do that?)
>
> I don't know. Good point.

I don't know either, which is bothering me.  If we are not able to express a
useful invariant for a stack, the simplest of data structures, how can we hope
to express invariants for real life problems?

****************************************************************

From: Edmond Schonberg
Date: Saturday, January 2, 2010  1:42 PM

> I guess I could change my mind if someone came up with realistic
> examples of practical programs were invariants would bring benefits
> that would be hard to achieve with the language as it stands.  But I
> am sick and tired of not Is_Empty(S) being a precondition for
> Stack.Pop.  This is such a trivial invariant that it's sure to be
> caught by the smallest amount of testing.  The really interesting
> property of a Stack is the LIFO order, and that one is actually much
> harder to express as an invariant.  (Exercise left to the reader:
> how do you do that?)

There is one simple advantage that has dropped out of the discussion:
anything that forces the programmer to examine his thought processes and make
them explicit is a good thing.  Having to restate the intent of the code in a
different form (be it comments or invariants) helps in recognizing errors
earlier on.  It's like an introverted code review, where you have to verbalize
and justify the structure and details of the code to yourself.  We claim the
same advantages for strong typing: it forces you to make explicit some
assumptions that in other languages go unstated (and are often wrong).
Invariants are just the next step in forcing the semi-conscious to become fully
conscious!

The LIFO predicate is usually written :  Pop (Push (X, Stack)) = X
but i doubt that this would be very useful, and it is of course incorrect in the
presence of multiple tasks.  The monotonicity of Log is probably not the kind of
thing that you would want to make into an executable assertion:
   forall (X, Y):  X > Y =>  Log (X) > Log (Y)
and iterating over a real range is not a very good idea.  So certainly there are
predicates that we better prove as properties of the algorithms.  Still, I would
not generalize this to say that invariants in discrete domains are useless or
harmless.

****************************************************************

From: Bob Duff
Date: Saturday, January 2, 2010  2:30 PM

> There is one simple advantage that has dropped out of the discussion:
> anything that forces the programmer to examine his thought processes
> and make them explicit is a good thing.  Having to restate the intent
> of the code in a different form (be it comments or invariants) helps
> in recognizing errors earlier on.  It's like an introverted code
> review, where you have to verbalize and justify the structure and
> details of the code to yourself.  We claim the same advantages for
> strong typing: it forces you to make explicit some assumptions that in
> other languages go unstated (and are often wrong).  Invariants are
> just the next step in forcing the semi-conscious to become fully
> conscious!

Well said.

****************************************************************


Questions? Ask the ACAA Technical Agent