!standard 13.3.2(00) 10-02-19 AC95-00192/01 !class Amendment 10-02-19 !status received no action 10-02-19 !status received 09-11-18 !subject Preconditions are actively harmful !summary !appendix From: Pascal Leroy Date: Tuesday, December 29, 2009 5:18 AM [From a thread filed in AI05-0145-2.] > One hopes that there will exist tools that detect and warn about "bad" > preconditions (and all of the other sorts of contracts that we're planning > to add). Without that, it will be very hard to reason about anything. I couldn't agree more: these contracts are either pure-ish and not terribly useful, or full of dependencies on global/external state, and devilishly hard to analyze. This is exactly why I think that all this contract business is useless at best and actively harmful at worst. **************************************************************** From: Tucker Taft Date: Thursday, December 31, 2009 8:38 AM I have a lot of experience using assertions, and it is all good. I have no experience using preconditions, postconditions, or invariants per-se, though of course assertions can mimic all these roles, and again, when used as such, my experiences have been positive. It is certainly true that during maintenance some assertions fail, but that is usually a very useful thing, as it makes one go back and review assumptions that might or might not still hold. On a nightly basis, we run regression tests both with a version with assertions turned on and with a version with assertion checks suppressed. We ship the version with assertion checks suppressed (but normal constraint checks remain on). The version with assertions turned on is significantly slower, but still usable. We have an even slower version, which activates additional assertions, some analogous to your suggested expensive monotonic log assertion. We run a few tests nightly with this "slowest" version. The proportion of bugs caught by assertions vs. other means is very high (probably > 50%). Of course they will probably be caught eventually by some other means, but the assertion failures tend to be easier to debug because they identify the problem at a higher level than might a constraint check failure, or heaven forbid a core dump. (Hey, we can use the word "core" again without feeling over 50 years old!). Very occasionally a complex assertion creates its own problems, but that is almost vanishingly rare. Much more common is when turning on debugging output causes the results to change, because the elaborate tree dumps, etc., change caching behavior, etc. One difference might be is that most of these assertions were inserted very early on. I can imagine going in after the fact to add assertions, preconditions, postconditions, etc., might be difficult. But it seems very natural to add a precondition to almost every subprogram. I'll admit I would probably make significantly less use of postconditions and invariants. I guess we already know that your mileage may vary... **************************************************************** From: Robert Dewar Date: Thursday, December 31, 2009 9:03 AM I echo all Tuck's observations, and his comments apply almost unchanged to the GNAT compiler sources. One interesting extension in GNAT is pragma Check pragma Check (check-identifier, condition [, message]); which allows categorization of assertions, individually controllable by pragma Check_Policy (check_identifier, [ON | OFF | CHECK | IGNORE]); In fact preconditions and postconditions are just special cases of categorized assertions, controllable using those identifiers. **************************************************************** From: Bob Duff Date: Thursday, December 31, 2009 9:12 AM > I have a lot of experience using assertions, and it is all good. I > have no experience using preconditions, postconditions, or invariants > per-se, ... I agree with your ramblings. Two general comments: Please include "user-defined constraints" in your list with "preconditions, postconditions, or invariants". I still wonder whether the user-defined constraints AI and the invariants AI should be combined, but I haven't had time to work on that. Either way, I continue to believe that user-defined constraints are more useful than invariants, as currently defined. Another advantage of all these things is documentation. And it's better documentation than you get with comments, because it is more likely to be true (and, more importantly, more likely to stay true during maintenance). Also, it's written in a precise language, rather than English, so it's more likely to be understood. > We have > an even slower version, which activates additional assertions, some > analogous to your suggested expensive monotonic log assertion. I missed the monotonic log assertion -- what's that? The slower version you're talking about contains two kinds of assertions in addition to the somewhat-slow version: (1) assertions that are very slow because they walk a giant data structure to assert global properties of it, and (2) assertions that occur in very low-level procedures that hardly ever change, so they probably no longer have many bugs, and are called often, so the assertions are expensive. **************************************************************** From: Robert Dewar Date: Thursday, December 31, 2009 9:35 AM > Another advantage of all these things is documentation. > And it's better documentation than you get with comments, because it > is more likely to be true (and, more importantly, more likely to stay > true during maintenance). Also, it's written in a precise language, > rather than English, so it's more likely to be understood. I find the last sentence remarkable, the idea that using a precise language makes things more like to be understood is a fallacy much shared by formal folks (no doubt there are people who think the current Ada RM is more likely to be understood than the original, but for me it is much more incomprehensible). After all, there were those who thought the Algol-68 report comprehensible :-) But in simple cases, sure it is good to have things precise, though in any case where there is an issue of clear understanding, I mostly reject the notion of self documenting code (*ESPECIALLY* complex conditions full of AND THEN, OR ELSE, and now case expressions and conditional expresssions). Such complex expressions always need comments in addition to be clear in my view. **************************************************************** From: Bob Duff Date: Thursday, December 31, 2009 10:22 AM > I find the last sentence remarkable, the idea that using a precise > language makes things more like to be understood is a fallacy much > shared by formal folks... You are right. I guess I'd claim that formality makes things easier to understand, and also harder to understand. ;-) An example of the former is: X : Integer := 1; -- X is always between 1 and 10. versus: X : Integer range 1..10 := 1; where the comment is confusing (did the author mean ("between 1 and 10, inclusive"?). It really means, the original programmer thought X should be in 2..9 (or maybe 1..10), but they might have been wrong, and even if they were right, it might no longer be true. >... (no doubt there > are people who think the current Ada RM is more likely to be >understood than the original, ... We received one comment to that effect during Ada 9X. But most people (including me) think the Ada 95 RM is harder to understand than the Ada 83 RM, for programmers. (Compiler writers, on the other hand, need to know the exact rules, even when they are unfortunately filled with "unless..., except on Tuesdays, or blah blah".) >...but for me it is > much more incomprehensible). I agree. It's also bigger. >...After all, there were those > who thought the Algol-68 report comprehensible :-) Not me. Even after finally grokking Van Wijngaarden grammars, I was still puzzled. It didn't help that they used the "," symbol (I think?) for concatenation, whereas I'm used to just juxtaposition, as in: assignment_statement ::= name := expression versus: assignment_statement ::= name , := , expression I'm sure you'll correct me if I'm misremembering this detail. > But in simple cases, sure it is good to have things precise, though in > any case where there is an issue of clear understanding, I mostly > reject the notion of self documenting code (*ESPECIALLY* complex > conditions full of AND THEN, OR ELSE, and now case expressions and > conditional expresssions). Such complex expressions always need > comments in addition to be clear in my view. Robert always says "always" when I usually say "usually". ;-) But basically, I agree -- it's good to have assertions AND comments. **************************************************************** From: Robert Dewar Date: Thursday, December 31, 2009 10:41 AM ... > X : Integer range 1..10 := 1; Yes, good example. Though of course if I see X : Integer range 1 .. 10; without a comment, I object strenuously. Where on earth is this arbitrary constant "10" coming from. Cries out for a comment. In my MIMIMAL language, the only constants allowed in code were 0 and 1. Every other constant had to be named and declared and commented separately. > versus: > > assignment_statement ::= name , := , expression > > I'm sure you'll correct me if I'm misremembering this detail. yes, essentially, though := could never appear in the syntax like this. As one of the elite club who could read the A68 report and was fluent in W grammars, I have a biased view, though to be honest, the inaccessibility of the revised report was a fatal blow. In particular, it delayed me getting involved in A68 implementation for several years, which was quite unfortunate. > Robert always says "always" when I usually say "usually". ;-) But > basically, I agree -- it's good to have assertions AND comments. Right, always almost always means usually. And usually usually means nearly always. **************************************************************** From: Bob Duff Date: Thursday, December 31, 2009 11:17 AM > Yes, good example. Though of course if I see > > X : Integer range 1 .. 10; > > without a comment, I object strenuously. Where on earth is this > arbitrary constant "10" coming from. Cries out for a comment. Sure, but the comment isn't going to tell you that it's 10 -- it's going to tell you WHY it's 10. > In my MIMIMAL language, the only constants allowed in code were 0 and > 1. Every other constant had to be named and declared and commented > separately. You didn't allow "... range 1 .. 2**31-1"? I could see requiring "Bits_Per_Word : constant := 32", but I really don't want "Two : constant := 2" or "Binary_Radix : constant := 2", and "range 1..Binary_Radix**(Bits_Per_Word - 1) - 1". ;-) To me, so long as a "magic number" is not duplicated, and is properly commented, I don't think it necessarily needs to be a named constant. I mean, referring to "My_Buffer'Last" is just as good as "Buffer_Length", maybe better in some cases (because fewer names to learn). How about: Print_Error ("Error: file not found."); ? **************************************************************** From: Robert Dewar Date: Thursday, December 31, 2009 11:31 AM >> In my MIMIMAL language, the only constants allowed in code were 0 and >> 1. Every other constant had to be named and declared and commented >> separately. > > You didn't allow "... range 1 .. 2**31-1"? Certainly not, that sounds like target dependent crud which would never be permitted in a MIMINAL program without all sorts of stuff. > I could see requiring "Bits_Per_Word : constant := 32", but I really > don't want "Two : constant := 2" > or "Binary_Radix : constant := 2", > and "range 1..Binary_Radix**(Bits_Per_Word - 1) - 1". ;-) Right, it proved overkill, but I still find that even good programmers scatter code with junk uncommented constants. > To me, so long as a "magic number" is not duplicated, and is properly > commented, I don't think it necessarily needs to be a named constant. > I mean, referring to "My_Buffer'Last" is just as good as > "Buffer_Length", maybe better in some cases (because fewer names to > learn). > > How about: > > Print_Error ("Error: file not found."); Not sure what your point is here **************************************************************** From: Bob Duff Date: Thursday, December 31, 2009 4:30 PM > > Print_Error ("Error: file not found."); > > Not sure what your point is here You said no literals except in constant decls. I wasn't sure whether you meant just numeric literals, or whether string literals are included. I don't want to be required to say: File_Not_Found_Message : constant String := "Error: file not found."; Print_Error (File_Not_Found_Message); This is getting rather off topic. Sorry. (But you started it. ;-) ) **************************************************************** From: Tucker Taft Date: Thursday, December 31, 2009 11:05 AM I recently got a comment from the CTO of Raytheon's Intelligence Division (the same fellow who spoke at SIGAda 2009), who said he *loved* the Ada 95 reference manual, because it was so precise and it read like a logic text book. So there are some out there who appreciated our efforts! ;-) **************************************************************** From: Robert Dewar Date: Thursday, December 31, 2009 11:09 AM Yes, well I know people who reacted that way to the Algol-68 report! There will always be some who like to read logic text books! There are those who like the (much more formal) PL/1 standard, but no programmers read it. It really comes down to who do you want to read the standard programmers, or only people who can read logic text books. I liked the Ada 83 RM because it achieved what had not been achieved since the days of Algol-60, namely typical programmers used it as a reference. That's unfortunately less true today. **************************************************************** From: Bob Duff Date: Thursday, December 31, 2009 11:28 AM > I liked the Ada 83 RM because it achieved what had not been achieved > since the days of Algol-60, namely typical programmers used it as a > reference. That's unfortunately less true today. It's a worthy goal. But I think the most important goal of a language standard is to promote uniformity across implementations, and I don't want to sacrifice that for readability. As a programmer, I don't really care about the details of the overload resolution rules, so long as they're not error prone (see C++) and so long as I can trust that all compilers will agree on what they are. I do think it's possible to get the best of both worlds, primarily by making things simple and uniform, so you don't end up with rules that have 17 occurrences of "unless ...", "except ...". **************************************************************** From: Robert Dewar Date: Thursday, December 31, 2009 11:35 AM > It's a worthy goal. But I think the most important goal of a language > standard is to promote uniformity across implementations, and I don't > want to sacrifice that for readability. I disagree, what's the point if, as in C, the standard is very clear on what is portable and what is not, but no C programmers know the rules, so they end up being close to useless. > As a programmer, I don't really care about the details of the overload > resolution rules, so long as they're not error prone (see C++) and so > long as I can trust that all compilers will agree on what they are. > > I do think it's possible to get the best of both worlds, primarily by > making things simple and uniform, so you don't end up with rules that > have 17 occurrences of "unless ...", "except ...". I am dubious, and I can't see any evidence that Ada 95 is implemented more uniformly than Ada 83. Furthermore you have to rely on compiler writers to do the right thing and not follow obvious errors in the RM (such as treating subtype X is integer range 1 .. 10; as a non-static subtype in the original Ada 83. Can you really document that the more formal approach of the RM has contributed to an improvement in uniformity between compilers. I don't believe it for a moment. In my experience by far the biggest source of incompatibilities in porting code is differences in implementation dependent stuff (packages, machine code inserts, rep clauses, pragmas, attributes, representation etc). It is very seldom that we have run into compiler differences as a portability problem. **************************************************************** From: Bob Duff Date: Thursday, December 31, 2009 4:26 PM > I disagree, what's the point if, as in C, the standard is very clear > on what is portable and what is not, but no C programmers know the > rules, so they end up being close to useless. That seems like an orthogonal issue. The problem in C is too much undefined behavior ("erroneous" in Ada terms). That is, the C standard does not sufficiently promote uniformity; it allows all manner of different behavior across implementations. A programmer doesn't need to understand the exact rules if misunderstanding results in a compile-time error, as is the case for overload resolution in Ada (mostly). ... > I am dubious, and I can't see any evidence that Ada 95 is implemented > more uniformly than Ada 83. Furthermore you have to rely on compiler > writers to do the right thing and not follow obvious errors in the RM > (such as treating > > subtype X is integer range 1 .. 10; > > as a non-static subtype in the original Ada 83. Agreed. Back then, I thought such RM bugs needed urgent fixing. But I've mellowed -- nowadays I often lecture the ARG that we don't need to fix so-and-so wrong wording in the RM unless our fix will change the behavior of some implementer or programmer. If "everybody knows" what is meant, then don't bother. > Can you really document that the more formal approach of the RM has > contributed to an improvement in uniformity between compilers. No, I can't. All I have is anecdotes. But you have to admit that for "type T is range 1..3", T'Size = 2 on all Ada 95 implementations, whereas that was not true in Ada 83. I know you don't like the Ada 95 rule but at least it's uniform. **************************************************************** From: Robert Dewar Date: Friday, January 1, 2010 5:17 AM > But you have to admit that for "type T is range 1..3", T'Size = 2 on > all Ada 95 implementations, whereas that was not true in Ada 83. I > know you don't like the Ada 95 rule but at least it's uniform. But not helpful, because 'Size has almost no impact. The real issue is in what GNAT calls Object_Size, and that is left completely vague in the RM unfortunately. Where it does have an impact, it has been a major source of *non-portability* from Ada 83 to Ada 95 (e.g. requiring 31 bits for Natural in a packed composite. It's hard to argue you have struck a blow for portability when you introduce a major change that is significantly incompatible with nearly all existing implementations and in practice causes major porting issues where none existed before. **************************************************************** From: Pascal Leroy Date: Thursday, December 31, 2009 5:18 AM > I'll admit I am curious why you have developed such > negative feelings toward "all this contract business." > Is it based on bad experience, or on the potential > for trouble if preconditions are impure? > > We certainly use "pragma Assert" heavily, to good > effect, for the equivalent of preconditions in our > static analysis tool. Usually there is also a > comment such as "--#pre XXX" or "-- Requires: XXX". > In most cases these would translate rather > directly into "with Pre => XXX" which would seem > to be a feature. Where does the "active harm" arise? (Gosh, I ended up writing such a long message, most of which is probably boring. Sorry about that, but Tuck asked for it. And happy new year to all.) I have no problem with pragma Assert (although it's not as useful as one might hope -- more later). My problem with preconditions/postconditions/invariants is that you are inventing a lot of language baggage for these features. Remember that any addition to the language has a cost just because it makes it more complicated for users to learn and for implementors to support. The question is whether this cost is offset by the benefits to the user community at large. I believe that invariants are only marginally useful and will only help a small category of Ada users. Overall, I feel that Ada 83 stroke the right balance with subtypes/constraints: the concepts are easy to grasp, putting a constraint in a declaration is simple, and it brings many benefits thanks to the runtime checks. Note however that constraints add up to a lot of complexity in terms of language description and implementation. This being said, they are undoubtedly one of the strong points of Ada. I'm all in favor of beefing up constraints, e.g. by supporting non-contiguous scalar and discriminant constraints, but I think that user-defined constraints/invariants are misguided. In general I am wary of adding implicit calls to user code all over the place: Ada is a language that is relatively WYSIWYG is the sense that you can get a pretty good idea of what (user) code gets executed when, and invariants break that property. I fear that there is a lot of potential for users being confused, e.g. by assuming that invariants are guaranteed at places they are not (see Randy's message). I also fear that nasty problems could be introduced during maintenance: say that a subprogram P has a precondition that uses function F, and that during maintenance F is changed to execute on a separate task or to cache some of its computations. This could have unpleasant consequences either by detecting an invariant violation where there is none or by failing to detect a violation. Of course none of this is new and it can happen in the absence of invariants, but I actually think that (non-trivial) invariants make it harder, not easier, to reason about the program, because the code can be scattered in many different places. I guess I could change my mind if someone came up with realistic examples of practical programs were invariants would bring benefits that would be hard to achieve with the language as it stands. But I am sick and tired of not Is_Empty(S) being a precondition for Stack.Pop. This is such a trivial invariant that it's sure to be caught by the smallest amount of testing. The really interesting property of a Stack is the LIFO order, and that one is actually much harder to express as an invariant. (Exercise left to the reader: how do you do that?) [Irrelevant aside: on the topic of making constraints more useful, I think they should be allowed in more places where only a type_mark is currently allowed. For instance, I'd love to write: function Cos(X : Float_Type'Base) return Float_Type'Base range -1.0..1.0; and having to declare a specific named subtype for that purpose is a pain. I realize that there are difficulties with doing this in the general case, but it should be allowed for static constraints. End of irrelevant aside.] Enough with language design philosophizing. You asked if I had had bad experience with assertions. I can think of three areas on which I worked in recent years where assertions were either (1) not too useful or (2) would have been useful but didn't work in the end: 1 - Numerical libraries Surprisingly enough, scalar constraints are not all that useful in this area. Expressing that Cos returns a number in the range -1.0..1.0 is good for clients and for the compiler, but it's not the right way to ensure that the implementation of Cos doesn't return values outside this range. The right way to do this is of course to *prove* that the formulae you use have the right properties. And if you have a bug, chances are that it's only going to show up for a minuscule subset of the input parameter, so the constraint check is unlikely to help catch it. A non-trivial invariant that comes to mind in this area is that the implementation of, say, Log, should be monotonically increasing (strangely enough the RM has nothing to say about this). This is a significant issue because Log is implemented using polynomials over a bunch of intervals, and there is potential for trouble at the junction between intervals. When do you check this invariant? Surely not each time Log is called. You could do that at elaboration time (ignore the purity issue) but does it make sense to have an expensive check that is always going to pass? Overall, the best approach is a combination of mathematical proof and exhaustive testing in the areas that are known to be problematic (like the monotony requirement). Although I ended up having a few assertions in this code, they were typically of the most trivial nature and wouldn't have benefited from extensive language support. 2 - Ada Semanticist There were a number of areas (e.g., generic instantiations -- surprised?) that were particularly difficult and were generating an endless stream of bug reports. Moreover, these bugs were hard to analyze, because a corruption in a part of the Diana tree would result in incorrect generated code miles away. So I tried to add assertions in some critical parts of the semanticist to verify that the tree was well-formed. I must confess that I never managed to express a set of assertions that would work satisfactorily. It all started simple: the Sm_Foo attribute of this node must be a Dn_Bar that is frozen. And then a customer would complain that some legitimate code wouldn't compile: oh wait, it can really be a Dn_Baz if it comes from a formal generic package; ah, but that's not true if the actual is really limited; unless of course it has an access discriminant, etc., etc. Over time the assertions became a collection of weird cases that had been known in the past to cause trouble. My point here is that if you have a really complicated problem at hand, which you don't know how to express formally, it may be extremely hard to express "interesting" invariants. 3 - Distributed Systems The stuff that I am working on at Google has lots of "quasi-invariants", things that are true most of the time, but may be violated under some circumstances. Here is an example: I was recently working on a component that gets work items from some storage system, orders them according to rather complicated criteria (including urgency and the nature of the work that needs to be done) and ships them to various servers for processing; it also monitors the progress of the processing. A very important invariant of this component is that it tries hard to guarantee some deadlines. In normal operations, each work item must be processed within X seconds. However, problems can and do happen: hardware dies, machines get oversubscribed, the network gets congested, etc. In this case, the system is expected to degrade gracefully and to continue making progress although some deadlines might be violated. I was interested in writing invariants to ensure that the deadline management was correct (if only because an occasional failure of that part would be hard to detect: a small needle in a very big haystack). I spent two days trying to write assertions before giving up: I just couldn't find any assertion that would express something interesting about the scheduling algorithm *and* would be true in the presence of various kinds of failures. What I ended up doing was to write tests that checked that the deadlines were guaranteed on a healthy system, and other tests that checked that the system was still making some progress if I injected failures. Effectively, the invariants are checked in the tests, not in the production code. **************************************************************** From: Bob Duff Date: Thursday, December 31, 2009 3:59 PM Ah, now I see the "monotonic log" thing. I wonder why Tuck saw this hours ago, and I just got it. Pascal Leroy wrote: > (Gosh, I ended up writing such a long message, most of which is > probably boring. Sorry about that, but Tuck asked for it. And happy > new year to > all.) Somebody famous wrote something like "sorry this letter is so long; I didn't have time to write a short one". ;-) Thanks for your opinions. And Happy New Year to you, too! > I have no problem with pragma Assert (although it's not as useful as > one might hope -- more later). My problem with > preconditions/postconditions/invariants is that you are inventing a > lot of language baggage for these features. Remember that any > addition to the language has a cost just because it makes it more > complicated for users to learn and for implementors to support. The > question is whether this cost is offset by the benefits to the user > community at large. I believe that invariants are only marginally > useful and will only help a small category of Ada users. > > Overall, I feel that Ada 83 stroke the right balance with > subtypes/constraints: the concepts are easy to grasp, putting a > constraint in a declaration is simple, and it brings many benefits > thanks to the runtime checks. Note however that constraints add up to > a lot of complexity in terms of language description and > implementation. This being said, they are undoubtedly one of the > strong points of Ada. I'm all in favor of beefing up constraints, > e.g. by supporting non-contiguous scalar and discriminant constraints, > but I think that user-defined constraints/invariants are misguided. > > In general I am wary of adding implicit calls to user code all over the > place: Ada is a language that is relatively WYSIWYG is the sense that > you can get a pretty good idea of what (user) code gets executed when, > and invariants break that property. I see your point, but there are other features that do likewise: finalization, record component defaults, parameter defaults, Adjust -- all of these cause code to be executed far away from where it is written. Dispatching calls are similar (calls something you don't see at the call site). All these things cause mysterious behavior when stepping through a program in a debugger. >...I fear that there is a lot of potential for users being confused, >e.g. by assuming that invariants are guaranteed at places they are not >(see Randy's message). I also fear that nasty problems could be >introduced during maintenance: say that a subprogram P has a >precondition that uses function F, and that during maintenance F is >changed to execute on a separate task or to cache some of its >computations. This could have unpleasant consequences either by >detecting an invariant violation where there is none or by failing to detect a violation. I don't understand the above. If P's invariant calls F, then F will be called by whatever task calls P, so I don't understand what you mean by "F is changed to execute on a separate task". Maybe you meant, "the program is changed so it now calls P from multiple tasks"? Well, yeah, the same issues arise if F is called from P's body. I also don't understand the cache issue. If F uses a cache, you'd still want it to return the same results for the same actuals. > Of course none of this is new and it can happen in the absence of > invariants, but I actually think that (non-trivial) invariants make it > harder, not easier, to reason about the program, because the code can > be scattered in many different places. > > I guess I could change my mind if someone came up with realistic > examples of practical programs were invariants would bring benefits > that would be hard to achieve with the language as it stands. ... OK. In GNAT, we have a type Node_Id. Conceptually, it's a pointer to a variant record, or pointer to class-wide. But it's actually implemented as an integer index into some table(s), and there's lots of getter/setter procedures. There are thousands of procedures that take parameters of type Node_Id, but that expect the Kind field to be in some set of enumeration values. If I had preconditions, I would state this expectation on such parameters. Better, if I had invariants/user-def-constraints, I would declare subtypes such as Callable_Entity_Node_Id, and use those as parameter subtypes. This seems hugely useful to me, for documenting what node kinds are expected, and for debugging (catch the bug early, before it wipes its fingerprints). >... But I am sick and tired of not > Is_Empty(S) being a precondition for Stack.Pop. This is such a >trivial invariant that it's sure to be caught by the smallest amount of >testing. > The really interesting property of a Stack is the LIFO order, and >that one is actually much harder to express as an invariant. >(Exercise left to the > reader: how do you do that?) I don't know. Good point. > Overall, the best approach is a combination of mathematical proof and > exhaustive testing in the areas that are known to be problematic (like > the monotony requirement). I think a monotony requirement would require Log to be monotonous (boring). Gary will correct me if I'm wrong, but I think "monotonicity requirement" is what you mean. ;-) > My point here is that if you have a really complicated problem at > hand, which you don't know how to express formally, it may be > extremely hard to express "interesting" invariants. But it seems like you need to understand those invariants to write the code. If you don't have invariants, you better use comments. If you don't use comments, then you're in trouble. I really don't see how to write code in an Ada front end if I don't know what node kinds I'm dealing with every step of the way. > The stuff that I am working on at Google has lots of > "quasi-invariants", things that are true most of the time, but may be > violated under some circumstances. Yeah, invariants/user-def-constraints won't help here. Maybe you should write code that logs potential failures to a log file, with info to help diagnose problems "by hand". **************************************************************** From: Bob Duff Date: Thursday, December 31, 2009 4:08 PM > [Irrelevant aside: on the topic of making constraints more useful, I > think they should be allowed in more places where only a type_mark is > currently allowed. For instance, I'd love to write: > > function Cos(X : Float_Type'Base) return Float_Type'Base range > -1.0..1.0; > > and having to declare a specific named subtype for that purpose is a pain. > I realize that there are difficulties with doing this in the general > case, but it should be allowed for static constraints. End of > irrelevant aside.] Early versions of Ada (1980?) allowed this. I think the rationale for removing it was that it would be unclear when the constraint gets evaluated: On the spec, on the body, both (check that it gets the same answer), on the call? I don't buy it -- the obvious answer is "on the spec". Your answer (require it to be static) is also acceptable. Or you could come up with rules about how it must statically denote a (possibly-non-static) constant, or be an expression involving such. As you say, "irrelevant aside" -- we're not likely to bother with this. **************************************************************** From: Pascal Leroy Date: Friday, January 1, 2010 4:06 AM >> I guess I could change my mind if someone came up with realistic >> examples of practical programs were invariants would bring benefits >> that would be hard to achieve with the language as it stands. ... > > OK. In GNAT, we have a type Node_Id. Conceptually, it's a pointer to > a variant record, or pointer to class-wide. But it's actually > implemented as an integer index into some table(s), and there's lots > of getter/setter procedures. > > There are thousands of procedures that take parameters of type > Node_Id, but that expect the Kind field to be in some set of > enumeration values. If I had preconditions, I would state this > expectation on such parameters. Better, if I had > invariants/user-def-constraints, I would declare subtypes such as > Callable_Entity_Node_Id, and use those as parameter subtypes. > > This seems hugely useful to me, for documenting what node kinds are > expected, and for debugging (catch the bug early, before it wipes its > fingerprints). That's an interesting example, but if I were trying to tackle this problem from scratch with Ada 2005, I would strive for *static* checking using strong typing. For instance: -- Abstract root type with all the operations common to all nodes. type Node_Id is abstract tagged ...; function Kind(N : Node_Id) return Node_Kind is abstract; -- Now define interfaces for all the useful categories, with appropriate primitive ops. type Callable_Entity is interface; function Convention(E : Callable_Entity) return Language is abstract; type Program_Unit is interface; function Address(P : Program_Unit) return Expression is abstract; -- Here is a concrete type that uses interfaces to specify all the categories it belongs to. type Subprogram is new Node_Id and Callable_Entity and Program_Unit with ...; -- Must define Convention and Address. -- Now you can use the interfaces to specify exactly what parameters you expect: function Are_Mode_Conformant(E1, E2 : Callable_Entity'Class) return Boolean; Of course, some work is required to put the proper type structure in place and to test it, but the great news is that with interfaces you are not restricted to hierarchical classifications, you can have as many classifications as you like, and you can add more during maintenance without having to do major restructuring of your code. In my book, static checking is superior to any kind of invariant/assertion/check that you can dream of. I suspect that you have been using Ada 83 for too long, and that you are trying to solve problems that wouldn't exist if you started from scratch with a more modern dialect of the language. >> The really interesting property of a Stack is the LIFO order, and >> that one is actually much harder to express as an invariant. >> (Exercise left to the >> reader: how do you do that?) > > I don't know. Good point. I don't know either, which is bothering me. If we are not able to express a useful invariant for a stack, the simplest of data structures, how can we hope to express invariants for real life problems? **************************************************************** From: Edmond Schonberg Date: Saturday, January 2, 2010 1:42 PM > I guess I could change my mind if someone came up with realistic > examples of practical programs were invariants would bring benefits > that would be hard to achieve with the language as it stands. But I > am sick and tired of not Is_Empty(S) being a precondition for > Stack.Pop. This is such a trivial invariant that it's sure to be > caught by the smallest amount of testing. The really interesting > property of a Stack is the LIFO order, and that one is actually much > harder to express as an invariant. (Exercise left to the reader: > how do you do that?) There is one simple advantage that has dropped out of the discussion: anything that forces the programmer to examine his thought processes and make them explicit is a good thing. Having to restate the intent of the code in a different form (be it comments or invariants) helps in recognizing errors earlier on. It's like an introverted code review, where you have to verbalize and justify the structure and details of the code to yourself. We claim the same advantages for strong typing: it forces you to make explicit some assumptions that in other languages go unstated (and are often wrong). Invariants are just the next step in forcing the semi-conscious to become fully conscious! The LIFO predicate is usually written : Pop (Push (X, Stack)) = X but i doubt that this would be very useful, and it is of course incorrect in the presence of multiple tasks. The monotonicity of Log is probably not the kind of thing that you would want to make into an executable assertion: forall (X, Y): X > Y => Log (X) > Log (Y) and iterating over a real range is not a very good idea. So certainly there are predicates that we better prove as properties of the algorithms. Still, I would not generalize this to say that invariants in discrete domains are useless or harmless. **************************************************************** From: Bob Duff Date: Saturday, January 2, 2010 2:30 PM > There is one simple advantage that has dropped out of the discussion: > anything that forces the programmer to examine his thought processes > and make them explicit is a good thing. Having to restate the intent > of the code in a different form (be it comments or invariants) helps > in recognizing errors earlier on. It's like an introverted code > review, where you have to verbalize and justify the structure and > details of the code to yourself. We claim the same advantages for > strong typing: it forces you to make explicit some assumptions that in > other languages go unstated (and are often wrong). Invariants are > just the next step in forcing the semi-conscious to become fully > conscious! Well said. ****************************************************************