!standard 13.15 08-01-18 AC95-00157/01 !class Amendment 08-01-18 !status received no action 08-01-18 !status received 08-01-10 !subject User-defined constraints !summary !appendix From: Randy Brukardt Date: Friday, January 11, 2008 8:52 PM Now for something completely different... It had struck me that Ada already has support for many of the most common preconditions. For instance, the precondition "Ptr /= null" is better written in the spec as "Ptr : in not null whatever". The magic is in the subtype. I began wondering if it was possible to expand that to cover a broader set of preconditions. (Indeed, I thought about it so much that I had trouble sleeping last night. Silly thing to lose sleep over!) It's certainly clear that subtypes cannot support all possible preconditions: any precondition that requires the use of more than one parameter cannot be written as a subtype. Still, a lot of useful preconditions could be written that way, as well as other things like object invariants. And the entire mechanism for subtypes is familiar to Ada programmers. So I started wondering how a generalized constraint could be written. Here's an idea. Define an operational attribute S'Constraint. This defines a function: function S'Constraint (Obj : in S) return Boolean; The attribute is specifiable for all specific subtypes. [Legality rules below limit this.] The default implementation of S'Constraint returns (Obj in S). [This is always True for unconstrained subtypes.] Static semantics If S'Constraint is specified for a subtype, S is constrained with a *user-defined constraint*. [Special syntax for this specification would be nice, so that it would be combined with the subtype declaration. One possibility is subtype is with ; But we can't do this now, as it is outside of what can be implementation-defined.] A value of the type of S satisfies a user-defined constraint if S'Constraint returns True. This is a dispatching call if S is class-wide. Legality Rules If S'Constrained is specified, S shall be declared by a and the defining of S shall be unconstrained. [This convolution is needed because otherwise we'd have a circularity - the specification of the attribute would make the subtype constrained and then the specification would be illegal.] If S'Constrained is specified, the function name shall statically denote a function declared "superpure" [for the lack of a better name. More on this below]. If S is tagged, the function name shall denote a primitive function. [So it can dispatch.] Other rules: Static matching - A subtype with S'Constraint specified only statically matches itself. Type conversions - 4.6(51/2) would ensure that a check is made (it depends on the definition of satisfies given above). Superpure functions: A function is *declared superpure* if it has pragma Superpure_Function applied to it. [The details of this pragmas application would be similar to those for pragma No_Return, so I won't detail them.] Statically bound calls to predefined operators are superpure. [Dispatching calls are not, as they might reference user-defined routines, and it would be a nasty incompatibility to require "=" overrides to be superpure.] A function that is declared superpure shall not reference any entity other than: * superpure function calls (including infix operators); * objects local to the superpure function, including the formal parameters; * subtypes; and * exceptions. A function that is declared superpure shall not include an . [This prevents assigning components of the formal parameters.] [I'm not particularly happy with these strong restrictions, but some strong restrictions seem necessary. Evaluating a constraint has to be task-safe (because more than one task could try to evaluate the constraint at the same time), has to always return the same result for a specific value, and to be consistent with the language model, should not have any side-effects (we want to be able to eliminate these checks without regard to the definition of the function - existing constraints are only evaluated once, at elaboration time). One could imagine declaring any side-effects as unspecified as to whether and how many times they occur, but the task-safe requirement cannot be worked around other than declaring it erroneous. For something specifically intended to increase the safety and correctness of programs, adding a new kind of erroneousness seems completely wrong. The easiest way to make this task-safe is to ban access to global objects. It seems OK to allow it to propagate an exception and to have local constants and variables, but otherwise it pretty much has to limit itself to operations on components of the formal parameters. Reading a global variable would probably be OK, but that could put the requirement for the same result for a specific value at risk - and that capability doesn't seem that valuable. Note that no procedure calls are allowed, so no creations or assignments of controlled types are allowed (although using one or passing it as a parameter are allowed). That could be relaxed, but it seems messy to do so without violating task safety or allowing side-effects.] --- An alternative to making S'Constraint into a function is to allow it be specified with a Boolean expression. This would have the advantage of putting more expressions into the specification (thus making static analysis easier), but would require some special semantics for the expression (since the expression would have to be able to reference the object that is being checked for being in the constraint), we'd still need the ability to make user-defined function calls (because we can't compromise privacy, and we surely wouldn't want to make it impossible to use these on private types), and it isn't clear how to describe the meaning for class-wide types. Thus I chose the function form. Note that superpure functions are simple enough that almost all of them can be inlined; I'd expect most to be a simple expression (or something that can be turned into a simple expression, like a chain of if statements). That would facilitate analysis, although the possible separate compilation of the body is a negative. At least the superpure restrictions would ensure that symbolic analysis can be used; extra checks can be removed assuming domination; and so on. --- Here's one of the motivating examples: In the Janus/Ada compiler, there are many subprograms that take specific kinds of symbol pointers. These routines tend to assume that their parameters are not null, and of a specific kind of symbol. It would be valuable to pull those invariant checks to the call sites, because it is usually the case that they already are known to have the particular requirements of the routine. It also would be valuable for the compiler to know that the invariants are checked somewhere, because then it could eliminate null pointer checks and variant component checks in the called subprogram. (This can't be done now.) Janus/Ada was originally an Ada 83 program, so it has a fairly primitive use of subtyping. Consider function Type_Size (A_Type : in Symbol_Ptr) return Target.Size_Type; This routine assumes that A_Type points at a symbol that represents a type. We could define a subtype to represent this: subtype Type_Symbol_Ptr is Symbol_Ptr; function Type_Size (A_Type : in Type_Symbol_Ptr) return Target.Size_Type; Now, we want to ensure that the symbol pointer is not null: subtype Type_Symbol_Ptr is not null Symbol_Ptr; [We could have put this on the parameter, instead.] We also want to ensure it points at a type symbol entry. We could in fact use a constrained pointer type to do this: subtype Type_Symbol_Ptr is not null Symbol_Ptr(TYP); and now we've put our precondition into the specification of the operation as part of the subtype. Now, let's look at another routine: procedure Generate_Call_Parameters (Callable : in Symbol_Ptr; ...); This takes a callable entity. We can make a subtype again: subtype Callable_Symbol_Ptr is Symbol_Ptr; procedure Generate_Call_Parameters (Callable : in Callable_Symbol_Ptr; ...); But a callable symbol can have one of 5 discriminants (PROC, FUNC, INST_PROC, INST_FUNC, AN_ENTRY), so a normal constraint won't do the trick. But a user-defined constraint would: function Is_Callable_Symbol (Obj : in Symbol_Ptr) return Boolean; pragma Superpure_Function (Is_Callable_Symbol); subtype Callable_Symbol is Symbol_Ptr; for Callable_Symbol'Constraint use Is_Callable_Symbol; function Is_Callable_Symbol (Obj : in Symbol_Ptr) return Boolean is begin return Obj /= null and then (Obj.Selector = PROC or Obj.Selector = FUNC or Obj.Selector = INST_PROC or Obj.Selector = INST_FUNC or Obj.Selector = AN_ENTRY); end Is_Callable_Symbol; Of course, such a constraint is not limited to looking at a single component of the object. One could easily imagine Elementary_Type_Symbol (which checks that the type is elementary), and so on. --- OK, I'm done with this. I'll go put on my flame-proof suit... **************************************************************** From: Tucker Taft Date: Friday, January 11, 2008 9:38 PM This is analogous to the per-subtype constraints proposed in Ada 9X. Doing things on a subtype basis when possible is very nice. But as you point out, there are also cases where a precondition is based on a relationship between two inputs, and those clearly need something that is not purely subtype based. As far as "purity" or "superpurity" or whatever, I don't see the point. So long as it is clear in the definition of the feature that a given constraint might be evaluated an unspecified number of times, and might be invoked concurrently in a multi-tasking program, the programmer should be able to do whatever they want. A friendly compiler could try to provide warnings when it sees an egregious violation of good taste in user-written constraints, but trying to make it a required part of the feature seems like overkill. **************************************************************** From: Alexander E. Kopilovich Date: Friday, January 11, 2008 10:50 PM >It's certainly clear that subtypes cannot support all possible >preconditions: any precondition that requires the use of more than one >parameter cannot be written as a subtype. But RECORD is a type, so one can imagine a subtype for a record type - for which natural constraints will be relations between the components of the record. Then, one may think of a subprogram as of an EXECUTABLE RECORD (as the list of subrogram's parameters resembles a record closely enough) - and thus extend the notion of "subtype for a record type" to subprograms. **************************************************************** From: Robert A. Duff Date: Saturday, January 12, 2008 9:07 AM Some languages take a viewpoint similar to that. E.g. languages in the ML family, where every function has a single parameter, which can be a tuple. But a precondition on a subprogram needs to be able to refer to globals as well as parameters. **************************************************************** From: Robert A. Duff Date: Saturday, January 12, 2008 9:32 AM > OK, I'm done with this. I'll go put on my flame-proof suit... Not at all -- I think you're on the right track. This has more to do with invariants that preconditions, though. AI95-375 is about invariants, but I don't remember what it says. I don't much like the details of the superpure business, although I do believe that in order to do any sort of automatic proofs, the compiler / proof tool will have to know all about side effects. It's all very well to know that the Sqrt function returns the square root of its argument, but you also want to know that it doesn't do anything else. But I'd deal with the side-effects issue separately -- Tuck's answer (let the programmer beware) is probably the right one, here. I do think the invariant/constraint should be specified as an expression, for the reasons you gave. The visibility should be as if that expression is written inside the type declaration, so it can refer to the "current instance" of the type. This is similar to how visibility in pragma Precondition is as if it were inside the spec, just after the last formal parameter. Example: type Sequence is private; subtype Empty_Sequence is Sequence; pragma Constraint (Empty_Sequence, Is_Empty(Empty_Sequence)); subtype Non_Empty_Sequence is Sequence; pragma Constraint (Non_Empty_Sequence, not Is_Empty(Non_Empty_Sequence)); I'd prefer not to have to write a Not_Is_Empty function, here. I think your proposal forbids this (right?): type T is range -10..10; for T'Constraint use Nonzero; I don't see why. Seems like the user-defined constraint/invariant should be on top of whatever predefined constraint (and null exclusion!) exists. I guess I'd call it pragma Invariant, to fit in with the rest of the world. **************************************************************** From: Alexander E. Kopilovich Date: Monday, January 14, 2008 2:02 PM >But a precondition on a subprogram needs to be able to refer to globals >as well as parameters. Well, but you may agree that globals are a kind of parameters - "constant" ones, that is, constant references. They need not be mentioned in calls - exactly because they are constants, already known to the subprogram; but they may be mentioned in subprogram declaration thus becoming a part of "executable record" type. It seems that those people who think about pre- and postconditions should not protest against requirement of exposing the globals (at least those that participate in pre- and postconditions) explicitly in subprogram declaration. **************************************************************** From: Randy Brukardt Date: Monday, January 14, 2008 3:13 PM ... > > OK, I'm done with this. I'll go put on my flame-proof suit... > > Not at all -- I think you're on the right track. > > This has more to do with invariants that preconditions, though. > AI95-375 is about invariants, but I don't remember what it says. For many purposes, you can use either. Subtypes of formal parameters are more like preconditions than invariants (since it doesn't need to apply to the actual until it is passed). I happen to think that subtyping is better than preconditions (for one thing, none of the confusing class-wide stuff is necessary); but it isn't able to represent all preconditions. > I don't much like the details of the superpure business, although > I do believe that in order to do any sort of automatic proofs, > the compiler / proof tool will have to know all about side effects. > It's all very well to know that the Sqrt function returns the square root of > its argument, but you also want to know that it doesn't do anything else. > But I'd deal with the side-effects issue separately -- Tuck's answer > (let the programmer beware) is probably the right one, here. I can understand the "programmer beware" attitude, but what bothers me is that it leaves no chance for the compiler to help. Unless the functions used here are marked somehow, the compiler has no chance of detecting the potential problems (and producing either warnings or errors). That's how languages like C work, it doesn't seem Ada-like. Ive heard it said that a separate static analysis tool should be doing this checking, not the compiler. But I don't buy that: separate tools don't get run by a lot of projects. After all, that's what C programmers say about C ("just run Lint to check your programs, the compiler doesn't have to"). Ada disagrees with that notion and builds that checking into the language. Why is this different?? The whole point of this functionality is to detect errors earlier. If it is introducing errors, it's not really helping. ... > type Sequence is private; > > subtype Empty_Sequence is Sequence; > pragma Constraint (Empty_Sequence, Is_Empty(Empty_Sequence)); > > subtype Non_Empty_Sequence is Sequence; > pragma Constraint (Non_Empty_Sequence, not Is_Empty(Non_Empty_Sequence)); > > I'd prefer not to have to write a Not_Is_Empty function, here. This sort of "special visibility" is one of the things that I detest intensely about these various proposals. It would be very difficult and time-consuming to implement (there surely is no provision for special visibility in pragmas or anywhere else for that matter now), and it would potentially be very confusing to users (which kind of scope would you be in? How would you remember?), and it would seem to introduce new possibilities for anomolies. > I think your proposal forbids this (right?): > > type T is range -10..10; > for T'Constraint use Nonzero; > > I don't see why. Seems like the user-defined constraint/invariant should > be on top of whatever predefined constraint (and null exclusion!) exists. Ada does not allow double constraints in general (it does for elementary types, but not for composite types). So I don't think double constraints can be allowed here. My view is that "range -10..10" is just a shorthand for a constraint function function A_Range (Obj : in T) return Boolean is begin return Obj in -10..10; end A_Range; ..and you can replace that function, but not add to it. You could compose them if you needed to: function A_Nonzero_Range (Obj : in T) return Boolean is begin if T'Constraint(Obj) then return else return False; end if; end A_Nonzero_Range; Moreover, my ideal goal would be for this to be first-class syntax (not an attribute): subtype S is T with A_Range; ..and having double constraints would be weird. > I guess I'd call it pragma Invariant, to fit in with the rest of > the world. I don't think pragmas are appropriate for any of these things; they're never appropriate for overloaded entities because there is no way to tell which one is meant. (Yes, that means that I think that pragma Convention is also a bad idea, it should be part of the syntax of the declaration.) I realize that for the purposes of experimentation, there isn't much choice but to use pragmas and attributes. But since these things alter the visible semantics of subtypes and subprograms, they should be visibly part of spec., and not just because someone happened to put a pragma nearby. That's is the same reason that we abandoned the "Overriding" pragmas in favor of syntax; I don't see much different here. **************************************************************** From: John Barnes Date: Tuesday, January 15, 2008 1:47 AM > >But a precondition on a subprogram needs to be able to refer to globals > >as well as parameters. > > Well, but you may agree that globals are a kind of parameters - "constant" > ones, that is, constant references. They need not be mentioned in calls - > exactly because they are constants, already known to the subprogram; but > they may be mentioned in subprogram declaration thus becoming a part of > "executable record" type. Spark treats all globals as parameters where the actual parameter is always the same. The modes can be "in", "in out" or "out" as usual. And they have to be mentioned in an annotation because they are part of the "specification" of the subprogram. > It seems that those people who think about pre- and postconditions should > not protest against requirement of exposing the globals (at least those that > participate in pre- and postconditions) explicitly in subprogram declaration. Absolutely essential in my view. Come on chaps, expose your globals. **************************************************************** From: Christoph Grein Date: Tuesday, January 15, 2008 2:03 AM As far as I know, Spark does not expose the globals (e.g. hidden state variables) verbatim, but only gives an annotation saying that there is some global (and gives it some name that need not correspond to the actual global). I think this is enough for static analysis and does not compromise privacy. Am I correct? **************************************************************** From: Robert A. Duff Date: Tuesday, January 15, 2008 7:37 AM > Absolutely essential in my view. Come on chaps, expose your globals. I think it's necessary to expose the existence of globals, but not their type (e.g. you don't need to say "it's record with an integer field of type integer, and..."). I think view this matches SPARK. **************************************************************** From: John Barnes Date: Tuesday, January 15, 2008 8:46 AM Correct. It could be just some abstract state which is refined somewhere into concrete variables which are hidden. **************************************************************** From: Randy Brukardt Date: Tuesday, January 15, 2008 2:10 PM > > I can understand the "programmer beware" attitude, ... > > First, I'm only half-heartedly advocating that attitude. > Second, the reason I'm advocating it is that I don't know > how to define the rules without (1) major additional features added, > and (2) a lot of language design work. Right. But we also need these features to better support multi-threading (it would be valuable for the compiler to be able to detect potential tasking problems, and to do that without a lot of false positives we need to know what cannot be a problem), and it also would potentially help optimization (in that fewer assumptions would be needed). So I'm willing to do that work (indeed, I already tried to do that). > >...but what bothers me is > > that it leaves no chance for the compiler to help. Unless the functions used > > here are marked somehow, the compiler has no chance of detecting the > > potential problems (and producing either warnings or errors). That's how > > languages like C work, it doesn't seem Ada-like. > > The argument "C does it that way, so it must be wrong" is a falacy. > I must admit, I've used that falacy myself. ;-) > I suggest we try to avoid it! Fair enough. But my point is that I want the compiler to be able to detect problems, and in the one-pass way that Ada generally uses. That requires a marker that the function is intended to be "superpure"; we don't necessarily have to assign any semantics to that marker. (It could be an "assume" it is superpure, which would still allow the compiler to warn about potential problems areas.) > > Ive heard it said that a separate static analysis tool should be doing this > > checking, not the compiler. But I don't buy that: separate tools don't get > > run by a lot of projects. > > I don't buy it, either, but I think your reason is wrong. If there's a magical > separate tool that can solve the problem, I have little sympathy for people who > can't be bothered to use it. > > My problem with the "separate tool" idea is that it's an excuse to avoid > solving hard problems. If we don't know how to solve the problem using Ada > rules, then separate tools won't solve it either. Furthermore, there will be > no portability -- i.e. no agreement on what exactly such a tool should > require. And no easy way to understand what class of bugs is guaranteed to be > detected by such tools. I like your reason better. Feel free to virtually replace my reason with yours in my note. ;-) ... > > This sort of "special visibility" is one of the things that I detest > > intensely about these various proposals. It would be very difficult and > > time-consuming to implement (there surely is no provision for special > > visibility in pragmas or anywhere else for that matter now), and it would > > potentially be very confusing to users (which kind of scope would you be in? > > How would you remember?), and it would seem to introduce new possibilities > > for anomolies. > > I think you overestimate the implementation difficulty. Robert has already > implemented this (or at least partly) in GNAT, and it doesn't seem like a big > deal. Maybe. I don't think anyone really understands how visibility works in Janus/Ada, so adding new forms is likely to be huge job. (And I suspect that is common with compiler writers, simply because they've had little reason to fiddle with that area of their compiler for the last few years.) What I know doesn't look good. > There are already similar things in Ada. For example, a record rep clause has > visibility as if it were inside the record. That's only for the names, not the expressions (and only in a particular syntactic location). Simple name lookups are easy to redirect somewhere else, but full expression changes would require changing the actual symboltable to reflect the new visibility (and changing it back afterwards). > For that matter, spec/body and > child/private features already require the compiler to restore > the visibility context to a previously saved state. Right, and that is extremely fragile code in Janus/Ada (and it has had a lot of trouble in corner cases). Ed once remarked to me that the situation was similar in GNAT. Do we *really* need more like that? > > Ada does not allow double constraints in general (it does for elementary > > types, but not for composite types). So I don't think double constraints can > > be allowed here. > > > > My view is that "range -10..10" is just a shorthand for a constraint > > function > > function A_Range (Obj : in T) return Boolean is > > begin > > return Obj in -10..10; > > end A_Range; > > I don't think that analogy quite works, because if I says "subtype S is T range > 1..100;" or "type S is new T range 1..100;", I get a constraint error, because > 100 > 10. The new constraint does not replace the old (unless of course you > say T'Base). Yes, it does (it surely does internally in Janus/Ada). If the elaboration check doesn't fail, then a replacement is fine. It's true that the new constraint has to be contained in the old one; I'd only allow replacing constraints for non-user-defined constrained. Thus, if 'Constrained is specified, then a later subtype cannot have an additional constraint (because there is no sane way to verify if that later constraint is contained in the original one). And I wouldn't allow specifying user-defined constraints for constrained subtypes. But still, I'd use the model that I gave above, so that there aren't contract problems with the use of S'Constrained. > Not sure what the right answer is, here. > > > Moreover, my ideal goal would be for this to be first-class syntax (not an > > attribute): > > Fine, but we're talking about something AdaCore is thinking about implementing > in the near-to-medium term, whereas Ada won't standardize it until the long > term. And AdaCore doesn't want to invent a bunch of new syntax. Right. I said "ideal goal", but I'm recommending a user-specifiable attribute for this capability if you're doing it now, because that can be done outside of the standard. Since this has to be done before freezing, a pragma doesn't seem appropriate. > To put it another way, saying "it has to be syntax" is making the best the > enemy of good-enough (even though I agree that new syntax would be more > aesthetically pleasing). I was only referring to what the standard should/would eventually do. Implementers are going to whatever they want irrespective of what is said here. That's not particularly relevant to this discussion. **************************************************************** From: Pascal Leroy Date: Wednesday, January 16, 2008 3:17 AM > OK, I'm done with this. I'll go put on my flame-proof suit... FWIW, this is the best idea that I have seen in this area for a long time. And I do like the superpurity (although obviously the details would have to be fleshed out). **************************************************************** From: Robert A. Duff Date: Saturday, January 19, 2008 8:02 AM [From a thread in AC-00156 - ED] > Indeed, I proposed user-defined constraints in large part to get some of > this information into the Ada subprogram specification where it belongs. I think user-defined constraints are a good idea, but I'd prefer to call them "invariants". Ideally, I'd like to be able to define them on the fly, but for that matter, I'd like to be able to say "procedure P(X : Integer range 1..10);" or even "procedure P(X: Integer range <>; A: String(X'Range));" or something like that! I'm not being entirely serious, here... **************************************************************** From: Randy Brukardt Date: Saturday, January 19, 2008 9:09 PM Bob Duff writes: > "Randy Brukardt" writes: > > > Cyrille Comar wrote: > > Indeed, I proposed user-defined constraints in large part to get some of > > this information into the Ada subprogram specification where it belongs. > > I think user-defined constraints are a good idea, but I'd prefer > to call them "invariants". When you say you want to call them "invariants", I have to assume you mean something different than what I proposed, which is just a more generalized Ada constraint (and which follows all of the rules of Ada constraints, including where they are checked). In that case, there is nothing "invariant" about them (they might only apply to the formal, the actual object is likely to have a different, weaker constraint, and because of aliasing, they might fail to hold after the initial check anyway). Besides, Ada has terminology for this concept, and using different terminology to mean the same thing is very confusing. > Ideally, I'd like to be able to define them on the fly, You can, of course, as you can declare a subtype anywhere that you can declare a subprogram spec. Maybe you mean you want to be able to declare them anonymously - an idea I don't support (I'm against the anonymous stuff that is already in Ada, and I don't want anymore of it). > ... but for that matter, > I'd like to be able to say "procedure P(X : Integer range 1..10);" or even > "procedure P(X: Integer range <>; A: String(X'Range));" or something like that! > I'm not being entirely serious, here... Thank goodness... ****************************************************************