Version 1.2 of acs/ac-00272.txt

Unformatted version of acs/ac-00272.txt version 1.2
Other versions for file acs/ac-00272.txt

!standard 3.8(12/3)          15-10-05 AC95-00273/00
!class confirmation 15-10-05
!status received no action 15-10-05
!status received 15-08-31
!subject Discriminants would be more useful if they could be used in static expressions
!summary
!appendix

!topic Discriminants would be more useful if they could be used in static expressions
!reference Ada 2012 RM3.8(12/3)
!from Hadrien Grasland 15-08-30
!keywords discriminant static
!discussion

In section 3.8, paragraph 12/3, the Ada 2012 RM states "If the
discriminant is used to define the constraint of a component, the bounds
of an entry family, or the constraint of the parent subtype in a
derived_type_definition, then its name shall appear alone as a
direct_name (not as part of a larger expression or expanded name)."

Unfortunately, when building nontrivial discriminated types, this rule
turns out to be unpleasantly limitating. For example, it makes it
impossible to parametrize an array using its initial bound and length,
as in the following (illegal) code :

type Tridiagonal_System(First_Index : Index_Type; Length : Count_Type) is
    record
        Lower_Diagonal : F_Containers.Vector (First_Index + 1 ..
               First_Index + Length - 1);
        Diagonal : F_Containers.Vector (First_Index .. First_Index +
               Length - 1);
        Upper_Diagonal : F_Containers.Vector (First_Index .. First_Index
               + Length - 2);
    end record;

Of course, I can emulate this functionality using more complex langage
constructs, such as a triplet of heap-allocated arrays that are all
simultaneously allocated by a constructor function. But one thing I
greatly enjoy about Ada, with respect to other languages like C++, is
the way its glorious type system dramatically reduces the need for such
boilerplate code (and unnecessary heap allocations). So this feels like
a step in the wrong direction.

Consequently, unless it would really be an immense amount of trouble for
implementations, I really wish it were possible to use type
discriminants in static expressions.

***************************************************************

From: Randy Brukardt
Sent: Tuesday, September 1, 2015  6:19 PM

> In section 3.8, paragraph 12/3, the Ada 2012 RM states "If the
> discriminant is used to define the constraint of a component, the
> bounds of an entry family, or the constraint of the parent subtype in
> a derived_type_definition, then its name shall appear alone as a
> direct_name (not as part of a larger expression or expanded name)."

Terminology point: nothing in the above wording nor in the rest of your message
has anything at all to do with "static expressions" as defined in 4.9. I can't
imagine any scenario where the discriminant of a current instance would ever be
considered static, as the associated discriminant constraint is unknown there,
and surely can be non-static:

    Obj : Rec_D (Some_Function);

...
> Of course, I can emulate this functionality using more complex langage
> constructs, such as a triplet of heap-allocated arrays that are all
> simultaneously allocated by a constructor function. But one thing I
> greatly enjoy about Ada, with respect to other languages like C++, is
> the way its glorious type system dramatically reduces the need for
> such boilerplate code (and unnecessary heap allocations).
> So this feels like a step in the wrong direction.
>
> Consequently, unless it would really be an immense amount of trouble
> for implementations, I really wish it were possible to use type
> discriminants in static expressions.

As I noted, there is nothing static about these expressions, and there never
could be.

By mentioning "static expressions" here, you're confusing causal readers as to
what you actually are asking about. If you're not sure about the proper
terminology, it's best to stay vague (in this case, just talk about
"expressions") because that will not confuse the issue.

It's a minor point, on to your actual point.

This issue has been well-known for decades, but the problem is that this looks
natural but it has the potential to cause extreme pain for implementations and
correspondingly for some kinds of users (those that cannot have any sort of heap
operations in their code). It's clear that some sort of restrictions have to be
placed on the expression.

Coincidentially, I recently received some private mail from Steve Baird on this
very topic (some form of this will be in a future draft of AI12-0075-1, which
proposes expansion of what is considered a static function, so I think it is OK
to use it here):

>We don't want to mess with the 3.8(12/3) rule.
>
>It might seem tempting to allow (in a component constraint) the case of
>a call to a static function such that all parameters (including, of
>course, defaulted parameters) are either static expressions or names
>denoting discriminants.
>
> One property we need for discriminant-dependent component constraint
> expressions is reproducibility - given the same discriminant values we
> must always get the same result. This relaxation of the rules would
> meet that requirement, but that's not enough.
>
> We also need, given a set of discriminant subtypes, to be able to
> easily calculate the minimum and maximum values that an expression
> might yield (e.g., we need to know how much storage is needed when we
> declare an unconstrained object of a type which has an array component
> with a discriminant-dependent index constraint).
>
> That's where this proposal would lead to implementation problems, as
> an arbitrary static function would not necessarily be monotonic.

I'm no fan of the max-size implementation approach, and are particularly
appalled that we allow that to be the only approach used by an implementation.
(We don't allow only table-based case statements, for instance, any arbitrary
range has to work in a case statement. Why do we treat discriminated records
differently??) Nevertheless, that train left the station decades ago, and it
wouldn't be sensible to change the rules now, given the massive pain that would
be caused. On top of which, some users cannot have any implicit non-stack based
storage management (as is required when avoiding a max-size implementation), so
an additional aspect would be required to force a max-size implementation when
required. That's clearly very complex. (It's also my preference for the
language, but I expect that I would be a group of one supporting that position,
that's certainly been the way it was in the past.)

We once considered (I forget when) allowing a very limited expansion to the
3.8(12/3) rule. Essentially, only D + C and D - C (C being a static expression)
were suggested. It didn't seem that those would be sufficiently useful to
support the expansion of the Standard wording and implementation complexity.
It's probably possible to go slightly beyond that, but even multiply and divide
can cause problems (especially when the values can be negative, which they can
be in general).

Ergo, I wouldn't expect a change in this area, as "thar be dragons", at least
for compilers implemented as most are (to my knowledge, all but Janus/Ada).

***************************************************************

From: Hadrian Grasland
Sent: Wednesday, September 2, 2015  1:59 AM

> Terminology point: nothing in the above wording nor in the rest of
> your message has anything at all to do with "static expressions" as
> defined in 4.9. I can't imagine any scenario where the discriminant of
> a current instance would ever be considered static, as the associated
> discriminant constraint is unknown there, and surely can be non-static:
>
>      Obj : Rec_D (Some_Function);

Indeed, sorry about that. I thought that static expressions would be enough and
easier on implementations (if the size of objects is known at compile-time, it
is easier to allocate them), but you are right that this would be *way* too
limitating, and actually runs counter to the point of discriminated types in
most cases.

...
>> Consequently, unless it would really be an immense amount of trouble
>> for implementations, I really wish it were possible to use type
>> discriminants in static expressions.
> As I noted, there is nothing static about these expressions, and there
> never could be.
>
> By mentioning "static expressions" here, you're confusing causal
> readers as to what you actually are asking about. If you're not sure
> about the proper terminology, it's best to stay vague (in this case,
> just talk about
> "expressions") because that will not confuse the issue.

I agree, see above.

> It's a minor point, on to your actual point.
>
> This issue has been well-known for decades, but the problem is that
> this looks natural but it has the potential to cause extreme pain for
> implementations and correspondingly for some kinds of users (those
> that cannot have any sort of heap operations in their code). It's
> clear that some sort of restrictions have to be placed on the expression.

I am somewhat surprised about the necessity of heap allocation that you mention.
I do not see what would prevent a stack-based Ada implementation from
implementing the above record as something that is allocated on a stack, maybe
along the lines of...

... | Record discriminants | First array offset | First array length | First array data | Second array offset | <other components> | ...

After all, by the time control reaches the declaration of a constrained object
of a discriminated record type, all the information that is required for
allocating the discriminated record on the stack should be available. It is only
a matter of computing the various array offsets and length at runtime.

Of course, as you say and as the email you quoted below elaborates, there must
be some kind of rule enforcing the reproducibility of array offset and length
computations. For example, it should probably be a expression without any side
effect.

As for unconstrained objects of a discriminated type, an implementation could
probably use the same sort of techniques as for functions returning
unconstrained objects, as we discussed recently on comp.lang.ada. That is,
either use a secondary stack or raw heap allocation, whichever works best for
the application domain, followed by storing a pointer to the allocated object on
the primary stack.

> Coincidentially, I recently received some private mail from Steve
> Baird on this very topic (some form of this will be in a future draft
> of AI12-0075-1, which proposes expansion of what is considered a
> static function, so I think it is OK to use it here):
>
>> We don't want to mess with the 3.8(12/3) rule.
>>
>> It might seem tempting to allow (in a component constraint) the case
>> of a call to a static function such that all parameters (including,
>> of course, defaulted parameters) are either static expressions or
>> names denoting discriminants.
>>
>> One property we need for discriminant-dependent component constraint
>> expressions is reproducibility - given the same discriminant values
>> we must always get the same result. This relaxation of the rules
>> would meet that requirement, but that's not enough.
>>
>> We also need, given a set of discriminant subtypes, to be able to
>> easily calculate the minimum and maximum values that an expression
>> might yield (e.g., we need to know how much storage is needed when we
>> declare an unconstrained object of a type which has an array
>> component with a discriminant-dependent index constraint).
>>
>> That's where this proposal would lead to implementation problems, as
>> an arbitrary static function would not necessarily be monotonic.

Please excuse my lack of implementation experience, but why is it needed to know
that minimum and maximum of an expression in this situation ? I do not
understand either this comment, nor the ones you provide below about max-size
implementations.

> I'm no fan of the max-size implementation approach, and are
> particularly appalled that we allow that to be the only approach used
> by an implementation. (We don't allow only table-based case
> statements, for instance, any arbitrary range has to work in a case
> statement. Why do we treat discriminated records differently??)
> Nevertheless, that train left the station decades ago, and it wouldn't
> be sensible to change the rules now, given the massive pain that would
> be caused. On top of which, some users cannot have any implicit
> non-stack based storage management (as is required when avoiding a
> max-size implementation), so an additional aspect would be required to
> force a max-size implementation when required. That's clearly very
> complex. (It's also my preference for the language, but I expect that
> I would be a group of one supporting that position, that's certainly
> been the way it was in the past.)
>
> We once considered (I forget when) allowing a very limited expansion
> to the
> 3.8(12/3) rule. Essentially, only D + C and D - C (C being a static
> expression) were suggested. It didn't seem that those would be
> sufficiently useful to support the expansion of the Standard wording
> and implementation complexity. It's probably possible to go slightly
> beyond that, but even multiply and divide can cause problems
> (especially when the values can be negative, which they can be in general).
>
> Ergo, I wouldn't expect a change in this area, as "thar be dragons",
> at least for compilers implemented as most are (to my knowledge, all
> but Janus/Ada).

In any case, thanks for your thoughtful reply :) It seems that whether I am
meanglessly trying to revive an old fire for nothing or not, I am going to learn
quite a lot of interesting things by raising this issue again.

By the way, if you can remember identifiers or keywords for older Ada Issues
that I can read in order to get more familiar with this matter, I would be happy
to peruse them.

***************************************************************

From: Randy Brukardt
Sent: Wednesday, September 2, 2015  3:23 PM

...
> > This issue has been well-known for decades, but the problem is that
> > this looks natural but it has the potential to cause extreme pain
> > for implementations and correspondingly for some kinds of users
> > (those that cannot have any sort of heap operations in their code).
> > It's clear that some sort of restrictions have to be placed on
> the expression.
>
> I am somewhat surprised about the necessity of heap allocation that
> you mention. I do not see what would prevent a stack-based Ada
> implementation from implementing the above record as something that is
> allocated on a stack, maybe along the lines of...
>
> ... | Record discriminants | First array offset | First array length |
> First array data | Second array offset | <other
> components> | ...

Nothing does. But that would be a rarely-occurring special case for a
compiler, something most compiler-writers are going to avoid. (Especially as
we have to support whole record assignment from/to unconstrained objects of
the same type. It's a pain if there are multiple representations.)
Unconstrained objects of discriminated types are much more likely.

Moreover, the ARG tends to worry about the worst case, because that's where
the problems lie. We don't worry much about the best case, 'cause
implementers will make the best performing compiler that they can -- they
rarely need our help in doing that.

Anyway, unconstrained objects cause problems in two cases:
(1) If the object is constrained by its initial value, the discriminant
value my be unknown until runtime.
    Obj : Disc := Some_Function;

(2) If the type is mutable ("mutable" is the term that the RM doesn't define
but the AARM does to describe a type with defaulted discriminants), the
discriminants can be changed by a later assignment.

    type Rec (Len : Natural := 0) is record
        Val : String (1 .. Len);
    end record;

    Obj : Rec; -- This could be a component, too.

    Obj := (Len => 2, Val => "RR");
    ...
    Obj := (Len => 3, Val => "ZZZ");

The commonly used solution to this last problem is the "max size"
implementation; the object will be large enough to hold a value for any
possible value of the discriminant. (As such, this declaration won't compile
or will raise Storage_Error on most compilers.)

There is no way that we would have different rules for the use of
discriminants for mutable and non-mutable types; that would be amazingly
confusing to programmers and would also pose a maintenance hazard (it would
be much harder to switch between a non-mutable and a mutable type).

> After all, by the time control reaches the declaration of a
> constrained object of a discriminated record type, all the
> information that is required for allocating the discriminated
> record on the stack should be available. It is only a matter
> of computing the various array offsets and length at runtime.
>
> Of course, as you say and as the email you quoted below
> elaborates, there must be some kind of rule enforcing the
> reproducibility of array offset and length computations. For
> example, it should probably be a expression without any side effect.

Right, but there is no way in current Ada to know that. Only predefined
operators are safe that way, which is rather limiting.

> As for unconstrained objects of a discriminated type, an
> implementation could probably use the same sort of techniques
> as for functions returning unconstrained objects, as we
> discussed recently on comp.lang.ada. That is, either use a
> secondary stack or raw heap allocation, whichever works best
> for the application domain, followed by storing a pointer to
> the allocated object on the primary stack.

No, that does not work for a mutable type. That's because it can change
shape (grow and use more memory is the problem case, of course) in a later
assignment, and that assignment might be inside of some other scope than the
one that created the object. The memory allocation here is not a stack. One
can just drop the memory on the floor, but for some reason users don't like
that. ;-)

Or one can have some sort of memory allocator associated with the scope
where the object is declared. That isn't cheap (although it does have the
effect of providing a correct implementation of dynamic accessibility
checking for almost free - so it's possible that you're paying the cost
anyway).

Moreover, for users that cannot have any sort of heap allocation, this
technique isn't going to work. They'd have to avoid the use of
discriminant-dependent components, which is way too fierce of a restriction.

> > Coincidentially, I recently received some private mail from Steve
> > Baird on this very topic (some form of this will be in a future draft
> > of AI12-0075-1, which proposes expansion of what is considered a
> > static function, so I think it is OK to use it here):
> >
> >> We don't want to mess with the 3.8(12/3) rule.
> >>
> >> It might seem tempting to allow (in a component constraint) the case
> >> of a call to a static function such that all parameters (including,
> >> of course, defaulted parameters) are either static expressions or
> >> names denoting discriminants.
> >>
> >> One property we need for discriminant-dependent component constraint
> >> expressions is reproducibility - given the same discriminant values
> >> we must always get the same result. This relaxation of the rules
> >> would meet that requirement, but that's not enough.
> >>
> >> We also need, given a set of discriminant subtypes, to be able to
> >> easily calculate the minimum and maximum values that an expression
> >> might yield (e.g., we need to know how much storage is needed when we
> >> declare an unconstrained object of a type which has an array
> >> component with a discriminant-dependent index constraint).
> >>
> >> That's where this proposal would lead to implementation problems, as
> >> an arbitrary static function would not necessarily be monotonic.
>
> Please excuse my lack of implementation experience, but why
> is it needed to know that minimum and maximum of an
> expression in this situation ? I do not understand either
> this comment, nor the ones you provide below about max-size
> implementations.

I think I explained it above. For an unconstrained mutable discriminated
record type, most implementations use a max-size approach to allocating
memory. The alternative of having a dedicated storage pool available to be
used if a later assignment grows the object is considered too complex.

For a max-size approach to work, one has to be able to determine the max
size. But that's not easy if arbitrary static functions are allowed.
Consider:

      subtype S_A is Integer range -20..20;
      subtype S_B is Integer range -30..10;

      type A_Rec (A : S_A := 0; B : S_B := 0) is record
         Val : String (1 .. (if A*B > 0 then A*B else 0));
      end record;

What is the maximum size of Val? It's 60 characters, but to figure that out,
one has to multiply all 4 combinations of end-points. The "obvious"
technique of using the upper bounds gets the wrong answer of 20. (Here, the
conditional expression just serves to keep the value in range for String; I
could have used some other array type without the conditional expression and
without changing the issue.)

It gets even worse for arbitrary conditional expressions as that potentially
makes discontinuities in the result values; you'd have to try every possible
value of the discriminants (which can be a whole lot if there are multiple
discriminants). It's not impossible, but complicated and slow.

(For instance, consider:

      type B_Rec (C : S_A := 0) is record
         Val : String (1 .. (if C = 0 then 50 else C+20));
      end record;

The largest size does not come from either endpoint.)

...
> In any case, thanks for your thoughtful reply :) It seems
> that whether I am meanglessly trying to revive an old fire
> for nothing or not, I am going to learn quite a lot of
> interesting things by raising this issue again.
>
> By the way, if you can remember identifiers or keywords for
> older Ada Issues that I can read in order to get more
> familiar with this matter, I would be happy to peruse them.

I'd look for the paragraph number in the AIs and minutes, that probably
won't find everything but at least it would provide a start.

***************************************************************

From: Tucker Taft
Sent: Wednesday, September 2, 2015  3:43 PM

> ... For a max-size approach to work, one has to be able to determine
> the max size. But that's not easy if arbitrary static functions are allowed.
> Consider:
>
>        subtype S_A is Integer range -20..20;
>        subtype S_B is Integer range -30..10;
>
>        type A_Rec (A : S_A := 0; B : S_B := 0) is record
>           Val : String (1 .. (if A*B > 0 then A*B else 0));
>        end record;
>
> What is the maximum size of Val? It's 60 characters, but to figure
> that out, one has to multiply all 4 combinations of end-points. The "obvious"
> technique of using the upper bounds gets the wrong answer of 20.
> (Here, the conditional expression just serves to keep the value in
> range for String; I could have used some other array type without the
> conditional expression and without changing the issue.) ...

To be fair, we could define rules similar to those for "static" predicates
(RM2012 3.2.4(15/3-22/3), which have similar concerns.  We could call it
"discriminant-static" in analogy with "predicate-static," with the key
requirement being the ability to determine statically the possible range of
values of the expression knowing the range of values of each discriminant,
without any sort of exhaustive search.

***************************************************************

From: Randy Brukardt
Sent: Wednesday, September 2, 2015  4:00 PM

Surely. But one wonders if that would lead to a result that is still pretty
limiting to users. Pretty much the only operators that work for such a scenario
are "+" and "-" (I showed why "*" is a problem, and "/" is obviously worse); no
conditionals, no static functions, unclear whether multiple discriminants could
be allowed.

Anyway, to be discussed if we want to get serious about this.

***************************************************************

From: Tucker Taft
Sent: Wednesday, September 2, 2015  10:07 PM

I don't see "*" as a serious problem.  It is true you have to split into
positive and negative subranges, and compute their resulting ranges separately.
But that is much better than an "exhaustive search."  For what it is worth, that
is exactly what the CodePeer static analyzer does, as it is performing this sort
of "value-set arithmetic" everywhere, on almost any operator.

***************************************************************

From: Bob Duff
Sent: Wednesday, September 2, 2015  4:48 PM

> Terminology point: nothing in the above wording nor in the rest of
> your message has anything at all to do with "static expressions" as
> defined in 4.9.

I think Hadrien Grasland was saying that it should have the form of a static
expression, assuming you plugged in a static value for the discriminant.  Or
something like that.  This concept is similar to "predicate-static".

> I'm no fan of the max-size implementation approach, and are
> particularly appalled that we allow that to be the only approach used
> by an implementation. (We don't allow only table-based case
> statements, for instance, any arbitrary range has to work in a case
> statement. Why do we treat discriminated records differently??)

Probably because language designers had the notion that heap allocation should
never be implicit.  Sparse case statements don't require implicit heap
allocation.

Never mind that if you don't want implicit heap allocation, a better solution is
"pragma Restrictions(No_Implicit_Heap);".

And never mind that in Ada 83, tasks require heap allocation, and in later
language versions most implementations use heap allocation.

And never mind that unbounded strings and Vectors and Holders and whatnot all do
heap allocation, arguably "implicit".

> [quoting Steve:]
> > We also need, given a set of discriminant subtypes, to be able to
> > easily calculate the minimum and maximum values that an expression
> > might yield (e.g., we need to know how much storage is needed when
> > we declare an unconstrained object of a type which has an array
> > component with a discriminant-dependent index constraint).

I'm not sure why we "need" that.  Maybe we'd "want" that, but:

    - Why "minimum"?

    - This is only needed/wanted for defaulted discrims.  We already
      have different rules (e.g. tagged types forbid defaulted
      discrims).

    - A conservative maximum can always be calculated -- e.g 2**56-1
      bytes on x86-64 (which will always raise S_E on alloc-the-max
      implementations).

> We once considered (I forget when) allowing a very limited expansion
> to the
> 3.8(12/3) rule. Essentially, only D + C and D - C (C being a static
> expression) were suggested.

For what it's worth, those are the cases I've run into in practise.
In particular "Some_Array(0..Length_Discrim-1);".
But it's a rather kludgy restriction, reminiscent of the restrictions on array
indexing expressions in early FORTRAN.

***************************************************************

From: Steve Baird
Sent: Wednesday, September 2, 2015  6:08 PM

>   - Why "minimum"?

For a discriminant-dependent low bound in an index constraint.

>       This is only needed/wanted for defaulted discrims.  We already
>       have different rules (e.g. tagged types forbid defaulted
>       discrims).


If we change the rules to allow

    subtype Index is Integer range 0 .. Some_Dynamic_Expression;

    type T1 (D1 : Index) is
      record
        T1_Stuff : String
                     (1 .. Some_Complicated_But_Static_Function (D1));
                     -- legal because D1 has no default value
      end record;

then what is the component size of the following array type?

    type T2 (D2 : Index := 0) is
      record
        T2_Stuff: T1 (D1 => D2);
      end record;

    type T2_Vector is array (Character) of T2;

I'm taking it as a given that we want to preserve "allocate max size"
as a viable implementation option for mutable records.

Some set of rules could be concocted to allow the cases we know how to implement
while disallowing things like the above example; I'm just saying that the rule
to capture this distinction isn't completely obvious.

A similar example that would also need to be dealt with is
     type T3 (D3 : Index := 0) is new T1 (D1 => D3); , although new discrims
for a derived untagged type is more of a corner case.

***************************************************************

From: Hadrien Grasland
Sent: Thursday, September 3, 2015  2:31 PM

> [...]
>
> Anyway, unconstrained objects cause problems in two cases:
> (1) If the object is constrained by its initial value, the
> discriminant value my be unknown until runtime.
>     Obj : Disc := Some_Function;
>
> (2) If the type is mutable ("mutable" is the term that the RM doesn't
> define but the AARM does to describe a type with defaulted
> discriminants), the discriminants can be changed by a later assignment.
>
>     type Rec (Len : Natural := 0) is record
>         Val : String (1 .. Len);
>     end record;
>
>     Obj : Rec; -- This could be a component, too.
>
>     Obj := (Len => 2, Val => "RR");
>     ...
>     Obj := (Len => 3, Val => "ZZZ");
>
> The commonly used solution to this last problem is the "max size"
> implementation; the object will be large enough to hold a value for
> any possible value of the discriminant. (As such, this declaration
> won't compile or will raise Storage_Error on most compilers.)
>
> There is no way that we would have different rules for the use of
> discriminants for mutable and non-mutable types; that would be
> amazingly confusing to programmers and would also pose a maintenance
> hazard (it would be much harder to switch between a non-mutable and a mutable type).

I would argue that of those, only (2) is really serious, since (1) can almost
always (except where you highlight below) be solved using a run-time allocation
mechanism based either on heap allocation or a sufficiently large program stack,
whichever is available. The only situation where this would fail would be an
architecture where processes may have large static memory requirements, but may
not ask for much more at runtime (think non-growing 4 KB stacks :-( )

Indeed though, the mutable type problem is pretty serious. I didn't know about
that one, and indeed I agree with you that silent heap allocation with
reference-counted pointers would probably be a better fit in this case, if the
target platform could provide that. Though this is probably never going to fly
for most implementations, because real-time programming and heap allocation
don't mix well.

> [...]
>
>> As for unconstrained objects of a discriminated type, an
>> implementation could probably use the same sort of techniques as for
>> functions returning unconstrained objects, as we discussed recently
>> on comp.lang.ada. That is, either use a secondary stack or raw heap
>> allocation, whichever works best for the application domain, followed
>> by storing a pointer to the allocated object on the primary stack.
> No, that does not work for a mutable type. That's because it can
> change shape (grow and use more memory is the problem case, of course)
> in a later assignment, and that assignment might be inside of some
> other scope than the one that created the object. The memory
> allocation here is not a stack. One can just drop the memory on the
> floor, but for some reason users don't like that. ;-)
>
> Or one can have some sort of memory allocator associated with the
> scope where the object is declared. That isn't cheap (although it does
> have the effect of providing a correct implementation of dynamic
> accessibility checking for almost free - so it's possible that you're
> paying the cost anyway).
>
> Moreover, for users that cannot have any sort of heap allocation, this
> technique isn't going to work. They'd have to avoid the use of
> discriminant-dependent components, which is way too fierce of a restriction.

So if I get your point, the main problem with allowing broader use of
discriminants is that it could break max-size-based implementation techniques,
which themselves are required to implement mutable types without heap
allocation.

But as you have also shown above, these techniques are already trivially broken
by using a "wrong" discriminant type to define array bounds, and consequently
can only be applied to a limited subset of the mutable types allowed by the Ada
standard anyway. So what would change in the end ?

Couldn't an implementation that doesn't support heap allocation just equate an
array index of nontrivial value to the mythical universal_integer,  throw a
warning about objects of an unbounded mutable type causing problems, and then
die at runtime the way GNAT did when I exposed it to your code snippet above ?

> [...]
>
> [...] For an unconstrained mutable discriminated record type, most
> implementations use a max-size approach to allocating memory. The
> alternative of having a dedicated storage pool available to be used if
> a later assignment grows the object is considered too complex.
>
> For a max-size approach to work, one has to be able to determine the
> max size. But that's not easy if arbitrary static functions are allowed.
> Consider:
>
>       subtype S_A is Integer range -20..20;
>       subtype S_B is Integer range -30..10;
>
>       type A_Rec (A : S_A := 0; B : S_B := 0) is record
>          Val : String (1 .. (if A*B > 0 then A*B else 0));
>       end record;
>
> What is the maximum size of Val? It's 60 characters, but to figure
> that out, one has to multiply all 4 combinations of end-points. The "obvious"
> technique of using the upper bounds gets the wrong answer of 20.
> (Here, the conditional expression just serves to keep the value in
> range for String; I could have used some other array type without the
> conditional expression and without changing the issue.)
>
> It gets even worse for arbitrary conditional expressions as that
> potentially makes discontinuities in the result values; you'd have to
> try every possible value of the discriminants (which can be a whole
> lot if there are multiple discriminants). It's not impossible, but complicated and slow.
>
> (For instance, consider:
>
>       type B_Rec (C : S_A := 0) is record
>          Val : String (1 .. (if C = 0 then 50 else C+20));
>       end record;
>
> The largest size does not come from either endpoint.)

I guess a max-size implementation could do it like this then :

* Try to put a reasonable upper bound on mutable object size using trivial
  heuristics (e.g. for raw discriminant usage "+" or "-" something known
  statically, where the computation is easy).

* In case of failure, assume the size of the object can be anything from the
  conceptual universal_integer'First to universal_integer'Last (and beyond !),
  and throw a compiler warning about mutable object usage possibly causing
  errors at runtime, the way GNAT does.

* In the special case where the array bound values are provided by an expression
  function, use the type of the result to put less pessimistic bounds on the
  mutable object's size. Possibly throw a warning anyway if the range remains
  gigantic.

> [...]
>> By the way, if you can remember identifiers or keywords for older Ada
>> Issues that I can read in order to get more familiar with this
>> matter, I would be happy to peruse them.
> I'd look for the paragraph number in the AIs and minutes, that
> probably won't find everything but at least it would provide a start.

Unfortunately, trying it with "3.8(12", which is the least specific I can do
before being swamped with irrelevant junk, didn't lead much results. Mostly AIs
about the RM sometimes violating this rule, typo corrections, and things like
that.

Well, guess this will have to do.

***************************************************************

From: Hadrien Grasland
Sent: Thursday, September 3, 2015  2:47 PM

> [...]
>
>> Terminology point: nothing in the above wording nor in the rest of
>> your message has anything at all to do with "static expressions" as
>> defined in 4.9.
> I think Hadrien Grasland was saying that it should have the form of a
> static expression, assuming you plugged in a static value for the
> discriminant.  Or something like that.  This concept is similar to
> "predicate-static".

Perhaps that's what I had in mind initially (I don't remember), but Randy is
right that what I wrote in the end was terribly wrong ;-)

>> I'm no fan of the max-size implementation approach, and are
>> particularly appalled that we allow that to be the only approach used
>> by an implementation. (We don't allow only table-based case
>> statements, for instance, any arbitrary range has to work in a case
>> statement. Why do we treat discriminated records differently??)
> Probably because language designers had the notion that heap
> allocation should never be implicit.  Sparse case statements don't
> require implicit heap allocation.
>
> Never mind that if you don't want implicit heap allocation, a better
> solution is "pragma Restrictions(No_Implicit_Heap);".
>
> And never mind that in Ada 83, tasks require heap allocation, and in
> later language versions most implementations use heap allocation.
>
> And never mind that unbounded strings and Vectors and Holders and
> whatnot all do heap allocation, arguably "implicit".

I agree, this is all very hypocritical.

I'm not a huge fan of unnecessary heap allocation, but in some scenarios, using
the heap simply cannot be avoided (or an implementation simply doesn't want to
do without, because it's much more difficult), and it would be WAY better to
have some kind of restrictions available to allow a programmer to prevent it
from happening.

>> [quoting Steve:]
>>> We also need, given a set of discriminant subtypes, to be able to
>>> easily calculate the minimum and maximum values that an expression
>>> might yield (e.g., we need to know how much storage is needed when
>>> we declare an unconstrained object of a type which has an array
>>> component with a discriminant-dependent index constraint).
> I'm not sure why we "need" that.  Maybe we'd "want" that, but:
>
>     - Why "minimum"?
>
>     - This is only needed/wanted for defaulted discrims.  We already
>       have different rules (e.g. tagged types forbid defaulted
>       discrims).
>
>     - A conservative maximum can always be calculated -- e.g 2**56-1
>       bytes on x86-64 (which will always raise S_E on alloc-the-max
>       implementations).
>
>> We once considered (I forget when) allowing a very limited expansion
>> to the
>> 3.8(12/3) rule. Essentially, only D + C and D - C (C being a static
>> expression) were suggested.
> For what it's worth, those are the cases I've run into in practise.
> In particular "Some_Array(0..Length_Discrim-1);".
> But it's a rather kludgy restriction, reminiscent of the restrictions
> on array indexing expressions in early FORTRAN.

Yes. A better solution would seem to be the one that you proposed above, and
that I also echoed in my reply to Randy's first mail : let the implementation
compute max-size correctly when it is able to (e.g. discriminant "+" or "-"
static data), and when it is not able to, assume the worst, throw a warning, and
crash at run-time.

***************************************************************

From: Randy Brukardt
Sent: Thursday, September 3, 2015  3:11 PM

> So if I get your point, the main problem with allowing broader use of
> discriminants is that it could break max-size-based implementation
> techniques, which themselves are required to implement mutable types
> without heap allocation.
>
> But as you have also shown above, these techniques are already
> trivially broken by using a "wrong" discriminant type to define array
> bounds, and consequently can only be applied to a limited subset of
> the mutable types allowed by the Ada standard anyway. So what would
> change in the end ?
>
> Couldn't an implementation that doesn't support heap allocation just
> equate an array index of nontrivial value to the mythical
> universal_integer,  throw a warning about objects of an unbounded
> mutable type causing problems, and then die at runtime the way GNAT
> did when I exposed it to your code snippet above ?

The problem with this is that almost all compilers solely support a max-size
implementation. (Janus/Ada is the only one that I know of that does not do
that.) What would be point be of providing an expansive new feature that
actually works in extremely limited circumstances? That would be lying to users.
Moreover, it would make discriminated types even less portable than they
currently are (with many reasonable definitions getting rejected at runtime).

Now, I personally think that implementations should never have been allowed to
use solely a max-size implementation. If someone used restriction
No_Implicit_Heap or used a similar type-related aspect (which should be
defined), then the type should be rejected (which is far more useful than
getting Storage_Error at some later point). Those restrictions specifically
invoke implementation-dependent behavior, so that seems fine. But otherwise
failing to implement a reasonable type declaration simply should not be an
option.

But I've been a lone voice in the wilderness on this one, and I don't expect
that to change at this late date.

...
> I guess a max-size implementation could do it like this then :
>
> * Try to put a reasonable upper bound on mutable object size using
> trivial heuristics (e.g. for raw discriminant usage "+"
> or "-" something known statically, where the computation is easy).
> * In case of failure, assume the size of the object can be anything
> from the conceptual universal_integer'First to universal_integer'Last
> (and beyond !), and throw a compiler warning about mutable object
> usage possibly causing errors at runtime, the way GNAT does.
> * In the special case where the array bound values are provided by an
> expression function, use the type of the result to put less
> pessimistic bounds on the mutable object's size. Possibly throw a
> warning anyway if the range remains gigantic.

I don't believe that an implementation should have the option to mishandle the
implementation of reasonable types. That's a serious portability problem, as
every implementation will do that differently.

> > [...]
> >> By the way, if you can remember identifiers or keywords for older
> >> Ada Issues that I can read in order to get more familiar with this
> >> matter, I would be happy to peruse them.
> > I'd look for the paragraph number in the AIs and minutes, that
> > probably won't find everything but at least it would provide a start.
>
> Unfortunately, trying it with "3.8(12", which is the least specific I
> can do before being swamped with irrelevant junk, didn't lead much
> results. Mostly AIs about the RM sometimes violating this rule, typo
> corrections, and things like that.
>
> Well, guess this will have to do.

I suspect that this thread (including Steve and Bob's messages) covered most of
the problems introduced by an expansion of the rules. So you probably wouldn't
find much different elsewhere.

***************************************************************

From: Bob Duff
Sent: Thursday, September 3, 2015  3:33 PM

> >   - Why "minimum"?
>
> For a discriminant-dependent low bound in an index constraint.

Ah, yes, of course.  I think I have used that feature approximately zero times
in my life (not counting compiler-test programs).

> >       This is only needed/wanted for defaulted discrims.  We already
> >       have different rules (e.g. tagged types forbid defaulted
> >       discrims).
>
>
> If we change the rules to allow
>
>     subtype Index is Integer range 0 .. Some_Dynamic_Expression;
>
>     type T1 (D1 : Index) is
>       record
>         T1_Stuff : String
>                      (1 .. Some_Complicated_But_Static_Function (D1));
>                      -- legal because D1 has no default value
>       end record;
>
> then what is the component size of the following array type?
>
>     type T2 (D2 : Index := 0) is
>       record
>         T2_Stuff: T1 (D1 => D2);
>       end record;
>
>     type T2_Vector is array (Character) of T2;

I guess part of your point is that if we allow the above, we might as well allow
a default on T1 as well.

And I guess the answer to the "component size" question is approximately
Integer'Last*Storage_Unit bits, likely leading to Storage_Error.

> I'm taking it as a given that we want to preserve "allocate max size"
> as a viable implementation option for mutable records.
>
> Some set of rules could be concocted to allow the cases we know how to
> implement while disallowing things like the above example; I'm just
> saying that the rule to capture this distinction isn't completely
> obvious.

Agreed, but the language doesn't necessarily need to make things illegal -- we
can count on GNAT to raise S_E and warn about it.

(By the way, I did some experiments, and discovered that the GNAT warnings are
conservative -- it warns about S_E in some cases where it ends up allocating 100
or so bytes.  Probably because the warning comes from the front end, and the
record layout is done by the back end.)

***************************************************************

From: Bob Duff
Sent: Thursday, September 3, 2015  3:44 PM

> The problem with this is that almost all compilers solely support a
> max-size implementation. (Janus/Ada is the only one that I know of
> that does not do
> that.)

I think the Rational compiler also uses heap allocation.

And the old Alsys compiler also, except it was buggy (a renaming could become a
dangling pointer).

> But I've been a lone voice in the wilderness on this one, and I don't
> expect that to change at this late date.

Not quite "lone"; I think Steve agrees with you.
And I'm sort of ambivalent.

***************************************************************

From: Hadrien Grasland
Sent: Thursday, September 3, 2015  4:30 PM

>> Couldn't an implementation that doesn't support heap allocation just
>> equate an array index of nontrivial value to the mythical
>> universal_integer,  throw a warning about objects of an unbounded
>> mutable type causing problems, and then die at runtime the way GNAT
>> did when I exposed it to your code snippet above ?
> The problem with this is that almost all compilers solely support a
> max-size implementation. (Janus/Ada is the only one that I know of
> that does not do
> that.) What would be point be of providing an expansive new feature
> that actually works in extremely limited circumstances? That would be
> lying to users. Moreover, it would make discriminated types even less
> portable than they currently are (with many reasonable definitions
> getting rejected at runtime).

My point was that for mutable discriminated types, the harm is already done. You
have shown yourself that something as trivial as an array whose bounds are
parametrized by a Natural will violently implode at runtime under most
implementations using the max-size technique, yet work just fine on a couple of
others. As of today, mutable discriminated types are unportable because most
implementations do not do a very good job at implementing them.

My understanding is that the only way to fix this (long-term, Ada
202x-ish) would be to...

1/ Somehow amend the Ada standard so that it rejects pure max-size
implementations that cannot use the heap when it is the only sensible option 2/
Add a restriction so that users who want no heap allocation can still get that
behavior, and make the use of mutable types illegal when that restriction is in
effect

It actually seems that we agree on this one, when you write...

> Now, I personally think that implementations should never have been
> allowed to use solely a max-size implementation. If someone used
> restriction No_Implicit_Heap or used a similar type-related aspect
> (which should be defined), then the type should be rejected (which is
> far more useful than getting Storage_Error at some later point). Those
> restrictions specifically invoke implementation-dependent behavior, so
> that seems fine. But otherwise failing to implement a reasonable type
> declaration simply should not be an option.
>
> But I've been a lone voice in the wilderness on this one, and I don't
> expect that to change at this late date.

...so I guess I can join you in the wilderness if you like ;-)

> I don't believe that an implementation should have the option to
> mishandle the implementation of reasonable types. That's a serious
> portability problem, as every implementation will do that differently.

But again, isn't the harm already done on this one ?

And, more interestingly, should it be considered a separate defect of the Ada
standard that so many implementations, which are considered valid, can implement
discriminated types in a fashion that makes some of them, even very simple ones,
largely unportable ? To say it otherwise, is the standard or the implementation
at fault in this case ?

***************************************************************

From: Randy Brukardt
Sent: Thursday, September 3, 2015  4:55 PM

...
> And, more interestingly, should it be considered a separate defect of
> the Ada standard that so many implementations, which are considered
> valid, can implement discriminated types in a fashion that makes some
> of them, even very simple ones, largely unportable ? To say it
> otherwise, is the standard or the implementation at fault in this case
> ?

It's not a defect of the Standard, since capacity limits are left to the
implementers. An implementation that only allowed one component in each record
would meet the requirements of the Standard, but obviously wouldn't be very
useful. The Standard can't say anything about "goodness" of an implementation,
since trying to specify limits is a never-ending game of whack-a-mole (and it
would be problematical in an existing Standard as we wouldn't want to force
existing implementations to change most things).

As Steve Baird points out in a private mail, the real fault here is that the
ACATS permits this. My recollection is that that was an explicit decision -
tests were removed from an early version of the ACVC (the predecessor of the
ACATS) that used Integer discriminants in this way. (But I could be
misremembering, that was a very long time ago -- mid-1980s.)

While the ACATS cannot test "goodness" of an implementation either, it does
effectively set a minimum standard, as it is a pain in the neck to dispute tests
and use modified grading to avoid implementation limits. An implementation that
couldn't handle multiple record components couldn't compile large parts of the
ACATS, so it is necessary to make the implementation limits allow compilation of
the ACATS. That at least provides a minimum level of "goodness".

The problem here is that trying to introduce such a test would lead to a large
fight. On top of which, I think that the aspect Contiguous_Representation (or
something like that) would need to be introduced so that users could force
max-size behavior if they need it on a per-type basis (and that would be good
for portability in any case). [Restriction No_Implicit_Heap_Allocation would do
the trick for program-wide requirements.] So I wouldn't try to use the ACATS to
introduce such a requirement without at least some ARG support.

***************************************************************

From: Hadrien Grasland
Sent: Monday, September 7, 2015  4:56 AM

So, if I try to summarize, we seem to be discussing two separate, but related
issues here.

1/Currently, the applicability of discriminants is limited in unnatural ways,
due to the restrictions brought forth by paragraph 3.8(12/3) of the RM. The
rules do not even allow the specification of an array by its index offset and
size, for example, which seems unnecessarily draconian. Thus, it would be very
nice if these restrictions could be lifted to allow for more general expressions
based on discriminants in record types.

Two problems prevent this from happening :

- Not every expression is suitable for use inside of a discriminated record
  definition. We would like to have an expression with no side effect, to ensure
  reproducibility of computation results. The global annotations being proposed
  as part of AI12-0079-1 could probably help with this when it comes to function
  calls.

- Complex expressions, especially those including calls to non-intrinsic
  functions, make it hard to impossible for implementations to deduce reasonable
  bounds on the size of discriminated types. Such bounds are, in turn, necessary
  to the implementation of mutable discriminated types in heap-free
  environments.

A way to address both of these problems is to restrict the kind of expressions
allowed inside of discriminated types to certain frequently needed scenarii,
such as the application of integer offsets ("+" and "-") to integral
discriminants. That works, provided that the "+" and "-" operators are
implemented without side effects (...we have to think about the worst case,
right ?). But as Bob Duff points out, it is kludgy, so a more general solution
would be preferable if possible.

2/In general, current heap-free implementations of mutable discriminated types
are kludgy. It is possible to break them with trivial code examples, such as the
one given by Randy Brukardt of a string type with Natural bounds. Arguably, a
better option would be to make the functionality depend on the availability of
heap allocation for such types, and introduce a restriction No_Implicit_Heap (or
similar), along with possibly type-specific aspects, to prevent the use of
heap-allocated types. But that would be, in itself, opening a new can of worms.

Would you agree with this summary, or am I missing something important there ?

***************************************************************

From: Tucker Taft
Sent: Monday, September 7, 2015  8:51 AM

> A way to address both of these problems is to restrict the kind of
> expressions allowed inside of discriminated types to certain
> frequently needed scenarii, such as the application of integer offsets
> ("+" and "-") to integral discriminants. That works, provided that the
> "+" and "-" operators are implemented without side effects (...we have
> to think about the worst case, right ?). But as Bob Duff points out, it is kludgy, so a more general solution would be preferable if possible.

Perhaps, but I believe this is the way to go, if we go in this direction at all.
But note that we do not need to restrict it this much.  We could include "*",
"/", "**" and perhaps a few others, so long as the range of possible values can
be determined statically without an exhaustive search through all possible
values of the discriminant.

> 2/In general, current heap-free implementations of mutable discriminated types are kludgy.
> It is possible to break them with trivial code examples, such as the
> one given by Randy Brukardt of a string type with Natural bounds.
> Arguably, a better option would be to make the functionality depend on
> the availability of heap allocation for such types, and introduce a
> restriction No_Implicit_Heap (or similar), along with possibly
> type-specific aspects, to prevent the use of heap-allocated types. But that would be, in itself, opening a new can of worms.

This is asking for trouble, in my view.  I don't think we should ever require
the use of a heap for mutable discriminated types.

***************************************************************

From: Bob Duff
Sent: Monday, September 7, 2015  8:54 AM

> Would you agree with this summary, or am I missing something important
> there ?

Good summary.

I am inclined to oppose any changes in this area.  It's not sufficiently broken,
and I'm sure there are other more important changes to make.

I could be convinced to allow the "Discrim+1" case and similar.
But I am strongly opposed to any change that would require heap allocation.  Too
much implementation difficulty (for some implementations) for too little gain.
If you want growable things, use Containers.Vectors or similar.

By the way, the restriction No_Implicit_Heap_Allocations already exists.
I don't know what "implicit" means; the RM doesn't define it.

***************************************************************

From: Jean-Pierre Rosen
Sent: Monday, September 7, 2015  9:37 AM

It doesn't define "heap" either...

***************************************************************

From: Bob Duff
Sent: Monday, September 7, 2015  12:15 PM

True.  But I know what "heap allocation" means; at least I think I do.
I'm not so sure about what makes such an allocation "implicit".

***************************************************************

From: Simon Wright
Sent: Monday, September 7, 2015  12:38 PM

"not explicit", I suppose; i.e. without the use of the keyword "new".

This usage of "implicit" isn't given by dictionary.com, for example, but (as
"unstated") does appear in the thesaurus.

***************************************************************

From: Bob Duff
Sent: Monday, September 7, 2015  2:32 PM

For sure, "new" is an explicit heap allocation.  But what about "Append"
(in Ada.Containers.Vectors)?  Everybody knows it does heap allocation.
And I can point to the line of code with "Append", and say, "See that?
That right there is doing a heap allocation."  Sounds kind of explicit.
But there's no "new" in the code (no fair peeking at the body of Append; it
might not even be written in Ada).

***************************************************************

From: Simon Wright
Sent: Tuesday, September 8, 2015  4:14 PM

Doesn't sound explicit to me.

It seems that GNAT implements the restriction by forbidding "new" while
compiling the RTS. So if I were to copy Ada.Containers.Vectors outside the RTS
(i.e. outside the Ada hierarchy) I could use it in a restricted environment
without problems.

The Ravenscar Guide (from which the "implicit" wording derives) doesn't seem to
consider Ada.Containers as part of "the implementation".

***************************************************************

From: Hadrien Grasland
Sent: Wednesday, September 9, 2015  1:28 PM

>> Would you agree with this summary, or am I missing something
>> important there ?
> Good summary.
>
> I am inclined to oppose any changes in this area.  It's not
> sufficiently broken, and I'm sure there are other more important changes to make.
>
> I could be convinced to allow the "Discrim+1" case and similar.
> But I am strongly opposed to any change that would require heap
> allocation.  Too much implementation difficulty (for some
> implementations) for too little gain.  If you want growable things,
> use Containers.Vectors or similar.
>
> By the way, the restriction No_Implicit_Heap_Allocations already exists.
> I don't know what "implicit" means; the RM doesn't define it.

In the end, I guess I will have to agree with you and Tucker Taft. It sounds
like the only changes that can be added without either...

1/An unstated requirement for heap allocation 2/More annoying distinctions
between constained and mutable discriminated types

would feel like too little, too late, in comparison to other pending issues. And
lack of forced heap allocation is also something I like about Ada, considering I
plan to use this language in my next attempt at OSdeving.

Thanks for this very thoughtful discussion anyway, everyone !

***************************************************************

Questions? Ask the ACAA Technical Agent