Version 1.5 of ais/ai-00267.txt

Unformatted version of ais/ai-00267.txt version 1.5
Other versions for file ais/ai-00267.txt

!standard 4.6 (33)          02-06-07 AI95-00267/05
!standard A.5.3 (41)
!class amendment 01-05-14
!status Amendment 200Y 02-05-10
!status ARG Approved 9-0-0 01-10-07
!status work item 01-05-14
!status received 01-05-14
!priority Medium
!difficulty Easy
!subject Fast float-to-integer conversions
!summary
An attribute is added to enable high-performance conversions from floating point types to integer types when the exact rounding does not matter.
!problem
4.6(33) specifies the rounding for conversions from floating point types to integer types. However, the specified rounding is different from the default rounding provided for conversions on many common machines. Thus, such conversions are quite expensive.
For example, one vendor reports that the effect of the specified rounding on the performance of their implementation of the Generic_Elementary_Functions was of the order of 10-20%. For this application, the conversion is used to provide an index into a lookup table. The rounding used doesn't have any effect on the quality of the result (because the two table entries to which you can round in the midpoint case are equally good), but it does affect the performance.
!proposal
The attribute Machine_Rounding is added to A.5.3.
!wording
S'Machine_Rounding
S'Machine_Rounding denotes a function with the following specification:
function S'Machine_Rounding (X : T) return T
The function yields the integral value nearest to X. If X lies exactly halfway between two integers, one of those integers is returned, but which of them is returned is unspecified. A zero result has the sign of X when S'Signed_Zeros is True. This function provides access to the rounding behavior which is most efficient on the target processor.
AARM note:
The intended use of this attribute is in a type conversion to some integer type:
Some_Integer(Some_Float'Machine_Rounding(X))
Implementations should detect this case to generate fast code for the conversion of X to Some_Integer. In particular, the usual float-to-integer type conversion rounding code is not necessary, as the value has already been rounded. (This applies to all of the rounding and truncation attributes defined in A.5.3, but is critical for the use of this attribute.)
!example
The following example illustrates a possible usage of this attribute to implement the Arcsinh function (this is an excerpt of real code). This function is implemented using polynomial approximations over small intervals. The coefficients of the polynomial are defined in a constant table:
type Element is record X : Float_Type; C0, C1, C2, C3 : Float_Type; end record;
Data : constant array (Long_Integer range <>) of Element := (7 => (16#0.1172906#E0, 16#0.116F1D00EE#E0, 16#0.FF6852006#E0, -16#0.8A9C5782#E-1, -16#0.29C8C88A8#E0), 8 => (16#0.13FB6A2#E0, 16#0.13F63C00DC#E0, 16#0.FF394406C#E0, -16#0.9E676A72#E-1, -16#0.29833F8E#E0), 9 => (16#0.1685FA#E0, 16#0.167E904#E0, 16#0.FF03D007#E0, -16#0.B21C84C4#E-1, -16#0.2934729F8#E0), ...);
The body of the Arcsinh function selects the proper interval by doing some simple computation based on its argument, and loads the coefficients of the polynomial from the above table. It then computes the value of the polynomial at X, and returns the result.
function Arcsinh (X : Float_Type'Base) return Float_Type'Base is I : Integer; Ci0, Ci1, Ci2, Ci3, Xi, Y, Z : Float_Type'Base; begin ... Y := 16#1.D6978# * X * (16#38.B433# - X * (12.0 - X)); I := Integer (Y); Xi := Float_Type'Base (Data (I).X); Ci0 := Float_Type'Base (Data (I).C0); Ci1 := Float_Type'Base (Data (I).C1); Ci2 := Float_Type'Base (Data (I).C2); Ci3 := Float_Type'Base (Data (I).C3); Z := X - Xi; return Ci0 + (Z * (Ci1 + Z * (Ci2 + Z * Ci3))); end;
Note that the evaluation of I involves a float-to-integer conversion. If Y happens to lie exactly between two integers, 4.6(33) requires that the conversion round away from zero.
However, in this instance we really don't care if the value returned by the conversion of, say, 41.5, is 41 or 42. The polynomials at indices 41 and 42 are, by construction, both very good approximations of the function being computed when Y = 41.5, so we can use either of them. (It is not even necessary for two invocations of Arcsinh to round in the same fashion: the rounding mode may have been changed by altering some control register, but Arcsinh will still return results within the required accuracy.)
With the proposed amendment, the computation of I will be written:
I := Integer (Float_Type'Machine_Rounding (Y));
and it will use the fastest rounding mechanism available on the underlying hardware.
!discussion
We considered making the return type of the attribute Integer or universal_integer, but these would be inconsistent with the other rounding attributes defined in A.5.3.
The intended use of this attribute is in a type conversion to some integer type:
Some_Integer(Some_Float'Machine_Rounding(X))
Implementations should detect this case to generate fast code for the conversion of X to Some_Integer. In particular, the usual float-to-integer type conversion rounding code is not necessary, as the value has already been rounded. (This probably applies to all of the rounding and truncation attributes defined in A.5.3.)
The rounding of Machine_Rounding when the value is halfway between two integers is purposely left unspecified. It is not intended that users depend on the actual rounding used. (If they need a specific mode of rounding, one of the other rounding attributes should be used.) If we had said that the rounding was implementation-defined, we would be requiring documentation of the rounding used, which potentially would encourage users to depend on a particular rounding.
!corrigendum A.5.3(41)
Insert after the paragraph:
The function yields the integral value nearest to X, rounding toward the even integer if X lies exactly halfway between two integers. A zero result has the sign of X when S'Signed_Zeros is True.
the new paragraphs:
S'Machine_Rounding
S'Machine_Rounding denotes a function with the following specification:
function S'Machine_Rounding (X : T) return T
The function yields the integral value nearest to X. If X lies exactly halfway between two integers, one of those integers is returned, but which of them is returned is unspecified. A zero result has the sign of X when S'Signed_Zeros is True. This function provides access to the rounding behavior which is most efficient on the target processor.
!ACATS test
Create a C-Test to check for the existence of this attribute. (It can't be a test case for CXA5015, as this is an amendment, and will probably not be required for years.)
!appendix

From: Randy Brukardt
Sent: Thursday, March 29, 2001 6:02 PM

> I still believe the "biased" rounding
> approach for real => integer was the right decision, but in retrospect
> it seems like it was an unwise generalization of that decision to
> the floating point => floating point rounding case.

Well, I disagree with the "biased" rounding for integers, at least as the
default option. The problem is that it is very expensive to implement on
Pentiums (the best I can do is about 100 times slower and 20 times larger
than an "unbiased" conversion). I agree that it is sometimes convenient to
have it defined that way, but when you don't care specifically about the
rounding, you are paying a huge price from which there is no way to escape
(because there is no way to avoid the type conversion which carries along
the expense). It would have been much better to leave the "default" rounding
(in type conversions to integers) implementation-defined, letting the users
who care use the various attributes defined in A.5.3 to get the specific
rounding they need. Moreover, type conversions look "cheap", and it is a
surprise when they are not cheap.

****************************************************************

From: Pascal Leroy
Sent: Friday, March 30, 2001 2:32 AM

> Well, I disagree with the "biased" rounding for integers, at least as the
> default option. The problem is that it is very expensive to implement on
> Pentiums (the best I can do is about 100 times slower and 20 times larger
> than an "unbiased" conversion). I agree that it is sometimes convenient to
> have it defined that way, but when you don't care specifically about the
> rounding, you are paying a huge price from which there is no way to escape
> (because there is no way to avoid the type conversion which carries along
> the expense).

I agree with this assessment.  We have run into the same problem on a variety
of processors.  At some point we considered having a configuration pragma to
mean I-don't-give-a-damn-about-mid-point-rounding, but we never had time to
implement it.

One practical situation where this shows up is the implementation of the
elementary functions.  You perform some FP computation, based on which you
extract pre-computed values from a table.  The index in the table is obtained by
float-to-integer conversion.  You really don't care what happens in the
mid-point case, because the two table entries to which you can round are equally
good (or equally bad), but you do care if the silly language requires 12 extra
instructions to do the conversion.

****************************************************************

From: Robert Dewar
Sent: Friday, March 30, 2001 5:13 AM

No, I think the biased rounding of integers is too peculiar. This was
frequently reported as a bug by users of Alsys Ada. The arguments for
unbiased rounding just do not apply in this case in my view.

And Randy's figure of a factor of 100 hit is way way way off on the ia32.

****************************************************************

From: Randy Brukardt
Sent: Friday, March 30, 2001 12:20 PM

> No, I think the biased rounding of integers is too peculiar. This was
> frequently reported as a bug by users of Alsys Ada. The arguments for
> unbiased rounding just do not apply in this case in my view.

The problem is simply that sometimes you want fast, and sometimes you want
predicable. And since there is a substantial cost difference between the
two, you really want to be able to chose. But Ada 95 has made the choice for
you: slow and predicable. That's not good for some applications.

> And Randy's figure of a factor of 100 hit is way way way off
> on the ia32.

Well, its the best *I* can do. None of the predefined rounding modes come
close to what you want, so you have to:
    Grab and save the current status mode;
    fiddle the bits to a known mode (I use chop, but the code is about the
                                                 same for the other modes).
    Grab the sign of the value to round and save it somewhere.
    Take the absolute value of the value.
    Add 0.5 to the value.
    Round the value.
    Put the sign back on the value.
    Convert it to integer by storing it to memory.
    Restore the previous status mode.

This is about 20 instructions; and the mode changes and the rounding in
place are relatively expensive. This compares to:

    Convert it to integer by storing it to memory.

which is faster than rounding in place. (Gosh knows why.)

On the original Pentium, the code sequence takes about 150 clocks, while the
straightforward store takes 5 clocks, so the ratio is more like 30 times
slower.

On newer Pentiums, the timings are quite dependent on the surrounding code,
so it is hard to draw any conclusions, but it appears that the change would
be about proportional.

This code sequence is such a mess that we don't even bother with it in-line,
and simply use the old software routine. Because it uses simple
instructions, it generally is faster on most processors (haven't tested the
timings on Pentium IIIs and IVs), but it is so large I always think of it of
as 100 times slower, but it really isn't.

Sorry about the mild exaggeration.

****************************************************************

From: Robert Dewar
Sent: Tuesday, April 03, 2001 7:06 AM

<<The problem is simply that sometimes you want fast, and sometimes you want
predicable. And since there is a substantial cost difference between the
two, you really want to be able to chose. But Ada 95 has made the choice for
you: slow and predicable. That's not good for some applications.>>

fast and unpredictable (and unexpected) is not an acceptable design goal
in Ada. If you have a need for this particular operation, you can always
get it with a machine code insertion or an interfaced unit.

****************************************************************

From: Randy Brukardt
Sent: Tuesday, April 03, 2001 2:59 PM

> <<The problem is simply that sometimes you want fast, and sometimes you want
> predicable. And since there is a substantial cost difference between the
> two, you really want to be able to chose. But Ada 95 has made the choice for
> you: slow and predicable. That's not good for some applications.>>
>
> fast and unpredictable (and unexpected) is not an acceptable design goal
> in Ada. If you have a need for this particular operation, you can always
> get it with a machine code insertion or an interfaced unit.

Sigh. So we have another case where you have to use another programming language
or otherwise make your code not portable in order to get performance matching
that of most other programming languages.

If you do care about how the rounding is done, you can always write:

       Integer(Float'Rounding(X))

(or one of the other attributes) to get exactly the rounding you need. But if
you don't care (which is often the case, even in numerical software as Pascal
pointed out), you have pay the penalty of biased rounding.

Even if you write "Integer(Float'Unbiased_Rounding(X))" (which a clever compiler
can make fast on the Intel processors), you have overspecification. And that
could be dreadfully slow if you ever ported it to a machine that doesn't use
Unbiased_Rounding as a native operation.

Admittedly, the Ada 83 definition had problems, but that was because there was
no portable work-around. Now that we have one, we didn't need to make the worst
possible choice for the default case. Unfortunately, it probably is too late to
undo the damage here. Even making this implementation-defined when not in strict
mode probably would break some programs.

For the record, Janus/Ada ignores 4.6(33) unless the compiler is running in
"validation mode". I'll reconsider this if a user ever complains. (Hasn't
happened yet.)

				Randy.

****************************************************************

From: Tucker Taft
Sent: Tuesday, April 03, 2001 6:01 PM

Every other language I know of insists on truncation when converting
float to int.

****************************************************************

From: Randy Brukardt
Sent: Tuesday, April 03, 2001 6:24 PM

Probably because there is no argument about how to truncate, like there is
for rounding.

****************************************************************

From: Robert Dewar
Sent: Wednesday, April 04, 2001 5:05 AM

<<Sigh. So we have another case where you have to use another programming
language or otherwise make your code not portable in order to get
performance matching that of most other programming languages.           >>

If you want sloppy undefined semantics, it is not surprising that you have
trouble doing this in Ada!

In practice I think there are quite quick ways of doing what you want. Do
you *really* have a program where the overall performance of the program is
affected by this, or are you doing the typical compiler writer thing of
focussing on a particular inefficiency without this data? :-)

****************************************************************

From: Randy Brukardt
Sent: Wednesday, April 04, 2001 5:51 PM

> If you want sloppy undefined semantics, it is not surprising
> that you have trouble doing this in Ada!

<Grin>. Of course, I only care that it is fast, and that the semantics are some
sort of rounding; I don't need precisely defined semantics.

> In practice I think there are quite quick ways of doing what you want. Do
> you *really* have a program where the overall performance of the program is
> affected by this, or are you doing the typical compiler writer thing of
> focussing on a particular inefficiency without this data? :-)

Well, the software version of Generic_Elementary_Functions (GEF) in Janus/Ada
uses Float => Integer conversion in Sin, Cos, Tan, and "**". Anyone who has an
inner loop containing any of those functions would notice the slowdown. The
default version of GEF in Janus/Ada uses the software version. (I recall that
there was some inaccuracies in the hardware versions of some of these routines.)

I don't have a particular program using these functions in mind, but I'd be
pretty surprised if no user of Janus/Ada has ever written such a program. OTOH,
anyone really needed high performance probably would use the hardware version
of the functions, so I can't say for certain any real user would be impacted.
I never wanted to take the chance.

Rewriting the GEF software might help, but that is not something that I'd
particularly want to undertake. I remember how painful it was to analyze every
conversion in the original Fortran source to see whether unbiased rounding was
going to be a problem, or whether it did not matter to the results. I don't
particularly want to revisit that. When you have carefully crafted numeric
software, it is generally best to leave it alone.

****************************************************************

From: Robert Dewar
Sent: Thursday, April 05, 2001 2:49 PM

<<Well, the software version of Generic_Elementary_Functions (GEF) in
Janus/Ada uses Float => Integer conversion in Sin, Cos, Tan, and "**".
Anyone who has an inner loop containing any of those functions would notice
the slowdown. The default version of GEF in Janus/Ada uses the software
version. (I recall that there was some inaccuracies in the hardware versions
of some of these routines.)>>

a) this is still hypothetical, I would be surprised if it is significant
in practice.

b) there is no possible reason on the x86 to have such an operation for
trig functions, you should be using the reduction instruction to reduce
the argument, and then the hardware operations from then on. Have a look
at the GNAT sources which have specialized code for the x86 and you can
see how it should be done.

Remember that the IEEE standard requires a scaling instruction, so it is
very unlikely that you EVER have to do this rounding operation.

****************************************************************

From: Randy Brukardt
Sent: Thursday, April 05, 2001 5:52 PM

We have a hardware implementation for the X86 (got it out of the Intel manuals,
then fixed as needed). It's just not the default implementation for GEF. As
with everything, that was decided back in the days when floating point on Intel
processors was an option, not something that was included on every processor.
We still have and use a few machines here that don't support any floating
point. (They should be curbed, soon, though.) Probably those defaults ought
to be changed, but doing so probably would break a few customers programs (and
more importantly, compilation scripts). There's a lot of defaults in Janus/Ada
that come from 16-bit MS-DOS and aren't really correct anymore. But I've never
wanted to confuse everyone by having different defaults on different targets.

> Remember that the IEEE standard requires a scaling instruction, so it is
> very unlikely that you EVER have to do this rounding operation.

Well, you're assuming floating point hardware, and the "generic" version of
Janus/Ada does not -- it provides its own floating point emulation code. And
that code is very simplified to rather basic requirements, and not strictly
IEEE compliant. In any case, I don't honestly know *what* those conversions are
there for, and I'm not interested in messing with working numeric software
without a very good reason.

****************************************************************

From: Robert Dewar
Sent: Thursday, April 05, 2001 5:55 PM

Either performance is important or it is not. Inadequate performance is
not "working" in my book, but then you still don't have real data to say
that this is a practical problem.

But in any case in Ada 95, you definitely do NOT need to be doing
rounding of this kind, please look at S'Remainder.

So really the example we have here is code that was appropriate to Ada 83,
but that is inappropriate for Ada 95, and a complaint that says "this does
not work well in Ada 95 but I do not want to rewrite it". Not very
convincing!

****************************************************************

From: Robert Dewar
Sent: Thursday, April 05, 2001 5:56 PM

<<Well, you're assuming floating point hardware, and the "generic" version of
Janus/Ada does not -- it provides its own floating point emulation code. And
that code is very simplified to rather basic requirements, and not strictly
IEEE compliant. In any case, I don't honestly know *what* those conversions
are there for, and I'm not interested in messing with working numeric
software without a very good reason.
>>

But you must have implemented S'Remainder, right?

****************************************************************

From: Randy Brukardt
Sent: Thursday, April 05, 2001 6:09 PM

Yes, of course. You're claiming that the conversions could be replaced by
some juditious use of S'Remainder. Perhaps; I'm not quite sure *what*
they're for.

I'm not completely sure that S'Reminder is implemented correctly. The lack
of any real test of the float attributes makes it hard to tell. In any case,
it is implemented with about 85 lines of Ada code, so I don't think that
using it to improve performance is very likely to work. :-) [This operation
is not represented in our intermediate code, so improving the implementation
would be a lot of work.]

****************************************************************

From: Robert Dewar
Sent: Thursday, April 05, 2001 6:12 PM

It is absolutely expected that 'Remainder map into the appropriate hardware
instruction. 85 lines of Ada code is pretty horrible here, so it sounds a
bit like blaming Ada for poor implementation :-)

Now I don't even know if GNAT does this right or not, but certainly we
do not have the same problem of running into slow rounding.

****************************************************************

From: Randy Brukardt
Sent: Thursday, April 05, 2001 6:28 PM


I plead guilty here. I had very limited time to implement A.5.3, and I spent
90% of it implementing "right" the operations that I knew people would use:
Rounding, Truncation, Exponent, Compose, Fraction. I spent the other half of
the time :-) with Q&D implementations of the rest of them; figuring to
improve them if someone needs a higher quality implementation. As I recall,
'Remainder doesn't map cleanly into the Pentium instruction set, so it may
be the case that it would never be very efficient on that processor. And I
know it is very hard to do in a machine language floating point subprogram;
indeed, I couldn't think of any way to do it at all there. It made a lot
more sense to write it in a high level language where it actually could be
debugged! (At least, I didn't need an Integer => Float conversion in it.)

****************************************************************

From: Pascal Leroy
Sent: Friday, April 06, 2001 4:14 AM

> a) this is still hypothetical, I would be surprised if it is significant
> in practice.

On RISC machines, the effect of rounding on the performance of our
implementation of the GEF was of the order of 10-20% (your mileage may vary, in
particular depending on the architecture).  Not huge, but significant.  The
reason is that the drudgery you have to perform to get the rounding right
requires tests which tend to flush the pipeline.  All the rest of the GEF code
is pretty much devoid of tests, and leads to very efficient
scheduling/pipelining.

Anyway, we are not going to change the language at this stage...

> b) there is no possible reason on the x86 to have such an operation for
> trig functions, you should be using the reduction instruction to reduce
> the argument, and then the hardware operations from then on. Have a look
> at the GNAT sources which have specialized code for the x86 and you can
> see how it should be done.

There is no way that such an implementation will meet the requirements of annex
G, so a very good reason for avoiding it is if you want to adhere to strict
mode.  We carefully considered the hardware support provided by the x86 and
decided not to use it, because the accuracy of most of these operations is
appalling.  Granted, they are a bit faster than a software implementation, but
not that much, because you can do an awful lot of mults/adds in the time it
takes to run a single fcos instruction.

****************************************************************

From: Robert Dewar
Sent: Friday, April 06, 2001 4:38 AM

Actually it is quite unfair to say that the results of the hardware operations
are appalling. What's your reference for that, I assume you have the relevant
Intel documentation. It is true that in 80-bit mode there are some problems,
but we know of no accuracy issues for the hardware operations when used in
32- and 64-bit modes.

In any case I am still surprised that any implementation of the GEF would
depend on this kind of rounding, rather than the use of 'Remainder (I assume
we are talking about argument reduction here?)

Of course Ada 83 code casually ported may indeed show this problem.

****************************************************************

From: Pascal Leroy
Sent: Friday, April 06, 2001 5:00 AM

Here is a test program which only uses Long_Float, which I believe is 64-bit
for GNAT, right?

with Ada.Numerics.Long_Elementary_Functions;
with Ada.Text_Io;
with System.Machine_Code;
procedure Reduction is
    -- A value chosen to be really hard for argument reduction.
    Angle : constant Long_Float := 16#1.921FB54442D18#;
    -- The machine number nearest to the exact mathematical value.
    Exact_Cos : constant Long_Float := 16#4.69898CC51701C#E-14;
    Actual_Cos : constant Long_Float :=
       Ada.Numerics.Long_Elementary_Functions.Cos (Angle);
begin
    Ada.Text_Io.Put (Long_Float'Image ((Actual_Cos / Exact_Cos - 1.0) /
                                       Long_Float'Model_Epsilon));
    Ada.Text_Io.New_Line;
end Reduction;

When I execute this with GNAT 3.12p (probably an oldish version, btw), it
prints:

-1.48736395356916E+11

which is, well, larger than the upper bound 2.0 required by RM95 G.2.4(6).  The
root of the problem is that the x86 gives you a Pi with a 64-bit mantissa
(corresponding to the 80-bit format) but to get the proper reduction over the
range specified by G.2.4(10) you need to have about 110-120 bits for Pi.

> In any case I am still surprised that any implementation of the GEF would
> depend on this kind of rounding, rather than the use of 'Remainder (I assume
> we are talking about argument reduction here?)

After the reduction algorithm, we need to have an integer value to look up some
stuff in precomputed tables, so 'Remainder would not help much, we would still
pay the price of a float-to-integer-conversion-with-rounding at some point.
Moreover, we implement 'Remainder in software, so we don't use it in the GEF
for obvious performance reasons (it's about 180 lines of code, including
comments, so it must be slower than Randy's version :-)

> Of course Ada 83 code casually ported may indeed show this problem.

This was all implemented from scratch for Ada 95.

****************************************************************

From: Robert Dewar
Sent: Friday, April 06, 2001 5:22 AM

It seems a real pity if no one is really implementing things like
'Remainder properly. Perhaps they should just be removed from the language?
If they are implemented in these horrible software routines, they are much
worse than useless. And note that I think GNAT has the same nasty problem
at the moment, so I am not making some competitive statement here.

I have no idea if GNAT still shows the problems you mention, as you say,
3.12p is way out of date, and in any case that version did not come from us.
(we take no responsibility for any public versions of GNAT, we have no way
of knowing if they are the same as what we distributed or not).

****************************************************************

From: Pascal Leroy
Sent: Friday, April 06, 2001 5:29 AM

Yes, we didn't pay for support ;-)

This was not intended as a criticism of GNAT btw, just as a demonstration that
it is really hard to use hardware support on x86 to implement strict mode.  I
had a quick glance at the code of GNAT, and it looks perfectly reasonable for a
relaxed mode implementation.  But I believe that in order to meet the strict
mode requirements you would have to add a lot of software "glue" around the x86
instructions, and you would quickly reach a point where the glue is so costly
that you are better off doing it all in software.  At least, that's the
conclusion that I came to...

****************************************************************

From: Robert Dewar
Sent: Friday, April 06, 2001 5:32 AM

That may well be a correct conclusion. Of course one has to wonder in
practice whether there is any real code that can benefit from the extra
accuracy requirements provided in Ada here. There was never any real
cost-benefits analysis performed :-)

****************************************************************

From: Robert Dewar
Sent: Friday, April 06, 2001 5:24 AM

Anyway, I filed that program as an internal bug report :-) So we will
look at it and see if this is still problematical.

****************************************************************

From: Randy Brukardt
Sent: Friday, April 06, 2001 4:50 PM

> When I execute this with GNAT 3.12p (probably an oldish
> version, btw), it prints:
>
> -1.48736395356916E+11

For grins, I ran this program on several of the Ada compilers I have around:

GNAT 3.14a (using the options from the most recent ACT ACATR)
  -1.48736395356916E+11

Janus/Ada 3.12 (note that Janus/Ada doesn't support strict mode, so this is
rather irrelevant)
   9.53995173944739E+09

ObjectAda 7.2
  -1.48736395356916E+11

(I didn't try Rational Apex.)
Seems that either the program is wrong, or all of the compilers are.

****************************************************************

From: Robert Dewar
Sent: Friday, April 06, 2001 6:50 PM

The program may well be right, but it is a diabolical case. Note that
Rational took a huge hit from VERY slow GEF at first, because they were
very fanatic in getting exact results.

****************************************************************

From: Robert Dewar
Sent: Friday, April 06, 2001 6:54 PM

incidentally, I realize my comment on Rational could be read as negative, but
please don't take it that way. Indeed the problem was that Rational was really
VERY careful to guarantee what the RM says -- that's why I am inclined to
believe Pascal has the program right (plus it looks right to me, and indeed
runs fine on the Solaris port of GNAT).

At some point we should perhaps revisit the whole issue of GEF accuracy.
Super-accuracy is dubious if it has too great a penalty, and if everyone
uses relaxed mode, then we lose all control.

But, Pascal can enlighten, I believe that Rational was able to greatly
speed up their GEF implementation without sacrificing accuracy.

Actually for us, an important request from several of our customers is a
super-relaxed mode that would be as fast as possible, reasonably accurate,
but not worry about error conditions, weird cycles etc :-)

****************************************************************

From: Tucker Taft
Sent: Saturday, April 07, 2001 6:25 AM

I ran this on our SPARC compiler (that uses ANSI C as its intermediate)
using an all-software implementation of Cosine (based on the work done at
Argonne National Laboratory) and got an error ratio of 0.000000.
The program seems to be correct.  The all-software implementation
uses a 90 digit representation of Pi/2 for argument reduction.

****************************************************************

From: Stephen Michell
Sent: Saturday, April 07, 2001 11:13 AM

I ran it on Rational yesterday. The result was exact to 15 digits.

****************************************************************

From: Robert Dewar
Sent: Friday, April 06, 2001 5:29 AM

BY the way, if you find the behavior of the round to integer to be such
a problem, why not just introduce a new attribute called Unbiased_Round
or whatever, and use it instead?

Certainly we cannot base the reasonable semantics of Float to Integer
rounding on low level requirements for implementation of the GEF!

As I noted, the round-to-even behavior is just too strange and unfamiliar
to be a reasonable part of the language, and to leave it implementation
defined is not at all in harmony with the design and goals of Ada here.
Users always regarded it as a bug that 2.5 rounded to 2 and 3.5 rounded
to 4, and I find this a reasonable reaction to Ada 83 implementations
that did this.

I would not mind a proposal for a new semi-standard attribute in this
area, that would seem the most constructive outcome of this discussion.

>Anyway, we are not going to change the language at this stage...

Well we can change the language by adding the attribute. That would be
well-defined, and clearly useful, since we have two implementatations
saying that the availability of this new attribute would significantly
improve performance for some code, and the addition of an attribute
like this is a relatively simple task, both from the point of view
of implementation and definition.

Pascal, does that seem reasonable to you?

****************************************************************

From: Pascal Leroy
Sent: Friday, April 06, 2001 5:36 AM

> BY the way, if you find the behavior of the round to integer to be such
> a problem, why not just introduce a new attribute called Unbiased_Round
> or whatever, and use it instead?

Funny that you mention this because I just came to the same conclusion this
morning.  When I wrote the GEF I considered adding a pragma to control rounding
(and didn't do it) but it didn't cross my mind that an attribute would be the
right solution.

> Certainly we cannot base the reasonable semantics of Float to Integer
> rounding on low level requirements for implementation of the GEF!

Agreed.

Note that the implementation of the GEF would really prefer an attribute called
Random_Round :-) which would use whatever rounding is faster on a given
architecture.  It's often round-to-even, but not always.  I have a vague
recollection that HP was an oddball in that respect (as in many others).

> I would not mind a proposal for a new semi-standard attribute in this
> area, that would seem the most constructive outcome of this discussion.

Makes sense.

> Well we can change the language by adding the attribute. That would be
> well-defined, and clearly useful, since we have two implementatations
> saying that the availability of this new attribute would significantly
> improve performance for some code, and the addition of an attribute
> like this is a relatively simple task, both from the point of view
> of implementation and definition.
>
> Pascal, does that seem reasonable to you?

Absolutely.

****************************************************************

From: Robert Dewar
Sent: Friday, April 06, 2001 5:58 AM

OK, so why not call it 'Fast_Round since that is really the intention here,
and the documentation for it is something like:

   function S'Fast_Round (X : T) return Integer;

where the result is obtained by rounding X to the result, the case of
exactly half way between two integers is handled in an implementation
defined manner, consistent with providing the fastest possible implementation.

By the way, why can't you get the result you want on most machines by
using S'Unbiased_Rounding, and then convert the result to an integer
(well I guess the answer is this would require more smarts in the code
generator than any of us are likely to have around :-) :-)

****************************************************************

From: Randy Brukardt
Sent: Friday, April 06, 2001 6:41 PM

Sounds like an idea, wonder why I never thought of it.

Humm, do we really want this to return type "Integer"? That seems mildly
restrictive, and we always recommend to avoid using the predefined types --
I hate to add another reason that you have to.

> By the way, why can't you get the result you want on most machines by
> using S'Unbiased_Rounding, and then convert the result to an integer
> (well I guess the answer is this would require more smarts in the code
> generator than any of us are likely to have around :-) :-)

Well, I actually suggested this (and I think that Janus/Ada actually does
this optimization - yes, see below).

But such code is not very portable; if it ever gets moved to a machine that
doesn't use Unbiased_Rounding, the performance be terrible. For GEF, that
doesn't much matter (it's probably tweaked for every target anyway), but it
might be a problem for some apps.

Appendix:
For Janus/Ada, the intermediate code generated for:

     A := Integer(Float'Unbiased_Rounding(F));

is:

     PSHF [F]
     FUNRND   <-- Unbiased rounding ('Unbiased_Rounding)
     FRND     <-- Biased rounding ('Rounding, inserted by type conversion)
     CTYPI    <-- Checked type conversion to integer (rounding not specified
                  a-la Ada 83)
     POPI [A]

And the optimizer notes that the FRND can have no effect, so it is
eliminated.

But you still get two conversion operations, because the code generator
doesn't look for FUNRND | CTYPxx and eliminate it. And I suspect that the
folks that don't use their own code generator would have even more trouble
eliminating it.

****************************************************************

From: Robert Dewar
Sent: Friday, April 06, 2001 7:11 PM

<<Humm, do we really want this to return type "Integer"? That seems mildly
restrictive, and we always recommend to avoid using the predefined types --
I hate to add another reason that you have to.
>>

I suggested Integer because I think it has a better chance of being
implemented efficiently, and if you are worried about rounding you are
almost certainly in Integer range. Certainly the examples you chose fit
this rule!

****************************************************************

From: Randy Brukardt
Sent: Friday, April 06, 2001 7:25 PM

True. I would have thought that having it return Universal_Integer would be
better, although a bit more complex to implement. You almost always know a
real type for the expression -- I assume that there is a fastest conversion
to any possible integer type supported by a compiler, even though that might
not be all that fast!
The only problem with that would be the weird cases where that is the actual
type of the expression is actually Universal_Integer, but I believe that the
standard says such things are evaluated in the largest possible integer
type.

I'd prefer the more flexible definition, but either is fine by me.

****************************************************************

From: Robert Dewar
Sent: Friday, April 06, 2001 8:25 PM

Yes, and that is nasty because it means unless you are very clever you
will end up using 64-bit integers when you really want 32-bit integers,
and generate lousy code after all.

And you really do NOT want to create an Ada-83 style disincentive in
implementing decent length integers here.

****************************************************************

From: Randy Brukardt
Sent: Friday, April 06, 2001 9:34 PM

Of course not. But I think you will have to be clever anyway to do this
right. (At least the IA will make it visible for those who don't). For
instance, Janus/Ada supports 8, 16, and 32 bit integers, and you really want
to do all of them the same (that is, without extra conversions). Either way,
you'll have to do some optimizations:

If the attribute returns Integer, and you're converting to a 16-bit integer:

     S : My_Short_Int;

     S := My_Short_Int(Float'Fast_Round (F));

In Janus/Ada, this would generate intermediate code of (assuming no range
check is needed:

    PSHF [F]
    CTYPI
    CTYPSI
    POPSI [S]

You'd need some optimizations to generate just the Pentium instructions:

    FLD Dword Ptr [F]
    FIST Word Ptr [S]
    <Some overflow check>

You'd need to do these optimizations for any use of the attribute on any
type other than Integer, since there would have to be an explicit type
conversion. (And most uses would need a conversion, since hopefully most
users aren't using type Integer in portable code.)

If the attribute returns Universal_Integer, and you're converting to a
16-bit integer:

     S := Float'Fast_Round (F);

In Janus/Ada, this would generate intermediate code of (assuming no range
check is needed:

    PSHF [F]
    CTYPLI
    CTYPSI
    POPSI [S]

As you can see, the intermediate code is nearly identical, and so are the
optimizations needed. Of course, you could fail to do the optimizations and
generate a 64-bit intermediate, but that seems unlikely in a
performance-centered application.

Of course, what needs to be done would vary from target to target, and code
generator to code generator, but it seems to me that you would need to be
clever about just about every use of this attribute no matter what type it
returned. So, I don't think that this is much of an argument either way for
the return type of the attribute.

****************************************************************


Questions? Ask the ACAA Technical Agent