Version 1.1 of ais/ai-00346.txt

Unformatted version of ais/ai-00346.txt version 1.1
Other versions for file ais/ai-00346.txt

!standard G.03.04          03-09-02 AI95-00346/01
!class amendment 03-09-02
!status work item 03-09-02
!status received 03-09-02
!priority Low
!difficulty Medium
!subject Roots of polynomials
!summary
Subprograms are added to the Numerics Annex for finding the roots of polynomials.
!problem
Ada has been criticized for not having adequate standard libraries. AI-296 proposes the addition of packages previously in the secondary standard 13813 concerning vectors and matrices. This proposal adds related material concerning the roots of polynomials.
!proposal
Child subprograms are added to Ada.Generic_Real_Arrays and Ada.Generic_Complex_Arrays for the determination of the roots of a polynomial equation. They are added in a new clause G.3.4. Recall from AI-296 that G.3 concerns vectors and matrices generally, G.3.3 concerns eignevalues and vectors.
!wording
Add a new clause G.3.4 as follows
G.3.4 Roots of Polynomial Equations
Roots of real and complex polynomial equations are computed by child subprograms Ada.Numerics.Generic_Real_Arrays.Roots and Ada.Numerics.Generic_Complex_Arrays.Roots.
The declaration of Ada.Numerics.Generic_Real_Arrays.Roots is
generic subprogram Ada.Numerics.Generic_Real_Arrays.Roots(P : in Real_Vector; R : out Real_Vector; Number_Real : out Natural);
The library subprogram Numerics.Real_Arrays.Roots has the same declaration. Non-generic equivalents for each of the other predefined floating point types are defined similarly, with the names Numerics.Short_Real_Arrays.Roots, Numerics.Long_Real_Arrays.Roots, etc.
The array P defines a polynomial P = P[0] + P[1]*x + P[2]*x**2 +...+ P[n]*x**n where n is P'Last. Constraint_Error is raised if P'First is not zero and if P'Last is not greater than zero. Constraint_Error is also raised if the bounds of the out parameter R are not one and P'Last.
The out parameter Number_Real is the number of roots of the equation P=0 which are real. The components R[1] .. R[Number_Real], if any, are the rool roots of the equation P=0 in some order. The remaining components of R, if any, taken in pairs with adjacent indexes are the real and imaginary parts respectively of the conjugate pairs of complex roots of P=0 in some order.
The declaration of Ada.Numerics.Generic_Complex_Arrays.Roots is
generic subprogram Ada.Numerics.Generic_Complex_Arrays.Roots(P : in Complex_Vector; R : out Complex_Vector);
The library subprogram Numerics.Complex_Arrays.Roots has the same declaration. Non-generic equivalents for each of the other predefined floating point types are defined similarly, with the names Numerics.Short_Complex_Arrays.Roots, Numerics.Long_Complex_Arrays.Roots, etc.
The array P defines a polynomial P = P[0] + P[1]*z + P[2]*z**2 +...+ P[n]*z**n where n is P'Last. Constraint_Error is raised if P'First is not zero and if P'Last is not greater than zero. Constraint_Error is also raised if the bounds of the out parameter R are not one and P'Last.
The components R[1] .. R[n] are the roots of the equation P=0 in some order.
Implementation Requirements
The accuracy of these subprograms is implementation defined.
Implementation Permissions
The nongeneric equivalent subprograms may, but need not, be actual instantiations of the generic subprograms for the appropriate predefined type.
Implementation Advice
Implementations should implement these subprograms using established techniques such as Laguerre's method.
!discussion
At the ARG meeting in Toulouse, the question of whether any other numerical material might be suitable for a standard was discussed. The conclusion was that possibly the determination of the roots of a polynomial equation might be considered. All other numerical problem areas seemed to be in a state of considerable flux.
One possibility would be to add a package for the manipulation of polynomials in general. This would include basic operations of addition, subtraction, multiplication and division. Presumably such polynomials would be normalized so that the coefficient of the highest power is not zero.
However, there are lots of choices. We could have polynomials with integer coefficients or with real coefficients and in any event they would need to be generic over the coefficient type.
Polynomials with integer coefficients are straightforward to normalize and form good programming exercises. However, they are not of interest for numerical analysis but more for number theory which does not seem appropriate for this annex.
Polynomials with real coefficients cause problems on normalization because it is not clear just when the highest coefficient should be deemed to be zero after operations such as subtraction. Maybe we need some small number Eps and then a rule that says if the highest coefficient is less in absolute value than Eps times the largest of the other coefficients then the highest coefficient can be replaced by zero and the order of the polynomial reduced.
Moreover, as for strings (a recognized pain in the neck), we would have to choose between fixed polynomials, bounded or unbounded. It would seem very unlikely that we would want to have all these possibilities in the language since this is after all a niche area. Bounded polynomials seem the most likely and then we need a generic parameter giving the upper bound. Presumably we make the type private and then we need functions to create polynomials from an array of coefficients and vice versa; it's also handy to have a function that gets a specific coefficient. It seems natural to use the array type from Generic_Real_Arrays. So we might have
generic type Real is digits <>; Max: Positive; Eps: Real; with package Real_Arrays is new Ada.Numerics.Generic_Real_Arrays(Real); package Ada.Numerics.Polynomials
Max_Order: constant Positive := Max; subtype Order_Range is Natural range 0 .. Max_Order;
type Poly is private;
function Order(P: Poly) return Order_Range;
function To_Poly(A: Real_Arrays.Real_Vector) return Poly; function To_Vector(P: Poly) return Real_Arrays.Real_Vector; function Get_Coeff(N: Order_Range, P: Poly) return Real;
function "+" (Left, Right: Poly) return Poly;
... -- etc
private
type Poly(N: Order_Range) is record The_Coeffs: Real_Arrays.Real_Vector(0 .. N); end record;
end Ada.Numerics.Polynomials;
If we did include such a package then the question is how to present a subprogram for determining the roots. From this viewpoint we would naturally consider this as the decomposition of the polynomial into its factors. Keeping to real polynomials (rather than to complex ones) this means returning a mixture of degree two polynomials (for pairs of complex roots) and degree one polynomials for the real roots. But we don't know how many of each.
We would presumably return the group of polynomials as an array so the specification might be
type Poly_Array is array(Integer range <>) of Poly; function Factors(P: Polynomial) return Poly_Array;
Now consider what the user has to do to get hold of the actual roots.
Thus to solve the equation x**3 - 1 = 0 the coefficients are 1.0, 0.0, 0.0, -1.0. The roots are of course +1 and the conjugate pair which are the roots of x**2 + x + 1 = 0 namely -0.5 +/- 0.866i.
We had better declare an array to hold the coefficients
My_Coeffs: Real_Arrays.Real_Vectors(0 .. 3) := (-1.0, 0.0, 0.0, 1.0);
That's a nuisance, I had to write the coefficients backwards! And then we can call Factors
... := Factors(To_Poly(My_Coeffs));
The first trouble is that we don't know whether the returned array is going to have three components (which it will if the three roots are real) or two components (which it will if, as in this case, there is a complex pair). So we need a wretched access type
type Ref_Poly_Array is access Poly_Array;
Answer: Ref_Poly_Array;
Now we can write
Answer := new Poly_Array'(Factors(To_Poly(My_Coeffs)));
Now we can dig out the factors, better declare a poly P for the current one and N for its order and X (and Y) to hold a root. Also a, b, c for ease in getting at the real and imaginary parts.
P: Poly; N: Order_Range; X, Y: Real; a, b, c: Real;
for I in Answer'range loop P := Answer(I); N := Order(P); if N = 1 -- phew we got a real root
X := - Get_Coeff(0, P) / Get_Coeff(1, P); else -- hopefully this means N = 2 for a complex pair a := Get_Coeff(2, P); b := Get_Coeff(1, P); c := Get_Coeff(0, P); X := -b / (2.0*a); -- the real part Y := sqrt(b**2 - 4.0*a*c) / (2.0*a); -- we now have the real and imag parts of the complex pair -- in X and Y end if; end loop;
That strikes me as very hard work.
If however, we don't want to play polynomials as algebraic data structures in general but just find the roots of a polynomial as a particulkar kind of non- linear equation then the simplest way (shh - like in Fortran) is to represent the polynomial simply as an array of coefficients and for the procedure to return an array of roots. In order to keep to real numbers only, it might be a good idea to represent the result as an array which comprises pairs of real and imaginary parts for the complex roots and single values for the real roots. A further parameter Number_Real is then needed to indicate how many roots are real. We can assume then that the first Number_Real components of the array are the real roots and the rest are the complex pairs. We might have a procedure with specification
procedure Roots(P: Real_Vector, R: out Real_Vector; Number_Real: out Natural);
We would raise Constraint_Error if the range of R were not 1 .. P'Length-1.
Now all we need is
Answer: Real_Arrays.Real_Vector(1 .. My_Coeffs'Length); N: Integer;
Roots(My_Coeffs, Answer, N);
We now find that N = 1 and so we know that the real root is Answer(1) and
the complex roots are Answer(2) +/- Answer(3)i.
That seems a lot easier to me. And clearly it is now closely related to the Generic_Real_Arrays package and so might as well be a child subprogram thus
generic subprogram Ada.Numerics.Generic_Real_Arrays.Roots(P: Real_Vector; R: out Real_Vector; Number_Real: out Natural);
and that's all we need.
Ladies and Gentlemen, we don't need a great roly poly package but just a baby child subprogram.
Of course, even with the polynomial package, we could still do the root solution in the array style. But really that would be an orthogonal feature. It seems that the polynomials of which one wants to compute the roots occur as single entities and not as part of the manipulation of lots of polynomials. If we see the features as orthogonal then it looks clear to me that the polynomial package is not worth having in its own right.
We can also deal with the roots of complex polynomials in the same way. In fact they are easier to handle since the roots can all be treated as complex and pairs of complex roots do not generally arise. Thus the third parameter is not required in the complex case.
!example
--!corrigendum G.03.04(01)
!ACATS test
!appendix

From: John Barnes
Sent: Tuesday, September 2, 2003  9:31 AM

In Toulouse I was asked to think about polynomial equations
as perhaps something else that might go in the numerics
annex.

I contemplated packages for polynomials in general but
concluded that they were not approriate for this standard
being great fun as programming exercises but not vital for
real time embedded systems and similar engineering stuff. I
concluded that if we did want to solve polynomials (and they
crop up in quite a few engineering situations such as
control theory) then they might as well be children of the
vectors and matrices stuff.

I attach an AI covering this. [This is version /01 - ED]
The discussion is more a chatter than a formal discussion at the moment.

Randy: could you put it in the database please?

I suppose I will have to get back to the wording for
downward closures now.

****************************************************************

From: Robert A Duff
Sent: Tuesday, September 2, 2003  9:44 AM

> Implementation Requirements
>
> The accuracy of these subprograms is implementation defined.

I suggest "unspecified".

****************************************************************

From: Pascal Leroy
Sent: Tuesday, September 2, 2003  9:52 AM

No, I disagree.  "Unspecified" means that the user doesn't have a clue
as to what accuracy she's going to get, so she has to do accuracy
testing herself, or just forget about the damn package.  "Implementation
defined" is not too hard for the implementers: it just means that you
need to do a back-of-an-envelope error analysis and publish the error
bound in your documentation.

PS: I thought you didn't do floating point? ;-)

****************************************************************

From: Robert A Duff
Sent: Tuesday, September 2, 2003  10:00 AM

OK, I withdraw my suggestion.

> PS: I thought you didn't do floating point? ;-)

I should learn to keep my mouth shut when the subject comes up.  ;-)

****************************************************************

From: Robert Dewar
Sent: Tuesday, September 2, 2003  10:18 AM

I disagree with Pascal's assessment on the issue of accuracy. A back of
the envelope calculation does NOT tell you the actual accuracy bounds,.
and that is what implementation defined accuracy means, it does not mean
just giving some bound that is not exceeded (otherwise you could just
give a ridiculous bound). Documentation requirements are almost always
bogus. Surely Pascal does not intend to precisely define just how bogus
an error bound is allowed to be to meet the requirements of the RM?

I would prefer "unspecified", with an implementation advice section saying
that it is advisable to provide as accurate error bound information as
possible.

****************************************************************

From: Robert Dewar
Sent: Tuesday, September 2, 2003  10:10 AM

I must say that I am dubious about these proposals. Are people really
going to use these facilities? My impression is that people seriously
interested in linear algebra, finding roots of polynomials etc really
just want a simple way of interfacing to standard existing packages
in the area, rather than a new ab initio invention.

I fear that this leads in the same direction as the information systems
annex -- lots of nice facilities, taking a long time to design and
implement and test, that no one actually uses much.

****************************************************************

From: Robert A Duff
Sent: Tuesday, September 2, 2003  1:30 PM

So, should we have a (standard!) binding to the six most commonly used
Fortran numerics packages?  Does anybody know what they are, and how
best to bind to them?

****************************************************************

From: Robert Dewar
Sent: Tuesday, September 2, 2003  2:01 PM

I would reawaken the NRG for this purpose :-)

****************************************************************

From: John Barnes
Sent: Wednesday, September 3, 2003  1:47 AM

I think they fell asleep from progessing so slowly.

****************************************************************

From: Ed Schonberg
Sent: Tuesday, September 2, 2003  3:19 PM

People at the Courant Institute who do numerical analysis for fame and fortune
(and goodly amounts of both) tell me that the BLAS routines have become the
standard for this community, and all they want, in any language, is a binding
to those. There are implementations that are optimized for various platforms,
including multiprocessors, and issues of accuracy are well-analyzed and
documented. For what it's worth...

****************************************************************

From: John Barnes
Sent: Wednesday, September 3, 2003  1:37 AM

Yes but they are enormous and a full binding to all the BLAS
would represent at least a 100 pages of RM I suspect.  Such
a full binding would be better as a stand-alone product.

****************************************************************

From: Robert Dewar
Sent: Wednesday, September 3, 2003  5:14 AM

I agree that this would better be separated from the RM.

****************************************************************

From: John Barnes
Sent: Wednesday, September 3, 2003  1:45 AM

>So, should we have a (standard!) binding to the six most commonly used
>Fortran numerics packages?  Does anybody know what they are, and how
>best to bind to them?


Well that is more or less what I asked NAG. I had lunch with
Brian Ford (boss of NAG) and we went through the library
looking for popular and basic things that could be well
defined. Essentially we came up with simple matrix stuff and
polynomials more or less as I have written them.

There are other popular things such as curve fitting but the
method shows through and thus it is not easy to make them
the abstract interface which would be appropriate for a
standard. Thus the choice of method for curve fitting
influences the parameters. For cubic splines the user needs
the option of giving the knot points and so on.

Note also that I am not really suggesting an ab initio
invention. I see no reason why the routines I have suggested
should not be *implemented* as a binding to BLAS or
whatever.

****************************************************************

From: Robert I. Eachus
Sent: Tuesday, September 2, 2003  7:12 PM

Robert A Duff wrote:

>So, should we have a (standard!) binding to the six most commonly used
>Fortran numerics packages?  Does anybody know what they are, and how
>best to bind to them?

NO!  Definitely not.  The right thing to do is to have a binding to the
BLAS. (Basic Linear Algebra System), which is what these packages are
trying to provide.  I personally thing that providing a few additional
types and operations might be a good idea, but the real goal and need is
to provide a binding to BLAS so that you can access the machine specific
operations (usually generated by the ATLAS) tool.  By providing locality
of reference, the optimized low-level operations can be hundreds of
times faster than načve implementations of matrix multiplication and
linear equation solving.

John decided not to provide options for LU (or LUP) decomposition, and I
mildly disagree but I don't think it is all that important.
Also he hasn't discriminated between solving systems of linear equations
and linear regression (solving overdetermined systems for least squares
fits).  Should there be a child package for Linear Regression that
provides residuals and the variance-covariance matrix? That might be the
best approach.  But then if we do that, why not also provide a linear
programming subpackage?  Again, the best is the enemy of the good.
 Linear programming finds the maximum (or minium) of some linear
function while satisfying a set of inequalities.  In vector/matrix
notation, maximize cx subject to Ax >= b. The other close call is
providing special representations for tri-diagonal, triangular,
permutation, and symmetric matrices.  It might be nice to have an
implementation where the type Matrix was tagged, and have all the
specialized matrices as child types. Or perhaps make Matrix a
discriminated record type, so that the discriminant could be used to
choose a more efficient algorithm based on the matrix structure.  But in
no case should we get drawn into the LAPACK and LINPACK mess of dozens
of names for the same algorithm applied to a different shaped matrix.

But notice that if a user wants to do that, he can.  What we should be
doing here--and as far as I can see are doing here--is to provide an Ada
friendly interface to the BLAS operations. (BLAS level 1 is vector with
vector, level 2 is vector with matrix, and level 3 is matrix with matrix
operatons.  What BLAS does, especially if you run ATLAS to get a version
of BLAS tuned to your particular hardware, is to reorder the
linear-algebra operations to make efficient use of the cache hierarchy.
 What you want obviously if you are multiplying two matrices is to
choose the order in which the operations are done so that elements of
the two matrices being multiplied are loaded  into cache as few times as
possible, and the product matrix elements are only stored, never
reloaded.  What is best can be quite tricky, because most
high-performance chips, including all Pentium family chips (and those
made by AMD and others) have two levels of cache.  If you look at what
ATLAS does for the Pentium 4, the result elements stay in L1 cache while
elements from the product rows and columns is optimized for L2 cache.
 On the other hand, for the AMD Athlon processors (with 64K instead of
8K of L1 data cache) the sizes are chosen to keep all data in L1.

So why not just provide a binding to BLAS, or to LAPACK or LINPACK?
Because the factoring of operations in these packages is wrong for Ada.
We could have ONE generic package specification for LAPACK, but I think
that separate real and complex generics are the way to go, with the
expectation that implementations will provide instantiations for Float
and Long_Float, Complex, and Long_Complex.  The other three
subdirectories for LAPACK contain facilities that are not appropriate
for an Ada binding. See http://www.netlib.org/lapack/ if you want more
details. For details on BLAS see: http://www.netlib.org/blas/faq.html,
and for LINPACK: http://www.netlib.org/linpack/

Incidently, as far as all this goes, I may seem to be more of an expert
than I am.  Well, that is the wrong way to put it.  I may be an expert
on how to implement BLAS, LAPACK, and/or LINPACK.  But I don't have much
experience using them. ;-)   My user experience is more in the area of
linear programming, regression, and in general operations research
and/or statistics.  But linear programming packages and linear
regression package tend to be implemented on top of BLAS along with the
linear algebra stuff.

****************************************************************

From: Robert I. Eachus
Sent: Tuesday, September 2, 2003  10:48 AM

>In Toulouse I was asked to think about polynomial equations
>as perhaps something else that might go in the numerics
>annex.

Good work.  First a typo:

generic
subprogram Ada.Numerics.Generic_Real_Arrays.Roots(P : in Real_Vector;
                       R : out Real_Vector; Number_Real : out Natural);

should be:

generic
PROCEDURE Ada.Numerics.Generic_Real_Arrays.Roots(P : in Real_Vector;
                       R : out Real_Vector; Number_Real : out Natural);


Now for a tough question which may be beyond the interests of this
group. Polynomials have exact solutions though quintic equations, beyond
that you have to use numberical methods.  If I were to implement this
package I would certainly do it by using analytic methods through fifth
order, then use numeric methods beyond that.  (In practice for high
order polynomials, Laguerre's method with or without N-R iteration is
fine for finding the real roots, if any, and at least some of the
complex roots.)

The problem I see is in determining all of the roots.  In theory, once a
single root is found, you can use polynomial division to find a lower
order equation with all the missing roots.  In practice I would hesitate
to do this iteratatively for high-order polynomials.  Arithmetic errors
can split multiple real roots into complex pairs that are close
together, then division by one of those roots will go out of range.
Since John's formulation of the polynomial root finder makes the number
of roots found somewhat indefinite, should we add wording to that
effect?  In other words, I would rather say that the implementation will
find all roots of a quintic or lower equation, and some roots of a
higher order equation.  (The user can apply polynomial division to get
more roots if he needs them, at least if some of the roots found are
real, or complex pairs with a real polynomial product.)

Is there any strong feeling one way or the other on this question?  If
so, it might make sense to add:

generic
function Ada.Numerics.Generic_Complex_Arrays.Roots(P : in Complex_Vector;

              Near : in Complex := (0.0,0.0)) return Complex;

As a function which finds A root of the (complex) polynomial P using
Near to start iteration.

Or we could just decide that behavior beyond quintic (fifth order)
polynomials is uninteresting, and say that the package John defined
finds all roots through fifth order.  Beyond that the number of roots
found is at least one, but may depend on the polynomial.  (I would hate
to define the number of roots found, since it will often depend on when
the polynomial division blows up.)

****************************************************************

From: Michael F. Yoder
Sent: Tuesday, September 2, 2003  11:14 AM

>Polynomials have exact solutions though quintic equations, beyond that
you have to use numberical methods.

If you're referring to exact solutions via taking roots, the exact
solutions go up to quartics.  The symmetric group S(4) is solvable, but
S(5) isn't (A(5) is simple of order 60 and a subgroup of index 2 in S(5)).

I'm pretty sure I've read claims that the solutions via root-taking have
poor numerical qualities.  Be that as it may, I'm inclined to leave
questions of this sort to the implementers.

Section 9.5 of "Numerical Recipes in Pascal" has a good discussion of
root-finding, including "Bairstow's method" for finding quadratic
factors while avoiding complex arithmetic.

****************************************************************

From: John Barnes
Sent: Tuesday, September 2, 2003  2:46 PM

>If you're referring to exact solutions via taking roots, the exact
>solutions go up to quartics.  The symmetric group S(4) is solvable, but
>S(5) isn't (A(5) is simple of order 60 and a subgroup of index 2 in S(5)).


Indeed.

>Section 9.5 of "Numerical Recipes in Pascal" has a good discussion of
>root-finding, including "Bairstow's method" for finding quadratic
>factors while avoiding complex arithmetic.

Yes, I have been reading that book and it gave me confidence
that we could expect a general algorithm to find all the
roots in most cases. The algorithms in the NAG library find
all the roots using a variation of Laguerre's method.

****************************************************************

From: Robert I. Eachus
Sent: Tuesday, September 2, 2003  4:46 PM

Michael F. Yoder wrote:

> >Polynomials have exact solutions though quintic equations, beyond
> that you have to use numerical methods.
>
> If you're referring to exact solutions via taking roots, the exact
> solutions go up to quartics.  The symmetric group S(4) is solvable,
> but S(5) isn't (A(5) is simple of order 60 and a subgroup of index 2
> in S(5)).

In theory you are right:

Finally, Ruffini
<http://library.wolfram.com/examples/quintic/people/Ruffini.html> (1799)
and Abel <http://library.wolfram.com/examples/quintic/people/Abel.html>
(1826) showed that the solution of the general quintic cannot be written
as a finite formula involving only the four arithmetic operations and
the extraction of roots.

However...

In 1877 Klein
<http://library.wolfram.com/examples/quintic/people/Klein.html>
published Lectures on the Icosahedron and the Solution of Equations of
the Fifth Degree. In this remarkable book and later article Klein gave a
complete solution of the quintic.

He used a Tschirnhaus transformation to reduce the general quintic to
the form:

He found the solution of this reduced quintic by first solving the
related icosahedral equation:

where Z can be expressed in radicals in terms of a, b, and c.

Quotes are from: http://library.wolfram.com/examples/quintic/main.html

(In case anyone is wondering how Ruffini and Abel are wrong, they aren't.  The
solution is analytic, but not completely in terms of arithmetic operations and
roots.  But it is analytic.)

In practice, quintic equations with real coefficients are found by using Newton
iteration to find the first root, which is always real, and then solving the
quartic residue.

****************************************************************

From: John Barnes
Sent: Thursday, September 4, 2003  1:42 AM

Robert (Bob)

Thank you for that interesting information regarding
quintics.  But there seems to be a formula missing in the
middle of your message.

****************************************************************

From: Robert I. Eachus
Sent: Friday, September 5, 2003  2:56 PM

Look at the page http://library.wolfram.com/examples/quintic/main.html
that I was quoting from.  I gather that the general quintic solution
equations did not fit in the margins of the e-mail, or some such. ;-)

Incidently, the Klein that first found a general analytic form of the
solutions to quintic equations was the same Klein who gave his name to
Klein bottles.

But don't miss my final comment.  The real (ouch!) reason for stopping
at quintics, at least at find all roots of higher equations is that for
quintics with real coefficients, even nasty ones, you can find a real
root using Newton iteration, then use polynomial division to get a
(reduced) quartic, and solve that for the remaining roots.  You can't do
that with 6th-order polynomials (or higher) since a 6th-order equation
may not have any real roots.  (And iteration near a multiple imaginary
root can be very slow.)

****************************************************************


Questions? Ask the ACAA Technical Agent