!standard G.1.2 (15) 01-09-07 AI95-00185/04
!class binding interpretation 97-03-19
!status ARG approved 5-0-3 01-05-20
!status work item 99-09-18
!status received 97-03-19
!priority Medium
!difficulty Hard
!subject Branch cuts of inverse trigonometric and hyperbolic functions
!summary
Replace G.1.2(15-17) by:
The imaginary component of the result of the Arcsin, Arccos and Arctanh
functions is discontinuous as the parameter X crosses the real axis to the left
of -1.0 or the right of 1.0.
The real component of the result of the Arctan and Arcsinh functions is
discontinuous as the parameter X crosses the imaginary axis below -i or above i.
The real component of the result of the Arccot function is discontinuous as
the parameter X crosses the imaginary axis below -i or above i.
Replace G.1.2(20) by:
The computed results of the mathematically multivalued functions are rendered
single-valued by the following conventions, which are meant to imply that the
principal branch is an analytic continuation of the corresponding real-valued
function in Ada.Numerics.Generic_Elementary_Functions. (For Arctan and Arccot,
the single-argument function in question is that obtained from the two-argument
version by fixing the second argument to be its default value.)
!question
The definition of the branch cuts in RM95 G.1.2(15-17) seem contradictory with
other rules regarding these functions, and inconsistent with common mathematical
practice.
!recommendation
(See summary.)
!wording
(See summary.)
!discussion
The RM description of these functions contains contradictions. Fortunately,
these are easily resolved if we assume the (ideal) functions over the complex
plane are meant to be analytic continuations of the (ideal) same-named
functions in Ada.Numerics.Generic_Elementary_Functions. (For Arctan and
Arccot, it is necessary to use the one-argument function derived by using the
default argument of the two-argument function.) This is consistent with normal
mathematical usage, and desirable in its own right.
G.1.2(17) defines the branch cut of Arccot as follows:
"The real component of the result of the Arccot function is discontinuous as
the parameter X crosses the imaginary axis between -i and i."
G.1.2(24) defines the principal value of Arccot as follows:
"The real component of the result of the Arccot function ranges from 0.0 to
approximately Pi."
These two paragraphs contradict each other. Consider what happens when X is
real and close to 0.0. The multi-valued Arccot of 0.0 is any odd multiple of
Pi/2.0. Because G.1.2(17) requires a discontinuity at 0.0, Arccot (-0.0) and
Arccot (+0.0) must be two different odd multiples of Pi/2.0. But G.1.2(24)
constrains the range of Arccot so that the only acceptable multiple of Pi/2.0
is Pi/2.0.
We follow G.1.2(24), because an analytic continuation of Arccot for real
arguments cannot have a branch cut crossing the real axis. What must be
changed here is the location of the cut.
Now consider the rules related to Arcsin:
"The real component of the result of the Arcsin function is discontinuous as
the parameter X crosses the real axis to the left of -1.0 or the right of 1.0."
(RM95 G.1.2(15))
and:
"The range of the real component of the result of the Arcsin function is
approximately -Pi/2.0 to Pi/2.0." (RM95 G.1.2(23))
Remember that if Y is a result of the multi-valued Arcsin, Pi - Y and Y + 2.0 *
Pi are also results.
Consider what happens when X crosses the real axis to the right of 1.0. Let X
= A + I * B be a complex number where A > 0.0 and B is small compared to A (so
that we can use first order approximation). A first order approximation of
Arcsin (X) is:
Y = Pi / 2.0 + B / Sqrt (A**2 - 1.0) - I * Log (A + Sqrt (A**2 - 1.0))
When B > 0.0, the real part of Y is slightly above Pi / 2.0. In order to keep
the real part of Arcsin (X) in the range -Pi / 2.0 .. Pi / 2.0, we have to use
Y when B < 0.0 and Pi - Y when B > 0.0. This cause the imaginary part to
become discontinuous. This illustrates that for this RM95 G.1.2(23) requires
that the imaginary part, not the real part, be discontinuous when X crosses the
real axis to the right of 1.0.
A similar analysis could be performed for X to the left of -1.0 and for Arccos
and Arcsinh. For these cases, the description of the cuts is accurate, and
what must change is the description of the properties of the functions.
The rules given in the !summary follow from the assumption that the functions
on the complex plane are meant to be analytic continuations of the
corresponding functions on the real line.
!appendix
!section G.1.2(15)
!subject Branch cuts of inverse trigonometric and hyperbolic functions
!reference RM95 G.1.2(15)
!reference RM95 G.1.2(16)
!reference RM95 G.1.2(17)
!reference RM95 G.1.2(24)
!from Pascal Leroy 97-03-10
!reference 97-15727.f Pascal Leroy 97-3-10>>
!discussion
G.1.2(17) defines the branch cut of Arccot as follows:
"The real component of the result of the Arccot function is discontinuous as
the parameter X crosses the imaginary axis between -i and i."
G.1.2(24) defines the principal value of Arccot as follows:
"The real component of the result of the Arccot function ranges from 0.0 to
approximately Pi."
These two paragraphs seem to contradict each other. Consider what happens
when X is real and close to 0.0. Mathematically, the Arccot of 0.0 is any odd
multiple of Pi/2.0. Because G.1.2(17) requires a discontinuity at 0.0, Arccot
(-0.0) and Arccot (+0.0) must be two different odd multiples of Pi/2.0. But
G.1.2(24) constrains the range of Arccot so that the only acceptable multiple
of Pi/2.0 is Pi/2.0. So Arccot cannot be discontinuous at 0.0 after all...
Also, the paragraphs G.1.2(15) and G.1.2(16) define branch cuts as follows:
"The real (resp. imaginary) component of the result of the Arcsin and Arccos
(resp. Arctanh) functions is discontinuous as the parameter X crosses the real
axis to the left of -1.0 or the right of 1.0
The real (resp. imaginary) component of the result of the Arctan (resp
Arcsinh) functions is discontinuous as the parameter X crosses the imaginary
axis below -i or above i."
These rules are puzzling, because the natural mathematical definition of
Arcsin and Arccos is such that the real part is continuous; it is the
imaginary part which has branch cuts. Similarly, the natural mathematical
definition of Arcsinh is such that the imaginary part is continuous; it is the
real part which has branch cuts.
****************************************************************
From: Mike Yoder
On: Monday, December 11, 2000, 5:33 PM
Re: AI95-00185, "Branch cuts of inverse trigonometric and hyperbolic functions"
I agree with the stated conclusions, and with the suggested fix. I have a
mathematical quibble with one paragraph; this is given at the end, since it
affects no conclusions. In the following, mathematical terminology is as in
Ahlfors, _Complex Analysis_, 2nd edition.
I had this significant difficulty: though I agree with the fixes, the RM
language as it stands doesn't imply that the fixes are the right ones. Nor
does the Ada 95 Rationale provide any help. There is, however, a simple
principle that would make all cases unambiguous for all practical purposes:
namely, that the (ideal) functions are always analytic continuations of the
(ideal) functions of the same name over the reals. (That is, the ones in
Ada.Numerics.Generic_Elementary_Functions.) It would be good to make this
principle explicit somewhere, ideally somewhere preceding the statements in
G.1.2(12) and G.1.2(20).
By "for all practical purposes" I mean: the remaining ambiguity is removable by
adding the subsidiary principle that branch cuts are always subsets of the real
or the imaginary axis. This is "intuitively obvious" in some sense but is
perhaps worth stating explicitly for the sake of complete clarity.
The quibble I mentioned is with this paragraph:
>These rules are puzzling, because the natural mathematical definition of
>Arcsin and Arccos is such that the real part is continuous; it is the
>imaginary part which has branch cuts. Similarly, the natural mathematical
>definition of Arcsinh is such that the imaginary part is continuous; it is the
>real part which has branch cuts.
I'm pretty sure I know what is meant here by extending the notion of "branch
cut" to real-valued functions. I'm dubious about whether this is the right way
to do so. For example, the real part of arccos is continuous but not
differentiable across its cut: its behavior is like that of abs(x) at x=0. For
any chosen function among those under discussion (call it 'f'), there is never
an analytic function in any neighborhood straddling a branch cut of f whose
real part matches f's real part. So, I'd prefer to let "branch cut" apply only
to analytic functions.
****************************************************************
From: Pascal Leroy
Sent: Wednesday, December 13, 2000 4:15 AM
Subject: Re: [Ada-Comment] AI-00185
> I had this significant difficulty: though I agree with the fixes, the RM
> language as it stands doesn't imply that the fixes are the right ones. Nor
> does the Ada 95 Rationale provide any help.
Agreed. I looked for guidance in these documents, and couldn't find any, so
I had to go back to the math textbooks.
> There is, however, a simple
> principle that would make all cases unambiguous for all practical purposes:
> namely, that the (ideal) functions are always analytic continuations of the
> (ideal) functions of the same name over the reals. (That is, the ones in
> Ada.Numerics.Generic_Elementary_Functions.) It would be good to make this
> principle explicit somewhere, ideally somewhere preceding the statements in
> G.1.2(12) and G.1.2(20).
Agreed.
> By "for all practical purposes" I mean: the remaining ambiguity is removable
> by adding the subsidiary principle that branch cuts are always subsets of the
> real or the imaginary axis. This is "intuitively obvious" in some sense but
> is perhaps worth stating explicitly for the sake of complete clarity.
If a complex function is (1) analytic and (2) an extension of the real
function with the same name, you can _prove_ the following:
1 - The branch cuts are invariant by complex conjugation.
2 - For odd functions the branch cuts are invariant by reflection in the
origin.
3 - The branch cuts begin and end at the points where the function has no
Taylor/Laurent series expansion.
For the functions at hand, this pretty much constrains the cuts to lie on
the axes. So it seems to me that your "subsidiary principle" is
unnecessary.
> >These rules are puzzling, because the natural mathematical definition of
> >Arcsin and Arccos is such that the real part is continuous; it is the
> >imaginary part which has branch cuts. Similarly, the natural mathematical
> >definition of Arcsinh is such that the imaginary part is continuous; it is
> > the real part which has branch cuts.
>
> I'm pretty sure I know what is meant here by extending the notion of "branch
> cut" to real-valued functions. I'm dubious about whether this is the right
> way to do so. For example, the real part of arccos is continuous but not
> differentiable across its cut: its behavior is like that of abs(x) at x=0.
> For any chosen function among those under discussion (call it 'f'), there is
> never an analytic function in any neighborhood straddling a branch cut of f
> whose real part matches f's real part. So, I'd prefer to let "branch cut"
> apply only to analytic functions.
I understand the argument that we should not apply the words "branch cut" to
a function which is not analytic, and surely the real/imaginary parts of
these function aren't. How about replacing "has branch cuts" by "is
discontinuous" in the above paragraph. I.e.:
"These rules are puzzling, because the natural mathematical definition of
Arcsin and Arccos is such that the real part is continuous; it is the
imaginary part which is discontinuous. Similarly, the natural mathematical
definition of Arcsinh is such that the imaginary part is continuous; it is
the real part which is discontinuous."
Note that at some point I tried to spell out what I meant by "natural
mathematical definition" (essentially the definition based on the complex
logarithm) but I gave up because of the difficulty of writing complicated
mathematical formulas in a plain text file.
****************************************************************
From: Pascal Leroy
Sent: Thursday, December 14, 2000 4:35 AM
> Hi, Pascal. I thought I'd spare the others the gruesome mathematical
> details, and just converse with you.
Good idea. I am copying John, who is as far as I can tell the only other
ARG member interested in "gruesome mathematical details". I am also copying
Randy to make sure that the discussion is recorded in the AI.
> >If a complex function is (1) analytic and (2) an extension of the real
> >function with the same name, you can _prove_ the following:
> >
> >1 - The branch cuts are invariant by complex conjugation.
> >2 - For odd functions the branch cuts are invariant by reflection in the
> >origin.
> >3 - The branch cuts begin and end at the points where the function has no
> >Taylor/Laurent series expansion.
>
> For #3 I agree. For #1 and #2 I don't see this, unless there are
> *different* implicit assumptions being made, in which case it might be
> worthwhile to explicitly state *those* assumptions.
>
> Could you supply the proofs in question to me?
I am copying the following from a 1987 paper by W. Kahan on the
implementation of complex elementary functions.
"Each of our nine elementary complex function f(z) has a slit or slits that
bound a region, called the principal domain, inside which f(z) has a
principal value that is single valued and analytic (representable locally by
power series), though it must be discontinuous across the slit(s). That
principal value is an extension, with maximal principal domain, of a real
elementary function f(x) analytic at every interior point of its domain,
which is a segment of the real x-axis. To conserve the power series'
validity, points strictly inside that segment must also lie strictly inside
the principal domain; therefore the slit(s) cannot intersect the segment's
interior. Let z* = x - iy denote the complex conjugate of z = x + iy; the
power series for f(x) satisfy the identity f(z*) = f(z)* within some complex
neighbourhood of the segment's interior, so the identity should persevere
throughout the principal domain's interior too. Consequently complex
conjugation must map the slit(s) to itself/themselves. The slit(s) of an
odd function f(z) = -f(-z) must be invariant under reflection in the origin
z = 0. Finally, the slit(s) must begin and end at branch-points: these are
singularities around which some branch of the function cannot be represented
by a Taylor nor Laurent series expansion. A slit can end at branch point at
infinity.
Consequently the slit for Sqrt, Log, and z ** w turns out to be the negative
real axis. Then the slits for Arcsin, Arccos and Arctanh turn out to be
those parts of the real axis not between -1 and +1; similarly those parts of
the imaginary axis not between -i and +i serve as slits for Arctan and
Arcsinh. The slit for Arccosh, the only slit with a finite branch-point
(-1) inside it, must be drawn along the real axis where z <= +1. None of
this is controversial, although a few other writers have at times drawn the
slits elsewhere either for a special purpose or by mistake; other tastes can
be accomodated by substitutions sometimes so simple as writing, say, Log
(-1) - Log (-1/z) in place of Log (z) to draw its slit along (and just
under) the positive real axis instead of the negative real axis."
> As extrinsic evidence
> for this, I'll quote Ahlfors p. 98: "The cut along the positive axis could
> be replaced by a cut along any simple arc from 0 to [infinity]... the cuts
> are in no way distinguished lines on the surface, but the introduction of
> specific cuts is necessary for descriptive purposes." In his context he
> was constructing a cut which would *not* be a usual choice, for the
> function z**(1/n) where n > 1 is an integer.
First, note that the cut for z ** (1/n) falls "naturally" on the negative
real axis. This is quite important, because we don't want the cut to lie on
a part of the real axis where the equivalent real function is well-defined
(x ** (1/n) is perfectly well-defined for x >= 0).
Now it is true that you can put the cuts where you want. For instance if
you rewrite z ** (1/n) as ((a * z) ** (1/n)) * ((1/a) ** (1/n)), that causes
the cut to rotate around the origin (its direction is given by -a*). But
then the resulting function is _not_ the sum of a power series, because we
don't have f(z*) = f(z)* anymore.
> Regardless, I intend to drop the suggestion that any secondary principle is
> needed.
I don't think a secondary principle is needed, but a note could be in order,
to summarize the discussion we are having (if we are having this discussion,
probably someone else could get confused too).
****************************************************************
From: Michael Yoder
Sent: Thursday, December 14, 2000 12:20 PM
Here (from Kahan's words) are the additional assumptions I knew had to be
there:
> Let z* = x - iy denote the complex conjugate of z = x + iy; the
>power series for f(x) satisfy the identity f(z*) = f(z)* within some complex
>neighbourhood of the segment's interior, so the identity should persevere
>throughout the principal domain's interior too. Consequently complex
>conjugation must map the slit(s) to itself/themselves. The slit(s) of an
>odd function f(z) = -f(-z) must be invariant under reflection in the origin
>z = 0.
Here he assumes *without* explicit statement (presumably by analogy) that
if the multivalued function is odd, our single-valued restriction of it
should be odd as well.
Many years ago I would have taken assumptions like this for granted with
impunity. I am more careful nowadays because I've found cases where two
seemingly self-evident assumptions contradict, and also cases where
obviously desirable axioms lead to potentially large implementation
difficulties.
Indeed, the following problem occurs with his oddness axiom for Arcsin and
Arctanh on machines without signed zeros. (Here the branch cuts are that
part of the real axis with absolute value greater than 1.) Is it
preferable for the implementation function to be odd on the branch cuts
themselves, or for the result to match that which would obtain on an
implementation with signed zeros for an argument with imaginary part
+0.0? Of course he may have only meant to demand that symmetry in most of
the complex plane and not on the cuts; this might be resolvable by reading
the rest of his paper.
To sum up: I prefer to state explicitly even the "obvious" axioms because,
as in the example just cited, it is possible for two "obvious" axioms to
collide.
****************************************************************
From: John Barnes
Sent: Thursday, December 14, 2000 10:20 AM
Thanks for keeping old John informed on this indeed gruesome
mathematical topic. Analysis never was my first love - I was more for
geometry (and curvacious girls).
I met Kahan once. In fact I gave him a lift from a conference we were
both at somewhere in Oregon to the airport. It was a strange
conference arranged by Los Alamos and I was invited to tell them about
Ada. Must have been around 1982.
A bright man. He told me how he designed the algorithms built in to
the HP pocket calculator for intergration and equation solution. He
was given so many bytes to fit it in.
Anyway, without reading this discussion in too much detail, I think
the main thing is to ensure that the truth is told but I wouldn't give
the reader too much detail or alternative views in case it just
confuses.
****************************************************************
From: Randy Brukardt
Sent: Thursday, December 14, 2000 5:14 PM
> I don't think a secondary principle is needed, but a note could be in order,
> to summarize the discussion we are having (if we are having this discussion,
> probably someone else could get confused too).
While I barely have a clue what this discussion is about (no, no, NO don't try
to tell me!!), this sounds like an AARM-only note. I doubt we would want to
clutter up the RM with this stuff.
****************************************************************
From: Pascal Leroy
Sent: Friday, December 15, 2000 3:45 AM
> While I barely have a clue what this discussion is about (no, no, NO don't
> try to tell me!!),
That's OK, as I said I am only copying you to make sure that the discussion
is recorded in extenso.
> this sounds like an AARM-only note. I doubt we would want
> to clutter up the RM with this stuff.
Undoubtedly. Users of the RM don't want to know the details. However a
numerics expert would probably be interested in finding the assumptions
documented in the AARM. I wish the original authors of annex G had done
that, because it would have been much easier to correct their mistake (and
maybe they wouldn't even have made the mistake in the first place).
****************************************************************
From: Pascal Leroy
Sent: Friday, December 15, 2000 3:57 AM
> Here he assumes *without* explicit statement (presumably by analogy) that
> if the multivalued function is odd, our single-valued restriction of it
> should be odd as well.
True. Making this assumption explicit in the AI (and the AARM) is fine with
me.
> Indeed, the following problem occurs with his oddness axiom for Arcsin and
> Arctanh on machines without signed zeros. (Here the branch cuts are that
> part of the real axis with absolute value greater than 1.) Is it
> preferable for the implementation function to be odd on the branch cuts
> themselves, or for the result to match that which would obtain on an
> implementation with signed zeros for an argument with imaginary part
> +0.0? Of course he may have only meant to demand that symmetry in most of
> the complex plane and not on the cuts; this might be resolvable by reading
> the rest of his paper.
Kahan's paper was, among other things, insisting on the importance of signed
zeros (the subtitle of the paper is: Much Ado About Nothing's Sign Bit).
Other than explaining how complex elementary functions should be computed
(this is surprisingly difficult because some of these functions behave
wildly in the vicinity of their cuts), he explains that if you don't have
signed zeros then you run into all sorts of contradictions when
implementing/using these functions, because many natural mathematical
identities become wrong on the cuts.
(I must say btw that I cannot get too excited by non-IEEE machines these
days. And I would definitely say that, if the underlying hardware is IEEE,
a decent Ada compiler must have 'Signed_Zeros = True.)
****************************************************************