Index of /SISTA/lsorber

(1)

NUMERICAL SOLUTION OF BIVARIATE AND POLYANALYTIC POLYNOMIAL SYSTEMS

LAURENT SORBER† §, MARC VAN BAREL†,AND LIEVEN DE LATHAUWER‡ §

Abstract. Finding the real solutions of a bivariate polynomial system is a central problem in robotics, computer modeling and graphics, computational geometry and numerical optimization. We propose an efficient and numerically robust algorithm for solving bivariate and polyanalytic polynomial systems using a single generalized eigenvalue decomposition. In contrast to existing eigen-based solvers, the proposed algorithm does not depend on Gr¨obner bases or normal sets, nor does it require computing eigenvectors or solving additional eigenproblems to recover the solution. The method transforms bivariate systems into polyanalytic systems and then uses resultants in a novel way to project the variables onto the real plane associated with the two variables. Solutions are returned counting multiplicity and their accuracy is maximized by means of numerical balancing and Newton–Raphson refinement. Numerical experiments show that the proposed algorithm consistently recovers a higher percentage of solutions and is at the same time significantly faster and more accurate than competing double precision solvers.

1. Introduction. Computing the roots of a set of polynomial equations is a well studied problem for which a wide range of techniques have been developed. These can roughly be categorized into numeric, symbolic and hybrid methods. Numeric techniques based on Newton’s method [38] are useful for finding solutions locally. However, there is no guarantee of finding all solutions and multiple roots may cause convergence issues. Methods based on interval arithmetic [35] are robust, but can suffer from slow convergence and require bounds on the solutions of interest. In the context of optimization, there also exist various convex relaxation methods for global minimization of polynomials such as a semidefinite programming relaxation [26,39,45], but these only return a lower bound on the objective function in general. Gr¨obner bases [9, 17] and resultants [16] are symbolic methods that originate from algebraic geometry. They are both used to eliminate variables, essentially reducing the problem to finding the roots of univariate polynomials. However, Gr¨obner bases are expensive to compute and are often numerically ill-conditioned. On the other hand, resultants can introduce spurious solutions and have historically not led to numerically robust algorithms either. Homotopy continuation methods [29, 54] are hybrid techniques that combine homotopy methods with numerical continuation methods. The idea is ∗_Received _by _the _editors _DATE; _accepted _for _publication _(in _revised _form) _DATE;

published electronically DATE. The scientific responsibility rests with the authors. http://www.siam.org/journals/siopt/x-x/xxxxx.html

†_{Department of Computer Science, KU Leuven, Celestijnenlaan 200A, BE-3001 Leuven, Belgium} (Laurent.Sorber@cs.kuleuven.be, Marc.VanBarel@cs.kuleuven.be). Laurent Sorber is supported by a doctoral fellowship of the Flanders agency for Innovation by Science and Technology (IWT). Marc Van Barel’s research is supported by (1) the Research Council KU Leuven: (a) OT/10/038, (b) PF/10/002 Optimization in Engineering (OPTEC), (2) the Research Foundation Flanders (FWO): G.0828.14N, and by (3) the Belgian Network DYSCO (Dynamical Systems, Control, and Optimiza-tion), funded by the Interuniversity Attraction Poles Programme, initiated by the Belgian State, Science Policy Office.

‡_{Group Science, Engineering and Technology, KU Leuven Kulak, E. Sabbelaan 53, BE-8500} Kortrijk, Belgium (Lieven.DeLathauwer@kuleuven-kulak.be).

§_{STADIUS Center for Dynamical Systems, Signal Processing and Data Analytics, Department of} Electrical Engineering (ESAT), KU Leuven, Kasteelpark Arenberg 10, BE-3001 Leuven, Belgium and iMinds Future Health Department (Lieven.DeLathauwer@esat.kuleuven.be). Lieven De Lathauwer’s research is supported by (1) the Research Council KU Leuven: (a) GOA-MaNet, (b) CoE EF/05/006 Optimization in Engineering (OPTEC), (c) CIF1, (d) STRT1/08/023, (2) the Research Foundation Flanders (FWO): (a) G.0427.10N, (b) G.0830.14N, (c) G.0881.14N and by (3) the Belgian Network DYSCO: IUAP P7/19.

(2)

to start from a system with a known solution set, and then to numerically track the solutions as the system is gradually transformed into the target system. Although homotopy continuation has been vastly improved in recent years, issues such as path crossing can decrease their robustness [36]. Adaptive multiprecision has proven to be an effective means of dealing with these problems [6]. For a recent overview on homotopy methods and algebraic geometry see the survey [55] and book [46].

In this article, we propose an efficient and numerically robust algorithm to com-pute the isolated real solutions of bivariate polynomial systems. Instead of using resultants to project the solution onto the complex plane associated with one of the two variables, we apply them in a novel way to project solutions onto the real plane associated with both variables. The key to this projection is to first transform the bivariate polynomial system into a so-called polyanalytic polynomial system. The projected system is then reduced to a polynomial eigenvalue problem, which can in turn be linearized into a generalized eigenvalue problem. As such, our method can be seen as a hybrid method, although symbolic manipulations are absent. In contrast to existing eigen-based solvers [3, 10, 13, 22, 27, 30, 31, 34, 50], the proposed algorithm does not depend on Gr¨obner bases or normal sets, nor does it require computing eigen-vectors or solving additional eigenproblems to recover the solution. Moreover, the method returns solutions counting multiplicity and is also applicable to polyanalytic polynomial systems. We aim to maximize accuracy with numerical balancing and Newton–Raphson refinement, and present a robust method to filter out spurious solu-tions introduced by the resultant. In our numerical experiments, we compare perfor-mance with Bertini [7], Chebfun [51], PHCpack [53], the RealSolving C library [41,42], the RegularChains package [2] and Maple’s BivariatePolynomial method on a range of bivariate systems of varying difficulty. The resulting performance profiles show that the proposed algorithm consistently recovers a higher percentage of solutions and is at the same time significantly faster and more accurate.

The paper is organized as follows. In the next subsection we review our notation and introduce some basic definitions. In Section 2 we examine a classical approach to solving bivariate polynomial systems and propose an algorithm for solving both bivariate and polyanalytic polynomial systems with a single generalized eigenvalue decomposition. Furthermore, we also present several balancing steps and a robust strategy for removing spurious roots. In Section 3 we evaluate the performance of the proposed algorithm on a sizable set of low- to moderate-degree bivariate polynomial systems using performance profiles [15]. We conclude the paper in Section 4.

1.1. Notation and preliminaries. Vectors are denoted by boldface letters and are lower case, e.g., a. Matrices are usually denoted by capital letters, e.g., A. Higher-order tensors are denoted by Euler script letters, e.g., A. An entry of a vector a, matrix A or tensor A is denoted by ai, aij or aijk..., depending on the number of

modes. A colon is used to select all entries of a mode. For instance, a:j corresponds

to the jth column of a matrix A. When there is no confusion, we also use aj to

denote the jth column of the matrix A. Sequences are denoted by a superscript in parentheses, e.g., {A(n)}N

n=1. The superscripts ·T, ·H, ·−1 and ·† are used for the

transpose, Hermitian conjugate, matrix inverse and Moore–Penrose pseudoinverse, respectively. The complex conjugate is denoted by an overbar, e.g., a is the complex conjugate of the scalar a. We use parentheses to denote the concatenation of two or more vectors, e.g., (a, b) is equivalent to aT _bTT_{. The n × n identity matrix}

is denoted by In, and the all-zero and all-one m × n matrices by 0m×n and 1m×n,

(3)

Let B be a basis for the ring of polynomials C[x] with basis polynomials Bn(x)

of degree n. For example, if B is the monomial basis or Chebyshev basis, the corre-sponding basis polynomials Bn(x) are xnand Tn(x) := cos(n arccos(x)), respectively.

Given a choice of basis, we associate a coefficient vector p ∈ Cdp+1_{, p}

dp+1 6= 0, with

polynomials

p(x) :=B0(x) B1(x) · · · Bdp(x) · p

of degree dp. Analogously, we associate a coefficient matrix P ∈ C(d (y)

p +1)×(d(x)p +1)_,

maxi|p_i,d(x)

p +1| 6= 0 ∧ maxj|pd (y)

p +1,j| 6= 0, with bivariate polynomials

p(x, y) :=hB0(y) B1(y) · · · B_d(y) p (y) i · P ·hB0(x) B1(x) · · · B_d(x) p (x) iT of coordinate degree (d(x)p , d (y)

p ). The total degree dp of the polynomial p(x, y) is

defined as the largest value i + j for which pi+1,j+1 6= 0. We represent bivariate

polynomials in product bases, i.e., as a linear combination of products Bi(y)Bj(x).

The choice of basis can depend on the application and in which region the roots of interest lie. For example, the product monomial basis is orthogonal with respect to the inner product over the two unit circles in the complex planes associated with x and y, while the product Chebyshev basis is orthogonal on the unit square associated with x and y.

Substituting x and y by z and its complex conjugate z produces so-called polyan-alytic polynomials p(z, z). The term polyanpolyan-alytic refers to the fact that the function is not analytic in z, but is analytic in z and z as a whole. Furthermore, it is not hard to show that if p(z, z) is real-valued for all z ∈ C, then P is Hermitian.

2. Solving systems of bivariate and polyanalytic polynomials.

2.1. The polynomial eigenvalue problem. In robotics [59], computer mod-eling and graphics [19, 23], computational geometry [4] and numerical optimization [11, 47], a common task is to compute the isolated real roots of bivariate polynomial systems

p(x, y) = q(x, y) = 0 (2.1)

or the isolated complex roots of polyanalytic polynomial systems

r(z, z) = s(z, z) = 0, (2.2)

where the associated coefficient matrices P , Q, R and S may be real- or complex-valued. For example, such systems appear as line and plane search subproblems in tensor optimization [47], and more generally also in polynomial optimization. Uncon-strained tensor optimization problems are of the form

minimize z ∈ Cn 1 2kM(z) − T k 2 F, (2.3)

where T ∈ CI1×···×IN _{is a given tensor and M is usually a multilinear function of the}

variables z. A function is said to be multilinear in its argument z if for all i it is linear in zi when the remaining variables zj (j 6= i) are fixed. Among other applications,

(4)

decomposition [12,21] and low multilinear rank approximation [25,52]. Given a current iterate zk and two descent directions ∆z1 and ∆z2,

minimize α ∈ C 1 2kM(zk+ α∆z1) − T k 2 F (2.4a) minimize α,β ∈ R 1 2kM(zk+ α∆z1+ β∆z2) − T k 2 F (2.4b)

are referred to as complex line search and real plane search subproblems, respectively. Let fLS(α, α) and fPS(α, β) be the objective functions of (2.4a) and (2.4b),

respec-tively, then the global line and plane search minimizers are solutions of the bivariate and polyanalytic polynomial systems

∂fLS ∂α = ∂fLS ∂α = 0 and ∂fPS ∂α = ∂fPS ∂β = 0,

where _∂α∂ (_∂α∂ ) is a Wirtinger derivative and acts as partial derivative with respect to α (α), while treating α (α) as constant [48].

Two-dimensional subspace minimization [11] is similar to the plane search (2.4b), where M is a linear approximation of a nonlinear residual function F and a norm constraint on the step size such as kα∆z1+ β∆z2k ≤ δ is added. The stationary

points of the resulting optimization problem are the solutions of a bivariate polynomial system in which the polynomials have total degree 2. In [47], it is shown that two-dimensional subspace minimization can be generalized to higher-order Taylor series expansions of the residual function F and that the resulting optimization problem is a bivariate polynomial system with polynomials of total degree 2 and 2d, where d is the order of the Taylor series approximation.

A widely used technique to solve bivariate polynomial systems (2.1) is to eliminate one of the two variables by reducing the problem to finding the roots of a so-called resultant polynomial. Let V_C_{denote the set {(x, y) ∈ C}2_{| p(x, y) = q(x, y) = 0}. If p}

and q have no common factor, then by B´ezout’s theorem V_Cis zero-dimensional and contains at most dpdq elements. The resultant of p and q is an algebraic description of

the projections of the set V_Conto the complex plane associated with x or y, depending on which variable is eliminated. In other words, the resultant is a polynomial whose roots include the set

V(x)

C := {x ∈ C | ∃y ∈ C : p(x, y) = q(x, y) = 0} or (2.5a)

V(y)

C := {y ∈ C | ∃x ∈ C : p(x, y) = q(x, y) = 0}. (2.5b)

From here on, we assume the variable y is eliminated so that the resultant is a polyno-mial in x. Of course, the role of x and y may be reversed in the subsequent discussion. Resultants have many formulations, but are typically represented by (ratios of) determinants. The two most common resultants are those of Sylvester and B´ezout, the former of which is a corollary of the following lemma.

Lemma 2.1. Let p(x, y) and q(x, y) be two nonconstant bivariate polynomials. If p(x, y) and q(x, y) have a common zero (x∗, y∗_{) ∈ C}2, then there are nonzero polynomials u(y) and v(y) satisfying

p(x∗, y)u(y) + q(x∗, y)v(y) ≡ 0, (2.6) where du< d (y) q and dv< d (y) p .

(5)

Proof. Write p(x, y) and q(x, y) as p(x, y) = d(y) p X i=0 pi(x)Bi(y) and q(x, y) = d(y) g X i=0 qi(x)Bi(y),

where pi(x) (qi(x)) is the polynomial in x associated with the coefficient vector pi+1,:

(q_i+1,:). Since p(x∗, y∗) = q(x∗, y∗) = 0, p(x∗, y) and q(x∗, y) will have a common factor k(y) := (y − y∗₎n _{for some positive integer n. We have that} p(x∗,y)

k(y) q(x ∗_{, y) =} q(x∗,y)

k(y) p(x

∗_{, y). Set u(y) := q(x}∗_{, y)/k(y) and v(y) := − p(x}∗_{, y)/k(y) .}

Equation (2.6) expresses the fact that if (x∗, y∗) satisfies p(x∗, y∗) = q(x∗, y∗) = 0, then there exist polynomials u(y) and v(y) such that all coefficients of the polynomial p(x∗, y)u(y) + q(x∗, y)v(y) are zero. Writing p, q, u and v in the basis {Bi(y)}n−1i=0,

where n := d(y)p + d(y)q , these conditions are equivalent to the system

Sp,q(x∗) · r = 0, (2.7)

where r := (u, v) and Sp,q(x) is a polynomial matrix called the Sylvester matrix. In other words, Lemma 2.1 states that Sp,q(x) must be singular when evaluated at the x-coordinate of a common root. The Sylvester resultant is now defined as resp,q_{(x) := det(S}p,q_{(x)). Clearly, the resultant is a polynomial in x and vanishes if}

and only if the Sylvester matrix is singular. The structure of the Sylvester matrix depends on the choice of basis corresponding to the variable y. Let f (x, y) be a polynomial in the basis {Bi(y)} with polynomial coefficients fi(x). Define the infinite

Toeplitz, Hankel and shifted Toeplitz matrices

Lf(x) :=         f0(x) .. . . .. f_d(y) f (x) . .. . ._.         , Mf(x) :=           f0(x) · · · f_d(y) f (x) 0 · · · .. . . .. . .. f_d(y) f (x) . .. 0 .. .           and Nf(x) :=    0 0 · · · 0 f0(x) · · · f_d(y) f (x) .. . . .. . .. . ..   ,

respectively. For the monomial basis we have yiyj= yi+jso that the Sylvester matrix is given by Sp,q(x) =hLp_1:n,1:d(y) q (x) Lq 1:n,1:d(y)p (x)i_.

(6)

For the Chebyshev basis we have 2Ti(y)Tj(y) = Ti+j(y) + T|i−j|(y) so that the

Sylvester matrix is, up to a scale factor, equal to

Sp,q(x) = "LpT 1:n,1:d(y)q (x) + MpT 1:n,1:d(y)q (x) + NpT 1:n,1:d(y)q (x) Lq T 1:n,1:d(y)p (x) + Mq T 1:n,1:d(y)p (x) + Nq T 1:n,1:d(y)p (x) #T .

Importantly, the polynomial resp,q(x) is in a sense a minimal characterization of the x-coordinates of the roots of the system p(x, y) = q(x, y) = 0. In fact, its roots are exactly the projected roots V_C(x) together with roots at infinity of the form (x∗, ∞).

Theorem 2.2 (see, e.g., [5]). If p and q are coprime, the distinct roots of resp,q(x) are the roots of the bivariate system p(x, y) = q(x, y) = 0 projected onto the complex plane associated with x and the roots of the leading coefficient system p_d(y)

p (x) =

q_d(y)

q (x) = 0. More precisely,

{x ∈ C | resp,q_{(x) = 0} = V}(x)

C ∪ {x ∈ C | pd(y)p (x) = qd(y)q (x) = 0}.

The multiplicity of a root x∗of resp,q(x) is the sum of the multiplicities of all solutions of p(x, y) = q(x, y) = 0 with x-coordinate x∗.

To compute the roots of the resultant, one possible approach is to explicitly compute the resultant resp,q_{(x) as the determinant of a polynomial matrix [43], after}

which the roots can be obtained as the eigenvalues of its companion matrix. However, it is well known that a polynomial’s roots can be very sensitive to small changes in the polynomial’s coefficients [57]. For this reason, the accuracy required of the resultant’s coefficients makes this approach prohibitively expensive. Instead, the resultant’s roots can be obtained directly without constructing resp,q_{(x) by noticing that (2.7) is a}

polynomial eigenvalue problem (PEP) [30] with matrix pencil

Sp,q(x∗) =

d

X

i=0

S(i)Bi(x∗), (2.8)

in the basis {Bi(x∗)} and where S(i)∈ Cn×nfor i = 0, . . . , d and d := max(d (x) p , d

(x) q ).

It is well known that p(x, y) and q(x, y) share a nontrivial common factor if and only if resp,q(x) ≡ 0. In this case, the pencil (2.8) is said to be singular. If p(x, y) and q(x, y) don’t share a common factor, the pencil is called regular. Unfortunately, the eigenvalues of a singular (or close to singular) pencil can be changed arbitrarily by small perturbations of its matrices [58]. However, the situation is not always as bad as this. Wilkinson gives the example of a singular pencil A − λB, where

A :=2 0 0 0 and B :=1 0 0 0 .

The perturbed pencil ˆA − λ ˆB has characteristic equation det( ˆA − λ ˆB) = [(2 + ∆a11) −

λ(1 + ∆b11)](∆a22− λ∆b22) − (∆a12− λ∆b12)(∆a21− λ∆b21), where ˆA := A + ∆A

and ˆB := B + ∆B. For almost all perturbations ∆A and ∆B, the characteristic equation has a root which is very close to 2 [58]. Hence, it may in some cases be possible to recover the eigenvalues of the regular part of a singular pencil. If this is not the case, the greatest common divisor of p(x, y) and q(x, y) or the singular part of the associated pencil should be extracted. Special care should be taken with both

(7)

approaches, as they will inevitably be based on decisions concerning the ranks of the matrices involved.

A standard way of solving the PEP (2.7) is to linearize the polynomial pencil Sp,q_(x∗_{) into a linear pencil A − x}∗_{B and solve its associated generalized eigenvalue}

problem (GEP). The matrices A, B ∈ Cdn×dn_{are chosen so that A−x}∗_{B has the same}

spectrum as Sp,q(x∗). In practice, the monomial basis is used most often together with a companion linearization. For the Chebyshev basis, a colleague linearization can be used. In fact, both of these are confederate linearizations [32,33], which are defined by the polynomial basis’ recurrence relations. To construct such a linearization, consider as per example the recurrence relations

x1 x · · · = 1 x · · · ·    0 1 . ._.    and (2.9a) xT0(x) T1(x) · · · = T0(x) T1(x) · · · ·          0 1 2 1 0 1 2 1 2 0 . ._. 1 2 . .. . ._.          (2.9b)

for the monomial and Chebyshev basis, respectively. Let v(d)(x) be the generalized Vandermonde vectorB0(x) · · · Bd−1(x)

T

of length d and let l be a left eigenvec-tor of the PEP (2.7) so that

lT· Sp,q(x∗) = 0. (2.10)

Now truncate equations (2.9) up to the dth term and convert to block form by treating scalars as identity matrices, then use (2.10) to eliminate the last block row. Defining w := v(d)(x∗) ⊗ l, we obtain the companion and colleague linearizations

x∗wT·      I . ._. I S(d)      = wT·      0 −S(0) I −S(1) . ._. .._. I −S(d−1)      and (2.11a) 2x∗wT·      I . ._. I S(d)      = wT·          0 I −S(0) 2I 0 . .. ... I . .. I −S(d−3) . ._. ₀ _S(d)_{− S}(d−2) I −S(d−1)          (2.11b)

for the monomial and Chebyshev basis, respectively. Here, w is a left eigenvector of the GEP (2.11). If (x∗, y∗_{) ∈ C}2 is a root of p(x, y) = q(x, y) = 0, then v(n)(y∗) is a left eigenvector l of (2.10) corresponding to the eigenvalue x∗ since

v(n)(y∗)T· Sp,q(x∗) =hp(x∗, y∗)v(dq(y))_(y∗₎T _q(x∗_{, y}∗_)v(d(y)p )_(y∗₎T

i = 0T.

(8)

This means that if x∗is a simple eigenvalue of the GEP (2.11), we can recover y∗from its corresponding left eigenvector w, which will be a multiple of v(d)_(x∗_{) ⊗ v}(n)_(y∗_).

However, the situation becomes more difficult when (x∗, y∗) is a multiple root, or when there is a ˆy∗6= y∗ _{for which (x}∗_{, ˆ}_y∗_{) is also a root. Both of these cases increase}

the dimension of the kernel of Sp,q_(x∗_{) so that it is larger than one. Several solutions}

for recovering all values y∗ under these circumstances have been proposed.

Some algorithms compute y∗ by solving a smaller eigenproblem for each distinct x∗ [10, 13, 31]. However, this approach requires estimating the multiplicities of the eigenvalues x∗, along with the dimension of their corresponding kernels. The former task is complicated by the fact that a multiple real eigenvalue x∗often splits into sev-eral nearby complex eigenvalues due to round-off errors or because the problem cannot be represented exactly in finite precision. Then, for each distinct x∗ _{an eigenproblem}

of size equal to the dimension of its kernel should be solved. Another approach is to eliminate both x and y separately and then for each pair of projected coordinates check whether it constitutes a solution of the given system or not [8,14,44]. For many applications, a nontrivial final task is to determine which of the solutions are real.

To solve these issues, we propose to first project the solutions onto the real plane associated with x and y, which will allow us to compute both the x- and y-coordinates of the roots simultaneously and ensure they are real. Moreover, this approach does not require computing any eigenvectors or solving any additional eigenproblems, which saves both computational effort and memory. Consider the change of variables

(x, y) = 1 2 1 1 −i i · (z, ω) (2.12)

which transforms the polynomials p(x, y) and q(x, y) into ˆp(z, ω) and ˆq(z, ω), respec-tively. This operation can destroy the coordinate degree structure of the polynomials, possibly changing the coefficient matrices from dense rectangular matrices to square upper anti-triangular matrices. However, the total degrees of the polynomials remain unchanged. We proceed as before, this time eliminating ω from the modified system ˆ

p(z, ω) = ˆq(z, ω) = 0. The roots of the resultant resp,ˆˆq(z) now comprise the set V(z)

C := {z ∈ C |∃ ω ∈ C : ˆp(z, ω) = ˆq(z, ω) = 0}.

In fact, {z ∈ C | resp,ˆˆq_{(z) = 0} is often exactly equal to V}(z)

C , since the transformation

(2.12) generically gives rise to a constant leading coefficient system ˆpdp(z) = ˆqdq(z) =

0 with no solutions. Since the transformation in (2.12) is invertible, V(z)

C can be

characterized in terms of the roots of the original system p(x, y) = q(x, y) = 0 as V(z)

C = {z ∈ C |∃ (x, y) ∈ VC: z = x + iy}.

Clearly, the set of projected real roots V(z)

R2 := {z ∈ C | ∃(x, y) ∈ VC∩ R

2_{: z = x + iy}}

is a subset of V(z)

C . As we are only interested in real roots (x

∗_{, y}∗_{) ∈ R}2_{, we may}

also interpret the transformed system as a polyanalytic polynomial system ˆp(z, z) = ˆ

q(z, z) = 0. Hence, the real roots (x∗, y∗_{) ∈ R}2 _{of a bivariate polynomial system may}

be obtained as the real and imaginary part of (a subset of) the eigenvalues of the GEP (2.11) associated with the elimination of z after converting the system into a

(9)

polyanalytic system with (2.12). Moreover, the method allows multiple roots to be recovered without additional effort. Indeed, each eigenvalue’s algebraic multiplicity is equal to the multiplicity of the corresponding root (x∗, y∗), so that multiple roots need not be treated separately from simple roots. In contrast, methods based on recovering y∗ from the eigenspace belonging to x∗ need to estimate the algebraic multiplicity of each eigenvalue and apply relatively complicated techniques to extract y∗.

In much the same way as the bivariate case, the complex roots z∗ ∈ C of a polyanalytic polynomial system can be obtained by eliminating z. For both types of systems, it remains to identify which of the eigenvalues do not correspond to roots of the original system. In the final subsection, we present a robust method to remove these spurious solutions based on a Newton–Raphson refinement on the original poly-nomial system. The following subsection proposes several balancing steps that aim to maximize the accuracy of the computed roots.

2.2. Balancing the system and its associated pencil.

2.2.1. Balancing the bivariate system. From here on, we restrict the discus-sion to polynomial systems given in the product monomial basis for simplicity. For both bivariate and polyanalytic polynomial systems, we center the exponents of the elements of the associated coefficient matrices P and Q around zero with the scaling

P ← 2−round(median(log2(|P |)))_P _(2.13a)

Q ← 2−round(median(log2(|Q|)))_Q. _(2.13b)

Here, | · | is the absolute value, log(·) is the element-wise logarithm, median(·) is the median over all finite elements of its argument and round(·) rounds to the nearest integer to avoid introducing round-off error. The scaling (2.13) not only normalizes the coefficients of the two polynomials relative to each other, but also maximizes the available exponent range so that overflow is less likely to occur.

Bivariate systems p(x, y) = q(x, y) = 0 are then converted into polyanalytic polynomial systems ˆp(z, z) = ˆq(z, z) = 0 using (2.12). The entries of the coefficient matrices ˆP and ˆQ are linear combinations of the entries of the coefficient matrices P and Q. In order to minimize loss of precision due to rounding errors, we propose to first minimize the variance of the exponents of their elements by scaling the variables as x ← 2rx_{x and y ← 2}ry_{y with}

ˆ

P ← diag(v(d(y)p +1)₍₂ry_{)) · P · diag(v}(d(x)p +1)₍₂rx₎₎ _(2.14a)

ˆ

Q ← diag(v(d(y)q +1)₍₂ry_{)) · Q · diag(v}(d(x)q +1)₍₂rx₎₎ _(2.14b)

where rxand ry are real scalars that solve the least squares problem

minimize

rx,ry∈ R

var(log(| ˆP |)) + var(log(| ˆQ|)), (2.15) in which var(·) is the sample variance over all finite elements of its argument. Again, we round rxand ryto their nearest integer values to avoid introducing round-off error.

After applying (2.14), we transform ˆP and ˆQ in-place with (2.12).

2.2.2. Balancing the polyanalytic system. Assuming we now dispose of a scaled polyanalytic system ˆp(z, z) = ˆq(z, z) = 0, our goal is to balance the problem such that its associated pencil A − z∗B is closer to a well-conditioned normal pencil [28]. Ward suggests to scale all elements of A and B to the same order of magnitude

(10)

[56]. To this end, a simple heuristic is to minimize the variance of the exponents of their elements by combining two balancing strategies. The first is to balance the coefficient matrices by scaling the variables z and z with a real scalar sz as z ← 2szz

and z ← 2sz_{z. The second is to balance the Sylvester blocks S}(i) _{by scaling the}

variable z∗ in the PEP (2.7) with a real scalar sz∗ as z∗← 2sz∗_z∗_{. We set}

ˆ

P ← diag(v(d(z)pˆ +1)₍₂sz_{)) · ˆ}_{P · diag(v}(d(z)pˆ +1)₍₂sz+sz∗₎₎ _(2.16a)

ˆ

Q ← diag(v(d(z)qˆ +1)₍₂sz_{)) · ˆ}_{Q · diag(v}(d (z) ˆ

q +1)₍₂sz+sz∗₎₎ _(2.16b)

Analogously to (2.15), sz and sz∗ are the solution of the least squares problem

minimize

sz,sz∗∈ R

var(log(| ˆP |)) + var(log(| ˆQ|)). (2.17) Again, we round the solution of (2.17) to the nearest integer before applying transfor-mations (2.16) to avoid introducing round-off error. Finally, we center the exponents of the coefficient matrices by applying (2.13) to ˆP and ˆQ.

2.2.3. Balancing the pencil. Solving the GEP (2.11) consists of three steps. Let A and B be the matrices in the right and left hand side of (2.11), respectively. The first step is to reduce B to an upper triangular matrix with a QR-factorization. Then, the pair (A, B) is reduced to generalized upper Hessenberg form using unitary left and right transformations so that A becomes an upper Hessenberg matrix and B remains upper triangular. Finally, the QZ algorithm is applied on the resulting pencil. Equation (2.11) is not the only way to linearize the PEP (2.7). However, compared to other linearizations it has the advantage that the pencil is already in a block generalized Hessenberg form. Moreover, many of the first Givens rotations reducing the pencil to a generalized Hessenberg form will be permutations. An alternative linearization is the second companion form [18], which is perhaps less efficient but in our experience has a slight edge in accuracy over (2.11).

To improve the conditioning of the eigenvalues of the pencil A − z∗B, most bal-ancing strategies are based on a two-sided diagonal transformation. The LAPACK [1] routine ZGGBAL implements Ward’s algorithm [56], which aims to scale the elements of A and B such that their magnitudes are as close to unity as possible. Unfortu-nately, LAPACK’s driver routine ZGGEV for computing the generalized eigenvalues of a pair of complex nonsymmetric matrices (and, consequently, MATLAB) disables Ward’s algorithm by default. More recently, Lemonnier and Van Dooren [28] suggest to (approximately) solve the convex optimization problem

minimize

det(Dl·Dr)=1

kDl· A · Drk2F+ kDl· B · Drk2F (2.18)

for the real positive diagonal matrices Dland Dr. In fact, this objective corresponds

to searching for Dl and Dr so that Dl2· (|A|2 + |B|2) · D2r is a doubly stochastic

matrix, i.e., having row and column sums equal to one. The Sinkhorn–Knopp (SK) algorithm is perhaps the most simple method for finding a doubly stochastic scaling of a nonnegative matrix. Its simplicity has led to the repeated discovery of the method [24], of which the algorithm proposed in [28] is another variation. Given a nonnegative matrix M := |A|2+ |B|2, SK alternately solves for Dl and Dr by fixing Dr (Dl) and

then setting the column (row) sums equal to one with the iterates D2l ← diag(M · D

2

r· 1dn×1)−1 and (2.19a)

D2_r← diag(11×dn· D2l · M )

(11)

respectively. The main cost of the SK iterates (2.19) are the left and right matrix-vector products with M , which are O(d2_n2_{) floating point operations (flop) each.}

Depending on the number of iterations, SK can be close to as expensive as solving the GEP (2.11) itself, which costs O(d3_n3_{) flop. To reduce the impact of SK on the total}

computational time, the block structure in M can be exploited in the matrix-vector products. Taking only the block structure and identity matrices into account reduces the computational complexity to O(dn2) flop per matrix-vector product, which is small enough in practice. Before applying Dland Dr, their entries are rounded to the

nearest power of two so that the transformation does not introduce round-off error. 2.3. Filtering spurious solutions with Newton–Raphson. Removing spu-rious solutions introduced by the resultant or by round-off error is not as easy as testing if the polynomials evaluated at the candidate roots are (close to) zero, even after normalization. This is because evaluating a polynomial may be, and often is, ill-conditioned, especially as the modulus of the candidate root increases. Even taking the condition number into account does not improve such a filter’s robustness much. We propose to refine the roots with a (complex) Newton–Raphson algorithm [48], and simultaneously remove spurious solutions based on the step length. For candi-date roots close to a root of the polynomial system, the step length will be small relative to the modulus of the root, while the step length for spurious roots tends to be large relative to its modulus. We underline that this approach is only heuristic and hence not guaranteerd to remove all spurious solutions. Moreover, Newton–Raphson is known to fail for certain bivariate polynomial systems such as the Griewank–Osborne example [7, 20], where it iterates away from the isolated root at the origin for any starting point in a punctured neighbourhood of that root. Even though such systems are not common in practice, we limit the number of Newton–Raphson iterations and only invert the (numerically) nonsingular components of the Jacobian to prevent good candidate solutions from straying too far from the system’s roots.

Writing the bivariate polynomial system p(x, y) = q(x, y) = 0 as a vector-valued residual function Fp,q(x, y) := (p(x, y), q(x, y)), the Newton–Rapshon step (∆x, ∆y) for a candidate root (x∗_{, y}∗_{) is computed as}

∆x ∆y

= (Jp,q(x∗, y∗))†· Fp,q_(x∗_{, y}∗_), _(2.20a)

where Jp,q _{is the Jacobian ∂F}p,q_{/∂(x, y)}T_. _{The next iterate is then (x}∗_{, y}∗_{) ←}

(x∗, y∗) + (∆x, ∆y).

In the polyanalytic case p(z, z) = q(z, z) = 0, the standard way of computing the Newton–Raphson step at a candidate root z∗ is to write p(z, z) = up(z, z) + ivp(z, z)

and q(z, z) = uq(z, z) + ivq(z, z), wherein up, vp, uq and vq are real-valued

polyan-alytic polynomials. Define the vector-valued residual function Fup,uq,vp,vq

R (z, z) :=

(up(z, z), uq(z, z), vp(z, z), vq(z, z)) and let x and y be the real and imaginary part

of z, respectively. The Newton–Raphson step (∆x, ∆y) at a point z∗ can then be computed as ∆x ∆y = Jup,vp,uq,vq R (z ∗_{, z}∗₎† · Fup,uq,vp,vq R (z ∗_{, z}∗_), _(2.20b) where Jup,uq,vp,vq

R is the real Jacobian ∂F

up,uq,vp,vq

R /∂(x, y)

T_{. With a step of this}

form, the next iterate is z∗ ← z∗_{+ (∆x + i∆y). However, obtaining u}

p, vp, uq and

(12)

real Jacobian may be obtained much more elegantly from a complex Taylor series of the complex vector-valued residual function Fp,q_{(z, z) := (p(z, z), q(z, z)). First, we}

compute its complex Jacobian

Jp,q C :=     ∂p ∂z ∂p ∂z ∂q ∂z ∂q ∂z     ,

where the partial derivatives are so-called Wirtinger derivatives and are with respect to z (z), while treating z (z) as constant. By defining

J_Rp,q := J_Cp,q·1 i 1 −i

,

the real Jacobian can then be recovered from the identity [48]

Jup,vp,uq,vq R ≡ " Re{Jp,q R } Im{Jp,q R } # . (2.21)

The advantage of (2.21) is that evaluating Jp,q

C requires only four complex function

evaluations, compared to eight real function evaluations for Jup,uq,vp,vq

R together with

the cost of constructing of the four polynomials up, vp, uq and vq. It is interesting

to note that (Jp,q

R (z

∗_{, z}∗₎₎†_{· F}p,q_(z∗_{, z}∗_{) computes a complex Newton–Raphson step,}

while the form (2.20b) ensures that the step will be real.

It is recommended to solve the least squares systems (2.20) either by regularizing the system or by approximating the Moore–Penrose pseudo-inverse using the singular value decomposition (SVD). This is especially important near multiple roots, where the Jacobian tends to a singular matrix. Since the Jacobian is a small matrix of size 2 × 2 or 4 × 2, we choose the SVD and invert the first component if σ1 > τσ and

additionally invert the second component if σ2> σ1τσ, where σ1 and σ2 are the first

two singular values of the Jacobian and τσ is a user-defined tolerance.

Next, we describe a method to remove spurious roots based on the Newton– Raphson step length which we have found to work well in practice. Let (∆x, ∆y) be a solution of (2.20a) or (2.20b) at a candidate root (x∗, y∗) or z∗, respectively. If the candidate root is close to a root of the polynomial system, it may be expected that its Newton–Raphson step will be small in comparison to a measure of the size of the root. Based on this observation, we identify a candidate root of a bivariate system as spurious if

k(∆x, ∆y)k ≥ τNR· max(k(x∗, y∗)k, 1) (2.22a)

and of a polyanalytic system if

k(∆x, ∆y)k ≥ τNR· max(|z∗|, 1) (2.22b)

holds for some tolerance τNR. This condition corresponds to a relative tolerance on

the step length outside the unit circle, and an absolute tolerance inside the unit circle. The smaller τNR, the less likely spurious roots are returned as solutions, but also the

(13)

In summary, we describe the complete process for solving bivariate and polyan-alytic univariate polynomial systems in Algorithm 2.1. In the following section, we investigate the performance of this and other algorithms on a set of bivariate systems, a subset of which are considered difficult due to the wide variation in exponents of the polynomial’s coefficients. We will see that Algorithm 2.1 is not only able to recover the largest fraction of roots, but also does this accurately and efficiently.

Input: Two bivariate or polyanalytic univariate polynomials p and q Output: Isolated real roots (x∗, y∗) or complex roots z∗ of the system p = q = 0

{P, Q} ← Apply (2.13) to center the exponents of the coefficients if p and q are bivariate then

{rx, ry} ← Solve (2.15)

{rx, ry} ← {round(rx), round(ry)}

{ ˆP , ˆQ} ← Apply the transformation (2.14) to balance the system

{ ˆP , ˆQ} ← Apply the transformation (2.12) to obtain a polyanalytic system end

{sz, sz∗} ← Solve (2.17)

{sz, sz∗} ← {round(s_z), round(s_z∗)}

{ ˆP , ˆQ} ← Apply the transformation (2.16) to balance the polyanalytic system { ˆP , ˆQ} ← Apply (2.13) to { ˆP , ˆQ} to center the exponents of the coefficients A − z∗_{B ← Build (2.11) associated with eliminating z from ˆ}_{p = ˆ}_{q = 0}

{Dl, Dr} ← Approximately solve (2.18) with SK

{Dl, Dr} ← {diag(2round(log2(diag(Dl))), diag(2round(log2(diag(Dr))))}

z∗_{← eig(D}

l· A · Dr, Dl· B · Dr)

foreach z∗ in z∗ do

if p and q are bivariate then (x∗, y∗) ← Decode z∗= x∗+ iy∗ end

for i = 1 to maxiter (e.g., maxiter = 4) do

(∆x, ∆y) ← Solve (2.20) at (x∗, y∗) or z∗ (for, e.g., τσ= 10−6)

if (2.22) holds (for, e.g., τNR= 10−2) then

Discard (x∗, y∗) or z∗ and break end

if p and q are bivariate then (x∗, y∗) ← (x∗, y∗) + (∆x, ∆y) else z∗← z∗_{+ (∆x + i∆y)} end end end

Algorithm 2.1: Compute isolated roots of a bivariate or polyanalytic univari-ate polynomial system.

3. Numerical experiments. We compare the relative performance of Algo-rithm 2.1, Bertini 1.4 [7], Chebfun 4.3.2987 [51], PHCpack 2.3.84 [53], the RealSolv-ing C library [41,42], the RegularChains package [2] and Maple’s BivariatePolynomial

(14)

method on a wide range of bivariate polynomial systems using the performance pro-files proposed by Dolan and Moré [15]. Algorithm 2.1 was implemented as part of the MATLAB toolbox Tensorlab [49], which is free for academic and non-commercial use. Bertini has a variable precision (VP) and double precision (DP) mode, both of which are evaluated in the numerical experiments. Chebfun currently has two algorithms for computing the common roots of two bivariate functions. In our experiments, we use the recently added method based on the Bézout resultant [37]. Chebfun is the only software package in this comparison which chooses to represent the functions, in this case polynomials, in the product Chebyshev basis. Bertini and PHCpack are well-known homotopy continuation solvers. The RealSolving library is based on re-ducing the polynomial system to a rational univariate representation, from which the solutions can easily be recovered. The RegularChains package computes a triangu-lar decomposition of the system, i.e., a set of simpler systems called regutriangu-lar chains which represent the solutions of the original system. Similar to but less advanced than Chebfun, Maple’s BivariatePolynomial algorithm applies a random orthogonal change of variables, computes the y-coordinates of the isolated roots as the eigen-values of the polynomial matrix associated with the Bézout resultant and recovers the x-coordinates from the corresponding eigenvectors. The first two algorithms were applied in MATLAB 8.1 (R2013a), Bertini and PHCpack as standalone executables, and the remaining three methods using Maple 17. In each experiment, all algorithms were applied without supplying any parameters so that each method uses its default parameter values. In the case of Bertini, we supplied the parameter MPTYPE: 0 to select its DP mode, and omitted this parameter to apply the default VP mode. All experiments were performed on a server with two hexacore Intel Xeon E5645 CPUs and 48 GB RAM.

Each algorithm’s performance is evaluated on two problem sets1_{. The first set}

is generated using the low-degree bivariate polynomial systems listed in Table 3.1. The second set is more difficult and is generated using the bivariate polynomial sys-tems listed in Table 3.2. The zero level lines and their intersections of two example systems of each type are displayed in Figure 3.1 and Figure 3.2. Both tables show the total degrees of the system and the exponent ranges of the coefficients. For a polynomial system p(x, y) = q(x, y) = 0 the total degree is displayed as (dp, dq), while

the exponent range is defined as (range(p(x, y)), range(q(x, y))), where range(p(x, y)) := max

i,j log10(|pij|) − mini,j pij6=0

log10(|pij|).

The total degrees define the size of the eigenproblem, which is generically equal to max(dp, dq)(dp+ dq). Large exponent ranges are usually associated with difficult

problems since round-off errors tend to increase in magnitude as the exponent range increases. Table 3.1 also includes the fraction of systems with at least one solution where the Jacobian is singular. Furthermore, Table 3.2 shows the exponent range of the (finite) condition numbers of the Jacobian evaluated at all solutions of the corresponding system. If a system has at least one solution where the Jacobian is singular, we indicate this by taking the union with infinity.

All systems are given in the product monomial basis, often with integer coeffi-cients. In some cases these coefficients contain more digits than can be represented by double precision floating point numbers. For a fair comparison, each algorithm

(15)

Table 3.1

Properties of the initial low-degree problem set, containing a total of 50 systems.

Problem Total degree Coefficient range Singular

ex001–ex025 [10] (3, 2) to (11, 10) (0.30, 0.00) to (8.79, 1.71) 68% C1–C5, D1–D2, (2, 2) to (16, 5) (0.00, 0.00) to (8.05, 8.65) 71%

M1–M4, R1–R3 [14]

Q21–Q210, X [40] (3, 3) (0.60, 0.42) to (2.40, 2.47) 0%

Table 3.2

Properties of the initial moderate-degree problem set, containing a total of 16 systems. All of these systems with the exception of lebesgue were taken from [8].

Problem Total degree Coefficient range Condition range

13 sings 9 (9, 8) (6.91, 6.26) (0.54, 9.31)

compact surf (18, 17) (3.76, 3.76) (0.02, 16.28)_{∪ ∞}

curve24 (29, 28) (4.95, 4.68) (0.13, 1.66)

curve issac (16, 15) (3.43, 4.24) (0.28, 16.23)_{∪ ∞}

cusps and flexes (9, 8) (4.74, 4.29) (0.22, 8.36)

deg16 7 curves (32, 11) (3.16, 1.08) (0.15, 2.87)∪ ∞ dfold 10 6 (32, 31) (4.97, 4.46) (0.07, 1.41)_{∪ ∞} grid deg 10 (10, 9) (9.08, 8.16) (0.11, 3.27) hard one (27, 10) (28.04, 7.82) (6.65, 20.85) huge cusp (8, 7) (4.79, 4.16) (0.37, 7.96) L4 circles (16, 15) (8.05, 8.65) (15.52, 19.43)_{∪ ∞} lebesgue (20, 20) (6.22, 6.53) (0.10, 1.64) spiral29 24 (29, 29) (10.94, 4.95) (0.13, 6.66) ten circles (20, 19) (6.10, 5.57) (0.92, 31.39)_{∪ ∞} tryme (34, 25) (7.68, 6.96) (3.50, 13.14) vert lines (16, 14) (2.15, 2.14) (0.49, 1.75) −4 −2 0 2 4 −2 0 2 x y (a) ex007 −2 0 2 −2 0 x y (b) C3

Fig. 3.1. Two examples of the low-degree bivariate polynomial systems and their solutions.

receives the systems in a normalized format. More specifically, each polynomial’s coefficients are converted to hardware double precision floating point numbers, after which they are multiplied by an integer power of two such that their exponents are approximately centered around zero.

(16)

−5 0 5 −2 0 2 x y (a) curve24 −1 0 1 −1 0 1 x y (b) lebesgue

Fig. 3.2. Two examples of the moderate-degree bivariate polynomial systems and their solutions.

Some of the systems are of the form p(x, y) = ∂p_∂y(x, y) = 0 and correspond to finding roots of p(x, y) that are tangent to a vertical line. Taking inspiration from this, we artificially expand the low-degree and medium-degree problem sets by taking each system p(x, y) = q(x, y) = 0 and adding the systems p(x, y) = ∂p_∂x(x, y) = 0 and

∂p

∂x(x, y) = ∂p

∂y(x, y) = 0, tripling the total number of systems. The only exception to

this rule are the systems Q21–Q210 and X, which are already of the form ∂p_∂x(x, y) =

∂p

∂y(x, y) = 0 and correspond to finding the stationary points of normal quartics [40].

For each system in the expanded low-degree and moderate-degree problem sets, we compute a reference solution set as follows. First, all solutions of all systems as computed by all eight algorithms are collected. There is no discrimination between real or complex solutions and valid or spurious solutions. For each such candidate root (x∗_{, y}∗_{), we apply a high-precision Newton–Raphson method with 128 significant}

decimal digits to Re{(x∗_{, y}∗_{)}. We terminate the refinement after 500 iterations or}

when the relative step size is less than mach10−3, where mach is the machine epsilon

for double precision. The refined root (x∗, y∗) is accepted as a valid root if both conditions

|p(x∗, y∗)| ≤ |p|(|x∗|, |y∗|)mach103 and

|q(x∗, y∗)| ≤ |q|(|x∗|, |y∗|)mach103,

hold when evaluated in high-precision. Here, the polynomials |p| and |q| correspond to the coefficient matrices |P | and |Q|, respectively. Evaluated at the absolute value of (x∗, y∗), they are equal to the condition number of evaluating the polynomials p and q at (x∗_{, y}∗_{). If accepted, the candidate root is truncated to double precision}

and added to the system’s reference solution set. After all candidate roots have been processed, the reference solution sets are reduced to their distinct elements.

For each problem set P , the performance profiles are computed as follows. Let S be the set of solvers consisting of the previously mentioned eight algorithms and let tp,s denote the time required by solver s to solve problem p. If solver s did not

successfully solve problem p, we set tp,s= ∞. The performance profile for solver s,

ρs(τ ) :=

|{p ∈ P | tp,s ≤ 2τ· (mins∈Stp,s)}|

(17)

depicts the fraction of problems that solver s was able to solve within a factor 2τ

of the fastest solution time for that problem. We say that a polynomial system p has been solved successfully by solver s if for at least 99% of all roots (x∗, y∗) in the reference solution set, there is a candidate root (ˆx∗, ˆy∗) computed by s which satisfies

k(x∗, y∗) − (ˆx∗, ˆy∗)k ≤ max(k(x∗, y∗)k, 1)max

for a certain maximal relative error max. Chebfun computes roots on a rectangular

domain specified by the user, which is [−1, 1] × [−1, 1] by default. For each system, we set this domain equal to the smallest axis-aligned bounding box of the corresponding reference solution set, plus a small margin to prevent edge cases.

0 2 4 6 8 10 12 0 0.2 0.4 0.6 0.8 1 τ ρ (a) max= 5% 0 2 4 6 8 10 12 0 0.2 0.4 0.6 0.8 1 τ ρ (b) max= 0.5%

Fig. 3.3. Performance profiles for low-degree bivariate polynomial systems. The problem set consists of all these polynomial systems, and we say a system has been successfully solved if at least 99% of its roots have been found with a relative error that does not exceed max. Legend: our method ( ), PHCpack ( ), Bertini VP ( ), Bertini DP ( ), Chebfun ( ), RealSolving ( ), RegularChains ( ) and BivariatePolynomial ( ).

0 2 4 6 8 10 12 14 16 18 0 0.2 0.4 0.6 0.8 1 τ ρ (a) max= 5% 0 2 4 6 8 10 12 14 16 18 0 0.2 0.4 0.6 0.8 1 τ ρ (b) max= 0.5%

Fig. 3.4. Performance profiles for moderate-degree bivariate polynomial systems. The problem set consists of all these polynomial systems, and we say a system has been successfully solved if at least 99% of its roots have been found with a relative error that does not exceed max. Legend: our method ( ), PHCpack ( ), Bertini VP ( ), Bertini DP ( ), Chebfun ( ), RealSolving ( ), RegularChains ( ) and BivariatePolynomial ( ).

(18)

Figures 3.3 and 3.4 show the performance profiles of the expanded low-degree and moderate-degree problem sets, respectively, for two choices of max. The left hand side,

where τ = 0, of each plot shows the fraction of successfully solved problems for which a solver was the fastest solver. The right hand side, where τ is large, shows the fraction of problems a solver was able to successfully solve. Among the fixed precision solvers, Algorithm 2.1 was able to successfully solve the largest fraction of problems and was also considerably faster than most other algorithms for both problem sets and choices of max, with the RealSolving library as a close second for the low-degree

problem set. Bertini was able to solve an impressive 90% of the moderate degree problems in its variable precision mode, which is likely due to the increased flexibility that variable precision offers. Concerning the accuracy of the computed solutions, PHCpack and Chebfun seem to be sensitive to the tolerance max, especially on the

moderate-degree problem set. The other algorithms’ performance profiles remain relatively static when maxis decreased, indicating that their computed solutions are

relatively more accurate.

Compared to the low-degree problem set, the fraction of moderate-degree systems that were successfully solved is much lower. These systems often have many more solutions than their low-degree counterparts. Since a solver by definition fails to solve a system as soon as over 1% of these roots is not found, we compute a second set of performance profiles to examine the fraction of successfully recovered roots as follows. We define two new problem sets containing all reference solutions of the expanded low-degree systems and moderate-degree systems, respectively. The time for solver s to solve problem p is then defined as the total time for solver s to compute all roots of the bivariate polynomial system associated with p, divided by the number of roots of the system’s reference solution set. If solver s was not able to successfully recover the root p, we set tp,s = ∞. The condition for successfully recovering a root is identical

to that of the previous performance profiles.

0 2 4 6 8 10 12 0 0.2 0.4 0.6 0.8 1 τ ρ (a) max= 5% 0 2 4 6 8 10 12 0 0.2 0.4 0.6 0.8 1 τ ρ (b) max= 0.5%

Fig. 3.5. Performance profiles for low-degree bivariate polynomial systems. The problem set consists of all distinct roots of these systems, and we say a root has been successfully found if its relative error does not exceed max. Legend: our method ( ), PHCpack ( ), Bertini VP ( ), Bertini DP ( ), Chebfun ( ), RealSolving ( ), RegularChains ( ) and BivariatePolynomial ( ).

The performance profiles based on the redefined problem sets corresponding to the expanded low-degree and moderate-degree systems are shown in Figure 3.5 and 3.6, respectively. Algorithm 2.1 was able to recover nearly all roots of both problem

(19)

0 2 4 6 8 10 12 14 16 18 0 0.2 0.4 0.6 0.8 1 τ ρ (a) max= 5% 0 2 4 6 8 10 12 14 16 18 0 0.2 0.4 0.6 0.8 1 τ ρ (b) max= 0.5%

Fig. 3.6. Performance profiles for moderate-degree bivariate polynomial systems. The problem set consists of all distinct roots of these systems, and we say a root has been successfully found if its relative error does not exceed max. Legend: our method ( ), PHCpack ( ), Bertini VP ( ), Bertini DP ( ), Chebfun ( ), RealSolving ( ), RegularChains ( ) and BivariatePolynomial ( ).

sets and for both values of mach. The roots that it did not recover, were spread out

among different systems, causing the fraction of successfully solved moderate-degree systems to drop to around 80% (cf. Figure 3.4). Interestingly, the RealSolving library successfully solved the (close to) second largest fraction of systems, yet other solvers recover a higher fraction of roots in Figure 3.6. This indicates that the RealSolving library is somewhat of a hit or miss algorithm, either recovering all solutions or very few. In contrast, Bertini’s double precision mode, PHCpack and Chebfun are able to recover a larger fraction of roots, but fail to recover at least one root for many moderate-degree systems. Furthermore, we observe that for moderate-degree systems around 20% of the roots computed by PHCpack and Chebfun are only accurate up to max= 5%.

4. Conclusion. We presented an algorithm for computing the isolated real so-lutions of bivariate polynomial systems, and the isolated complex soso-lutions of poly-analytic polynomial systems. For bivariate systems in two variables, say x and y, the algorithm projects the solutions onto the real plane associated with x and y by com-bining a well chosen change of variables together with a resultant-based elimination of a hidden variable. In this way, the problem is reduced to computing the eigenvalues of a polynomial eigenvalue problem. In comparison to other eigen-based approaches, there is no need to compute the accompanying eigenvectors or solve any additional eigenproblems. To increase the algorithm’s robustness, we presented three methods of balancing the eigenvalue problem and an elegant method to remove spurious roots. Though the algorithm was chiefly developed for the polynomials expressed in the product monomial basis, the generalization to other bases is straightforward. The numerical experiments show that our method often outperforms other algorithms in the fraction of roots successfully recovered, the accuracy of the estimated roots and computational time. The algorithms discussed in this article were implemented as part of the MATLAB toolbox Tensorlab [49] and are available online.

(20)

REFERENCES

[1] E. Anderson, Z. Bai, Chr. H. Bischof, S. Blackford, J. W. Demmel, J. J. Dongarra, J. J. Du Croz, A. Greenbaum, S. J. Hammarling, A. McKenney, and D. C. Sorensen, LAPACK Users’ Guide, Society for Industrial and Applied Mathematics, Philadelphia, PA, third ed., 1999.

[2] P. Aubry, D. Lazard, and M. Moreno Maza, On the theories of triangular sets, J. Symb. Comput., 28 (1999), pp. 105–124.

[3] W. Auzinger and H. J. Stetter, An elimination algorithm for the computation of all zeros of a system of multivariate polynomial equations, in Proceedings of the International Con-ference on Numerical Mathematics, vol. 86 of Intern. Series in Numer. Math., Birkh¨auser, Basel, 1988, pp. 12–30.

[4] C. Bajaj, T. Garrity, and J. Warren, On the applications of multi-equational resultants, Technical Report 88-826, Department of Computer Science, Purdue University, Nov. 1988. [5] S. Basu, R. Pollack, and M.-F. Roy, Algorithms in Real Algebraic Geometry, vol. 10 of

Algorithms and Computation in Mathematics, Springer-Verlag, second ed., 2006. [6] D. J. Bates, J. D. Hauenstein, A. J. Sommese, and C. W. Wampler, Adaptive multiprecision

path tracking, SIAM J. Numer. Anal., 46 (2008), pp. 722–746.

[7] , Numerically solving polynomial systems with Bertini, vol. 25 of Software, Environments and Tools, SIAM, 2013.

[8] E. Berberich, P. Emeliyanenko, and M. Sagraloff, An elimination method for solving bivariate polynomial systems: Eliminating the usual drawbacks, in Proceedings of the Thirteenth Workshop on Algorithm Engineering and Experiments (ALENEX), M. M¨ uller-Hannemann and R. Werneck, eds., San Francisco, CA, USA, Jan. 2011, SIAM, pp. 35–47. [9] B. Buchberger, Gr¨obner bases and systems theory, Multidim. Systems and Signal Proc., 12

(2001), pp. 223–251.

[10] L. Bus´e, H. Khalil, and B. Mourrain, Resultant-based methods for plane curves intersection problems, in Proceedings of the 8th international conference on Computer Algebra in Sci-entific Computing, vol. 3718 of Lecture Notes in Computer Science, Springer-Verlag, 2005, pp. 75–92.

[11] R. H. Byrd, R. B. Schnabel, and G. A. Schultz, Approximate solution of the trust region problem by minimization over two-dimensional subspaces, Math. Program., 40 (1988), pp. 247–263.

[12] J. D. Carroll and J.-J. Chang, Analysis of individual differences in multidimensional scaling via an n-way generalization of “Eckart–Young” decomposition, Psychometrika, 35 (1970), pp. 283–319.

[13] R. M. Corless, P. M. Gianni, and B. M. Trager, A reordered Schur factorization method for zero-dimensional polynomial systems with multiple roots, in Proceedings of the Inter-national Symposium on Symbolic and Algebraic Computation, ACM Press, 1997, pp. 133– 140.

[14] D. I. Diochnos, I. Z. Emiris, and E. P. Tsigaridas, On the asymptotic and practical complex-ity of solving bivariate systems over the reals, J. Symb. Comput., 44 (2009), pp. 818–835. [15] E. D. Dolan and J. J. Mor´e, Benchmarking optimization software with performance profiles,

Math. Program., 91 (2002), pp. 201–213.

[16] I. Z. Emiris and B. Mourrain, Matrices in elimination theory, J. Symb. Comput., 28 (1999), pp. 3–43.

[17] J.-C. Faug`ere, A new efficient algorithm for computing Gr¨obner bases without reduction to zero (F5), in Proceedings of the 2002 international symposium on Symbolic and algebraic computation, ISSAC ’02, ACM, 2002, pp. 75–83.

[18] I. C. Gohberg, P. Lancaster, and L. Rodman, Matrix Polynomials, Academic Press, New York, 1982.

[19] L. Gonzales-Vega and I. Necula, Efficient topology determination of implicitly defined al-gebraic plane curves, Comput. Aided Geom. Design, 19 (2002), pp. 719–743.

[20] A. Griewank and M. R. Osborne, Analysis of newton’s method at irregular singularities, SIAM J. Numer. Anal., 20 (1983), pp. 747–773.

[21] R. A. Harshman, Foundations of the PARAFAC procedure: Models and conditions for an “explanatory” multi-modal factor analysis, UCLA Working Papers in Phonetics, 16 (1970), pp. 84–84.

[22] G. F. J´onsson and S. A. Vavasis, Accurate solution of polynomial equations using macaulay resultant matrices, Math. Comp., 74 (2004), pp. 221–262.

[23] J. T. Kajiya, Ray tracing parametric patches, ACM SIGGRAPH Comp. Graph., 16 (1982), pp. 245–254.

(21)

[24] Ph. A. Knight, The Sinkhorn-Knopp algorithm: Convergence and applications, SIAM J. Ma-trix Anal. Appl., 30 (2008), pp. 261–275.

[25] T. G. Kolda and B. W. Bader, Tensor decompositions and applications, SIAM Rev., 51 (2009), pp. 455–500.

[26] J. B. Lasserre, Global optimization with polynomials and the problems of moments, SIAM J. Optim., 11 (2001), pp. 796–817.

[27] D. Lazard, Résolution des systèmes d’équations algébriques, Theoret. Comput. Sci., 15 (1981), pp. 77–110.

[28] D. Lemonnier and P. Van Dooren, Balancing regular matrix pencils, SIAM J. Matrix Anal. Appl., 28 (2006), pp. 253–263.

[29] T.-Y. Li, Numerical solution of multivariate polynomial systems by homotopy continuation methods, Act. Num., 6 (1997), pp. 399–436.

[30] D. Manocha, Solving systems of polynomial equations, IEEE Comput. Graph. Appl., 14 (1994), pp. 46–55.

[31] D. Manocha and J. W. Demmel, Algorithms for intersecting parametric and algebraic curves II: Multiple intersections, Graph. Models and Image Proc., 57 (1995), pp. 81–100. [32] J. Maroulas and S. Barnett, Polynomials with respect to a general basis. I. Theory, J. Math.

Anal. Appl., 72 (1979), pp. 177–194.

[33] , Polynomials with respect to a general basis. II. Applications, J. Math. Anal. Appl., 72 (1979), pp. 599–614.

[34] H. M. M¨oller and H. J. Stetter, Multivariate polynomial equations with multiple zeros solved by matrix eigenproblems, Numer. Math., 70 (1995), pp. 311–329.

[35] R. E. Moore, Methods and applications of interval analysis, SIAM, 1987.

[36] A. P. Morgan, Polynomial continuation and its relationship to the symbolic reduction of poly-nomial systems, Symbolic and Numerical Computation for Artificial Intelligence, Academic Press, New York, 1992, pp. 23–45.

[37] Y. Nakatsukasa, V. Noferini, and A. Townsend, Computing the common zeros of two bivariate functions via B´ezout resultants. Submitted, 2013.

[38] J. Nocedal and S. J. Wright, Numerical Optimization, Springer Series in Operations Re-search, Springer, second ed., 2006.

[39] P. A. Parrilo, Structured semidefinite programs and semialgebraic geometry methods in ro-bustness and optimization, PhD thesis, California Institute of Technology, 2000.

[40] L. Qi, Z. Wan, and Y.-F. Yang, Global minimization of normal quartic polynomials based on global descent directions, SIAM J. Optim., 15 (2006), pp. 275–302.

[41] F. Rouillier, Solving zero-dimensional systems through the rational univariate representation, Appl. Algebra Engrg. Comm. Comput., 9 (1999), pp. 433–461.

[42] F. Rouillier and P. Zimmermann, Efficient isolation of polynomial real roots, J. Comput. Appl. Math., 162 (2003), pp. 33–50.

[43] M. ˇSebek, S. Pejchov´a, D. Henrion, and H. Kwakernaak, Numerical methods for zeros and determinant of polynomial matrix, in Proceedings of the IEEE Mediterranean Symposium on New Directions in Control and Automation, Chania, Crete, Greece, June 1996, pp. 488– 491.

[44] R. Seidel and N. Wolpert, On the exact computation of the topology of real algebraic curves, in Proceedings of the twenty-first annual symposium on Computational geometry, SCG 2005, New York, NY, USA, 2005, ACM, pp. 107–115.

[45] N. Z. Shor, Class of global minimum bounds of polynomial functions, Cybern. and Sys. Anal., 23 (1987), pp. 731–734.

[46] A. J. Sommese and C. W. Wampler, The Numerical Solution Of Systems Of Polynomials Arising In Engineering And Science, World Scientific Press, Singapore, 2005.

[47] L. Sorber, I. Domanov, M. Van Barel, and L. De Lathauwer, Exact line and plane search for tensor optimization, ESAT-STADIUS Internal Report 13-02, Department of Electrical Engineering (ESAT), KU Leuven, Leuven, Belgium, 2013.

[48] L. Sorber, M. Van Barel, and L. De Lathauwer, Unconstrained optimization of real func-tions in complex variables, SIAM J. Optim., 22 (2012), pp. 879–898.

[49] , Tensorlab v2.0. Available online at http://www.tensorlab.net, Jan. 2014.

[50] H. J. Stetter, Matrix eigenproblems are at the heart of polynomial system solving, SIGSAM Bull., 30 (1996), pp. 22–25.

[51] A. Townsend and L. N. Trefethen, An extension of Chebfun to two dimensions. Submitted, 2013.

[52] L. R. Tucker, Some mathematical notes on three-mode factor analysis, Psychometrika, 31 (1966), pp. 279–311.

(22)

homotopy continuation, ACM Trans. Math. Software, 25 (1999), pp. 251–276.

[54] J. Verschelde, J. Verlinden, and R. Cools, Homotopies exploiting newton polytopes for solving sparse polynomial systems, SIAM J. Numer. Anal., 31 (1994), pp. 915–930. [55] C. W. Wampler and A. J. Sommese, Numerical algebraic geometry and algebraic kinematics,

Act. Num., 20 (2011), pp. 469–567.

[56] R. C. Ward, Balancing the generalized eigenvalue problem, SIAM J. Statist. Comput., 2 (1981), pp. 141–152.

[57] J. H. Wilkinson, The evaluation of the zeros of ill-conditioned polynomials. Part I, Numer. Math., 1 (1959), pp. 150–166.

[58] , Kronecker’s canonical form and the QZ algorithm, Linear Algebra Appl., 28 (1979), pp. 285–303.

[59] K. Zhou and S. I. Roumeliotis, Multirobot active target tracking with combinations of relative observations, IEEE Trans. Robot., 27 (2011), pp. 678–695.