Two Block Macaulay Matrix Algorithms to Solve Multiparameter Eigenvalue Problems ?
Christof Vermeersch
a,∗, Bart De Moor
aaCenter for Dynamical Systems, Signal Processing, and Data Analytics (STADIUS), Department of Electrical Engineering (ESAT), KU Leuven, Kasteelpark Arenberg 10, 3001 Leuven, Belgium
Abstract
We consider two algorithms that use the so-called block Macaulay matrix to solve multiparameter eigenvalue problems (MEPs). In earlier work, the null space of the block Macaulay matrix that is constructed from the coefficient matrices of an MEP results in a standard eigenvalue problem (SEP), the eigenvalues and eigenvectors of which yield the solutions of the MEP. We revisit in this paper the null space based approach and propose a new, complementary algorithm to solve MEPs, which considers the data in the column space of the sparse and structured block Macaulay matrix instead. This paper generalizes, in a certain sense, earlier traditional Macaulay matrix techniques from multivariate polynomial system solving to the block Macaulay matrix in the MEP setting. We also provide several numerical examples to illustrate both the null space and column space based approach and to underline the intrinsic complementarity between both fundamental subspaces.
Keywords: multiparameter eigenvalue problem, block Macaulay matrix, realization theory 2020 MSC: 15A18, 15B05, 93B15
1. Introduction
Many natural and scientific phenomena exhibit intrinsic system dynamics that can be captured in a standard eigenvalue problem (SEP). The eigenvalues and eigenvectors that correspond to those phenomena describe the proper (the prefix eigen- is adopted from the German word eigen, which means proper and was presumably first coined by Hilbert (1904)) evolution of the system dynamics in a certain direction. But for some phenomena, a single spectral parameter does not always capture the system dynamics entirely and multiple spectral parameters, or eigentuples of eigenvalues, come into the picture. Historically, mul- tiparameter spectral theory has its roots in the classical problem of solving boundary-value problems for partial differential equations by the classical method of separation of variables (Atkinson and Mingarelli, 2010; Plestenjak et al., 2015; Sleeman, 2007). For example, the vibration problem of an elliptic membrane in two elliptic coordinates, i.e., the two-dimensional Helmholtz equation, leads to the study of a pair of ordi- nary differential equations, both of which contain the same two spectral parameters, hence a two-parameter
?This work was supported in part by the KU Leuven: Research Fund (projects C16/15/059, C3/19/053, C24/18/022, and C3/20/117), Industrial Research Fund (fellowship 13-0260, IOF/16/004), and several Leuven Research and Develop- ment bilateral industrial projects, in part by Flemish Government agencies: FWO (EOS project G0F6718N (SeLMA), SBO project S005319, infrastructure project I013218N, TBM project T001919N, and PhD grants SB/1SA1319N, SB/1S93918, and SB/1S1319N), EWI (Flanders AI Research Program), and VLAIO (City of Things COT.2018.018, Baekeland PhD mandate HBC.2019.2204, and innovation mandate HBC.2019.2209), and in part by the European Commission (EU Research Council under the European Union’s Horizon 2020 research and innovation programme (ERC Adv. Grant) under grant 885682). Other funding: Foundation “Kom op tegen Kanker”, Christelijke Mutualiteit. The work of Christof Vermeersch was supported by the FWO Strategic Basic Research fellowship under grant SB/1SA1319N.
∗Corresponding author
Email addresses: christof.vermeersch@esat.kuleuven.be(Christof Vermeersch), bart.demoor@esat.kuleuven.be (Bart De Moor)
eigenvalue problem (Plestenjak et al., 2015; Sleeman, 2007). The presence of multiple spectral parameters (eigentuples instead of eigenvalues) clearly links the different physical phenomena in an elementary fashion.
Recently, we have shown that the identification of linear time-invariant (LTI) systems is, in essense, also a multiparameter eigenvalue problem (De Moor, 2019, 2020; Vermeersch and De Moor, 2019).
However, despite their applicability and natural relation to SEPs, multiparameter eigenvalue problems (MEPs) have not yet penetrated the general scientific community. Besides the early work by Carmichael (1921a,b, 1922), Atkinson (1972), Volkmer (1988), and others, the available literature about MEPs remains rather limited. During the last two decades, a renewed interest in these problems (for example by Hochsten- bach, Plestenjak, and co-authors) has sparked several contributions to the field of multiparameter spectral theory.
The literature about solving one-parameter eigenvalue problems, i.e., SEPs, GEPs (generalized eigenvalue problems), and PEPs (polynomial eigenvalue problems), is vast and mature. PEPs are usually linearized into a larger GEP (Higham et al., 2006; Tisseur and Meerbergen, 2001) and the resulting matrix pencil is solved via one of the many available, efficient SEP or GEP solvers. Approaches to solve MEPs, on the other hand, have clearly been explored much less intensively. In practice, although some direct algorithms based on simultaneous diagonalizations exist for small linear MEPs (Ji, 1991; Slivnik and Tomˇ siˇ c, 1986), one typ- ically solves nonlinear MEPs with iterative nonlinear optimization algorithms (Nocedal and Wright, 2006), e.g., gradient descent techniques (Blum and Geltner, 1978; Blum and Curtis, 1978; Browne and Sleeman, 1982), minimal residual quotient iterations (Blum and Chang, 1978), or Newton-based methods (Bohte, 1982). However, these optimization approaches are heuristic (they depend on an intial guess) and only result in approximations of the eigentuples and eigenvectors. In the last two decades, a regained interest in the topic has led to several iterative (homotopy) continuation algorithms (Plestenjak, 2000) and sub- space approaches (Hochstenbach et al., 2004, 2019; Hochstenbach and Plestenjak, 2002, 2008; Ruymbeek et al., 2021) to overcome the well-known issues with convergence and scalablity. Nevertheless, these algo- rithms remain approximate and focus, except for linearizations of the quadratic two-parameter eigenvalue problem (Hochstenbach et al., 2012; Muhiˇ c and Plestenjak, 2010), solely on linear MEPs, with implementa- tions available for linear two-parameter and three-parameter eigenvalue problems (generalizations to n > 3 parameters are not always straightforward (Hochstenbach et al., 2019; Plestenjak et al., 2015)).
In our earlier work (De Moor, 2019, 2020; Vermeersch and De Moor, 2019), we have introduced the so-called block Macaulay matrix, which allows us to solve nonlinear MEPs exactly via a multidimensional realization problem in the null space of the block Macaulay matrix. This block Macaulay matrix framework enables a natural transition from SEPs and GEPs, over PEPs and linear MEPs, to nonlinear MEPs. In this paper, we first revisit the existing null space based algorithm and then develop a new, complementary column space based approach to determine the solutions of nonlinear MEPs exactly. The complementarity between both fundamental subspaces of the block Macaulay matrix results in an equivalent multidimensional realization problem in its column space. This observation stems from a similar complementarity in the field of multivariate polynomial system solving, where the null space and column space of the traditional Macaulay matrix both give rise to an exact linear algebra root-finding algorithm (Vermeersch and De Moor, 2020).
The main contribution of this paper is a new approach to solve (nonlinear) MEPs, via the column space of the block Macaulay matrix, using the complementarity between its null space and column space. We also introduce the backward QR-decomposition, which is an essential tool in the column space based approach and serves a dual pre-processing purpose: reducing the number of rows and revealing the rank (or solution) structure of the block Macaulay matrix.
Outline. The remainder of this paper proceeds as follows: Section 2 gives a rigourous definition of MEPs
and Section 3 introduces the block Macaulay matrix. We revise the null space based approach to solve MEPs
in Section 4, while, in Section 5, we propose a new, complementary column space based algorithm. Several
numerical examples illustrate both algorithms in Section 6. Finally, we summarize our conclusions and
point at ideas for future research in Section 7. Appendix A covers the shift structure of finite dimensional
subspaces, like the null space of the block Macaulay matrix, in more depth.
Notation. We denote scalars by italic lowercase letters, e.g., a, and tuples/vectors by boldface lowercase letters, e.g., a. Matrices are characterized by boldface uppercase letters, e.g., A. When a matrix contains one or more parameters, for example the combined coefficient matrix of an MEP, we use a calligraphic font, e.g., A (λ). We use a subscript to indicate the elements or submatrices of a tuple, vector, or matrix, e.g., a
1is the first element of the vector a.
2. Multiparameter eigenvalue problems
Multiparameter eigenvalue problems (MEPs) naturally extend the typical structure of standard eigen- value problems (SEPs) and involve eigentuples λ = (λ
1, . . . , λ
n) of eigenvalues instead of single eigenvalues λ. For example, A
00+ A
10λ
1+ A
12λ
1λ
22z = 0 contains n = 2 spectral parameters, combined in an eigentuple (λ
1, λ
2). Several manifestations of MEPs appear in the literature. Therefore, we first rigourously define the MEP within our block Macaulay matrix framework.
Definition 1. Given coefficient matrices A
ω∈ R
k×l(with k ≥ l + n − 1), the multiparameter eigenvalue problem M (λ
1, . . . , λ
n) z = 0 consists in finding n scalars λ
1, . . . , λ
n∈ C and a corresponding vector z ∈ C
l×1\ {0}, such that
M (λ
1, . . . , λ
n) z =
X
{ω}
A
ωλ
ω
z = 0, (1)
with monomials λ
ω= Q
ni=1
λ
ωii= λ
ω11· · · λ
ωnn. The n-tuples λ = (λ
1, . . . , λ
n) ∈ C
nand (nonzero) vectors z ∈ C
l×1\{0} denote the eigentuples (with eigenvalues λ
1, . . . , λ
n) and eigenvectors of the MEP, respectively.
The n-tuple ω = (ω
1, . . . , ω
n) ∈ N
ngathers the powers of the eigenvalues in the monomial λ
ω= Q
ni=1
λ
ωii= λ
ω11· · · λ
ωnnand identifies the corresponding coefficient matrices A
ω= A
(ω1···ωn). The total degree of a monomial is equal to the sum of its powers, or |ω| = P
ni=1
ω
i. The degree d
Sof an MEP corresponds to the highest total degree of all its monomials
1. The matrix M (λ
1, . . . , λ
n) is the combined coefficient matrix of the MEP and is a function of the different eigenvalues.
Example 1. To demonstrate Definition 1, we consider an MEP of degree d
S= 2 with two parameters and four monomials,
A
00+ A
10λ
1+ A
11λ
1λ
2+ A
02λ
22z = 0, (2) which has four coefficient matrices A
ω∈ R
3×2,
A
00=
1 2 3 4 3 4
, A
10=
2 1 0 1 1 3
, A
11=
3 4 2 1 0 1
, and A
02=
1 2 4 2 2 1
.
In this additive structure, one easily notices the resemblance with the SEP Az = λz, or (A − Iλ) z = 0, which has a non-trivial solution if and only if the characteristic polynomial χ (A) = det (A − Iλ) = 0.
Similarly, we can write generalized eigenvalue problems (GEPs) as (A − Bλ) z = 0. Obviously, the degree d
Sof SEPs and GEPs is equal to 1. Polynomial eigenvalue problems (PEPs) also fit perfectly in Definition 1, with n = 1, as
X
{ω}
A
ωλ
ω
z =
dS
X
i=0
A
iλ
i! z = 0.
1We indicate the degree dSof an MEP with a subscript·S, which stems from its function as the seed equation in the block Macaulay matrix (see Section 3), to make a distinction with the degree dFSRof the forward shift recursions and the degree d of the block Macaulay matrix.
Table 1: Within our block Macaulay matrix framework, we observe four different types of MEPs, organized on a two-dimensional grid according to the structure of their monomials (i.e., single eigenvalues versus eigentuples and linear versus nonlinear).
Eigenvalues Linear Nonlinear
n = 1
Type I Type II
{1, λ} λ
ωSEP/GEP PEP
n > 1
Type III Type IV
λ
iQ
ni=1
λ
ωiilinear MEP nonlinear MEP
Example 2. We can easily write a GEP in the structure of Definition 1:
1 2 1 3 4 2 2 4 3
−
1 2 1 4 4 2 1 2 1
λ
z = 0. (3)
Example 3. A PEP of degree d
S= 4 has five coefficient matrices A
ω∈ R
k×land is given by A
0+ A
1λ + A
2λ
2+ A
3λ
3+ A
4λ
4z = 0.
Example 4. Often, the eigenvalues only appear linearly in the monomials of the MEP. For example, a linear two-parameter eigenvalue problem (linear 2-EP)
(A
00+ A
10λ
1+ A
01λ
2) z = 0, (4)
with three coefficient matrices A
ω∈ R
3×2,
A
00=
2 6 4 5 0 1
, A
10=
1 0 0 1 1 1
, and A
01=
4 2 0 8 1 1
.
The previous examples show that the universe of MEPs is very differentiated. Table 1 summarizes the different types of MEPs that we cover within our block Macaulay matrix framework. They are organized on a two-dimensional grid according to the structure of their monomials (i.e., single eigenvalues versus eigentuples and linear versus nonlinear). Examples 1, 2, 3, and 4 each highlight one of the four types of MEPs on that two-dimensional grid (Type IV, I, II, and III, respectively). The available literature mostly considers MEPs with single eigenvalues (Type I and II) or problems in which the eigenvalues only appear linearly (Type III).
However, within our block Macaulay matrix framework, we tackle all four types of problems and provide numerical linear algebra algorithms to solve general (nonlinear) MEPs exactly.
Remark 1. In the literature, other manifestations of MEPs can be found, one of them in particular is the system-wise formulation, in which multiple MEPs with square coefficient matrices are combined into one system of matrix equations (see for example (Atkinson, 1972; Volkmer, 1988; Plestenjak, 2000)). A linear 2-EP in this system-wise formulation is written as
(A
1+ A
1λ
1+ A
1λ
2) x = 0 (A
2+ A
2λ
1+ A
2λ
2) y = 0 ,
where (λ
1, λ
2) is the eigentuple of eigenvalues and the tensor product x⊗y = vec yx
Tis the corresponding eigenvector
2.
2The vectorization vec (·) is a linear transformation that converts a matrix into a column vector, by stacking the columns of the matrix on top of one another.
3. Block Macaulay matrix
In earlier work (De Moor, 2019, 2020; Vermeersch and De Moor, 2019), we have introduced the so- called block Macaulay matrix in the context of system identification, in order to solve the MEPs that arrive from identification problems. The block Macaulay matrix is a block matrix extension of the traditional Macaulay matrix (Macaulay, 1902, 1916), a sparse and structured matrix primarily used to solve systems of multivariate polynomial equations (Batselier et al., 2013, 2014a,b; Dreesen, 2013; Dreesen et al., 2012, 2018;
Dreesen and De Moor, 2009). We consider in this paper both the null space and the column space of the block Macaulay matrix to find the eigentuples and eigenvectors, i.e., the solutions, of MEPs.
The MEP M (λ
1, . . . , λ
n) z = 0 in (1) constitutes the so-called seed equation of its corresponding block Macaulay matrix. Indeed, we can generate new, equivalent equations n
Q
n i=1λ
diio
M (λ
1, . . . , λ
n) z = 0 by multiplying the MEP by different monomials Q
ni=1
λ
diiof increasing total degree P
ni=1
d
i. We call these newly obtained equations forward shift recursions (FSRs), because of the recursive shift of the coefficient matrices in the subsequent FSRs.
Example 5. For example, if we revisit the linear 2-EP from Example 4, (A
00+ A
10λ
1+ A
10λ
2) z = 0,
and we consider the FSRs of total degree d
FSR= 1, then we have two new, equivalent equations:
λ
1(A
00+ A
10λ
1+ A
10λ
2) z = 0 λ
2(A
00+ A
10λ
1+ A
10λ
2) z = 0.
The coefficient matrices are clearly shifted to a monomial of higher total degree.
Before we continue, we first rigourously define the block Macaulay matrix.
Definition 2. Given the MEP M (λ
1, . . . , λ
n) z = 0 of degree d
S, which serves as the seed equation, the block Macaulay matrix M (d) ∈ R
p(d)×q(d)of degree d stacks the coefficient matrices of the seed equation and of its FSRs up to total degree d
FSR, such that d = d
FSR+ d
S, i.e.,
M (d) = hn Q
ni=1
λ
diio
M (λ
1, . . . , λ
n) i .
The blocks of the block Macaulay matrix contain the shifted coefficient matrices and are indexed (both in row and column direction) by the different monomials of total degree at most d.
Consequently, we can rewrite the MEP and its FSRs as a matrix-vector product of the generated block Macaulay matrix M (d) ∈ R
p(d)×q(d)and a block multivariate Vandermonde vector v(d) ∈ C
q(d)×1, i.e.,
hn Q
n i=1λ
diio
M (λ
1, . . . , λ
n) i
| {z }
M (d)
z zλ
1.. . zλ
nzλ
21.. .
| {z }
v(d)
= 0.
We increase the degree d of the block Macaulay matrix M (d) until it is large enough and reaches the desired degree d ≥ d
∗, a concept on which we elaborate in Section 4.
The structure of the block Macaulay matrix depends on its multivariate monomial ordering. We use in
this paper the degree negative lexicographic ordering (Batselier et al., 2013), which unambiguously orders
two different monomials based on their total degree and compares both tuples element-wise when both
degrees tie (Definition 3). However, the remainder of this paper remains valid for any (graded) multivariate
monomial ordering.
dFSR= 1 dFSR= 2
dFSR= 3
dFSR= 4
Figure 1: The block Macaulay matrix M of degree d = 6 of a quadratic two-parameter eigenvalue problem (dS= 2). The elements of the seed equation, i.e., the generating MEP, are indicated in red, while the elements of the FSRs are indicated in blue (the elements not shown are zero). Vertical lines indicate the different degree blocks, while horizontal dashed lines separate FSRs of different total degree. Clearly, the block Macaulay matrix exhibits a quasi-Toeplitz structure.
Definition 3. If we consider two n-tuples α, β ∈ N
nand |α| < |β| or |α| = |β| with in the element-wise difference β − α ∈ Z
nthe left-most non-zero element of the tuple negative, then two monomials λ
αand λ
βare ordered λ
α< λ
βby the degree negative lexicographic ordering.
Example 6. The degree negative lexicographic ordering orders the monomials of maximal total degree d = 2 in n = 3 variables as
1 < λ
1< λ
2< λ
3< λ
21< λ
1λ
2< λ
1λ
3< λ
22< λ
2λ
3< λ
23. Example 7. Consider a quadratic two-parameter eigenvalue problem, i.e.,
A
00+ A
10λ
1+ A
01λ
2+ A
20λ
21+ A
11λ
1λ
2+ A
02λ
22z = 0, and multiply it by monomials of increasing total degree, i.e.,
λ
1, λ
2| {z }
dFSR=1
, λ
21, λ
1λ
2, λ
22| {z }
dFSR=2
, λ
31, . . .
| {z }
dFSR≥3
The MEP and its FSRs result in the corresponding block Macaulay matrix M (d) and vector v(d):
A
00A
10A
01A
20A
11A
020 · · · 0 A
000 A
10A
010 A
20· · · 0 0 A
000 A
10A
010 · · ·
0 0 0 A
000 0 A
10· · ·
.. . .. . .. . .. . .. . .. . .. . . . .
z λ
1z λ
2z λ
21z λ
1λ
2z
λ
22z λ
31z .. .
= 0.
When we further increase the block Macaulay matrix via FSRs of higher total degree, we obtain a sparse and structured matrix, as visualized in Figure 1.
Remark 2. Note that we make a distinction between blocks and degree blocks in this paper. A block gathers
all the rows or columns that correspond to one monomial (e.g., all the rows that belong to λ
21), while a
degree block contains all the blocks that correspond to monomials of the same total degree (e.g., all the
rows that belong to λ
21, λ
1λ
2, and λ
22). We separate different degree blocks in matrices and vectors with
horizontal and vertical lines.
The vector v(d) is a vector in the (right) null space of the block Macaulay matrix M (d) and has a special block multivariate Vandermonde structure, because of the monomial ordering of the columns of the block Macaulay matrix. In the structure of both the null space (Section 4) and the column space (Section 5) of the block Macaulay matrix lies the key to solve its generating MEP. To alleviate the notational complexity in this paper, we no longer specify the degree d explicitly, but we assume it to be large enough.
4. Null space based solution approach
We now exploit the special block multivariate Vandermonde structure of the null space of the block Macaulay matrix in order to find the solutions of its seed equation, i.e., the MEP that we want to solve.
We first assume that all solutions are simple and affine, which allows us to show that a multidimensional realization problem in the structured null space yields the exact solutions of the MEP (Section 4.1). This multidimensional realization problem has a profound system theoretic interpretation. Next, we show how to deal with multiplicities and how to deflate the solutions at infinity (Section 4.2). Finally, we summarize the different steps of the null space based algorithm (Section 4.3).
4.1. Multidimensional realization theory
We consider an MEP M (λ
1, . . . , λ
n) z = 0 that only has, for didactic purposes, m
asimple and affine solutions, i.e., an MEP with an affine zero-dimensional solution set B
M(see below for the cases with multiple solutions and solutions at infinity). When we iteratively increase the degree d of the block Macaulay matrix M , we notice that the nullity (i.e., the dimension of the null space) grows. However, at a certain degree d
∗the nullity reaches the number of affine solutions m
aand remains the same for larger degrees d (the degree d ≥ d
∗is now large enough and the nullity has stabilized at m
a). Every solution of the MEP corresponds to one block multivariate Vandermonde vector in the null space and, together, these basis vectors span the entire null space of the block Macaulay matrix. They naturally form the block multivariate Vandermonde basis V ∈ C
q×maof degree d ≥ d
∗:
V =
z|
(1)· · · z|
(ma)
λ
1z|
(1)· · · λ
1z|
(ma)
.. . .. .
λ
nz|
(1)· · · λ
nz|
(ma)
λ
21z
(1)· · · λ
21z
(ma)
.. . .. .
,
which has one column v|
(j), j = 1, . . . , m
a, for every affine solution of the MEP. From this block multivariate Vandermonde basis, and as thoroughly explained in Appendix A, it follows that the null space of the block Macaulay matrix has a special structure:
Proposition 1. The null space of the block Macaulay matrix is block multi-shift-invariant.
Example 8. We consider again the linear 2-EP from Example 4, which has three affine solutions (m
a= 3)
3: Solutions (λ
1, λ
2) z
Solution 1 (α
1, α
2) r Solution 2 (β
1, β
2) s Solution 3 (γ
1, γ
2) t
3Note that we compute the exact numerical values of these solutions in Section 6. The variables in this example can be seen as placeholders for the numerical values in Section 6.
The block multivariate Vandermonde basis V of the null space of the corresponding block Macaulay matrix M of this MEP, with degree d = 2, clearly exhibits a block multi-shift-invariant structure:
V =
r s t
α
1r β
1s γ
1t α
2r β
2s γ
2t α
21r β
12s γ
21t α
1α
2r β
1β
2s γ
1γ
2t
α
22r β
22s γ
22t
.
A shift with λ
1(i.e., with α
1, β
1, and γ
1) of the first three blocks of this basis results again in three blocks from that basis:
r s t
α
1r β
1s γ
1t α
2r β
2s γ
2t
λ1
−→
α
1r β
1s γ
1t α
21r β
12s γ
12t α
1α
2r β
1β
2s γ
1γ
2t
.
We can formulate the block multi-shift-invariance property mathematically by means of row selection ma- trices, i.e.,
S
1=
I 0 0 0 0 0
0 I 0 0 0 0
0 0 I 0 0 0
and S
λ1=
0 I 0 0 0 0
0 0 0 I 0 0
0 0 0 0 I 0
,
which select the specific rows before and after the multiplication (or shift operation) with λ
1, respectively.
Note that the selection matrices consist of submatrices (I ∈ N
2×2is the identity matrix of appropriate size), because we select all the rows of a block during this shift operation. The multiplication is then written as a diagonal matrix, i.e.,
D
λ1=
α
10 0 0 β
10 0 0 γ
1
,
where the diagonal elements are the evaluations of the shift λ
1in the different solutions. The shift-invariance property of the null space now corresponds to
S
1V D
λ1= S
λ1V .
This shift-invariance property does not restrict itself to the first eigenvalue λ
1, but holds for any shift polynomial g (λ
1, . . . , λ
n) in the eigenvalues of the MEP. Hence, we obtain a generalized eigenvalue problem,
S
1V D
g= S
gV , (5)
where the diagonal matrix D
g∈ R
ma×macontains the evaluations of the shift polynomial g (λ
1, . . . , λ
n) in the different solutions of the MEP and the matrix of eigenvectors is equal to the identity matrix. In order for this generalized eigenvalue problem not to have degeneracies (Moler and Stewart, 1973), the matrix S
1V has to be non-singular. This means that the row selection matrix S
1∈ R
ma×qhas to select m
alinearly independent rows
4from the block multivariate Vandermonde basis V (then S
1V is square and regular).
The matrix S
g∈ R
ma×q, on the other hand, simply selects the rows obtained after the multiplication (or shift) with the shift polynomial g (λ
1, . . . , λ
n). Actually, from algebraic geometry and from earlier work on the traditional Macaulay matrix, it follows that these linearly independent rows of the null space correspond to the different solutions of the MEP (Cox et al., 2004; Dreesen, 2013).
4Since we, in general, only select linearly independent rows and not blocks from the basis, we do not fully exploit the block multi-shift-invariance of the null space. As we mention in Section 7, algorithms that efficiently work with entire blocks instead of rows remain a topic for future work.
Example 9. We revisit our previous example and consider a more advanced shift polynomial, namely g (λ
1, λ
2) = 2λ
21+ 3λ
2. A shift of a row (or block) of the basis V now results in a combination of rows (or blocks) of that basis. As mentioned above, we can write this mathematically via row selection matrices:
S
1V D
g= S
gV ,
where S
1and S
gselect the rows (or blocks) before and after the shift, respectively. The diagonal matrix
D
g=
2α
21+ 3α
20 0
0 2β
12+ 3β
20
0 0 2γ
21+ 3γ
2
,
contains the evalutions of the shift polynomial g (λ
1, λ
2) in the different solutions (or columns) of the null space.
In practice, we do not know the block multivariate Vandermonde basis V in advance, since it is con- structed from the unknown solutions of the MEP. Therefore, as the block multi-shift-invariance is a property of the null space and not of its specific basis (see Appendix A), we work with a numerical basis Z ∈ R
q×maof the null space of the block Macaulay matrix M instead, for example a numerical basis obtained via the singular value decomposition. There exists, of course, a relation between these two bases, namely V = ZT , with T ∈ C
ma×maa non-singular transformation matrix, which transforms (5) into a solvable GEP,
(S
1Z) T D
g= (S
gZ) T , (6)
where T contains the eigenvectors and D
gthe eigenvalues of the matrix pencil (S
1Z, S
gZ). Alternatively, we can also consider the SEP:
T D
gT
−1= (S
1Z)
−1(S
gZ) . (7)
We can then use the matrix of eigenvectors T to retrieve the block multivariate Vandermonde matrix V , via V = ZT , and hence, find the solutions of the MEP.
4.2. Multiple solutions and solutions at infinity
The above-mentioned null space based approach only considers MEPs with simple and affine solutions in a zero-dimensional solution set. We now cover the difficulties of multiple solutions and solutions at infinity.
As the next paragraphs explain, they pose most of the time no insurmountable problems for the null space based algorithm.
4.2.1. Multiple solutions
When all solutions are simple, we find one column in the block multivariate Vandermonde basis V of the null space for every solution of the MEP and every column contributes to the nullity of the block Macaulay matrix. However, if multiple solutions prevail, the null space of the block Macaulay matrix no longer contains only the block multivariate Vandermonde solution vectors, but also linear combinations of the partial derivatives of these solution vectors, i.e., we have a confluent block multivariate Vandermonde matrix (M¨ oller and Stetter (1995) and Dayton et al. (2011) give an elaborate exposition in the case of systems of multivariate polynomial equations). Except for the well-known loss of accuracy in computing multiple eigenvalues (Chu, 1990; Golub and Van Loan, 2013), multiplicity poses no problems and the above-mentioned null space based approach can be used without any modifications.
4.2.2. Solutions at infinity
Beside affine solutions, MEPs can also have solutions at infinity, due to the singularity of some higher
degree coefficient matrices. As for the case with only isolated affine solutions in a zero-dimensional solution
set, the nullity of the block Macaulay matrix also grows when we increase its degree d and it stabilizes at
a certain nullity m
b. We denote this degree of the block Macaulay matrix the desired degree d = d
∗. The
nullity of the block Macaulay matrix after stabilization (d ≥ d
∗) corresponds to the total number of solutions
d = 3 d∗= 4 d = 5
gap
d = 6
gap
compressed null space
Figure 2: The nullity of the null space of the block Macaulay matrix M grows as its degree d increases. However, at a certain degree d≥ d∗(in this example d∗= 4), the nullity stabilizes at the total number of solutions mb. From that degree on, some linearly independent rows (corresponding to the affine solutions - indicated by dashed lines) stabilize, while the other linearly independent rows (corresponding to the solutions at infinity - also indicated by dashed lines) move to higher degree block, and a gap emerges that separates these two types of linearly independent rows. The influence of the solutions at infinity can then be deflated via a column compression. Afterwards, the affine algorithm can be applied straightforwardly on the compressed basis of the null space.
of the MEP, which now equals the sum of the affine solutions and the solutions at infinity (m
b= m
a+ m
∞).
Every solution spans one basis vector in this null space, hence all the columns of the numerical basis are linear combinations of both affine solutions and solutions at infinity. In the null space of this block Macaulay matrix, we find now not only linearly independent rows that correspond to affine solutions, but also linearly independent rows that correspond to solutions at infinity. When the degree d of the block Macaulay matrix M further increases (d > d
∗), some of the linearly independent rows no longer move to higher total degree blocks, but remain fixed at their respective position (they correspond to the affine solutions), while other linearly independent rows keep moving to a higher total degree (they correspond to the solutions at infinity).
A gap without linearly independent rows eventually emerges and helps to separate the affine solutions from those at infinity via a column compression (Dreesen, 2013):
Definition 4. A numerical basis Z = Z
T1Z
T2Tof the null space of the block Macaulay matrix M is a q × m
bmatrix, which can be partitioned into a k × m
bmatrix Z
1(that contains the part with the affine solutions and the gap) and a (q − k) × m
bmatrix Z
2(that contains the part with the solutions at infinity), with rank (Z
1) = m
a< m
b. Furthermore, let the singular value decomposition of Z
1= U ΣQ
T. Then, W = ZQ is called the column compression of Z and can be partitioned as
W = W
110 W
21W
22,
where W
11is the k × m
amatrix that corresponds to the compressed numerical basis of the null space.
This column compression deflates the solutions at infinity and allows us to straightforwardly use the above-mentioned affine null space based approach as if no solutions at infinity were present (we simply replace Z in (6) by W
11), provided that the gap can accommodate the shift polynomial (a shift polynomial of degree d
grequires a gap of at least d
gdegree blocks). Figure 2 visualizes the emergence of this gap and the result of the column compression.
The null space of the block Macaulay matrix can also be interpreted as the observability matrix of a multidimensional linear time-invariant system (De Moor, 2019; Vermeersch and De Moor, 2019).
4.3. Null space based algorithm
Algorithm 1 summarizes the different steps to solve an MEP via the null space of the block Macualay
matrix.
Remark 3. In the exposition above, we have always assumed that the solution set B
Mof the MEP is zero- dimensional, i.e., all solutions are isolated points in the solution space. However, sometimes MEPs violate this assumption and have a positive-dimensional solution space. In those situations, the nullity of the block Macaulay matrix keeps increasing when the degree d grows and never stabilizes at a fixed value. Algorithm 1 fails to retrieve the solutions of the MEP in that situation, except for the special case in which the affine solution set is zero-dimenensional and the positive-dimensional part of the solution set entirely exists at infinity (the number of affine solutions m
ais fixed). Then, a gap still appears and Algorithm 1 works after a column compression of the null space. We no longer wait for the stabilization of the nullity as trigger for d
∗, but we look at the stabilization of the affine part of the null space and the emergence of a gap.
Algorithm 1 Null space based approach
1:
Construct the block Macaulay matrix M of large enough degree d > d
∗(Section 4.2).
2:
Compute a numerical basis Z of the null space of M .
3:
Determine the gap and the number of affine solutions m
avia rank tests (Section 4.2).
4:
Use Definition 4 to obtain the compressed numerical basis W
11of the null space (note that if m
b= m
a, then W
11= Z).
5:
For a user-defined shift polynomial g (λ
1, . . . , λ
n), solve the GEP (S
1W
11) T D
g= (S
gW ) T , where the matrices S
1, S
g, T , and D
gare defined as in (6).
6:
Retrieve the different components of the solutions from the block multivariate Vandermonde basis V = W
11T .
5. Column space based approach
In this section, we consider the column space of the block Macaulay matrix instead of its null space. The natural complementarity between both fundamental subspaces (Section 5.1) enables a new, complementary algorithm to solve MEPs, which works directly on the sparse and structured data (Section 5.2). We introduce the backward QR-decomposition, which is an essential tool in this complementary algorithm and serves a dual pre-processing purpose (Section 5.3). Finally, we summarize the different steps of the column space based algorithm (Section 5.4).
5.1. Complementarity between the null space and the column space of an arbitrary matrix
The null space and column space of an arbitrary matrix share an intrinsic complementarity (Dreesen, 2013):
Lemma 1. We consider an arbitrary matrix M ∈ R
p×q, with rank (M ) = r, and a basis Z ∈ C
q×mbof its null space, with rank (Z) = q − r because of the rank-nullity theorem (Roman, 2008). We reorder the rows of Z = Z
TAZ
TBT, such that the submatrix Z
Acontains exactly q − r linearly independent rows, and partition the columns of the matrix M = M
AM
Baccordingly (i.e., M Z = M
AZ
A+ M
BZ
B= 0).
This reordering and partitioning are generally not unique, but always exist. One can easily prove that rank (M
B) = r ⇔ rank (Z
A) = q − r.
We can now apply Lemma 1 to the block Macaulay matrix and any basis of its null space. The solutions of the MEP give rise to linearly independent rows in that basis. When we check the rank of the basis of the null space row-wise from top to bottom, every linearly independent row corresponds to exactly one solution.
If we gather these linearly independent rows in a matrix Z
A, which has full rank q −r, then we know, because
of Lemma 1, that there exists a partitioning M
Bof the columns of the block Macaulay matrix, which has
gap
gap
= 0
M
Z
Figure 3: The solutions of the MEP give rise to linearly independent rows in any basis Z of the null space of the block Macaulay matrix M (indicated with dashed lines). If we check the rank of any basis Z of the null space row-wise from top to bottom, every linearly independent row corresponds to one solution. Because of the complementarity between the null space and column space of the block Macaulay matrix M , its linearly dependent columns (also indicated with dashed lines), checked column-wise from right to left, correspond to the same solutions. The complementarity between both fundamental subspaces of the block Macaulay matrix M creates many similarities between properties in the null space and column space, of which the algorithms in this paper are a key example.
full rank r. Consequently, the remaining columns M
Aof the block Macaulay matrix depend linearly on the columns of M
B. These linearly dependent columns of M
Acorrespond to the linearly independent rows of the null space (gathered in Z
A), and hence to the solutions of the MEP.
Corollary 1. The solutions of an MEP give rise to both linearly independent rows in the null space (checked row-wise from top to bottom) and to linearly dependent columns in the column space (checked column-wise from right to left). Figure 3 visualizes the complementarity between both fundamental subspaces of the block Macaulay matrix and the relation between the linearly dependent columns and linearly independent rows.
5.2. Equivalent column space realization theory
If we consider a block Macaulay matrix M ∈ R
p×q, with large enough degree d > d
∗, such that the nullity has stabilized at the total number of solutions m
band a large enough gap has emerged, then we know that
M W = M W
110 W
21W
22= 0,
where W ∈ C
q×mbis a special compressed block multivariate Vandermonde basis of the null space, in which the matrix W
11contains the part with the affine solutions and the gap, while the matrices W
21and W
22correspond to the part with the solutions at infinity.
Next, we define two new matrices A and B. The matrix A ∈ C
ma×macontains all the linearly indepen- dent rows of the matrix W
11, which correspond to the monomials that lead to the affine solutions. If we multiply (or shift) these rows by the user-defined shift polynomial g (λ
1, . . . , λ
n), we obtain m
a− m
hrows that are present in the matrix A and m
hrows that are not. We gather these m
hnon-present rows in the matrix B ∈ C
mh×maand rewrite this shift relationship as
AD
g= S
gA B
,
with S
gan m
a× (m
a+ m
h) matrix that selects the m
acombinations of rows obtained after the multipli-
cation. The matrix D
gis a diagonal matrix that contains again the evaluations of the shift polynomial
g (λ
1, . . . , λ
n) in the different eigenvalue solutions. If we extract the matrix A from the right-hand side, an SEP appears
AD
g= S
gI BA
−1A, or
AD
gA
−1= S
gI BA
−1. (8)
The matrix A is invertible because it contains m
alinearly independent rows by construction. Of course, since we do not know the matrices A and B in advance, we can not solve this SEP right away. In the remainder of this section, we circumvent this problem via the complementarity between the null and column space of the block Macaulay matrix.
The matrices A and B contain rows of the basis of the null space, in particular of the matrix W
11. If we define the matrix C ∈ C
(k−ma−mh)×maas the matrix that contains the remaining rows of W
11, then we can reorder the matrix W as
P W =
A 0
B 0
C 0
W
21W
22
.
The matrix P is a q × q row-permutation matrix. We can rearrange the columns of the block Macaulay matrix in accordance to the reordered basis of the null space and obtain
M
1M
2M
3M
4| {z }
N
A 0
B 0
C 0
W
21W
22
= 0, (9)
where every matrix M
icorresponds to a subset of the columns of the block Macaulay matrix (this parti- tioning of M follows the reordering of W ). We call the matrix N = M P
Tthe reordered block Macaulay matrix.
Now, we apply the so-called backward QR-decomposition
5on this reordered matrix N , which yields
Q
R
14R
13R
12R
11R
24R
23R
220 R
34R
330 0
R
440 0 0
A 0
B 0
C 0
W
21W
22
= 0,
or, if we pre-multiply both sides by Q
−1= Q
T,
R
14R
13R
12R
11R
24R
23R
220 R
34R
330 0
R
440 0 0
A 0
B 0
C 0
W
21W
22
= 0.
We notice that R
33B = −R
34A, which means that
BA
−1= −R
−133R
34,
because R
33is always of full rank (since the rows of B depend linearly on the rows of A and the comple- mentarity of Lemma 1). This relation helps to remove the dependency on the null space in (8) and yields a solvable SEP
AD
gA
−1= S
gI
−R
−133R
34. (10)
5We cover the implications of the backward QR-decomposition in depth in Section 5.3. Essentially, the backward QR- decomposition orthogonalizes the reordered matrix N as the traditional forward QR-decomposition, but starts with the last column of N (instead of its first column) and iteratively works towards the first column of N .
Consequently, the eigenvalues in D
gand eigenvectors in A yield the solutions of the MEP. Note that this complementary column space approach does not require a column compression to remove the influence of the solutions at infinity, because the backward QR-decomposition already separates them implicitly.
Remark 4. Contrary to the null space based approach, the user-defined shift polynomial g (λ
1, . . . , λ
n) has an influence on the reconstruction of the solutions. If not all components of the solutions are present in the matrix A, then we must select a special shift polynomial such that we can reconstruct the entire solution from the eigenvalues, as illustrated by the next two examples.
Example 10. Once more, we revisit the linear 2-EP from Example 4. As mentioned above, this MEP has m
a= 3 affine solutions. Moreover, the first three columns (which correspond to the variables z
1, z
2, and λ
1z
1) of the corresponding block Macaulay matrix M are linearly dependent on the other columns (checked column-wise from right to left). When we solve the SEP in (10), the matrix A ∈ R
3×3corresponds to the first three rows of the basis W of the null space. Hence, the matrix A contains the variables z
1, z
2, and λ
1z
1evaluated in each of the affine solutions. In order to retrieve information about λ
2, we need to shift with a polynomial that contains λ
2, e.g., g (λ
1, λ
2) = λ
2.
Example 11. Although we can most of the time find a shift polynomial such that we can retrieve all components of the solutions, this is not always the case. Let us, therefore, consider a linear three-parameter eigenvalue problem (linear 3-EP),
(A
000+ A
100λ
1+ A
010λ
2+ A
001λ
3) z = 0, (11) with four coefficient matrices A
ω∈ R
4×2,
A
000=
2 3 2 5 0 1 1 1
, A
100=
1 0 0 1 1 1 2 1
, A
010=
4 2 2 3 3 1 3 1
, and A
001=
1 2 1 4 2 1 4 2
.
This problem has m
a= 4 affine solutions and the first four columns (which correspond to the variables z
1, z
2, λ
1z
1, and λ
1z
2) of the corresponding block Macaulay matrix M are linearly dependent on the other columns (checked column-wise from right to left). Thus, the matrix A contains references to the variables z
1, z
2, λ
1z
1, and λ
1z
2evaluated in each of the affine solutions, but no references to λ
2or λ
3. Hence, not one, but two shift polynomials (with references to λ
2and λ
3) are required to find all the components of the solutions via the eigenvalues of two different SEPs.
5.3. Backward QR-decomposition
The backward QR-decomposition triangularizes a matrix N from the back, i.e., it starts with the last column and iteratively works towards the first column of that matrix. Its result is similar to the result of the traditional forward QR-decomposition of the matrix with all its columns flipped. The backward QR- decomposition also yields an upper triangular matrix R
B, just like the upper triangular matrix R
Fof the forward QR-decomposition after flipping all the columns. Figure 4 compares both approaches visually.
One particular technique to compute the backward QR-decomposition, next to Givens rotations, is via Householder reflections (Golub and Van Loan, 2013; Trefethen and Bau, 1997). A Householder reflection is an orthogonal transformation
V = I − βvv
T, β = 2 v
Tv ,
in which the vector v is called the Householder vector. If a vector x is multiplied by the symmetric and orthogonal Householder matrix V , then it is reflected in the hyperplane span {v}
⊥(Figure 5a). In particular, suppose that we consider a vector x ∈ R
m\ {0} and we want to reflect it in a hyperplane span {v}
⊥onto the unit vector e
1(e
1∈ N
m×1is the first column of the identity matrix I ∈ N
m×m), then we choose a specific v = x ∓ kxk
2e
1, such that
V x =
I − 2 vv
Tv
Tv
x = ± kxk
2e
1.
=
N QF RF
0
(a) Forward QR-decomposition
=
N QB RB
0
(b) Backward QR-decomposition
Figure 4: The traditional forward QR-decomposition triangularizes the matrix N from the first column (left) towards the last column (right). The k-th column (counted from the left - dashed arrow) can be written as a linear combination of the first k columns of the orthonormal matrix QF, the weights of the linear combination given by the k-th column of the upper triangular matrix RF. The first k columns of QFform an orthonormal basis for the subspace spanned by the first k columns of N , hence the triangular structure of RF. The backward QR-decomposition, on the other hand, starts with the last column (right) of the matrix N and triangularizes N towards the first column (left). The (q− k)-th column (or the k-th column counted from the right - dashed arrow) can then be written as a linear combination of the first k columns of the orthonormal matrix QB, the weights now given by the (q− k)-th column of the upper triangular matrix RB. Therefore, the matrix RB has a flipped upper triangular structure compared to RF.
n∗k n∗l
n∗k 2e1
span{v}⊥
(a) Projection
N∗ k
(b) Matrix structure
Figure 5: Householder reflections can be applied recursively on a smaller submatrix N∗in order to triangulize a matrix N via orthogonal reflections. In every step, the columns n∗kof N∗are reflected into the hyperplane spanned orthogonal on the Householder vector v, such that the right-most column of N∗is reflected onto the unit vector of appropriate dimensions. This gradual procedure results in the backward upper triangular matrix RB (see Figure 4b). A column n∗l that is a scalar multiple of n∗k is clearly reflected onto the same unit vector.
Householder reflections can be applied to gradually transform a matrix N to an upper triangular form R
Bvia subsequent multiplications with orthogonal reflection matrices V . We multiply in every step the non- triangulized submatrix N
∗of N by the new Householder matrix to introduce zeros under the anti-diagonal (Figure 5b). The rightmost column n
∗kof the non-triangulized submatrix N
∗is then transformed into a scaled unit vector ± kn
∗kk
2e
1of appropriate size.
Within our column space based approach, the backward QR-decomposition has two tasks: it serves as pre-processing step (to reduce the number of rows and to reveal the rank structure of the block Macaulay matrix) and creates the matrices R
33and R
34of the equivalent column space realization problem. We already considered the latter in the previous section and now elaborate on the former. Since the orthonormal matrix Q
Bis not used during either tasks, we often use the backward Q-less QR-decomposition in practical implementations.
Data-reduction technique. The block Macaulay matrix of large enough degree d is usually tall, which means that it has more rows p than columns q. Therefore, we can replace the block Macaulay matrix M by the upper triangular matrix R of its (forward or backward) Q-less QR-decomposition and reorder this upper triangular matrix R instead as
R
1R
2R
3R
4| {z }
N
A 0
B 0
C 0
W
21W
22
= 0, (12)
where every matrix R
icorresponds to a subset of the columns of the upper triangular matrix R (this partitioning is similar to the partitioning of M in (9)). If the number of rows p is larger than the number of columns q, this first QR-decomposition serves as a data-reduction step and reduces the number of rows of the resulting matrix N (because we remove the zero rows at the bottom).
Rank-revealing technique. In order to reorder the columns to obtain the reordered matrix N , whether
it comes from the block Macaulay matrix M or its upper triangular matrix R, we need to know which
columns correspond to the affine solutions, i.e., we have to discover the linearly dependent columns or the
rank structure of the block Macaulay matrix. This rank-revealing task is usually done by checking the rank
of the block Macaulay matrix column-wise from right to left, via q singular value decompositions or q rank-
revealing QR-decompositions of a growing matrix m
q−i· · · m
q, for i = 0, . . . , q − 1 (where m
kis the
k-th column of the matrix M ). However, while computing a single backward Q-less QR-decomposition as
data-reduction technique (see previous paragraph), one can easily determine the linearly dependent columns
of the block Macaulay matrix as a side result, without any dedicated rank checks
6. Indeed, during every backward Householder iteration, the transformations reflect the columns n
∗kof the submatrix N
∗in the hyperplane span {v}
⊥. If a column is linearly dependent on the previous columns, it lies at a certain point in the same space as the earlier reflected columns and a zero column appears. Figure 5a clearly visualizes what happens if a column n
∗lis linearly dependent on the previous column n
∗k: the projection results in a vector that is also a scalar multiple of the unit vector. Hence, in the next step, the new column of N
∗only contains zeros. We can thus use this local approach to identify the linearly dependent columns, with caution for the numerical difficulties that arise when using these local rank operations on large, realistic matrices.
So, with only one backward Q-less QR-decomposition, we reduce the number of rows and reveal the rank-structure of the block Macaulay matrix. A second backward Q-less QR-decomposition on the reordered matrix N then builds the equivalent column space realization problem (10). Algorithm 2 summarizes the different steps of this backward Q-less QR-decomposition and points out how to reveal the structure of the linearly dependent columns as an additional side product.
Algorithm 2 Structure-revealing backward Q-less QR-decomposition
A (with q columns) corresponds to M (pre-processing) or N (create solvable SEP)
1:
for k = 0, . . . , q − 1 do
2:
a
∗k← the right-most column of the non-triangularized submatrix A
∗in iteration k.
3:
if ka
∗kk
22< tolerance then
4:
Indicate column q − k as linearly dependent.
5:
end if
6:
Determine Householder vector v
kof a
∗k.
7:
Update matrix A
∗.
8:
end for
9:
Take upper triangular part of A as the matrix R.
5.4. Column space based algorithm
Algorithm 3 recapitulates the different steps to solve MEPs directly from the data in the column space of its block Macaulay matrix.
6. Numerical experiments
In this section, we show several numerical examples to illustrate the column space based algorithm and to compare it with its null space based counterpart. We start with four synthetic problems, each of which highlights one particular aspect of the algorithms and is already used througout this paper (Section 6.1).
Afterwards, we consider a more realistic problem, in which we identify the parameters of an ARMA model via an MEP (Section 6.2).
6.1. Four synthetic problems
Linear two-parameter eigenvalue problem with affine solutions only. In our first numerical example, we consider the linear 2-EP from Example 4, which only has affine solutions. In both the null space based and column space based approach, we build a large enough block Macaulay matrix, for example d = 5 > d
∗. The block Macaulay matrix M of degree d = 5 then has p = 45 rows and q = 42 columns. Via the singular
6The most numerically robust approach to determine the rank structure of a matrix, of course, remains the singular value decomposition, preferably on a series of matrices that grow with multiple columns at the time, although we did not encounter any numerical issues with this local column-wise operation during the different numerical examples of this paper. The coefficients of the block Macaulay matrix in our examples are mostly nice integer values, so it still has to be seen how this local approach behaves on large matrices with floating-point values obtained from practical problems.