Algebraic and Optimization-based Algorithms for Decomposing Tensors into Block Terms
Nico Vervliet, Ignat Domanov, Lieven De Lathauwer Department of Electrical Engineering (ESAT)—Stadius, KU Leuven
https://homes.esat.kuleuven.be/~nvervlie Abstract
Tensors, or multiway arrays of numerical values, and their decompositions have been applied suc- cessfully in a myriad of applications in, a.o., signal processing, data analysis and machine learning [1]. Key is the ability to use decompositions to extract simple components under mild uniqueness conditions. The simplest decomposition, the canonical polyadic decomposition (CPD), writes a tensor as a sum of rank-1 tensors. However, in some applications more complex terms are required and a block term decomposition (BTD) may be more suitable.
In this work, we consider the decomposition of a third-order I × J × K tensor T into a sum of multilinear rank (M
r, N
r, ·) terms:
T =
R
X
r=1
D
r·
1A
r·
2B
r(1)
in which D
r, A
rand B
rhave dimensions M
r× N
r× K, I × M
rand J × N
r, resp., ·
ndenotes the mode-n product, i.e., T = D ·
nA ⇔ T
(n)= AD
(n), and the subscript
(n)denotes the mode-n unfolding; see [1]. Alternatively, one can see this as the joint block diagonalization of the frontal slices T
k, k = 1, . . . , K:
T
k= A · blockdiag(D
1,k, . . . , D
R,k) · B
T, (2) in which A = [A
1, . . . , A
R] and B = [B
1, . . . , B
R]. This problem has been studied for some special cases; see, e.g., [2, 3]. We tackle the general BTD, which is often solved via a nonconvex optimization problem which has many local optima. For the CPD and the decomposition into multilinear rank (L
r, L
r, 1) terms (LL1), algebraic methods often provide suitable initializations.
Here, we aim to derive an algebraic and an optimization-based method for problem (1).
Algebraic algorithm We show that the decomposition (1) can be cast as a coupled simultaneous eigenvalue decomposition (CS-EVD) problem under mild assumptions. Let T ∈ K
I×J ×K(K can be R or C) admit a decomposition into a sum of R multilinear rank (M
r, N
r, ·) terms (Eq. (1)) and assume P
Rr=1
M
r≤ I and P
Rr=1
N
r≤ J . (Using compression by multilinear singular value decomposition [4], we can assume w.l.o.g. that these are equalities.) Let A and B have full column rank. Then, under some mild conditions, the decomposition in Eq. (1) is unique (see [5]), and can be reduced to a CS-EVD as follows. Let M ∈ K
IJ K×(I2+J2)be defined as
M =
T
T1⊗ I
I−I
J⊗ T
1.. . .. . T
TK⊗ I
I−I
J⊗ T
K
. (3)
We can show that null(M) =: N has dimension R and can be partitioned as X Y
in which X ∈ K
I2×R
and Y ∈ K
J2×R. Let X
r∈ K
I×I(Y
r∈ K
J ×J) be the reshaped rth column x
r(y
r),
1
r = 1, . . . , R, of X (Y). We can show that A and B can be recovered from the CS-EVD problem X
r= A · blockdiag(g
1rI
M1, . . . , g
RrI
MR) · A
−1, r = 1, . . . , R, (4) Y
r= B
−T· blockdiag(g
1rI
N1, . . . , g
RrI
NR) · B
T, r = 1, . . . , R. (5) Hence, we have two sets of simultaneous EVD problems that are coupled through the eigenvalues g
rs, r = 1, . . . , R, s = 1, . . . , R, which result in the factors A and B in Eq. (2). The core tensors D
rcan be recovered from Eq. (2) as A and B are invertible by assumption. Note that R, M
rand N
rcan be determined automatically from the dimension of the null space and the eigenvalues g
rs. To reduce the computational cost of computing the null space N of the IJ K ×(I
2+J
2) matrix M, an iterative EVD algorithm is used to compute N as the subspace corresponding to the zero eigenvalues of M
HM. The bottleneck computation is solving M
HMx = y for x every iteration. We show that the QR factorization of M = QR can be computed efficiently by exploiting the Kronecker and block structures using ideas from the block-QR and quasi-QR factorizations [6]. This way, M
HMx = y can be written as R
HRx = y which can be solved using forward and backward substitution as R is upper triangular. The cost of computing N then is O(J
4+ I
2J
2+ I
3) flop per iteration for the EVD step and O(J
6+ IJ
4K + I
2J
3K) flop for the QR step.
Optimization-based algorithm The CS-EVD problem (Eqs. (4)–(5)) is a coupled LL1 decom- position: by stacking X
rand Y
ras frontal slices in X and Y, resp., we have (using CPD notation):
X = JA, A
−T
, GP
1K, Y = JB
−T
, B, GP
2K, (6)
in which G collects the eigenvalues in Eqs. (4)–(5) and P
1= blockdiag(1
TM1
, . . . , 1
TMR
) and P
2= blockdiag(1
TN1, . . . , 1
TNR
). We propose a single-step optimization-based approach that exploits this and computes the null space and the decomposition simultaneously by solving
A,B,G
min 1 2
K
X
k=1
JA, T
TkA
−T, GP
1K − JT
kB
−T, B, GP
2K
2
F