A Link between Decomposition of a Tensor in Multilinear Rank-(L, L, 1) Terms and Simultaneous Block Diagonalization

(1)

A Link between Decomposition of a Tensor in

Multilinear Rank-(L, L, 1) Terms and

Simultaneous Block Diagonalization

Lieven De Lathauwer

Tech. Report, ESAT-SISTA, KU Leuven

March 31, 2012

Decomposition

T ₌ R X r=1 (Ar· BTr) ◦ cr. (1) TKI×J = (C ⊗ A) · BT. (2)

Condition

We use the same bilinear mapping Φ as in [1]. The decomposition is essen-tially unique if the following three conditions are satisfied:

(2)

2. LR 6 IJ and C ⊗ A has full column rank.

3. the tensors Φ cαAα(:, l)T, cβAβ(:, m)T, 1 6 α < β 6 R, 1 6 l, m 6

L, are linearly independent.

If these conditions are satisfied, then it is clear from (2) that the rank of TKI×J is equal to LR.

Formulation as simultaneous block diagonalization

Economy-size SVD:

TKI×J = U · Σ · VH = E · VH, (3)

(Note that we split up in (U·Σ)·VH _{and not U·(Σ·V}H_{) or (U·Σ}1₂_)·(Σ1₂_·VH_).

The reason is that if the components of B are uncorrelated (which is often the case in the long mode), matrix W below is in principle orthogonal. Hence, the SD problem below is well-conditioned.)

Combination of (2) and (3) yields:

E = (C ⊗ A) · W−₁

(4) B = V∗

· W−T

(5) in which W ∈ CLR×LR _{is nonsingular. It is clear that W can only be}

de-termined up to right multiplication with a block diagonal matrix involving nonsingular (L × L) blocks. This is consistent with the ambiguities in A and B. Consider the partitioning W = [W1. . . WR], in which Wr ∈ CLR×L.

We compute, for all r, s ∈ [1, LR]: P_rs _{= Φ(E}_r, Es) = LR X u,v=1 (W−₁ )ur(W −₁

(3)

We introduce new indices α, β ∈ [1, R] and l, m ∈ [1, L] such that u = (α − 1)L + l and v = (β − 1)L + m. Then we have:

P_rs₌ L X l,m=1 R X α,β=1 (W−1 )(α−1)L+l,r(W−1)(β−1)L+m,sΦ cαAα(:, l)T, cβAβ(:, m)T . (6) Now assume that there exists a symmetric matrix M ∈ CLR×LR _{such that}

LR X r,s=1 mrsPrs = O. (7) Substitution of (6) in (7) yields: LR X r,s=1 L X l,m=1 R X α,β=1 (W−1 )(α−1)L+l,r(W−1)(β−1)L+m,smrsΦ cαAα(:, l)T, cβAβ(:, m)T = O. (8) We have that Φ cαAα(:, l)T, cβAβ(:, m)T = O if α = β, regardless of the

values of l and m. Our third working condition now implies that W−1

· M · W−T _{= D,} ₍₉₎

in which D is a block diagonal matrix involving (L×L) blocks. Equivalently, M= W · D · WT. (10) It can be verified that any matrix M that has the structure (10), satisfies (7). Hence, there are L2_R _{symmetric matrices in the kernel of (7), which can}

all be decomposed as in (10):

M1 = W · D1· WT

.. .

M_L2_R = W · D_L2_R· WT. (11)

Let Dn = blockdiag(D(1)n , . . . , D(R)n ), where D(r)n ∈ CL×L, 1 6 n 6 L2R,

1 6 r 6 R. The formulation in terms of a block diagonalization problem is consistent with the fact that W can only be determined up to right mul-tiplication with a block diagonal matrix. The advantage of (11) over (1) is

(4)

that (11) is overdetermined, while (1) is possibly underdetermined. It can be verified that the decomposition in rank-(Lr, Lr,1) terms can be computed

by means of simultaneous block diagonalization of PR

r=1L2r matrices, where

the blocks have dimension (Lr× Lr).

Stack all the matrices Mn in a tensor M ∈ CLR×LR×L

2_R

. The simultaneous block diagonalization problem is equivalent with the decomposition of M in (L, L, ·) terms, as defined in [2]. The decomposition of T in rank-(Lr, Lr,1) terms can be reformulated as the decomposition of a tensor M ∈

C(PRr=1Lr)×( PR r=1Lr)×( PR r=1L2r) in rank-(L r, Lr, ·) terms.

Solution by means of simultaneous generalized block

Schur decomposition

Substitute in (11) the QR-factorization W = QH_·R′

and the RQ-factorization WT _{= R}′′

· ZH_{. Denote the diagonal (L × L) blocks of R}′

and R′′

by R(r)′

and R(r)′′

, respectively, 1 6 r 6 R. Since W is nonsingular, R′

and R′′

are nonsingular. This, in turn, implies that all R(r)′

and R(r)′′ are nonsingular. We obtain: Q · M1· Z = R ′ · D1· R ′′ = R1 .. . Q · ML2_R· Z = R′· D_L2_R· R′′= R_L2_R, (12)

in which Rn is block upper triangular. This simultaneous generalized block

Schur decomposition can be computed by means of a block version of the extended QZ-iteration in [3]. The matrices Q and Z should be updated block by block.

The question is now how W can be found, once (12) has been computed. Stack M in an (L2_R2_{× L}2_R_{) matrix ˜}_{M. From (11) we have:}

˜

(5)

where ˜D ∈ CL2_R×L2_R is defined by ˜ D =    vec(D(1)₁ ) · · · vec(D(1)_L2_R) ... ... vec(D(R)₁ ) · · · vec(D(R)_L2_R)   . (14)

The matrix ˜M is full column rank since M1, . . . , ML2_Rare linearly

indepen-dent. The matrix [W1⊗ W1 . . . WR⊗ WR] consists of columns of W ⊗ W,

which is full rank since W is full rank. Hence, ˜D is full rank.

Denote the diagonal (L × L) blocks of Rn by R(r)n , 1 6 n 6 L2R, 1 6 r 6 R.

We have:

R(r)_n = R(r)′ · D(r)_n · R(r)′′. (15) In vectorized form we have:

vec(R(r)n ) =

(R(r)′′)T _{⊗ R}(r)′

·_vec(D(r)_n _). ₍₁₆₎ Stack the block diagonals of Rn in an (L2R × L2R) matrix ˜R, the columns

corresponding to different values of n. From (16) we have: ˜

R= blockdiag((R(1)′′)T ⊗ R(1)′, . . . ,(R(R)′′)T ⊗ R(R)′) · ˜D, (17) The block diagonal factor is nonsingular because all R(r)′ and R(r)′′ are non-singular. Combination of (13) and (17) yields:

˜ M · ˜R−₁ = [W1⊗ W1 . . . WR⊗ WR] · [blockdiag((R(1) ′′ )T _{⊗ R}(1)′ , . . . ,(R(R)′′ )T _{⊗ R}(R)′ )]−₁ = [(W1⊗ W1) · ((R(1) ′′ )−T _⊗_(R(1)′ )−₁ ) . . . (WR⊗ WR) · ((R(R) ′′ )−T _⊗_(R(R)′ )−₁ )].(18) Actually, the right-hand side of (18) should be obtained by solving the linear

set of equations ˜M= Y · ˜Rfor Y ∈ CL2_R2×L2_R

. We map the rth (L2_R2_{× L}2₎

submatrix of this partitioned matrix to Wr ∈ CLR×LR×L

2 , 1 6 r 6 R. Denote the pth column of (R(r)′′ )−T _⊗_(R(r)′ )−₁ by x(r)p , 1 6 r 6 R, 1 6 p 6 L2.

Define ˜X_r _{as the (L × L × L}2_{) tensor in which the matrices unvec(x}(r) 1 ), . . . ,

unvec(x(r)_L2) are stacked. Then we have:

(6)

The column space of Wrequals the mode-1 (and mode-2) vector space of Wr.

It can be obtained by truncated Multilinear Singular Value Decomposition, or, more accurately, by best multilinear rank-(L, L, L2_{) approximation.}

After computation of W, B follows from (5). Matrices C and A follow from (4). We compute E · W = [F1 . . . FR], where Fr ∈ CIJ ×L, 1 6 r 6 R.

Stack Fr in ˜Fr ∈ CLI×J. Then the best rank-1 approximation of ˜Fr yields

an estimate of cr and vec(Ar), 1 6 r 6 R.

Note that one can detect the value of L (and R) by clustering. The first L2

columns of Y correspond to (LR × LR) matrices of which the column (and row) space is a subspace of the same L-dimensional vector space, namely, the column space of W1. The next L2 columns of Y correspond to (LR ×

LR) matrices of which the column (and row) space lies in the L-dimensional column space of W2, and so on.

These results can be generalized for the decomposition of T in rank-(Lr, Lr,1)

terms.

References

[1] L. De Lathauwer, “A Link between the Canonical Decomposition in Mul-tilinear Algebra and Simultaneous Matrix Diagonalization”, SIAM J. Ma-trix Anal. Appl., Vol. 28, No. 3, 2006, pp. 642–666.

[2] L. De Lathauwer, “Decompositions of a Higher-Order Tensor in Block Terms — Part II: Definitions and Uniqueness”, SIAM J. Matrix Anal. Appl., Special Issue Tensor Decompositions and Applications, Vol. 30, No. 3, 2008, pp. 1033–1066.

[3] A.-J. van der Veen and A. Paulraj, An analytical constant modulus algorithm, IEEE Trans. Signal Processing, 44 (1996), pp. 1136–1155.