Canonical Polyadic Decomposition of Third-Order Tensors of Which the Dimensions are Strictly Smaller than the Rank

(1)

Canonical Polyadic Decomposition of

Third-Order Tensors of Which the Dimensions

are Strictly Smaller than the Rank

∗

Lieven De Lathauwer

December 1, 2011

1 Case I

₁

6 I

₂

6 I

₃

= R − 1

We consider the Canonical Polyadic Decomposition (CPD) of a rank-R tensor T ∈ KI1×I2×I3, with I 1 6I2 6I3 and R = I3+ 1: T = R X r=1 ar ⊗br ⊗cr. (1) Contributions:

• A new, relaxed, algebraic condition on ar, br, cr, 1 6 r 6 R that is

sufficient for essential uniqueness of the CPD.

• Under this condition the CP factors can be obtained by means of stan-dard linear algebra.

∗_{Available in SISTA publication engine as: L. De Lathauwer, CPD Study, Part IV,}

(2)

• Algorithm for the estimation of the CP factors in the case of an ap-proximate CPD.

• Generic uniqueness for a given set (I1, I2, I3, R) is established by

check-ing the algebraic condition for one random sample.

• Similar approach in the partially symmetric case (ar = br, 1 6 r 6 R)

and in the symmetric case (ar = br = cr, 1 6 r 6 R). (Generic bound

will be different.)

• Verification for (4×4×4) tensors of rank 5 (symmetric and unsymmetric case).

In matrix format we have:

T[1,2;3] = (A ⊙ B) · CT. (2)

A necessary condition for uniqueness of the CPD is that A ⊙ B has full column rank. For now we additionally assume that k(C) = I3. Denote

by CT the (I3 × I3)-matrix obtained from CT by deleting the bottom row.

Further define ˜c = C−₁

cR and

H= [a1⊗ b1+ ˜c1,RaR⊗ bR · · · aI3 ⊗ bI3+ ˜cI3,RaR⊗ bR] . (3)

We have:

T[1,2;3] = H · CT. (4)

Let f1, f2, . . . , fI3 denote a basis for the column space of T[1,2;3]. We have:

T[1,2;3] = F · GT, (5)

F = [f1· · · fI3] = H · M

T_, ₍₆₎

GT = M−T _{· C}T_, ₍₇₎

in which M ∈ KI3×I3 is nonsingular.

For the computation of the CPD we will use the fact that the columns of H represent rank-2 matrices. Let Φ denote the rank-2 detecting mapping, introduced in [1]. Further, Fi3 = (fi3)[1;2] ∈ K I1×I2, 1 6 i 3 6I3. We have: Φ(Fk1, Fk2, Fk3) = I3 X l1,l2,l3=1 mk1l1mk2l2mk3l3Φ(al1b T l1 + ˜cl1,RaRb T R, al2b T l2 + ˜cl2,RaRb T R, al3b T l3 + ˜cl3,RaRb T R). (8)

(3)

We have the following properties:

• Φ(·, ·, ·) is symmetric in its arguments. • Φ(al1b T l1 + ˜cl1,RaRb T R, al1b T l1 + ˜cl1,RaRb T R, al1b T l1 + ˜cl1,RaRb T R) = O, 1 6 l1 6I3. • Φ(al1b T l1 + ˜cl1,RaRb T R, al1b T l1 + ˜cl1,RaRb T R, al2b T l2 + ˜cl2,RaRb T R) = 2˜cl1,RΦ(al1b T l1, al2b T l2, aRb T R) and Φ(al1b T l1 + ˜cl1,RaRb T R, al2b T l2 + ˜cl2,RaRb T R, al2b T l2 + ˜cl2,RaRb T R) = 2˜cl2,RΦ(al1b T l1, al2b T l2, aRb T R), 1 6 l1 < l2 6I3.

We look for symmetric tensors X ∈ KI3×I3×I3 that satisfy

I3 X k1,k2,k3=1 xk1,k2,k3Φ(Fk1, Fk2, Fk3) = O. (9) We stack Φ(Fk1, Fk2, Fk3), 1 6 k1 6 k2 6 k3 6 I3, in a matrix ¯F ∈ KI13I 3 2×C 3

I3+2. We determine the vectors x in the null space of ¯F, which can

then be mapped to tensors X . We make the crucial assumption that the set of tensors {Φ(ar1b

T r1, ar2b T r2, ar3b T r3)|1 6 r1 < r2 < r3 6 R} is linearly

independent. Under this assumption, the null space of ¯Fhas dimension C2 R.

Denote W = M−T_{. Combination of (8) and (9) shows that the vectors in}

the null space represent tensors that are linear combinations of:

wk ⊗wk ⊗wk, 1 6 k 6 I3, (10)

˜

wk1,k2⊗w˜k1,k2⊗w˜k1,k2, 1 6 k1 < k2 6I3, (11)

in which ˜wk1,k2 = −˜ck2,Rwk1 + ˜ck1,Rwk2.

If we stack the tensors obtained from a basis of the null space of ¯F in a super-tensor Y ∈ KI3×I3×I3×CR2, then this super-tensor admits the CPD

Y = I3 X k=1 wk ⊗wk ⊗wk ⊗vk+ I3 X k1=1 I3 X k2=k1+1 ˜ wk1,k2⊗w˜k1,k2⊗w˜k1,k2⊗vk1,k2. (12)

(4)

The vectors vk, 1 6 k 6 I3, and vk1,k2, 1 6 k1 < k2 6 I3, represent the

coefficients of the tensors (10) and (11), respectively, in the linear combi-nations that yield the basis vectors of the null space of ¯F. They can be stacked in a matrix V ∈ KC2

R×C 2

R. This matrix has full rank, since the

ba-sis vectors are linearly independent. We make the assumption that CPD (12) is unique. A sufficient condition for this is that (i) the set of vectors {wk⊗ wk|1 6 k 6 I3} ∪ { ˜wk1,k2 ⊗ ˜wk1,k2|1 6 k1 < k2 6I3} is linearly

inde-pendent and that (ii) the set {wk|1 6 k 6 I3} ∪ { ˜wk1,k2|1 6 k1 < k2 6 I3}

does not contain proportional vectors. Indeed, knowing that V has full rank, the condition implies that the CPD

Y[1;2,3;4] = I3 X k=1 wk ⊗(wk⊗wk)⊗v_k+ I3 X k1=1 I3 X k2=k1+1 ˜ wk1,k2⊗( ˜wk1,k2⊗ ˜wk1,k2)⊗vk1,k2 (13) is unique, see for instance [2, 3]. In the case of an exact decomposition, (13) follows from a matrix EVD. (Exploiting the symmetry of wk· wkT and

˜

wk1,k2· ˜w

T

k1,k2, one may work with an (I3×

I3(I3+1)

2 × CR2)-dimensional version

of Y[1;2,3;4].) In the case of an approximate decomposition, the EVD may be

used to initialize an algorithm for fitting (12).

In the case of an exact decomposition, one can, in the set W = {wk|1 6

k 6 I₃} ∪ { ˜wk1,k2|1 6 k1 < k2 6 I3}, find C

3

R subsets of three vectors

that are linearly dependent. These subsets can be characterized as follows. Choose R − 3 different columns of C, say c1, c2, . . . , cR−3. The vectors of

the corresponding subset are all three orthogonal to c1, c2, . . . , cR−3. The

first vector is additionally orthogonal to cR−2, the second vector is instead

orthogonal to cR−1, and the third vector is orthogonal to cR. The different

subsets correspond to different choices for the columns of C to which the three vectors in the set are orthogonal.

Let {k1, k2, . . . , kR−3} ∪ {lR−2, lR−1, lR} = {1, 2, . . . , R}. Let WlR

−2,lR−1,lR ∈

KI3×3 _{be the rank-2 matrix containing the vectors of W that are orthogonal}

to ckr, 1 6 r 6 R − 3. Define

ElR₋₂,lR₋₁,lR = F · WlR₋₂,lR₋₁,lR. (14)

The entries of ElR

−2,lR−1,lR ∈ K

I1I2×3 can be stacked in a tensor E

lR

−2,lR−1,lR ∈

KI1×I2×3 _{of which the CPD is given by:}

ElR −2,lR−1,lR = alR−2 ⊗b_l R −2 ⊗¯c₁ + a_l R −1 ⊗b_l R −1 ⊗¯c₂+ a_l R⊗blR⊗¯c3, (15)

(5)

where ¯C = [¯c1 ¯c2 ¯c3] has rank 2 since WlR

−2,lR−1,lR has rank 2. The vectors

alR₋₂, blR₋₂, alR₋₁, blR₋₁, alR, blR can be found from this decomposition if

[alR₋₂ alR₋₁ alR] and [blR₋₂ blR₋₁ blR] have full column rank.

In the case of an exact decomposition, the full matrices A and B can be found by starting from ⌈R

3⌉ subsets of W that correspond to matrices WlR

−2,lR−1,lR

of which the column spaces minimally intersect. The matrix C can then be found from (2).

In the case of an approximate decomposition, one can in principle start from the C3

R subsets of W that are maximally linearly dependent, to obtain 3CR3

vectors that are estimates of the columns of A and B. These can then be clustered in R groups of C2

R−1 vectors, and the cluster centers can be used

to initialize an algorithm for fitting (1).

Remark 1. Instead of working with ¯F _{∈ K}I13I 3 2×C

3

I3+2, one may work with a

reduced version ˜F_{∈ K}CI13 C 3 I2×C

3

I3+2, as explained in [1]. Instead of estimating

the null space of ¯F or ˜F, one may also estimate the null space of ¯FH · ¯F ∈ KCI3+23 ×C

3

I3+2. The latter matrix can be computed efficiently by computing

(fi1 ⊗ fi2 ⊗ fi3) H_(f j1 ⊗ fj2 ⊗ fj3) as (f H i1fj1)(f H i2fj2)(f H i3fj3). The disavantage is

that the condition number is squared.

2 Case I

₁

6 I

₂

6 I

₃

= R−K, 1 < K < min(I

1

, I

2

)−

1

We use the rank-(K + 1) detecting mapping ΦK+1, introduced in [1].

Consider K = 2 by way of example. We have: Φ3(al1b T l1 + ˜cl1,R−1aR−1b T R−1 + ˜cl1,RaRb T R, al1b T l1 + ˜cl1,R−1aR−1b T R−1+ ˜cl1,RaRb T R, al1b T l1 + ˜cl1,R−1aR−1b T R−1+ ˜cl1,RaRb T R, al1b T l1 + ˜cl1,R−1aR−1b T R−1+ ˜cl1,RaRb T R) = O, 1 6 l1 6I3.

(6)

The vectors Φ3(al1b T l1 + ˜cl1,R−1aR−1b T R−1 + ˜cl1,RaRb T R, al1b T l1 + ˜cl1,R−1aR−1b T R−1+ ˜cl1,RaRb T R, al1b T l1 + ˜cl1,R−1aR−1b T R−1+ ˜cl1,RaRb T R, al2b T l2 + ˜cl2,R−1aR−1b T R−1+ ˜cl2,RaRb T R), Φ3(al1b T l1 + ˜cl1,R−1aR−1b T R−1 + ˜cl1,RaRb T R, al1b T l1 + ˜cl1,R−1aR−1b T R−1+ ˜cl1,RaRb T R, al2b T l2 + ˜cl2,R−1aR−1b T R−1+ ˜cl1,RaRb T R, al2b T l2 + ˜cl2,R−1aR−1b T R−1+ ˜cl2,RaRb T R) and Φ3(al1b T l1 + ˜cl1,R−1aR−1b T R−1 + ˜cl1,RaRb T R, al2b T l2 + ˜cl2,R−1aR−1b T R−1+ ˜cl2,RaRb T R, al2b T l1 + ˜cl2,R−1aR−1b T R−1+ ˜cl2,RaRb T R, al2b T l2 + ˜cl2,R−1aR−1b T R−1+ ˜cl2,RaRb T R) are proportional to Φ3(al1b T l1, al2b T l2, aR−1b T R−1, aRbTR), 1 6 l1 < l2 6I3. The vectors Φ3(al1b T l1 + ˜cl1,R−1aR−1b T R−1 + ˜cl1,RaRb T R, al1b T l1 + ˜cl1,R−1aR−1b T R−1+ ˜cl1,RaRb T R, al2b T l2 + ˜cl2,R−1aR−1b T R−1+ ˜cl2,RaRb T R, al3b T l3 + ˜cl3,R−1aR−1b T R−1+ ˜cl3,RaRb T R), Φ3(al1b T l1 + ˜cl1,R−1aR−1b T R−1 + ˜cl1,RaRb T R, al2b T l2 + ˜cl2,R−1aR−1b T R−1+ ˜cl2,RaRb T R, al2b T l2 + ˜cl2,R−1aR−1b T R−1+ ˜cl2,RaRb T R, al3b T l3 + ˜cl3,R−1aR−1b T R−1+ ˜cl3,RaRb T R),

(7)

and Φ3(al1b T l1 + ˜cl1,R−1aR−1b T R−1 + ˜cl1,RaRb T R, al2b T l2 + ˜cl2,R−1aR−1b T R−1+ ˜cl2,RaRb T R, al3b T l3 + ˜cl3,R−1aR−1b T R−1+ ˜cl3,RaRb T R, al3b T l3 + ˜cl3,R−1aR−1b T R−1+ ˜cl3,RaRb T R),

are expected to be linearly independent, 1 6 l1 < l2 < l3 6I3.

References

[1] A link between the decomposition of a third-order tensor in rank-(L,L,1) terms and simultaneous matrix diagonalization, Tech. report.

[2] R.A. Harshman, Foundations of the PARAFAC procedure: Model and conditions for an“explanatory” multi-mode factor analysis, UCLA Work-ing Papers in Phonetics, 16 (1970), pp. 1–84.

[3] S.E. Leurgans, R.T. Ross, and R.B. Abel, A decomposition for three-way arrays, SIAM J. Matrix Anal. Appl., 14 (1993), pp. 1064–1083.