Canonical Polyadic Decomposition of
Third-Order Tensors of Which the Dimensions
are Strictly Smaller than the Rank
∗
Lieven De Lathauwer
December 1, 2011
1
Case I
16
I
26
I
3= R − 1
We consider the Canonical Polyadic Decomposition (CPD) of a rank-R tensor T ∈ KI1×I2×I3, with I 1 6I2 6I3 and R = I3+ 1: T = R X r=1 ar ⊗br ⊗cr. (1) Contributions:
• A new, relaxed, algebraic condition on ar, br, cr, 1 6 r 6 R that is
sufficient for essential uniqueness of the CPD.
• Under this condition the CP factors can be obtained by means of stan-dard linear algebra.
∗Available in SISTA publication engine as: L. De Lathauwer, CPD Study, Part IV,
• Algorithm for the estimation of the CP factors in the case of an ap-proximate CPD.
• Generic uniqueness for a given set (I1, I2, I3, R) is established by
check-ing the algebraic condition for one random sample.
• Similar approach in the partially symmetric case (ar = br, 1 6 r 6 R)
and in the symmetric case (ar = br = cr, 1 6 r 6 R). (Generic bound
will be different.)
• Verification for (4×4×4) tensors of rank 5 (symmetric and unsymmetric case).
In matrix format we have:
T[1,2;3] = (A ⊙ B) · CT. (2)
A necessary condition for uniqueness of the CPD is that A ⊙ B has full column rank. For now we additionally assume that k(C) = I3. Denote
by CT the (I3 × I3)-matrix obtained from CT by deleting the bottom row.
Further define ˜c = C−1
cR and
H= [a1⊗ b1+ ˜c1,RaR⊗ bR · · · aI3 ⊗ bI3+ ˜cI3,RaR⊗ bR] . (3)
We have:
T[1,2;3] = H · CT. (4)
Let f1, f2, . . . , fI3 denote a basis for the column space of T[1,2;3]. We have:
T[1,2;3] = F · GT, (5)
F = [f1· · · fI3] = H · M
T, (6)
GT = M−T · CT, (7)
in which M ∈ KI3×I3 is nonsingular.
For the computation of the CPD we will use the fact that the columns of H represent rank-2 matrices. Let Φ denote the rank-2 detecting mapping, introduced in [1]. Further, Fi3 = (fi3)[1;2] ∈ K I1×I2, 1 6 i 3 6I3. We have: Φ(Fk1, Fk2, Fk3) = I3 X l1,l2,l3=1 mk1l1mk2l2mk3l3Φ(al1b T l1 + ˜cl1,RaRb T R, al2b T l2 + ˜cl2,RaRb T R, al3b T l3 + ˜cl3,RaRb T R). (8)
We have the following properties:
• Φ(·, ·, ·) is symmetric in its arguments. • Φ(al1b T l1 + ˜cl1,RaRb T R, al1b T l1 + ˜cl1,RaRb T R, al1b T l1 + ˜cl1,RaRb T R) = O, 1 6 l1 6I3. • Φ(al1b T l1 + ˜cl1,RaRb T R, al1b T l1 + ˜cl1,RaRb T R, al2b T l2 + ˜cl2,RaRb T R) = 2˜cl1,RΦ(al1b T l1, al2b T l2, aRb T R) and Φ(al1b T l1 + ˜cl1,RaRb T R, al2b T l2 + ˜cl2,RaRb T R, al2b T l2 + ˜cl2,RaRb T R) = 2˜cl2,RΦ(al1b T l1, al2b T l2, aRb T R), 1 6 l1 < l2 6I3.
We look for symmetric tensors X ∈ KI3×I3×I3 that satisfy
I3 X k1,k2,k3=1 xk1,k2,k3Φ(Fk1, Fk2, Fk3) = O. (9) We stack Φ(Fk1, Fk2, Fk3), 1 6 k1 6 k2 6 k3 6 I3, in a matrix ¯F ∈ KI13I 3 2×C 3
I3+2. We determine the vectors x in the null space of ¯F, which can
then be mapped to tensors X . We make the crucial assumption that the set of tensors {Φ(ar1b
T r1, ar2b T r2, ar3b T r3)|1 6 r1 < r2 < r3 6 R} is linearly
independent. Under this assumption, the null space of ¯Fhas dimension C2 R.
Denote W = M−T. Combination of (8) and (9) shows that the vectors in
the null space represent tensors that are linear combinations of:
wk ⊗wk ⊗wk, 1 6 k 6 I3, (10)
˜
wk1,k2⊗w˜k1,k2⊗w˜k1,k2, 1 6 k1 < k2 6I3, (11)
in which ˜wk1,k2 = −˜ck2,Rwk1 + ˜ck1,Rwk2.
If we stack the tensors obtained from a basis of the null space of ¯F in a super-tensor Y ∈ KI3×I3×I3×CR2, then this super-tensor admits the CPD
Y = I3 X k=1 wk ⊗wk ⊗wk ⊗vk+ I3 X k1=1 I3 X k2=k1+1 ˜ wk1,k2⊗w˜k1,k2⊗w˜k1,k2⊗vk1,k2. (12)
The vectors vk, 1 6 k 6 I3, and vk1,k2, 1 6 k1 < k2 6 I3, represent the
coefficients of the tensors (10) and (11), respectively, in the linear combi-nations that yield the basis vectors of the null space of ¯F. They can be stacked in a matrix V ∈ KC2
R×C 2
R. This matrix has full rank, since the
ba-sis vectors are linearly independent. We make the assumption that CPD (12) is unique. A sufficient condition for this is that (i) the set of vectors {wk⊗ wk|1 6 k 6 I3} ∪ { ˜wk1,k2 ⊗ ˜wk1,k2|1 6 k1 < k2 6I3} is linearly
inde-pendent and that (ii) the set {wk|1 6 k 6 I3} ∪ { ˜wk1,k2|1 6 k1 < k2 6 I3}
does not contain proportional vectors. Indeed, knowing that V has full rank, the condition implies that the CPD
Y[1;2,3;4] = I3 X k=1 wk ⊗(wk⊗wk)⊗vk+ I3 X k1=1 I3 X k2=k1+1 ˜ wk1,k2⊗( ˜wk1,k2⊗ ˜wk1,k2)⊗vk1,k2 (13) is unique, see for instance [2, 3]. In the case of an exact decomposition, (13) follows from a matrix EVD. (Exploiting the symmetry of wk· wkT and
˜
wk1,k2· ˜w
T
k1,k2, one may work with an (I3×
I3(I3+1)
2 × CR2)-dimensional version
of Y[1;2,3;4].) In the case of an approximate decomposition, the EVD may be
used to initialize an algorithm for fitting (12).
In the case of an exact decomposition, one can, in the set W = {wk|1 6
k 6 I3} ∪ { ˜wk1,k2|1 6 k1 < k2 6 I3}, find C
3
R subsets of three vectors
that are linearly dependent. These subsets can be characterized as follows. Choose R − 3 different columns of C, say c1, c2, . . . , cR−3. The vectors of
the corresponding subset are all three orthogonal to c1, c2, . . . , cR−3. The
first vector is additionally orthogonal to cR−2, the second vector is instead
orthogonal to cR−1, and the third vector is orthogonal to cR. The different
subsets correspond to different choices for the columns of C to which the three vectors in the set are orthogonal.
Let {k1, k2, . . . , kR−3} ∪ {lR−2, lR−1, lR} = {1, 2, . . . , R}. Let WlR
−2,lR−1,lR ∈
KI3×3 be the rank-2 matrix containing the vectors of W that are orthogonal
to ckr, 1 6 r 6 R − 3. Define
ElR−2,lR−1,lR = F · WlR−2,lR−1,lR. (14)
The entries of ElR
−2,lR−1,lR ∈ K
I1I2×3 can be stacked in a tensor E
lR
−2,lR−1,lR ∈
KI1×I2×3 of which the CPD is given by:
ElR −2,lR−1,lR = alR−2 ⊗bl R −2 ⊗¯c1 + al R −1 ⊗bl R −1 ⊗¯c2+ al R⊗blR⊗¯c3, (15)
where ¯C = [¯c1 ¯c2 ¯c3] has rank 2 since WlR
−2,lR−1,lR has rank 2. The vectors
alR−2, blR−2, alR−1, blR−1, alR, blR can be found from this decomposition if
[alR−2 alR−1 alR] and [blR−2 blR−1 blR] have full column rank.
In the case of an exact decomposition, the full matrices A and B can be found by starting from ⌈R
3⌉ subsets of W that correspond to matrices WlR
−2,lR−1,lR
of which the column spaces minimally intersect. The matrix C can then be found from (2).
In the case of an approximate decomposition, one can in principle start from the C3
R subsets of W that are maximally linearly dependent, to obtain 3CR3
vectors that are estimates of the columns of A and B. These can then be clustered in R groups of C2
R−1 vectors, and the cluster centers can be used
to initialize an algorithm for fitting (1).
Remark 1. Instead of working with ¯F ∈ KI13I 3 2×C
3
I3+2, one may work with a
reduced version ˜F∈ KCI13 C 3 I2×C
3
I3+2, as explained in [1]. Instead of estimating
the null space of ¯F or ˜F, one may also estimate the null space of ¯FH · ¯F ∈ KCI3+23 ×C
3
I3+2. The latter matrix can be computed efficiently by computing
(fi1 ⊗ fi2 ⊗ fi3) H(f j1 ⊗ fj2 ⊗ fj3) as (f H i1fj1)(f H i2fj2)(f H i3fj3). The disavantage is
that the condition number is squared.
2
Case I
16
I
26
I
3= R−K, 1 < K < min(I
1, I
2)−
1
We use the rank-(K + 1) detecting mapping ΦK+1, introduced in [1].
Consider K = 2 by way of example. We have: Φ3(al1b T l1 + ˜cl1,R−1aR−1b T R−1 + ˜cl1,RaRb T R, al1b T l1 + ˜cl1,R−1aR−1b T R−1+ ˜cl1,RaRb T R, al1b T l1 + ˜cl1,R−1aR−1b T R−1+ ˜cl1,RaRb T R, al1b T l1 + ˜cl1,R−1aR−1b T R−1+ ˜cl1,RaRb T R) = O, 1 6 l1 6I3.
The vectors Φ3(al1b T l1 + ˜cl1,R−1aR−1b T R−1 + ˜cl1,RaRb T R, al1b T l1 + ˜cl1,R−1aR−1b T R−1+ ˜cl1,RaRb T R, al1b T l1 + ˜cl1,R−1aR−1b T R−1+ ˜cl1,RaRb T R, al2b T l2 + ˜cl2,R−1aR−1b T R−1+ ˜cl2,RaRb T R), Φ3(al1b T l1 + ˜cl1,R−1aR−1b T R−1 + ˜cl1,RaRb T R, al1b T l1 + ˜cl1,R−1aR−1b T R−1+ ˜cl1,RaRb T R, al2b T l2 + ˜cl2,R−1aR−1b T R−1+ ˜cl1,RaRb T R, al2b T l2 + ˜cl2,R−1aR−1b T R−1+ ˜cl2,RaRb T R) and Φ3(al1b T l1 + ˜cl1,R−1aR−1b T R−1 + ˜cl1,RaRb T R, al2b T l2 + ˜cl2,R−1aR−1b T R−1+ ˜cl2,RaRb T R, al2b T l1 + ˜cl2,R−1aR−1b T R−1+ ˜cl2,RaRb T R, al2b T l2 + ˜cl2,R−1aR−1b T R−1+ ˜cl2,RaRb T R) are proportional to Φ3(al1b T l1, al2b T l2, aR−1b T R−1, aRbTR), 1 6 l1 < l2 6I3. The vectors Φ3(al1b T l1 + ˜cl1,R−1aR−1b T R−1 + ˜cl1,RaRb T R, al1b T l1 + ˜cl1,R−1aR−1b T R−1+ ˜cl1,RaRb T R, al2b T l2 + ˜cl2,R−1aR−1b T R−1+ ˜cl2,RaRb T R, al3b T l3 + ˜cl3,R−1aR−1b T R−1+ ˜cl3,RaRb T R), Φ3(al1b T l1 + ˜cl1,R−1aR−1b T R−1 + ˜cl1,RaRb T R, al2b T l2 + ˜cl2,R−1aR−1b T R−1+ ˜cl2,RaRb T R, al2b T l2 + ˜cl2,R−1aR−1b T R−1+ ˜cl2,RaRb T R, al3b T l3 + ˜cl3,R−1aR−1b T R−1+ ˜cl3,RaRb T R),
and Φ3(al1b T l1 + ˜cl1,R−1aR−1b T R−1 + ˜cl1,RaRb T R, al2b T l2 + ˜cl2,R−1aR−1b T R−1+ ˜cl2,RaRb T R, al3b T l3 + ˜cl3,R−1aR−1b T R−1+ ˜cl3,RaRb T R, al3b T l3 + ˜cl3,R−1aR−1b T R−1+ ˜cl3,RaRb T R),
are expected to be linearly independent, 1 6 l1 < l2 < l3 6I3.
References
[1] A link between the decomposition of a third-order tensor in rank-(L,L,1) terms and simultaneous matrix diagonalization, Tech. report.
[2] R.A. Harshman, Foundations of the PARAFAC procedure: Model and conditions for an“explanatory” multi-mode factor analysis, UCLA Work-ing Papers in Phonetics, 16 (1970), pp. 1–84.
[3] S.E. Leurgans, R.T. Ross, and R.B. Abel, A decomposition for three-way arrays, SIAM J. Matrix Anal. Appl., 14 (1993), pp. 1064–1083.