• No results found

NEW UNIQUENESS CONDITIONS FOR THE CANONICAL POLYADIC DECOMPOSITION OF HIGHER-ORDER TENSORS

N/A
N/A
Protected

Academic year: 2021

Share "NEW UNIQUENESS CONDITIONS FOR THE CANONICAL POLYADIC DECOMPOSITION OF HIGHER-ORDER TENSORS"

Copied!
24
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

POLYADIC DECOMPOSITION OF HIGHER-ORDER TENSORS

MIKAEL SØRENSEN ∗‡ AND LIEVEN DE LATHAUWER ∗ ‡

Abstract. The uniqueness properties of the Canonical Polyadic Decomposition (CPD) of higher-order tensors have made it an attractive tool for signal separation. However, CPD uniqueness is not yet fully understood. In this paper, we first present a new uniqueness condition for a CPD where one of the factor matrices is assumed to be known. Second, we show that this result can be used to obtain a new overall uniqueness condition for the CPD. We also provide a new uniqueness condition for a complex CPD with a columnwise orthonormal factor matrix. We also demonstrate that the result can be used to obtain a new uniqueness condition for a complex CPD with a partial Hermitian symmetry. Finally, we provide an algorithm for computing a CPD with a known factor matrix.

Key words. tensor, polyadic decomposition, parallel factor (PARAFAC), canonical decompo-sition (CANDECOMP), canonical polyadic decompodecompo-sition.

1. Introduction. Tensor decompositions are finding more and more applica-tions in signal processing, data analysis and machine learning. For instance, Canonical Polyadic Decomposition (CPD) is becoming a basic tool for signal separation. Essen-tially, this is due to the fact that CPD is unique under mild conditions, compared to decomposition of a matrix in rank-1 terms. Thanks to their uniqueness, rank-1 terms are easily associated with interpretable data components. Numerous applications have been reported in Independent Component Analysis, exploratory data analysis, multi-user access in telecommunication, radar, chemometrics, psychometrics, sensor array processing, and so on [3, 4, 19, 20, 16, 18, 12, 2, 13]. However, the understanding of uniqueness is lagging behind the use of CPD in practice.

In this paper we first present a new uniqueness condition for a CPD with a known factor matrix. Based on this result we propose a new overall uniqueness condition for the CPD. We provide a new uniqueness condition for a complex CPD with a columnwise orthonormal factor matrix. We also demonstrate that the result can be used to obtain a new uniqueness condition for a complex CPD with a partial Hermitian symmetry. Such constrained CPDs are very common in signal processing. Finally, we present an inexpensive algorithm for computing a CPD with a known factor matrix, which is also relevant for signal processing.

The paper is organized as follows. The rest of the introduction presents our notation followed by a quick review of the CPD. Section 2 reviews existing uniqueness results for the CPD. Section 3 presents a new uniqueness condition and numerical method for a CPD with a known factor matrix. Next, in sections 4 and 5 we present new uniqueness conditions for the overall CPD and for some variants. Numerical experiments are reported in section 6. We end the paper with a conclusion in section 7.

Group Science, Engineering and Technology, KU Leuven - Kulak, E. Sabbelaan 53, 8500 Kor-trijk, Belgium, {Mikael.Sorensen, Lieven.DeLathauwer}@kuleuven-kulak.be and KU Leuven - E.E. Dept. (ESAT) - SCD-SISTA, Kasteelpark Arenberg 10, B-3001 Leuven-Heverlee, Belgium.

Research supported by: (1) Research Council KU Leuven: GOA-Ambiorics, GOA-MaNet, CoE EF/05/006 Optimization in Engineering (OPTEC), CIF1, STRT1/08/23, (2) F.W.O.: (a) project G.0427.10N, (b) Research Communities ICCoS, ANMMM and MLDM, (3) the Belgian Federal Sci-ence Policy Office: IUAP P7 (DYSCO II, Dynamical systems, control and optimization, 2012-2017), (4) EU: ERNSI.

(2)

1.1. Notation. Vectors, matrices and tensors are denoted by lower case bold-face, upper case boldface and upper case calligraphic letters, respectively. The rth column vector of A is denoted by ar. The symbols ⊗ and ⊙ denote the Kronecker and Khatri-Rao products, defined as

A⊗ B,    a11B a12B . . . a21B a22B . . . .. . ... . ..    , A⊙ B, a1⊗ b1 a2⊗ b2 . . . 

in which (A)mn= amn. Sometimes we denote the Khatri-Rao products of N matrices {A(n)}N n=1 by N K n=1 A(n)= A(1)⊙ · · · ⊙ A(N ). The Hadamard product is denoted by ∗ and is given by

A∗ B,    a11b11 a12b12 . . . a21b21 a22b22 . . . .. . ... . ..    .

Let ◦ denote the outer product of N vectors a(n)∈ CInsuch that a(1)◦a(2)◦· · ·◦a(N )∈

CI1×I2×···×IN satisfies



a(1)◦ a(2)◦ · · · ◦ a(N )

i1,i2,...,iN

= a(1)i1 a(2)i2 · · · a(N )iN .

The identity matrix, all-zero matrix and all-zero vector are denoted by IM ∈ CM×M, 0M,N∈ CM×N and 0M ∈ CM , respectively.

Let (·)T, (·)∗, (·)H, (·)†, k · kF, |·|, Col (·) and Null (·) denote the transpose, conjugate, conjugate-transpose, Moore-Penrose pseudo-inverse, Frobenius norm, de-terminant, column space and null space of a matrix, respectively.

Given A ∈ CI×J, then Vec (A) ∈ CIJ will denote the column vector defined by (Vec (A))i+(j−1)I = (A)ij. Given a ∈ C

IJ, then the reverse operation is Unvec (a) = A ∈ CI×J such that (a)

i+(j−1)I = (A)ij. Let Dk(A) ∈ C

J×J denote the diagonal matrix holding row k of A ∈ CI×J on its diagonal.

Matlab index notation will be used to extract submatrices from a given matrix. For example A(1 : k, :) denotes the submatrix of A consisting of the rows from 1 to k of A. Let T ∈ CI×R, then T = T (1 : I − 1, :) ∈ C(I−1)×R, i.e., T is obtained by deleting the bottom row of T.

The rank of a matrix A is denoted by r (A). The k-rank of a matrix A is denoted by k (A). It is equal to the largest integer k (A) such that every subset of k (A) columns of A is linearly independent.

Let Ck

n = k!(n−k)!n! denote the binomial coefficient. The k-th compound matrix of A∈ Cm×n is denoted by Ck(A) ∈ CCkm×Ck

n and its entries correspond to the k-by-k

minors of A ordered lexicographically. As an example, let A ∈ C4×3, then

C2(A) =         |A ([1, 2], [1, 2])| |A ([1, 2], [1, 3])| |A ([1, 2], [2, 3])| |A ([1, 3], [1, 2])| |A ([1, 3], [1, 3])| |A ([1, 3], [2, 3])| |A ([1, 4], [1, 2])| |A ([1, 4], [1, 3])| |A ([1, 4], [2, 3])| |A ([2, 3], [1, 2])| |A ([2, 3], [1, 3])| |A ([2, 3], [2, 3])| |A ([2, 4], [1, 2])| |A ([2, 4], [1, 3])| |A ([2, 4], [2, 3])| |A ([3, 4], [1, 2])| |A ([3, 4], [1, 3])| |A ([3, 4], [2, 3])|         .

(3)

See [9, 5] for further details on compound matrices.

1.2. Canonical Polyadic Decomposition (CPD). Consider an N th-order tensor X ∈ CI1×···×IN. We say that X is a rank-1 tensor if it is equal to the outer

product of some non-zero vectors a(n) ∈ CIn, n ∈ {1, . . . , N }, such that xi 1...iN =

QN n=1a

(n)

in . More generally, the rank of a tensor X is equal to the minimal number of

rank-1 tensors that yield X in a linear combination. Assume that the rank of X is R, then it can be written as

X = R X r=1 a(1)r ◦ · · · ◦ a (N ) r . (1.1)

The decomposition (1.1) will in this paper be referred to as the Canonical Polyadic Decomposition (CPD) of X . Let us stack the vectors {a(n)r } into the matrices

A(n)=h a(n)1 , · · · , a (n) R

i

∈ CIn×R, n ∈ {1, . . . N } . (1.2)

The matrices A(n) in (1.2) will be referred to as the factor matrices of the CPD of the tensor X in (1.1). The following matrix representation of an N th-order tensor will also be used throughout the paper. Consider X ∈ CI1×···×IN with rank R, then

it has matrix representation X[P ]∈ CQPm=1Im×QNn=P +1In given by

X[P ]=      x1,...,1,1,...,1 x1,...,1,1,...,2 · · · x1,...,1,IP+1,...,IN x1,...,2,1,...,1 x1,...,2,1,...,2 · · · x1,...,2,IP+1,...,IN .. . ... . .. ...

xI1,...,IP,1,...,1 xI1,...,IP,1,...,2 · · · xI1,...,IP,IP+1,...,IN

     =A(1)⊙ · · · ⊙ A(P ) A(P +1)⊙ · · · ⊙ A(N ) T . (1.3)

In some cases it is convenient to map an N th-order tensor to a third-order tensor. Let the integers M and P satisfy 1 ≤ M < P < N , then we can group the factor matrices of the CPD of X as follows

A = A(1)⊙ · · · ⊙ A(M)∈ CI×R, I = M Y n=1 In B = A(M+1)⊙ · · · ⊙ A(P )∈ CJ×R, J = P Y n=M+1 In C = A(P +1)⊙ · · · ⊙ A(N )∈ CK×R, K = N Y n=P +1 In.

By ignoring the structural relations among the column vectors of A, B and C we obtain the third-order tensor

CI×J×K ∋ eX = R X r=1

ar◦ br◦ cr. (1.4)

The tensor (1.4) has matrix representation

(4)

2. Existing Uniqueness Results for CPD. Subsection 2.1 presents existing conditions that guarantee uniqueness of one factor matrix of a CPD while subsection 2.2 presents existing conditions that guarantee overall uniqueness of a CPD. We will in this section treat tensors of order N > 3 as third-order tensors by working with matrix representation (1.5).

2.1. Uniqueness Conditions for One Factor Matrix of a CPD. A factor matrix, say A(n), of the CPD of X ∈ CI1×···×IN is said to be unique if it can be

determined up to the inherent column scaling and permutation ambiguities from X . More formally, the factor matrix A(n) is unique if alternative factor matrices bA(n) satisfy the condition

b

A(n)= A(n)P∆A(n),

where P is a permutation matrix and ∆A(n) is a nonsingular diagonal matrix. The

conditions stated in this subsection guarantees the uniqueness of one factor matrix of a CPD. They will be used in the paper to obtain new overall uniqueness results for tensor decompositions.

Theorem 2.1. Let X ∈ CI×J×K be a tensor with rank R and matrix

represen-tation

X= (A ⊙ B) CT,

where A ∈ CI×R, B ∈ CJ×R and C ∈ CK×R. If      k (C) ≥ 1 min (k (A) , k (B)) ≥ R − r (C) + 2

r (C) + k (B) + k (C) + max (r (A) + k (A) , r (B) + k (B)) ≥ 2 (R + 1)

(2.1)

then the factor matrix C is unique [8].

More recently the following more relaxed condition has been obtained.

Theorem 2.2. Let X ∈ CI×J×K be a tensor with rank R and matrix

represen-tation

X= (A ⊙ B) CT,

where A ∈ CI×R, B ∈ CJ×R and C ∈ CK×R. If      k (C) ≥ 1 min (I, J) ≥ R − r (C) + 2

CR−r(C)+2(A) ⊙ CR−r(C)+2(B) has full column rank

(2.2)

then the factor matrix C is unique [5].

2.2. Uniqueness Conditions for CPD. A CPD of X ∈ CI1×···×IN is said to

be unique if all the N -tuplets 

b

A(1), . . . , bA(N ) 

satisfying (1.1) are related via b

A(n)= A(n)P∆A(n), ∀n ∈ {1, . . . N } ,

where {∆A(n)} are diagonal matrices satisfying

QN

n=1∆A(n)= IRand P is a

permu-tation matrix. The following theorem states a well-known condition for overall CPD uniqueness.

(5)

Theorem 2.3. Let X ∈ CI×J×K be a tensor with rank R and matrix

represen-tation

X= (A ⊙ B) CT,

where A ∈ CI×R, B ∈ CJ×R and C ∈ CK×R. If

k (A) + k (B) + k (C) ≥ 2 (R + 1) , (2.3)

then the CPD of X is unique.

Proof. The original proof can be found in [14] but a more accessible prove is given

in [26]. An alternative and shortened proof has recently been presented in [17]. Recently the following more relaxed overall CPD uniqueness condition has been obtained.

Theorem 2.4. Let X ∈ CI×J×K be a tensor with rank R and matrix

represen-tation

X= (A ⊙ B) CT,

where A ∈ CI×R, B ∈ CJ×R and C ∈ CK×R. If      min (k (A) , k (B)) + k (C) ≥ R + 1 max (k (A) , k (B)) + k (C) ≥ R + 2

CR−r(C)+2(A) ⊙ CR−r(C)+2(B) has full column rank

(2.4)

then the CPD of X is unique [6].

3. New Uniqueness Result for CPD with a Known Factor Matrix. Con-sider the tensor X ∈ CI1×···×IN with rank R and matrix representation X[P ] =



A(1)⊙ · · · ⊙ A(P ) A(P +1)⊙ · · · ⊙ A(N )T. Assume that the factor matrix A(1) is known and let the (N − 1)-tuplet

 b

A(2), . . . , bA(N ) 

yield an alternative decompo-sition of X with A(1). The CPD of X with the known factor matrix A(1) is said to be unique if all the N -tuplets

 b

A(1), . . . , bA(N ) 

satisfying (1.1) are related via

b

A(n)= A(n)∆A(n), ∀ n ∈ {2, . . . , N } ,

where ∆A(n)∈ CR×R are diagonal matrices with property

QN

n=2∆A(n) = IR.

In this section we present a new uniqueness result for a CPD with a known factor matrix. If the known factor matrix has full column rank, then the overall decomposition is trivially unique. We state the result here for completeness.

Proposition 3.1. Consider X ∈ CI1×···×IN with rank R and matrix

representa-tion X[N −1]=A(1)⊙ · · · ⊙ A(N −1)A(N )T. If A(N ) is known and has full column

rank, then the CPD with A(N ) is unique. Proof. The result is trivial, simply compute

Y = [y1, . . . , yR] = X

(6)

then the factor matrices follow from the best rank-1 approximation problems min a(2)r ,...,a(N −1)r yr− a(2)r ⊗ · · · ⊗ a (N −1) r 2 F , r ∈ {1, . . . , R} .

In this section we present a new result for the case where the known factor matrix does not necessarily have full column rank. The result is stated in Theorems 3.6 and 3.8. We first provide some auxiliary results.

Lemma 3.2. Given an analytic function f : Cn → C. If there exists a vector x∈ Cn such that f (x) 6= 0, then the set { x | f (x) = 0 } is of Lebesgue measure zero.

Proof. For a proof, see for instance [10].

A matrix A(1) is said to be Vandermonde if it takes the form

A(1)=      1 · · · 1 z1 · · · zR .. . . .. ... zI1−1 1 · · · z I1−1 R     ∈ C I1×R, (3.1)

where {zr} are called the generators of A(1).

Lemma 3.3. Let A(1)∈ CI1×R be a Vandermonde matrix and let A(2) ∈ CI2×R,

then the matrix A(1)⊙ A(2) generically has rank min (I1I2, R).

Proof. The result follows from Theorem 3 and Corollary 1 in [10].

Proposition 3.4. Given a Vandermonde matrix A(1)=ha(1) 1 , . . . , a

(1) R

i

∈ CI1×R

with distinct generators. Let the column vectors of U ∈ CI1I2×R constitute a basis for

ColA(1)⊙ A(2), where A(2) ∈ CI2×R. If the matrix A(1)⊙ A(2) has full column

rank, then the matrix

G(r)=hU, −a(1)r ⊗ II2i∈ CI1I2×(R+I2) (3.2)

has rank R + I2 − 1, ∀r ∈ {1, . . . , R}. Generically, G(r) has rank R + I2− 1 if (I1− 1)I2≥ R.

Proof. Consider a Vandermonde matrix A(1) ∈ CI1×R as in (3.1). Let us first

prove that if A(1)⊙A(2)has full column rank, then G(r)has a one-dimensional kernel. Let U = A(1)⊙ A(2), then G(r)=hA(1)⊙ A(2), −a(1) r ⊗ II2 i =      A(2) II2 A(2)Z zrII2 .. . ... A(2)ZI1−1 zI1−1 r II2     ∈ C I1I2×(R+I2),

where Z = D1([z1, z2, . . . , zR]). Consider the following nonsingular block-bidiagonal matrix

(7)

F(r)=             II2 0I2,I2 · · · 0I2,I2 0I2,I2 −zrII2 II2 . .. ... ... 0I2,I2 −zrII2 . .. . .. ... ... .. . . .. . .. . .. 0I2,I2 0I2,I2 0I2,I2 . .. . .. II2 0I2 0I2,I2 · · · 0I2,I2 −zrII2 II2             ∈ CI1I2×I1I2, then F(r)G(r)=       A(2) II2 A(2)(Z − zrIR) 0I2,I2 .. . ... A(2)ZI1−1− zrZI1−2 0I 2,I2      . (3.3)

Hence, the problem of determining the rank of G(r) reduces to finding the rank of

H(r)=     A(2)(Z − zrIR) .. . A(2)ZI1−1− zrZI1−2     =    A(2)IR(Z − zrIR) .. . A(2)ZI1−2(Z − zrIR)    =A(1)⊙ A(2)(Z − zrIR) . (3.4) The deterministic condition with A(1)⊙ A(2) having full column rank follows from (3.4). Indeed, we note that if A(1)⊙ A(2)has full column rank, then the matrix (3.4) has rank R − 1 and consequently G(r) in (3.2) has rank R + I2− 1, ∀r ∈ {1, . . . , R}. More generally, any U of which the columns form a basis for ColA(1)⊙ A(2)yields a matrix G(r) of rank R + I2− 1, ∀r ∈ {1, . . . , R}.

Let us now prove that if (I1 − 1)I2 ≥ R, then generically G(r) has a one-dimensional kernel. Due to Lemma 3.2 we just need to find one example where the statement made in this proposition holds. Note that any minor of G(r) is an analytic function of the elements G(r). Due to Lemma 3.3 we know that when (I1− 1)I2≥ R, the matrix H(r)given by (3.4) generically has rank R − 1. Hence, due to Lemma 3.2 we can now conclude that when (I1− 1)I2 ≥ R, the matrix G(r) in (3.2) generically has rank R + I2− 1, ∀r ∈ {1, . . . , R}.

A direct consequence of Proposition 3.4 is the following result, which generalizes the generic rank condition to the case where A(1) is not Vandermonde. The result will be instrumental in the proof of Theorem 3.6.

Corollary 3.5. Given A(1) = ha(1) 1 , . . . , a

(1) R

i

∈ CI1×R and let the column

vectors of U ∈ CI1I2×R constitute a basis for Col



(8)

CI2×R. If (I

1− 1)I2≥ R, then the matrix G(r)=hU, −a(1)r ⊗ II

2

i

∈ CI1I2×(R+I2)

generically has rank R + I2− 1, ∀r ∈ {1, . . . , R}.

Proof. The result follows directly from Lemma 3.2 and Proposition 3.4.

Theorem 3.6. Consider X ∈ CI1×···×IN with rank R and matrix representation

X[P ]=A(1)⊙ · · · ⊙ A(P ) A(P +1)⊙ · · · ⊙ A(N )T. Assume that the factor matrix A(1) is known. If        rA(P +1)⊙ · · · ⊙ A(N )= R rhA(1)⊙ · · · ⊙ A(P ), −a(1)r ⊗ IQP n=2In i = P Y n=2 In+ R − 1 , ∀r ∈ {1, . . . , R} ,(3.5)

then the CPD of X is unique. Generically, condition (3.5) is satisfied when

min (I1− 1) P Y n=2 In, N Y m=P +1 Im ! ≥ R . (3.6)

Proof. Let us first prove that condition (3.5) guarantees the uniqueness of the

CPD of X with A(1). Note that, trivially, a(1)r ⊗ · · · ⊗ a(P )r ∈ Col 

−a(1)r ⊗ IQP n=2In

 . Condition (3.5) implies that this is the only dependency between the column vectors of h

A(1)⊙ · · · ⊙ A(P ), −a(1)r ⊗ IQP n=2In

i

. Hence, we know that rA(1)⊙ · · · ⊙ A(P )= R. Given are X[P ] = A(1)⊙ · · · ⊙ A(P ) A(P +1)⊙ · · · ⊙ A(N )

T

and A(1). We now know that, under condition (3.5), the matrices A(1)⊙ · · · ⊙ A(P ) and A(P +1) · · · ⊙ A(N ) have full column rank. Thus, let X[P ]= UΣVH denote the compact SVD of X[P ], where U ∈ C(QP

n=1In)×R, then there exists a nonsingular matrix M ∈ CR×R

such that

UM = A(1)⊙ · · · ⊙ A(P )⇔ Umr= a(1)r ⊗ · · · ⊗ a (P )

r , r ∈ {1, . . . , R} . (3.7) Relation (3.7) can also be written as

G(r)  mr a(2)r ⊗ · · · ⊗ a(P )r  = 0QP p=1Ip, r ∈ {1, . . . , R} , (3.8) where G(r)=hU, −a(1) r ⊗ IQP n=2In i ∈ CQPn=1In×(QPn=2In+R). (3.9)

Since we assume that G(r) has a one-dimensional kernel the factor matrices A(1) up to A(P ) follow from (3.8), as explained next. Let xr ∈ CQPn=2In+R denote the

right singular vector associated with the zero singular value of the matrix in (3.9). Set CQPn=2In ∋ y

r= xr 

(9)

matrices nA(n)o P

n=2 follow from the decoupled best rank-1 tensor approximation problems min a(2)r ,...,a(P )r yr− a(2) r ⊗ · · · ⊗ a(P )r 2 F , ∀r ∈ {1, . . . , R} . (3.10) Recall that the matrix A(1)⊙ · · · ⊙ A(P )has full column rank. Let

FT =A(1)⊙ · · · ⊙ A(P )†X[P ]=A(P +1)⊙ · · · ⊙ A(N )T,

then the remaining factor matrices follow from the decoupled best rank-1 tensor ap-proximation problems min a(P +1)r ,...,a(N )r fr− a(P +1)r ⊗ · · · ⊗ a(N )r 2 F , ∀r ∈ {1, . . . , R} . (3.11) Let us now prove that condition (3.6) generically guarantees the uniqueness of the CPD of X with A(1). Given are X[P ]=A(1)⊙ · · · ⊙ A(P ) A(P +1)⊙ · · · ⊙ A(N )T and A(1). Due to Lemma 3.3 we know that min(I1− 1)QPn=2In,QNm=P +1Im≥ R implies that generically the matricesA(1)⊙ · · · ⊙ A(P )andA(P +1)⊙ · · · ⊙ A(N ) have full column rank. Due to Corollary 3.5 we know that if (I1− 1)QPn=2In ≥ R, then the rank of G(r) given by (3.9) is generically equal toQPn=2In+ R − 1. Hence, generically we can work as in the proof of the deterministic case.

An important remark concerning the proof of Theorem 3.6 is that it is construc-tive, i.e., it provides us with an algorithm to compute a CPD with a known factor. In other words, if one of the factor matrices of the CPD is known, say A(1), and the con-ditions stated in Theorem 3.6 are satisfied, then even if the known factor matrix does not have full column rank (as opposed to Proposition 3.1), we can find the remain-ing factor matrices from decoupled best rank-1 tensor approximation problems. The algorithm is useful for applications in wireless communication and array processing, see [23, 22].

Note that if P = 2, then A(2)follows directly from (3.10). Furthermore, if P = 3, then A(2)and A(3)follow from (3.10) via best rank-1 matrix approximation problems, i.e., via standard numerical linear algebra. Similarly, if N = P + 1, then A(N )follows directly from (3.11). Again, if N = P + 2, then A(N −1) and A(N ) follow from (3.11) via best rank-1 matrix approximation problems.

An outline of the proposed method for computing a CPD with a known factor matrix is given in Algorithm 1. We exploited the relation [11]



A(1)⊙ · · · ⊙ A(P )† =(A(1)HA(1)) ∗ · · · ∗ (A(P )HA(P ))−1(A(1)⊙ · · · ⊙ A(P ))H .

From the first part of the proof of Theorem 3.6 we obtain the following result, which may be useful in problems where only some of the columns of the factor matrix A(1) are known, see [23] for concrete examples in signal processing.

Corollary 3.7. Consider X ∈ CI1×···×IN with rank R and matrix representation

(10)

Algorithm 1 Outline of the procedure for computing a CPD with a known factor matrix based on Theorem 3.6.

Input: X[P ]=A(1)⊙ · · · ⊙ A(P ) A(P +1)⊙ · · · ⊙ A(N ) T and A(1). Compute SVD X[P ]= UΣVH. SolvehU, −a(1)r ⊗ IQP n=2In i xr= 0QP p=1Ip, ∀r ∈ {1, . . . , R}. Set yr= xrR + 1 : R +QPn=2In, ∀r ∈ {1, . . . , R}. Solve mina(2)r ,...,a(P )r

yr− a(2)r ⊗ · · · ⊗ a(P )r 2 F, ∀r ∈ {1, . . . , R}. Compute G =A(1)HA(1)∗ · · · ∗A(P )HA(P )−1. Compute FT = GA(1)⊙ · · · ⊙ A(P ) H X[P ]. Solve mina(P +1) r ,...,a(N )r fr− a (P +1) r ⊗ · · · ⊗ a(N )r 2 F, ∀r ∈ {1, . . . , R}. Output: {A(n)}N n=2.

vector a(1)r of the factor matrix A(1) is known. If    rA(P +1)⊙ · · · ⊙ A(N )= R rhA(1)⊙ · · · ⊙ A(P ), −a(1)r ⊗ IQP n=2In i =QPn=2In+ R − 1 , (3.12)

then the column vectors {a(p)r }Pp=2 are unique. Generically, this is true if

min (I1− 1) P Y n=2 In, N Y m=P +1 Im ! ≥ R . (3.13)

Let X[P ] = UΣVH denote the compact SVD of X[P ] and assume that the column vector a(1)r is known. Then numerically we first solve the homogeneous linear system

h U, −a(1)r ⊗ IQP n=2In i xr= 0QP p=1Ip,

and thereafter we obtain the column vectors {a(p)r }Pp=2 via the rank-1 approximation problem min a(2)r ,...,a(P )r yr− a(2)r ⊗ · · · ⊗ a(P )r 2 F , in which yr= xr  R + 1 : R +QPn=2In.

We now derive a relaxed version of Theorem 3.6 in which the matrices A(1)⊙· · ·⊙ A(P )and A(P +1)⊙ · · · ⊙ A(N ) are not required to have full column rank. We will as-sume that the known factor A(1)has rank S. In that case we can w.l.o.g. assume that rA(1)(1 : S, 1 : S)= S. Indeed, if this is not the case, then there exist permutation matrices PI1 ∈ C

I1×I1and PR∈ CR×Rsuch that r

 PI1A

(11)

Theorem 3.8. Consider X ∈ CI1×···×IN with rank R and matrix representation

X[P ] = A(1)⊙ · · · ⊙ A(P ) A(P +1)⊙ · · · ⊙ A(N )T. Denote S = rA(1).

As-sume that A(1)is known and S ≥ 2. W.l.o.g. we also assume that rA(1)(1 : S, 1 : S)= S. Consider an integer m ∈ {2, . . . , S} and define

A(1,m,S)= h a(1)1 (1 : m), . . . , a(1)m(1 : m), a(1) S+1(1 : m), . . . , a (1) R (1 : m) i ∈ Cm×(R−S+m) A(n,m,S)=ha(n) 1 , . . . , a(n)m , a (n) S+1, . . . , a (n) R i ∈ CIn×(R−S+m), n ∈ {2, . . . , N } . If        rA(P +1,m,S)⊙ · · · ⊙ A(N,m,S)= R − S + m rhA(1,m,S)⊙ · · · ⊙ A(P,m,S), −a(1,m,S)r ⊗ IQP n=2In i = J + R − S + m − 1 , ∀r ∈ {1, . . . , R − S + m} , for some m ∈ {2, . . . , S} , (3.14)

in which J =QPn=2In, then the CPD of X is unique. Generically, condition (3.14) is satisfied if



R ≤ (K+S)(J−1)+S−JJ when K < R ,

R ≤ (S − 1)J when K ≥ R , (3.15)

where J =QPn=2In, K =QNn=P +1In and S = min (I1, R).

Proof. Let us first prove that condition (3.14) guarantees the uniqueness of the

CPD of X with A(1). W.l.o.g. we assume that A(1)(1 : S, 1 : S) is nonsingular. Observe that B(1) =A(1)(1 : S, 1 : S)−1A(1)(1 : S, :) =  IS,A(1)(1 : S, 1 : S)−1A(1 : S, S + 1 : R)  . (3.16) Compute Y[1]=A(1)(1 : S, 1 : S)−1X[1](1 : S, :) .

Due to the structure of the matrix given by (3.16) we can pick m ≤ S rows of Y[1] so that (S − m) columns of B(1) do not contribute. Hence, the tensor Z with matrix representation

Z[1]= Y[1](1 : m, :)

is composed of (R − S + m) rank-1 terms. Its matrix representation Z[P ]is given by

Z[P ]= P K n=1 A(n,m,S) !   N K p=P +1 A(p,m,S)   T . (3.17)

We assume that rhA(1,m,S)⊙ · · · ⊙ A(P,m,S), −a(1,m,S)r ⊗ IQP n=2In

i

=QPn=2In+ R−S+m−1. As explained in Theorem 3.6 this implies that rA(1,m,S)⊙ · · · ⊙ A(P,m,S)=

(12)

R−S+m, i.e.,JPn=1A(n,m,S)has full column rank. We also assume thatJNp=P +1A(p,m,S) has full column rank. Let Z[P ] = UΣVH denote the compact SVD of Z[P ] in which U ∈ C(mQP

n=2In)×(R−S+m), then there exists a nonsingular matrix M ∈

C(R−S+m)×(R−S+m) such that UM = P K n=1 A(n,m,S) or equivalently G(r,m,S)  mr a(2,m,S)r ⊗ · · · ⊗ a(P,m,S)r  = 0mQPn=2In, (3.18) in which G(r,m,S)=hU, −a(1,m,s)r ⊗ IQP n=2In i ∈ C(mQPn=2In)×(QPn=2In+R−S+m). (3.19)

We assume that G(r,m,S)has rankQPn=2In+ R − S + m − 1, ∀r ∈ {1, . . . , R − S + m} which implies that we can obtain the matrices {A(n,m,S)}P

n=2 from (3.18) as follows. Let xr∈ C(QP

n=2In+R−S+m)denote the right singular vector associated with the zero

singular value of the matrix given by (3.19). Set

CQPn=2In∋ y r= xr R − S + m + 1 : R − S + m + P Y n=2 In ! ,

then due to relation (3.18) the factor matricesnA(n,m,S)oP

n=2follow from the decou-pled best rank-1 tensor approximation problems

min a(2,m,S)r ,...,a(P,m,S)r yr− a(2,m,S) r ⊗ · · · ⊗ a(P,m,S)r 2 F , r ∈ {1, . . . , R − S + m} . The next step is to determine the matrices {A(n,m,S)}N

n=P +1. Let

FT =A(1,m,S)⊙ · · · ⊙ A(P,m,S)†Z[P ]=A(P +1,m,S)⊙ · · · ⊙ A(N,m,S)T, then the factor matrices {A(n,m,S)}N

n=P +1 follow from the decoupled best rank-1 ten-sor approximation problems

min a(P +1,m,S)r ,...,a(N,m,S)r fr− a(P +1,m,S)r ⊗ · · · ⊗ a (N,m,S) r 2 F , r ∈ {1, . . . , R − S + m} . Finally, givennA(n,m,S)oN

n=1 we determine the matrices e A(n,m,S)=ha(n)m+1, . . . , a (n) S i ∈ CIn×(S−m), n ∈ {2, . . . , N } . Let bA(1,m,S)=ha(1)1 , . . . , a(1)m, a(1)S+1, . . . , a(1)R i and compute Q[1]= X[1]− bA(1,m,S) N K p=2 A(p,m,S) !T = eA(1,m,S) N K p=2 e A(p,m,S) !T ,

(13)

where eA(1,m,S) = ha(1)m+1, . . . , a (1) S

i

∈ CI1×(S−m) is known. Recall that w.l.o.g. we

assumed that the matrix A(1)(1 : S, 1 : S) is nonsingular which in turn means that the matrix eA(1,m,S) has full column rank. Hence, we obtain

H = Q[1]T 

e A(1,m,S)T

†

and the remaining unknowns { eA(n,m,S)}N

n=2 now follow from the best rank-1 tensor approximation problems min a(2,m,S)r ,...,a(N,m,S)r hr− a(2,m,S)r ⊗ · · · ⊗ a(N,m,S)r 2 F , r ∈ {m + 1, . . . , S} . Let us now prove that condition (3.15) generically guarantees the uniqueness of the CPD of X with A(1). Let m ∈ {2, . . . , S}. Due to Lemma 3.3 we know that A(P +1,m,S)⊙ · · · ⊙ A(N,m,S) generically has full column rank if

K ≥ R − S + m ⇔ K − R + S ≥ m . Since m ∈ {2, . . . , S} we obtain the following upper bound on m:

m ≤ min (K − R + S, S) . (3.20)

Due to Corollary 3.5 we know thathA(1,m,S)⊙ · · · ⊙ A(P,m,S), −a(1,m,S)r ⊗ IQP n=2In

i generically has rank J + R − S + m − 1 if

(m − 1)J ≥ R − S + m ⇔ m ≥ R − S + J

J − 1 , (3.21)

which necessitates m > 1. Consider first the case when K − R + S < S. This in turn means that K < R. From (3.20) and (3.21) we conclude that if the inequality

R − S + J

J − 1 ≤ K − R + S ⇔ R ≤

(K + S)(J − 1) + S − J J

is satisfied, then generically the CPD of X with A(1) is unique. Consider now the case when K − R + S ≥ S or equivalently K ≥ R. In that case we know from (3.20) that m ≤ S. Combining this condition with (3.21) we conclude that if the inequality

R − S + J

J − 1 ≤ S ⇔ R ≤ (S − 1)J is satisfied, then generically the CPD of X with A(1) is unique.

We observe that the proof is constructive. An algorithm based on Theorem 3.8 is given in Algorithm 2.

Analogous to the above, the following result may be useful in signal processing applications where only some of the columns of the factor matrix A(1) are known, see [23] for concrete examples.

Corollary 3.9. Consider X ∈ CI1×···×IN with rank R and matrix representation

(14)

Algorithm 2 Outline of the procedure for computing a CPD with a known factor matrix A(1) based on Theorem 3.8.

Input: X[P ]=A(1)⊙ · · · ⊙ A(P ) A(P +1)⊙ · · · ⊙ A(N ) T and A(1) with S = rA(1)and S = rA(1)(1 : S, 1 : S). Choose integer m ∈ {2, . . . , S}. Compute Y[1]=A(1)(1 : S, 1 : S)−1X[1]. Set Z[1]= Y[1](1 : m, :). Build Z[P ]. Compute SVD Z[P ]= UΣVH. SolvehU, −a(1,m,S)r ⊗ IQP n=2In i xr= 0mQP p=2Ip, r ∈ {1, . . . , R − S + m}. Set yr= xr  R − S + m + 1 : R − S + m +QPn=2In. Solve min {a(n,m,S)r }Pn=2 yr− a(2,m,S)r ⊗ · · · ⊗ a(P,m,S)r 2 F, r ∈ {1, . . . , R − S + m}. Compute G =A(1,m,S)HA(1,m,S)∗ · · · ∗A(P,m,S)HA(P,m,S)−1. Compute FT = GA(1,m,S)⊙ · · · ⊙ A(P,m,S)HZ[P ]. Solve min {a(n,m,S)r }Nn=P +1 fr− a(P +1,m,S)r ⊗ · · · ⊗ a(N,m,S)r 2 F, r ∈ {1, . . . , R−S+m}. Build bA(1,m,S)=ha(1)1 , . . . , a (1) m, a (1) S+1, . . . , a (1) R i Compute Q[1]= X[1]− A(1,m,S)JNp=1A(p,m,S)T. Build eA(1,m,S)=ha(1)m+1, . . . , a (1) S i Compute H = Q[1]T  e A(1,m,S)T † . Solve min {a(n,m,S)r }Nn=2 hr− a(2,m,S)r ⊗ · · · ⊗ a (N,m,S) r 2 F , r ∈ {m + 1, . . . , S} . Output: {A(n)}N n=2. h a(1) 1 , . . . , a (1) Q i

of A(1) is known and denote rha(1) 1 , . . . , a

(1) Q

i

= Q. W.l.o.g. we

assume that rA(1)(1 : Q, 1 : Q) = Q. Consider an integer m ∈ {2, . . . , Q} and

define A(n,m,Q)=ha(n) 1 , . . . , a(n)m , a (n) Q+1, . . . , a (n) R i ∈ CIn×(R−Q+m), n ∈ {2, . . . , N } A(1,m,Q) =ha(1) 1 (1 : m), . . . , a(1)m(1 : m), a (1) Q+1(1 : m), . . . , a (1) R (1 : m) i ∈ Cm×(R−Q+m). If        rA(P +1,m,Q)⊙ · · · ⊙ A(N,m,Q)= R − Q + m rhA(1,m,Q)⊙ · · · ⊙ A(P,m,Q), −a(1,m,Q)r ⊗ IQP n=2In i = J + R − Q + m − 1 , ∀r ∈ {1, . . . , R − Q + m} , for some m ∈ {2, . . . , Q} , (3.22)

(15)

in which J =QPn=2In, then the column vectors {a(p,m,Q)q }Pp=2are unique. Generically, condition (3.22) is satisfied if  R ≤ (K+Q)(J−1)+Q−JJ when K < R , R ≤ (S − 1)J when K ≥ R , (3.23) where J =QPn=2In, K = QN n=P +1In and S = min (I1, R).

We compute the column vectors {a(p,m,Q)q }Pp=2 as follows. Compute Y[1] = 

A(1)(1 : Q, 1 : Q)−1X[1] and Z[1]= Y[1](1 : m, :). Let Z[P ] = UΣVH denote the compact SVD Z[P ] and solve the homogenous linear system

h U, −a(1,m,Q)r ⊗ IQP n=2In i xr= 0mQP p=2Ip.

Finally, we obtain the vectors {a(p,m,Q)q }Pp=2via the rank-1 approximation

min {a(p,m,Q)r }Pp=2 yr− a(2,m,Q)r ⊗ · · · ⊗ a(P,m,Q)r 2 F .

4. New Uniqueness Results for Overall CPD. By combining the existing results presented in subsection 2.1 with the new results presented in section 3 we will in this section present new uniqueness conditions for the overall CPD and some variants that are important in signal processing.

4.1. New Uniqueness Condition for CPD. We are now ready to present a deterministic uniqueness condition for CPD that does not involve k-ranks.

Theorem 4.1. Consider X ∈ CI1×···×IN with rank R and matrix representation

X[P ] = A(1)⊙ · · · ⊙ A(P ) A(P +1)⊙ · · · ⊙ A(Q)⊙ A(Q+1)⊙ · · · ⊙ A(N )T, where P ∈ {1, . . . , N − 2}, Q ∈ {2, . . . , N − 1} and P < Q. Denote D = A(1)⊙ · · · ⊙ A(P )∈ C(QPp=1Ip)×R and S = r (D). Assume that S ≥ 2 and w.l.o.g. assume also that

r (D(1 : S, 1 : S)) = S. Consider an integer m ∈ {2, . . . , S} and define

D(m,S)= [d1(1 : m), . . . , dm(1 : m), dS+1(1 : m), . . . , dR(1 : m)] ∈ Cm×(R−S+m) A(n,m,S)=ha(n) 1 , . . . , a(n)m , a (n) S+1, . . . , a (n) R i ∈ CIn×(R−S+m), n ∈ {P + 1, . . . , N } . Define also J =QQn=P +1In. If                  rCR−S+2JQp=P +1A (p)⊙ CR−S+2JN q=Q+1A (q) = CRR−S+2 rA(Q+1,m,S)⊙ · · · ⊙ A(N,m,S)= R − S + m rhD(m,S)⊙ A(P +1,m,S)⊙ · · · ⊙ A(Q,m,S), −d(m,S)r ⊗ IJi= J + R − S + m − 1 ∀r ∈ {1, . . . , R − S + m} , for some m ∈ {2, . . . , S} ,

for some P ∈ {1, . . . , N − 2} and Q ∈ {2, . . . , N − 1} with P < Q ,

(4.1)

then the CPD of X is unique.

Proof. Let us treat X as a third-order tensor such that X[P ] = C (A ⊙ B)T in which C = A(1)⊙ · · · ⊙ A(P ), A = A(P +1)⊙ · · · ⊙ A(Q)and B = A(Q+1)⊙ · · · ⊙ A(N ).

(16)

Due to Theorem 2.2, the conditions (

rCR−S+2JQp=P +1A(p)⊙ CR−S+2JNq=Q+1A(q)= CRR−S+2 for some P ∈ {1, . . . , N − 2} and Q ∈ {1, . . . , N − 1} with P < Q ,

stated in (4.1) guarantee the uniqueness of C = A(1)⊙ · · · ⊙ A(P )which due to the rank-1 structure of the column vectors of C also implies uniqueness of the factor matrices {A(n)}P

n=1. We now treat X[P ] = C (A ⊙ B) T

as a third-order CPD with known factor C. Due to Theorem 3.8, the conditions

           rA(Q+1,m,S)⊙ · · · ⊙ A(N,m,S)= R − S + m rhD(m,S)⊙ A(P +1,m,S)⊙ · · · ⊙ A(Q,m,S), −d(m,S)r ⊗ IJ i = J + R − S + m − 1 ∀r ∈ {1, . . . , R − S + m} , for some m ∈ {2, . . . , S} ,

for some P ∈ {1, . . . , N − 2} and Q ∈ {2, . . . , N − 1} with P < Q ,

stated in (4.1) now guarantee the uniqueness of A and B. Finally, due the rank-1 structure of the column vectors A and B we obtain the overall uniqueness of the CPD. Remark that no k-ranks are involved in the deterministic condition (4.1). No efficient method for computing the k-rank has been proposed so far. The computation of the k-rank of a matrix is in fact expected to be NP-hard, see for instance [15] and references therein for a discussion.

For a discussion of Theorem 4.1, let us first consider the generic case, where factor matrices have maximal rank and k-rank. We notice that, generically, if the one factor matrix uniqueness condition stated in Theorem 2.2 is satisfied, then the overall uniqueness condition stated in Theorem 2.4 is automatically satisfied. This means that in the generic case, the one factor matrix uniqueness condition stated in Theorem 2.2, which corresponds to the first condition in (4.1), makes that Theorem 4.1 is not more relaxed than Theorem 2.2.

We will now give an example to show that Theorem 4.1 can lead to relaxed overall uniqueness results in the non-generic case. Consider the third-order tensor X ∈ C4×4×4 with matrix representation X[1] = (A ⊙ B) CT

in which A ∈ C4×5 , B∈ C4×5 and C ∈ C4×5, where A =     1 0 0 0 a15 0 1 0 0 0 0 0 1 0 a35 0 0 0 1 a45     , B =     1 0 0 0 0 0 1 0 0 b25 0 0 1 0 b35 0 0 0 1 b45     , C =     1 0 0 0 c15 0 1 0 0 0 0 0 1 0 c35 0 0 0 1 c45     .

For a random choice of parameter entries we have k (A) = 3, k (B) = 3 and k (C) = 3. On the other hand, we also have r (A) = 4, r (B) = 4 and r (C) = 4. For this problem

(17)

the conditions stated in Theorem 2.4 are not satisfied, which means that Theorem 2.4 fails. However, the conditions stated in Theorem 4.3 are satisfied with m = 3, which implies overall uniqueness of the CPD of X .

4.2. New Uniqueness Condition for CPD with Partial Hermitian Sym-metry. We say that CPD of a tensor X ∈ CI1×I1×I3 is partially Hermitian symmetric

if X[2] =A(1)⊙ A(1)∗A(3)T. This structure is very common in signal separation applications. For instance, tensors with partial Hermitian symmetry are obtained by stacking sets of covariance matrices of complex data, see [24, 1, 3, 4, 27, 28] for references to concrete applications. It has recently been shown in [24] that a com-plex CPD with a partial Hermitian symmetry is unique under milder conditions than an unconstrained CPD. The result presented in [24] was limited to the case where the matrix D =hA(3)T, A(3)HiT has full column rank. We will now provide a new uniqueness condition for a complex CPD with a partial Hermitian symmetry that is even more relaxed than the one obtained in [24], in the sense that D is not required to have full column rank.

Theorem 4.2. Consider a third-order tensor X ∈ CI1×I1×I3 with rank R and

matrix representation X[2]=A(1)⊙ A(1)∗A(3)T, i.e., A(1) = A(2)∗. Construct

D=  A(3) A(3)∗  .

Denote S = r (D) ≤ min (2I3, R) and assume that S ≥ 2. W.l.o.g. we assume that r (D(1 : S, 1 : S)) = S. Consider an integer m ∈ {2, . . . , S} and define

A(m,S)=ha(1) 1 , . . . , a(1)m, a (1) S+1, . . . , a (1) R i ∈ CI1×(R−S+m), D(m,S)= [d1(1 : m), . . . , dm(1 : m), dS+1(1 : m), . . . , dR(1 : m)] ∈ Cm×(R−S+m). If              rCR−S+2A(1)⊙ CR−S+2A(1)∗= CR−S+2 R , rA(m,S)= R − S + m , rhD(m,S)⊙ A(m,S), −d(m,S)r ⊗ II 1 i = I1+ R − S + m − 1 , ∀r ∈ {1, . . . , R − S + m} , for some m ∈ {2, . . . , S} , (4.2)

then the CPD of X with the partial Hermitian symmetry A(1) = A(2)∗ is unique. Proof. Let us define the tensor Y ∈ CI1×I1×I3 by yi

1i2i3 = x

i2i1i3 for all indices.

We have

Z[2]=hX[2], Y[2]i=A(1)⊙ A(1)∗DT ∈ CI2

1×2I3, (4.3)

where DT =hA(3), A(3)∗iT. We may now study the uniqueness of the CPD of Z in order to obtain uniqueness results for X .

Since r (D) = S ≤ min (2I3, R) and rCR−S+2A(1)⊙ CR−S+2A(1)∗ = CRR−S+2 we know from Theorem 2.2 that D is unique.

(18)

Due to Theorem 3.8, the conditions            r (D) = S ≤ min (2I3, R) rA(m,S)= R − S + m rhD(m,S)⊙ A(m,S), −d(m,S)r ⊗ II1 i = I1+ R − S + m − 1 , ∀r ∈ {1, . . . , R − S + m} , for some m ∈ {2, . . . , S} , guarantee overall CPD uniqueness.

4.3. New Uniqueness Conditions for CPD with Columnwise Orthonor-mal Factor Matrix. We say that the CPD of a tensor X ∈ CI1×I2×I3 has a

colum-nwise orthonormal factor if X[2] =A(1)⊙ A(2)A(3)T in which A(3)HA(3) = IR. It has recently been demonstrated in [25] that a complex CPD with a columnwise orthonormal factor matrix is unique under milder conditions than an unconstrained CPD. Orthogonality-constrained CPD is very common in statistical signal processing. Typically, the orthonormal columns of A(3) model signals that are uncorrelated. See [25] for an overview and references. We will now provide a new uniqueness condition for a complex CPD with a columnwise orthonormal factor matrix.

Theorem 4.3. Consider a tensor X ∈ CI1×I2×I3 with rank R and matrix

rep-resentation X[2] = A(1)⊙ A(2)A(3)T in which A(3) is columnwise orthonormal.

Let Nmax= ( 2 if I2≥ I1 1 if I1> I2 , Nmin= ( 2 if I2< I1 1 if I1≤ I2 (4.4) Imax= max (I1, I2) , Imin= min (I1, I2) . (4.5)

Construct D = A(Nmin)∗⊙ A(Nmin) ∈ CImin2 ×R. Denote S = r (D) and assume that

S ≥ 2. W.l.o.g. we assume that r (D(1 : S, 1 : S)) = S. Consider an integer m ∈ {2, . . . , S} and define D(m,S) = [d1(1 : m), . . . , dm(1 : m), dS+1(1 : m), . . . , dR(1 : m)] ∈ Cm×(R−S+m), A(Nmax,m,S)=ha(INmax) 1 , . . . , a (INmax) m , a(IS+1Nmax), . . . , a(IRNmax) i ∈ CINmax×(R−S+m). If              rCR−S+2A(Nmax)∗⊙ CR−S+2A(Nmax)= CR−S+2 R rA(Nmax,m,S)= R − S + m rhD(m,S)⊙ A(Nmax,m,S), −d(m,S) r ⊗ IImax i = Imax+ R − S + m − 1 ∀r ∈ {1, . . . , R − S + m} , for some m ∈ {2, . . . , S} , (4.6)

then the CPD of X with a columnwise orthonormal factor matrix A(3) is unique. Proof. Denote X(i1··) = X (i1, :, :) and construct the fourth-order tensor Y ∈

CI2×I2×I1×I1 with matrix slices

(19)

Since X(i1··)= A(2)Di 1



A(1)A(3)T and since A(3) is columnwise orthonormal, the matrix slice Y(··i3,i4) admits the factorization

Y(··i3,i4)= A(2)Di 3  A(1)Di4  A(1)∗A(2)H, i3, i4∈ {1, . . . , I1} . Thus, the fourth-order tensor Y has the following matrix representation

Y =hVecY(··1,1), VecY(··1,2), · · · , VecY(··I1,I1)i

=A(2)∗⊙ A(2) A(1)∗⊙ A(1)T. (4.7) If the fourth-order CPD with matrix representation Y in (4.7) is unique, then the CPD of X with columnwise orthonormal factor matrix is also unique. Let us interpret Y in (4.7) as a matrix representation of a CPD with factor matrices A(Nmax)∗, A(Nmax)and

A(Nmin)∗⊙ A(Nmin), in which the Khatri-Rao structure of the latter is in first instance

ignored. Assuming that condition (4.6) is satisfied, we know that Theorem 2.2 guaran-tees the uniqueness of the factor matrix A(Nmin)∗⊙ A(Nmin). Since A(Nmin)∗⊙ A(Nmin)

is unique, A(Nmin)is unique. Furthermore, since A(Nmin)is unique and condition (4.6)

holds, Theorem 3.8 guarantees the uniqueness of the factor matrix A(Nmax). Finally,

the columnwise orthonormal matrix A(3) follows from X[2] = A(1)⊙ A(2)A(3)T with X[2], A(1) and A(2) known, via an orthogonal Procrustes problem [7].

The uniqueness condition stated in Theorem 4.3 is more relaxed than those pre-sented in [25]. For instance, consider a third-order tensor X ∈ C8×4×17 with matrix representation X[1] = (A ⊙ B) CT in which A ∈ C8×17 , B ∈ C4×17 and column-wise orthonormal C ∈ C17×17. We randomly generate A, B and C. For this setting the uniqueness results presented in [25] fail. On the hand, the conditions stated in Theorem 4.3 with m = 7 are generically satisfied.

Uniqueness may be established beyond Theorem 4.3 and the results presented in [25]. As an example, consider the third-order tensor X ∈ C4×2×5 with matrix representation X[2]= (A ⊙ B) CT in which A ∈ C4×5 , B ∈ C2×5 and columnwise orthonormal C ∈ C5×5, where A =     1 0 0 0 a15 0 1 0 0 0 0 0 1 0 a35 0 0 0 1 a45     .

For this case Theorem 4.3 fails. However, uniqueness can be established as follows. As in Theorem 4.3 we construct a matrix Y, which can be decomposed as Y = D (A∗⊙ A)T in which D = B∗ ⊙ B. This decomposition is now interpreted as a matrix representation of a CPD with factor matrices A∗, D and A. Note that r(A) = 4 and that generically C3(A∗) ⊙ C3(D) has full column rank. Theorem 2.2 now generically guarantees the uniqueness of A. Further, A∗⊙ A has generically full column rank and therefore we also obtain B via (A∗⊙ A)†YT = (B⊙ B)T

. Finally, the columnwise orthonormal matrix C follows from X[2]= (A ⊙ B) CT with X[2], A and B known, via an orthogonal Procrustes problem.

5. A New Partial Uniqueness Condition for the CPD. In [8] a partial uniqueness condition for the CPD was obtained (given below as Lemma 5.1). Partial

(20)

uniqueness means that when the overall CPD is not unique, some submatrices of the factor matrices may still be unique. In Theorem 5.2 we provide a new partial uniqueness condition for the CPD. The derivation will make use of the following result.

Lemma 5.1. Consider a tensor X ∈ CI1×I2×I3 with rank R and matrix

repre-sentation X[2] =A(1)⊙ A(2)A(3)T in which A(n)∈ CIn×R. Assume that A(1) is

given and let P ∈ CR×R be a permutation matrix such that the partitioning

A(n)P=hA(n,1), . . . , A(n,Q)i, A(n,q)∈ CIn×Rq, R = Q X q=1 Rq, n ∈ {1, 2, 3} (5.1)

has the following property

ColA(1,q) \     Q [ p=1 p6=q ColA(1,p)     = {0I1} . (5.2)

Then the pair A(2,q), A(3,q) is unique up to the intrinsic ambiguities of the partial

CPD with matrix representation Xq = 

A(1,q)⊙ A(2,q)A(3,q)T.

Proof. A proof can be found in [8] where the result is stated as Theorem 3.1.

Note that in the case Rq= 1, due to the rank-1 structure of Xq=a(1,q)⊗ a(2,q)a(3,q)T

the pair a(2,q), a(3,q) is obviously unique when a(1,q) is given. Consequently, the following Theorem 5.2 will only consider the case where Rq ≥ 2.

Theorem 5.2. Consider X ∈ CI1×I2×I3 with rank R and matrix representation

X[2]=A(1)⊙ A(2)A(3)T. Let P ∈ CR×R be a permutation matrix such that A(n), n ∈ {1, 2, 3}, has the partitioning

A(n)P= h A(n,1), . . . , A(n,Q) i , A(n,q)∈ CIn×Rq, R = Q X q=1 Rq, (5.3)

with the following property

ColA(1,q) \     Q [ p=1 p6=q ColA(1,p)     = {0I1} . (5.4)

Denote S = rA(1) and Sq = rA(1,q). Assume that S ≥ 2 and Sq ≥ 2 (and

therefore also Rq ≥ 2). W.l.o.g. we also assume that Sq = r 

A(1,q)(1 : Sq, 1 : Sq).

(21)

A(n,q,mq,Sq)∈ CIn×(Rq−Sq+mq), n ∈ {2, 3}, as follows A(1,q,mq,Sq)= h a(1,q) 1 (1 : mq), . . . , a(1,q)mq (1 : mq), a (1,q) Sq+1(1 : mq), . . . , a (1,q) Rq (1 : mq) i , A(n,q,mq,Sq)= h a(n,q) 1 , . . . , a(n,q)mq , a (n,q) Sq+1, . . . , a (n,q) Rq i , n ∈ {2, 3} . If          rCR−S+2A(2)⊙ CR−S+2A(3)= CR−S+2 R rA(3,q,mq,Sq)= Rq− Sq+ mq rhA(1,q,mq,Sq)⊙ A(2,q,mq,Sq), −a(1,q,mq,Sq) rq ⊗ II2 i = I2+ Rq − Sq+ mq− 1 , (5.5)

then A(1) and the pairA(2,q), A(3,q) are unique.

Proof. Due to Theorem 2.2, the conditions

(

rCR−S+2A(2)⊙ CR−S+2A(3)= CRR−S+2 S ≥ 2

stated in (5.5) guarantee the uniqueness of A(1). Lemma 5.1 together with assumption (5.4) tells us that the study of the overall uniqueness of X decouples into the study of the CPDs

Xq =A(1,q)⊙ A(2,q)A(3,q)T, q ∈ {1, . . . , Q} ,

where A(n,q)∈ CIn×Rq is given by (5.3) and A(1,q) is known. Theorem 3.8 together

with the conditions    rA(3,q,mq,Sq)  = Rq− Sq+ mq rhA(1,q,mq,Sq)⊙ A(2,q,mq,Sq), −a(1,q,mq,Sq) rq ⊗ II2 i = I2+ Rq− Sq+ mq− 1 ,

now guarantee the uniqueness of the pairA(2,q), A(3,q).

6. Numerical Experiments. Let T ∈ CI1×I2×I3 denote a rank-R tensor, the

CPD of which we attempt to estimate from the observed tensor X = T + βN , where N is an unstructured perturbation tensor and β ∈ R controls the noise level. The entries of all the involved factor matrices and perturbation tensors are randomly drawn from a Gaussian distribution with zero mean and unit variance. The following Signal-to-Noise Ratio (SNR) measure will be used

SNR [dB] = 10 log T[1] 2 F/ βN[1] 2 F  .

The distance between the factor matrix A(n)∈ CIn×R with In ≥ R and its estimate

b

A(n), is measured according to the following criterion: PA(n)= min Λ A(n)− bA (n) Λ F / A(n) F ,

(22)

where Λ denotes a diagonal matrix. In the case In< R ≤ I2

nwe resort to the following criterion: QA(n)= min Λ A(n)⊙ A(n)−  b A(n)⊙ bA(n)  Λ F/ A(n)⊙ A(n) F , where Λ denotes a diagonal matrix.

We consider the problem of computing a CPD with a known factor matrix. We compare Algorithms 1 and 2 with the classical Alternating Least Squares (ALS) method which alternates between the update of the two unknown factor matrices. For other optimization-based algorithms [21] the conclusions are similar. Let f

 b X[1](k)  = X[1]− bX [1] (k) F

, where bX[1](k) denotes the estimated tensor at iteration k, then we de-cide that ALS method has converged when

f  b X[1](k)  − f  b X[1](k+1)  < ǫALS= 1e−8 or when the number of iterations exceeds 5000. The ALS method is randomly ini-tialized. Since the best out of ten random initializations is used, the ALS method will be referred to as ALS-CP-10. Algorithms 1 and 2 will be referred to as Alg1 and Alg2, respectively. When Alg1 and Alg2 are followed by at most 500 ALS refinement iterations they will be referred to as Alg1-ALS and Alg2-ALS, respectively.

Note that the computational cost of the Alg1 and Alg2 methods is not much more than one ALS iteration, so they are computationally very cheap.

Case 1: I1, I3 < R and I2 > R. Let the model parameters be I1 = 3, I2 = 10, I3 = 4 and R = 9. The factor matrix A(3) is known. The mean QA(1) and PA(2)values over 100 trials while SNR is varying from 10 to 40 dB SNR can be seen in figure 7.1. We notice that the Alg1 and Alg1-ALS methods perform better than the ALS-CP-10 method. Hence, the proposed Alg1 method, possibly followed by few ALS refinement steps, seems to be a good method for computing a CPD with a known factor matrix in which one unknown factor matrix has full column rank.

Case 2: I1, I2, I3 < R. Let the model parameters be I1 = 4, I2= 6, I3= 5 and R = 8. The factor matrix A(3) is known. The mean QA(1)and QA(2)values over 100 trials while SNR is varying from 10 to 40 dB SNR can be seen in figure 7.2. We first notice that in the presence of noise Alg2 performs worse than ALS-CP-10. However, we also observe that Alg2-10 and ALS-CP-10 perform about the same. Since the Alg2-ALS method is also computationally cheaper than the ALS-CP-10 method, the Alg2 method seems to be a good method for initializing the ALS method. Thus, the Alg2-ALS method seems to be the method of choice for computing a CPD with a known factor in which none of the unknown factor matrices have full column rank. 7. Conclusion. Many problems in signal processing can be formulated as tensor decomposition problems with a known or partly known factor matrix. In the first part of the paper we provided a new uniqueness condition and an algorithm for computing a CPD with a known or partly known factor matrix.

Based on the results obtained in the first part of the paper, we provided in the second part of the paper a new deterministic overall uniqueness condition for the CPD that only considers the ranks of the involved factor matrices. Since no k-ranks are involved the condition should be easy to check in practice. Next, we presented a new uniqueness condition for a complex CPD with an columnwise orthonormal factor

(23)

matrix. Furthermore, we provided a new uniqueness condition for a complex CPD with a partial Hermitian symmetry. We also showed that the results presented in this paper also leads to a new partial uniqueness condition for the CPD.

Finally, numerical experiments confirmed the practical use of the proposed Algo-rithms 1 and 2 for computing a CPD with a known factor matrix. More precisely, in the case where one of the unknown factor matrices had full column rank, Algo-rithm 1 performed better than the standard ALS method. In the case where none of the unknown factor matrices had full column rank, Algorithm 2 provided at a low computational cost a good initial value for an optimization-based algorithm.

REFERENCES

[1] A. Belouchrani and K. Abed-Meraim and J.-F. Cardoso and E. Moulines, A Blind Source Separation Technique Using Second-Order Statistics, IEEE Trans. Signal Process-ing, 45(1997), pp. 434–444.

[2] A. Cichocki and S.-I. Amari, Adaptive Blind Signal and Image Processing, Wiley, 2002. [3] P. Comon and C. Jutten, Editors, Handbook of Blind Source Separation, Independent

Com-ponent Analysis and Applications, Academic Press, 2010.

[4] L. De Lathauwer, A Short Introduction to Tensor-Based Methods for Factor Analysis and Blind Source Separation, Proc. ISPA 2011, 4-6 Sep., Dubrovnik, Croatia, 2011.

[5] I. Domanov and L. De Lathauwer, On the Uniqueness of the Canonical Polyadic Decom-position – Part I: Basic Results and Uniqueness of One Factor Matrix, Technical Report 12-66, ESAT-SISTA, KU Leuven, Belgium, 2012.

[6] I. Domanov and L. De Lathauwer, On the Uniqueness of the Canonical Polyadic Decompo-sition – Part II: Overall Uniqueness, Technical Report 12-72, ESAT-SISTA, KU Leuven, Belgium, 2012.

[7] G. H. Golub and C. Van Loan, Matrix Computations, John Hopkins University Press, 1996. [8] X. Guo and S. Miron and D. Brie and A. Stegeman, Uni-Mode and Partial Uniqueness Conditions for CANDECOMP/PARAFAC of Three-way Arrays with Linearly Dependent Loadings, SIAM J. Matrix Anal. Appl., 33 (2012), pp. 111–129.

[9] R. A. Horn and C. Johnson, Matrix Analysis, Cambridge University Press, Cambridge, 1985. [10] T. Jiang and N. D. Sidiropoulos and J. M. F. Ten Berge, Almost-Sure Identifiability of Multidimensional Harmonic Retrieval, IEEE Trans. Signal Process., 49(2001), pp. 1849– 1859.

[11] H.A.L. Kiers, Towards a Standardized Notation and Terminology in Multiway Analysis, J. Chemometrics 14 (2000) 105–122.

[12] P. M. Kroonenberg, Applied Multiway Data Analysis, Wiley, 2008.

[13] T. Kolda and B. W. Bader, Tensor Decompositions and Applications, SIAM Review, 51(2009), 455-500.

[14] J. B. Kruskal, Three-way Arrays: Rank and Uniqueness of Trilinear Decompositions, with Applications to Arithmetic Complexity and Statistics, Linear Algebra and its Applications, 18 (1977), pp. 95–138.

[15] L.-H. Lim and P. Comon, Multiarray Signal Processing: Tensor Decomposition meets Com-pressed Sensing, C.R. Mec., 338 (2010), pp. 311–320.

[16] D. Nion and N. D. Sidiropoulos, Tensor Algebra and Multidimensional Harmonic Retrieval in Signal Processing for MIMO Radar, IEEE Trans. Signal Processing, 58(2010), pp. 5693– 5705.

[17] J. A. Rhodes, A Concise Proof of Kruskal’s Theorem in Tensor Decomposition, Linear Algebra and its Applications, 432(2010), pp. 1818–1824.

[18] A. Smilde and R. Bro and P. Geladi, Multi-way Analysis: Applications in the Chemical Sciences, Wiley, 2004.

[19] N. D. Sidiropoulos and R. Bro and G. B. Giannakis, Parallel Factor Analysis in Sensor Array Processing, IEEE Trans. Signal Process., 48 (2000), pp. 2377-2388.

[20] N. D. Sidiropoulos and G. B. Giannakis and R.Bro, Blind PARAFAC Receivers for DS-CDMA Systems, IEEE Trans. Signal Process., 48 (2000), pp. 810-823.

[21] L. Sorber and M. Van Barel and L. De Lathauwer, Optimization-Based Algorithms for Tensor Decompositions: Canonical Polyadic Decomposition, Decomposition in rank-(Lr,Lr, 1) Terms and a New Generalization, Technical Report 12-37, ESAT-SISTA, KU Leuven, Belgium, 2012, SIAM. J. Opt., accepted for publication.

(24)

[22] M. Sørensen and L. De Lathauwer, Blind Signal Separation via Tensor Decomposition with Vandermonde Factor – Part I: Canonical Polyadic Decomposition, Technical Report 12-11, ESAT-SISTA, KU Leuven, Belgium, 2012.

[23] M. Sørensen and L. De Lathauwer, Signal Separation via Tensor Decomposition with Known or Partly Known Factor, In Preparation.

[24] M. Sørensen and L. De Lathauwer and P. Comon and S. Icart and L. Deneire, Simul-taneous Generalized Schur Decomposition Methods for Computing the Canonical Polyadic Decomposition, In Preparation.

[25] M. Sørensen and L. De Lathauwer and P. Comon and S. Icart and L. Deneire, Canonical Polyadic Decomposition with a Columnwise Orthonormal Factor Matrix, SIAM J. Matrix Anal. Appl., 33 (2012), pp. 1190–1213.

[26] A. Stegeman and N. D. Sidiropoulos, On Kruskal’s Uniqueness Condition for the Cande-comp/Parafac Decomposition, Linear Algebra and its Applications, 420 (2007), pp. 540– 552.

[27] A. Yeredor, Non-Orthogonal Joint Diagonalization in the Least-Squares Sense with Applica-tion in Blind Source SeparaApplica-tion, IEEE Trans. Signal Process., 50 (2002), pp. 1545-1553. [28] A.-J. Van Der Veen and A. Paulraj, An Analytical Constant Modulus Algorithm, IEEE

Trans. Signal Processing, 44(1996), pp. 1136–1155.

10 20 30 40 0 0.2 0.4 0.6 0.8 SNR mean Q( A (1) ) Alg1 Alg1−ALS ALS−10 (a) Mean QA(1). 10 20 30 40 0 0.2 0.4 0.6 0.8 1 SNR mean P( A (2) ) Alg1 Alg1−ALS ALS−10 (b) Mean PA(2).

Fig. 7.1. Mean QA(1)and PA(2)values over 100 trials while SNR is varying from 10 to 40 dB, case 1. 10 20 30 40 0 0.2 0.4 0.6 0.8 SNR mean Q( A (1) ) Alg2 Alg2−ALS ALS−10 (a) Mean QA(1)  . 10 20 30 40 0 0.2 0.4 0.6 0.8 SNR mean Q( A (2) ) Alg2 Alg2−ALS ALS−10 (b) Mean QA(2)  .

Fig. 7.2. Mean QA(1)and QA(2)values over 100 trials while SNR is varying from 10 to 40 dB, case 2.

Referenties

GERELATEERDE DOCUMENTEN

Theorem 1.9. Let ˜ A be any set of columns of A, let B be the corresponding set of columns of B.. Uniqueness of the CPD when one factor matrix has full column rank. The following

We show that under mild conditions on factor matrices the CPD is unique and can be found algebraically in the following sense: the CPD can be computed by using basic operations

We find conditions that guarantee that a decomposition of a generic third-order tensor in a minimal number of rank-1 tensors (canonical polyadic decomposition (CPD)) is unique up to

Citation/Reference Domanov I., De Lathauwer L., ``Canonical polyadic decomposition of third-order tensors: relaxed uniqueness conditions and algebraic algorithm'', Linear Algebra

For the computation of CPD with a semiunitary factor matrix we derive new semiunitary constrained versions of the simultaneous matrix diagonalization (SD-CP) [8] and

Theorem 1.9. Let ˜ A be any set of columns of A, let B be the corresponding set of columns of B.. Uniqueness of the CPD when one factor matrix has full column rank. The following

We first present a new con- structive uniqueness condition for a PD with a known factor matrix that leads to more relaxed conditions than those obtained in [9] and is eligible in

A Simultaneous Generalized Schur Decomposition (SGSD) approach for computing a third-order Canonical Polyadic Decomposition (CPD) with a partial symmetry was proposed in [20]1. It