• No results found

On Avoiding Diverging Components in the Computation of the Best Low Rank Approximation of Higher-Order Tensors

N/A
N/A
Protected

Academic year: 2021

Share "On Avoiding Diverging Components in the Computation of the Best Low Rank Approximation of Higher-Order Tensors"

Copied!
4
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

On Avoiding Diverging Components in the

Computation of the Best Low Rank

Approximation of Higher-Order Tensors

Lieven De Lathauwer

Tech. Report 05-269, ESAT-SISTA, K.U.Leuven (Leuven, Belgium), 2005 This report was written as a contribution to an e-mail discussion between Rasmus Bro, Lieven De Lathauwer, Richard Harshman and Lek-Heng Lim.

(2)

1. Orthogonality in one of the modes

Consider the approximation of a tensor A ∈ CI1×I2×...×IN by a rank-R tensor

ˆ A, given by ˆ A = R X r=1 λru(1)r ◦ u(2)r ◦ · · · ◦ u(N )r , where R 6 I1. Denote X = (λ1, . . . , λR; u(1)1 , . . . , u(1)R ; . . . ; u (N ) 1 , . . . , u(N )R ).

We have the following theorem.

Theorem 1 Under the condition that the vectors u(1)r , r = 1, . . . , R, are

mutually orthonormal, the function f (X) = kA −

R

X

r=1

λru(1)r ◦ u(2)r ◦ · · · ◦ u(N )r k2 (1)

attains its infimum. Proof: Let U(n)= [u(n)

1 . . . u(n)R ], n = 1, . . . , N , and let Λ = diag(λ1, . . . , λR).

Matricizing (1), we obtain

f (X) = kA(1)− U(1)· Λ · (U(2)¯ · · · ¯ U(N ))Tk2. (2)

Let U(1) = (U(1), U(1)

) be (square) unitary. For any choice of U(1), we have

f (X) = kU(1)H · A(1) µ IR×R O(I1−R)×R· Λ · (U(2)¯ · · · ¯ U(N ))Tk2. (3) Define B = A ×1 U (1)H

. Denote the (I2 × I3× . . . × IN)-slices of B by B1,

. . . , BI1. Then f (X) = R X r=1 kBr− λru(2)r ◦ ur(3)◦ · · · ◦ u(N )r k2+ c(U(1)), (4) in which c(U(1)) = kA × 1(U(1)

)Hk2. We conclude that the problem reduces

to a set of best rank-1 approximation problems, for which the infimum is attained.

(3)

2. Bounded condition number in one mode

In the previous section the condition number of U(1) was taken equal to one.

This constraint can be relaxed. We have the following theorem.

Theorem 2 Let the vectors u(n)r , r = 1, . . . , R, have unit norm. Under the

condition that the condition number κ(U(1)) 6 k, the function f (X) attains

its infimum.

Proof: All level sets of the cost function are compact. Closed: cf. note

Lek-Heng. Bounded: below.

We assume that all vectors u(n)r , r = 1, . . . , R, n = 1, . . . , N , have

unit-length. Hence, we have to show that λr → ∞ implies that f (X) → ∞. Let

λ = (λ1, . . . , λR). We have

f (X) = kvec(A) − (U(1)¯ · · · ¯ U(N )) · λk2. (5)

Reasoning by contradiction, we see that the condition number of U(1)¯ · · · ¯

U(N ) is bounded because the condition number of U(1) is bounded. This

implies that k(U(1)¯ · · · ¯ U(N )) · λk → ∞ whenever λ

r → ∞. As a result,

f (X) → ∞ whenever λr → ∞.

From a practical point of view, in an ALS algorithm, the constraint could be imposed by replacing the current estimate of U(1) by its best approximation

with condition number at most k. This approximation is simply obtained by replacing the singular values that are more than k times smaller than the dominant singular value σ1, by σ1/k.

3. Zero-correlation in one mode

One can also impose that the factors in one mode are uncorrelated. In that case, the matrix U(1) is of the form 1 · mT + Q · Ω, in which 1 is a vector

that contains only ones, m contains the means of the different factors, Q 2

(4)

is column-wise orthogonal and Ω is diagonal. Even when some entries of Ω become big, this cannot lead to degeneracy, cf. Section 1. However, entries of m may also become big and may mutually cancel. This means that degeneracy cannot be completely avoided. If it occurs, two or more columns of U(1) become proportional to 1. This is an event that happens

with probability zero.

Referenties

GERELATEERDE DOCUMENTEN

• If we look at the daily religious practice of the members of the Dutch salafist community we can distinguish five types using five criteria: the degree of orthodox

A gossip algorithm is a decentralized algorithm which com- putes a global quantity (or an estimate) by repeated ap- plication of a local computation, following the topology of a

From the observations above and our discussion in section 3, it follows that the optimal solution X of the GSD problem (6.2), if it is unique, is the limit point of the sequence of

Through the tensor trace class norm, we formulate a rank minimization problem for each mode. Thus, a set of semidef- inite programming subproblems are solved. In general, this

Searching for the global minimum of the best low multilinear rank approximation problem, an algorithm based on (guaranteed convergence) particle swarm optimization ((GC)PSO) [8],

-DATA4s delivers end-to-end solutions to financial institutions and telecom operators for improved risk analysis and management. of their customer and

Searching for the global minimum of the best low multilinear rank approximation problem, an algorithm based on (guaranteed convergence) particle swarm optimization ((GC)PSO) [8],

multilinear algebra, higher-order tensor, rank reduction, singular value decompo- sition, trust-region scheme, Riemannian manifold, Grassmann manifold.. AMS