Multiple Invariance ESPRIT for Nonuniform Linear Arrays: A Coupled Canonical Polyadic Decomposition Approach

(1)

Multiple Invariance ESPRIT for Nonuniform

Linear Arrays: A Coupled Canonical Polyadic

Decomposition Approach

Mikael Sørensen and Lieven De Lathauwer, Fellow, IEEE

Abstract—The Canonical Polyadic Decomposition (CPD) of higher-order tensors has proven to be an important tool for array processing. CPD approaches have so far assumed regular array geometries such as uniform linear arrays. However, in the case of sparse arrays such as Nonuniform Linear Arrays (NLAs), the CPD approach is not suitable anymore. Using the coupled CPD we propose in this paper a multiple invariance ESPRIT method for both one- and multi-dimensional NLA processing. We obtain a multiresolution ESPRIT method for sparse arrays with multiple baselines. The coupled CPD framework also yields a new uniqueness condition that is relaxed compared to the CPD approach. It also leads to an eigenvalue decomposition based algorithm that is guaranteed to reduce the multi-source NLA problem into decoupled single-source NLA problems in the noiseless case. Finally, we present a new polynomial rooting procedure for the latter problem, which again is guaranteed to find the solution in the noiseless case In the presence of noise, the algebraic algorithm provides an inexpensive initialization for optimization-based methods.

Index Terms—array processing, direction-of-arrival estima-tion, tensor, coupled decomposiestima-tion, sparse array, multiple baseline, multiresolution, nonuniform sampling.

I. Introduction

The connection between Uniform Linear Arrays (ULAs) and the Canonical Polyadic Decomposition (CPD) is well-known [19]. In order to increase the aper-ture or to deal with malfunctioning sensors in ULAs, Nonuniform Linear Arrays (NLAs) have been used, see [13], [34], [6], [15], [30], [16] and references therein. Because of the nonuniform sampling, the CPD frame-work is not suitable for NLA processing. Many of the developed methods for NLA make the assumption that the impinging signals are uncorrelated. This assumption often makes it possible to transform the original NLA problem into a uniformly sampled harmonic retrieval problem (e.g., [13], [15], [30], [14], [17]). In other words, NLA problems involving uncorrelated signals can often be solved by means of ULA-based methods. In this paper we do not assume that the signals are uncorrelated. For the more general case of linearly independent impinging M. Sørensen and L. De Lathauwer are with KU Leuven - E.E. Dept. (ESAT) - STADIUS Center for Dynamical Systems, Signal Processing and Data Analytics, Kasteelpark Arenberg 10, B-3001 Leuven-Heverlee, Belgium, the Group Science, Engineering and Technology, KU Leuven Kulak, E. Sabbelaan 53, 8500 Kortrijk, Belgium, and iMinds Medi-cal IT, Kasteelpark Arenberg 10, B-3001 Leuven-Heverlee, Belgium, {Mikael.Sorensen, Lieven.DeLathauwer}@kuleuven-kulak.be.

signals, a direct-MUSIC method has been proposed in [29], assuming a one-dimensional NLA that is composed of a sparse ULA with a large inter-element spacing and one additional sensor.

As our contribution, we present a link between the coupled CPD model [24], [25] and NLA processing. More precisely, we obtain a new uniqueness condition that is more relaxed than the one in [29]. We also derive an algebraic algorithm for one- and multi-dimensional NLAs enjoying multiple shift-invariance structures that, under the relaxed condition, is guaranteed to the find the solution in the noiseless case. In particular, we adapt some of the ULA results in [22] to NLAs by considering the given NLA as a set of superimposed ULAs. For this reason, our results can be understood as an extension of classical ESPRIT [18] for ULAs to multiple invariance structured NLAs, including the computation by Gener-alized EigenValue Decomposition (GEVD).

A related problem concerns Direction-Of-Arrival (DOA) estimation based on sparse arrays embedded with multiple baselines (e.g. [8], [36], [10], [1]). The multi-ple baseline property can make sparse arrays ambiguity-free when several of the involved shift-invariance struc-tures are exploited together. However, the existing re-sults have so far been limited to dual shift-invariance structures and again only CPD based methods have been considered. We explain that the coupled CPD model pro-vides an algebraic framework for sparse array processing in the case of multiple baselines.

Even though we focus on sensor array processing, the results presented in this paper may also be used in, e.g., system identification and parametric spectral analysis.

We note in passing that “grid-based” DOA estimation using sparsity constraints is another popular approach (e.g., [12]). Compared to the grid-based approach, a nice feature of “grid-less” DOA estimation is that in the case of structured arrays (e.g., [18], [22]) it can often be achieved by means of basic algebraic techniques (which can also provide an inexpensive initialization for an optimization-based method). The algorithm admits a straightforward extension to the multidimensional case, without the need to perform, e.g., an expensive multidi-mensional grid-search as in the direct-MUSIC method. This will be demonstrated in Section IV. Finally, grid-based DOA estimation methods do not lead to a formal identifiability condition for the original grid-less DOA

(2)

estimation problem.

The paper is organized as follows. The rest of the introduction will present the notation. Section II reviews the necessary algebraic prerequisites. Section III presents an algebraic coupled CPD framework for 1D NLA pro-cessing including a relaxed uniqueness condition. An algebraic algorithm is developed in Section IV. A nice feature of the coupled CPD approach is that it can easily be generalized to the multidimensional case. This is illustrated in Section V for two-dimensional (2D) arrays composed of superimposed NLAs (e.g. L-shaped arrays with missing sensors). Numerical experiments are reported in section VI. Section VII concludes the paper.

Notation: Vectors, matrices and tensors are denoted by

lower case boldface, upper case boldface and upper case calligraphic letters, respectively. The rth column vector and transpose of a matrix A are denoted by arand AT,

respectively. The symbols⊗ and " denote the Kronecker and Khatri-Rao product, defined as

A⊗B :=     a11B a12B . . . a21B a22B . . . .. . ... . ..    , A"B := [a1⊗ b1 a2⊗ b2. . . ] ,

in which (A)mn=amn. The Hadamard product is denoted

by ∗ and is given by (A ∗ B)ij=aijbij. The outer product

of three vectors a, b and c is denoted by a◦b◦c, such that (a◦ b ◦ c)ijk=aibjck. The range, kernel, Frobenius norm,

real part and imaginary part of matrix A are denoted by range (A), ker (A),%A%F, Re{A} and Im{A}. From the

context it should be clear when i denotes the imaginary unit number, i.e., i = √−1.

Matlab index notation will be used for submatrices of a given matrix. For example, A(1:k,:) represents the submatrix of A consisting of the rows from 1 to k of A. Let A ∈ CI×R_{, then A = A (2 : I, :)} _{∈ C}(I−1)×R _and A = A(1 : I− 1, :) ∈ C(I−1)×R_{, i.e., A and A are obtained} by deleting the top and bottom row of A, respectively.

Dk(A)∈ CJ×Jdenotes the diagonal matrix holding row k

of A∈ CI×J _{on its diagonal.}

Let Cm,n ∈ Cmn×mn denote the commutation matrix

property with Cm,n(a⊗ b) = b ⊗ a, where a ∈ Cm and

b ∈ Cn_{. We will also make use of the linear operator}

Sn(X) = C2,n '_I

nX

InX (

∈ C2(n−1)×p_{, where X}_{∈ C}n×p _{and where} In∈ Cn×ndenotes the identity matrix.

The cardinality of a set S is denoted by card (S). The binomial coefficient is denoted by Ck

m= )m k * = m! k!(m−k)!. The

k-th compound matrix of A∈ CI×R_{is denoted by C}

k(A)∈

CCkI×CkR. It is the matrix containing the determinants of all k×k submatrices of A, arranged with the submatrix index sets in lexicographic order, see [4] and references therein for a discussion. The modulus and argument of a ∈ C are denoted by |a| and arg(a), respectively. The greatest common divisor of M, N∈ N is denoted by GCD(M, N).

II. Algebraic Prerequisites

A. Canonical Polyadic Decomposition (CPD)

Consider the third-order tensorX ∈ CI×J×K_{. We say that} X is a rank-1 tensor if it is equal to the outer product of non-zero vectors a ∈ CI_{, b} _{∈ C}J _{and c} _{∈ C}K _such

that xijk = aibjck. A Polyadic Decomposition (PD) is a

decomposition ofX into a sum of rank-1 terms: CI×J×K_{* X =}

R

+

r=1

ar◦ br◦ cr. (1)

The rank of a tensor X is equal to the minimal number of rank-1 tensors that yield X in a linear combination. Assume that the rank of X is R, then (1) is called the CPD of X. Let us stack the vectors {ar}, {br} and {cr}

into the matrices A = [a1, . . . , aR], B = [b1, . . . , bR] and

C =[c1, . . . , cR]. The matrices A, B and C will be referred

to as the factor matrices of the (C)PD ofX in (1).

1) Matrix Representation: Let X(i··) ∈ CJ×K _{denote the} matrix such that ,X(i··)

-jk = xijk, then X

(i··) _{= B}_D

i(A) CT

and we obtain the following matrix representation of (1): CIJ×K_{* X :=}'_X(1··)T _{. . . X}(I··)T(T₌_(A_{" B) C}T_. ₍₂₎

2) Uniqueness: The uniqueness properties of the CPD

have been intensely studied in recent years, see [4], [5], [21] and references therein. In this paper we consider the case where one factor matrix has full column rank. For this case the following sufficient uniqueness condition has been obtained.

Theorem II.1. Consider the PD ofX ∈ CI×J×K _{in (1). If} 



CC2has full column rank,(A)" C2(B) has full column rank,

(3)

then the rank of X is R and the CPD of X is unique [9],

[3], [4]. Generically1_{, condition (3) is satisfied if 2R(R}_{− 1) ≤}

I(I− 1)J(J − 1) and R ≤ K [3], [26].

The uniqueness condition (3) is easy to check and yet very powerful. Moreover, under condition (3) the CPD ofX can be converted into the CPD of a (partially symmetric) (R× R × R) tensor M of rank R by a few algebraic steps, even when I < R and/or J < R [3]. In the exact case, the latter CPD can be computed by means of a standard GEVD, implying that under condition (3) the original CPD ofX can be computed by algebraic means. Essentially, the GEVD approach exploits the structure in only two of the (R× R) slices of M (e.g., [11]). In the inexact case, the CPD of M may also be computed as such, to increase the accuracy. It is common to initialize an optimization-based algorithm for the computation of of the CPD of M with the GEVD-based estimate. Finally, the estimates of A, B, C obtained via the CPD of M may themselves be refined by optimization (e.g.,

1_{A tensor decomposition property is called generic if it holds with}

probability one when the entries of the factor matrices are drawn from absolutely continuous probability density measures.

(3)

[20]). The CPD of M can equivalently be seen as the simultaneous diagonalization of its matrix slices, which is why the approach is denoted as Simultaneous matrix Diagonalization (SD).

B. Coupled CPD

We say that a collection of tensorsX(n)_{∈ C}In×Jn×K_{, n}_∈ {1, . . . , N}, admits an R-term coupled PD if each tensor X(n)_{can be written as [24]:} X(n)₌ R + r=1 a(n)r ◦ b(n)r ◦ cr, n∈ {1, . . . , N}, (4)

with factor matrices A(n) = 'a(n)₁ , . . . , a(n)_R(, B(n) = '

b(n)₁ , . . . , b(n)_R(and C = [c1, . . . , cR]. We define the coupled

rank of {X(n)_{} as the minimal number of coupled} rank-1 tensors a(n)r ◦ b(n)r ◦ cr that yield {X(n)} in a linear

combination. Assume that the coupled rank of{X(n)_{} is}

R, then (4) will be called the coupled CPD of{X(n)_}. The coupled (C)PD of {X(n)_{} given by (4) has the} following matrix representation

X ='X(1)T, . . . , X(N)T(T= FCT∈ C(2Nn=1InJn)×K_, ₍₅₎ where F = [(A(1)_"B(1)₎T_{, . . . , (A}(N)_"B(N)₎T_]T_{∈ C}(2N

n=1InJn)×R_. The coupled rank-1 tensors in (4) can be arbitrarily permuted and the vectors within the same coupled rank-1 tensor can be arbitrarily scaled provided that the overall coupled rank-1 term remains the same. We say that the coupled CPD is unique when it is only subject to these trivial indeterminacies. Uniqueness conditions for the coupled CPD were derived in [24]. Theorem II.2 extends Theorem II.1 to coupled CPD.

Theorem II.2. Consider the coupled PD ofX(n)_{∈ C}In×Jn×K_, n∈ {1, . . . , N} in (4). Define G =     C2 , A(1)-" C2 , B(1) -.. . C2 , A(N)-_{" C} 2 , B(N) -   ∈ C ,2N n=1C2InC2Jn -×C2 R_. If  

CGhas full column rank,has full column rank, (6) then the coupled rank of{X(n)_{} is R and the coupled CPD of} {X(n)_{} is unique [24].}

As in ordinary CPD, the uniqueness condition (6) is easy to check and yet very powerful. An extension of the SD method in [3] to the coupled CPD has been developed in [25]. Again, in the noise-free case the exact factors can be obtained from a matrix GEVD.

III. Connection between NLA processing and coupled CPD

Assume that R sources are impinging on an I element NLA. Let us additionally assume that the sources are located in the far-field, that the narrowband assumption holds and that K snapshots are available such that sr∈ CKdenotes the signal vector associated with the rth

source. In this section we also assume that the sources are located in the “xy”-plane such that only the azimuth angles are of interest. Let us choose the first sensor in the array as the reference sensor. Under these assumptions the observed data matrix admits the factorization

X = AST∈ CI×K_{, A = [a} 1 . . . aR] , S = [s1 . . . sR] , (7) in which ar= ' 1 e−iωcτr2 _{. . . e}−iωcτrI(T₌3_{1 x}n2 r . . . xnrI 4T ∈ CI_{, (8)}

where τri=nidxcos(θr)/c corresponds to the propagation

delay associated with the ith sensor and the rth source, ωcis the carrier frequency, c is the propagation speed, dx

is the nominal unit measure along the sensor axis (“x-axis”), nidxis the nominal distance between the reference

sensor (n1= 0) and the ith sensor in the NLA in which

ni∈ N, and xr=e−iωcdxcos(θr)/cin which 0≤ θr< π denotes

the azimuth angle associated with the rth source. Clearly ar is an incomplete Vandermonde vector where not all

powers appear. This is in contrast to the ULA case where all powers are present, e.g., ar= [1 xrx2r . . . xIr−1]T.

We define the NLA constrained rank of X in (7) as the minimal number of NLA constrained rank-1 matrices arsTr, where ar is of the form (8), that yield X in a linear

combination. We say that the NLA decomposition of X is unique if for every pair (5A, 5S) satisfying the relation (7) we have 5A = AΠ, where Π is a permutation matrix. In DOA applications the goal is to estimate{θr} via A,

observing X. On the other hand, in source separation the goal is to estimate S, observing X. For simplicity, we will limit the exposition to the case where S has full column rank. However, the results presented in this paper can be extended to cases where S does not have full column rank.

In this section we establish a link between NLA processing and the coupled CPD. The approach can be interpreted as a MI-ESPRIT method [28] for NLA. More precisely, we explain that by simultaneously exploiting several shift-invariance structures inherent in the given NLA it is, under certain conditions, possible to transform the NLA problem into a coupled CPD problem. The idea of exploiting several shift-invariance structures was first considered in [28] and later also in [16].

In order to clarify the idea we first provide an in-formative example in subsection III-A. Next, in subsec-tion III-B, we formalize the reasoning and establish the link between 1D NLA and coupled CPD. Subsection III-C briefly explains that the proposed coupled CPD approach also provides a multiresolution interpretation of sparse arrays with multiple baselines.

(4)

A. Illustration

Consider the NLA observation matrix X = AST∈ CI×K_{, a} r= ' 1 xrx2r x4r x5r x6r x8r (T , I = 7. (9) The shift-invariance structures within A can be visual-ized as follows:               1 x x2 _{− x}4 _x5 _x6 _{− x}8 x 1 x x2 x x4 _x5 _x6 x2 ₁ _x2 _x4 _x6 _x8 x3 _x _x4 x3 _x2 _x5 _x8 x4 ₁ _x4 _x8 x4 _x _x5 x4 _x2 _x6 x5 ₁ _x5 x5 _x _x6 x6 ₁ _x6 x6 _x2 _x8 x7 _x _x8 x8 ₁ _x8               , (10)

where the outer column contains the generators from x up to x8_{, the outer row denotes the incomplete} Vander-monde vector ar (in which ’-’ denotes a missing power

of x) and each row inside the matrix denotes the sub-ULA associated with a particular partial shift-invariance. For instance, the first row corresponds to the ULA subarray formed by the first three sensors, of which the response aris a Vandermonde vector generated by xr. By

inspection of (10) it becomes clear that the NLA can be interpreted as a set of superimposed ULAs. For the sake of brevity, we limit ourselves to the sub-ULAs associated with the first three rows of the matrix in (10) to illustrate our approach. (It can be verified that taking into account the other rows does not improve the bound on on R at the end of this section.) From X in (9) we extract the submatrices X(1,1) ∈ C3×K _{and X}(2,1)

∈ C3×K _associated with x, and the submatrix X(1,2) ∈ C5×K _{associated with}

x2_{as follows:}

X(1,1) = X([1, 2, 3], :) = A(1,1)ST, (11) X(2,1) = X([4, 5, 6], :) = A(2,1)ST, (12) X(1,2) = X([1, 3, 4, 6, 7], :) = A(1,2)ST, (13) where the submatrices of A associated with the genera-tors x and x2_{, denoted by A}(1,1)_{∈ C}3×R_{, A}(2,1)_{∈ C}3×R _and A(1,2)∈ C5×R_{, have columns given by}

a(1,1)_r = '1 xr x2r (T , a(2,1)r = ' x4 r x5r x6r (T , (14) a(1,2)r = ' 1 x2r x4r x6r x8r (T . (15)

Note that the vectors (14)–(15) are (column-wise scaled) Vandermonde vectors. Exploiting the shift-invariance, we obtain from (11)–(13) tensors Y(1) _{∈ C}4×2×K _and Y(2)_{∈ C}4×2×K _{of which the matrix representation admits} the coupled decomposition

Y(1)= [S2(X(1,1))T, S2(X(2,1))T]T

= [(A(1,1)" B(1))T, (A(2,1)" B(1))T]TST= (A(1)" B(1))ST, Y(2)=S2(X(1,2)) = (A(1,2)" B(2))ST= (A(2)" B(2))ST,

Variable Description

X(m,n) _{The mth submatrix of X in (7) that is}

associated with the generators{xn r}. A(m,n) _{The mth submatrix of A in (7) that is}

associated with the generators{xn r}.

Y(m,n) _{Tensorized version of Y}(m,n)₌_S 2(X(m,n)).

Im,n Row-dimension of X(m,n)and A(m,n), i.e., X(m,n)∈ CIm,n×K_{and A}(m,n)_{∈ C}Im,n×R_. σm,n(i) Selection of the entries of ar that are

stacked in a(m,n)r , i.e., a(m,n)_i,r =aσm,n(i),r.

TABLE I

Overview of the double indexed variables in Section III.

where a(1)r = [a(1,1)Tr | a (2,1)T r ] T_{= [1 x} r| x4r x5r]T, b(1)r = [1 xr]T, a(2)r = a(1,2)r = [1 x 2 r x4r x6r]T, b(2)r = [1 x2r]T.

We now obtain the coupled CPD of {Y(1)_,_Y(2)_{} with} matrix representation Y = < Y(1) Y(2) = = < A(1)" B(1) A(2)" B(2) = ST. (16) Thanks to Theorem II.2 the coupled CPD of{Y(1)_,_Y(2)_} is generically unique if R ≤ 5. To obtain this bound, it suffices to check a random example. Indeed, (6) can be expressed in terms of determinants being nonzero, and determinants are analytic functions. See [7] for a mathematical justification of this argument. Note that if we only use the CPD of Y(1)_{, then DOA identifiability} can only be guaranteed for R≤ 4 (e.g. Theorem II.1).

B. 1D NLA via coupled CPD

Let us now formalize the reasoning behind the previ-ous example. The first step is to identify ULA structures within A in (7). In the second step we turn the matrix decomposition into a tensor decomposition via spatial smoothing. In the third and final step we combine the generated tensors, leading to a coupled CPD problem.

1) Multiple generators lead to multiple sub-ULAs: The

coupled CPD interpretation of multiple shift-invariant structured NLAs will make use of several double in-dexed variables which are listed in Table I.

Consider the NLA observation matrix X∈ CI×K _given by (7) where A∈ CI×R_{is a NLA matrix (e.g. of the form} (9)). We call xn

r a generator if there exist Im,n≥ 2 entries

of ar denoted by aσm,n(1),r, aσm,n(2),r, . . . , aσm,n(Im,n),r such that aσm,n(i),r=aσm,n(1),rx

(i−1)n

r ,∀i ∈ {1, . . . , Im,n}. The first index of Im,n denotes the mth subvector extracted from ar while

the second index of Im,ndenotes the power of xr in the

generator (i.e., xn

r). As an example, in the cases where n∈

{1, 2}, the matrix A in (9) has ULA structured submatrices with properties:

(5)

I1,1= 3 σ1,1(1) = 1 σ1,1(2) = 2 σ1,1(3) = 3

I2,1= 3 σ2,1(1) = 4 σ2,1(2) = 5 σ2,1(3) = 6

I1,2= 5 σ1,2(1) = 1 σ1,2(2) = 3 σ1,2(3) = 4 σ1,2(4) = 6 σ1,2(5) = 7

Let Ω ⊆ {1, . . . , I} denote the set of powers of x that appear in the generators of the submatrices. Assume that there are Mn submatrices of A associated with the

n-th power of the generator, i.e., with xn_{. We denote}

these submatrices by A(1,n)∈ CI1,n×R_{, . . . , A}(Mn,n)_{∈ C}IMn ,n×R_. Since {A(Mn,n)_{} are (column-wise scaled) Vandermonde,} we have a(m,n)_p+1,r=a(m,n)p,r xnr. By way of illustration, consider

Ain (9) and the subset Ω ={1, 2}. Then we have M1= 2 with A(1,1)and A(2,1)given by (14) and M2= 1 with A(1,2) given by (15).

For each pair (m, n) we extract the corresponding submatrix X(m,n)∈ CIm,n×K _{from X in (7) as follows}

X(m,n)= X([σm,n(1), . . . , σm,n(Im,n)], :) = A(m,n)ST, (17)

where A(m,n) = A([σm,n(1), . . . , σm,n(Im,n)], :), m ∈

{1, . . . , Mn} and n ∈ Ω. For instance, in the case (m, n) =

(2, 1), the columns of A in (9) are related to the columns of A(2,1)as follows: a(2,1)r =     a(2,1)_1,r a(2,1)_2,r a(2,1)_3,r    =    aσ2,1(1),r aσ2,1(2),r aσ2,1(3),r    =    a1,r a2,r a3,r    ∈ CI2,1_, _I 2,1= 3.

2) From matrices to tensors: Using xlm,n

r = xlrm,n−1xr,

the mapping S2(X(m,n)) produces the tensor Y(m,n) ∈ C(Im,n−1)×2×K _{with matrix representation}

Y(m,n)=S2(X(m,n)) = ,

A(m,n)" B(n)-_ST

, (18)

where the rth column of A(m,n)∈ C(Im,n−1)×R _{is given by}

a(m,n)_r = a(m,n)r (1 : Im,n− 1, :), (19) and B(n) ₌ < 1 · · · 1 xn 1 · · · xnR = ∈ C2×R. (20) Define In = 2Mn m=1Im,n− Mn. Stacking Y (1,n)_{, . . . , Y}(Mn,n)_, yieldsY(n)_{∈ C}In×2×K _{for which the matrix representation}

Y(n)∈ CIn2×K _{admits the decomposition}

Y(n)='Y(1,n)T . . . Y(Mn,n)T(T₌,_A(n)

" B(n)-ST, (21) where

A(n) = 'A(1,n)T . . . A(Mn,n)T(T_{∈ C}In×R_. ₍₂₂₎

3) From NLA to coupled CPD: Let us index the elements

in Ω by ω(1), . . . , ω(card (Ω)). For every element n∈ Ω it is assumed that Iω(n)≥ 2. This condition is necessary for the definition of G(Ω)given in (24) further on. We now obtain the coupled CPD of {Y(n)_{} with matrix} represen-tation Y∈ C(2n∈ΩIn)2×K _{given by} Ytot=     Y(ω(1)) .. . Y(ω(card(Ω)))    =     A(ω(1))" B(ω(1)) .. . A(ω(card(Ω)))" B(ω(card(Ω)))    S T_{. (23)}

Proposition III.1 below summarizes the transformation of the NLA problem into a coupled CPD. In short, Theorem II.2 together with conditions (25a)–(25b) ensure that the NLA factorization can be expressed as a coupled CPD with unique factor matrices. Condition (25c) in turn ensures that the involved NLA constrained rank-1 terms can be extracted from the coupled CPD in a unique manner.

Proposition III.1 makes use of the following matrix

G(Ω)=     C2 , A(ω(1))-" C2 , B(ω(1)) -.. . C2 , A(ω(card(Ω)))-" C2 , B(ω(card(Ω))) -   . (24) Proposition III.1. Consider the NLA factorization of X =

AST∈ CI×K_{in (7) where A is a NLA structured matrix with}

columns ar=31 xnr2 . . . xrnI4T and dx≤ λ/2 = πc_ω_c. If

  

Sin (7) has full column rank, (25a)

G(Ω)in (24) has full column rank, (25b)

GCD(ni, nj) = 1 for some i ! j , (25c)

then the NLA constrained rank of X is R and the NLA decomposition of X is unique.

Proof: Since G(Ω) and S have full column rank, we

know from Theorem II.2 that the coupled rank of{Y(n)_} is R and the coupled CPD of {Y(n)_{} is unique. This in} turn implies that A = X(ST)† _{is unique up to a column} permutation and scaling. In other words, the coupled CPD of{Y(n)_{} has split the problem into a set of single} source DOA estimation problems. It is now easy to show that we can recover θr from zr = αrar, where αr ∈ C

is an arbitrary scaling factor. We add the details for completeness.

For simplicity, consider first the special case where nj− ni= 1 for some i < j. This implies that

'_z_ir zjr ( = αr '_xni r xnjr ( ⇒

zjr/zir=xnrj−ni=xrfor some i < j. Since dx≤ πc_ω_c, nj−ni= 1

and 0 ≤ θr < π, we can uniquely recover θr from xr = e−iωcdx(nj−ni) cos(θr)/c₌_e−iωcdxcos(θr)/c_.

Consider now the general case in which nj− ni ≥ 1,

∀i < j and GCD(ni, nj) = 1 for some i ! j. We have

'_z 1r zir ( = αr ' ₁ xnir ( ⇒ br=zir/z1r=xnri, (26) '_z 1r zjr ( = αr ' ₁ xnjr ( ⇒ cr=zjr/z1r=x nj r . (27)

The solutions to (26) are x(br)

k = |br|1/nie

i(2πk+arg(br))

ni _{, k} _∈

(6)

|cr|1/nje

i(2πm+arg(cr))

nj _{, m}_{∈ {0, 1, . . . , n}

j− 1}. Hence, any solution

that satisfies both (26) and (27) must also satisfy the relation (2πk + arg(br))/ni= (2πm + arg(cr))/nj. (28) Note that bnj r = x ninj r = |br|njei arg(br)nj and cnri = x ninj r =

|cr|niei arg(cr)ni, implying that arg(br)nj= arg(cr)ni. Together

with relation (28) this implies that we have knj

ni = m.

Since GCD(ni, nj) = 1 and k < ni we conclude that the

only common solution to (26) and (27) is k = m = 0, cor-responding to arg(br)/ni = arg(cr)/nj = −ωcdxcos(θr)/c.

Since dx≤ πc_ω_c and 0≤ θr< π we can uniquely recover θr

from arg(br) and arg(cr).

We conclude that if conditions (25a)–(25c) are satisfied, then the NLA constrained rank of X is R and the NLA decomposition of X is unique.

Note that condition (25c) is necessary while conditions (25a) and (25b) are only sufficient. Even though we have only discussed the case of unit norm generators, Propo-sition III.1 is also valid for cases where the generators are not necessarily located on the unit circle.

Note also that conditions (25a) and (25b) are valid for arbitrarily positioned sensors provided that the associ-ated steering vectors enjoy shift-invariance structures, i.e., also in the case of arbitrary sensor positions the over-all recovery problem can be decoupled into single-source estimation problems. On the other hand, condition (25c) is limited to subsampled regular grids (multidimensional linear arrays, rectangular arrays, hexagonal arrays, etc.). Thus, in the case of arbitrarily positioned sensors, a different condition for the extraction of the generators {xr} from A is required.

Proposition III.1 can be understood as a generaliza-tion of the identifiability condigeneraliza-tion for sparse arrays in [29]. More precisely, it was explained in [29] that the factorization of X in (7) with I = N + 1 and ar =

[1, xM

r , . . . , xM(Nr −1) , xPr]T is generically unique if R < N,

|xr| = 1, GCD(M, P) = 1 and K ≥ R. Compared to the

result in [29], Proposition III.1 does not require an N element sub-ULA with property R < N, it is not limited to generators on the unit circle, it admits a constructive interpretation (see Section IV) and it extends to the multidimensional case (see Section V).

Since each tensorY(n)_{∈ C}In×2×K_{, n}_{∈ Ω}_,_{only consists of} two matrix slices, the coupled CPD of{Y(n)_{} can be} inter-preted as a coupled Matrix-Pencil (MP) problem. More precisely, let P(n) ∈ C2In×2In _{denote a row-permutation} matrix such that Z(n) := P(n)Y(n) = (B(n)" A(n))ST. Con-sider the partitioning Z(n) = 'Z(1,n)

Z(2,n) ( = '_A(n)A_D(n)ST 2(B(n))ST ( with Z(1,n), Z(2,n) ∈ CIn×K_{. Define Γ}_Γ_{Γ = [γ}_γ_γ 1, . . . , γγγR] := (ST)† and ΥΥΥ =D1([λ1, . . . , λR]) := (D1([x1, . . . , xR]))−1. Then we

obtain a coupled MP problem Z(1,n)ΓΓΓ = Z(2,n)ΓΓΓΥΥΥn_{⇔ (Z}(1,n)

− λn

rZ(2,n))γγγr= 0In, (29) where r∈ {1, . . . , R} and n ∈ Ω. In other words, the fac-torization problem (7) amounts to finding the common generalized eigenvectors of a set of MPs{(Z(1,n)_{, Z}(2,n)₎_}

n∈Ω

subject to a power constraint on the associated gener-alized eigenvalues ({λn

r}n∈Ω). The power constraint can make the generalized eigenvalues (λr) unique despite

min(Ω) > 1. Likewise, the coupling between the MPs can make the common generalized eigenvectors (γγγr) in (29)

unique, even in cases where for certain n, Z(1,n)and Z(2,n) have overlapping kernels (ker,Z(1,n)-∩ ker,Z(2,n)-!_{∅) so} that the eigenvector in (29) is not unique. Note also that the ordinary linear MP case (Ω = {1}) corresponds to classical ESPRIT [18].

Let us illustrate that the aforementioned power con-straint (λn

r) leads to improved DOA identifiability.

Con-sider the NLA for which A has columns of the form: ar= [1 x3r x6r x8r]T. (30)

(Note that (30) corresponds to the NLA studied in [29] if |xr| = 1.) Consider the sub-ULAs [1 x3r x6r]T and

[x6

r x8r]T of (30) with inter-element spacings n = 3 and

n = 2, respectively. Since Ω ={2, 3} and GCD(2, 3) = 1,

Proposition III.1 guarantees the uniqueness of a generic NLA decomposition in which A has columns of the form (30) if R ≤ 2. On the other hand, if only the sub-ULA [1 x3

r x6r]T is used and the element x8r in (30) is ignored,

then unique DOA recovery is not possible since now Ω = _{{3} and min(Ω) > 1. Indeed, the associated MP} involves only x3

r and not xr itself. Similarly, if only the

sub-ULA [x6

r x8r]T is used, then unique DOA recovery is

not possible since Ω ={2} and min(Ω) > 1.

To illustrate that coupling can reduce the dimension of the eigenspace, consider the NLA for which A has columns of the form:

ar= [1 xrx2r x11r x15r x18r x21r x23r ]T. (31)

(Note that (31) corresponds to a minimum redundancy array with eight sensors [2].) By simultaneously exploit-ing the multiple shift-invariance structures in (31) that are associated with the generators xr, x2r, x3r and x10r ,

Proposition III.1 guarantees the uniqueness of a generic NLA decomposition with columns of the form (31) if

R ≤ 3. Note that if we only exploit one of the sub-ULAs in (31), say [1 xr x2r]T, then the more restrictive

NLA uniqueness condition R≤ 2 is obtained.

This does not not mean that identifiability properties for all possible NLAs can be derived via the coupled CPD framework. For instance, nonredundant arrays [33] such as ar = [1 xr x4r x9r x11r ]T do not contain any

shift-invariance structure and hence the coupled CPD approach does not apply if R > 1.

C. Multiresolution interpretation of coupled CPD for NLA

Consider the NLA response matrix A with columns of the form (31). We note that the partial sub-ULAs within A have generators with small and large powers, i.e., ar(1)xr = ar(2), ar([1, 7])x2r = ar([3, 8]), ar([1, 7])x2r =

ar([3, 8]), ar([5, 6])x3r = ar([6, 7]) and ar([2, 4])x10r =

(7)

baselines. This property has been used to develop arrays with multiresolution properties (e.g. [8], [36], [10], [1]). An example taken from [1] is the case of three separated ULAs where the response matrix A has columns of the form ar= [1 xrx2rx14r x15r x21r x22r x23r ]T. We note that arhas

generators xr, x7r, x8r, x13r , x14r and x20r . Roughly speaking,

a small baseline can lead to coarse but unique DOA estimates while a large baseline can lead to accurate but ambiguous DOA estimates. The idea behind array processing involving multiple baselines is to combine the advantages of small and large baselines.

A related example is nested arrays or co-prime arrays composed of superimposed or concatenated ULAs with different inter-element spacings [15], [30], [14], [29], [17]. An example could be two superimposed ULAs where the response matrix A has columns of the form ar =

[1 x5

r x7r x10r x14r x15r xr20 x21r x25 x28r x30r ]T. From A we

can extract A(1), A(5) and A(7) with columns of the form a(1)r = [x14r x15r x20r x21r ]T, a(5)r = [1 x5rx10r x15r xr20x25r x30r ]Tand

a(7)r = [1 x7r x14r x21r x28r ]T, respectively. It is now clear that

the coupled CPD approach for NLA can also be used for nested arrays or co-prime arrays without making orthogonality assumptions on S, or by making use of any kind of data imputation/interpolation.

The coupled CPD approach also supports the use of hybrid arrays composed of sparse arrays with periodic subarrays [35].

Actually, in the case of sparse array processing, the coupled CPD approach for NLA can be interpreted as a multiresolution variant of ESPRIT. A distinct advantage of the coupled CPD approach compared to the existing methods (e.g. [8], [36], [10], [1], [29]) is that it simulta-neously exploits the structure of the involved subarrays. As such, none of the involved subarrays is required to yield unique DOAs and also a more relaxed bound on

R may be obtained.

As an example, consider the NLA response matrix A = [A(2)T A(11)T A(14)T]T _{∈ C}15×R _{with columns a}(2) r = [1 x2 r x4r x6r x8r]T, a (11) r = [x11r x21r xr31 x41r x51r ]T and a (14) r = [x14

r x28r x46r x62r x76r ]T. Methods that process the sub-ULAs

individually can identify at most R≤ 4 DOAs. However, Proposition III.1 implies that this NLA factorization is generically unique if R≤ 7.

IV. MI-ESPRIT algorithm for NLA processing In this section we propose a MI-ESPRIT variant of the SD method for coupled CPD that is suited for shift-invariance structured NLA factorization problems. More precisely, the basic compression step 0 and the algebraic link between coupled CPD and SD in step 2 are the same as in [25] while steps 1 and 3 are specific for MI-ESPRIT. The SD problem can in turn be reduced to a GEVD, explaining the relation with classical ESPRIT [18].

a) Step 0: Dimensionality reduction: Assume that the

full column rank conditions (25a) and (25b) in Propo-sition III.1 are satisfied, then A in (7) has full column rank. Indeed, if A does not have full column rank, then

there exists a vector y ∈ ker (A) and a vector x ∈ CK

such that X = A(S + xyT₎T _{has rank R}_{− 1. This in turn}

contradicts Proposition III.1. Let X = UΣΣΣVH denote the compact SVD of X, where U ∈ CI×R _{and V} _{∈ C}K×R _are columnwise orthonormal and ΣΣΣ_{∈ C}R×R _{is diagonal. We} know that range (X) = range (U) and that there exists a nonsingular matrix F∈ CR×R_{such that}

UΣΣΣ = AFTand V∗F−1= S. (32) In the further derivation we will work with UΣΣΣ = AFT, rather than with Eq. (7), since the former only involves

R column vectors, where R can be substantially less than

the number of samples K.

b) Step 1: From NLA factorization to coupled CPD:

For every m∈ {1, . . . , Mn} and for every n ∈ Ω we extract

from UΣΣΣ in (32) the compressed variant of (17): U(m,n)= U([σm,n(1), . . . , σm,n(Im,n)], :)ΣΣΣ = A(m,n)FT. (33)

From the matrices{U(m,n)_{} in (33) we obtain the following} compressed variant of (21): Y(n)_red=     S2(U(1,n)) .. . S2(U(Mn,n))    = , A(n)" B(n)-FT, n∈ Ω, (34)

where ’red’ stands for reduced. Stacking for all n ∈ Ω yields the following compressed variant of (23):

Ytot,red=     Y(ω(1))_red .. . Y(ω(card(Ω)))_red    =     A(ω(1))" B(ω(1)) .. . A(ω(card(Ω)))" B(ω(card(Ω)))    F T_. (35) Clearly the low rank structured matrix decomposition (35) corresponds to a matrix representation of the cou-pled CPD of the tensorsY(n)_red∈ CIn×2×K_{, n}_{∈ Ω.}

c) Step 2: From coupled CPD to SD: Assume that the

full column rank conditions (25a) and (25b) in Propo-sition III.1 are satisfied, then there exist R symmetric matrices{M(r)} built from the matrix (35), admitting the factorizations (see [25], [22] for the precise construction): M(r)= GΛΛΛ(r)GT, r∈ {1, . . . , R}, (36) where G = F−1 and where ΛΛΛ(r) _{∈ C}R×R _are diag-onal matrices. It is explained in [25] that the com-plexity of the construction of {M(r)} is only of order 4 min,(2_n_∈ΩI2 n)R2, ( 2 n∈ΩIn)R4 -.

The simultaneous decomposition (36) corresponds to the CPD of the partially symmetric tensor M that was mentioned in Section II-A. More specifically, M is ob-tained by stacking the matrices M(r); its CPD is given by M = 2Rr=1gr◦ gr◦ γγγr in which γγγr = [λ(1)rr, . . . , λ(R)rr]T.

In the exact case, the SD problem (36) can be solved by means of an eigenvalue decomposition (e.g. [11]), with the columns of G−T as generalized eigenvectors.

(8)

d) Step 3: From SD to single-source DOA retrieval:

After G = F−1 has been found from (36), we obtain A and S via (32). The generator xr can now be obtained

from ar. More precisely, define

e(n)r =     a(1,n)r .. . a(Mn,n) r     and f (n) r =     a(1,n)r .. . a(Mn,n) r    , n∈ Ω. (37) As explained in the proof of Proposition III.1, con-dition (25c) ensures that the generator xr ∈ C can be

estimated as the minimizer of the univariate polynomial

f (xr) =2n∈Ω>>>e(n)r xnr− f

(n)

r >>>

2

F. However, in DOA estimation xr is not arbitrarily complex-valued but located on the

unit circle. In that case xr must satisfy the conditions

e(n)r xnr = f(n)r , n∈ Ω subject to xr=ei·φr, 0≤ φr< π. (38)

Proposition IV.1 below states that φrcan be found via

the rooting of a real polynomial.

Proposition IV.1. Consider f (φr) =2n∈Ω>>>er(n)ei·φr·n− f(n)r >>>

2

F, where e(n)r and f(n)r are given by (37) and Ω⊂ {1, . . . , I}. The largest element in Ω is denoted by Nmax. Consider also the

real polynomial of degree 2Nmax:

g(t) =+ n∈Ω ? Re@β(n)r A₊2n p=0 N+max−n k=0 ? 2n p B? Nmax− n k B, − pt(2n−p+1+2k) + (2n− p)t(2n−p−1+2k)-· cos(1 2(2n− p)π) +Im{β(n)r } 2n + q=0 N+max−n m=0 ? 2n q B? Nmax− n m B, − qt(2n−q+1+2m) + (2n− q)t(2n−q−1+2m)-· sin(1 2(2n− q)π) B , (39)

where β(n)r = e(n)Hr f(n)r . Let ˆt denote the real root of g(t) that minimizes f (2 tan−1(ˆt)), then

2 tan−1(ˆt) = arg min 0≤φr<π

f (φr).

Proof: The proof is technical and is given in the

supplementary material of [23].

e) Summary of algorithm: The overall procedure is

summarized as Algorithm 1. Note that if conditions (25a)–(25c) in Proposition III.1 are satisfied, then Algo-rithm 1 is guaranteed to find the exact decomposition in the noiseless case. In more detail, conditions (25a) and (25b) in Proposition III.1 guarantee that the columns{ar}

of A are unique up to individual scaling factors. This means that there exists a unit norm generator xr =ei·φr

such that e(n)r xr= f(n)r in (38) is satisfied for every n∈ Ω.

Finally, condition (25c) in Proposition III.1 guarantees the uniqueness of this unit norm generator. In the presence of noise, Algorithm 1 can provide an inexpensive initial-ization for an optiminitial-ization-based method.

Algorithm 1SD method for MI-ESPRIT (Prop. III.1). Input: X = AST of the form (7).

1.Determine subset Ω⊆ {1, . . . , I}.

2. Compute SVD X = UΣΣΣVH(and find R from ΣΣ_Σ). 3. Extract U(m,n)_{in (33) from UΣ}_Σ_Σ,_{∀m ∈ {1, . . . , M}

n}, ∀n ∈ Ω.

4. Build Y(n)_red in (34) for every n∈ Ω. 5. Build matrices{M(r)

} from {Y(n)red}n∈Ω[25], [22].

6. Solve SD problem M(r)_{= GΛ}_Λ_Λ(r)_GT_{, r}

∈ {1, . . . , R}. 7. Compute A = UΣΣΣGT_{and S = V}∗_G_.

8. Obtain{xr} from A (e.g. via Proposition IV.1). Output:{xr}, A and S.

V. Extension to the multidimensional case A simple extension of the 1D NLA is an array built from superimposed NLAs. Here we adapt some of the results in [22] to the NLA case. Subsection V-A provides an illustrative example. Next, in Subsection V-B we formalize the extension to the multidimensional case.

A. Illustration

Consider an L-shaped array composed of I(1) _sensors along the “x1-axis” and I(2) sensors along the “x2-axis” such that (7) extends to

X(1)= A(1)ST∈ CI(1)_×K

, X(2)= A(2)ST∈ CI(2)_×K

, (40)

where X(1) denotes the observation matrix associated with the sensors along the “x1-axis” in which (A(1))r =

' 1 xn2,1 r,1 . . . x n_I(1),1 r,1 (T ∈ CI(1)

, where xr,1=e−iωcd1sin(φr) cos(θr)/c, d1 is the unit spacing along the “x1-axis” and the new variable φr denotes the elevation angle associated with

the rth source. Similarly, X(2) denotes the observation matrix associated with the sensors along the “x2-axis” in which (A(2))r = ' 1 xn2,2 r,2 . . . x nI(2),2 r,2 (T ∈ CI(2) , where

xr,2=e−iωcd2sin(φr) sin(θr)/c and d2 is the unit spacing along the “x2-axis”.

As in the 1D case, we define the NLA constrained and coupled rank of{X(1), X(2)} in (40) as the minimal num-ber of NLA constrained and coupled rank-1 matrices {(A(1))rsTr, (A(2))rsTr} that yield {X(1), X(2)} in a linear combi-nation. We also say that the coupled NLA decomposition of {X(1), X(2)} is unique if for every triplet (5A(1), 5A(2), 5S) satisfying the relation (40) we have 5A(1) = A(1)Π and 5

A(2)= A(2)Π, where Π is a permutation matrix.

By way of example, let X(1) in (40) be of the form (9) while X(2) in (40) admits the factorization

X(2)= A(2)ST∈ CI(2)_×K , (A(2))r= ' 1 xr,2x4r,2x5r,2x8r,2x12r,2 (T . (41) From X(1) we obtain the coupled CPD of{Y(1)₍₁₎,Y(2)₍₁₎} with matrix representation (16). We will now capitalize on the NLA structure of A(2). Note that the vector (A(2))rin A(2)

(9)

contains the generators xr,2, x3_r,2 and x4_r,2:       1 x − − x4 _x5 _{− − x}8 _{− − − x}12 x 1 x x x4 _x5 x3 _x _x4 x3 _x5 _x8 x4 ₁ _x4 _x8 _x12 x4 _x _x5       . (42)

We consider the following submatrices of A(2):

X(1,1)₍₂₎ = X(2)([1, 2], :) = A(1,1)₍₂₎ ST∈ C2×K, I(2)_1,1= 2, (43) X(2,1)₍₂₎ = X(2)([3, 4], :) = A(2,1)₍₂₎ ST∈ C2×K, I(2)_2,1= 2, (44) X(1,3)₍₂₎ = X(2)([2, 3], :) = A(1,3)₍₂₎ ST∈ C2×K, I(2)_1,3= 2, (45) X(2,3)₍₂₎ = X(2)([4, 5], :) = A(2,3)₍₂₎ ST∈ C2×K, I(2)_2,3= 2, (46) X(1,4)₍₂₎ = X(2)([1, 3, 5, 6], :) = A(1,4)₍₂₎ ST∈ C4×K, I(2)_1,4= 4, (47) X(2,4)₍₂₎ = X(2)([2, 4], :) = A(2,4)₍₂₎ ST∈ C2×K, I(2)_2,4= 2, (48) where A₍₂₎(1,1)='_x_1,21 _{··· x}··· 1_R,2(, A(2,1)₍₂₎ ='x 4 1,2··· x4R,2 x5 1,2··· x 5 R,2 ( , A(1,3)₍₂₎ ='x_x1,24 ··· xR,2 1,2··· x4R,2 ( , A(2,3)₍₂₎ ='x 5 1,2··· x5R,2 x8 1,2··· x8R,2 ( , A(1,4)₍₂₎ =    1 ··· 1 x4 1,2··· x 4 R,2 x8 1,2··· x8R,2 x12 1,2··· x12R,2   , A(2,4)₍₂₎ ='xx1,25 ··· xR,2 1,2··· x5R,2 ( . From (43)–(48) we obtain the tensors Y₍₂₎(1) ∈ C2×2×K _, Y(3)₍₂₎∈ C2×2×K _and_Y(4)

(2)∈ C4×2×K for which the associated matrix representations Y(1)₍₂₎∈ C4×K_{, Y}(3)

(2)∈ C4×K and Y (4) (2)∈ C8×K _{admit the following CPDs:}

Y(1)₍₂₎ = 'S2(X₍₂₎(1,1))T, S2(X(2,1)₍₂₎ )T (T = (A(1)₍₂₎" B(1)₍₂₎)ST, Y(3)₍₂₎ = 'S2(X₍₂₎(1,3))T, S2(X(2,3)₍₂₎ )T (T =,A(3)₍₂₎" B(3)(2) -ST, Y(4)₍₂₎ = 'S2(X₍₂₎(1,4))T, S2(X(2,4)₍₂₎ )T (T =,A(4)₍₂₎" B(4)₍₂₎-ST, where A(1)₍₂₎='A (1,1) (2) A(2,1)₍₂₎ ( ='x14 ··· 1 1,2··· x4R,2 ( , B(1)₍₂₎='x11,2··· 1··· xR,2 ( , A(3)₍₂₎='A (1,3) (2) A(2,3)₍₂₎ ( ='x15 ··· 1 1,2··· x5R,2 ( , B(3)₍₂₎='x13 ··· 1 1,2··· x3R,2 ( , A(4)₍₂₎='A (1,4) (2) A(2,4)₍₂₎ ( =    1 ··· 1 x4 1,2 ··· x4R,2 x8 1,2··· x8R,2 x1,2 ··· xR,2    , B(4)₍₂₎='x14 ··· 1 1,2··· x4R,2 ( . We now obtain the coupled CPD of{Y(2)(1),Y

(3) (2),Y (4) (2)} with matrix representation Y(2) =     Y(1)₍₂₎ Y(2)₍₂₎ Y(3)₍₂₎    =     A(1)₍₂₎" B(1)₍₂₎ A(3)₍₂₎" B(3)(2) A(4)₍₂₎" B(4)₍₂₎    ST. (49) Overall, from (16) and (49) we obtain the coupled CPD of{Y(1)₍₁₎,Y₍₁₎(2),Y(1)₍₂₎,Y(3)₍₂₎,Y(4)₍₂₎}. By exploiting the cou-pling between the two NLAs imposed by S, improved

uniqueness conditions are obtained. Indeed, from sub-section III-A we know that the coupled CPD structure of only {Y(1)₍₁₎,Y(2)₍₁₎} yields the upper bound R ≤ 5. Likewise, from Theorem II.2 the coupled CPD structure of {Y(1)₍₂₎,Y(3)₍₂₎,Y₍₂₎(4)} yields the upper bound R ≤ 3. On the other hand, by exploiting the overall structure of {Y(1)₍₁₎,Y(2)₍₁₎,Y₍₂₎(1),Y(3)₍₂₎,Y(4)₍₂₎}, Theorem II.2 relaxes the bound to R≤ 6.

B. Coupled CPD framework for multidimensional NLA

Let us now formalize the reasoning behind the previ-ous example. Consider the decomposition

X = AST∈ CI×K, (50) where the matrix A∈ CI×R_{is shift-invariance structured} while S ∈ CK×R _{is unstructured. More precisely, let} J_(p) ∈ CI(p)_×R

denote a row-selection matrix such that A(p) = J(p)A ∈ CI

(p)_×R

consists of I(p) _{rows of A. It is} assumed that each of the following P decompositions

X(p)= J(p)X = A(p)ST∈ CI (p)_×K

, p∈ {1, . . . , P}, (51) corresponds to an NLA factorization of the form (7). This in turn means that A(p) ∈ CI

(p)_×R

is an NLA matrix with columns (A(p))r = [xnr,p1,p x

n2,p

r,p . . . x n_I(p),p

r,p ]T in which xr,p ∈

C _{and n}_i,p _{∈ Z. We can now proceed as in Section III.} Briefly, let Ωp ⊆ {1, . . . , I(p)} denote the set of powers of xr,p in the generators associated with pth NLA matrix

A(p). Relation (17) now becomes

X(m,n)_(p) = X(p)([σm,n,p(1), . . . , σm,n,p(I(p)m,n)], :) = A(m,n)_(p) ST∈ CI

(p) m,n×K_, where A(m,n)_(p) = A(p)([σm,n,p(1), . . . , σm,n,p(I(p)m,n)], :), m ∈

{1, . . . , Mn,p}, n ∈ Ωp and the new index ’p’ refers to the

associated NLA factorization X(p). Likewise, relation (18) extends to Y(m,n)_(p) = S2(X(m,n)_(p) ) = C A(m,n)_(p) " B(n)_(p) D ST, where B(n)_(p) = 'x1n ··· 1 1,p··· xnR,p (

. Let us index the elements in Ωp by

ωp(1), . . . , ωp(card

, Ωp

-). We now obtain the augmented version of (23): Ytot=     Ytot,1 .. . Ytot,P    , Ytot,p=      Y(ωp(1)) (p) .. . Y(ωp(card(Ωp))) (p)      , Y(n)_(p) =     Y(1,n)_(p) .. . Y(Mn,p,n) (p)    , in which Y(ωp(n)) (p) = (A (ωp(n)) (p) " B (ωp(n)) (p) )S T _{and A}(n) (p) = E A(1,n)T_(p) . . . A(Mn,p,n)T (p) FT

. In the same way, the augmented version of (24) is given by

Gtot='G(Ω1,1)T _{. . . G}(ΩP,P)T(T_, ₍₅₂₎ where the index ’p’ indicates that the matrix G(Ωp,p)_, which is of the form (24), is built from the factor matrices {A(ωp(n),p)

(p) , B (ωp(n),p)

(p) } card(Ωp)

n=1 associated with the pth NLA.

Proposition V.1 below summarizes the extension of Proposition III.1 to the multidimensional case.

(10)

Proposition V.1. Consider the multidimensional NLA fac-torization of X = AST∈ CI×K_{in (50). If}     

Sin (50) has full column rank,

Gtot in (52) has full column rank,

∀p ∈ {1, . . . , P} , ∃ i, j ∈ Ωp : GCD(ni,p, nj,p) = 1 ,

(53)

then the multidimensional NLA constrained rank of X is R and the multidimensional NLA decomposition of X is unique.

Furthermore, if the conditions (53) in Proposition V.1 are satisfied, then Algorithm 1 can easily be extended to the multidimensional case. We summarize the extension as Algorithm 2.

Algorithm 2 SD method for multidimensional MI-ESPRIT (Proposition V.1).

Input: X = ASTof the form (50).

1.Compute SVD X = UΣΣΣVH_{(and find R from Σ}_Σ_Σ).

2. Choose row-selection matrices J(1), . . . , J(P).

3. Extract U(p)=J(p)UΣΣΣ for every p∈ {1, . . . , P}

4. Determine the subsets Ωp⊆ {1, . . . , I(p)}, ∀p ∈ {1, . . . , P}.

5. Extract{U(m,n) (p) }

m∈{1,...,Mn,p}

n∈Ωp of the form (33),∀p ∈ {1, . . . , P}. 6. Build Y(n)_red,p of the form (34),∀n ∈ Ωp,∀p ∈ {1, . . . , P}.

7. Build{M(r) } in (36) from {Y(n)red,p} n∈Ωp p∈{1,...,P}[25], [22]. 8. Solve SD problem M(r)_{= GΛ}_Λ_Λ(r)_GT_{, r} ∈ {1, . . . , R}. 9. Compute A(p)= J(p)UΣΣΣGT,∀p ∈ {1, . . . , P} and S = V∗G.

10. Obtain{xr,p} via A(p)(e.g. via Proposition IV.1).

Output:{xr,p}, {A(p)} and S.

VI. Numerical experiments

Consider the factorization X = AST in (7) or (50). The goal is to estimate the angles {θr} or {θr, φr} from

T = X+βN, where N is an unstructured perturbation ma-trix and β∈ R controls the signal-to-noise level. We will consider Monte Carlo simulations, where for each trial the real and imaginary entries of S and N are randomly drawn from a Gaussian distribution with zero mean and unit variance. The following Signal-to-Noise Ratio (SNR) measure will be used: SNR [dB] = 10 log(%X%2

F/β2%N%2F). A. One-dimensional (1D) NLA

Consider the NLA problem with I = 8, K = 50,

R = 3 and NLA response matrix A with columns of

the form (31). We compute the NLA factorization of (31) via Algorithm 1 in which the generators xr, x2r, x3

r and x10r are used and the associated subvectors of

(31) are [1 xr x2r]T, [1 x2r | x21r xr23]T, [x15r x18r x21r ]T and

[1 x11

r x21r ]T, respectively. Note that methods that only

exploit one shift-invariance structure within the NLA require that R≤ 2 and hence cannot be used. In order to demonstrate that the multiresolution property of A can lead to improved performance, we compare the I = 8 element NLA with an I = 8 (number of elements in the NLA) element ULA and an I = 23 (aperture of the NLA) element ULA. The algebraic ESPRIT method [18] will be used for DOA estimation in the ULA cases. The SD

problem in step 8 of Algorithm 1 will numerically be solved by GEVD (e.g. [11]) and by extended QZ iteration [31] using two or all R = 3 matrices in (36), respectively. The methods will be referred to as GEVD and SD-QZ, respectively.

We set θθθ = [θ1, θ2, θ3]T= ₁₈₀π [10, 40, 45]T. We compute the root mean square error ERMS=

G 1

RTMC 2TMC

t=1 %θθθ− ˆθθθt%2F,

where TMC and ˆθθθt denote the number of Monte Carlo

trials and the associated trial estimate of θθθ, respectively. The root mean square of the deterministic Cram´er-Rao Bound (CRB) [27], [37] will also be computed. The ERMS and CRB values over TMC = 200 trials as a function of SNR can be seen in figure 1(a). We observe that the NLA performs significantly better than the I = 8 element ULA and almost as well as the I = 23 element ULA. We also observe that SD-QZ performs better than SD-GEVD. This was also expected since the former approach exploits all matrices{M(r)_{} in (36). For this reason, only the extended} QZ version of Algorithm 1 will be considered in the following experiments.

In order to illustrate that SD techniques can cheaply provide a good initialization for optimization-based methods, Algorithm 1 will be applied in combination with a conditional Maximum Likelihood (ML) method. (The ML method turned out to be sensitive w.r.t. its initialization for this NLA experiment. For this reason a randomly initialized version of ML is not retained.) The overall procedure will be referred to as SD-ML. We also consider the MUSIC method with a discretization

step π

36000 ≈ 8.7e− 5 on the interval [0, π]. Note that the parameters θθθ are located on the grid, favoring the MUSIC method. The dense sampling grid of the MUSIC method can make it expensive (about three times more costly than SD-ML in our experiment). (Note also that the SD method could be used to narrow the search range of a grid-based method, such as MUSIC.) See [32] for details and references about the ML and MUSIC methods. The ERMS and CRB values over M = 200 trials as a function of SNR can be seen in figure 1(b). Above 5 dB SNR, the SD-ML and MUSIC methods perform about the same. The algebraic SD Algorithm 1 yields less reliable DOA estimates below 5 dB SNR. For this reason the SD-ML method did not perform well below 5 dB SNR either since it requires a proper initialization.

B. L-shaped two-dimensional (2D) NLA

Consider the L-shaped 2D NLA in subsection V-A with I(1)_{= 7, I}(2)_{= 6, K = 50 and R = 3. Consider also an} (I(1)_{, I}(2)_{) = (7, 6) (number of elements in the L-shaped} 2D NLA) element 2D ULA and an (I(1)_{, I}(2)_{) = (9, 13)} (aperture of the L-shaped 2D NLA) element 2D ULA. We use Algorithm 2 and the SD algorithm in [22] for DOA estimation in the 2D NLA and 2D ULA cases, respectively. We set θθθ = [θ1, θ2, θ3]T = 180π [10, 40, 45]T and φφφ = [φ1, φ2, φ3]T = ₁₈₀π [−70, −60, 60]T. The ERMS = G 1 2RTMC 2TMC