www.elsevier.com/locate/laa
On a variational formulation of the QSVD and the RSVD
聻Delin Chu a , ∗ , Bart De Moor b ,1
a Department of Mathematics, National University of Singapore, Lower Kent Ridge Road, Singapore 119260, Singapore
b Department of Electrical Engineering (ESAT), Research Group SISTA, Katholieke Universiteit Leuven, Kardinaal Mercierlaan 94, B-3001 Leuven, Belgium
Received 16 December 1998; accepted 7 February 2000 Submitted by L. Elsner
Abstract
Recently, M.T. Chu, R.F. Funderlic and G.H. Golub [SIAM J. Matrix Anal. Appl., 18 (1997) 1082–1092] presented a variational formulation for the quotient singular value decom- position (QSVD) of two matrices A ∈ R n×m , C ∈ R p×m which is a generalization to two matrices of the ordinary singular value decomposition (SVD) and characterizes the role of
聻
This work is supported by several institutions:
1. The Flemish Government:
(a) Concerted Research Action GOA-MIPS (Model-based Information Processing Systems).
(b) The FWO (Fund for Scientific Research — Flanders) project G.0292.95: Matrix algorithms and differential geometry for adaptive signal processing, system identification and control.
(c) The FWO project G.0256.97: Numerical Algorithms for Subspace System Identificati on, extension to special cases.
(d) The FWO Research Communities: ICCoS (Identification and Control of Complex Systems) and Advanced Numerical Methods for Mathematical Modelling.
2. The Belgian State, Prime Minister’s Office — Federal Office for Scientific, Technical and Cultural Affairs: Interuniversity Poles of Attraction Programme (IUAP P4-02 (1997–2001): Modeling, Iden- tification, Simulation and Control of Complex Systems; and IUAP P4-24 (1997–2001): Intelligent Mechatronic Systems (IMechS)).
Delin Chu was a Visiting Research Fellow with the K.U. Leuven during the writing of this paper.
Bart De Moor is a Senior Research Associate with the F.W.O. and an Associate Professor with the K.U.
Leuven. The scientific responsibility is assumed by the authors.
∗
Corresponding author. Fax: +65-7795452.
E-mail adressess: matchudl@math.nus.edu.sg (D. Chu), Bart.Demoor@esat.kuleuven.ac.be (B. De Moor).
1 Tel.: +32-16-32-1970; fax: +32-16-32-1709.
0024-3795/00/$ - see front matter
2000 Elsevier Science Inc. All rights reserved.
PII: S 0 0 2 4 - 3 7 9 5 ( 0 0 ) 0 0 0 7 2 - 0
two orthogonal matrices in QSVD. In this paper, we give an alternative derivation of this vari- ational formulation and extend it to establish an analogous variational formulation for the Re- stricted Singular Value Decomposition (RSVD) of a matrix triplet A ∈ R n×m , B ∈ R n×l , C ∈ R p×m , which provides a new understanding of the orthogonal matrices appearing in this de- composition. © 2000 Elsevier Science Inc. All rights reserved.
AMS classification: 65F15; 65H15
Keywords: SVD; QSVD; RSVD; Generalized singular value; Variational formulation; Stationary value;
Stationary point
1. Introduction
The ordinary singular value decomposition (OSVD) of a given matrix A ∈ R n×m is
U T AV =
r a m − r a
r a R 0
n − r a 0 0
, (1)
with V ∈ R n×n , V ∈ R m×m , R ∈ R r
a×r
aand
U = r a n − r a
n U 1 U 2
, V = r a m − r a
m V 1 V 2
,
R = diag{σ 1 , . . . , σ r
a}, σ 1 > σ 2 > · · · > σ r
a> 0, r a = rank(A), where U, V are orthogonal matrices. The σ 1 , . . . , σ r
aare the non-trivial singular values of A, and the columns of U 1 and V 1 are, respectively, the non-trivial left and right singular vectors of A. In this paper, k · k denotes the two-norm of a vector. The following theorem is well known [4].
Theorem 1. Given A ∈ R n×m with OSVD (1).
(a) Consider the optimization problem
x∈R
max
m,y∈Rn y=Ax, y6=0kyk
kxk . (2)
Then the non-trivial singular values σ 1 , . . . , σ r
aof A are precisely the stationary values, i.e., the functional evaluations at the stationary points, of (2). Let the stationary points in (2) corresponding to the stationary values σ 1 , . . . , σ r
abe
x
y
11, . . . , x
ray
ra. Then V 1 = h x
kx
11k · · · kx x
rarak i
.
Moreover, if n = m = r a , then
U 1 = h y
ky
11k · · · ky y
rarak i .
(b) Consider the dual optimization problem
x∈R
max
n,y∈Rm yT=xTA, y6=0kyk
kxk . (3)
Then the non-trivial singular values σ 1 , . . . , σ r
aof A are precisely the stationary values of (3). And, let the stationary points in (3) corresponding to the stationary values σ 1 , . . . , σ r
abe cx
y
11, . . . , x
ray
ra. Then
U 1 = h x
kx
11k · · · kx x
rarak i . Moreover, if n = m = r a , then
V 1 = h y
ky
11k · · · ky y
rarak i .
Recently, in [1] Theorem 1 has been generalized to the quotient singular val- ue decomposition (QSVD) [3,5–11,13,14] of two matrices A ∈ R n×m , C ∈ R p×m . Naturally, it is very interesting to generalize the result in [1] to the restricted singu- lar value decomposition (RSVD) [8–11,15,16] of a matrix triplet. After a study, we found that it is not trivial to get an analogous variational formulation for the RSVD via the approach in [1], this motivates us to re-derive the result in [1] via a different approach.
The purpose of this paper is twofold. Firstly, we present an alternative derivation of the variational formulation in [1] directly based on the QSVD of two matrices A, C. Then we extend this result to the RSVD [8–11,15,16] of a matrix triplet and obtain an analogous variational formulation that provides new understanding of the orthogonal matrices appearing in this decomposition.
Our approach is quite different from that one in [1]. We will show in Section 2 that an orthogonal reduction can be applied to the matrices A ∈ R n×m and C ∈ R p×m to get lower dimensional non-singular matrices A 22 and C 22 such that the non-trivial quotient singular values of the pair (A, C) are just the standard singular values of A 22 C 22 −1 and there is a close relationship between the two orthogonal matrices in the QSVD of (A, C) and the (left and right) singular vectors of A 22 C 22 −1 (see Lemma 2 and Theorem 3). Thus, the non-trivial quotient singular values of (A, C) can be characterized by Theorem 1 with matrix A 22 C −1 22 . Hence, Theorem 1 can be used as a bridge to re-derive the variational formulation in [1] for QSVD of (A, C). Moreover, same idea also works for the RSVD (see Lemma 5 and Theorem 6), so it offers a springboard to leap forward to the variational formulation of the RSVD.
In order to prove our main results, we will establish two condensed forms based
on orthogonal matrix transformations. The QSVD of two matrices and the RSVD of
matrix triplets can be obtained and the variational formulation for QSVD and RSVD
can be proved directly based on these two condensed forms.
In this paper, we use the following notation:
• S ∞ (M) denotes a matrix with orthogonal columns spanning the right nullspace of a matrix M.
• T ∞ (M) denotes a matrix with orthogonal columns spanning the right nullspace of a matrix M T .
• M ⊥ denotes the orthogonal complement of the space spanned by the columns of
• For a matrix M, T M. T ∞ (M) and T ⊥ ∞ (M) are defined by (T ∞ (M)) T and (T ∞ (M)) ⊥ ,
respectively.
• Unless noted, we do not distinguish between a matrix with orthogonal columns and the space spanned by its columns.
We also use the following notation for any given matrices A, B, C with compatible sizes: denote
r a = rank(A), r b = rank(B), r c = rank(C),
r ab = rank A B
, ¯r ac = rank
A C
, r abc = rank
A B
C 0
,
k 1 = r abc − r b − r c , k 2 = r ab + r c − r abc , k 3 = ¯r ac + r b − r abc , k 4 = r a + r abc − r ab − ¯r ac .
2. A variational formulation for QSVD
Several generalizations of the OSVD have been proposed and analyzed. One that is well known is the generalized SVD as introduced by Paige and Saunders in [5], which was proposed by De Moor and Golub [11] to rename as the QSVD. Another one is the RSVD, introduced in its explicit form by Zha [16] and further developed and discussed by De Moor and Golub [8].
In this section, we will give an alternative proof of the variational formulation for the QSVD of [1] based directly on QSVD itself. Firstly, we present a condensed form to derive the QSVD of two matrices.
Lemma 2. Given matrices A ∈ R n×m , C ∈ R p×m . Then there exist three orthogo- nal matrices U a ∈ R n×n , W ∈ R m×m , V c ∈ R p×p such that
U a T AW =
¯r ac − r c r a + r c − ¯r ac ¯r ac − r a m − ¯r ac
¯r ac − r c A 11 A 12 0 0
r a + r c − ¯r ac 0 A 22 0 0
n − r a 0 0 0 0
,
(4)
V c T CW =
¯r ac − r c r a + r c − ¯r ac ¯r ac − r a m − ¯r ac
p − r c 0 0 0 0
r a + r c − ¯r ac 0 C 22 0 0
¯r ac − r a C 31 C 32 C 33 0
,
where A 11 , A 22 , C 22 and C 33 are non-singular.
Proof. See Appendix A. Let the SVD of A 22 C 22 −1 be
U 22 T A 22 C 22 −1 V 22 = diag{σ 1 , . . . , σ s } =: S AC , s = r a + r c − ¯r ac , (5) where U 22 , V 22 are orthogonal matrices, σ 1 > σ 2 > · · · > σ s > 0. Define
U :=U a diag {I ¯r
ac−r
c, U 22 , I n−r
a}, (6) V :=V c diag {I p−r
c, V 22 , I ¯r
ac−r
a}. (7)
X =
I 0 0 0
0 I 0 0
−C 33 −1 C 31 −C 33 −1 C 32 C 33 −1 0
0 0 0 I
×
A −1 11 −A −1 11 A 12 C 22 −1 V 22 0 0
0 C 22 −1 V 22 0 0
0 0 I 0
0 0 0 I
. (8)
Then, as a direct consequence of the condensed form (4), we have the following well-known QSVD theorem.
Theorem 3 (QSVD theorem). Let A ∈ R n×m , C ∈ R p×m . There exist orthogonal matrices U ∈ R n×n , V ∈ R p×p and a non-singular matrix X such that
U T AX=
¯r ac − r c r a + r c − ¯r ac ¯r ac − r a m − ¯r ac
¯r ac − r c I 0 0 0
r a + r c − ¯r ac 0 S AC 0 0
n − r a 0 0 0 0
,
V T CX=
¯r ac − r c r a + r c − ¯r ac ¯r ac − r a m − ¯r ac
p − r c 0 0 0 0
r a + r c − ¯r ac 0 I 0 0
¯r ac − r a 0 0 I 0
, (9)
where S AC is of the form (5), and U, V and X can be chosen to be given by (6)–
(8), respectively. σ i , i = 1, . . . , s, are defined to be the non-trivial quotient singular values of the matrix pair A, C.
According to the uniqueness theorem in [16], we only need to characterize the matrices U, V given by (6) and (7) in order to characterize the role of the orthogonal matrices in the QSVD. Let U, V be given by (6) and (7) and partition these two orthogonal matrices as
U = ¯r ac − r c r a + r c − ¯r ac n − r a
U 1 U 2 U 3
, (10)
V = p − r c r a + r c − ¯r ac ¯r ac − r a
V 1 V 2 V 3
. (11)
Then, from Lemma 2 we have
U 3 =T ∞ (A), U 1 = T ⊥ ∞ (AS ∞ (C)), (12)
V 1 =T ∞ (C), V 3 = T ⊥ ∞ (CS ∞ (A)). (13)
Hence, in order to characterize the role of the orthogonal matrices U, V in the QSVD, it should only characterize the role of U 2 , V 2 in QSVD.
The following variational formulation has been established in [1] to characterize U 2 and V 2 .
Theorem 4. Given A ∈ R n×m , C ∈ R p×m . Consider the optimization problem
x∈Rn,y∈
max
Rp, x6=0
AT CT
TT
∞(A) 0
0 TT
∞(C)
−yx
= 0
kyk
kxk . (14)
Then the non-trivial quotient singular values σ 1 , . . . , σ s of the matrix pair A, C are precisely the stationary values for the problem (14). Furthermore, let x
−y
11, . . . ,
x
s−y
sbe stationary points of the problem (14) with corresponding stationary values σ 1 , . . . , σ s , then
U 2 = h x
kx
11k · · · kx x
ssk i
, V 2 = h y
ky
11k · · · ky y
ssk i .
Proof. We prove Theorem 4 by the following three arguments.
• Argument 1. Firstly, we characterize the orthogonal matrices U 22 , V 22 in (5). Con-
sider the optimization problem
max
x2,y2∈Rra+rc−¯rac xT2 A22 =yT
2 C22,x2 /=0
ky 2 k
kx 2 k . (15)
Since A 22 , C 22 are both non-singular, by Theorem 1 the σ 1 , . . . , σ s , i.e., the singu- lar values of the matrix A 22 C 22 −1 are precisely the stationary values of the problem (15), and, if
h x
1 2−y
21i , . . . , h x
s2
−y
2si
are the stationary points of the problem (15) with corresponding stationary values σ 1 , . . . , σ s , then
U 22 =
x
12kx
21k · · · kx x
2ss 2k
, V 22 =
y
12ky
12k · · · ky y
2ss 2k
.
• Argument 2. Secondly, define F=
x
−y
|x ∈ R n , y ∈ R p , U a T x =
¯r ac − r c 0 r a + r c − ¯r ac x 2
n − r a 0
,
V c T y =
p − r c 0 r a + r c − ¯r ac y 2
¯r ac − r a 0
, x T A = y T C, x /= 0
. Consider the optimization problem
max
x
−y
∈
Fkyk
kxk . (16)
Obviously, we have that x
−y
is a stationary point of the problem (16) with sta-
tionary value σ if and only if x
−y
22is a stationary point of the problem (15) with the same stationary point σ and furthermore
U a T x =
¯r ac − r c 0 r a + r c − ¯r ac x 2
n − r a
0
, V c T y =
p − r c 0 r a + r c − ¯r ac y 2
¯r ac − r a 0
.
• Argument 3. Finally, for any x ∈ R n , y ∈ R p , partition
U a T x =
¯r ac − r c x 1
r a + r c − ¯r ac x 2
n − r a x 3
, V c T y =
p − r c y 1
r a + r c − ¯r ac y 2
¯r ac − r a y 3
.
Since
U a T T ∞ (A) =
¯r ac − r c 0 r a + r c − ¯r ac 0
n − r a I
,
V c T T ∞ (C) =
p − r c I r a + r c − ¯r ac 0
¯r ac − r a 0
, it is easy to know that x
−y
∈ F if and only if
A T C T
T T ∞ (A) 0 0 T T ∞ (C)
−y x
= 0, x /= 0.
Note that
U a T U 2 =
¯r ac − r c 0 r a + r c − ¯r ac U 22
n − r a 0
, V c T V 2 =
p − r c 0 r a + r c − ¯r ac V 22
¯r ac − r a 0
, thus, Theorem 4 follows directly from the above Arguments 1–3.
3. A variational formulation for RSVD
In Section 2, we have derived the QSVD of two matrices A, C based on the con- densed form (4). Now we will establish the RSVD of a matrix triplet (A, B, C) via an analogous condensed form.
Lemma 5. Given A ∈ R n×m , B ∈ R n×l , C ∈ R p×m . Then there exist orthogonal matrices P ∈ R n×n , Q ∈ R m×m , U b ∈ R l×l , V c ∈ R p×p such that
P AQ=
k 1 k 2 k 3 k 4 ¯r ac − r a m − ¯r ac
k 1 A 11 A 12 0 0 0 0
k 2 0 A 22 0 0 0 0
k 3 A 31 A 32 A 33 A 34 0 0
k 4 A 41 A 42 0 A 44 0 0
r ab − r a 0 0 0 0 0 0
n − r ab 0 0 0 0 0 0
,
P BU b =
l − r b k 3 k 4 r ab − r a
k 1 0 0 0 B 14
k 2 0 0 0 B 24
k 3 0 B 32 B 33 B 34
k 4 0 0 B 43 B 44
r ab − r a 0 0 0 B 54
n − r ab 0 0 0 0
, (17)
V c T CQ=
k 1 k 2 k 3 k 4 ¯r ac − r a m − ¯r ac
p − r c 0 0 0 0 0 0
k 2 0 C 22 0 0 0 0
k 4 C 31 C 32 0 C 34 0 0
¯r ac − r a C 41 C 42 C 43 C 44 C 45 0
,
where A 11 , A 22 , A 33 , A 44 , B 32 , B 43 , B 54 , C 22 , C 34 and C 45 are non-singular.
Proof. See Appendix B. Let the SVD of B 43 −1 A 44 C 34 −1 be
U 44 T B 43 −1 A 44 C 34 −1 V 44 = diag{σ 1 , . . . , σ k
4} =: S ABC , (18) where U 44 , V 44 are orthogonal matrices, σ 1 > σ 2 > · · · > σ k
4> 0. Define
U :=U b
I l−r
bI k
3U 44
I r
ab−r
a
, (19)
V :=V c
I p−r
cI k
2V 44
I ¯r
ac−r
a
. (20)
Similarly to Theorem 3, from Lemma 5 directly, we have the following theorem.
Theorem 6 (RSVD theorem). Given A ∈ R n×m , B ∈ R n×l , C ∈ R p×m . Then there
exist non-singular matrices X ∈ R n×n , Y ∈ R m×m and orthogonal matrices U ∈
R l×l , V ∈ R p×p such that
X T AY =
k 1 k 2 k 3 k 4 ¯r ac − r a m − ¯r ac
k 1 I 0 0 0 0 0
k 2 0 I 0 0 0 0
k 3 0 0 I 0 0 0
k 4 0 0 0 S ABC 0 0
r ab − r a 0 0 0 0 0 0
n − r ab 0 0 0 0 0 0
,
X T BU =
l − r b k 3 k 4 r ab − r a
k 1 0 0 0 0
k 2 0 0 0 0
k 3 0 I 0 0
k 4 0 0 I 0
r ab − r a 0 0 0 I
n − r ab 0 0 0 0
, (21)
V T CY =
k 1 k 2 k 3 k 4 ¯r ac − r a m − ¯r ac
p − r c 0 0 0 0 0 0
k 2 0 I 0 0 0 0
k 4 0 0 0 I 0 0
¯r ac − r a 0 0 0 0 I 0
,
where S ABC is of the form (18), and U, V can be chosen to be given by (19) and (20), respectively, σ 1 , . . . , σ k
4are defined to be the non-trivial restricted singular values of the matrix triplet A, B, C.
From the uniqueness theorem in [16], we only need to consider the matrices U, V given by (19) and (20) in order to characterize the role of the orthogonal matrices in the RSVD. Let U, V be defined by (19) and (20), respectively, and partition
U = l − r b k 3 k 4 r ab − r a
U 1 U 2 U 3 U 4
,
V = p − r c k 2 k 4 ¯r ac − r a
V 1 V 2 V 3 V 4
.
We have
U 1 =S ∞ (B), U 4 = S ⊥ ∞ (T T ∞ (A)B), V 1 =T ∞ (C), V 4 = T ⊥ ∞ (CS ∞ (A)).
Furthermore, if we define
W 1 :=CS ∞ (T T ∞ (B)A),
W 2 :=AS ∞ (T T ∞ (B)A), W 3 :=(T ∞ (W 2 S ∞ (W 1 ))) T B, then,
W 1 =V c ×
k 3 k 4 ¯r ac − r a m − ¯r ac
p − r c 0 0 0 0
k 2 0 0 0 0
k 4 0 C 34 0 0
¯r ac − r a C 43 C 44 C 45 0
,
W 2 =P T ×
k 3 k 4 ¯r ac − r a m − ¯r ac
k 1 0 0 0 0
k 2 0 0 0 0
k 3 A 33 A 34 0 0
k 4 0 A 44 0 0
r ab − r a 0 0 0 0
n − r ab 0 0 0 0
,
W 3 =
l − r b k 3 k 4 r ab − r a
k 1 0 0 0 B 14
k 2 0 0 0 B 24
k 4 0 0 B 43 B 44
r ab − r a 0 0 0 B 54
n − r ab 0 0 0 0
U b T . (22)
Hence, from (22) we have
U 1 U 2
= S ∞ (W 3 ).
Thus, in order to characterize the role of orthogonal matrices U, V in the RSVD we only need to characterize the role of U 3 , V 3 of U, V in the RSVD. This can be done by the following variational formulation.
Theorem 7. Given matrices A ∈ R n×m , B ∈ R n×l , C ∈ R p×m . Consider the opti- mization problem
max
x∈Rm, y∈Rl, x /=0
A B
ST
∞
A C
0 ST
∞(A)CT C 0
0 ST
∞(B)
0 TT
∞(BS⊥∞(W3))B
x
−y
= 0