Delin Chu
and Bart De Moor
yDepartment of Electrical Engineering
Katholieke Universiteit Leuven Kardinaal Mercierlaan 94
B-3001 Leuven, Belgium August 13, 1998
Abstract
Recently, Chu, Funderlic and Golub [ SIAM J. Matrix Anal. Appl., 18:1082{1092, 1997] presented a variational formulation for the quotient singular value decomposition (QSVD) of two matrices
A 2 Rnm;C 2 Rpmwhich is a generalization of that one for the ordinary singular value decomposition (OSVD) and characterizes the role of two orthogonal matrices in QSVD. In this paper, we give an alternative derivation of this variational formulation and extend it to establish an analogous variational formulation for the Restricted Singular Value Decomposition (RSVD) of Matrix Triplets
A2Rnm;B2Rnl;C2Rpm
which provides new understanding of the orthogonal matrices appearing in this decomposition.
Keywords:
OSVD, QSVD, RSVD, Generalized Singular Value, Variational Formu- lation, Stationary Value, Stationary Point.
AMSsubject classication:
65F15, 65H15.
1 Introduction
The ordinary singular value decomposition (OSVD) of a given matrix A
2R
nmis U
TAV =
"
r
am
?r
ar
a0
n
?r
a0 0
#
; (1)
with
U =
hr
an
?r
an U
1U
2 i; V =
hr
am
?r
am V
1V
2 i;
= diag
f1;
;
rag;
1 2 ra> 0 ; r
a= rank( A ) ;
where U;V are orthogonal matrices. The
1;
;
raare the non-trivial singular values of A , and the columns of U
1and V
1are, respectively, the non-trivial left and right singular vectors of A . In this paper,
kkdenotes the 2
?norm of a vector. The following theorem is well-known [4]:
Delin.Chu@esat.kuleuven.ac.be
yBart.Demoor@esat.kuleuven.ac.be
1
Theorem 1 Given A
2R
nmwith OSVD (1).
(a) Consider the optimization problem
y=Ax; y
max
6=0k
y
kk
x
k: (2)
Then the non-trivial singular values
1;
;
raof A are precisely the stationary values, i.e., the functional evaluations at the stationary points, of (2). And, let the stationary points in (2) corresponding to the stationary values
1;
;
rabe
"
x
1y
1#
;
;
"
x
ray
ra#
, then V
1=
h kxx11k kxxrarak i:
Moreover, if n = m = r
a, then
U
1=
h kyy11k kyyrara ki
: (b) Consider the dual optimization problem
yT=x
max
TA; y6=0k
y
kk
x
k: (3)
Then the non-trivial singular values
1;
;
raof A are precisely the stationary values of (3). And, let the stationary points in (3) corresponding to the stationary values
1;
;
rabe
"
x
1y
1#
;
;
"
x
ray
ra#
, then
U
1=
h kxx11k kxxrara ki
: Moreover, if n = m = r
a, then
V
1=
h kyy11k kyyrraa ki
:
Recently, in Chu, Funderlic and Golub [1] Theorem 1 has been generalized to the Quotient Singular Value Decomposition (QSVD) [3, 5, 6, 7, 8, 9, 10, 11, 13, 14] of two matrices A
2R
nm;C
2R
pmbased on the relationship between QSVD of two matrix A;C and the eigendecomposition of the matrix pencil ( A
TA;C
TC ).
The purposes of this paper are twofold. Firstly, we present an alternative derivation of the variational formulation in [1] directly based on the QSVD of two matrices A;C . Then we extend this result to the Restricted Singular Value Decomposition (RSVD)[8, 9, 10, 11, 15, 16] of matrix triplets and obtain a analogous variational formulation which provides new understanding of the orthogonal matrices appearing in this decomposition.
In order to prove our main results, we will establish two condensed forms based on orthog- onal matrix transformations. The QSVD of two matrices and the RSVD of matrix triplets can be obtained and the variational formulation for QSVD and RSVD can be proved directly based on these two condensed forms.
In this paper, we use the following notation:
S
1
( M ) denotes a matrix with orthogonal columns spanning the right nullspace of a
matrix M ;
T
1
( M ) denotes a matrix with orthogonal columns spanning the right nullspace of a matrix M
T;
M
?denotes the orthogonal complement of the space spanned by the columns of M ;
Unless noted, we do not distinguish between a matrix with orthogonal columns and the space spanned by its columns.
We also use the following notation for any given matrices A;B;C with compatible sizes:
denote
r
a= rank( A ) ; r
b= rank( B ) ; r
c= rank( C ) ; r
ab= rank
hA B
i; r
ac= rank
"
A C
#
; r
abc= rank
"
C A B 0
#
; k
1= r
abc?r
b?r
c; k
2= r
ab+ r
c?r
abc;
k
3= r
ac+ r
b?r
abc; k
4= r
a+ r
abc?r
ab?r
ac:
2 A Variational Formulation for QSVD
Nowadays, several generalizations of the OSVD have been proposed and analysed. One that is well-known is the generalized SVD as introduced by Paige and Saunders in [5], which was proposed by De Moor and Golub [11] to rename as the QSVD. Another one is the RSVD, introduced in its explicit form by Zha [16] and further developed and discussed by De Moor and Golub [8].
In this section we will give an alternative proof for the variational formulation for the QSVD of [1] based directly on QSVD itself. Firstly we present a condensed form to derive QSVD of two matrices.
Lemma 2 Given matrices A
2R
nm;C
2R
pm. Then there exist three orthogonal matrices U
a2R
nn;W
2R
mm;V
c2R
ppsuch that
U
TaAW =
2
6
4
r
ac?r
cr
a+ r
c?r
acr
ac?r
am
?r
acr
ac?r
cA
11A
120 0
r
a+ r
c?r
ac0 A
220 0
n
?r
a0 0 0 0
3
7
5
;
(4) V
TcCW =
2
6
4
r
ac?r
cr
a+ r
c?r
acr
ac?r
am
?r
acp
?r
c0 0 0 0
r
a+ r
c?r
ac0 C
220 0
r
ac?r
aC
31C
32C
330
3
7
5
; where A
11;A
22;C
22and C
33are nonsingular.
Proof. See Appendix A.
Let the OSVD of A
22C
22?1be
U
22TA
22C
22?1V
22= diag
f1;
;
sg=: S
A; s = r
a+ r
c?r
ac; (5)
where U
22;V
22are orthogonal matrices,
12 s> 0. Dene
U := U
adiag
fI
rac?rc;U
22;I
n?rag; (6) V := V
cdiag
fI
p?rc;V
22;I
rac?rag: (7) X =
2
6
6
6
4
I 0 0 0
0 I 0 0
?
C
33?1C
31 ?C
33?1C
32C
33?10
0 0 0 I
3
7
7
7
5 2
6
6
6
4
A
?111 ?A
?111A
12C
22?1V
220 0 0 C
22?1V
220 0
0 0 I 0
0 0 0 I
3
7
7
7
5
; (8) Then, as a direct consequence of the condensed form (4), we have the following well-known QSVD theorem.
Theorem 3 ( QSVD Theorem ) Let A
2R
nm;C
2R
pm, there exist orthogonal matrices U
2R
nn;V
2R
ppand nonsingular matrix X such that
U
TAX =
2
6
4
r
ac?r
cr
a+ r
c?r
acr
ac?r
am
?r
acr
ac?r
cI 0 0 0
r
a+ r
c?r
ac0 S
A0 0
n
?r
a0 0 0 0
3
7
5
;
V
TCX =
2
6
4
r
ac?r
cr
a+ r
c?r
acr
ac?r
am
?r
acp
?r
c0 0 0 0
r
a+ r
c?r
ac0 I 0 0
r
ac?r
a0 0 I 0
3
7
5
; (9)
where S
Ais of the form (5), and U;V and X can be chosen to be given by (6), (7) and (8), respectively.
i;i = 1 ;
;s are dened to be the non-trivial generalized singular values of two matrices A;C .
According to the uniqueness theorem in [16], we only need to characterize matrices U;V given by (6) and (7) in order to characterize the role of orthogonal matrices in QSVD. Let U;V be given by (6) and (7) and partition these two orthogonal matrices by
U =
hr
ac?r
cr
a+ r
c?r
acn
?r
aU
1U
2U
3 i; (10)
V =
hp
?r
cr
a+ r
c?r
acr
ac?r
aV
1V
2V
3 i: (11)
(12) Then, from Lemma 2 we have
U
3=
T1( A ) ; U
1=
T1?( A
S1( C )) ; (13) V
1=
T1( C ) ; V
3=
T1?( C
S1( A )) : (14) Hence, in order to characterize the role of orthogonal matrices U;V in QSVD, it should only characterize the role of U
2;V
2in QSVD.
The following variational formulation has been established in [1] to characterize U
2and
V
2.
Theorem 4 Given A
2R
nm;C
2R
pm. Consider the optimization problem max
2
6
4
A
TC
TTT
1
( A ) 0 0
T1T( C )
3
7
5
"
x
?
y
#
=0; x6=0
k
y
kk
x
k: (15)
Then the non-trivial generalized singular values
1;
;
sof two matrices A;C are precisely the stationary values for the problem (15). Furthermore, let
"
x
1?
y
1#
;
;
"
x
s?
y
s#
be sta- tionary points of the problem (15) with corresponding stationary values
1;
;
s, then
U
2=
h kxx11k kxxssk i; V
2=
h kyy11k kyyssk i: Proof. We prove Theorem 4 by the following three arguments.
Argument 1 Firstly, we characterize orthogonal matrices U
22;V
22in (5). Consider the optimization problem
xT2A22=
max
y2TC22; x26=0k
y
2kk
x
2k: (16)
Since A
22;C
22are both nonsingular, by Theorem 1 the
1;
;
s, i.e., the singular values of the matrix A
22C
22?1are precisely the stationary values of the problem (16), and, if
"
x
12?
y
21#
;
;
"
x
s2?
y
2s#
are the stationary points of the problem (16) with corresponding stationary values
1;
;
s, then
U
22=
h kxx1212 k
xs2
kxs2k
i
; V
22=
h kyy2112 k
ys2
kys2k
i
:
Argument 2 Secondly, let
F
=
f"
x
?
y
#
j
x
2R
n;y
2R
p;U
Tax =
2
6
6
6
4
r
ac?r
c0 r
a+ r
c?r
acx
2n
?r
a0
3
7
7
7
5
;
V
Tcy =
2
6
6
6
4
p
?r
c0 r
a+ r
c?r
acy
2r
ac?r
a0
3
7
7
7
5
; x
TA = y
TC;x
6= 0
g: Consider the optimization problem
max
"
x
?
y
#
2F k
y
kk
x
k: (17)
Obviously, we have that
"
x
?
y
#
is a stationary point of the problem (17) with stationary value if and only if
"
x
2?
y
2#
is a stationary point of the problem (16) with the same stationary point and furthermore
U
Tax =
2
6
6
6
4
r
ac?r
c0 r
a+ r
c?r
acx
2n
?r
a0
3
7
7
7
5
; V
Tcy =
2
6
6
6
4
p
?r
c0 r
a+ r
c?r
acy
2r
ac?r
a0
3
7
7
7
5
:
Argument 3 Finally, for any x
2R
n;y
2R
p, partition U
Tax =
2
6
6
6
4
r
ac?r
cx
1r
a+ r
c?r
acx
2n
?r
ax
33
7
7
7
5
; V
Tcy =
2
6
6
6
4
p
?r
cy
1r
a+ r
c?r
acy
2r
ac?r
ay
33
7
7
7
5
:
Since
U
TaT1( A ) =
2
6
4
r
ac?r
c0 r
a+ r
c?r
ac0 n
?r
aI
3
7
5
; V
TcT1( C ) =
2
6
4
p
?r
cI r
a+ r
c?r
ac0 r
ac?r
a0
3
7
5
; it is easy to know that
"
x
?
y
#
2F
if and only if
2
6
4
A
TC
TTT
1
( A ) 0 0
T1T( C )
3
7
5
"
x
?
y
#
= 0 ; x
6= 0 :
Note that
U
TaU
2=
2
6
4
r
ac?r
c0 r
a+ r
c?r
acU
22n
?r
a0
3
7
5
; V
TcV
2=
2
6
4
p
?r
c0 r
a+ r
c?r
acV
22r
ac?r
a0
3
7
5
; thus, Theorem 4 follows directly from the above Arguments 1, 2 and 3.
3 A Variational Formulation for RSVD
In Section 2 we have derived the QSVD of two matrices A;C based on the condensed form
(4). Now we will establish the RSVD of a matrix triplet ( A;B;C ) via an analogous condensed
form.
Lemma 5 Given A
2R
nm;B
2R
nl;C
2R
pm. Then there exist orthogonal matrices P
2R
nn;Q
2R
mm;U
b 2R
ll;V
c2R
ppsuch that
PAQ =
2
6
6
6
6
6
6
6
4
k
1k
2k
3k
4r
ac?r
am
?r
ack
1A
11A
120 0 0 0
k
20 A
220 0 0 0
k
3A
31A
32A
33A
340 0
k
4A
41A
420 A
440 0
r
ab?r
a0 0 0 0 0 0
n
?r
ab0 0 0 0 0 0
3
7
7
7
7
7
7
7
5
;
PBU
b=
2
6
6
6
6
6
6
6
4
l
?r
bk
3k
4r
ab?r
ak
10 0 0 B
14k
20 0 0 B
24k
30 B
32B
33B
34k
40 0 B
43B
44r
ab?r
a0 0 0 B
54n
?r
ab0 0 0 0
3
7
7
7
7
7
7
7
5
; (18)
V
TcCQ =
2
6
6
6
4
k
1k
2k
3k
4r
ac?r
am
?r
acp
?r
c0 0 0 0 0 0
k
20 C
220 0 0 0
k
4C
31C
320 C
340 0
r
ac?r
aC
41C
42C
43C
44C
450
3
7
7
7
5
; where A
11;A
22;A
33;A
44, B
32, B
43, B
54, C
22, C
34and C
45are nonsingular.
Proof. See Appendix B.
Let the OSVD of B
43?1A
44C
34?1be
U
44TB
43?1A
44C
34?1V
44= diag
f1;
;
k4g=: S
A; (19) where U
44;V
44are orthogonal matrices,
12 k4> 0. Dene
U := U
b2
6
6
6
4
I
l?rbI
k3U
44I
rab?ra3
7
7
7
5
; (20)
V := V
c2
6
6
6
4
I
p?rcI
k2V
44I
rac?ra3
7
7
7
5
: (21)
Similarly to Theorem 3, from Lemma 5 directly, we have
Theorem 6 ( RSVD Theorem ) Given A
2R
nm;B
2R
nl;C
2R
pm. Then there exist
nonsingular matrices X
2R
nn;Y
2R
mmand orthogonal matrices U
2R
ll;V
2R
ppsuch that
X
TAY =
2
6
6
6
6
6
6
6
4
k
1k
2k
3k
4r
ac?r
am
?r
ack
1I 0 0 0 0 0
k
20 I 0 0 0 0
k
30 0 I 0 0 0
k
40 0 0 S
A0 0
r
ab?r
a0 0 0 0 0 0
n
?r
ab0 0 0 0 0 0
3
7
7
7
7
7
7
7
5
;
X
TBU =
2
6
6
6
6
6
6
6
4
l
?r
bk
3k
4r
ab?r
ak
10 0 0 0
k
20 0 0 0
k
30 I 0 0
k
40 0 I 0
r
ab?r
a0 0 0 I
n
?r
ab0 0 0 0
3
7
7
7
7
7
7
7
5
; (22)
V
TCY =
2
6
6
6
4
k
1k
2k
3k
4r
ac?r
am
?r
acp
?r
c0 0 0 0 0 0
k
20 I 0 0 0 0
k
40 0 0 I 0 0
r
ac?r
a0 0 0 0 I 0
3
7
7
7
5
;
where S
Ais of the form (19), and U;V can be chosen to be given by (20) and (21), respectively,
1;
;
k4are dened to be the non-trivial restricted singular values of matrix triplets A;B;C . From the uniqueness theorem in [16], we only need to consider matrices U;V given by (20) and (21) in order to characterize the role of orthogonal matrices in RSVD. Let U;V be dened by (20) and (21), respectively and partition
U =
hl
?r
bk
3k
4r
ab?r
aU
1U
2U
3U
4 i; V =
hp
?r
ck
2k
4r
ac?r
aV
1V
2V
3V
4 i: We have
U
1=
S1( B ) ; U
4=
S1?(
T1T( A ) B ) ; V
1=
T1( C ) ; V
4=
T1?( C
S1( A )) : Furthermore, if we dene
1
:= C
S1(
T1T( B ) A ) ;
2
:= A
S1(
T1T( B ) A ) ;
3
:= (
T1(
2S1(
1)))
TB;
then,
1
= V
c2
6
6
6
4
k
3k
4r
ac?r
am
?r
acp
?r
c0 0 0 0
k
20 0 0 0
k
40 C
340 0
r
ac?r
aC
43C
44C
450
3
7
7
7
5
;
2
= P
T2
6
6
6
6
6
6
6
4
k
3k
4r
ac?r
am
?r
ack
10 0 0 0
k
20 0 0 0
k
3A
33A
340 0
k
40 A
440 0
r
ab?r
a0 0 0 0
n
?r
ab0 0 0 0
3
7
7
7
7
7
7
7
5
;
3
=
2
6
6
6
6
6
4
l
?r
bk
3k
4r
ab?r
ak
10 0 0 B
14k
20 0 0 B
24k
40 0 B
43B
44r
ab?r
a0 0 0 B
54n
?r
ab0 0 0 0
3
7
7
7
7
7
5
U
Tb: (23)
Hence, from (23) we have
hU
1U
2 i=
S1(
3) :
Thus, in order to characterize the role of orthogonal matrices U;V in RSVD we only need to characterize the role of U
3;V
3of U;V in RSVD. This can be done by the following variational formulation.
Theorem 7 Given matrices A
2R
nm;B
2R
nl;C
2R
pm. Consider the optimization problem
max
2
6
6
6
6
6
6
6
6
4
A B
ST
1
(
"
A C
#
) 0
ST
1
( A ) C
TC 0 0
S1T( B ) 0
T1T( B
S1?(
3)) B
3
7
7
7
7
7
7
7
7
5
"
x
?
y
#
=0; x6=0
k
y
kk
Cx
k: (24)
Then the stationary values for the problem (24) are precisely the non-trivial generalized singu- lar values
1;
;
k4of the matrix triplet A;B;C . Moreover, if
"
x
1?
y
1#
;
;
"
x
k4?
y
k4#
are the stationary point of the problem (24) corresponding to the stationary values
1;
;
k4, respectively, then
U
3=
h kyy11k kyykk44 ki
:
Proof. Same as the proof of Theorem 4, we prove part (a) by the following three arguments.
Argument 1 Firstly, we characterize U
44in (19). Consider the optimization problem
A44x4=
max
B43y3; y36=0k
y
3kk
C
34x
4k: (25)
Since A
44;B
43;C
34are nonsingular, so by Theorem 1, the stationary values of the prob- lem (25) are precisely
1;
;
k4, i.e., the singular values of the matrix B
43?1A
44C
34?1, and, if let the corresponding stationary points be
"
x
14?
y
31#
;
;
"
x
k44?
y
3k4#
, then U
44=
kyy1313 k
yk34
kyk34k
:
Argument 2 Secondly, dene
F
:=
f"
x
?
y
#
k
x
2R
m;y
2R
l;U
Tby =
2
6
6
6
4
l
?r
b0
k
30
k
4y
3r
ab?r
a0
3
7
7
7
5
; Q
Tx =
2
6
6
6
6
6
6
6
4
k
10
k
20
k
3x
3k
4x
4r
ac?r
ax
5m
?r
ac0
3
7
7
7
7
7
7
7
5
;
C
43x
3+ C
44x
4+ C
45x
5= 0 ;Ax = By;y
6= 0
g: Consider the optimization problem
max
"
x
?
y
#
2F k
y
kk
Cx
k: (26)
Since A
33;A
44;B
43and C
45are nonsingular, so a simple calculation yields that the problem (26) are equivalent to the problem (25) in the sense that the stationary values of the problem (26) are precisely the stationary values of the problem (25), i.e.,
1;
;
k4, and,
"
x
?
y
#
is the stationary point of the problem (26) if and only if
"
x
4?
y
3#
is the stationary point of the problem (25) with same stationary value.
Argument 3 Thirdly, for any x
2R
m;y
2R
l, denote
U
Tby =
2
6
6
6
4
l
?r
by
1k
3y
2k
4y
3r
ab?r
ay
43
7
7
7
5
; Q
Tx =
2
6
6
6
6
6
6
6
4
k
1x
1k
2x
2k
3x
3k
4x
4r
ac?r
ax
5m
?r
acx
63
7
7
7
7
7
7
7
5
:
Since
ST
1
(
"
C A
#
) x = 0
()x
6= 0;
Ax = By =
)x
1= 0 ; x
2= 0 ; y
4= 0;
ST
1
( A ) C
TCx =
()C
41x
1+ C
42x
2+ C
43x
3+ C
44x
4+ C
45x
5= 0;
ST
1
( B ) y = 0
()y
1= 0 : From (23), we also know
T
T
1
( B
S1?( )) By = 0
()y
2= 0 : Therefore, we have that
"
x
?
y
#
2F
if and only if
2
6
6
6
6
6
6
6
6
4
A B
ST
1
(
"
A C
#
) 0
ST
1
( A ) C
TC 0 0
S1T( B ) 0
T1T( B
S1?(
3)) B
3
7
7
7
7
7
7
7
7
5
"
x
?
y
#
= 0 ; y
6= 0 :
Note that
U
TbU
3=
2
6
6
6
4
l
?r
b0
k
30
k
4U
44r
ab?r
a0
3
7
7
7
5
;
so, Theorem 7 follows directly from the above Arguments 1, 2 and 3.
Similarily, we also have the dual result of Theorem 7 which characterizes the non-trivial generalized singular values
1;
;
k4and the matrix V
3in (21). For the sake of simplicity, we omit it here.
4 Conclusion
In this paper, we have studied generalized singular value decompositions. We have given an alternative proof of the variational formulation for the QSVD in [1] and established an analogous variational formulation for the RSVD which provides new understanding of the orthogonal matrices appearing in this decomposition.
5 Acknowledgement
Some of Delin Chu's work was done during his visit in the Department of Mathematics at
The University of Bielefeld in Germany in April 1998. He is grateful to Professor L.Elsner for
his kind hospitality and nancial support. He also thanks Professor L.Elsner for his reading
and correcting the rst version of the present paper.
Appendix A
In this appendix we prove Lemma 2 constructively.
Proof. We prove Lemma 2 by 4 steps as follows:
Step 1: Perform simulaneous row and column compression:
U
1TAW
1=:
"
r
ar
ac?r
am
?r
acr
aA
110 0
n
?r
a0 0 0
#
;
V
1TCW
1=:
2
6
4
r
ar
ac?r
am
?r
acp
?r
c0 0 0
r
a+ r
c?r
acC
210 0
r
ac?r
aC
31C
330
3
7
5
with A
11;C
33nonsingular and C
21full row rank.
Step 2: Perform a column compression:
C
21W
2=:
hr
ac?r
cr
a+ r
c?r
ac0 C
22 iwith C
22nonsingular. Set
A
11W
2=:
hr
ac?r
cr
a+ r
c?r
acA
11A
12 i; C
31W
2=:
hr
ac?r
cr
a+ r
c?r
acC
31C
32 i: Step 3: Perform a row compression:
U
3TA
11=:
"
r
ac?r
cA
11r
a+ r
c?r
ac0
#
with A
11nonsingular. Set
U
3TA
12=:
"
r
ac?r
cA
12r
a+ r
c?r
acA
22#
: Step 4: Set
U
a:= U
1"
U
3I
#
; W := W
1"
W
2I
#