An eigenvalue algorithm based on norm-reducing transformations

(1)

An eigenvalue algorithm based on norm-reducing

transformations

Citation for published version (APA):

Paardekooper, M. H. C. (1969). An eigenvalue algorithm based on norm-reducing transformations. Technische Hogeschool Eindhoven. https://doi.org/10.6100/IR41102

DOI:

10.6100/IR41102

Document status and date: Published: 01/01/1969

Document Version:

Publisher’s PDF, also known as Version of Record (includes final page, issue and volume numbers)

Please check the document version of this publication:

• A submitted manuscript is the version of the article upon submission and before peer-review. There can be important differences between the submitted version and the official published version of record. People interested in the research are advised to contact the author for the final version of the publication, or visit the DOI to the publisher's website.

• The final author version and the galley proof are versions of the publication after peer review.

• The final published version features the final layout of the paper including the volume, issue and page numbers.

Link to publication

General rights

Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. • Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain

• You may freely distribute the URL identifying the publication in the public portal.

If the publication is distributed under the terms of Article 25fa of the Dutch Copyright Act, indicated by the “Taverne” license above, please follow below link for the End User Agreement:

www.tue.nl/taverne

Take down policy

If you believe that this document breaches copyright please contact us at: openaccess@tue.nl

providing details and we will investigate your claim.

(2)

(3)

AN EIGENV ALUE ALGORITHM

BASED ON NORM-REDUCING

TRANSFORMATIONS

PROEFSCHRIFT

TER VERKRIJGING VAN DE GRAAD VAN DOCTOR IN DE TECHNISCHE WETENSCHAPPEN AAN DE TECHNISCHE HOGESCHOOL TE EINDHOVEN OPGEZAG VAN DE RECTOR MAGNIFICUS DR.IR. A.A.TH.M. VAN TRIER,HOOGLERAAR IN DE AFDELING DER ELEKTROTECHNIEK, VOOR EEN COMMISSIEUITDESENAATTEVERDEDIGENOPDINSDAG

2 DECEMBER 1969 TE 16.00 UUR.

DOOR

MICHAEL HUBERTUS CORNELIUS PAARDEKOOPER

(4)

DIT PROEFSCHRIFT IS GOEDGEKEURD DOOR DE PROMOTOR PROF. DR. G.W. VELTKAMP

(5)

Aan Wil

(6)

0. Introduction

0.0. Introductory remarks

0.1. Notations, definitions and elementary theorems 0.2. A survey of Jacobi-like

0. 3. Summary

1. Real Norm-Reducing Shears 1.0. Introduction

1.1. Row congruency and Euclidean parameters of a shear

1. 2. transformations by real unimodular shears

for the real unimodular

norm-1

.4.

The particular case D

=

F 0,

1.5.

The commutator in relation to shear

transfor-matioas

2. Complex Norm-Reducing Shears 2.0. Introduction

2. 1 • Row congruency and Euclidean oa.ra,llle c shear

of a

2.2. The unimodular norm-reducing shear

transformation

for the complex unimodular norm-reducing shears

2.4. The case D = F == 0

2.5. The commutator in relation to shear transfor-mations 1 1 9 18 24

29

46 49

53

55

61 64

(7)

3.

Convergence to Normality 3.0. Introduction

3.1.

A lower bound for the optimal norm-reduction

b,r shear transformations

3.2,

The convergence theorem

4•

Jacobi-like Methods for almost Diagonalization of almost Normal Matrices

4.0.

Introduction

4.1.

Almost diagonalization of a complex almost normal matrix

4.2.

Almost block diagonalization of a real almost

normal matrix

4·3·

The real diagonali representative of Jn,tm(A)

4.4.

The complex diagonalizing representative of

5.

Numerical stability and the norm-reducing process 5.0. Introduction

5.1. Input and output perturbations related to rounding errors

5.2. Error analysis of similarity transformations

5.3. The general error analysis applied to shear transformation

5.4.

A numerically stable transformation by the dia-gonalizing representative of >Sntm(A)

5.5. Diagonal dominance and shear transformations

References Sa.menvatting Curriculum Vitae. 6

69

73

76 76

77

90

94

97 1.01 101 104 112 118 125 128 139 145

(8)

CHAPTER 0

INTRODUCTION

0. 0. Introductory remarks

Since the rise of the program-stored digital computer it has been possible to master effectively the bulk of work necessary to solve numerically the algebraic eigenvalue problem, i.e. the approximate calculation of the eigenvalues and eigenvectors of a linear trans-formation represented by a given matrix. The advent of this appa-ratus has stimulated the construction new algorithms for this problem. As concerns the Hermitean eigenvalue problem we mention the numerically stable methods of Givens and Householder.

But also for the non-Hermi tean problem several new methods are proposed •. Since this problem can be very ill-conditioned, the construction of the latter algorithms presents serious difficul-ties. Inexact arithmetic, the reverse of the computer's speed, makes therefore the problem mathematically interesting. Research on the numerical solution of the non-Hermitean eigenvalue problem is very active at present. At the moment it is not yet clear which of the algorithms proposed by several authors is preferable. The QR-algorithm, developed by Francis in 1961-1962, attracts much at-tention and inspires confidence as to speed of convergence and ac-curacy of the results.

The method for the non-Hermitean eigenvalue problem which we pre-sent in this book, is of what is known as the Jacobi-like type, i.e. an extension of the classical Jacobi-method to non-normal matrices.

The Jacobi-algortihm is based on the use of rotations, the original matrix A A being recursively transformed into matrices

(9)

A , A , •••• , which tend to a diagonal form. In each step of the

p~ode~s

the plane rotation is chosen to minimize the sum of the squares of the moduli of the non-diagonal elements. In principle, each normal matrix A can be transformed into a diagonal form by these unitary Jacobi-transformations; the Euclidean norm of the matrix A is invariant under these transformations and for normal

.!.

matrices this norm equals (

.E

!A.I

2)2 , where

A,

A~, .••• , A

J=1 J 1 "" n are

For

the eigenvalues of A.

2 n 2

non-normal matrices 1fAIIE > .. E

I

A.

I

(!IAIIE

~J=1 J i,

called the Euclidean norm of A); hence it is not possible to transform these matrices unitarily into diagonal form, and so the Jacobi-method is fruitless to achieve this end.

In 1962, Eberlein

[4]

suggested the use of non-unitary plane transformations in order to diminish the Euclidean norms of the matrices in the sequence thus obtained. It is not impossible that this may lead to diagonalization of non-normal matrices since [22]

inf f!T-1 A Tl!E 2

T regular

n

E

j=1

In the first part of this thesis (chapter 1, 2, 3 and 5) we con-struct and investigate an algorithm to normalize non-normal ma-trices. In chapters one and two an algorithm is described to

re-..

duce (in some sense optimally) the EUclidean norm of a real,res-pectively complex, non-normal matrix

qy

a plane non-unitar,y trans-formation. In chapter three we prove that the sequence-{~}, generated by the successive application of this algorithm, con-verges to the class of normal matrices with the same eigenvalues as A

0, and,finally, in chapter five we show that computation of these plane non-unitary similarity transformations can be per-formed in a stable way.

In the second part (chapter four) an algorthm is described b.y which, using unitary plane transformations, an almost normal matrix- let

(10)

us say the result of our norm-reducing process - can be transformed into almost diagonal form. F·rom the diagonal elements of this form we may read approximations of the eigenvalues we have aimed at.

0.1. Notations, definitions and elementary theorems

Q. 1 •. 1 • We

and we

our preliminaries with the definition of a normal a list of well-known theorems concerning these matrices.

Let

A

be a linear transformation on R to R , where R

n n n is a

unitary or a Euclidean space of dimension nand let~* be the ad-of

A .

Definit:ion 0.1.

A

**(=>./hA ""AA*.**

Let A be the matrix representation of

..4

on some orthonormal basis of Rn; the conjugate transpose A* of A is the matrix representation of

fi

* on the same basis.

In the we deal with square matrices of order n over the field of complex numbers, unless mentioned otherwise. The eigenvalues of the matrix A will be denoted by A.. (j ""1,2, ••• , n) where A.=fJ..+iV.o

J J J J

Defini ti.on 0. 2. A normal ~ A*A AA*.

Theorem 0.1. The matrix representation A of

/cJ

on an orthonorrnal basis is normal if and only if

JJ

is normal ([21], p.56).

A matrix A is normal if and similar to a diagonal matrix ([21], p.165).

if A is unitarily

A real matrix A is normal if and only if A is orthog-similar to a matrix that is the direct sum of matrices of the

(11)

form (A.), where

( Re(.\)

- Im( A.)

A. is a real eigenvalue of A and of

Im(A.))

Re( A.)

2 x 2 matrices

where A is a complex eigenvalue of A. This direct sum is called Murnaghan 1 _{s canonical form of A.}

Theorem 0.4. Let A be a normal matrix. The real parts of the eigen-values of A are eigeneigen-values of' the Hermitean part i(A + A *)of A,and

the imaginar,y parts of the eigenvalues of A are eigenvalues of the

.

*

skew-Hermitean part i(A- A ) of A.

Proof. U*AU

=

Since A is normal, diag ( ll· +i v.) , where

J J

there exists a unitar,y matrix U so that

IJ.. + iv. (j = 1, ••• , n) are the

eigen-J J

values of A. We see that

U* A ; A* U = diag ( ll j) , j 1, ... , n and

U* A- A* ₂ U d' _{~ag ~}(' _{vj '}) _J. 1 , 2, ••• , n. 0

0.1.2.

Theorem 0.5. (Schur's lemma). For any matrix A there exists a unitar,y matrix U for which holds U*AU = T, where T is of upper triangular form; T is diagonal if and on~ if A is normal ([21], p. 158).

Theorem 0.6. For any real matrix A of order n with k pairs of complex conjugate eigenvalues

1J. £ ± i V£ (V£ >0, 1 ..;;; £ ..;;; k, 0 ..;;; k ..;;; [~ ])

and n-2 k real eigenvalues llj(j = 2k+1, ••• ,n) there exists an

(12)

orthogonal matrix Q so that triangular form

has the following block upper

2

(

\) 1 _u _u1n f.l1

-a-

13 1 a _IJ.1 3 "- 2n 1

'

-IJ.2k+4 0 0 0 0

and the elements a£ of QTAQ are positive,

In the proof of Schur's lemma we have to add the part of the inductive proof corresponding to a pair of complex

Let X ;!: i y1 be eigenvectors corresponding to

1

!l1 + i \)

1 .

Since \) 1

-fo

X 1 and y are linearly 1 and we have A [x : y1

1 '

J

[x1

~)

(

~1 -\!1)

' 1 ll1 Then _sin

:)

(cos q> -A [x _y1] _. 1 _Sln _cos q>

(13)

Let (cos {jl - sin •

xos•

=Lx,:y] 1 1 _Sln. {jl COS cp Sln cp (cos• -

sin~)

( "' =lx:y] 11 ₁ sincp COS (jl V . 1 u 1 x 1 cos cp + y 1 sin cp u x sin cp + y cos cp • 2 1 1 Then A [ u u ] 1 2 and T u u 1 2 sin •) ("' cos cp v 1 - v1

)~~s

•

_{-sin •)} j.J. 1 ln <p COS<p

-v,)

IJ.1 T I f X y

1 1 0 then we take cp O, otherwise we determine cp by

T Taking either value of <p, u

1 u2 0. Finally to obtain

orthonor-mal vectors, u and u have to be standardized to length one.

1 2 -1 -1 Let v 1

·-

l!u 1 2 11 u 1 V 2 := llu21!2 u2• Then !lv 1[ = llv 11 _{1 '} V T V 0 and 1 I 2 _{2 2} ₁ ₂ A _{[v1 V ]} = [v V ] ( IJ.1

-v/cx)

(0.1.1) 2 1 2

_cxv

IJ.1 ' 1

where a = l!u ll /llu IT •

2 2 1 2 Let v

1 and v_~,2 be the first two columns of an orthogonal matrix ~;

let B .- p- AP 1 1 1 •

(14)

B P TAP = 1 1 1 with a 1 = ·ex v 1 1 > 0. fl1 a 1 0 0 - v 2 /a 1 1 ~13 ~,n 111 ~23 ~2n 0 _~33 _~3n 0 _~n3 _~nn

For the rest, l.he formal inductive i:'roof of the theorem is anal-ogous l.u l.hat of Schur's lemma. 0

Definition

0.3.

I!AIIE is called the E"L.tcJ.idean norm of A.

Theorem

o.

7. The Euclidean norm IIAI!E of A is invariant relative to unitary similarity transformations, i.e.

U*U = I =) I!U*AU11 = IIAII •

E E

Theorem c.s~_ Let A be a complex matrix with eigenvalues

\ , !..₂, ••• , t..n. Then

n

~ L:

i=1

I

t...l_~2

Eq'J.ality holds in (0.1.2) i f and only if A is normal.

(0.1.2)

Proof. This iE a direct consequence of the theorems

0.5

and

n . 1

Definition~L',(A) := (IIA!I: - L: jt...j2)2•

i=1 ~.

6(A) i£ calJ.ed the departure-of normality of A

[14]

(15)

as the Euclidean norm, is invariant relative to unitar,r trans~or

mation of A.

Theorem

0.9.

For any matrix A

inf

o.

T regular

The infimum is assumed i f and only i f

A

is diagonalizable

[22].

Theorem 0.10. For each matrix A there exists a normal matrix N

which has the same eigenvalues as A and is. such that I!A-NffE .,;;L~(A).

Proof. According to Schur1_{s lemma there exists a unitary matrix}_U

for which holds

U*AU

=

diag ( 11.:.) + T,

J

where T is strictly upper triangular. Then A = N + UTU*, where N

=

U diag(X.)U*.

J This matrix N is normal and

0

Corblla;y. Let

A=

diag(

X.), A.

being the eigenvalues of A. Then

J J

there exists a unitary matrix U for which holds !IU*AU -

All ..;

/:::,(A).

Theorem 0.11. For any real matrix A of order n there exists a real normal matrix N of order n for which holds that A a:nd N have the same eigenvalues and llA-NilE .,;;; 6(A).

Proof. Let QTAQ be the block upper triangular matrix indicated in theorem 0.6. Then QT AQ

=

M + T, where

(16)

2

0

V 1.!1 -v 0 V _ _!_ 1 1 a 1 V _1.!1 a -v 0 1 1 1 Ilk -\: ! M= ' T= vk Ilk

0

ll2k+1

,j

0

with U an upper triangular matrix of which u 2 _~-. 1 2

_'

_~. = 0 , i

=

1 , 2, ••• , k. 2 0 _vk

--

\l.k ak ~-vk 0 0 0

Thus QT AQ is the sum of a Murnagban canonical form M and a pertur-bation matrix T. Then

T T

A = QJVIQ + QTQ = N + P. (0.1.3)

The matrices A and N have the same eigenvalues and N is a normal matrix. Since the Euclidean norm and the departure of normality of a matrix are invariant relative to unitar.y similarity transfor-mations, we have

(0.1.4)

where (M,T)E is the inner product of the matrices M and T (consid-ered as elements of the n2-space) corresponding to the Euclidean norm. As we see from the matrices M and T, written in full above,

(17)

k

l!A-r-rn~ = 1'12(A)- 2 z:: (a[v,.e)2v/a,.e..:; t'12(A).

o

.e=t

Corollary. Let M be Murnaghan1_{s canonical form corresponding to}

the eigenvalues A, A, ••• , A of the real matrix A. Then there

1 2 n

exists an orthogonal matrix Q for wh:i.ch holds IIQT AQ - Mlf ..,; 1'1(A).

Definition 0.5. Let~ (k = 0,1, ••• ) be similar to A= A 0 and

n(A)

the class of normal matrices with the same eigenvalues as

A.

The sequence~ is said to converge to normality (or to converge to ?1.(A)) if there exists a sequence {\;} where \: E '!/.(A)

(k = 0,1, ••• ), so that

Theorem 0.12. Let {~} be a sequence of similar matrices. It con-verges to normality if and only if lim ~(~)

=

0.

k - 00

Proof. The sufficieney follows immediately from theorem 0.10 (for real matrices from theorem 0.11).

The convergence to normality of {~} implies that there exists a sequence {Nk}, NkE:il.(A) so that lf~-\;IIE

-o.

Then there exists a sequence of unitary matrices {vk} for which holds that

Then n

2 _

*

2 _

1 2 _

I

(k)l2

I

(k)la

lll\:lfE - I!Vk ~UJIE - 11 A+ Ekl E - .~ A.+ e.. + .L: e~0

J=1 J J J Jl.t .,yv

(18)

Hence

n

0.1. 3.

In contrast to 6(A) the measure of non-normality of the matrix A which we define in this subsection, is effectively computable.

Definition 0.6. commutator of A.

*

C(A) := A A -AA • The matrix C(A) is called the

Theorem 0.13. Let 6(A) be the departure of normality of the ma-trix A and C(A) the commutator of this mama-trix.

Then [14]

1

62(A),.; [(n3-n)/12]2 [IC(A)IIE (0.1.5)

and if A

I

0 then [5]

(0.1.6)

Corollar,y. Let {~} be a sequence of similar matrices. It con-verges to normality i f and only i f C(~) - 0 •

(19)

Finally, we define here some notions which are used in the descrip-tion of Jacobi-like algorithms.

Definition 0.7. A shear matrix Tn is a non-singular matrix which

- - - .-vm

differs from the unit matrix I on~ in one of its two-dimensional submatrices. In that one submatrix the elements are t,e,e' t,em, tm.e and t • The indices ,e and m, 1 ~ ,e <m ~ n are called the

pivot-mm

pair of T,em and the elements t,e,e' t.em' tm£ and tmm are called the Jacobi.;..parameters of T.em• The class of shear matrices with pivot-pair (,e,m) will be denoted by

if

.£m·

[.em ,

~m

and

U

.£m are the classes of shear matrices with pivots .£and m which are uni-modular (i.e. jdet(T£m)1 = 1), orthogonal and unitary respectively.

Definition 0.8. The matrix

will be called the (..e,m)-restriction of A.

Definition 0.9.

1

S(A) := ( 2!

la.

·12Y2.

ifj

J.J

S(A) will be called the departure of dia,gonal form of A

0. 2.

A survey of Jacobi-Hke algorithms

In a Jacobi-like procedure for the computation of the eigenvalues

A

1,

A

2, ••• ,

An

of a matrix A of order n a sequence A=A,A,A,

0 1 2

is constructed in which the matrices ~

18

(k)

(20)

defined by the relation -1

Ak+1 := Tk ~ Tk (k 0,1,2, ••• ).

The matrix Tk is a shear matrix with pivot-pair(..ek, ~) and Jacobi-parameters (k) (k) t

=

p ..ek,..ek k t ..ek,~ qk (k) (k) t ~,..ek rk t

_~·~

=

sk.

The indices ..ek and ~, 1 ~ ,ek < ~ ~ n constitute the pivot-pair of the k-th iteration of the Jacobi-like process. The choice of the successive pivot-pairs (..ek' ~) is called the pivot-strategy of the process. In several Jacobi-like processes the pivot-pairs are selected in some cyclic order. We mention especially the se~

pivot-strategy indicated by the scheme: (£ ' ₀ m ) ₀ ( 1 '2)

[<'k·V1l

(..ek+1'~+1) ~ (..ek+1 ,£k+2)

( 1, 2)

The method of Jacobi ([16], 1846) is ' if , i f if one ..ek < n-1, ~ <n ,ek < n-1, _~ n (0.2.1) ,ek n-1, _~ n.

of the few efficient methods of solving the Hermitean problem which existed before 1950. After its rediscovery in the late forties several modifications and generalizations of this method have been proposed,

1. Hermitean matrices

In the Jacobi-procedure for the computation of the eigenvalues of a Hermitean matrix the shear matrices Tk are unitary and the Jacobi-parameters pk' qk' rk and sk are chosen to minimize the

*

(21)

departure the element

a~k+

1

)

of A

1 is annihilated. Therefore

""k'~ le+

the decrease of the departure of diagonal form equals 2ja(k)

f

2 •

~·~

a) In the classical Jacobi-process

([9],

[12]) the pivot-pair (tk, ~) is chosen so that

I

a ( k)

I

= max (

I

a~~)

I ) .

£k,~ i < j ~J

b) In the serial Jacobi-process the pivots are chosen in confor-mity with rule ( 0. 2. 1 ) [ 13] •

c.) In the serial Jacobi-method with threshold t the pivots run serially through all superdiagonal positions of the matrix, except those for which

/a~~)j

< t ([24], [29]).

l.J

If all off-diagonal elements of A(k) are smaller in modulus than t, then the threshold t is lowered.

Each of these pivot-strategies gives rise to a convergent process:

if the values of the Jacobi-parameters for which a}k+1)

=

0 are chosen in a reasonable way, then lim \: =

diag(A.)k'~

[s].

k-oo J

Moreover, the asymptotic convergence is quadratic ([17], [18],

[ 3 3]).

2. Normal matrices

In the extension of the Jacobi-procedure to the case of normal matrices, as proposed by Goldstine and Horwitz [10], the shear matrices Tk are unitary and for each k the Jacobi-parameters of of Tk are chosen so as to minimize

~(\:+

1 )

=

~ ja~~+

1

)j

2

• if

j l.J

Although at each step the decrease of the departure of diagonal form is optimal,Voyevodin has exhibited a class of matrices, for which, independently of the pivot-strategy, this process is tionary before the diagonal form is reached [32]. To prevent sta-tionarity, Goldstine and Horwitz have modified their algorithm, and they have shown that with this modified algorithm

(22)

lim

k -+C'O

(Aj). Ruhe [25] has proved that if such modifica-tions are superfluous, the convergence is quadratic.

3. Triangularization by unitary shears

Greenstadt [11](1955) and Lotkin [19] (1956) have generalized the Jacobi-procedure to arbitrary matrices. Their algorithms,according to a made by J. von Neumann, are based on Schur's lem-ma (theorem 0.5): for each lem-matrix A there exists a unitary lemma-trix U for which holds that U*AU is triangular. In both general-izations the shear matrices Tk are unitary. Greenstadt determines the Jacobi-parameters at the k-th stage of the process in such a way that a(k+i) = 0, Lotkin, on the contrary, determines them so

,ek'~

that . Z.

la~~+

1

)1

2 is minimal. For some matrices, however, the

~ <J ~J

the sequences {~} generated by these methods are not convergent whatever pivot-strategy is used([ 1 ], [2]).

4.

Norm-reducing by non-unitary shears

In 1962 in her paper "A Jacobi-like Method for the Automatic Com-putation of Eigenvalues and Eigenvectors of an Arbitrary Matrix''

[4]

P.J. Eberlein introduces a norm-reducing process by transfor-mations with non-unitary shears Tk, The underlying idea is that (in conformity with theorem 0.9)

inf T regular

n

z

(0.2. 2)

Moreover, if and only if A is diagonalizable, there exists a non-singular matrix T such that T-1AT is normal; in that case the in-fimum in (0.2.2) is assumed. So the aim is to construct recurs-ively a sequence A

0 =A, A1 , A2, ••• , where

~+i

:=

T~

1

_~

Tk, so that

um

n~n~

k .... eo n

=r.IA·I~

j=1 J

In terms of definition 0.5 and according to theorem 0.12 this se-quence {Ak} converges to normality.

(23)

a) In Eberlein1_{s method}~+

1

is produced from~ in two steps:

"' -1 -1 -1 ~

~ := ~ ~~ , ~+

1

:= ~ sk ~ sk ~ (0.2.3) with

~and ~unitary shears and Sk a norm-reducing non-unitary shear, all having the same pivot-pair (tk'~). ~is chosen

such that

C

0 n c where

,..k,A;k mk,~

c

This pre-treatment facilitates the construction of a suit-· able norm-reducing Sk. The Jacobi-parameters of Sk are chosen in the following way:

(

Pk qk)

(cos""'

i

e

1

"'sinh~

'

rk sk -ie -itj,ksinhcpk cosh<pk } ~ and <)lk real.

In order to minimize ~~~+

₁

11E

,

considered as functions of and ~k' two simultaneous quartic equations have to be solved. Since this is not easy, Eberlein gives an approximation

(~k' ~k) of the solution (~k' $k). The norm-reducing shears Sk corresponding to (~k' $k), in combination with the plane rotations ~ suffice - independently of the choice of the unitary shear~- to obtain a sequence {Ak} which converges to normality, provided the pivot-pairs (,ek'~) are chosen appropriately.

b) In Rutishauser's norm-reducing algorithm [28]the transforma-tion with Tk is also performed in two steps as in (0.2.3). The unitary shear

1\:

annihilates the element c,e of

f"'V f"V *,..., ,.._.J

""'*

k' ~

C = ~ ~ - ~

1\·

The non-unitary shear Sk is a diagonal matrix, which scales-~ in the following manner.

If r~t

t

I

~ I~

I. '

then the lengths of the t-th column

k' k ~'~

and the t-th row are made equal, else this operation is ap-plied to the m-th column and the m-th row. Rutishauser states

(24)

that the sequence {~} obtained in this way converges to nor-mality.

c) Voyevodin [ 31] proposes to use the following Jacobi-parameters.

-1

The parameter \:is chosen to minimize i[Tk A TklrE The convergence to normality has been proved.

d) Osborne1_{s equilibration [23], too, is based on the principle}

of norm-reduction. The aim of Osborne's algorithm is to im-prove the condition of the eigenvalue problem. In each step of the process the Euclidean lengths of a certain row and its corresponding column of the transformed matrix are made equal, Osborne has proved that if A is irreducible, then there exists a non-singular diagonal matrix ~ such that the diagonal ele-ments of C(D-1AD) are zero. With sequential pivoting the pro-cess mentioned above constructs iteratively such a

matrix D. The matrix D-1AD is called equilibrated,

diagonal In the class of matrices similar to A by diagonal transformation,

-~

the eq,Jilibrated matrix D 'AD has a minimum Euclidean norm.

5. Diagonalization and combination of no:t'lll-reduction and diagonalization

If !:J.(A) is small in relation to ILil,.'l~ then the matrix A is called almost normal. According to the corollary of theorem 0.10 such an almost normal matrix can be unitarily transformed into an almost diagonal matrix. The "almost-diagonalization" of an almost normal

matrix is the second stage of a Ja9obi-like process for

arbitrary matrices and follows the process of norm-reduction. In Eberlein1_{s process the diag\)nalization is already promoted during}

the norm-reduction stage. For that purpose the unitary shear~ in (0,2.3)is chosen such that the departure of diagonal form of

~

1

_(s~

1

_~

(25)

[32] shows that the global convergence to diagonal form of the se-q·.:ence

f-\:},

n1:"':dned with these {~}, cannot be proved sin.:;e the Eberletn algorithm is a generalization of the Goldstine-Horwitz proced:;re. Ruhe [26] has shown that i f the sequencE

{.I\},

gener-ated ty Eberlein 1 _{s norm-reducing diagonalizing algorithm,}

con-verges to diagonal form, then the convergence is quadratic.

0.

3.

Summary

In chapter we investigate the norm-reducing shear transformations on the pivot pair (-e ,m), applied to a real matrix A. I t is shown that, in consequence of the invariance of the Euclidean norm of a matrix under orthogonal transformations, each shear in a class of

w1~at will be called row congruent shears brings about the same norm reduction. This class is determined by what will be called its Eu-clidean parameters (x,y,z), x> 0, y > 0 (definition1.2). With the Euclidean parameters of a shear Tim in such a class we find a simple expression describing the Euclidean norm of the transformed matrix (theorem 1.2). If the shears are restricted to be unimodu-lar then :1 T;~AT );mll~ is a quadratic function of x, y and z defined on the hyperooloid xy- "" 1. In theorem

1.4

it is shown that, ex-cept for a number of particular cases, this quadratic attains its

infim~m on the hyperboloid for finite values of x, y and z.

:r:r:

section 1. 3 we describe the algorithm to compute the Euclidean

(x,y,z) of the class'»1m(A) of row congruent unimodular optimal norm-reduc shears (theorem 1.5). For the computation of tLese parameters we have to determine a real root of a quartic equation, which is uniquely localized by an inclusion theorem (lem-ma 1.5).

TLe particular case that the infimum of the quadratic on the hyper-b:;loid is not assumed for finite values of the Euclidean parameters is fully described in section

1.4.

(26)

In section 1. 5 it is shmm that after optimal norm reduction by a unimodular shear on tile pivot-pair (t,m) the commutator C 1 _{of the}

transformed matrix has the properties c Jm = O, c

tt

= cr:rrn.

In chapter 2 we the complex norm-reducing shear trans-formations on the pivot-pair (t,m) applied to a complex matrix A. As in the real case, each shear of a class of what will be called row congruent shears brings about the same norm reduction. This class is again determined by its Euclidean parameters (x,y, z), where now x and y are real and positive, but z is complex (definition

2. 2). the Euclideaz: parameters of a shear T ,em in such a class we find an expression, less simple as in the real case, describing

l

j _, _{, T .£mAT tm: E}1[ ( _{theorem 2. 2 •}) _In_{order to s imp i y this express ion, t e}1 f h matrix A is pre-treated by a unitary shear U.£m' so that

(U~~AUEm)m.£ = 0. In theorem 2,) it is shown that i f A is a pre-treated matrix (i.e. am.£ 0) and T t is a unimodular shear on the pivot-pair (t,m), then

IIT;~AT,emlli

isma quadratic function of x, y, z and;;, defined on xy-jzj2 1, x> 0, y> 0, z complex,

As in the real case, this quadratic attains its infimum on

xy-

I zl

2 = 1 for finite values of x, y and z unless the matrix A satisfies particular conditions.

In section 2,) we ciescribe the algorithm to compute the Euclidean parameters of the class irltm(A) of complex unimodular optimal norm-reduc shears on the pivot-pair (t,m) (theorem 2.5).

TI1anks to the pre-treatment of the original matrix, it is possible to transfer the algorithm for real matrices to the complex case. In section 2.5 it is proved that for a real matrix A the Euclidean parameters of the class of complex unimodular optimal norm-reducing shears are real.

0.3. 3.

(27)

norm-reducing shear transformations is investigated. Let the pivot-pair (£,m) be chosen so that

C equal to A*A-AA*. Let T.R.m

E~m(A)

and A' =

T~~ATtm'

Then theorem 3.2 gives a lower bound for the decrease of the Euclidean norm effectuated by Ttm:

Our proof of this result is essentially the same as that of Eber-lein

[4].

Since WB make use of the Euclidean parameters (x,y,z) of

the shears involved, our calculation of Eberlein 's estimate for the optimal decrease of the Euclidean norm is considerably more simple than her ovm.

We use this estimate in the proof of the convergence theorem. Let (,e ,m ),(.e ,m), ••• be a sequence of- pivot-pairs. Let A :=A and

1 1 2 2 0

~ := T:1 \ : T 0 (k"' 0,1, •.. ), where T

.e

is a unimodular

""k'~ _, ""k'~ k'~

optimal norm-reducing shear on the pivot-pair (,ek'~). In theorem 0.3 we prove that i f the pivot-strategy is so that for each k:

then the sequence {~} converges to normality in the sense of def-inition 0.5.

0. 3 ·4.

As a consequence of the convergence theorem of chapter 3 we find that for each e > 0 and each matrix A there exists an integer k and a normal matrix N with the same eigenvalues as A, so that for the matrix \ : obtained after k norm-reducing similarity transformations

(28)

where ~(~) is the of normality of Ak.

Therefore, in 4 we consider a Jacobi-process which almost diagonalizes an almost normal matrix A.

In the first part of this process t!:e Hermi tean part of A is almost diagonalized. The resulting matrix, A

1 (say), is shown to have ar:

almost block stru.cture, the Hermitean part of a

block being almost a multiple of the unit matrix (lemma 4.1). As a consequence, it is shovm that the skew- Hermitean parts of these diagonal blocks can be diagonalized by a second sequence of Jacobi rotations without disturbing the "almost diagonal" character of the Hermitean parts of the diagonal blocks (lemma 4. 2). Let be the resulting matrix after this second half of our process. 'Ihe depar-ture of diagonal form S (A

2) of this ultimate matrix proves to be

bounded by a function of (i) the

(ii) the

of normality of the original matrix A;

of diagonal form of the Hermitean of the matrix A, obtained after almost diagonalizing the Hermitean

1

part of A;

(iii) The departures of diagonal form of the skew-Hermi tean parts of the diagonal blocks of

This function tends to zero i f each of these quanti ties tends to zero (theorem

4.2).

If for real matrices we want to use only real transformations, a somewhat more complicated result is obtainable. of the symmetric part of a real almost normal matrix A results in an almost block diagonal matrix which is an almost canonical form, unless, if ~ ~ iv is a

of A, there exist yet other orem 4.6).

of complex eigenvalues

(s) with real part(s) !J.

(the-Already the norm-reducing stage of the eigenvalue procedure, diagonalization can be promoted by executing the norm-reducing shear

(29)

transformation with that element T.£m E m.£m(A) that, moreover,

min-imizes the departure of diagonal form of the transformed matrix. This element will be called the diagonalizing representative of

~m (A).

TLe problem of the numerical stability of a single optimal norm-reducing shear transformation, executed in floating point arithme-tic, is considered in chapter 5. It proves to be possible to per-form the transper-formation with the diagonalizing representative TimE rtl:em(A)

mation e

in such a way that the actual result of the transfor-T~~(A +F)Tim + G, where IIFIIE is small relatively to is small relatively to llTi~ATimiiE'

This result may contribute to explain the accuracy of the solution of the eigenvalue problem which was observed during our numerical experiments with procedures based on the algorithms described in this thesis.

(30)

CHAPTER 1

REAL NORM-REDUCING SHEARS I.

0.

Introduction

In this way

one step of the norm-red:.wing Jacobi-like process, and we shall

determine tf,e norm-reducing unimodular shear similarity

transformation for the real case. Since for one transformation the pivots £ and m are fixed, we shall omit the subscripts when no ambiquity arises.

L L Row congruency and Euclidean parameters of a shear

Let To be a shear rr,atrix with pivot-pair (,e,m) and Jacobi-)'"m

parameters p, q, r and s. So the (,e,m)-restriction of T hn is

(:

:)

.

Let d := det(T£ ), m_1 non-singular T£m pivot-pair (£,m). (1.1.1)

thus d ps-~r. Since T£m is supposed to be exists. Tim-1 is also a shear matrix with The (i,m)-restriction of Tfm-1 is

(1.1.2)

In this chapter we asswne that the matrix A and the matrix T.£m' by which A is transformed, are real matrices. In the description

-1

of the Euclidean norm of A T im' we shall try to take into aoco~~t the invariance of this norm under orthogonal transforma-tion. In particular, if Q.£m is an orthogonal shear then we have

T -1 -1

for each shear T n : IIQ, 1' n A •J' a QD liE liT 0 AT 0 11. Hence

-vm .zm ,vm "'m ,m ,vm ,vm

the optimal norm-redc1cing shear is determined except for an ortho-gcnal factor Q n •

(31)

Definition 1.1. The matrices S and T will be called row congruent

i f S = TQ for some orthogonal matrix Q.

Theorem 1 .1. S and T are row congruent if and only if SST TTT. Proof, If S

=

TQ, with QQT

For the proof of the sufficiency of the condition, we make use of the polar factorization ([20], page 74) of the matrices S and T.

Let S

=

PU, T =RV, where P and Rare positive semi-definite ma-trices, U and V orthogonal m:ttrices .Since P and R are complet determined by SST and TTT respectively and the latter are equal, P R. So S

=

PU

=

RU

=

TV-1

u.

Hence S and•T are row congruent. D The theorem shows that the class of matrices row congruent to T is uniquely determined b,y the elements of ; they p~rametrize

the equivalence classes into which the full linear group of non-singular matrices is d·ecomposed by row congruency.

Now we consider row congruency for shear matrices with pivot-pair (t,m), I f p, q, rand s are the Jacobi-parameters of Ttm then the

(t,m)-restriction of Ttm

T~

is

pr + qs) r2+

(1.1.3)

Pefinition 1.2. The quantities x := x(Ttm) := p2+ q2

y := y(Ttm) := r2+ s2 z : z(Tfm) pr + qs

will be called the Euclidean parameters of Ttm'

According to theorem 1.1. the Euclidean parameters (x,y,z) of T

1m

determine the class of shears row congruent to T£m' 'rhis class will be denoted asl't_gm(x,y,z).

Lemma 1 .1, The Euclidean parameters (x,y,z) of a shear Tfm

(32)

the inequalities

X > 0,

y

>

o,

(1.1.5)

Conversely, if x, y and z ( 1 .1 • 5) , then they determine the class ~" (x,y, z) of shears on the pivot-pair

Ce

,m). This class

,vill

has an upper and a lower representative, B£m and

, with (t,m) restrictions

yi

y-iz) A ( )

le_ and L£m = (1.1.6)

0 y2 X

a) From (1.1.5) and definition 1.2·we find for the (£,m)-restriction of T£mT£; ( + q2 pr + qs Since is non-singular T, T"T is ,c;ill hill x > 0, y > 0, xy - z2 > 0.

b) If (x,y,z) satisfy (1.1.5) then the shear

(

x+>~

(x+y+2'/ xy-z2)

--i

z

restriction

H£m

z

Cx,y,z). This shear has Euclidean parameters

representative of~" (x,y ,z). "''m

definite, and hence

with

(£,m)-)

is the s;;rmmetric

c) From (1.1.6) we see immediately trJBt Bn _,mE 1(," (x,y,z) and .«ill

Corollary 1. For each shear Tlm E ~tm(x,y,z) there exist ortho-gonal shears Q£m and Rim such that

T£m

=

B£m Q.£m

=

1_{tm R£m'}

(33)

mentioned in (1.1,6).

Corollary 2. I f T0 E~" (x,y,~), then det

2_(T")₌

xy-~

.

~

In the sequel we mostly use unimodular shears. Then the (t,m)-restrictions of the triangular represr.::ntatives are

Bern (y->

y~z)

and

L;;m

= ( ) '

0

l

0 y2 X ""2z X

L 2.

Similarity transformations

by

real unimodular shears

In this section we shall consider the similarity transformation by a real shear T£m with pivot-pair (£,m) and Jacobi-parameters p, q, r and s. Let and d := det(T;;m) = ps- qr A '·. T-1 AT

.em

,em' (1.2.1) (1.2.2) The elements of

Ar

will be denoted by a!., i =

1,2, ••

,n,j=1, ••• ,n.

~J

Only the elements of A in the ;;-th and m-throws and columns are· affected by the similarity transformation with T

1m. For the elements of A1 _we_{find with}

(1.1.1)

_and

(1.1.2)

and a!

J.ffi

p + r a aJ~= im' "'""

aJ,e (ps au"" qr amm + rs a;;m- pq am;;)/d a' = fia - q2a_0+ qs(a.0 - amm)}/d

:.em

'

J!m uw "'"'

a'm:t {p2ame- :t'a£m- pr(a;;;;- amm)}/d

a~ (ps amm- qr au- rs atm+ pq am;;)/d

a!. = a ...

lJ lJ

otherwise.

(1.2.3)

(34)

2 I 2

order to simplify the formulae for

IIAI!E

and.

IIA liE

we introduce the followir..g notation.

n

c

z:

_{a. aik} i=1 ~j if £,m n

·-

z:

a .. i=1 Jl if'x,m n cr : z: a~ i, j=1 l.j ilt,m jf,e,m I

The same functions of the transformed matrix A

I i

be denoted by Cjk' Rjk' e' and cr1 _{respectively.}

(1.2.5)

-1

T .£m AT .£m will For convenience and for simplicity of the formulae we will not

men-tion the dependence of these parameters on A (resp.T,e;1 AT,em)' ,e and m.

We now find

(1.2.6)

Obviously, a is an invariant of A under similarity transformations b;y shears with pivot-pair (,e,m). Since e

=

(71.~

1

))

2

+ (A.~

2

))

2

(where

(11 (2\ hill hill

A., ' and A, 1 _{are the eigenvalues of the (,e,m)-restriction of A),}

,,;m Lill

e is also such an invariant : e e 1 • In order to determine !lA 1!12

E we (a 1 _- _a1

f

_{and C}_t ₊_C_I ₊_Rt + R 1 _•

tm mt U mm U mm

TheorEilm 1.2. I!T.£;1 A'r,emll~ is in terms of the functions of A defined in (1.2.5) and the Euclidean parameters of T.£m'viz.

(35)

Proof. From (1.1.4), (1.2.3) and (1.2.4) we see C1 ₊_C1 ₌ Le mm and n 2: { i=1 if'..e,m n ;(. )a. + 2(pr+qs)a. 0a. } 1m l-v 1m (1.2. 7) R1 ₊_R1 ₌ ,e,e mm

z

i"'1 if'..e,m - 2(pr+qs)a0.a . }/d2 "'1 m1 (Rmmx + R,e,eY - 2R,emz)/(xy-z2). Since L ~

IIA

1

_ll

_CJe

₊_cr:nn₊_Rj,e ₊_Rr:nn ₊_(aJm- a~) ₊_u ₊_e,

formula (1.2.7) is obtained. 0 In order to determine inf

T..eJ

~m

(liT~:

AT ,em liE), the rational function in the right-hand part of (1.2.7) has to be minimized in the r.alf-cone

x > 0,

y

> 0, xy - z2 >

o.

Since the determination of the values of x, y and z minimizing this rational function, is rather complicated, we shall henceforth restrict ourselves to unimodular shear matrices T;,m• With this restriction on T0 we may restate theorem 1.2 as

: hm

Theorem

1.3.

If T₁m is a-unimodular shear with Euclidean para-meters (x,y,z), then

(1.2.8)

(36)

where · f(x,y,z) := ax + ~y + 2yz +(-XX+ ~y with n

z

(a~i+a2 .), ~:= i=1 ~ ffil n 2: (a~.+a~ ), i=1 hl lm n y:= Z (a, 0a. -a0.a .) i=1 lh lm hl ml ,m if£,m if.£, n; (1.2.10) and A := _a£m' (1.2.11)

In order to determine inf (liT ₁;1 AT );milE), we have to mi.nimize T ..emE

~£m

the function f(x,y,z) with side conditions

X> 0,

y

>

o,

xy- = 1. (1.2.12)

Definition 1.3. The subset

~:={(x,y,z);x>O,y>O,xy-z

2

=

1} of R will be called the positive sheet of the hyperboloid xy-z2= 1.

3

Previous to presenting in section 1.3, an algorithm to compute the values of x, y and z minimizing f on

'1{ ,

we shall demonstrate some properties of the coefficients of f and then we :?.hall estab-lish a sufficient condition for which the infimum of f, on i.he surface'J/ , is assumed for finite x, y and z.

In the we make use of the notations introduced in (1.2.10) and (1.2.11) and, moreover, we define

D

·-

a~ - - yv,

E V 2 + 4il.fh (1.2.13)

F

·-

- y 2

Lemma 1.2. The quantities a, ~, D, E and F defined in (1.2.10) and (1.2.13) have the following properties

(i) o::;;. O, ~ :;;. O, F :;;. 0; (ii) D2 + E F :;;. 0;

(iii) if E.< 0 then D 0 implies F = 0, o: = 0, ~ = 0; (iv) iff E > 0 (E = O, E < 0), then the (,e,m)-restriction

of A has two different real (two equal real, two com-plex conjugate) eigenvalues

A~~)

and

A~!)

respectively.

(37)

Proof, (i) a and pare non-negative since they are sums of

squares, In order to show F ~ 0, the elements in the ~ -th column and the m-th row of A, not belonging to the (t,m)-restriction of A, are considered as components of a vector in R

2n_4• In the same way we consider the elements in the m-th column and the ~-th row, not belonging to A£m' as components of a vector in R₂n_

4• From the inequality of Cauchy-Schwarz follows for these vectors

2

F

=

ap-y

=

n n n

[ ~ (a~ o+a2.) ] [ ~ (a: +a~.)]-[ ~ (a. a. -a a; a .)J~o.

i=₁ l h m1 i=₁ 1m ..vl i=₁ u 1m hi. m 1

if't,m il,e,m i;h,m

(ii) If ap

f

0 then

D2+ EF = (afl-pA-yv)2+ ( }+4At.t) (o:~/)

2

o:p{v-

~ (o:~J-pA.)

f+

a~?

(o:IJ+0A.)2

~

0.

If a0 0, then F =

o,

thus D2+ EF = ~ 0.

(iii) If E < 0 and D =

o,

(i) and (ii) imply F = 0, hence a0=y2•

In order to prove o: =

p

0 we make use of the fact that a!J-phyv. This implies that (o:ll +0A.)2 =(o:IJ-pfl.)2 +4a!34 "'¥2 (

l

+4A.JJ.)

Hence E < 0 implies that y ~ 0 and O:!J.+pA.= O.Since ~-pA.=yv= O, we also have: A.=O.Since E <0 implies :\j.!<o;t follows that o:=p= 0. (iv) E = (a£.£-amm)2+ 4.-emam.-e= (a££+amm)2-

4(a£.£amm~

a£mam£)

( A(1) + i\(2))2- ₄i\(1) A.(2)= (;\.(1) _ i\(2))2 •

• em £m £m .£m ' £m .£m

A

Hence iff E > 0 (E O, E < 0), A£m has two different real (two equal rPal, two complex conjugate) eigenvalues

A.~!)

and

11.}!)

re-spectively. D

We shall now demonstrate that if D and F are not both equal to zero, there exists a compact set Qc~such that the minimum of f(x,y,z) on;( is assumed in the interior of Q. To prove this

(38)

Lemrrra 1.3. If D and F are not both equal to zero, then for (x,y' x+y- implies f(x,y, z)- eo.

Proof. We start by remarking that

D and F wonld be equal to zero.

>0, for otherwise both

Let

'f2

be the subset of R defined by

3

xy-

o,

X ? 01 y ? 0.

Then~ c

1l.

In 1:<. we have

f(x,y,z) = ~(x,y,z) + w2(x,y,z),

where cp : we + i3Y + 2yz, w := -.\x + !J.Y + v z. In

fl.

we find for the linear part cp of f:

2cp=( ex+~) (x+y)+(cx-~) (x-y)+4yz?( ex+~) (x+y)-1 ex-~ llx-y l-4lrzi

2 2 l 2

o:+~)(x+y )- {( o:-13) +4y }2 _{{(x-y) +4}

2 2 2 l

=(cx+~)(x+y)-{(cx+p) -4F {(x+y) -4(xy-z) }2 ?{x+y) {cx+f3-V(cx+~)2-4F2 }?0.

IfF> then

in~

and a fortiori on:;(, cp ?a(x+y), where 1l >0. Thus i f F > 0, then on{('

2

f = cp + w - co for x+y -

=.

Now we conside:J? the particular case of F 0, D

I

0. Then in;fl.: 2cp :;, ( {x+y-V(x+y)2- 4(xy-z2) }.

Hence ondf : cp >0 and in

14:

cp?O.

(x,y, z)E

1<

and cp 0 implies cxx+py = -2yz, hence (cxx+py)2 = 4 / = 4cxpz2 ..; 4cx(3xy.

Hence in~ cp 0 if and only if (x,y,z)

= t((3,cx,-y), t

~

o.

Along this line,~, the plane cxx+py +2yz = 0 is tangent to the

boundary a

k

of

11,.

On

t

we find w = tD. Hence., since D

I

o,

in

tz -

{(o,o,o)} : cp +

I

wl>o.

Let

E

be the intersection of the plane x+y = 1 and

12 •

Since

C

is a compact set and cp + lwi>O in

t,

continuous function cp+lwl attains on't_ a positive minimum, say o(o>O). From the linear homogeneity of cp and w it now follows that in 1<, : cp + lw r:;;;. o(x+y).

(39)

A fortiori cp+lw I~ o(x+y)

on:(.

Consequently

cp+w2_{~ cp+ lw}

I -

_i

_{~ o(x+y)-} _{Hence also in the case}

F O, D

f

O, on ~ q~+w2-oo for x+y .... oo • D

Theorem 1.4. If D and F are not both equal to zero, then the infimum of f(x,y, on{ is assumed for finite x,y and z.

Proof. Lemma 1.3 asserts the existence of a. YJ.umber M > 0 such that on~f(x,y,z) >f(1,1,0) if x+y >M. The theorem now follows from the continuity of f and the fact that the subset of

1{

for which x+y .;;; M is a ccmpact set. o

In the next section we make also use of

1

Lemma

1.4.

(i) inf (ax+f3y+2yz) (x,y, z)E'/(

2F2 • The infimum is assumed

for finite x,y ,and z i f and only if F

f

0 or a + f3 = 0.

) inf ,,) -A.x-+tJ.y+vz)2 (x,y,z)Eq

max(O,-E). The infimum is assumed for finite x,y and z if and only if E

I

0 o~ A.

=

~·

Proof. (i)If F > 0 then, as we have seen in the proof of lemma 1.3, on?{ a:x-r(3y+2y~~ .... oo for x1~ - 00• Hence, in this case the infimum on

:t{

of the non-negative function ax+f3y+2y z is assumed for finite x, y and z. Using a Lagrange multiplier we easily find that

1 ~n(ax+f3y+2yz) = 2F2

r The coordinates of this unique stationary

1

point on:{are

F'"'

2 (f3,a,-y). If a+f3=0, then a=f3=y=0; then ax+f3y+2yz = 0 for each (x,y,z)EJf.

If F =

o,

cx+f3fO,then

on-:(

ax+f3y+2yz ~ (a:+f3){x+y-V(x+y)2-4}~ 0. We now consider the curve

r

c:(,

d.efined in the following way : :X:= -(a-f3)t+{1+(a+f3)2t2

}~,

y=(a-p)t+{1 +(a:+f3)2t2

}~,

, t >

o.

On this curve

r ,

using the fact that F = 0, we find

a:x+f3y+2yz=(a:-rp)

{V

1+(a+f3)2 t2 -(a+f3 )t} .... 0 for t - 00 • d.

Hence ~ (ax+f3y+2yz)

=

2F2•

(ii) If E > 0, then the plane -A.x~+vz=O intersects~. The point with coordinates

(40)

:x:= . E+21l(J.L-A.) _ E-2A(u-A.) {E2+E(A.-!.!.)2}f' y- {E2+E(A.-!.!.)2}f is an element of this intersection.

I f E ,.. 0 then Afl

,..o

and so we can apply, with appropriate modifi-cations, the reasoning of (i)

~ (-A:x:+!.!.Y+vz)2 = -E. D

Definition 1 .4. The class of unimodular row-congru.ent shears for which the Euclidean norm of T₀-1A T

0 is minimal, will be

...vm ..vm

m

call~d the class of minimizing shears corresponding to A, £ and m, and will be denoted by ~,em (A),

The Euclidean parameters (:x:,y,z) of the shears in ~,em (A) minimize

f on"!(. If D and F are not both to zero, then theorem

1.4.

shows that mt .tm(A) is non-void, The particular case D = F = 0 will be discussed in section

1.4.

1.

3.

An algorithm for the real unimodular norm-reducing shears ars In this section we suppose that D and F are not both equal to zero. According to theorem 1.4.this is a sufficient condition for

f(:x:,y,z) to attain a minimum on~. We apply Lagrange1_{s method of}

multipliers to determine the point (x,y, z)

on<?t

where f is statio-nery. We consider

g(x,y,z;p):=ax+py+2yz+(-A.x+uy+vz) 2+p(xy-z2-1) (1.3.1)

In the stationary of f on'?( the partial derivatives ·Of g(:x:,y,z;p) with respect to :x:,y,z and p are zero. Hence x,y,z and satisfy a - 2A.w + PY 0 (1.3.2) €Sy

p

+ 2/.l.w + pX 0 (1.3.3) 1 2gz = y + \'W - pz 0 (1.3.4) and =xy- - 1 0 (1.3.5)

(41)

where

w :~ -Ax + ~ + vz. (1.3.6)

We now eliminate x, y and z from the equations (1.3.2) to (1.3.6) incl. Since px -0-211w , pY = -a+2J.. w and pz = y+vw, (1.3.5)g:Lves

On the other hand, we multiply (1.3.2), {1.3.3) and {1.3.4) by jl,

-A. and v respectively and add results. Then we find ajJ -

rn. -

Y v + P w -(l +4 All) w ~ o.

With the notation (1.2,13), the equations become 2 ? p + E w- - 2Dw- F 0 ( 0-E)w+D 0. (1.3.8) (1.3.7) and

(1.3.8)

Elimination of w from

(1.3.9)

and

(1.3.10)

gives

(1.3.11)

The value of the Lagrangean multiplier p corresponding to the minimum of f

on~

will be called the feasible multiplier. This multiplier satisfies

(1.3.11).

The next lemmas make it possible to locate the feasible multiplier among the zeros of the quartic equation,

Lemma 1.5. The Lagrangean multiplier p, corresponding to a stationar,y point of f on~, satisfies the inequalities

p ~ - + min(O,E), p < min (o,E)

(1.3.12) (1.3.13) Proof. We multiply (1.3.2), (1

.3.3)

and

(1.3.4)

by x, y and z respectively and add results. Then we find

o:x+t3y+2y z+2 ( -A.x-11J.y+v z )2 + 2 p 0. According to lemma 1.4 ~ (ax+t3y+2y z) i~ ( -AX+iJY+ vz)2

40

max(O,-E).

(42)

Therefore

p ""2(a:x-t(3y+2y z)-( -l..x-ljly+vz) 1 2 ~ -F],_ 2_+min(O,E).

_(1.

_3.12)

I f F

I

O, then (1.3.13) follows immediately from (1.3.12), which

for the present case proves the theorem.For the case F = O, D

I

0 we have to show that E < 0 implies p < E and that E ;:;, 0 implies p <0. Now let F 0, If E > O,then (1.3.12) shows p ~O.Since p 0 would imply w = D O, as is seen from (1.3.9) and (1.3.10), we have p <0. I f E ~ 0, then (1.3.12) implies p~E. Now p = E again implies D = 0, as is seen from (1.3.10), hence p <E. This proves (1.3.13). D

Ler:nma 1.6. +E

2

-2D F=O

The equations { w w - have one and

(p-E)w + D = 0

and only one solution (p,w) for which holds p < min(O ,E).

For this root holds

1 -(F+rf

/EY

2 ~p~- if E >

o,

i f E = O, (1.3.14) (1.3.15) (1.3.16) E-

1 nl I

E

I

P if E<

o.

c

1. 3

.n)

Proof. We investigate the intersection of the quadratics (1.3.9) and (1.3.10). To that end we distinguish three cases.

I E > 0. In this case the graph of (1.3.9) in the ~,w)-plane is an ellipse with centre p 0, w

=

D/E. The half-axis parallel

1

to axis w = 0 is of length (F+

n

2/E)2 • This ellipse intersects

1

the a:xis w 0 in the points having as coordinates p

=£F

2, w=O. The graph of (1.3.10) in the (p,w)-plane is an hyperbola with asymptotes p = E and w =

o.

The hyperbola passes through the centre cf the ellipse. In figure 1 the quadratics are sketched for the case D >

o.

If D < 0 then the appropriate sketch is obtained from figure 1 by reflection with respect to w = 0. If D = 0, then the hyperbola degenerates into the lines p =E

(43)

42

and w =

o.

If Il

I

0 then the quadratics have a unique point of intersection S (see figure 1) in the half-plane p <

o.

The

-coordinate of S satisfies the inequalities

l. l. -(F+D2/E)2 < p<- F2 ..;; O. p E D,E> 0. fig. 1.

If Il

=

0 and F

I

0, the degenerate hyperbola and the ellipse have a unique point of intersection in the half-plane p _. _l. < 0. The coordinates of this point are p

= -

F2 , w =

o.

(44)

II -E

o.

In this particular case the graph of

(1.3.9)

in the (p,w)-plane is a parabola which intersect the axis w

=

0 in

le

the po].nt having as coordinates p

=

:±:. p2, w 0. If D

I

0,

then the parabola and the hyperbola (graph of

(1.3.10))

r~ve a unique point of intersection S in the half-plane p < 0. The p -coordinate of this point S satisfies the inequality

.1.

p < F-2 _.;;;; _0.

p

E=O I D>O .

fig. 2.

If D

=

0 and F

I

o,

then the parabola and the degenerate

hyperbola have a unique point of intersection in the half -r;lane 1

(45)

III E < 0. In this case the graph of (1.3.9) in the (p,w)-plane is a r.yperbola with centre p "' 0, w D/E. This hyperboia intersects the axis w 0 in the points having as coordinates

~ ~

44

p .:!:. F2, w "'

o,

and its asymptotes are p .:!:.

I

El2 (w-D/E).

These asymptotes and the graph of (1.3.10) have a unique of intersection in the half-plane p < E. The coordinates

1 1

of this point are p = D! !E!-2 , w "' (D) !E!2 •

We conclude from lemma 1.2 (iii) that D

r

O, forD= 0 would F =

o.

Hence the

is not degenerated.

of (1.3.10), being an hyperbola,

As we see from figure 3 the byperbolae have a unique point of intersection S in the half-plane p < E. The p -coordinate of S satisfies the inequality

< p <E.

p

0>0 . E <0 .

(46)

If F

I

01 we can sharpen the upper bound of the p -coordinate

of

s.

For that purpose we :1

of p "" E - F2 and the :1

consider T,the point of intersection of (1.3.10). The coordinates ofT are p

=

E - F2 , w D Simple calculation show that the

:1

left-hand part of ( 1 • 3. 9) in T equals (E-2F2) (D2 +EF)

/F

< 0,

.l.

whereas in the point with the coordinates p = E-F2, w 0 the

.l.

same function has the value -2EF2 >

o.

Hence the of S is smaller than that of T if F >0. That means

:1 :1

E-IJI

t.Eil-

2

E;p ,;; E-F2• D

From the lemmas 1.5 and 1.6 we that there exists one and one Lagrangean multiplier that corresponds to a stationar,y point of f on

t( .

With theorem 1 we find that in this unique stationar,y point the minimum of f on ?I( is reached. Using the fea-sible multiplier p, i.e. the root of (1.).11) which satisfies (1.3.13), we find with (1.).2), (1.3.3) and (1.3.4) that f(x,y,z) is minimal on ~ in the point

x= 2f.!D - @~p-E) _p(p-E

'

(1.3.18) y -2XD -

f

12:E) p(p-E (1.3.19) z -vD +

1

12:E) p(p-E

.

(1.3.20)

We summarize the results of the lemmas in

Theorem 1.5. If the quantities D and corresponding to the ma-trix A and the pivot-pair (£,m) are not both equal to zero, then the Euclidean parameters x, y and z of ~m(A) (being the values of x, y and z which minimize f on

:t )

may be computed from the formulae (1.3.18), (1.3.19) and (1.3.20), where p is the unique root of the quartic equation (1.3.11) for which root holds

(47)

1. 4: The particular case D

=

F , 0

In this section we investigate the properties of the

shears in the case that the functions D and F of the matrix A and the pivot-pair (2,m) (see (1.2.13)) are to zero. Theorem 1.6. Let D

= F

= 0. Then

}t'{o:x-tpy+2yz+(-A.x-tp.y+vz?}

=

max (0,-E).

This infimum is assumed for finite (x,y, z)::

d(

if and

a = 0 A [3 = 0 A (E .j 0 V A.= 1J.).

Proof. We two cases.

I a = [3 = 0 . Then y 0.

i f

For this case the theorem has already been proved in lemma

1.4 (ii).

The infimum is assumed for the following values of x, y and z:

x= E+2u(u-A.)

1 , y= E-2A.(u-A.)1 , z• - (u-A.)v 1 ,if E>O

{:Ff' +E(A.-tJ.f r2

{Ef

+E(A.-tJ.l

ra

{:Ff'+E(A.-tJ.)2 }2

X= 1 y = 1 ' z = 0 if 0

1 1

X

=

2j Ej-211-l.r

, z =lEI-2 vsign(A.)if E < 0.

II a

I

0 V [3

I

o.

According to lemma 1.2 (iii) this situation does not occur if E < O.

Since D = F = 0, the line which the plane ax+fly+2y z 0 is tangent to the cone xy-~ O,ooinoides with the intersection of the planes a x-tf3y+2y z 0 and - A.X-f].J.y+v z = 0. This line, whi oh we shall denote by

l ,

is in parameter form by

(x,y,z) = 2t([3,a:,-y ), t ~

o.

Now we describe a curve

r

on:(

of which

t:l

is the asymptote.

1 1

r : x=- ( a-[3 )+{ 1 +(a+fl )2 t2 }~' y=(a-[3 )t+{ 1 +(o:+fl

i

t2

rr'

z=-2yt' ~0. On this cure we find, using the fact that D = F = 0,

(48)

and

Since on

<1(

ax+py+2y z > 0,

\Pf{ax+py+~z+(-Ax~y+vz)2_}

=

o,

but this infimum is not assumed for finite x, y and z. ::l

Remark 1 • In the case D

=

F

=

0 the

(p-E)2(p2 (2p-E)

=

0 has solutions p O,O,E,E. Thus the for-mulae (1.3.18), (1.3.19) and (1.3.20) for the Euclidean parameters x, y and z of the optimal norm-reducing shears are not usable. Remark 2. If a

=

0 A ~

=

o,

then each affected element of A not belonging to A£m equals zero. Hence the investigation may be con-fined to the (l,m)-restriction of A. I f a 0 A p = 0 A E > 0,

f on//( determines a class of shear similarity transformations. Each transformation of this class symmetrizes the (l,m)-restriction of A, i.e. a£m a~.

I f a 0 A p 0 A E = 0 t\ A.

t

ll, then the infimum of f on !( is

not assumed. This situation is connected with the defeotness of Alm: this matrix of order two has two equal real eigenvalues and is not symmetric.

If a 0 A p = 0 A E < 0, then the Euclidean parameters

~

(x,y,z) = rEI 42 (2riJ.r, 2IA.I·, A.)) correspond to shears T£m' such that the (l,m)-restriction of

Tt~

1

A

Ttm has Thfuxnaghan1_s canonical form:

where A.(1) is a complex tm

Remark 3. The particular case a + p

f

o,

D = F

=

0 will be