Bundle methods in combinatorial optimization

(1)

Tilburg University

Bundle methods in combinatorial optimization

Sotirov, R.

Publication date:

2003

Document Version

Publisher's PDF, also known as Version of record

Link to publication in Tilburg University Research Portal

Citation for published version (APA):

Sotirov, R. (2003). Bundle methods in combinatorial optimization. Department of Mathematics, University of

Klagenfurt, Austria.

General rights

Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners

and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights.

• Users may download and print one copy of any publication from the public portal for the purpose of private study or research.

• You may not further distribute the material or use it for any profit-making activity or commercial gain

• You may freely distribute the URL identifying the publication in the public portal

Take down policy

If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately

and investigate your claim.

(2)

Bundle Methods in Combinatorial

Optimization

DISSERTATION

zur Erlangung des akademis hen Grades

Doktorinder Te hnis hen Wissens haften

UniversitatKlagenfurt

FakultatfurWirts haftswissens haften und Informatik

1. Beguta hter: Univ. Prof. Dipl.Ing. Dr. Franz Rendl

Institut furMathematik

2. Beguta hter: Univ. Prof. Dr. Johannes S hoiengeier

(3)

(4)

I h erklare ehrenwortli h, da i h die vorliegende S hrift verfat und alle

ihr vorausgehenden oder Sie begleitenden Arbeiten dur hgefuhrt habe. Die

in der S hrift verwendete Literatursowie das Ausma der mir imgesamten

Arbeitsvorgang gewahrten Unterstutzung sind ausnahmslos angegeben. Die

S hrift is no h keiner anderen Prufungsbehorde vorgelegt worden.

Dipl.-Ing. RenataSotirov

(5)

(6)

Abstra t

Semidenite Programming (SDP) has re ently turned out to be a very

po-werful tool for approximating some NP-hard problems. The nature of the

Quadrati Assignment Problem (QAP) suggests SDP as a way to derive

tra table relaxation.

We present several Semidenite Programming relaxations of the QAP with

the in reasing levels of omplexity, that are formulated in matrix spa es of

dierent dimensions. We also use a representation of a permutation matrix

in the lifted spa e, whi h allows to exploit sparsity in a simple way, and

whi h is smaller than one for the standard SDP relaxations for QAP. The

trade o between strength of the bounds and time needed for solving them

is presented.

Allresults are omputed usingtheInteriorPointMethodand/orthe Bundle

Method. The Bundle Method turns out to be a very favorable method for

solving large ombinatorial optimization problems. The method allows the

sele tion of important onstraints from the given model, whi h are treated

indire tly using Lagrangian duality.

The omputational results demonstrate the eÆ ien y of the approa h. Our

bounds are the urrently strongest ones available for QAP. We investigate

their potential for Bran h and Bound settings by looking at the bounds in

(7)

(8)

A knowledgement

I am extremely grateful to my supervisor Professor Franz Rendl, for his

ex ellent guidan e and support throughout my Ph.D. studies. Professor's

ideas and advises have been the foundations of the resear h leading to this

thesis. I thank himalsoforreadiness todis uss anytimeI needed, aswell as

for the generously given time. His persisten e and patien e resulted as well

with my Germanknowledge. Many thanks for that also.

I am grateful to Professor Henry Wolkowi z for the invitations to visit the

University of Waterloo. My two visits were, due to many dis ussions and

work, very produ tive and enjoyable.

I would alsoliketothank oursystem engineer Gerald Ho hegger forhelp on

numerous o asions on erning university ma hines, and for making them

available forthe big omputations.

I gratefully a knowledge nan ial support provided by Prof. Franz Rendl's

resear h grant from the Austrian S ien e Foundation (Fonds zur Forderung

der wissens haftli hen Fors hung FWF) under Proje t P12660-MAT, and

Prof.GerhardJ.Woeginger'sresear hgrantfromthe STARTprogramY43{

MAToftheAustrianMinistryofS ien e. Ialsogratefullyre ognizenan ial

support inthe formof tea hing assistantships (two semesters) from the

De-partment ofMathemati s, and University of Klagenfurtforthe Dr. Manfred

Gehring resear h fellowship.

Finally,Iwouldliketothankmyparentsfortheiren ouragementandsupport

through my graduate studies. They were although very far { nevertheless

very lose.

(9)

(10)

Contents

Abstra t i

A knowledgement iii

List of Tables ix

List of Figures xi

List of Symbols xiii

1 Preliminariesand Notation 1

1.1 Matri es . . . 1

1.2 Operators . . . 4

2 Semidenite Programming 7 2.1 The Semidenite ProgrammingProblem . . . 8

2.2 DualityTheory . . . 9

2.3 Nondegenera y and Stri t Complementarity . . . 13

2.4 Primal-Dual Interior Point Methods . . . 19

(11)

3 The Quadrati Assignment Problem 29

3.1 Problem Formulation . . . 29

3.2 Equivalent Formulationsof QAP . . . 30

3.3 Appli ations . . . 32

3.4 ComputationalComplexity of QAP . . . 34

3.5 A Convex Quadrati ProgrammingRelaxation . . . 34

4 SDP Relaxations of QAP in S n 2 +1 39 4.1 Deriving the Relaxations . . . 39

4.2 Gangster and Arrow: Linearly Dependent Constraints . . . 46

4.3 Solving the QAP R 2 Relaxation . . . 52

4.4 Stri tly Feasible Points . . . 56

5 SDP Relaxations of QAP in S (n 1) 2 +1 59 5.1 A New Representation of the Permutation Matrix . . . 59

5.2 About the Stru ture of the New Parametrization . . . 62

5.2.1 Elementwise Matrix Des ription . . . 64

5.3 Howdo ^ V and V Interrelate? . . . 67

5.4 Deriving the Relaxations . . . 69

6 The Bundle Method in Combinatorial Optimization 75 6.1 Introdu tion . . . 75

6.2 Lagrangian Duality . . . 76

6.3 The Bundle Method . . . 77

6.3.1 Bundle Method: the Basi Idea . . . 77

(12)

7 Bounds for the QAP by using the Bundle Method 83

7.1 Bundle Method toSolve QAP SDP Relaxations . . . 83

7.2 ComputationalResults for the Relaxations inS n 2 +1 . . . 88

7.3 ComputationalResults for the Relaxation inS (n 1) 2 +1 . . . 93

7.4 The Bounds afterBran hing . . . 96

7.5 The Bran h and Bound Tree Estimator . . . 101

A Symmetri and Positive Semidenite Matri es 105 B Matrix Cal ulus 107 C Minimax Problems 111 C.1 Convex Fun tions and Convex Sets . . . 111

C.2 Dire tions of Re ession and Re essions Fun tion . . . 118

C.3 Con ave Fun tions . . . 121

C.4 Bifun tions . . . 122

C.5 Saddle{Fun tions . . . 124

C.6 MinimaxProblems . . . 129

C.7 Conjugate Saddle{Fun tions and Minimax Theorems . . . 134

C.8 Saddle{Points of LagrangeFun tion . . . 140

Summary and Outlook 143 Indi es 145 Indexof Authors . . . 145

Indexof Topi s . . . 148

(13)

(14)

List of Tables

4.1 Computationtimes to solve the relaxation QAP R1

. . . 44

4.2 Optimalsolutionsof QAP R2 relaxationobtained using NEOS and omputation time for one interior pointiteration . . . 55

5.1 Optimalsolutions of the basi , groundand QAP Rs 2 relaxation 71 5.2 Optimalsolutionsof the relaxationsQAP Rs 3 and QAP Rs 4 ob-tained using NEOS . . . 72

7.1 Bounds of the QAP R 2 relaxation omputed with the bundle methodand omputationtimeperone iterationofthe bundle algorithm . . . 87

7.2 Computation time per one iteration of the bundle algorithm for the QAP R 3 relaxation . . . 87

7.3 Comparing bounds for QAPLIB instan es I. . . 90

7.4 Comparing bounds for QAPLIB instan es II . . . 91

7.5 Bounds for QAPLIB instan es . . . 91

7.6 QAP R3 bounds in dependen e of number of iterations of the bundle algorithm . . . 93

7.7 The time table for solving the minimization problem (7.10) and (7.1) with the interiorpointmethod . . . 94

7.8 QAP Rs 4 bounds for Nugent instan es . . . 95

(15)

7.11 Results for the rst level inthe bran hing tree for Nug15 . . . 99

7.14 Estimationof nodes for Nug15 . . . 103

7.15 Estimationof nodes for Nug20 . . . 104

(16)

List of Figures

7.1 Deviationfromtheintegeroptimumofthenormalizedbounds,

for dierent Nugent instan es, obtained afterin reasing

num-bers of the bundle iterations . . . 92

(17)

(18)

List of Symbols

A() ... the linearoperator

A T

() ... the adjoint of the linear operatorA()

A i;

... the ithrowof A

A ;j ... the jth olumnof A A >B ... a ij >b ij 8i;j A B ... a ij b ij ; 8i;j

AÆB ... the Hadamardprodu t of A and B

AB ... the Krone ker produ t of A and B

... the Lowner partialorder

A B ... A B is positive denite

A B ... A B is positivesemidenite

arrow() ... the arrowoperator

Arrow() ... the adjoint operator of the arrowoperator

diag(X) ... the ve tor of the diagonalelements of the matrix X

Diag(x) ... the diagonalmatrix with diagonalx

dim(S) ... dimension of the spa eS

(19)

e ... the ve tor of ones

E ... the matrix of ones

E ... the set of matri es with rowand olumnsums one

G J

... the Gangster operatoron S n

2 +1

G S

... the Gangster operatoron S (n 1) 2 +1 G ... the set of R2S (n 1) 2 +1 s.t. G J ( ^ VR ^ V T )=0 G S ... the set of R 2S (n 1) 2 +1 s.t. G S (R )=0

int(S) ... the interior of S

I ... the identity matrix

Le() ... the operator dened onpage 72

Le T

() ... the adjointoperator of the operator Le

L ... the set of R2S (n 1)

2 +1

s.t. Le (R )= (n 1)(n 2)

Lm() ... the operator dened onpage 70

Lm T

() ... the adjointoperator of the operator Lm

M k

... the spa e of kk real matri es

M mk

... the spa e of mk real matri es

N ... the operatordened as (N(Y)) ij =Y ij N ... the set of R 2S (n 1) 2 +1 s.t. N( ^ VR ^ V T )0 N S ... the set of R2S (n 1) 2 +1 s.t. N S (R )0

Null (M) ... the null spa e of M

oDiag(S) ... operatorsets the diagonalof S tozero

(20)

^

P ... the set ontainingP

P S

... the set dened on page 69

... the set of permutation matri es

R ... the set of R2S (n 1) 2 +1 s.t. R 0and arrow ( ^ VR ^ V T )=e 0 R S ... the set of R2S (n 1) 2 +1 s.t. R 0and arrow (R )=e 0 IR k

... the spa e of k-dimensionalve tors

Range (A) ... the range spa e of A

rank (A) ... the rankof a matrix A

S k

... the spa e of real symmetri kk matri es

S + k

... the set of positivesemidenite matri es

S ++ k

... the set of positivedenite matri es

tr(A) ... the tra e of a square matrix A

h;i ... the tra e inner produ t onM m;k

jjAjj F

... the Frobenius norm of A

ve (X) ... the ve tor formed fromthe olumnsof the matrix X

Z ... the set of (0;1)-matri es

n+1

2 !

(21)

(22)

Chapter 1

Preliminaries and Notation

In this hapter we give some basi notation and preliminary on epts that

will beused throughoutthe thesis.

1.1 Matri es

Wedenotethespa eofmk(resp.kk )realmatri esbyM m;k

(resp.M k

).

We use tr(A)to denotethe tra e of asquare matrix A2M k , where tr(A)= k X i=1 a ii = k X i=1 i ; where i are eigenvalues of A =(a ij ). ForA2M m;k ; B 2M k;m we get tr(AB)=tr(BA): Thespa eM m;k

is onsideredwiththetra einnerprodu t. ForA;B 2M m;k the tra e innerprodu t is

hA;Bi=tr(B T A)= m X i=1 k X j=1 a ij b ij :

The normasso iated with the tra e inner produ t isthe Frobenius norm

jjAjj = q

(23)

For two matri es A;B 2 M k , A B, (A > B) means a ij b ij , (a ij >b ij ) for alli;j. We say that A2M k is symmetri if A=A T :

We denotethe spa eof real symmetri kk matri esby S k . Thedimension of S k is dimS k = 1 2 k(k+1)=: k+1 2 ! : Alleigenvalues of A2S k

are real numbers.

Denition 1.1 (Positive denite and semidenite matrix)

A2S k is positivesemidenite (A 0) if x T Ax0; 8x2IR k . A2S k is positivedenite (A 0) if x T Ax>0; 8x2IR k nf0g. The matrix A 2 S k

is positive semidenite if and only if all eigenvalues of

A are real and greater than orequal to zero. The matrix A 2S k

ispositive

deniteif andonlyifalleigenvaluesofAare realand greaterthan zero. The

spa e S k

is equipped with the Lowner partial order, i.e. A B (resp. )

denotes A B is positive denite (resp. positive semidenite). We dene

nowthe set of positive semidenite matri es

S + k :=fA2S k :A0g;

and the set of positive denite matri es

S ++ k :=fA2S k :A0g: Remark 1.1 If A 2 S + k and a ii

= 0 for some i 2 f1;:::;kg then a ij

=

0; 8j 2f1;:::;kg.

More about symmetri matri es an be found in Appendix A1.

We partitiona symmetri matrix Y 2S n

2 +1

into blo ks asfollows.

(24)

wherewe usethe index0forthe rstrowand olumn. Hen eY 0 2IR n 2 ,Z 2 S n 2 , Y p0 2IR n , and Y pq 2 M n

. When referring to entry r;s2f1;2;:::;n 2

g

of Z, we also use the pairs (i;j);(k;l) with i;j;k;l 2 f1;2;:::;ng. This

identies the element in row r =(i 1)n+j and olumn s = (k 1)n+l

by Y

(i;j);(k;l )

. This notation is going to simplify both the modeling and the

presentation of properties of the relaxations. If we onsider Z as a matrix

onsisting of nn blo ks Y ik

, then Y

(i;j);(k;l )

is just element (j;l) of blo k

(i;k).

Weusee i

todenotethe ithunit ve tor,eistheve torof ones. When thereis

no onfusionwiththe unitve tor,weuse e k

toindi atethe sizeof theve tor

of allones. The matrix E k

is akk matrix with allits entries being equal

to one, and I k

is a kk identity matrix. We use E and I when there is no

ambiguity.

We use O todenote the set of orthogonal matri es, i.e.

O :=fX : XX T

=X T

X =Ig;

E todenotethe setof matri es with rowand olumn sumsone, alledthe set

of assignment onstraints, i.e.

E :=fX : Xe=X T

e =eg;

and Z the set of (0;1)-matri es,i.e.

Z :=fX : X ij

2f0;1gg:

Denition 1.2 Let X =(X ij

)be akk matrix. Ifthe entriesx ij fulllthe following onditions n P i=1 x ij =1; 1j k n P j=1 x ij =1; 1j k x ij 2f0;1g; 1i;j k;

(25)

Premultipli ation (resp. Postmultipli ation)of a matrix with a permutation

matrix results ina matrix with rows (resp. olumns) permuted.

The following is known for permutationmatri es, see [62, 46℄.

1. =E\Z =O\Z.

2. The determinantof a permutationmatrix is 1.

3. The produ t of two permutation matri esis again apermutation matrix.

4. A is kk doubly sto hasti , for some m2 N, there existkk

permutation matri esP 1 ;:::;P m and 1 ;:::; m 2IR , i 0 with 1 +:::+ m =1,su h that A= 1 P 1 +:::+ m P m .

We also need some notationto be able to refer to ertain elements orparts

of matri es. The notationthat we use is similar tothe syntaxof MATLAB.

ForA2M n;k

the ithrowof Ais denoted by A i;

anda ordingly we denote

the jth olumnby A ;j

.

1.2 Operators

Here we dene the most important operators that appear in the thesis. For

alinear operatorA :IR k

!IR m

, the adjoint operator of A, denoted A

is a

linearoperatormapping fromIR m

to IR k

su h that for any x2 IR k and any y2IR m , hA(x);yi=hx;A (y)i:

The range spa e of A

is orthogonal to the nullspa e of A. If A and A

are

writtenas matri esthen A =A T . For X 2 M k

, ve (X) denotes the ve tor in IR k

2

that is formed from the

olumns of the matrix X. More pre isely, the operator ve : M k ! IR k 2 is dened as ve (X)= 2 6 6 4 X ;1 . . . X ;k 3 7 7 5 :

The onne tion between operators ve and tr is given with the following

relation;see e.g. [32℄. tr(AB)=(ve (A T )) T ve B; A;B 2M k : (1.2)

The operator Diag maps IR k

to M k

, and for some x 2 IR k

(26)

Conversely, diag(X) is the ve tor of the diagonal elements of the matrix X.

Diag(x) is the adjointoperator of diag(X).

The Hadamard produ t isa mapÆ:M m;k M m;k !M m;k whi his dened as (AÆB) ij :=a ij b ij ; 8i;j:

TheKrone kerprodu tisamap:M m;k M p;q !M mp;kq whi hisdened as AB := 2 6 6 6 6 4 a 11 B a 12 B ::: a 1k B a 21 B a 22 B ::: a 2k B . . . . . . . . . a m1 B a m2 B ::: a mk B 3 7 7 7 7 5 :

Let A;B;C;D be matri es of appropriate size. The following identities are

known in matrix analysis, see e.g. [32℄,

(AB) T = A T B T (1.3) tr(AB) = tr(A)tr(B) (1.4) ve (AXB) = (B T A)ve (X) (1.5) tr(ABCD) = ve (D T ) T (C T A)ve (B): (1.6) LetJ f(i;j):1i;j n 2 +1g. TheoperatorG J :S n 2 +1 !S n 2 +1 dened as (G J (Y)) ij := ( Y ij if (i;j)2J 0 otherwise ; (1.7)

is alled the Gangsteroperator,and the setJ the set of the gangsterindi es.

The name of the Gangster operator was introdu ed in [90℄. We keep the

name, even though we feel that it is not quite appropriate. We denote the

subspa e of (n 2

+1)(n 2

+1) symmetri matri es withnonzero index set J

with S J ; S J :=fX 2S n 2 +1 :X ij =0 if (i;j)6=Jg: (1.8) Note that Range(G J ())=S J and Null (G J ())=S J ;

where J denotes the omplementof J. The adjoint equation

tr(G J (Z)Y)=tr(ZG J (Y))

implies that the gangster operatoris self-adjoint, i.e.,

(27)

(28)

Chapter 2

Semidenite Programming

Linearized models appear in many real{world appli ations, and su h

mo-dels des ribe key features of a problem quite a urately. Semidenite

pro-gramming (SDP) is an extension of linear programming where the

nonneg-ativity onstraints are repla ed by positive semideniteness on the matrix

variables. The pra ti e shows that semidenite models are sometimes

sig-ni antly stronger than purely linear ones. The algorithmi ideas an be

extended quitenaturally from linearto semideniteoptimization.

The theory of semidenite programming has been studied already by

Bell-man and Fan [11℄ in the 1960. An expli it use of semidenite programming

in ombinatorial optimization appeared in the seminal work of Lovasz [61℄

onthe so alled thetafun tion,inthe late 70's. Lately,therehas beenmu h

interest in the area of SDP be ause of appli ations in ombinatorial

opti-mization [59, 58, 97℄ and in ontroltheory [25, 13℄, and also be ause of the

development of eÆ ient interior point algorithms.

InSe tion2.1weintrodu ethestandardformulationofaprimalsemidenite

program (PSDP)and deriveits dual (DSDP). In Se tion 2.2we explain the

dualitytheory. The onditionthatimpliesuniqueness oftheprimalanddual

solution, known as nondegenera y is des ribed in Se tion 2.3. In Se tion

2.4 and 2.6we des ribe some primaldual interior point methods and sear h

(29)

2.1 The Semidenite Programming Problem

Semidenite programming is a spe ial ase of onvex programming where

the feasible region is an aÆne subspa e of the one of positive semidenite

matri es. In omparisontostandardlinearprogramming,the ve torx2IR n + of variables is repla ed by a matrix variable R 2 S

+ n

. Let L be a given

symmetri nn matrix,anda2IR m

. We onsider thefollowingsemidenite

programmingproblemin the variable R2S n . (PSDP) :=min hL;R i A(R )=a R 0; where A : S n ! IR m

is a linear operator on the spa e of the symmetri

matri es. ThelinearoperatorAa tingonR2S n

anbeexpressedexpli itly

by the followingve tor

A(R )= 0 B B hA 1 ;R i . . . hA m ;R i 1 C C A ; where A i 2 S n

for i = 1;:::;m. The adjoint operator A T is satisfying the adjoint relation hA(R );wi=hR ;A T (w)i; for allR2S n and w2IR m . Sin e hA(R );wi= m X i=1 w i tr(A i R )=tr(R m X i=1 w i A i )=hR ;A T (w)i; weobtain A T (w)= m X i=1 w i A i :

We will derive the dual of (PSDP) by introdu ing the Lagrange multiplier

w2IR m

fortheequality onstraintA(R )=a,and byusingthethe minimax

(30)

and thus max w L(R ;w)= ( hL;R i if A(R )=a +1 otherwise :

Therefore, the primal problemis equivalent to

=min R0 max w L(R ;w):

By inter hanging the max and the min, and using the minimax inequality

we get max w min R0 L(R ;w):

We an rewritethe Lagrangian asL(R ;w)=ha;wi+hL A T

(w);R i. From

Corollary A.1 itfollows that

min R0 L(R ;w)= ( ha;wi if L A T (w)0 1 otherwise:

By introdu ing the dual sla k variable Z, the dual problemof (PSDP) is

(DSDP) :=max ha;wi L A T (w)=Z Z 0; for w2IR m . 2.2 Duality Theory

The gap between a primal feasible solution R and a dual feasible solution

(w;Z), alled a duality gap, is dened as the dieren e between the primal

obje tive and the dual obje tive value,

hL;R i ha;wi=hZ+A T

(w);R i hA(R );wi=hZ;R i0: (2.1)

The inequality in(2.1) follows from LemmaA.2.

Lemma 2.1 (Weak Duality)

Let R 2 S + n

and w 2 IR m

be given with A(R ) =a; L A T

(31)

Weak duality provides lower bounds on the optimalvalue of the primal

pro-gram. If hZ;R i 0 turns out to be zero then this primal{dual pair is an

optimal solution. The ru ial issue in duality theory onsists in identifying

suÆ ient onditions that insure zero duality gap, also alled strong duality.

TheSlater ondition(see[42℄)isanexampleofasuÆ ient onditiontostrong

duality.

Denition 2.1 (Feasibility)

A point R 0 is feasible for (PSDP) if A(R )=a.

A pair (w;Z); Z 0 is feasible for (DSDP) if L=A T

(w)+Z.

Denition 2.2 (Stri t feasibility)

A point R is stri tly feasible for (PSDP) if it is feasible for (PSDP) and

satises R0.

A pair (w;Z) is stri tly feasible for (DSDP) if it is feasible for (DSDP) and

satises Z 0.

Denition 2.3 (Slater onstraint quali ation)

(PSDP) satises the Slater ondition if there exists a stri tly feasible point

R for (PSDP).

(DSDP) satises the Slater ondition if there exists a stri tly feasible pair

(w;Z) for (DSDP).

If the Slater ondition does not hold, then a duality gap

>

an exist,

and/orthe dual (or primal)optimalvalue may not be attained.

(32)

ing the ost fun tion and onstraints in a matrix form. min * 2 6 4 0 0 0 0 0 3 2 0 3 2 0 3 7 5 ;X + s:t: * 2 6 4 0 1 0 1 0 0 0 0 0 3 7 5;X + =0 * 2 6 4 0 0 1 0 0 0 1 0 0 3 7 5 ;X + =0 * 2 6 4 0 0 0 0 1 0 0 0 0 3 7 5 ;X + =0 * 2 6 4 1 0 0 0 0 1 2 0 1 2 0 3 7 5 ;X + =3:

The dual program is

max 3y 4 s:t: Z = 2 6 4 y 4 y 1 y 2 y 1 y 3 3+y 4 2 y 2 3+y 4 2 0 3 7 5 0: Sin e x 22

= 0, a ne essary ondition for the primal matrix to be positive

semidenite is that x 23

= 0 (see Remark 1.1). Following the same idea we

obtain from z 33 = 0 that z 32 = 0, and hen e y 4

= 3 in the dual program.

Sin e

=0 and

= 9, the duality gap isnine.

DuÆn [21℄ shows the following result.

Theorem 2.1 (i) If (PSDP) satises theSlater onstraint quali ationand

is nite, then = is attained for (DSDP).

(ii) If (DSDP) satises the Slater onstraint quali ation and is nite, = is attained for (PSDP).

(iii)If(PSDP)and(DSDP)arebothstri tlyfeasible,then

=

(33)

The following theorem is alsowell known.

Theorem 2.2 [94℄

Let(PSDP) and (DSDP)satisfy the Slater onstraint quali ation. If oneof

the problems isinfeasible, then the other isinfeasible or unbounded.

Suppose nowthat (PSDP) and (DSDP) are both stri tly feasible. Sin e the

duality gap is zero, itfollows that

0=tr(R Z)=tr(R 1 2 ZR 1 2 )=hZ 1 2 R 1 2 ;Z 1 2 R 1 2 i=jjZ 1 2 R 1 2 jj 2 F : Z 1 2 R 1 2 =0implies that ZR =0.

Denition 2.4 (Complementarity sla kness)

Wesaythat(R ;Z)2S n

S n

are omplementary, or satisfy omplementarity

sla kness if ZR =0.

The omplementarity ondition implies that R and Z ommute, and hen e

share an orthonormal system of eigenve tors, say Q. Clearly this results in

rank(R )+rank (Z)n.

Denition 2.5 (Maximal and Stri t Complementarity)

A primal solution R and a dual solution (w;Z) are said to satisfy

maxi-mal omplementarity if R and Z have maximal rank among all solutions.

A primal solution R and a dual solution (w;Z) are said to satisfy stri t

omplementarity if rank(R )+rank (Z)=n.

Letus denotethe eigenvalues of R and of Z by

=( 1 ;:::; n )0 and ! =(! 1 ;:::;! n )0

respe tively. We assume without loss of generality that the omponents of

(resp. !) are arranged in nonin reasing (resp. nonde reasing) order, i.e.

1 ::: n (resp. ! 1 ::: ! n

). Writing the primal solution as (see

Theorem A.1)

R=QDiag()Q T

(2.2)

and the dual sla k solutionas

(34)

we an restate the omplementarity ondition ZR=0as ! =0, and stri t

omplementarity as +! > 0. In the ase that strong duality holds, we

get the following primal{dual hara terization of optimality onditions for

(PSDP) and (DSDP):

(OPT)

A(R )=a (primal feasibility)

Z+A T (w)=L (dualfeasibility) ZR =0 ( omplementarity sla kness); (2.4) where X 2 S + n ;Z 2 S + n , and y 2 IR m

. The optimality onditions (OPT)

are ne essary and suÆ ient optimality onditions. These optimality

ondi-tions provide the basis for: (i)the primal simplex method (maintain primal

feasibility and omplementary sla kness while striving for dual feasibility);

(ii) the dual simplex method (maintain dual feasibility and omplementary

sla kness while striving for primal feasibility), and (iii) the interior point

methods(maintain primalanddual feasibilitywhilestrivingfor

omplemen-tarysla kness). Sin ethereare urrentlynoeÆ ientalgorithmsforprimalor

dual simplex methodsfor SDP,we explain interior point methods that have

proven tobeverysu essful. Beforewedes ribetheinteriorpointalgorithm,

we derive onditions that implyuniqueness of the primal and dual solution.

2.3 Nondegenera y and Stri t

Complemen-tarity

Here we onsider the semidenite program (PSDP) and assume that the

matri es A k

; k = 1;:::;m are linearly independent. Through this se tion

we also assume that there exists a primal feasible point R 0, and a dual

feasible point(w;Z)with Z 0, su h that rank(R )+rank (Z)=n.

We denethe set

M r

:=fR2S n

:rank(R )=rg:

Sin e the eigenvalues ofa matrixR are ontinuous fun tions ofR , it is lear

(35)

Thenthe boundary of S + n is given by S + n =M + 0 [:::[M + n 1 ;

and the interior of S + n is int(S + n )=M + n :

LetR beprimal feasible with rank(R )=r and

R =QDiag( 1 ;:::; r ;0;:::;0)Q T ; (2.5) whereQ T

Q=I. The tangent spa e toM r atR is, see [8℄ T R = ( Q " U V V T 0 # Q T :U 2S r ;V 2M r;(n r) ) : Notethat dimT R = r+1 2 ! +r(n r)= n+1 2 ! n r+1 2 ! : Remark 2.1 For R2T R we have Q T (R+R )Q= " Diag( 1 ;:::; r )+U V V T 0 # :

Thus RR is not ontained in S + n

, for >0, unless V =0 (see Remark

1.1).

Denition 2.6 R is primal nondegenerate if it isprimal feasible and

T R +N =S n ; (2.6) where N =fY 2S n :hA k ;Yi=0; 8kg:

Theorem 2.3 [3℄ Let R be primalfeasible with rank (R )=r. A ne essary

ondition for R to be primal nondegenerate is that

(36)

Furthermore, let Q 1 2 M n;r and Q 2 2M n;(n r)

respe tively denote the rst

r olumns and thelast n r olumns od Q givenby(2.5). ThenR is primal

nondegenerate if and only if the matri es

B k = " Q T 1 A k Q 1 Q T 1 A k Q 2 Q T 2 A k Q 1 0 # ; k =1;:::;m (2.8)

are linearly independent in S n

.

Proof: Sin edimT R = n+1 2 ! n r+1 2 ! anddimN = n+1 2 !

m, inequality (2.7) follows dire tly. Equation (2.6) isequivalent to

dimT ? R \dimN ? =f0g; (2.9) where T ? R and N ?

are respe tively the orthogonal omplements of T R and N. Namely, T ? R = ( Q " 0 0 0 H # Q T :H 2S n r ) and N ? =Span fA k g: If the B k

are linearly dependent, there exists 2IR m ; 6=0su h that m X k=1 k B k =0: This implies m X k=1 k A k 2T ? R ;

whi h ontradi tswith (2.9). Conversely, if the B k

are linearly independent

then (2.9) holds.

Theorem 2.3holds for any Q satisfying(2.5).

(37)

Proof: From the assumptions of the theorem it follows that a dual

opti-mal solution (w;Z) exists, so that omplementarity holds. Let Q 1

and Q 2

respe tively denotetherst r olumnsand thelastn r olumnsofQgiven

in (2.5). Any ~

Z satisfying the omplementarity ondition ~ ZR = 0 must be of the form ~ Z =Q 2 HQ T 2 ; for some H 2 S n r

. The dual feasibility ondition requires the existen e of

~ w2IR m and H 2S n r su h that Q 2 HQ T 2 +A T (w)~ =L:

Theorem 2.3guarantees that any solutionof this linear system is unique.

Ifwe assume Q satises(2.3) and (2.5), we nd that

H =Diag(w r+1

;:::;w n

):

Letus onsider nowthedualnondegenera y. Let(w;Z)bedualfeasiblewith

rank(Z)=s and Z =QDiag(0;:::;0;w n s+1 ;:::;w n )Q T (2.10) with Q T

Q=I. The tangent spa e toM s at Z is T Z = ( Q " 0 V V T H # Q T :V 2M n s;s ; H 2S s ) : Notethat dimT Z = s+1 2 ! +s(n s)= n+1 2 ! (n s)+1 2 ! :

Denition 2.7 The point (w;Z) isdual nondegenerate if it is dual feasible

and Z satises T Z +SpanfA k g=S n : (2.11)

Theorem 2.5 [3℄ Let(w;Z)bedualfeasiblewithrank(Z)=s. Ane essary

ondition for (w;Z)to be dual nondegenerate isthat

(n s)+1 !

(38)

Furthermore, let ~ Q 1 2 M n;(n s) and ~ Q 2 2 M n;s

respe tively denote the rst

n s and the last s olumns of Q given by (2.10). Then (w;Z) is dual

nondegenerate if and only if the matri es

~ B k =[ ~ Q T 1 A k ~ Q 1 ℄; k =1;:::;m span S n s .

Proof: It is animmediate onsequen e of the denition.

Theorem 2.6 [3℄ Let(w;Z)bedualnondegenerateandoptimal. Thenthere

exists a unique optimal primal solution R .

Proof: From the assumptions of the theorem it follows that a primal

op-timal solution R exists, so that omplementarity holds. Let ~ Q 1 and ~ Q 2 respe tivelydenotethe rst n s olumnsandthe lasts olumnsofQgiven

by (2.10). Any ~

R satisfying omplementarity ondition Z ~ R =0 must be of the form ~ R= ~ Q 1 U ~ Q T 1 for some U 2S n s

. The primalfeasibility ondition in (2.4) redu es to

h ~ Q T 1 A k ~ Q 1 ;Ui=a k ; k=1;:::;m:

Theorem 2.5guarantees that any solution of this linear system is unique.

If we assumeQ satises (2.5) and (2.10), we nd that

U =Diag( 1

;:::; n s

):

Note also the distin tion between the partitioning of Q used in Theorems

2.3 and 2.5. The former uses Q=[Q 1

;Q 2

℄ where Q 1

has r olumns and the

latter uses Q =[ ~ Q 1 ; ~ Q 2 ℄ where ~ Q 1

has n s olumns. These partitions are

the same if and onlyif r+s=n, i.e.stri t omplementarity holds.

Theorem 2.7 [3℄ Suppose that R and (w;Z) are respe tively primal and

dual optimal solutions satisfying stri t omplementarity. Then if the primal

solution R is unique, the dual nondegenera y ondition must hold, and if

(39)

Proof: Let Q satisfy onditions (2.5) and (2.10). Stri t omplementarity

statesthat r+s=n, sothe partitioningof Q used inTheorems 2.3and 2.5

are the same. Thus

R=Q 1 Diag( 1 ;:::; r )Q T 1 ; Z =Q 2 Diag(! 1 ;:::;! r )Q T 2 :

Suppose rst that the dual nondegenera y assumption (2.11) fails to hold.

We show that in this ase R an not be a unique primal solution. Sin e Z

isanoptimaldual solution, omplementarity statesthat any optimalprimal

solution ~ R must satisfy ~ R=Q 1 UQ T 1 ; for someU 2S r

, and sothe primalfeasibility ondition redu es to

hQ T 1 A k Q 1 ;Ui=a k ; k=1;:::;m:

Be ause the dual nondegenera y assumption doesnot hold, the solution set

of this equation is not unique, but holds on an aÆne subset of S r

, say U.

The ondition that ~

R 0 holds if and only if U 0. But the parti ular

hoi e U =Diag( 1

;:::; r

) lies in U and is positivedenite, sothere is an

open set in U for whi h the same is true. Every su h U denes an ~

R whi h

satisesthe optimality onditions.

Now suppose that the primal nondegenera y assumption (2.6) fails tohold.

We show thatin this ase (w;Z) annotbea uniquedual solution.

Comple-mentarity states that any solution ~ Z must satisfy ~ Z =Q 2 HQ T 2 for some H 2 S s

, and so the dual feasibility ondition redu es to the

solv-abilityof Q 2 HQ T 2 +A T (w)~ =C for some w~ 2IR m and H 2 S s

. Be ause the primal nondegenera y

assump-tion doesnot hold, the solution set of this equation isnot unique, but holds

onan aÆne subset of IR m

S s

, say W. The ondition ~

Z 0 if and only if

H 0. But the parti ular hoi e (w~ = w; H = Diag(! r+1

;:::;! n

)) lies in

W with H positivedenite, sothere isanopen set inW for whi h thesame

is true. Every su h H denes a ~

(40)

2.4 Primal-Dual Interior Point Methods

Semideniteprograms anbesolved(morepre isely,approximated)in

poly-nomial time within any spe ied a ura y, either by the ellipsoid algorithm

[33℄ orthrough interior-pointalgorithms. Morere ently,interiorpoint

meth-ods[2,16,39,68,69,88,88,96℄haveturnedouttobethemethodof hoi eto

solve SDP, sin e they give faster algorithmsthan the ellipsoidmethod. The

interior-point algorithms onverge very fast and an approximately optimal

solutionisobtainedwithinapolynomialnumberofiterations. The

omputa-tionofasinglestepis omputationallyratherexpensivefortheproblemsthat

ontain a big number of onstraints. Within urrent te hnology we are able

to solve with these methods problems that ontain about 8000 onstraints.

The interior point methods for SDP are iterative algorithms whi h use a

Newton-likemethodtogeneratesear hdire tionstondanapproximate

so-lutiontothenonlinearsystem. Belowwedes ribetheinterior-pointapproa h

for SDP, ormore pre isely the primal-dual interior point path-following

me-thod. First, we state several assumptions. From Theorem 2.1 follows that a

suÆ ient ondition for the attainment of optimal primal and dual solutions

is theexisten e ofstri tly feasibleprimaland dual solutions. The on ept of

the interior-pointapproa h is based onthe followingassumptions.

Basi Assumption 1 Both,theprimal(PSDP) andthedual(DSDP)

prob-lem satisfy the Slater onstraint quali ation, e.g.,

9(R ;w;Z) s:t: R0; Z 0; A(R )=a; L A T

(w)=Z:

Here wealsoassumethatastri tlyfeasiblestartingpointisknown. Toavoid

trivialities, itis usually to assumethe following.

Assumption 2.1 The linear equations hA i

;R i = a i

; i = 1;:::;m are

lin-early independent.

The start of the interior-point algorithm is in the interior of the feasible

region, thus in the one of the positive denite matri es. In order to stay

during the iteration pro ess in that one, we hange the obje tive fun tion

(41)

The real number > 0 is the so- alled barrier parameter and logdet (R )

is the barrier fun tion. The value of the barrier fun tion is small in the

interior of the feasible region but grows to innity when the boundary is

approa hed. Optimaare usuallylo atedonthe boundary. Inordertoobtain

the onvergen e of the algorithm towards optima, the ee t of the barrier

fun tion should be de reased after ea h iteration of the algorithm. This is

obtainedbyweightingthebarrierfun tionwith>0,whosevaluede reases

asthe algorithmpro eeds. We nowintrodu ethe asso iatedbarrierproblem

for (PSDP),whi hwe allthe primal barrier problem:

min hL;R i logdet (R )

A(R )=a;

R 0:

(2.12)

Sin ethe ost fun tion of the barrier problemisstri tly onvex (see Lemma

B.1),the optimalsolutionexists and is unique.

Remark 2.2 The barrier fun tion logdet(R ) belongs to the lass of so{

alled strongly self on ordant fun tions [68℄. If a linear fun tional isadded

toa self on ordantfun tion the resulting fun tion is aself on ordant

fun -tion. Hen e, the obje tive fun tion in the primal barrier problem is a self

on ordantfun tion. Nesterovand Nemirovskii[68℄ showedthatforthe lass

of strongly self on ordant fun tions Newton's method works espe ially well.

Morepre isely,for onvex setshavingastrongly self on ordantbarrier

fun -tion whi h an be omputed eÆ iently, Newton's method yields a polynomial

timeinterior-point algorithm.

Forea h >0,there is a orrespondingLagrangian:

L

(R ;w)=hL;R i+ha A(R );wi logdet (R );

wherew2IR m

isaLagrangemultiplier. Therst-orderoptimality onditions

for the saddle{point of the Lagrangian L

are alled Karush{Kuhn-Tu ker

(KKT) onditions (Theorem C.13). For any xed value > 0, the KKT

(42)

for R 0; Z 0. Here we use the fa t that d dR logdet (R ) = R 1 , see

Theorem B.1 and Remark B.1. In a primal-dual formulation we set Z =

R 1

. There are several equivalent formulations of this ondition. We use

ZR=I. Wenow rewrite the KKT onditions inthe followingform.

(KKT) A(R ) = a; R 0 Z+A T (w) = L; Z 0 ZR = I: (2.15)

The rstequalityin(2.15)is alledthe primalfeasibility,andthe se ondthe

dual feasibility. The third equation is alled the perturbed omplementarity

ondition. For=0 we have the omplementarity ondition.

Remark 2.3 The dual barrier problem is

max ha;wi+logdet (Z)

L A

T

(w)=Z; Z 0:

The ostfun tionofthisbarrierproblemisstri tly on ave,hen etheoptimal

solution exists and is unique. For ea h > 0, there is a orresponding

Lagrangian: L (R ;w;Z)=ha;wi+hR ;Z+A T (w) Li+logdet(Z):

From the Lagrangian

L we obtain the KKT onditions (2.15).

Under Basi Assumption 1and Assumption 2.1, forevery >0there exists

a unique solution (R ;w ;Z

) of KKT, see [68, 91℄. The set

f(R ;w ;Z ):>0g

denes asmooth urveparametrizedby,whi hisusually alledtheprimal{

dual entralpathor entraltraje tory. Forea hpoint(R ;w;Z)onthe entral

path, itiseasy todetermineitsasso iated value usingthe lastequationof

the optimality onditions:

=

tr(ZR )

n :

(43)

omplementarity as possible, a tually the point to whi h the entral path

onverges for ! 0 is maximal omplementarity (see [51℄). This point is

alledthe analyti enter.

Interior-point algorithms work as follows.

We start with a point u = (R ;w;Z) whi h satises R 0, Z 0. If

that pointwould lieonthe entralpath, the asso iated parameterwould be

= tr(ZR)

n

. We do not assume that u lies on the entral path, but would

like to move this triple towards the entral path. We head for a point on

the entralpath given by the parameter 2

. Su hanapproa h performs very

well in pra ti e. Next, we attempt to nd steps (R ;w;Z) su h that

the new point(R+ R ;w+ w;Z+ Z); >0 be omes lose tothe

point(R 2 ;w 2 ;Z 2

)onthe entraltraje tory. We an nd su h astepwith a

variantofNewton'smethod. Theequation(2.15)has (n+1)n+m variables,

but (n+1)n

2

+n 2

+m equations, and therefore Newton's method annot be

dire tlyappliedto(2.15). Thedieren earisesfromZR I,whi hneednot

be symmetri , even if R and Z are. Therefore some sort of symmetrization

of the last equation in (2.15) is ne essary to over ome this problem. Many

authorshavesuggested dierent ways of symmetrizing the thirdequation in

(2.15). Todd [88℄ analyzes twenty dierent sear h dire tions for SDP. In

Se tion2.6wedes ribethreesear h dire tionsthat areused mostfrequently

inpra ti e.

Herewepresentthevariantofthesear hdire tionthatwasindependently

in-trodu edbyHelmberg,Rendl, Vanderbeiand Wolkowi z [39℄;Kojima,

Shin-doh and Hara [55℄, and Monteiro [64℄. This sear h dire tion is known as

the H..K..M dire tion. It is simple, and yet omputationally quite eÆ ient.

Here, the step dire tion an be determinedby the linearized system

(KKT) A(R ) = a A(R ) =: F p Z+A T (w) = L A T (w) Z =: F d ZR+ZR = I ZR =: F ZR :

We rst solve for Z and eliminatethe se ond equation of (KKT)

Z = A

T

(w)+F d

:

Now we solve forR and eliminate the third equationof (KKT)

(44)

By substituting the previous equation into the rst equation of the system

(KKT), the nal equation of the system is

A(Z 1 A T (w)R )=A(Z 1 F d R Z 1 )+a: (2.16)

In pra ti eitisveryimportanttoeÆ ientlynd thematrixrepresenting the

nal system by exploitingthe possible stru ture (see Se tion 4.3).

Sin e the matrix Z 1

A T

(w)R is not symmetri , we need to extend the

denition ofthe operatorA.

Remark 2.4 In order to apply operator A to unsymmetri matri es, we

extend its denition. For any nonsymmetri square matrix X, let

A(X)= 1 2 A(X+X T ):

The system (2.16) is positive denite (see [39℄) and an therefore be solved

quite eÆ iently by standard methodsyielding w. In our omputationswe

use the Cholesky fa torization(see Se tion4.1, 7.2,and 7.3). Ba k

substitu-tiongivesRandZ. Ingeneral,R annotbeassumedtobesymmetri ,

but thissear hdire tionalways yieldsasymmetri Z. WesymmetrizeR

by R R+R T 2 :

Kojima et al. [55℄ show that even under this symmetrization the interior

pointmethodhas polynomial onvergen e. The new pointis

R n = R+ R w n = w+ w Z n = Z+ Z;

where 2 h0;1℄ is the stepsize. Kojima et al. [55℄ set the ondition on

whi h guarantees the positivesemideniteness of the updated variables.

Lemma 2.2 [55℄ Suppose that R 2 S ++ n , R 2S n and 0. Let min be

(45)

Inpra ti eonestartswith=1,whi hisafullNewtonstep,andba ktra ks

bymultiplyingthe urrentwiththefa torsmallerthan1(forinstan e0:85)

untilpositivedenitenessofR n

andZ n

isrea hed. On ewerea hthepositive

deniteness of R n

and Z n

, we multiply with 0:95 to make sure the next

pointis not to lose tothe boundary.

2.5 Predi tor{Corre tor

Thepredi tor{ orre torapproa hturnstobeverysu essfulforsemidenite

programming,see [16, 95℄. The step dire tionisa linear ombinationof two

sear h dire tions, the predi tor and the orre tor one.

The predi tor step solves the system (KKT) with = 0. Hen e, the nal

equationdiers from(2.16) onlyin the righthand side,

A(Z 1 A T (Æw p )R )=A(Z 1 F d R )+a:

The resultis the aÆne stepdire tionÆs p =(ÆR p ;Æw p ;ÆZ p ) whi h is

respon-sible for the progress towards the desired optimum.

The orre tor step pulls the urrent iterate loser tothe entralpath, and is

often alledthe enteringstep. The orre torstepsolvesthesystem(KKT)

atthe point(R+ÆR p ;w+Æw p ;Z+ÆZ p

). Ifhigherorder termsarenegle ted

for the orre tor step, the nallyequation of the system is

A(Z 1 A T (Æw )R )= A(Z 1 (I ÆZ p ÆR p )):

Note that the system again hanges only on the right hand side. Hen e,

we an use the oldfa torization of the system matrix to solve for Æw

. The

resultisthe enteringstepdire tionÆs =(ÆR ;Æw ;ÆZ

). Finally,thesear h

dire tionfor the linesear h is Æs p

+Æs

.

The predi tor{ orre tor algorithm shows good pra ti al behavior with

re-spe t to stability and is proven to onverge superlinearly [73℄. In all our

omputationswe use the predi tor{ orre tor algorithms.

2.6 Sear h Dire tions

(46)

For nonsingularmatrix P 2M n

weintrodu e the mappingP T P :S n ! S n ,see [2℄ (P T P)R:= 1 2 (P T R P T +PR P 1 ):

Remark 2.5 This denition an be extended on M n . For P;Q;R2M n is dened (P Q)R := 1 2 (PR Q T +QR T P T ): If P;Q2M n

and(PQ)is onsideredas anoperator from S n toitselfthen P Q=QP; (P Q) T =P T Q T :

If P is nonsingularmatrix then

(P P) 1 =P 1 P 1 : We nowdene H P :=P T P:

The idea isto repla ein KKT the omplementarity ondition ZR =I by

H P

(ZR )=H P

(I)=I:

Nowwe rewrite KKT inthe followingway

A(R ) = a; R 0 Z +A T (w) = L; Z 0 H P (ZR ) = I: (2.17)

Nowthe lastequation in(2.17) islinearized

(47)

then the linearizationof the system (2.17) isgiven by A(R ) = F p Z+A T (w) = F d FZ+ER = F P : (2.19)

We an alternativelymultiplythe equation (2.18)by P T

fromleft and by P

fromright,and get

1 2 (ZR P T P +P T PR Z)+ 1 2 (ZR P T P +P T PR Z) =P T P 1 2 (ZR P T P +P T PR Z): Nowis F :=MRI; E :=ZM; and F P =M 1 2 (ZR M +MR Z); with M :=P T P:

The hoi e of P and M often depends onthe urrent iterates Z and R , and

hen ewesometimes writeP(Z;R ) orM(Z;R )tohighlightthis dependen e.

i)AHO dire tion. (Alizadeh, Haeberly, and Overton [2℄)

The solution isprimal-dual symmetri .

P =M =I, and F =RI, E =ZI, F P =I 1 2 (ZR+R Z).

ii)H..K..M dire tion. (Helmberg, Rendl, Vanderbei and Wolkowi z [39℄;

Kojima,Shindoh and Hara [55℄, and Monteiro [64℄)

The solution of (2.19)is (R ;w;Z)2M n IR m S n . P =Z 1=2 , M =Z, and F =ZRI, E =ZZ, F P =Z ZR Z.

Alternatively, so that E doesnot need to be inverted:

(48)

iii) NTdire tion. (Nesterov and Todd [69, 70,89℄)

The solution isprimal-dual symmetri .

P =W 1=2

, M =W 1

for the unique s alingmatrix:

W =Z 1=2 (Z 1=2 R Z 1=2 ) 1=2 Z 1=2 , and F =W 1 RI, E =ZW 1 , F P =W 1 1 2 (W 1 R Z+ZR W 1 ).

These sear h dire tions are urrently the most exploited in pra ti al

imple-mentation. Beside these three dire tions Todd in [88℄ des ribes seventeen

more primal-dual sear h dire tions.

Assuming thatE isnonsingular, wend that (2.19)has aunique solutioni

the mm S hur omplement matrix AE 1

FA T

is nonsingular. In this ase

the solutionis obtained from

(AE 1 FA T )w = F p AE 1 (F RZ FF d ) Z = F d A T (w) R = E 1 (F RZ FZ): (2.20)

Themain omputationalworkistheformationandfa torizationoftheS hur

omplementmatrix. TheH..K..MandNTdire tionsgiveauniquesear h

di-re tionforeverysymmetri positivedeniteR andZ andsurje tiveoperator

A. These two dire tions posses the property that E 1

F is positive denite

and self-adjoint. Hen e,the S hur omplementmatrix is symmetri and the

rstequationin(2.20)issolvedbyusingaCholeskyfa torizationoftheS hur

omplement matrix. The AHO dire tion gives a unique sear h dire tionfor

everysymmetri positivedeniteR and Z if ZR+R Z issymmetri positive

denite (see [89℄), orif (R ;w;Z)liesin asuitable neighborhoodof the

en-tral path (see [65℄). The S hur omplement matrix for the AHO dire tion

is not symmetri ,but an be shown to be nonsingular if ZR+R Z 0, see

[84℄. For the AHO dire tion the rst equation in (2.20) is solved by using

an LU fa torization of the S hur omplement.

(49)

Interior Point Algorithm

Basi Iteration.

(i) Choose 0< <1and dene: = tr(ZR)

n .

(ii) Determine(R ;w;Z)by solving (2.20).

(iii)In the ase of the H..K..M dire tion,

update R by R 1 2 (R+R T ).

(iv)Choose steplengths P ; D 2 h0;1℄ sothat R+ P R0and Z + D Z 0. (v) Compute =minf P ; D g and update R R+ R , w w+ w, Z Z+ Z. Stopping Criteria.

IfjjA(R ) ajj,jjA T

(50)

Chapter 3

The Quadrati Assignment

Problem

3.1 Problem Formulation

TheQuadrati AssignmentProblem(QAP)wasintrodu edin1957by

Koop-mans and Be kmann [56℄ as a modelfor lo ationproblems, that takes into

a ount the ost of pla ing a new fa ility on a ertain site as well as the

intera tionwith otherfa ilities. Nowadays, the QAPiswidely onsidered as

a lassi al ombinatorial optimization problem. For the appli ations of the

QAP see Se tion 3.3.

The Quadrati Assignment Problem an be statedin the following way. For

given A=(a ij );B =(b ij ), andC =( ij

)real nn matri esnd a

permuta-tion of the set f1;:::;ng whi hminimizes

min n X i=1 n X j=1 a ij b (i);(j) + n X i=1 i;(i) : (3.1)

This is a ombinatorial formulation of the QAP. A permutation 0

whi h

minimizes(3.1) is alled an optimal solution. The rst part inthe obje tive

fun tion is alledthe quadrati part whilethe otheris alledthe linear term.

ThesizenofthematrixA(resp.B;C)isthesizeoftheQAP.Ifthe oeÆ ient

(51)

Asaillustrativeexamplewedes ribea ampusplanningmodelduetoDi key

andHopkins[20℄. Theuniversityownsapie eoflandonwhi hnewbuildings

are to be ere ted. On the university's land n sites have been identied as

possible sites forthe buildings. Ea h of the buildingshas a spe ial fun tion,

su h as library or dormitory. Let a ij

be the walking distan e between the

two sites i and j, where the new buildings an be ere ted. These distan es

are olle ted in the matrix A = (a ij

), whi h is often alled the distan e

matrix. Letb kl

denotesthe numberofpeopleperweekwho ir ulatebetween

buildings k and l. The quantities b kl

are olle ted in the matrix B = (b kl

),

whi h is often alled the ow matrix. Notethat the diagonal elements of A

and B are allzero and both, A and B are symmetri matri es. Theprodu t

a ij

b (i);(j)

des ribesthe weeklywalkingdistan eofpeoplewhotravelbetweenbuildings

k=(i)and l=(j),if buildingk isere ted on sitei and buildingl onsite

j. The problem of assigning buildings to sites so that the walking distan e

isminimized orresponds to the followingminimization problem

min n X i=1 n X j=1 a ij b (i);(j) : (3.2)

Suppose now that inaddition tothe interest of minimizingthe walking

dis-tan eatthe ampus, theuniversity isalsointerestedinminimizingthetotal

onstru tion ost. Let ij

denotes the ost of ere ting the building i on site

j. Thenthe ost onstru tion minimizationproblemis

min n X i=1 i;(i) : (3.3)

Note that minimizing the total distan e walked by all users of the ampus

will in general be in on i t with the goal of minimizing onstru tion ost.

The minimization problem whose solution fullls both previously des ribed

demandshasfortheobje tivefun tionthelinear ombinationoftheobje tive

fun tionfrom (3.2) and (3.3), whi h is exa tlythe obje tive given in(3.1).

3.2 Equivalent Formulations of QAP

(52)

one orresponden e between the set of allpermutationsof the set f1;:::;ng

and the set of nn permutation matri es (see Denition (1.2)). For the

entries of the permutationmatrix X =(x ij )we spe ify x ik = (

1 if and onlyif (i)=k

0 otherwise :

Notethat ifiisassignedtok andj isassignedtol,i.e.if (i)=k; (j)=l,

we have n X k=1 n X l =1 a ij b kl x ik x jl =a ij b (i);(j) :

Hen e, an equivalentformulationof QAP is

min X2 X ijkl a ij b kl x ik x jl + X ik ik x ik : (3.4)

The problem formulation (3.4) is alled the Koopmans-Be kmann

formula-tion of the QAP.

Another equivalent formulationof the QAP an beobtained usingthe tra e

of the matrix. Notethat for the ik-entry ofAXB T is (AXB T ) ik = X jl a ij x jl b kl

and ithdiagonalentry of CX T is (CX T ) ii = X k ik x ik : Therefore we have X ijkl a ij b kl x ik x jl + X ik ik x ik = X ik (AXB T ) ik x ik + X i (CX T ) ii =tr (AXB T +C)X T ;

and the tra e formulation of QAPis

min X2 tr(AXB T +C)X T :

(53)

of the problem and isfavorable for easy manipulation and relaxation of the

model.

From (1.2) and (1.5) itfollows that

tr(AXB T +C)X T =x T ve (AXB T )x+ T x=x T (BA)x+ T x;

where x = ve (X) and = ve (C). The Krone ker-produ t formulation of

the QAP is min X2 x T (BA)x+ T x: Finally,sin e x ij =x 2 ij

, wepresent the Krone ker-Diag formulationof QAP

min X2 x T (BA+Diag( ))x: 3.3 Appli ations

There is a large variety of appli ations of the QAP. Here we supply an

overview of published appli ations of QAP. In 1957 Koopmans and

Be k-mann [56℄ derived the QAP as a mathemati al model of assigning a set of

e onomi a tivities to a set of lo ations. A very important area of

appli- ations of QAPs is the \wiring" problem. In a 1961 paper [87℄, Steinberg

des ribed a \ba kboard wiring" problem. The problem on erns the

pla e-ment of omputer omponents so as to minimize the total wiring length

required to onne t them. In the parti ular instan e onsidered by

Stein-berg, 34 omponents with a total of 2625 inter onne tions are to be pla ed

on a bla kboard with 36 open positions. To formulate the wiring problem

mathemati allyit is onvenient to add 2 dummy omponents, with no

on-ne tions to any others,so that the numberof omponents and lo ationsare

both n =36. Steinberg onsiders 1-norm,2-norm, and squared 2-norm

dis-tan es between the bla kboard lo ations. These three norm versions of the

Steinberg wiring problemare nowknown as the Ste36a, Ste36 , and Ste36b

probleminstan es. They are in luded in QAPLIB [14℄, a Quadrati

Assign-ment ProblemLibrary, established in1991 by Burkard, Karis h,and Rendl.

(54)

keyboard. Suppose the keys of a typewriter are to be arranged on the

key-board su h that the time needed to write a text in a ertain language is

minimal. The set f1;:::;ng is the set of symbols to be arranged on the

keyboard. The matrix A ontains the mean frequen y of a pair of letters

in the onsidered language. In the study by Burkard and Oerman these

quantitiesweredeterminedbyevaluatingGerman,English,andFren htexts

with 100 000 letters and pun tuation marks. The entry b kl

of the matrix

B is the number of times key l is pressed after pressing key k. If the ith

symbol isassigned tokey k, i.e. k=(i), and jth symbol isassigned tokey

l, i.e. l =(j), the produ t a ij

b (i);(j)

is the time needed towrite symbol j

after symbol i. In order tominimizethe averagetime for writinga text, the

QAP should be solved. Note that this QAP isnot a symmetri one.

In 1972, as a part of the design of a German university hospital Klinikum

Regensburg inGermany,arose the problemofassigningroomsina hospital.

Krarup [53℄ models that problem as a QAP. In a 1977 paper [24℄, Elshafei

des ribesahospitallayout asa QAP.The results showthat itismore

inter-esting to minimizethe largest distan e rather than the sum of all distan es.

The Krarupand Elshafei instan es are alsoin luded in QAPLIB [14℄.

Krarupand Pruzan(1978) [54℄modelthe rankingof ar heologi aldata, and

Heey (1976) [41℄ models the ranking of a team in a relay ra e as a QAP.

Problems as a balan ing of turbine runners (see [83℄ and [57℄); analysis of

hemi alrea tionsfororgani ompounds(see[92℄),ands heduling[30℄leads

also toa QAP.

From a graph-theoreti al point of view, there are series of graph

optimiza-tion problems, that an be modeled as a QAP with spe ial stru ture. The

Traveling SalesmanProblem(TSP)isaQAPwherematrixA isthe distan e

matrix of the problem, and B is the adja en y matrix of a y le. In the

ase ofthe graphpartitionproblem,matrixA istheweighted adja en y

ma-trix of a graph, while the matrix B is the adja en y matrix of two disjoint

omplete graphs. Also, other types of graph problems, su h as Max-Clique,

(55)

3.4 Computational Complexity of QAP

The QAP is from a worst ase omputational omplexity point of view one

of the most diÆ ult ombinatorialoptimizationproblems. The QAP is well

known tobean NP{hard ombinatorialoptimization problem, asshown by

Sahniand Gonzales[82℄.

Let z()= n X i=1 n X j=1 a ij b (i);(j) + n X i=1 i;(i) :

We introdu e the notation of an -approximation algorithm and

-approxi-mate solution.

Denition 3.1 [17, pg. 18℄ Given a real number >0, an algorithm for

the QAP issaid to be an -approximationalgorithm if and only if for every

instan e QAP the followingholds: z( ) z( opt ) z( opt ) ; where

isthe solutionto QAP omputed byalgorithmand opt

is an

op-timalsolutionto QAP.Thesolutionof QAP produ ed byan-approximation

algorithmis alled an -approximate solution.

Sahni and Gonzales have also proved that even nding an -approximate

solution for QAP is a NP{hard problem, see [82℄. The pra ti e shows that

the QAP is extremely diÆ ult to solve to optimality. The omputational

eorttosolvethe QAPisverylikelytogrowexponentiallywith theproblem

size. Problems of size n 20 are urrently onsidered as huge problems.

Christodes and Gerrard [18℄ show that QAP an be solved in polynomial

time using dynami programming if the matri es A and B are a weighted

adja en y matri es of a tree. If only one of these matri es is a weighted

adja en y matrix, the problemremains NP-hard.

3.5 A Convex Quadrati Programming

Re-laxation

(56)

num-from 20 n 36. Their omputations are onsidered to be among the

most extensive omputations ever performed to solve dis rete optimization

problems. The key tothis break-throughliesin the use ofa bound for QAP

that is both fast to ompute, and gives good approximations to the exa t

valueofQAP.Anstrei her-Brixius boundingpro edure ombinesorthogonal,

semidenite and onvex quadrati relaxations in a nontrivial way, starting

from the Homan-Wielandt inequality, see Theorem A.2. Their bounds are

known as a onvex quadrati programming bounds (QPB),see [4℄.

We use here the parameterization

X = 1 n E+V ^ XV T ; (3.5)

from Lemma 4.2, and assume in additionthat V T V =I n 1 . By use of (3.5) we get AXBX T = AV ^ XV T BV ^ X T V T + 1 n (AEBV ^ X T V +AV ^ XV T BE) + 1 n 2 AEBE = AV ^ XV T BV ^ X T V T + 1 n (AEBX T +AXBE) 1 n 2 AEBE: Hen e tr(AXBX T )=tr( ^ A ^ X ^ B ^ X T )+ 2 n tr(Aee T BX T ) 1 n 2 s(A)s(B); (3.6) for ^ A=V T AV; ^ B =V T BV, and s(M):=e T Me= X ij m ij : Using (3.6),wehave tr(AXB+C)X T =tr( ^ A ^ X ^ B ^ X T )+tr( ^ C+ 2 n V T Aee T BV) ^ X T + 1 n 2 s(A)s(B)+ 1 n s(C); (3.7) where ^ C =V T

CV. Hadleyetal.[34℄usethistobound thequadrati termin

(57)

A.2. Anstrei her andBrixius [4℄use this observation asastarting pointand

observe that for any ^ S; ^ T 2S n 1 , and ^ X 2O n 1 , one has 0 = tr ^ S(I ^ X ^ X T )=tr ^ S tr ^ S ^ XI ^ X T =tr ^ S tr(I ^ S)(^x^x T ) 0 = tr ^ T(I ^ X T ^ X)=tr ^ T tr I ^ X ^ T ^ X T =tr ^ T tr(T ^ I)(^x^x T ); wherex^=ve ( ^

X). We use these results toobtain the following identity

tr( ^ A ^ X ^ B ^ X T )=tr( ^ S+ ^ T)+tr( ^ B ^ A I ^ S ^ T I)(^x^x T ): (3.8) Let ^ Q:= ^ B ^ A I ^ S ^ T I; and ^ D= ^ C+ 2 n V T Aee T

BV. We substitute this into(3.7) and get

tr(AXB+C)X T =tr( ^ S+ ^ T)+x^ T ^ Q^x+ ^ d T ^ x+ 1 n 2 s(A)s(B)+ 1 n s(C): (3.9)

This relationis true for any orthogonal X and ^

X related by (3.5) and

sym-metri ^ S;

^

T. In order to express the parts in (3.9) ontaining ^

X by the

originalmatrix X we use the followingidentities:

tr ^ S(I V T V) = tr ^ S(I V T XX T V)=tr ^ S tr(V ^ SV T )XIX T = tr ^ S tr(IV ^ SV T )(xx T )=0; tr ^ T(I V T V) = tr ^ T(I V T X T XV)=tr ^ T tr IX(V ^ TV T )X T = tr ^ T tr(V ^ TV T I)(xx T )=0:

Hen e, for any orthogonal X,and any symmetri ^ S; ^ T we alsohave tr(AXB +C)X T =tr( ^ S+ ^ T)+x T Qx+ T x; (3.10) for Q=BA I(V ^ SV T ) (V ^ TV T )I:

Comparing(3.9) and (3.10) we note that

^ x T ^ Q^x+ ^ d T ^ x+ 1 n 2 s(A)s(B)+ 1 n s(C)=x T Qx+ T x:

Note that Q and ^

Q depend on the spe i hoi e of ^ S;

^

T. Anstrei her

(58)

that dual feasibility yields ^

Q 0. Therefore the above problem is a

on-vex quadrati programming problem. We denote its optimal solution as the

quadrati programmingbound.

QPB(A;B;C):=tr( ^ S+ ^ T)+minfx T Qx+ T x: x=ve (X); X 2Eg:

Notethatasa onsequen eofTheoremA.3matrix Qispositivesemidenite

over the set of assignment onstraints and that optimal ^ S;

^

T an easily be

obtained from the spe tral de omposition of ^ A and

^

B. The QP bounds

depend on the hoi e for the dual basis. Anstrei her and Brixius use two

dierent dual basis and obtain QPB0 and QPB1 bounds. In [4℄ is proved

that ingeneralQPB annotbeworse thenproje ted eigenvaluebound(PB)

introdu edin [34℄.

Anstrei her et al. [5℄ solve approximately QPB by using the well-known

Frank-Wolfe (FW) algorithm [29℄. Although the FW method is known to

have poor asymptoti performan e, Anstrei her et al. use that method in

their omputations be ause of the following reasons. Ea h iteration of the

FW algorithm requires the solution of a linear assignment problem, whi h

anbeperformedextremelyrapidly. TheFWalgorithmgeneratesdual

infor-mationthat anbeusedtoestimatetheee tofxing anassignmentx ij

=1

to reatea \ hild" problemof a node inthe B&B tree. The bran hing rules

for the bran hand boundalgorithmare based onthis dual informations. In

Tables 7.3, 7.4 we list some of the bounds reported in [4℄, and in Table 7.9

(59)

(60)

Chapter 4 SDP Relaxations of QAP in S n 2 +1

In this Chapter we present three SDP relaxations of in reasing omplexity,

pla edinS n

2 +1

. Wepresentalsonumeri alresultsobtainedbyinteriorpoint

method for the relaxation QAP R2

on Nugent-type instan es. We prove the

linear dependen y of the arrowand gangster onstraints.

4.1 Deriving the Relaxations

Thenaturalway(see[47,97℄)ofembeddingthe0{1problemintothe

semidef-inite framework isby liftingit intoa higherdimensional spa e of symmetri

matri es. Forthat purpose, we introdu e the matrix variable

Y = 1 x ! 1 x T = 1 x T x xx T ! ; x=ve (X); X 2: (4.1) This matrix Y 2S n 2 +1

is positive semidenite,i.e. Y 0 and satises

Y 00 = 1 (4.2) Y ii = Y 0i =Y i0 ; i=1;:::;n 2 ; (4.3)

whereweuse index0forthe rstrowand olumnofthematrix. Constraints

(4.3) are equivalent to

(61)

Equations (4.2) and (4.3) an be formally summarized as arrow (Y) = e 0

,

wherethe linearoperatorarrow:S n 2 +1 !IR n 2 +1 isdened as arrow (Y):=diag(Y) (0;(Y 1:n 2 ;0 )); Y 2S n 2 +1 : (4.4)

The adjointoperatorof the operatorarrow isArrow:IR n 2 +1 !S n 2 +1 ,given by Arrow(w)= w 0 1 2 w T 1:n 2 1 2 w 1:n 2 Diag(w 1:n 2) ! :

In order to rewrite the obje tive fun tion from QAP we use (1.2) and (1.5)

and obtainthe following formof the obje tive fun tion

tr(AXB+C)X T

=hx;ve (AXB +C)i=x T

(BA)x+x T

;

where x= ve (X) and =ve (C). Using the fa t that x is 0{1 ve tor, the

obje tivefun tion of the QAP be omes

tr 1 x T 0 B 0 1 2 T 1 2 T BA 1 C A 1 x ! =tr LY; (4.5) where L:= 0 B 0 1 2 T 1 2 T BA 1 C A 2M n 2 +1 ;

and Y isa matrix of the form (4.1).

Remark 4.1 Note that equivalently we an set

L= 0 B 0 0 0 BA+Diag( ) 1 C A:

We dene the feasible set of QAP

P := onv 8 < : 1 x ! 1 x ! T : x=ve (X); X 2 9 = ; ; (4.6)

and write QAPin the followingform

(62)

In order to obtain tra table relaxations for QAP we need to approximate

the set P by larger sets ontaining P. The following lemma olle ts known

results.

Lemma 4.1 Y 2P , Y 0; arrow (Y)=e 0

and rank (Y)=1.

We next exploit the fa t that the row and olumn sums of permutation

matri es are one.

Lemma 4.2 [34℄ LetVbeann (n 1)matrixwithV T e=0andrank(V)= n 1. Then n X 2M n :Xe=X T e=e o = 1 n ee T +VMV T :M 2M n 1 :

MatrixV fromtheprevious Lemma ouldbeany basisof e ?

. Our hoi efor

V is V = 0 I n 1 e T n 1 1 A : (4.7)

Let usdene the n 2 ((n 1) 2 +1) matrix W := 1 n ee;V V ; (4.8) and (n 2 +1)((n 1) 2 +1) matrix ^ V := 0 e T 0 W 1 A : (4.9)

Lemma 4.3 For any Y 2 P there exists a symmetri matrix R of order

(63)

Proof. (See also [97℄.) Firstwe look at the extreme pointsof P. Let Y be

one of them. Thus

Y = 1 x T x xx T ! ;

for some permutation matrix X. From Lemma 4.2 it follows that for the

permutationmatrix X there exists some matrix M 2 M n 1 su h that X = 1 n ee T +VMV T

. With the use of (1.5), we get

x=ve (X)= 1 n (ee)+(V V)m=Wz; wherem =ve (M), z = 1 m !

and W is dened in(4.8). Now

Y = 0 1 (Wz) T Wz Wzz T W T 1 A = 0 e T 0 W 1 A zz T 0 e T 0 W 1 A T = ^ VR ^ V T ; withR=zz T and ^

V denedin(4.9). Hen e,Rissymmetri positive

semidef-inite matrix and R 00

= 1. The same holds for onvex ombinationsformed

fromseveral permutationmatri es.

Remark 4.2 The ondition R 00

=1 appearsnaturally fromdenitionsof Y

and ^ V. Lemma 4.4 [97℄ Let R 2S (n 1) 2 +1

be arbitrary and let

Y = ^ VR ^ V T :

Then, using the blo k notation of (1.1), we have

1. y 00 =r 00 , Y 0j e=r 00 , and P n j=1 Y 0j =r 00 e T . 2. Y 0j =e T Y ij , for i;j =1;:::;n. 3. P n i=1 Y ij =eY 0j and P n i=1 diag(Y ij )=Y j0 , for j =1;:::;n.

(64)

In [97℄ it is shown that ^

P has interior points. Forinstan e

^ R= 1 0 0 1 n 2 (n 1) (nI n 1 E n 1 )(nI n 1 E n 1 ) ! 0 is su h that ^ V ^ R ^ V T is the bary enter of P, i.e. ^ V ^ R ^ V T = 1 n! X x2 1 x T x xx T ! : (4.10) Theorem 4.1 [97℄ Let ^ Y = ^ V ^ R ^ V T be the bary enter of P. Then: 1. ^

Y has a 1 in the (0;0) positionand n diagonalnn blo ks withdiagonal

elements 1=n. The rst row and olumn equal the diagonal. The rest of the

matrix is made up of nn blo ks with all elements equal to 1=(n(n 1))

ex ept for the diagonal elements whi h are 0.

^ Y = 1 1 n e T 1 n e ( 1 n 2 EE)+ 1 n 2 (n 1) (nI E)(nI E) ! : 2. The rank of ^ Y is given by rank ( ^ Y)=(n 1) 2 +1: 3. The n 2 +1 eigenvalues of ^

Y are givenin the ve tor

2; 1 n 1 e T (n 1) 2 ;0e T 2n 1 T : 4. Let T := Ie T e T I ! :

The null spa e and range spa e are

(65)

Table 4.1: Computation times tosolvethe relaxation QAP R1

n 10 15 20 25 30 35

time (se onds) 0.43 4.16 19.4 69.3 198.8 548

The basi SDP relaxationof QAP is

(QAP R1 ) minftr LY : Y 2 ^ Pg:

We an eliminate the (n 2

+1) (n 2

+1) matrix variable Y through the

((n 1) 2

+1)((n 1) 2

+1)matrixvariableR inthe basi relaxation. For

that purpose, wedene the followingset:

R:=fR 0:R 2S (n 1) 2 +1 ; arrow( ^ VR ^ V T )=e 0 g: (4.11) Ifwe dene L:= ^ V T L ^ V 2M (n 1) 2 +1 ; (4.12) then QAP R 1 an be writtenas (QAP R 1 ) minftrLR : R2Rg: The onstraintR 00

=1in onne tionwith arrow ( ^ VR ^ V T )=e 0 isredundant,

soitis leftout in (4.11). Thus, the basi relaxation ontains n 2

+1 equality

onstraints that are oming from the arrow operator. In Table 4.1 we give

running times for solving the QAP R1

relaxation of dierent problem sizes,

usingaprimal-dualpath-followinginteriorpointmethod. Therunningtimes

inse onds are obtained using anAthlon XP with 1800 GHz.

In [97℄ it is shown that ea h matrix from P has a spe i zero pattern. For