The conditioning of linear boundary value problems

(1)

The conditioning of linear boundary value problems

Citation for published version (APA):

Mattheij, R. M. M. (1982). The conditioning of linear boundary value problems. SIAM Journal on Numerical Analysis, 19(5), 963-978. https://doi.org/10.1137/0719070

DOI:

10.1137/0719070

Document status and date: Published: 01/01/1982

Document Version:

Publisher’s PDF, also known as Version of Record (includes final page, issue and volume numbers)

Please check the document version of this publication:

• A submitted manuscript is the version of the article upon submission and before peer-review. There can be important differences between the submitted version and the official published version of record. People interested in the research are advised to contact the author for the final version of the publication, or visit the DOI to the publisher's website.

• The final author version and the galley proof are versions of the publication after peer review.

• The final published version features the final layout of the paper including the volume, issue and page numbers.

Link to publication

General rights

Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. • Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain

• You may freely distribute the URL identifying the publication in the public portal.

If the publication is distributed under the terms of Article 25fa of the Dutch Copyright Act, indicated by the “Taverne” license above, please follow below link for the End User Agreement:

www.tue.nl/taverne

Take down policy

If you believe that this document breaches copyright please contact us at:

openaccess@tue.nl

providing details and we will investigate your claim.

(2)

THE CONDITIONING

OF

LINEAR

BOUNDARY VALUE

PROBLEMS*

R. M. M. MATTHEIJt

Abstract. Investigationismadeintothesensitivityof solutions of linearboundary value problemsto

perturbationsofthe boundarycondition.Wederive auseful quantitytodecide for well or illconditioning.

Fromthis itisdeducedwhich kindofrequirements theboundaryconditionsshouldmeetinordertohave

a wellconditionedproblem.Itisshown that thisquantity can fruitfullybe usedtoexplain why,e.g., the

multipleshootingtechniqueisstable (incontrast to thesingle shootingone)for certainwellposedproblems

having solutions with differentgrowthbehavior.Theresults are sustainedbyanumber ofexamples.

1. Introduction. The study of boundary value problems

(BVP)

of ODE has

produced a large number of numerical methods. Curiously enough, the conditioning

of the problem as such has received much less attention. In fact, one would like to

have something like condition numbers used in linearalgebra. Obviously inspired by this, e.g.,in

[5]

an analysis is givento decideforthe conditioningofshooting methods

(see

also

[4]). However,

sinceshooting essentiallyis an initialvaluemethod,thisresult

is notreally whatweare aimingat. Indeed,wewouldrather have some criterionthat indicates the nature oftheproblem, notof somemethod.

(As

iswell known,shooting may not be stable at all, despite physical arguments that the problem is well

condi-tioned, cf. e.g.

[7],

[8].)

It

is the intention of this paper to discuss the inherent (in-)stability of a BVP.

This isdoneby introducing condition numbers thatindicatethe possible amplification

of perturbations in an absolute sense (this is induced by the fact that one is usually

interestedin absoluteaccuracy).

From

heuristic arguments,it isnaturaltoask whether the number of increasing

and/or

decreasing solutions (if present) are related to the

number ofboundaryconditionsimposedateachoftheboundary points.Thisquestion

is also of importance, since several algorithms for problems with linear boundary conditions

(cf.

[15])

essentially make use ofknowledgeabout thenumber of indepen-dent relations at the initial point, say. Therefore a substantial part of this paper is

devoted to finding out the relationship between the conditioning and the rank of the boundary conditions. The usefulness of this relationship is demonstrated by Example 6.5.

Although the conditionnumbers inprinciple indicateby how much any possible

errors in the boundary conditions may be amplified, it turns out that they also play

an important role in estimating the global error due to other perturbations. This is

analyzedin

[11

].

An

intriguing question iswhetherthe stability of certain algorithms,

like multiple shooting, is related to the conditioning of the problem. Therefore we

give a stability analysis of this algorithm and show that the condition number is an

important quantity indeed in estimating the global error. Webelieve thatthis aspect

of our results is new and may give a valuable contribution to the understanding of multiple shooting.

The paper is built up as follows: First we give in 2 a more detailed problem

setting and some definitions. Then in 3 we introduce the notion of the condition

number of aproblem. The relation ofthe conditioning tothe boundary conditionsis

consideredin 4.In 5weinvestigate the stabilityofthe multiple shooting algorithm. Finally,weillustrate the analysisof 4,5 byanumber of examplesin 6.

*Receivedbythe editorsMarch26, 1980,and in revised form November4, 1981.

tMathematischInstituut,KatholiekeUniversiteit,Toernooiveld,6525EDNijmegen,theNetherlands. 963

(3)

964 R.M.M. MATTHEIJ

2. The problem setting andsome conventions

2.1.

Norms.

The norm

_I1"

thatwill be used is assumed to be a H61der norm. Only in some special cases do we use the explicit notation

II,

(for some p). The

induced matrix norm will also be denoted by

]].

_I1.

Besides, we shall use the glb of a matrix

A,

definedas

[IAxll

(2

1)

glb

(A)

min

If

A

-1

exists, thenglb

(A)=

IIA-I[

-.

Remark 2.2. If

A

is column equilibrated, say all columns having unit length, thenglb

(A)

will only be small if any of the columnsof

A

makes asmall angle with the subspace spanned by the rest. More precisely, if

A

(all’"" la)

and 0 is the

smallestangle between any column of

A

and the subspace spanned by the rest, then glb2(A)_->sin 0mini=,...,n

[[a/[[a

(cf.

[9,

4]).

2.2. The

ODE

and its solutions. Considerthe ODE

d

(2.3)

_-x(t)

A(t)x(t)+ f(t),

where

A

is an nxn-matrix function and

_f

an n-vector function. Let denote a

fundamental solution, i.e., an nxn-matrix solutionofthehomogeneous part d

(2.4)

_-cb(t)

A

(t)cb(t).

An

interesting class of BVP involves ODE with both increasing solutions and

non-increasing solutions.ForsuchODEit is not restrictive to assumethatthisdichotomy

isreflected in the partitioning of the matrices

(t).

Inspiredby thiswethus have" Assumption 2.5. Let

(t)

bewritten as

(4

(t)[..

14,

(t)).

Letthereexistintegers

k, l,0-<k, -< n, andmoreoverreal numbers

.

>

0,/z

<

0and positiveconstants "/1, "/2

and"/3,such thatforall t,

-

e

[a,/3

with

-

>- there hold

(i)

_il6

(t)ll/ll6

( )11

<--

e-X(’-t),

_J

<=

k,

and,ifk

<

n-l,also

(iii)

II ( )ll/ll (t)ll

k+l<=]<-n-l.

Suchatypeoffundamental solution naturallyarises intheconstantcoefficientcase, where k corresponds to the number of eigenvalueswith apositive real part, and to

the number of eigenvalueswith anegative realpart.

Itis convenientforlater use topartition the matrices

(t)

intoblocks of columns

as is inducedby Assumption 2.5,viz

(2.6)

(t)

where

[2(t)]

may be nonexistentifthereareonly increasing and decreasing solutions, i.e., k

+

n, etc.

There are of course also other classes ofsolutionspaces(e.g.,withsolutionsthat decrease ona certainsubinterval, but increaseoutsidethatinterval). Neverthelesswe

(4)

thinkthat theclass ofODEwithsolutions as indicated in Assumption 2.5 enablesus togive afairly complete description of problems concerning the conditioning.

At

the end of 4wereturn to thisquestion.

2.3. The boundary conditions. The general linear boundary condition

(BC)

for thesolutionx can bewritten as

(2.7)

_Mx(a)+Mx()

b,

where

M,,

_Mo

are n n matricesand b is an n-vector.Fora solutionof

(2.3)

satisfying

(2.7)

wehavethe well known results(el.

[8]):

Property

2.8. There exists a unique solution of

(2.3)

and

(2.7)

if and only if

M

(a

+

M

o

(B)

isnonsingular.

Property

2.9. Asolutionof

(2.3)

and (2.7) isuniqueonlyif rank

_;

n. Quiteoften one has separated

BC,

being of the form

a)

Sx

(a

b,

(2.10)

b)

_Sx

([3)

b,

where

S,

_S

are (not necessarily square)_smatrices and b,

_b

vectors of suitable

dimensions. We willalways assume that

_[--

is an n n matrix.

By

supplementing

S

and

_So

with zeros we can regard

(2.10)

as aspecialcase of

(2.7).

Wehave

_s__

_be

an n n-matrix.Thenthere exists auniquesolution

Property

2.11. Let

so

of(2.3)and

(2.10)

ifand onlyif

is nonsingular.

Forthe sake ofconvenience we willassumethatthematrices

_M

and

_M0

(whether

or notarising fromseparated

BC)

arescaled such that

(2.12)

max

(IIMI[,

IIMo[I)-

a.

If the solution x is unique, then aBVP is called wellposed.

From Property

2.8

we see that for a well posed problem

[M(a)+

Mo(/3)]

-1 must exist.

In

the next section we will see that this existence doesnotimply well conditioningatall.

3. Condition numbers.

In

computing a solution x by some method, rounding errorsand possibly truncationerrors aremade. Their effectsonthecomputedsolution dependontheproblemandonthe methodwhich isused.Itturns outthat the question of how such local errors affect the global erroris strongly connectedwiththe funda-mentalproblem of how perturbations of theBCinfluencethe solution.Itisthepurpose of thissection toanalyze thissensitivity of the solution with respect totheBC. This willeventually leadtothenotionofconditionnumber.Wefirstgiveanexample.

Example 3.1. Considerthe ODE

(3.2)

d2u

dt2 l

OOu,

or in vectornotation,withx (u,

du/dt)

,

(3.3)

(5)

966 R.M.M. MATTHEIJ

Obviouslyafundamentalsolution isgivenby

10t

_e-lt

e

(3.4)

(I)(t)

10e1’ -10

e-l’J

ConsidertheBC du du

(3.5)

u

(0)

+

_,--(0)

1

_""

81,

u’(1)

+

8--(1)

1

+e.

Define in an obvious way x

=

(u

,

du’/dt)

T

(where

the superscript e denotes the

dependenceon thevector (el,

e2)T).

Thenwehave inmatrix notation

(3.6)

x

(0)+

1 x

(1)=

l+e

Supposeweareinterested inthe solution x

,

but for somereasonwehaveaperturbed BC

(3.6)

withe 0. Then theerror

_z:=

x

-

x satisfies the BC

(.7

0

z(0l+

z(l=

2

Since is afundamental solution of

(3.3)

there existsavector ve

N

suchthat

(3.8)

z

=v.

Substitutingthis in

(3.7)

wefind:

(3.9)

v

(/, )

[

e-lO(1

108)

+10(

k-e 1

+

108)

10y

+

e2

where

[x(/,

()]-1

=[e-lO(l_

108)(1+

10,)-el(1

10/)(1

+

108)].

Hence for the error u (t)-

u(t)

we obtainin particular

(3.10)

u(t)-u(t)=(.r,

8)[{el(’-l)(1-1OS)-e

108

+

1)}el

+

{elt(10T

1)

+

e-it(10"y

+

1)}e:].

If

₃₌₆

=0, then we see that (0,

0)e

-1,

i.e., the error

lu(t)-u(t)l

nowhere exceeds

]el[

+

[e.[

by very much.If, however,

),=

or6

-1-t6

then

lu(t)-u(t)]

may be a factor e2 larger (for, e.g., 6 and

,,

resp.

0).

The former case apparently is

"wellconditioned,"whereas the latter turnsout tobe "ill conditioned."

Remark 3.11.

It

is well known that aninitialvalueproblem or a terminal value

problem for

(3.2)

isnotwell conditioned.

Note,

however, thatthe errorsareamplified thenbyafactor =e1 atmost.

Remark 3.12. Since

(3.7)

and

(3.8)

give a linear equation for v,

Ov

[;]

say,

one might have the feeling that the condition number

[IO[1110-111

of this system is

indicative of well conditioning (cf.

[18,

p.

191]). However,

if

,

6 0, this number

is e1and if, e.g.,

,

6 0 this numberis e

3.

Althoughthislast number clearly indicates that the latter case is more ill conditioned than theformer,wemay conclude thatitis notthe decisive criterion.

Thereasonthat the quantity

1[O[1110-11[

in Remark3.12is notthe decisivecriterion

is, ofcourse,thatwe are interested incb(t)v and notin v.This induces the following:

DEFINITION 3.13. The condition number of theBVPisdefinedas

(6)

Remark 3.14. The number

c

is in fact independent of the choice of the fundamental solution. This follows from the observation that for any other fundamental solution

,

say, there existsanonsingularmatrix

P,

say, such that

Vt(t)-

(t)P.

Suppose

_I1’

II-

_{I1" I1}

in Example 3.1. Then from

(3.4)

and

(3.9)

itcan easily beseen

that for 3’ 6 0, the value of

c

11 and for 3’ and 6 0 this value will be

11 e

2.

Hence

c

givesaproper description of the actualerroramplification. Belowweshow more generally that

c

isthe right criteriontoindicate possible

erroramplification of perturbations of the

BC,

indeed.Tothisend,let 6M,

_6Mt

and 6b be perturbations of

Ms,

M,

and b, respectively. Let be a solution of

(2.3)

and theBC

(3.15)

(M+6M,,)Y(a)+(Mt+6M)Y()=b+6b.

Then there certainlyexists a fixed vectorv, such that

(3.16)

x(t)-Y(t)

(t)v

for all

[a,/3].

For

ease of writingweintroducethe matrices

O

and

60

defined by

a)

Q :=

m(a)

+

m(),

(3.17)

b)

60

:=

_6M

(a

+

_6M

( ).

Firstweprovethat

11600-11[

canbe estimated intermsof

cW

and the perturbations’

LEMMA 3.18.

ll6Ol)-ll[

(IIM[I

+

IIM[I).

We

nowhave:

THEOREM 3.19. Lete be such that 0

<

e

<

1/2cgW.

_If

theperturbations are such

that max

(IIM[I,

IIMII,

IIb[I)=

--<

,

then the solution

_of

(3.5)

and

(2.3)

_satisfies

max

IIx(t)- (t)ll

t[a,/3]

g,N’6 ₁

₊

_llx

_(,)II

₊

_llx

_()II)

1 2 cW6

Moreover,

givenany such perturbation bound 6

<-e,

there alwaysexists a solution

_of

(3.15)

and

(2.3)

suchthat

(

_{/llx()ll/llx(t)ll)<}

max

IIx(t)-(t)ll

<

1

+

2 c6

w(

+

_llx

()11 +

_llx

()11)

1 2cW6

Proof.

From (2.7),

(3.15),

(3.16)

and

(3.17)

weobtain

Hence

(a)

(O

+

6O)v

w :=

6b-6Mx(a)-6Mtx().

v=O-1(l+6OO-1)-’w.

Substitution of (a) into

(3.16)

and taking norms gives

ll(t)vll<-(1/(1

IIoo-xll))llwll;

fromthisthe right-handsideinequalities easily follow.

To

show the left-hand side inequality,we first notethat for

any

6b with

_IIbll-

,

there exist 6M,

_6Mo

with

IIMII

_[IMII

,

IIMx()[I

llx()ll

and

_IIMx(t)ll

llx(t)ll,

and moreover

6Mx(a)

and

6Mox(fl)

are equal to -6b, apart from positive constants (this follows from

[16,

6]).

For a vector w as above, composed of such

vectors,there apparently holds

(7)

968 r. M. M. MATTHEIJ

Nowletthevector u and the point

f [a,/3]

be such that

(c)

IIo(t)O- ull--

_llull.

Then we choose

w(=6b-,M,,x(a)-6Max(#))

such that

[I+6OQ-1]-lw

has the direction ofthe (maximizing)vector u in

(c).

On accountof thischoice andof

(b)

we

thusfindthat there exists a solution

Y

such thatfor

O()v(=x()-())

the following bound holds"

1 26

In

ordertohave amoreworkable quantityfor wemayestimate itby

(3.20)

max

te[,]

Such an estimate appears, e.g., in

[5].

However, if we choose

(a)=L

as is done

there, thenmax

(t)]

threatenstobe the dominantquantity in

(3.20).

For Example 3.1 thiswould mean thatwe would havean estimatethat is afactor e too large (it

indicates the conditioning of some algorithms though, cf.

(5.9)).

We will not have those problems ifwescale appropriately.Therefore,inaddition toAssumption2.5,

we moreover have the (not restrictive)"

Assumption 3.21. Let @ be such that max[,0]

[](t)]]

1

Vp.q

maxt

[[P(t)]]

max [[q(s)

(N.B.

eP(t)denotes

the pth column of

(t)).

Onaccountof Assumption 3.21 wecan estimate by

(3.22)

:=

_]][M

(a)

+

_Mo

(fl)]-l.

Below, it isshown howsharply isestimatedby

Property

3.23. Supposeweusethelinear norm. Define

R (t)

by

(3.24)

R

(t):=

(t).

diag

(]

(t)]],

(t)[[).

and

Then min,glbl(R

(t))K

<-cCA# <-K.

Proof.

maxt

IlcP(t)O-11]

_>-maxt{glba

(R(t))

minp

I]P(t)ll}

>_-mintglb

(R(t)).

Remark 3.25. To appreciate

Property

3.23, one should realize that glb

R(t)

is in fact the condition number of

R(t)

(of.

Rem. 2.2). In

the autonomous case,

R(t)

may be thought of as thematrix ofscaled eigenvectors; except that if they are skew the lowerboundin

Property

3.23 is substantially smaller than

We

can also show that the scaling assumption 3.21 is optimal regarding the

estimatein

(3.20),

if we usethelinearnorm, asfollows from"

Property

3.26. Let denote the classofnonsingulardiagonal matrices.For D

let

K(D)=

]][M,,

(a )D

+McP(B)D]-II1.

Then

min max

IlcP(t)II(D)=

max

IlcP(t)[l(I)=

.

Dt[,/3] te[c,/3]

Proof.

Theproof essentiallyusesthe arguments as given in

[16,

Thm.

23].

F1

Sincethe quantity is easier tohandle than A; and moreover has turned out tobe afairlysharp estimate of

V,

wewilluse ratherthan A; furtheron.

(8)

Remark 3.28. For numerical purposes we should in fact consider a somewhat

different conditionnumber.Indeed,inpracticewedeal ratherwith adiscretized

ODE,

where wehave (approximate)solutions definedon some grid only. The introduction

of conditionnumbers ofsuch discrete problems goes similarlytothe continuouscase,

however.

In [12]

itis shown that forasufficientlyfinegrid onecanconstruct adiscrete fundamentalsolutionthat isclose innormtothe continuous fundamental

solution

;

the conditioning of the discrete and continuous case are almost the same, then. For

stiffproblems, however, this is notalways true, as isshownin

[11].

4. The conditioning of some modelproblems.

As

we saw in Example 3.1, the

conditioning may change dramatically if we change the BC. Therefore it is of great importance to investigate the relation of the conditioning to the BC in general.

In

particularweare interested inthe questionofwhether anot-largeconditionnumber might imply that the BC has some special properties. Theanswer ispositive, and as aninteresting application ofthis we indicate inExample 6.5 whyan essential partof

theGodunov-Conte algorithmisstableindeed.

In

order to be able tosay whether ornot aBVP is well conditioned we have to

bemorespecific about the interpretation of the actual value of K (which willbeused

rather than the number

).

In particular, we want to avoid inexact use of the

terminology "order of."

One possibility is to consider ODE defined on

[0, ). By

imposing the BC on

the solution x defined on

[0,

T]

for some T

>

0, we can expect theill conditioning,

if present, to become more and more pronounced as T-->

.

Of course, this setting automaticallycoversBVPoninfiniteintervalsandinthis senseitshowssomesimilarity

withthe approach in

[6].

Another waytocircumvent asloppy use ofthe "order-of" terminology would be to consider afamily of ODE wherewe can let A and -/x go to infinity somehow. Such a setting would also cover certain singular perturbation problems.

As

thereader may verify quite easily, the latter approach would givesimilar

results.Wedonotconsider thishere,however.Wewouldliketostressthat the results of the theorems below can be used for a qualitative interpretation of finite interval

BVP,

i.e., the infinite cases below give agood insight intothe nature ofthe problem forfiniteintervals(cf. Example

6.3).

Assumption 4.1. LettheODEbedefined on

[0, ). For

each

T

define on

[0, T]

afundamental solution7-,satisfying the requirementsof asstatedinAssumptions 2.5 and3.21.

Nowconsider the family ofBC

(4.2)

Mox

(0)

+

Moox

(T)

b, T

>

O.

DEFINITION 4.3. The family ofBVPwithBC

(4.2)

(the

BVP,

for

short)

is called

well conditioned if and ill conditioned if

]l[Mo7-(0)

/

Moo7-(T)]-ll1-1

_{o(1), T}

.

_The_{results below}_{will concentrate on}_how

manyindependent conditions are needed at bothboundarypoints and also how the conditioning depends on the relation between the directions of the fundamental

solutions at the boundary points and

Mo

and

Mo.

Forthe sake ofcompleteness, we

recall thatkindicatesthe dimension ofthe dominant subspace, andl,ofthe dominated subspace.

THEOREM4.4. The BVP isillconditioned

_if

rank

(Mo) <

orrank

(M)<

k.

Proof.

Write

(t)

instead of

T(t). Suppose,

e.g., rank(Mo)=p

< I.

Then also rank

(Mo(0))

<_-p

<

l.Let Sbean n

(n

-p)matrixof rankn-p such that

Mo(O)S

(9)

970 R.M.M. MATTHEIJ

[ISrll

1 and

[dpl(T)Idp2(T)If]Sr

0.Forthevector s

n,

s := Srwethenfind

IlEM0(0)

+ Mop(T)]sll

IlMoo[p3(T)]sll

<=

_IIMII

[IE3(T)]II

O(e").

Hence

IIEMo(O)

+

M(T)]-II

-

o().

Remark 4.5. It is not dicult to see that the requirements rank

(Mo)

and rank

(M)

k (cf. Theorem

4.4)

donotimplyawell conditioning of theBVP. Tosee

thisoneshould consideraBCwhere

Mo

and

M

have zerosinthe last row.

A

more refinedresultiscontained in" THZOZM 4.6. The BVPis ill conditioned

g

rank

(M0[(0)[(0)])<

or rank

(M[(T)[(T)])

<

k.

Proof.

Liketheproofof Theorem4.4,except for takingS such that

Mo[

[2(0)13(0)]S

0.

Remark 4.7. Theorem 4.6 can be interpreted as follows:

Suppose

precisely s

independentrow vectors of

M0

are orthogonal to

g+a(0),...,

(0)

and precisely independent row vectors of

M

are orthogonal to

Oa(T),..., n-l(T).

Then the problem is ill conditioned if rank

(Mo)-S <

or rank

(M)-t <

k.

In

practice, this

means that if we have near orthogonality, we can already expect ill conditioning in the finiteinterval case.

Remark 4.8. Note that Theorems 4.4 and 4.6 imply necessary conditions for well conditioning. This will be useful in practical problems which are known to be well conditioned (e.g.,onphysicalgrounds).

In

order to give apositiveresult for well conditioning, weshould require that a

"near orthogonality," such as was noted in Remark 4.7, does not take place. The

nexttheorem givesanecessaryand sufficientconditionin casewehaveaseparated BC.

THZOREM 4.9. Let the

BVP

haveaseparated

BC,

i.e.,

Assume

either rank

(So)=

or rank

(S)=

k. Then

either

IllSoE.(o)l,(o)]Fl[

o()and

or

{So[(O)])-x

0(1)

and

{S[(T)IV(T)])-x=

0(1).

Proof.

From (3.27)

weobtainthat thecondition numberx equals

So(R) (o)

(a) Let

e.g.

II(So[(0)])-ll

o()

and

II{So[(T)I(T)])-II

0(1).

Let

T

be sucient]ylarge,then rank

(So)=

l.

Now

write

Sor(O)

]

Q=

S-(T)

(10)

Here C is an xt-matrix and

[DID

z]

an (n -l)x(n-l) matrix,

IIB’II

O(e-XT),

]IE[I

O(e"T).

If

B

z 0then there existssomematrix

R

suchthatCR

+

Bz 0. (N.B.

R

R (T)!)

Nowdefine

Z

R

I

k

Then

0

equals Q but

_for2the

matrix B

2,

where has zeros, and the matrix D

2,

where

0

has a matrix

15 =DZ+ER.

(N.B.

[IERII

O(e"r).)

Write with obvious

meaning Then we have o(1) /5-’+o(1) -’+o(I) o(1)

Hence it follows from the assumption that

II0-111

O(1), but since

I[/-111

O(1)it

alsofollows that

IIO-111-

O(1).

(b) Suppose the assertion is not true. Let, e.g., glb

(So[3r(0)])=

o(1). Let y R

be such that

II.s0[,3(0)]yll-

glb

($0[*-(0)])[lyll.

Use

the notation

O

:=

[g

c]

as above

and define

Cy

z’;

Ey

z

z.

Then

Y

<llzlll+llz2[l-

_glb_{(C)+ O(e}

_"T)

_o(1).

glb(O)_-<

[[yll

[lyll

Henceeither

O-I(T)

does not existforalmost allTor

[IO-1[[

-1 o(1),i.e.,theproblem

is notwell conditioned. Theother casesgo essentially similarly. VI

COROLLARY 4.10.

_If

the BVP in Theorem 4.9 is well conditioned then <-rank(So)<- n k and k<-_rank_(S)-<_ n

I.

There remainsanumberoftypesofODEthatare notrepresentedin theforegoing.

We hope that some examples will suffice to show that, e.g., solution spaces with

partially increasing and partially decreasingsolutions canbe dealtwith in a similarway.

Example 4.11. Consider the ODE

u"-4tu’+(4t2-2)u=O;

this has basis

sol-utions u(t)=etz and u(t)= e

’2.

Leta =-T

and/3

T.

After rewritingthis ODE in a companion matrix form we obtain a fundamental solution that, after scaling according toAssumption 3.21, has a (0) and a (T) with elements of orderunity.

As

far as the condition number is concerned, we therefore can expect a situation

(11)

972 R.M.M. MATTHEIJ

u"

+

4tu’

+

(4t2

+

2)u

0, wherethe basis

sol---t --t

utions are

u(t)=

e and

u(t)=

e Again let

=-T

and

T.

After rewriting

-T

and scaling wenow have

(0)

and

(T)

beingof the order ofe This meansthat

T

we canexpectthe conditionnumbertobe of the order of e atleast.

5. The stability ofmultiple shooting.

A

very successful improvement of (single)

shooting is multiple shooting (cf.

[17,

pp. 170 if.I). For good surveys see, e.g.,

[2],

[8].

Although there seems to be a common opinion about the advantages of this

algorithm, the arguments vary quite substantially (cf.

[2], [4],

[5], [8], [14], [15]).

We

thinkthatournotionof conditioningcanbehelpfultoclearupsomequestionsabout

the stabilityof thisalgorithm.

We first give a brief description of the method. Let

[a,/3]

be divided into subintervals [ti,

ti+l],

0,’’’, N-1 (so to a,

tn =/).

Then for each aparticular

solution _ri and a fundamental solution

_F

is computed on

Its,

ti/l]. Hence for each thereexists a vectorvi such that

(5.1)

x(t)

Fi(t)vi

+

ri(t).

Forsimplicity we assume thatF(ti)=/.

By

matching the relations

(5.1)

at the points

tl," ",tn-1weobtain a recurrence relation forthe vg

(5.2)

Aivi-1

vi

+

gi, 1,

,

N-1

where

(5.3)

Ai

=F-l(ti) and gg r,(ti)-rg_l(ti).

The matrix

_Ai

in (5.3) is the increment of the fundamental solution going from ti-1 to ti. Inparticularwehave

(5.4)

Fo(t)

Fi(t)mi

A

1,

>=

ti.

Ifwe use (5.1)in theBC

(2.7)

and addto it therelations

(5.2)

weobtainthe multiple shootingsystem

A1

A2

-I

M

MtAN

:_ I)N-1 gl gN-b Mro(to)

Mt3rN-1

(tN)

(N.B.

An

:= Fn-l(tn), cf.

(5.3)).

The matrix in

(5.5)

will be denoted byP. Although

we actually find incremental matrices Ag of the discretized problem in practice, we

neglectthisfact inourdiscussion.

In

fact,onemay alsothink of

(5.1)-(5.5)

asbelonging

tothediscreteproblem,cf. Remark 3.28.

In order to study the sensitivity with respect to rounding errors in the final

solution,wemust specify how the algorithm is carriedout.All variants have incommon

that theyuse an initial value technique on each subinterval. Therefore, if we denote by

:

themachineconstant, theactually computed

A

andg maycontainerrorsofthe orderof

(5.6)

The variant described, e.g., in

[3],

[17]

uses direct forward recursion in order to

(12)

N-1

(5.7)

vN-1

1-1

Aivo.

i=1

The solution of

(5.5)

is quite simplenow, forwe cansolve for v0.Theremaining

{v}

follow from

(5.2),

and in this way we can determine

{x(t)}. By

using

(5.7)

we may makeanerror invr-1 ofthe order of

(5.8)

max

_A.

:-<

max

Oi<=N

In

solving for Vo we moreover should reckon with an amplification of this error by

II[M +

M0(/)]-ll.

Therefore, roughly speaking,we may expect errors in v0of the order of

(5.9)

max

IIFo(t)ll II[M

+

M0()]-ll.

The factor in

(5.9),

infront of

,

may beconsideredas astabilityconstant ("condition

number")

_for

thisparticular algorithm.

It

is, however, notthe appropriate quantityto

indicate the conditioningofthe multiple shootingprocess in general;notemoreover thatwe rejected this factor to estimate in

(3.20).

Finally, it is noteworthy that single shooting would givea similar stabilityconstant.

The other variants (cf.

[8], [13],

[14])

solve the system

(5.5)

by some stable LU-decomposition ofP.

For

such a method we may therefore expect errors in the

vi x

(t

i) ofthe order of

(5.

0)

IIe-ll

_mox

_Ilmill.

Below we now give a realistic estimate for

IIe-[I.

It will turn out that

I[e-ll

is not

large if u

(so )

is notlarge. On accountof thiswe thusmay conclude that aslong as

max/IIAilI

doesnotexceed the required tolerance for the global error,thealgorithm is numerically stable forawell conditionedproblem

Remark 5.11.

In [14]

abackward analysis has been given for multiple shooting

with separated BC.

An

essential role in this is played by an estimate for unfortunately, aprooffor this estimate is not given in thatpaper. We would like to

remark that our result in Theorem 5.12, below, is different, atleast in a qualitative

sense. The kindof condition number that appears in

[14]

seems tobe based on the

theoreticaldependenceonthe inhomogeneoustermsonly, whereas oursmoreexplicitly shows therelation tothe well conditioningwithrespecttotheBC.

We

have:

THEORE 5.12. Let

_I1"

_II:

_I1"

_I1.

For 0,..., N let

_Ri

be

_defined

by (ti)

Ri

diag

(11

(ti)ll)

(el. (3.24)).

Denote

_x

maxi,

IIR,

IIIIR; II.

Finally,leth

mini

(ti+ ti). Then

-Ah

-,h

+

max ,h,

(N-

1)

+

.

-e 1-e

Pro@

Let beafundamentalsolutionas in Assumptions 2.5 and 3.21. Denote

(t).

Using the

R

we seethat

(a)

G

:=

R

iAR_,

Bi

of orderk isadiagonal matrix.

In

orderto find

P-

we successively solve

(b)

PY(q)=E(q),

q=O, ,N-l,

where Y(q) andE(q) are block vectorsconsisting of n columns. N(q) contains zeros except for

Nq+(q)

I

(byE(q)wedenotethe ithblock). Nowdefinefor suchablock

(13)

974 R.M.M. MATTHEIJ

(c)

ITs(q)

:=

_R7

Y(q), 0,...,N if q

=<

N-1.

From (b)

and

(c)

wesee that

{

?i(q)}0

satisfies aninhomogeneous recursion. Therefore we splitit intosome other suitable particular solution

{

g

(q)}v,

say, andahomogeneous solution

{i(q)}v,

say, inorderto find estimates forthe Y(q).

For

simplicitywe omit

the(q)for themoment.Letthe superscripts 1 and 2 denoteapartitioning corresponding

to (a),i.e.,

_{c,_}

_

k

y:=

.

Lyj Thenasuitableparticular solution

{

0}

isgivenby

-1

(d

)

O=-(q B):,

iNq,

0=0

otherwise;

/=i+1

E,

iq+l,

0:0

otherwise.

q+2

Thehomogeneoussolution

{):: {- 0}

thensatisfiesthe BC

Using the fact that

_Z

Zo

wethereforefindfor

_Y

If q N-1, then

{

(q)}

justsatisfies ahomogeneousrecursion andweimmediately obtain

(g)

Y

[Mo

+M]

-.

The requiredestimatethen follows from (f)and (g). Toseethis, denote for short:

max

(T2

exp

[hi], 7),

n

P

+

--.

p yexp[-&hi], Then for q

<

N-1"

Yo(q)lloo

and for q

N-

1"

Pq.+

<=

X vo

+

Xt N q

2J

(11

Yo(N 1

An

estimate for

IIP-lloo

will be given by considering the maximal row sum in the

matrixwithelements

(11

r/(q)l[oo),

Since

N N-2 N-1 1=I 1=0 1=I N-2

F

’2 "] N-1 e-zh vj

<=max

_[

,h,

(N-

1)73

P1

<=’Y1

-xh, =o 1--e a =1 1--e

the estimateimmediatelyfollows. [-1

COROLLARY

5.13.

_If

(k

+

l)<n

(so

there are solutions that do not increase nor

(14)

COROLLARY5.14.

I]:

(k

+

I)

n andAh,

-Izh

are notsmall, then

[Ie-ll]

_x.

If,

however, Ahor

_-tzh

issmall, then

[Ie-’ll

again.

Remark 5.15. The result in Corollary 5.13 regarding the factor

N

israthersharp,

as can be seen from a P where the matrices

_Ai

and

_M

are equal to

I

and where

Ms

0. Herewehave

IIe-’ll

N.

Remark 5.16. Since the norms of the incremental matrices

[IA[[

are usually significantly larger than1,weseethat

_Ilell

maxi

IIAII.

Thereforewe obtainfrom

(5.10)

that

IIe-’llllell

givesagood ideaof theabsoluteroundingerroramplification.

6. Examples.

In

this section wegiveanumber ofexamplestoillustratethe results of 4 and5.

In

ordertoeliminate discretizationerrors, weconstructed the multiple shootingmatricesfrom analytically known fundamentalsolutions.Thenorm isassumed

tobethe supremum norm

_I1"

_Iloo.

dx 1-2 os2t 0 l+2sin

2 0 x

(a)

dt

[_-l+2sin2t 0 l+2cos2t

on

[0,

r]. A

fundamental solution

Fo

(with

Fo(0)

I)

is givenby

(b)

sit

0 -cost

I

-’).

Fo(t)

1 0 diag(e

3’,

e

’,

e [_cost 0 sint

Assume

_Mo

M,

L

Forthe scaledfundamentalsolution weobtain

(c)

Hence

(0)=

0 e-2

I,(Tr)

0 1 0 --3-H-e 0 -1 0 0 0 0 -l+e

-3

M0(0)

+

M(Tr)

0 1/e-2 0 --l+e-3 0 0

We

thus find K

I[-1

+

e-]-ll

1.05. It can also be checked that cCW1.05. This

sharpnessofthe estimateK isconfirmedby 3.23, fromwhich we deduce that

/x/

<=

cW

<_-x.Therefore, weshould callthisBVP wellconditioned. Using thenotationfor

thedimension of thedominantsolutionspace, k,and the subdominantsolutionspace,

l,asinAssumption2.5,weapparently havek 2 and 1.

From

the wellconditioning,

wemay concludethat rank

(Mo)

->1 and rank

(M)=>

2, asfollowsfromTheorem 4.4.

This istrivially true, of course.We also computed the multiple shootingmatrix. For

this we used N+1 equally spaced points to,"’,

tN.

For checking the estimate in Theorem 5.12 it is important to notethat h’,71 and ’}/2are about

x/. Moreover,

for

Nnot toosmall,we seethatmax

(e-X/N/(1

e

-/),

1/1

e

"/lv)

N/7r

(N.B.

h 2,

/x--1).

Hence as an estimate for

IIP-ll

we would find

IlP-l[4N/zr.

In

Table 6.1

we have given the actually computed values; they seem to agree with this estimate apart fromafactor 4.

(15)

976 R.M.M. MATTHEIJ 1 3 6 9 12 15 18 21 24 TABLE 6.1 1.1 1.5 2.4 3.4 4.3 5.3 6.2 7.2 8.2 12.4.10 37.2 14.2 13.1 13.9 15.2 17.8 18.5 20.2 N 1 3 6 9 12 15 18 21 24 TABLE 6.2 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.1 1.1 1.9 1027 1.2 109 3.5 104 1.1" 10 1.9 102 6.9’101 3.5" 10 2.2’101 1.6 101

AswesaidinRemark5.16,

IIPII

may be consideredtobeindicativeof themaximal incrementalgrowth. Thereforewehave also given the conditionnumber ofPin Table

6.1,which canbeused to estimatetheerror in

(5.10).

Note that the valuesforN- 1 correspondtosingle shooting.

Example 6.2.

Suppose

wehave anODElike

(6.1a)

butnow with

Fo(t)

1 0

LCOS 0 sin

diag

(e

TM,e

19t,

e

-18t).

We nowhave K 1.0 (upto26 decimals).

In

Table 6.2wehave given values for the

corresponding multiple shootingmatrix. Since we have a strong dichotomy now, we

almost do not see adependenceonN (cf. Cor.

5.14).

ForsmallerNthe incremental

values are very largenow. Ifwe have a 16-digit mantissa, we cannothope to obtain

morethan 7 accurate decimals forN 3, forexample; notethat single shootingwill

thencompletelyfail.

Example 6.3. Again consider a similar

ODE,

nowwith a fundamentalsolution

Fo

given by

sint 0 -cost

1 (a)

Fo(t)

0 1 0

cos 0 sin

and aseparated BC,where

diag(e

3t,

1,

e-’)

So=(1

0 0),

S==

0 0

In

ordertoinvestigate the conditioningweconsider

(b)

I

o

0 -1

Q=L----7_)j

0 1

-1 0

Hence K 1. Wealsofind

(c)

1 and 1.

Theresultsin

(b)

and

(c)

are in agreementwith a finiteinterpretation of Remark 4.8.

For this, one should replace astatement like

"lla(r)ll

O(1), T-, o" by "a(zr) not

(16)

This is confirmedby Table 6.3. N 3 6 9 12 15 18 21 24 TABLF. 6.3 TABLE 6.4 3 6 9 12 15 18 21 24 .84 103 .68 104 .17.105 .28.105 .41.105 .55.105 .69.105 .83 105

Example 6.4. LettheODEbe as in(6.1a), butnowwithBC

Mo

0

_M=

0 1

0 0 0

Sincerank

(Mo)

2 (>1 l)and rank

_M

2(=>k),wecannot conclude illconditioning

on account of Theorem 4.4. Similarly, since rank

(Mo[3

(0)])

1 and

rank

(M[l(cr)])

2, we cannot conclude ill conditioning on account of Theorem 4.6. Finally, since moreover rank

(Mo)

+

rank

(M.) >

3 and even rank

(Mo

+

M.)

3

wemight hopeto haveawell conditionedproblem. Neverthelesswe obtain

-1 0

-11

(a)

O

0 1 0 so : 1

+

e3.12 10

5.

--3-rr

e 0 0

The values for

lIp-ill

in Table 6.4 confirm this. The resultin

(a)

also shows that an

approach such as issuggested in

[5,

e.g., Thm.

3.3]

is notvery relevant as far as the

estimates for the "condition number"

IIMo+M IIIIEMo+M, ]- II

are concerned. Also

the other quantity which appears there, in our notation max/o.,l

doesnotreallymatterin arealisticbound for error amplification, as followsfrom our

discussionin 5. Notethat the latterquantity e4 in thisexample; if wewould have hadadecreasingsolutionbehavinglikee

"t,/x

<

0,then this quantity would have been

=e Nevertheless K would remain the same (and therefore

lIP-111

would not

change drastically). It goes without saying that

IIPII

wouldbe almost independent of the magnitude

of/x

(cf. Remark

5.16).

Example6.5. Theorems4.4,4.6 and 4.9 also provide useful argumentstoexplain the stability ofcertain algorithms.

As

an example, we considerthe Godunov-Conte algorithm

[1], [14],

[15]: In

order to have stability, we e.g., have to prove that an essentialpart of themethod, viz. the backward recursion, cf.

[15, Eq. (11)],

is stable.

For this, assume that

+

k n and that the BVP (with separated

BC)

is well condi-tioned. Then it follows from Theorem 4.4 that rank

(So)=

I.

This means that the

orthogonal complement ofthespacespanned by therows of

_So

mustbe k-dimensional.

The ideaof the algorithmnow is tocompute k independentbasissolutions, in matrix notation

F,

say, such that

SoFa(a)=

0, i.e., span

(F(a))2span

(Sot).

On the other

(17)

978 R.M.M. MATTHEIJ

a dominated solution) can be

(almost)

orthogonal to the space spanned by the rows

of

So.

Thereforewe conclude that

F(a)

must consist of dominantsolutions. At the shooting pointsweonly reorthogonalizethecolumns of thesesolutionvectors, whence we see that span

(F/

(t))=span

(F(t)).

On account of the analysis a given in

[9],

[10],

it can then be deduced that the corresponding incremental matrices define a

recursion that is indeed stable in the backward direction. For a different kind of analysis, see

14].

REFERENCES

[1] S. D. CONTE, The numerical solutionoflinearboundary value problems,SIAM Rev.,8 (1966),pp.

309-321.

[2] P. DEUFLHARD,Nonlinearequationsolversinboundary valueproblem codes,in Codes forBoundary ValueProblemsinOrdinaryDifferential Equations, Childs, B. ed.,Lecture NotesinComputer

Sciences, 76, Springer,NewYork, 1979.

[3]

,

Recentadvancesinmultipleshooting techniques,Prepr.41,Inst.fiirAngewandte Mathematik,

Univ. Heidelberg,1979.

[4] J. C. FALKENBERG, A methodforintegrationofunstablesystems_ofordinary differentialequations

subfecttotwopointboundary conditions,BIT,8(1968),pp. 86-103.

[5] J.H. GEORGE ANDR. W. GUNDERSON, Conditioning_oflinearboundary value problems,BIT, 12

(1972),pp.172-181.

[6] F. R. DEHOOGANDR. WEISS, Onthe boundaryvalueproblemforsystemsofordinarydifferential equations, withasingularity_ofthe second kind,SIAM J.Math.Anal., 11 (1980),pp.41-60.

[7] H. B. KELLER, NumericalMethodsfor Two-PointBoundary-ValueProblems, Blaisdell, Waltham,

MA,1968.

[8]

,

Numerical Solution ofTwo-PointBoundary ValueProblems, CBMS Regional Conference

Series inApplied Mathematics,24, Society forIndustrial and AppliedMathematics, Philadelphia, 1976.

[9] R. M. M. MATTHEIJ, Characterizations _ofdominantand dominated solutions oflinearrecursions,

Numer.Math.,35(1980),pp. 421-442.

10]

.,

Stablecomputation_ofsolutions_ofunstable linear initialvaluerecursions,Rep.8108, Nijmegen, 1981.BIT,22:1(1982),toappear.

[11]

,

Estimatesfortheerrors inthe solutionsoflinearboundaryvalueproblems duetoperturbations,

Computing, 27(1981),pp. 299-318.

[12]

,

Accurateestimatesfor thefundamentalsolutions_ofdiscrete boundary value problems, Rep.

8112, Nijmegen, 1981.

[13] M. R. OSBORNE, Onshootingmethodsforboundary valueproblems, J.Math.Anal.Appl., 27(1969),

pp. 417-433.

[14]

,

The stabilized marchisstable,thisJournal, 17(1979),pp.923-933.

[15] M.R. SCOTTANDn.A. WATTS,Computationalsolutionoflineartwopointboundary value problems

viaorthonormalization,thisJournal, 14(1977),pp.40-70.

[16] A.VAN DERSLUIS,Conditionnumbers and equilibrationofmatrices,Numer.Math., 14(1969),pp. 14-23.

[17] J. STOERANDR. BULIRSCH,Einfiihrungin die Numerische Mathematik11, Springer, Berlin, 1973.