The conditioning of linear boundary value problems
Citation for published version (APA):
Mattheij, R. M. M. (1982). The conditioning of linear boundary value problems. SIAM Journal on Numerical Analysis, 19(5), 963-978. https://doi.org/10.1137/0719070
DOI:
10.1137/0719070
Document status and date: Published: 01/01/1982
Document Version:
Publisher’s PDF, also known as Version of Record (includes final page, issue and volume numbers)
Please check the document version of this publication:
• A submitted manuscript is the version of the article upon submission and before peer-review. There can be important differences between the submitted version and the official published version of record. People interested in the research are advised to contact the author for the final version of the publication, or visit the DOI to the publisher's website.
• The final author version and the galley proof are versions of the publication after peer review.
• The final published version features the final layout of the paper including the volume, issue and page numbers.
Link to publication
General rights
Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. • Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain
• You may freely distribute the URL identifying the publication in the public portal.
If the publication is distributed under the terms of Article 25fa of the Dutch Copyright Act, indicated by the “Taverne” license above, please follow below link for the End User Agreement:
www.tue.nl/taverne
Take down policy
If you believe that this document breaches copyright please contact us at:
openaccess@tue.nl
providing details and we will investigate your claim.
THE CONDITIONING
OFLINEAR
BOUNDARY VALUE
PROBLEMS*
R. M. M. MATTHEIJtAbstract. Investigationismadeintothesensitivityof solutions of linearboundary value problemsto
perturbationsofthe boundarycondition.Wederive auseful quantitytodecide for well or illconditioning.
Fromthis itisdeducedwhich kindofrequirements theboundaryconditionsshouldmeetinordertohave
a wellconditionedproblem.Itisshown that thisquantity can fruitfullybe usedtoexplain why,e.g., the
multipleshootingtechniqueisstable (incontrast to thesingle shootingone)for certainwellposedproblems
having solutions with differentgrowthbehavior.Theresults are sustainedbyanumber ofexamples.
1. Introduction. The study of boundary value problems
(BVP)
of ODE hasproduced a large number of numerical methods. Curiously enough, the conditioning
of the problem as such has received much less attention. In fact, one would like to
have something like condition numbers used in linearalgebra. Obviously inspired by this, e.g.,in
[5]
an analysis is givento decideforthe conditioningofshooting methods(see
also[4]). However,
sinceshooting essentiallyis an initialvaluemethod,thisresultis notreally whatweare aimingat. Indeed,wewouldrather have some criterionthat indicates the nature oftheproblem, notof somemethod.
(As
iswell known,shooting may not be stable at all, despite physical arguments that the problem is wellcondi-tioned, cf. e.g.
[7],
[8].)
It
is the intention of this paper to discuss the inherent (in-)stability of a BVP.This isdoneby introducing condition numbers thatindicatethe possible amplification
of perturbations in an absolute sense (this is induced by the fact that one is usually
interestedin absoluteaccuracy).
From
heuristic arguments,it isnaturaltoask whether the number of increasingand/or
decreasing solutions (if present) are related to thenumber ofboundaryconditionsimposedateachoftheboundary points.Thisquestion
is also of importance, since several algorithms for problems with linear boundary conditions
(cf.
[15])
essentially make use ofknowledgeabout thenumber of indepen-dent relations at the initial point, say. Therefore a substantial part of this paper isdevoted to finding out the relationship between the conditioning and the rank of the boundary conditions. The usefulness of this relationship is demonstrated by Example 6.5.
Although the conditionnumbers inprinciple indicateby how much any possible
errors in the boundary conditions may be amplified, it turns out that they also play
an important role in estimating the global error due to other perturbations. This is
analyzedin
[11
].An
intriguing question iswhetherthe stability of certain algorithms,like multiple shooting, is related to the conditioning of the problem. Therefore we
give a stability analysis of this algorithm and show that the condition number is an
important quantity indeed in estimating the global error. Webelieve thatthis aspect
of our results is new and may give a valuable contribution to the understanding of multiple shooting.
The paper is built up as follows: First we give in 2 a more detailed problem
setting and some definitions. Then in 3 we introduce the notion of the condition
number of aproblem. The relation ofthe conditioning tothe boundary conditionsis
consideredin 4.In 5weinvestigate the stabilityofthe multiple shooting algorithm. Finally,weillustrate the analysisof 4,5 byanumber of examplesin 6.
*Receivedbythe editorsMarch26, 1980,and in revised form November4, 1981.
tMathematischInstituut,KatholiekeUniversiteit,Toernooiveld,6525EDNijmegen,theNetherlands. 963
964 R.M.M. MATTHEIJ
2. The problem setting andsome conventions
2.1.
Norms.
The normI1"
thatwill be used is assumed to be a H61der norm. Only in some special cases do we use the explicit notationII,
(for some p). Theinduced matrix norm will also be denoted by
]].
I1.
Besides, we shall use the glb of a matrixA,
definedas[IAxll
(2
1)
glb(A)
minIf
A
-1exists, thenglb
(A)=
IIA-I[
-.
Remark 2.2. If
A
is column equilibrated, say all columns having unit length, thenglb(A)
will only be small if any of the columnsofA
makes asmall angle with the subspace spanned by the rest. More precisely, ifA
(all’"" la)
and 0 is thesmallestangle between any column of
A
and the subspace spanned by the rest, then glb2(A)_->sin 0mini=,...,n[[a/[[a
(cf.[9,
4]).
2.2. The
ODE
and its solutions. Considerthe ODEd
(2.3)
-x(t)
A(t)x(t)+ f(t),
where
A
is an nxn-matrix function andf
an n-vector function. Let denote afundamental solution, i.e., an nxn-matrix solutionofthehomogeneous part d
(2.4)
-cb(t)
A
(t)cb(t).An
interesting class of BVP involves ODE with both increasing solutions andnon-increasing solutions.ForsuchODEit is not restrictive to assumethatthisdichotomy
isreflected in the partitioning of the matrices
(t).
Inspiredby thiswethus have" Assumption 2.5. Let(t)
bewritten as(4
(t)[..
14,
(t)).
Letthereexistintegersk, l,0-<k, -< n, andmoreoverreal numbers
.
>
0,/z<
0and positiveconstants "/1, "/2and"/3,such thatforall t,
-
e[a,/3
with-
>- there hold(i)
il6
(t)ll/ll6
( )11
<--
e-X(’-t),
J
<=
k,and,ifk
<
n-l,also(iii)
II ( )ll/ll (t)ll
k+l<=]<-n-l.
Suchatypeoffundamental solution naturallyarises intheconstantcoefficientcase, where k corresponds to the number of eigenvalueswith apositive real part, and to
the number of eigenvalueswith anegative realpart.
Itis convenientforlater use topartition the matrices
(t)
intoblocks of columnsas is inducedby Assumption 2.5,viz
(2.6)
(t)where
[2(t)]
may be nonexistentifthereareonly increasing and decreasing solutions, i.e., k+
n, etc.There are of course also other classes ofsolutionspaces(e.g.,withsolutionsthat decrease ona certainsubinterval, but increaseoutsidethatinterval). Neverthelesswe
thinkthat theclass ofODEwithsolutions as indicated in Assumption 2.5 enablesus togive afairly complete description of problems concerning the conditioning.
At
the end of 4wereturn to thisquestion.2.3. The boundary conditions. The general linear boundary condition
(BC)
for thesolutionx can bewritten as(2.7)
Mx(a)+Mx()
b,where
M,,
Mo
are n n matricesand b is an n-vector.Fora solutionof(2.3)
satisfying(2.7)
wehavethe well known results(el.[8]):
Property
2.8. There exists a unique solution of(2.3)
and(2.7)
if and only ifM
(a
+
M
o(B)
isnonsingular.Property
2.9. Asolutionof(2.3)
and (2.7) isuniqueonlyif rank;
n. Quiteoften one has separatedBC,
being of the forma)
Sx
(a
b,(2.10)
b)
Sx
([3)
b,
where
S,
S
are (not necessarily square)smatrices and b,b
vectors of suitabledimensions. We willalways assume that
[--
is an n n matrix.By
supplementingS
andSo
with zeros we can regard(2.10)
as aspecialcase of(2.7).
Wehave_s__
bean n n-matrix.Thenthere exists auniquesolution
Property
2.11. Letso
of(2.3)and
(2.10)
ifand onlyifis nonsingular.
Forthe sake ofconvenience we willassumethatthematrices
M
andM0
(whetheror notarising fromseparated
BC)
arescaled such that(2.12)
max(IIMI[,
IIMo[I)-
a.
If the solution x is unique, then aBVP is called wellposed.
From Property
2.8we see that for a well posed problem
[M(a)+
Mo(/3)]
-1 must exist.In
the next section we will see that this existence doesnotimply well conditioningatall.3. Condition numbers.
In
computing a solution x by some method, rounding errorsand possibly truncationerrors aremade. Their effectsonthecomputedsolution dependontheproblemandonthe methodwhich isused.Itturns outthat the question of how such local errors affect the global erroris strongly connectedwiththe funda-mentalproblem of how perturbations of theBCinfluencethe solution.Itisthepurpose of thissection toanalyze thissensitivity of the solution with respect totheBC. This willeventually leadtothenotionofconditionnumber.Wefirstgiveanexample.Example 3.1. Considerthe ODE
(3.2)
d2u
dt2 l
OOu,
or in vectornotation,withx (u,
du/dt)
,
(3.3)
966 R.M.M. MATTHEIJ
Obviouslyafundamentalsolution isgivenby
10t
e-lt
e(3.4)
(I)(t)
10e1’ -10e-l’J
ConsidertheBC du du(3.5)
u
(0)
+
,--(0)
1""
81,u’(1)
+
8--(1)
1+e.
Define in an obvious way x
=
(u
,
du’/dt)
T(where
the superscript e denotes thedependenceon thevector (el,
e2)T).
Thenwehave inmatrix notation(3.6)
x(0)+
1 x
(1)=
l+e
Supposeweareinterested inthe solution x
,
but for somereasonwehaveaperturbed BC(3.6)
withe 0. Then theerrorz:=
x-
x satisfies the BC(.7
0
z(0l+
z(l=
2Since is afundamental solution of
(3.3)
there existsavector veN
suchthat(3.8)
z=v.
Substitutingthis in
(3.7)
wefind:(3.9)
v(/, )
[
e-lO(1
108)
+10(
k-e 1
+
108)
10y+
e2where
[x(/,
()]-1
=[e-lO(l_
108)(1+
10,)-el(1
10/)(1
+
108)].
Hence for the error u (t)-u(t)
we obtainin particular(3.10)
u(t)-u(t)=(.r,
8)[{el(’-l)(1-1OS)-e
108
+
1)}el
+
{elt(10T
1)
+
e-it(10"y
+
1)}e:].
If
3=6
=0, then we see that (0,0)e
-1,
i.e., the errorlu(t)-u(t)l
nowhere exceeds]el[
+
[e.[
by very much.If, however,),=
or6-1-t6
thenlu(t)-u(t)]
may be a factor e2 larger (for, e.g., 6 and,,
resp.0).
The former case apparently is"wellconditioned,"whereas the latter turnsout tobe "ill conditioned."
Remark 3.11.
It
is well known that aninitialvalueproblem or a terminal valueproblem for
(3.2)
isnotwell conditioned.Note,
however, thatthe errorsareamplified thenbyafactor =e1 atmost.Remark 3.12. Since
(3.7)
and(3.8)
give a linear equation for v,Ov
[;]
say,one might have the feeling that the condition number
[IO[1110-111
of this system isindicative of well conditioning (cf.
[18,
p.191]). However,
if,
6 0, this numberis e1and if, e.g.,
,
,
6 0 this numberis e3.
Althoughthislast number clearly indicates that the latter case is more ill conditioned than theformer,wemay conclude thatitis notthe decisive criterion.Thereasonthat the quantity
1[O[1110-11[
in Remark3.12is notthe decisivecriterionis, ofcourse,thatwe are interested incb(t)v and notin v.This induces the following:
DEFINITION 3.13. The condition number of theBVPisdefinedas
Remark 3.14. The number
c
is in fact independent of the choice of the fundamental solution. This follows from the observation that for any other fundamental solution,
say, there existsanonsingularmatrixP,
say, such thatVt(t)-
(t)P.
Suppose
I1’
II-
I1" I1
in Example 3.1. Then from(3.4)
and(3.9)
itcan easily beseenthat for 3’ 6 0, the value of
c
11 and for 3’ and 6 0 this value will be11 e
2.
Hencec
givesaproper description of the actualerroramplification. Belowweshow more generally thatc
isthe right criteriontoindicate possibleerroramplification of perturbations of the
BC,
indeed.Tothisend,let 6M,6Mt
and 6b be perturbations ofMs,
M,
and b, respectively. Let be a solution of(2.3)
and theBC(3.15)
(M+6M,,)Y(a)+(Mt+6M)Y()=b+6b.
Then there certainlyexists a fixed vectorv, such that
(3.16)
x(t)-Y(t)
(t)v
for all[a,/3].
For
ease of writingweintroducethe matricesO
and60
defined bya)
Q :=m(a)
+
m(),
(3.17)
b)
60
:=6M
(a
+
6M
( ).
Firstweprovethat
11600-11[
canbe estimated intermsofcW
and the perturbations’LEMMA 3.18.
ll6Ol)-ll[
(IIM[I
+
IIM[I).
We
nowhave:THEOREM 3.19. Lete be such that 0
<
e<
1/2cgW.If
theperturbations are suchthat max
(IIM[I,
IIMII,
IIb[I)=
--<
,
then the solutionof
(3.5)
and(2.3)
satisfies
max
IIx(t)- (t)ll
t[a,/3]
g,N’6 1
+
llx
(,)II
+
llx
()II)
1 2 cW6
Moreover,
givenany such perturbation bound 6<-e,
there alwaysexists a solutionof
(3.15)
and(2.3)
suchthat(
/llx()ll/llx(t)ll)<
max
IIx(t)-(t)ll
<1
+
2 c6w(
+
llx
()11 +
llx
()11)
1 2cW6
Proof.
From (2.7),(3.15),
(3.16)
and(3.17)
weobtainHence
(a)
(O
+
6O)v
w :=6b-6Mx(a)-6Mtx().
v=O-1(l+6OO-1)-’w.
Substitution of (a) into
(3.16)
and taking norms givesll(t)vll<-(1/(1
IIoo-xll))llwll;
fromthisthe right-handsideinequalities easily follow.To
show the left-hand side inequality,we first notethat forany
6b withIIbll-
,
there exist 6M,
6Mo
withIIMII
[IMII
,
IIMx()[I
llx()ll
andIIMx(t)ll
llx(t)ll,
and moreover6Mx(a)
and6Mox(fl)
are equal to -6b, apart from positive constants (this follows from[16,
6]).
For a vector w as above, composed of suchvectors,there apparently holds
968 r. M. M. MATTHEIJ
Nowletthevector u and the point
f [a,/3]
be such that(c)
IIo(t)O- ull--
llull.
Then we choose
w(=6b-,M,,x(a)-6Max(#))
such that[I+6OQ-1]-lw
has the direction ofthe (maximizing)vector u in(c).
On accountof thischoice andof(b)
wethusfindthat there exists a solution
Y
such thatforO()v(=x()-())
the following bound holds"1 26
In
ordertohave amoreworkable quantityfor wemayestimate itby(3.20)
maxte[,]
Such an estimate appears, e.g., in
[5].
However, if we choose(a)=L
as is donethere, thenmax
(t)]
threatenstobe the dominantquantity in(3.20).
For Example 3.1 thiswould mean thatwe would havean estimatethat is afactor e too large (itindicates the conditioning of some algorithms though, cf.
(5.9)).
We will not have those problems ifwescale appropriately.Therefore,inaddition toAssumption2.5,we moreover have the (not restrictive)"
Assumption 3.21. Let @ be such that max[,0]
[](t)]]
1Vp.q
maxt[[P(t)]]
max [[q(s)
(N.B.eP(t)denotes
the pth column of(t)).
Onaccountof Assumption 3.21 wecan estimate by
(3.22)
:=]][M
(a)
+
Mo
(fl)]-l.
Below, it isshown howsharply isestimatedby
Property
3.23. Supposeweusethelinear norm. DefineR (t)
by(3.24)
R
(t):=(t).
diag(]
(t)]],
(t)[[).
and
Then min,glbl(R
(t))K
<-cCA# <-K.Proof.
maxtIlcP(t)O-11]
_>-maxt{glba(R(t))
minpI]P(t)ll}
>_-mintglb(R(t)).
Remark 3.25. To appreciate
Property
3.23, one should realize that glbR(t)
is in fact the condition number ofR(t)
(of.Rem. 2.2). In
the autonomous case,R(t)
may be thought of as thematrix ofscaled eigenvectors; except that if they are skew the lowerboundin
Property
3.23 is substantially smaller thanWe
can also show that the scaling assumption 3.21 is optimal regarding theestimatein
(3.20),
if we usethelinearnorm, asfollows from"Property
3.26. Let denote the classofnonsingulardiagonal matrices.For Dlet
K(D)=
]][M,,
(a )D
+McP(B)D]-II1.
Thenmin max
IlcP(t)II(D)=
maxIlcP(t)[l(I)=
.
Dt[,/3] te[c,/3]
Proof.
Theproof essentiallyusesthe arguments as given in[16,
Thm.23].
F1Sincethe quantity is easier tohandle than A; and moreover has turned out tobe afairlysharp estimate of
V,
wewilluse ratherthan A; furtheron.Remark 3.28. For numerical purposes we should in fact consider a somewhat
different conditionnumber.Indeed,inpracticewedeal ratherwith adiscretized
ODE,
where wehave (approximate)solutions definedon some grid only. The introduction
of conditionnumbers ofsuch discrete problems goes similarlytothe continuouscase,
however.
In [12]
itis shown that forasufficientlyfinegrid onecanconstruct adiscrete fundamentalsolutionthat isclose innormtothe continuous fundamentalsolution
;
the conditioning of the discrete and continuous case are almost the same, then. For
stiffproblems, however, this is notalways true, as isshownin
[11].
4. The conditioning of some modelproblems.
As
we saw in Example 3.1, theconditioning may change dramatically if we change the BC. Therefore it is of great importance to investigate the relation of the conditioning to the BC in general.
In
particularweare interested inthe questionofwhether anot-largeconditionnumber might imply that the BC has some special properties. Theanswer ispositive, and as aninteresting application ofthis we indicate inExample 6.5 whyan essential partof
theGodunov-Conte algorithmisstableindeed.
In
order to be able tosay whether ornot aBVP is well conditioned we have tobemorespecific about the interpretation of the actual value of K (which willbeused
rather than the number
).
In particular, we want to avoid inexact use of theterminology "order of."
One possibility is to consider ODE defined on
[0, ). By
imposing the BC onthe solution x defined on
[0,
T]
for some T>
0, we can expect theill conditioning,if present, to become more and more pronounced as T-->
.
Of course, this setting automaticallycoversBVPoninfiniteintervalsandinthis senseitshowssomesimilaritywiththe approach in
[6].
Another waytocircumvent asloppy use ofthe "order-of" terminology would be to consider afamily of ODE wherewe can let A and -/x go to infinity somehow. Such a setting would also cover certain singular perturbation problems.As
thereader may verify quite easily, the latter approach would givesimilarresults.Wedonotconsider thishere,however.Wewouldliketostressthat the results of the theorems below can be used for a qualitative interpretation of finite interval
BVP,
i.e., the infinite cases below give agood insight intothe nature ofthe problem forfiniteintervals(cf. Example6.3).
Assumption 4.1. LettheODEbedefined on
[0, ). For
eachT
define on[0, T]
afundamental solution7-,satisfying the requirementsof asstatedinAssumptions 2.5 and3.21.
Nowconsider the family ofBC
(4.2)
Mox
(0)
+
Moox
(T)
b, T>
O.DEFINITION 4.3. The family ofBVPwithBC
(4.2)
(theBVP,
forshort)
is calledwell conditioned if and ill conditioned if
]l[Mo7-(0)
/Moo7-(T)]-ll1-1
o(1), T
.
Theresults belowwill concentrate onhowmanyindependent conditions are needed at bothboundarypoints and also how the conditioning depends on the relation between the directions of the fundamental
solutions at the boundary points and
Mo
andMo.
Forthe sake ofcompleteness, werecall thatkindicatesthe dimension ofthe dominant subspace, andl,ofthe dominated subspace.
THEOREM4.4. The BVP isillconditioned
if
rank(Mo) <
orrank(M)<
k.Proof.
Write(t)
instead ofT(t). Suppose,
e.g., rank(Mo)=p< I.
Then also rank(Mo(0))
<_-p<
l.Let Sbean n(n
-p)matrixof rankn-p such thatMo(O)S
970 R.M.M. MATTHEIJ
[ISrll
1 and[dpl(T)Idp2(T)If]Sr
0.Forthevector sn,
s := SrwethenfindIlEM0(0)
+ Mop(T)]sll
IlMoo[p3(T)]sll
<=
IIMII
[IE3(T)]II
O(e").
Hence
IIEMo(O)
+
M(T)]-II
-
o().Remark 4.5. It is not dicult to see that the requirements rank
(Mo)
and rank(M)
k (cf. Theorem4.4)
donotimplyawell conditioning of theBVP. Toseethisoneshould consideraBCwhere
Mo
andM
have zerosinthe last row.A
more refinedresultiscontained in" THZOZM 4.6. The BVPis ill conditionedg
rank
(M0[(0)[(0)])<
or rank(M[(T)[(T)])
<
k.Proof.
Liketheproofof Theorem4.4,except for takingS such thatMo[
[2(0)13(0)]S
0.Remark 4.7. Theorem 4.6 can be interpreted as follows:
Suppose
precisely sindependentrow vectors of
M0
are orthogonal tog+a(0),...,
(0)
and precisely independent row vectors ofM
are orthogonal toOa(T),..., n-l(T).
Then the problem is ill conditioned if rank(Mo)-S <
or rank(M)-t <
k.In
practice, thismeans that if we have near orthogonality, we can already expect ill conditioning in the finiteinterval case.
Remark 4.8. Note that Theorems 4.4 and 4.6 imply necessary conditions for well conditioning. This will be useful in practical problems which are known to be well conditioned (e.g.,onphysicalgrounds).
In
order to give apositiveresult for well conditioning, weshould require that a"near orthogonality," such as was noted in Remark 4.7, does not take place. The
nexttheorem givesanecessaryand sufficientconditionin casewehaveaseparated BC.
THZOREM 4.9. Let the
BVP
haveaseparatedBC,
i.e.,Assume
either rank(So)=
or rank(S)=
k. Theneither
IllSoE.(o)l,(o)]Fl[
o()andor
{So[(O)])-x
0(1)
and{S[(T)IV(T)])-x=
0(1).
Proof.
From (3.27)
weobtainthat thecondition numberx equalsSo(R) (o)
(a) Let
e.g.II(So[(0)])-ll
o()
andII{So[(T)I(T)])-II
0(1).
Let
T
be sucient]ylarge,then rank(So)=
l.Now
writeSor(O)
]
Q=
S-(T)
Here C is an xt-matrix and
[DID
z]
an (n -l)x(n-l) matrix,IIB’II
O(e-XT),
]IE[I
O(e"T).
IfB
z 0then there existssomematrixR
suchthatCR+
Bz 0. (N.B.R
R (T)!)
NowdefineZ
R
Ik
Then
0
equals Q butfor2the
matrix B2,
where has zeros, and the matrix D2,
where0
has a matrix15
=DZ+ER.
(N.B.[IERII
O(e"r).)
Write with obviousmeaning Then we have o(1) /5-’+o(1) -’+o(I) o(1)
Hence it follows from the assumption that
II0-111
O(1), but sinceI[/-111
O(1)italsofollows that
IIO-111-
O(1).(b) Suppose the assertion is not true. Let, e.g., glb
(So[3r(0)])=
o(1). Let y Rbe such that
II.s0[,3(0)]yll-
glb($0[*-(0)])[lyll.
Use
the notationO
:=[g
c]
as aboveand define
Cy
z’;
Ey
zz.
ThenY
<llzlll+llz2[l-
glb(C)+ O(e"T)
o(1).glb(O)_-<
[[yll
[lyll
Henceeither
O-I(T)
does not existforalmost allTor[IO-1[[
-1 o(1),i.e.,theproblemis notwell conditioned. Theother casesgo essentially similarly. VI
COROLLARY 4.10.
If
the BVP in Theorem 4.9 is well conditioned then <-rank(So)<- n k and k<-rank(S)-<_ nI.
There remainsanumberoftypesofODEthatare notrepresentedin theforegoing.
We hope that some examples will suffice to show that, e.g., solution spaces with
partially increasing and partially decreasingsolutions canbe dealtwith in a similarway.
Example 4.11. Consider the ODE
u"-4tu’+(4t2-2)u=O;
this has basissol-utions u(t)=etz and u(t)= e
’2.
Leta =-Tand/3
T.
After rewritingthis ODE in a companion matrix form we obtain a fundamental solution that, after scaling according toAssumption 3.21, has a (0) and a (T) with elements of orderunity.As
far as the condition number is concerned, we therefore can expect a situation972 R.M.M. MATTHEIJ
Example 4.12. Considerthe ODE
u"
+
4tu’+
(4t2+
2)u
0, wherethe basissol---t --t
utions are
u(t)=
e andu(t)=
e Again let=-T
andT.
After rewriting-T
and scaling wenow have
(0)
and(T)
beingof the order ofe This meansthatT
we canexpectthe conditionnumbertobe of the order of e atleast.
5. The stability ofmultiple shooting.
A
very successful improvement of (single)shooting is multiple shooting (cf.
[17,
pp. 170 if.I). For good surveys see, e.g.,[2],
[8].
Although there seems to be a common opinion about the advantages of thisalgorithm, the arguments vary quite substantially (cf.
[2], [4],
[5], [8], [14], [15]).
Wethinkthatournotionof conditioningcanbehelpfultoclearupsomequestionsabout
the stabilityof thisalgorithm.
We first give a brief description of the method. Let
[a,/3]
be divided into subintervals [ti,ti+l],
0,’’’, N-1 (so to a,tn =/).
Then for each aparticularsolution ri and a fundamental solution
F
is computed onIts,
ti/l]. Hence for each thereexists a vectorvi such that(5.1)
x(t)
Fi(t)vi+
ri(t).Forsimplicity we assume thatF(ti)=/.
By
matching the relations(5.1)
at the pointstl," ",tn-1weobtain a recurrence relation forthe vg
(5.2)
Aivi-1
vi+
gi, 1,,
N-1where
(5.3)
Ai
=F-l(ti) and gg r,(ti)-rg_l(ti).The matrix
Ai
in (5.3) is the increment of the fundamental solution going from ti-1 to ti. Inparticularwehave(5.4)
Fo(t)
Fi(t)miA
1,>=
ti.Ifwe use (5.1)in theBC
(2.7)
and addto it therelations(5.2)
weobtainthe multiple shootingsystemA1
A2
-IM
MtAN
:_ I)N-1 gl gN-b Mro(to)Mt3rN-1
(tN)(N.B.
An
:= Fn-l(tn), cf.(5.3)).
The matrix in(5.5)
will be denoted byP. Althoughwe actually find incremental matrices Ag of the discretized problem in practice, we
neglectthisfact inourdiscussion.
In
fact,onemay alsothink of(5.1)-(5.5)
asbelongingtothediscreteproblem,cf. Remark 3.28.
In order to study the sensitivity with respect to rounding errors in the final
solution,wemust specify how the algorithm is carriedout.All variants have incommon
that theyuse an initial value technique on each subinterval. Therefore, if we denote by
:
themachineconstant, theactually computedA
andg maycontainerrorsofthe orderof(5.6)
The variant described, e.g., in
[3],
[17]
uses direct forward recursion in order toN-1
(5.7)
vN-11-1
Aivo.
i=1
The solution of
(5.5)
is quite simplenow, forwe cansolve for v0.Theremaining{v}
follow from(5.2),
and in this way we can determine{x(t)}. By
using(5.7)
we may makeanerror invr-1 ofthe order of(5.8)
maxA.
:-<
maxOi<=N
In
solving for Vo we moreover should reckon with an amplification of this error byII[M +
M0(/)]-ll.
Therefore, roughly speaking,we may expect errors in v0of the order of(5.9)
maxIIFo(t)ll II[M
+
M0()]-ll.
The factor in
(5.9),
infront of,
may beconsideredas astabilityconstant ("conditionnumber")
for
thisparticular algorithm.It
is, however, notthe appropriate quantitytoindicate the conditioningofthe multiple shootingprocess in general;notemoreover thatwe rejected this factor to estimate in
(3.20).
Finally, it is noteworthy that single shooting would givea similar stabilityconstant.The other variants (cf.
[8], [13],
[14])
solve the system(5.5)
by some stable LU-decomposition ofP.For
such a method we may therefore expect errors in thevi x
(t
i) ofthe order of(5.
0)
IIe-ll
mox
Ilmill.
Below we now give a realistic estimate for
IIe-[I.
It will turn out thatI[e-ll
is notlarge if u
(so )
is notlarge. On accountof thiswe thusmay conclude that aslong asmax/IIAilI
doesnotexceed the required tolerance for the global error,thealgorithm is numerically stable forawell conditionedproblemRemark 5.11.
In [14]
abackward analysis has been given for multiple shootingwith separated BC.
An
essential role in this is played by an estimate for unfortunately, aprooffor this estimate is not given in thatpaper. We would like toremark that our result in Theorem 5.12, below, is different, atleast in a qualitative
sense. The kindof condition number that appears in
[14]
seems tobe based on thetheoreticaldependenceonthe inhomogeneoustermsonly, whereas oursmoreexplicitly shows therelation tothe well conditioningwithrespecttotheBC.
We
have:THEORE 5.12. Let
I1"
II:
I1"
I1.
For 0,..., N letRi
bedefined
by (ti)Ri
diag(11
(ti)ll)
(el. (3.24)).
Denotex
maxi,IIR,
IIIIR; II.
Finally,lethmini
(ti+ ti). Then-Ah
-,h
+
max ,h,(N-
1)
+
.
-e 1-e
Pro@
Let beafundamentalsolutionas in Assumptions 2.5 and 3.21. Denote(t).
Using theR
we seethat(a)
G
:=R
iAR_,
Bi
of orderk isadiagonal matrix.In
orderto findP-
we successively solve(b)
PY(q)=E(q),
q=O, ,N-l,where Y(q) andE(q) are block vectorsconsisting of n columns. N(q) contains zeros except for
Nq+(q)
I
(byE(q)wedenotethe ithblock). Nowdefinefor suchablock974 R.M.M. MATTHEIJ
(c)
ITs(q)
:=R7
Y(q), 0,...,N if q=<
N-1.From (b)
and(c)
wesee that{
?i(q)}0
satisfies aninhomogeneous recursion. Therefore we splitit intosome other suitable particular solution{
g
(q)}v,
say, andahomogeneous solution{i(q)}v,
say, inorderto find estimates forthe Y(q).For
simplicitywe omitthe(q)for themoment.Letthe superscripts 1 and 2 denoteapartitioning corresponding
to (a),i.e.,
c,_
_
k
y:=
.
Lyj Thenasuitableparticular solution
{
0}
isgivenby-1
(d
)
O=-(q B):,
iNq,0=0
otherwise;/=i+1
E,
iq+l,0:0
otherwise.q+2
Thehomogeneoussolution
{):: {- 0}
thensatisfiesthe BCUsing the fact that
Z
Zo
wethereforefindforY
If q N-1, then
{
(q)}
justsatisfies ahomogeneousrecursion andweimmediately obtain(g)
Y
[Mo
+M]
-.
The requiredestimatethen follows from (f)and (g). Toseethis, denote for short:
max
(T2
exp[hi], 7),
n
P+
--.
p yexp[-&hi], Then for q
<
N-1"Yo(q)lloo
and for qN-
1"Pq.+
<=
X vo+
Xt N q2J
(11
Yo(N 1An
estimate forIIP-lloo
will be given by considering the maximal row sum in thematrixwithelements
(11
r/(q)l[oo),
SinceN N-2 N-1 1=I 1=0 1=I N-2
F
’2 "] N-1 e-zh vj<=max
[
,h,(N-
1)73
P1<=’Y1
-xh, =o 1--e a =1 1--ethe estimateimmediatelyfollows. [-1
COROLLARY
5.13.If
(k
+
l)<n(so
there are solutions that do not increase norCOROLLARY5.14.
I]:
(k+
I)
n andAh,-Izh
are notsmall, then[Ie-ll]
x.
If,
however, Ahor
-tzh
issmall, then[Ie-’ll
again.Remark 5.15. The result in Corollary 5.13 regarding the factor
N
israthersharp,as can be seen from a P where the matrices
Ai
andM
are equal toI
and whereMs
0. HerewehaveIIe-’ll
N.
Remark 5.16. Since the norms of the incremental matrices
[IA[[
are usually significantly larger than1,weseethatIlell
maxiIIAII.
Thereforewe obtainfrom(5.10)
that
IIe-’llllell
givesagood ideaof theabsoluteroundingerroramplification.6. Examples.
In
this section wegiveanumber ofexamplestoillustratethe results of 4 and5.In
ordertoeliminate discretizationerrors, weconstructed the multiple shootingmatricesfrom analytically known fundamentalsolutions.Thenorm isassumedtobethe supremum norm
I1"
Iloo.
Example 6.1. Considerthe ODE
dx 1-2 os2t 0 l+2sin
2 0 x
(a)
dt
[_-l+2sin2t 0 l+2cos2t
on
[0,
r]. A
fundamental solutionFo
(withFo(0)
I)
is givenby(b)
sit
0 -cost
I
-’).
Fo(t)
1 0 diag(e3’,
e’,
e [_cost 0 sintAssume
Mo
M,
L
Forthe scaledfundamentalsolution weobtain(c)
Hence(0)=
0 e-2I,(Tr)
0 1 0 --3-H-e 0 -1 0 0 0 0 -l+e-3
M0(0)
+
M(Tr)
0 1/e-2 0 --l+e-3 0 0We
thus find KI[-1
+
e-]-ll
1.05. It can also be checked that cCW1.05. Thissharpnessofthe estimateK isconfirmedby 3.23, fromwhich we deduce that
/x/
<=
cW
<_-x.Therefore, weshould callthisBVP wellconditioned. Using thenotationforthedimension of thedominantsolutionspace, k,and the subdominantsolutionspace,
l,asinAssumption2.5,weapparently havek 2 and 1.
From
the wellconditioning,wemay concludethat rank
(Mo)
->1 and rank(M)=>
2, asfollowsfromTheorem 4.4.This istrivially true, of course.We also computed the multiple shootingmatrix. For
this we used N+1 equally spaced points to,"’,
tN.
For checking the estimate in Theorem 5.12 it is important to notethat h’,71 and ’}/2are aboutx/. Moreover,
forNnot toosmall,we seethatmax
(e-X/N/(1
e-/),
1/1
e"/lv)
N/7r
(N.B.
h 2,/x--1).
Hence as an estimate forIIP-ll
we would findIlP-l[4N/zr.
In
Table 6.1we have given the actually computed values; they seem to agree with this estimate apart fromafactor 4.
976 R.M.M. MATTHEIJ 1 3 6 9 12 15 18 21 24 TABLE 6.1 1.1 1.5 2.4 3.4 4.3 5.3 6.2 7.2 8.2 12.4.10 37.2 14.2 13.1 13.9 15.2 17.8 18.5 20.2 N 1 3 6 9 12 15 18 21 24 TABLE 6.2 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.1 1.1 1.9 1027 1.2 109 3.5 104 1.1" 10 1.9 102 6.9’101 3.5" 10 2.2’101 1.6 101
AswesaidinRemark5.16,
IIPII
may be consideredtobeindicativeof themaximal incrementalgrowth. Thereforewehave also given the conditionnumber ofPin Table6.1,which canbeused to estimatetheerror in
(5.10).
Note that the valuesforN- 1 correspondtosingle shooting.Example 6.2.
Suppose
wehave anODElike(6.1a)
butnow withFo(t)
1 0LCOS 0 sin
diag
(e
TM,e
19t,
e-18t).
We nowhave K 1.0 (upto26 decimals).
In
Table 6.2wehave given values for thecorresponding multiple shootingmatrix. Since we have a strong dichotomy now, we
almost do not see adependenceonN (cf. Cor.
5.14).
ForsmallerNthe incrementalvalues are very largenow. Ifwe have a 16-digit mantissa, we cannothope to obtain
morethan 7 accurate decimals forN 3, forexample; notethat single shootingwill
thencompletelyfail.
Example 6.3. Again consider a similar
ODE,
nowwith a fundamentalsolutionFo
given bysint 0 -cost
1
(a)
Fo(t)
0 1 0cos 0 sin
and aseparated BC,where
diag(e
3t,
1,e-’)
So=(1
0 0),S==
0 0
In
ordertoinvestigate the conditioningweconsider(b)
I
o
0 -1Q=L----7_)j
0 1-1 0
Hence K 1. Wealsofind
(c)
1 and 1.Theresultsin
(b)
and(c)
are in agreementwith a finiteinterpretation of Remark 4.8.For this, one should replace astatement like
"lla(r)ll
O(1), T-, o" by "a(zr) notThis is confirmedby Table 6.3. N 3 6 9 12 15 18 21 24 TABLF. 6.3 TABLE 6.4 3 6 9 12 15 18 21 24 .84 103 .68 104 .17.105 .28.105 .41.105 .55.105 .69.105 .83 105
Example 6.4. LettheODEbe as in(6.1a), butnowwithBC
Mo
0M=
0 10 0 0
Sincerank
(Mo)
2 (>1 l)and rankM
2(=>k),wecannot conclude illconditioningon account of Theorem 4.4. Similarly, since rank
(Mo[3
(0)])
1 andrank
(M[l(cr)])
2, we cannot conclude ill conditioning on account of Theorem 4.6. Finally, since moreover rank(Mo)
+
rank(M.) >
3 and even rank(Mo
+
M.)
3wemight hopeto haveawell conditionedproblem. Neverthelesswe obtain
-1 0
-11
(a)
O
0 1 0 so : 1+
e3.12 105.
--3-rr
e 0 0
The values for
lIp-ill
in Table 6.4 confirm this. The resultin(a)
also shows that anapproach such as issuggested in
[5,
e.g., Thm.3.3]
is notvery relevant as far as theestimates for the "condition number"
IIMo+M IIIIEMo+M, ]- II
are concerned. Alsothe other quantity which appears there, in our notation max/o.,l
doesnotreallymatterin arealisticbound for error amplification, as followsfrom our
discussionin 5. Notethat the latterquantity e4 in thisexample; if wewould have hadadecreasingsolutionbehavinglikee
"t,/x
<
0,then this quantity would have been=e Nevertheless K would remain the same (and therefore
lIP-111
would notchange drastically). It goes without saying that
IIPII
wouldbe almost independent of the magnitudeof/x
(cf. Remark5.16).
Example6.5. Theorems4.4,4.6 and 4.9 also provide useful argumentstoexplain the stability ofcertain algorithms.
As
an example, we considerthe Godunov-Conte algorithm[1], [14],
[15]: In
order to have stability, we e.g., have to prove that an essentialpart of themethod, viz. the backward recursion, cf.[15, Eq. (11)],
is stable.For this, assume that
+
k n and that the BVP (with separatedBC)
is well condi-tioned. Then it follows from Theorem 4.4 that rank(So)=
I.
This means that theorthogonal complement ofthespacespanned by therows of
So
mustbe k-dimensional.The ideaof the algorithmnow is tocompute k independentbasissolutions, in matrix notation
F,
say, such thatSoFa(a)=
0, i.e., span(F(a))2span
(Sot).
On the other978 R.M.M. MATTHEIJ
a dominated solution) can be
(almost)
orthogonal to the space spanned by the rowsof
So.
Thereforewe conclude thatF(a)
must consist of dominantsolutions. At the shooting pointsweonly reorthogonalizethecolumns of thesesolutionvectors, whence we see that span(F/
(t))=span(F(t)).
On account of the analysis a given in[9],
[10],
it can then be deduced that the corresponding incremental matrices define arecursion that is indeed stable in the backward direction. For a different kind of analysis, see
14].
REFERENCES
[1] S. D. CONTE, The numerical solutionoflinearboundary value problems,SIAM Rev.,8 (1966),pp.
309-321.
[2] P. DEUFLHARD,Nonlinearequationsolversinboundary valueproblem codes,in Codes forBoundary ValueProblemsinOrdinaryDifferential Equations, Childs, B. ed.,Lecture NotesinComputer
Sciences, 76, Springer,NewYork, 1979.
[3]
,
Recentadvancesinmultipleshooting techniques,Prepr.41,Inst.fiirAngewandte Mathematik,Univ. Heidelberg,1979.
[4] J. C. FALKENBERG, A methodforintegrationofunstablesystemsofordinary differentialequations
subfecttotwopointboundary conditions,BIT,8(1968),pp. 86-103.
[5] J.H. GEORGE ANDR. W. GUNDERSON, Conditioningoflinearboundary value problems,BIT, 12
(1972),pp.172-181.
[6] F. R. DEHOOGANDR. WEISS, Onthe boundaryvalueproblemforsystemsofordinarydifferential equations, withasingularityofthe second kind,SIAM J.Math.Anal., 11 (1980),pp.41-60.
[7] H. B. KELLER, NumericalMethodsfor Two-PointBoundary-ValueProblems, Blaisdell, Waltham,
MA,1968.
[8]
,
Numerical Solution ofTwo-PointBoundary ValueProblems, CBMS Regional ConferenceSeries inApplied Mathematics,24, Society forIndustrial and AppliedMathematics, Philadelphia, 1976.
[9] R. M. M. MATTHEIJ, Characterizations ofdominantand dominated solutions oflinearrecursions,
Numer.Math.,35(1980),pp. 421-442.
10]
.,
Stablecomputationofsolutionsofunstable linear initialvaluerecursions,Rep.8108, Nijmegen, 1981.BIT,22:1(1982),toappear.[11]
,
Estimatesfortheerrors inthe solutionsoflinearboundaryvalueproblems duetoperturbations,Computing, 27(1981),pp. 299-318.
[12]
,
Accurateestimatesfor thefundamentalsolutionsofdiscrete boundary value problems, Rep.8112, Nijmegen, 1981.
[13] M. R. OSBORNE, Onshootingmethodsforboundary valueproblems, J.Math.Anal.Appl., 27(1969),
pp. 417-433.
[14]
,
The stabilized marchisstable,thisJournal, 17(1979),pp.923-933.[15] M.R. SCOTTANDn.A. WATTS,Computationalsolutionoflineartwopointboundary value problems
viaorthonormalization,thisJournal, 14(1977),pp.40-70.
[16] A.VAN DERSLUIS,Conditionnumbers and equilibrationofmatrices,Numer.Math., 14(1969),pp. 14-23.
[17] J. STOERANDR. BULIRSCH,Einfiihrungin die Numerische Mathematik11, Springer, Berlin, 1973.