Nonnegative matrices in dynamic programming
Citation for published version (APA):Zijm, W. H. M. (1982). Nonnegative matrices in dynamic programming. Stichting Mathematisch Centrum. https://doi.org/10.6100/IR106492
DOI:
10.6100/IR106492
Document status and date: Published: 01/01/1982
Document Version:
Publisher’s PDF, also known as Version of Record (includes final page, issue and volume numbers)
Please check the document version of this publication:
• A submitted manuscript is the version of the article upon submission and before peer-review. There can be important differences between the submitted version and the official published version of record. People interested in the research are advised to contact the author for the final version of the publication, or visit the DOI to the publisher's website.
• The final author version and the galley proof are versions of the publication after peer review.
• The final published version features the final layout of the paper including the volume, issue and page numbers.
Link to publication
General rights
Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. • Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain
• You may freely distribute the URL identifying the publication in the public portal.
If the publication is distributed under the terms of Article 25fa of the Dutch Copyright Act, indicated by the “Taverne” license above, please follow below link for the End User Agreement:
www.tue.nl/taverne Take down policy
If you believe that this document breaches copyright please contact us at: openaccess@tue.nl
providing details and we will investigate your claim.
NONNEGATIVE MATRICES
IN DVNAMIC PROGRAMMING
IN .DYNAMIC PROGRAMMING
Proefschriftter verkrijging van de graad van Doctor in de Technische Wetenschappen aan de Technische Hogeschool Eindhoven, op gezag van de Rector
Magnificus, prof.ir. J. Erkelens, voor een commissie aangewezen door het College van
Dekanen in het openbaar te verdedigen op vrijdag 29 januari 1982 te 16.00 uur
door
WILLEM HENDRIK MARIA ZIJM
geboren te Texel
1982
Dit p:roe fschri
ft
is goedgekeU'l'd door de promotorenProf. d:P. J. Weesels
en
Contents
I. Introduetion 1
2
4 5
1. l. A short history; main objectives 1.2. Description of the model
1.3. Examples
1.4. Summary of the subsequent chapters 1 • 5. Notational conven ti ons
13 15
Part I. Finite-dimensional systems 17
2. Nonnegative matrices : a structure analysis 19
2.1. Basic tools and definitions 20
2.2. Block-triangular decompositions 26
2.3. Generalized eigenveetors 32
2.4. Some further results 37
2.5. State classifications 40
Appendix 2.A. A fundamental set of equations 42
3. Sets of nonnegative matrices : block-triangular structures 44
3.1. Sets of irreducible nonnegative matrices 45
3.2. Sets of reducible nonnegative matrices 47
4. Converganee of dynamic programming recursions : the case v = I 58 4.1. Dynamic programming recursions with irreducible nonnegative
matrices 59
4.2. Converganee of dynamic programming recursions: reducible matrices
Appendix 4.A. Geometrie converganee in undiscounted Markov de~ision processas
5. Sensitive analysis of growth
5. I. Converganee results for dynamic programming recursions : the general case
5.2. The structure of generalized eigenveetors 5.3. Estimation of growth characteristics
62 70 78 79 88 89 .i
Appendix 5.A. Nested functional equations 95
6. Continuous-time dynamic programming models 6. I • HL-matrices
6.2. Systems with irreducible HL-matrices 6.3. Systems with HL-matrices: the general case
Appendix 6.A. Exponential convergence in continuous-time Markov decision processes
Part II. Countably infinite·dimensional systems
7. Countable stochastic matrices : strong ergodicity and the Doeblin 100 lOl 105 108 114 125 condition 126
7.1. Strong ergodicity and the Doeblin condition 126
7.2. DQeblin condition and mean recurrence time 134
8. R-theory for countable nonnegattve matrices 139
8.1. Countable irreducible nonnegative matrices 140
8.2. Countable reducible nonnegative matrices 148
8.3. Discussion of the conditions of the theorems 8.6, 8.8
and 8.9 156
9. R-theory for sets of countable nonnegative matrices 164
9.1. Communicating systems 164
9.2. Sets of reducible nonnegative matrices 174
References 182
Subject index 188
Samenvatting 191
CHAPTER 1
INTRODUCTION
In this monograph we study dynamic programming models in which the transition law is specified by a set of nonnegative matrices. These models include e.g. Markov decision processes with additive and multiplicative utility function , input-output systems with substitution, controlled multitype branching processes, etc. The main objective of this monograph is to show that all these models can be studied within one general matrix-theoretica! framework. This framework will be built up by using dynamic programming methods and will be based on the theory of sets of general nonnegative matrices. This explains the title.
Methods which have been developed to determine an optimal control in the above mentioned models with respect to various types of criterion functions, will follow as special cases from such a general framework. As an example we may think of a policy iteration metbod for a Markov decision process with respect to some "sensitive optimality" criterion or of methods to determine equilibrium prices in a Leontief substitution system. This indicates the generality of our model, a model in which the theory of generalized eigenveetors and generalized (sub)invariant veetors for sets of nonnegative matrices plays a central role.
In this introduetion chapter we first give a short bistorical review of the problem field and a summary of our objectives (section 1.1), After that a more formal description is given of the model to be studied in this volume (section 1.2).
Section 1.3 lists a number of examples of models, arising from various fields in matbematics an in mathematica! economics, which can be written in, or easily be transformed into our problem formulation. The contents of the. subsequent chapters are summarized insection 1.4 and a list of notations is given insection 1.5.
l.I. A short history; lllàin óbjèctives
Since the pubHeation of Bellman's 11Dynamic Programming" in 1957 (BELLMAN [ 5 J), interest in dynamic programming bas expanded rapidly. In bis book Bellman formalized the technique of b~ckward induction which
appeared to be fundamental for the analysis of sequential decision processes. In the last chapter of this volume some attention is paid to Markov decision processes. A deeper investigation of the use of dynamic programming for the cont~ol of Markov decision processes appeared three years later (HOWARD [29]). Also Shapley's paper on stochastic games is now recognized as fundamental to this field (SHAPLEY [53]), But, as Denardo remarked, the modern era started with the work of Blackwell (compare Denardo's contribution to the panel discussion in PUTERMAN [ 4 7); see a lso BLACKWELL [ 8
J, [
9 ]) •Markov decision processes with additive reward function have been studied witb respect to several criteria, the classical ones being:
the expected total reward criterion and the expected average reward criterion. More sensitive optimality criteria have been investigated by VEINOTT [64], SLADKY [54], and DENARDO AND ROTHBLUM [16). Often the transition probability matrices in these models are allowed to be substochastic, i.e., a positive probability for fading of the system is allowed (cf. VEINOTT [64],
ROTHBLUM [50], [51), HORDIJK [27] and WESSELS [71]).
Multiplicative Markov decision processes have been studied by HOWARD AND MATMESON [30] and by ROTHBLUM [49]. Other models which are in fact closely related (as far as structure is concerned), can be found in e.g. MDRISHIMA [42] or BURMEISTER AND DOBELL [ 12] (Leontief substitution syst-ems) and in PLISKA. [ 46] ( controlled multitype brancbing processes).
One of the objectives of this monograph is to analyze these models by using nonnegative matrix theory instead of probabilistic arguments (note
that several models, which have been mentioned above, have no probabilistic interpretation at all, and that the associated nonnegative matrices are not stochastic in general). This takesus to our secoud subject. Nonnegative matrices and more general nonnegative operators play an important role in various fields of applied mathematica, e.g. probability theory, demography, numerical analysis and mathematical economics. Since the publication of the basicworkof PERRON [45] and FROBENIUS [24], [25] an overwhelming number of papers appeared in the literature. To mention only a few iroportant ones: BIRKHOFF [7 ], KARLIN [33] and VERE-JONES [65], [66], Excellent overviews may be found in SENETA [52] and in BERMAN AND PLEMMONS [ 6 ] • Ffnally, some
3
results concerning sets of finite-dimensional nonnegative matrices, closely related to some of our own work in part I of this monograph, are given i!J..·SLADKY [56], [58].
We conclude this sectien with a sketch of problems we examine and objectives we pursue in this monograph. The hook is divided into two parts, the first one dealing with finite-dimensional systems, the second one with roodels of countably infinite dimension. Our main objective will be to give a sys~ematic treatment of the theory of sets of nonnegative matrices in dynamic programming problems and to give a fairly complete analysis of the asymptotic behaviour of dynamic programming recursions. In order to keep the exposition lucid and reasonably simple we shall first treat the finite-dimensional case. In this case it is possible to develop explicit policy-iteration methods, which end after a finite numher of steps, in order to characterize and to determine matrices which maximizes the growth of the system. Brief attention will be paid to the continuous-time analogue of the above sketched models.
The secend part of this book is devoted to the development of a theory for sets of countably infinite nonnegative matrices. Questions concerning invariant veetors and optima! contraction factors then arise and we shall try to answer them. The reader familiar with CHUNG [13] will reco~ize that some of our results are extensive generalizations of results in that volume. Our results are also related to well-known facts in potential theory for Markov ebains (cf. KEMENY, SNELLAND KNAPP [35], and HORDIJK [27]). At several places we shall indicate applications of the results, e.g. in the theory of Markov decision processes and strongly excessive functions (cf, VAN REE AND WESSELS [70]), and in the investigation of sensitive optimality criteria in controlled Markov ebains (cf. SLADKY [54]).
1.2. Description of the model
In this section a formal description is given of the dynamic systems to be studi.ed in this monograph. For notations the reader is referred to section 1.5.
Central in the book is the concept of a set of matrices with the product property. Let us first give the formal definition.
DEFINITION 1.1. Let K be a set of k x m matrices (k,m e
lN)
and let P. 1denote the i-tb row of a matrix P e
K.
ThenK
bas the pPoduat p~ePty if for each subset V of {1,2, ••• ,k} and for each pair of matricesPXI), P(2) e
K
the following holds:The matrix P(3), defined by
P(3). := 1
P(2). 1
cis also an element of
K.
for i e V
for ie {1,2, ••• ,k}\V,
0 Roughly speaking this means that for i • 1,2, ••• ,k there exists a colleetien ei of row veetors of lenght m. K is the set of all k x m matrices with the property that their i-tb row is an element of C., for i • l, ••• ,k.
1
Next we describe the finite-dimensional models to be studied in part I. Let lRN denote the N-dimensional Euclidean space. The set {1,2, ••• ,N} will often be called the state space and is then denoted by S. A nonnegative matrix P is a matrix with all its entries real and nonnegative. Let
K
now'denote a finite set of nonnegative N x N matrices with the product property. One of our objectives is to obtain information about the asymptotic behaviour of the utility vector x(n) (an N-dimensional column vector), obeying the dynamic programming recursion·
(1.2.1) x(n+l) • max P x(n) PeK
n = 0,1,2, •••
positive vector, For interpretations of (1,2,1) we refer to section 1.3. Here we only remark that the fact that Khas the product property implies for each n the existence of a matrix P(n) €
K
such thatx(n+l) P(n) x(n) n = 0,1 ,2, ...
5
In chapter 6 we briefly treat the continuous-time analogue of the discrete dynamic programming recursion defined above. A central role is then played by a colleetien of so-called ML-matrices with the product property, An ML-matrix is a square matrix with all its nondiagonal entries nonnega-tive. Let
M
denote a finite set of ML-matrices with the product property. We are now interested in the asymptotic behaviour of the vector function z(t), defined by(1.2.2) dz dt(t) .. ma.x Q z(t)
QEM
t € [O,co),
with z(O) fixed, strictly positive (again the maximum is taken component-wise). Note that, since Mhas the product property, there exist matrices Q(t) E M such that
Q(t) z(t) t € [O,oo),
For an example we refer to section 1.3.
The analysis of these models requires a detailed study of sets of nonnegative matrices (resp. sets of ML-matrices) with the product ·property. In part I we shall develop a theory for sets of finite-dimensional matrices, in part II infinite-dimensional models are investigated. The results in the secend part may he viewed as rather far-reaching extensions of the R-theory
for nonnegative matrices, iniHated by VERE-JONES [65], [66].
1.3. Examples
In this section, we list as examples a number of special cases of the general models, sketched in the preceding section.
1.3.1. Markov"decision proéesses with additive reward furiction
a. The discrete time case
Markov decision processes have been studied initially by BELLMAN
[ 4
J, [
5] and HOWARD [ 29]. Suppose a system is observed at discrete points of time, At each time point tbe system may be in one of a finite number of states, labeled by 1,2, ••• ,N. If, at timet, the system is in state i, one may choose an action, a say, from a finite action space A; this action re-sults in a probability p~. of finding the system in statejat time t+l.1J
Furtbermore a reward r~. is earned when in state i action a is taken and
l.J
the system moves to state j. Suppose
N
r~. ~ 0;
t
p~. sI1J j•l 1J i,j
i.e., a positive probability that the process terminatea is allowed.
Let v(O). denote tbe terminal reward in state i and let v(n). be the 1 . 1 maximal expected return for the n-period problem (i.e., with n periods to go), when starting in state i. For convenianee define
a N a a
r. •
t
p .• r .. 1 j=l lJ 1Ji
=
1,., • , N; a € A,Bellman's optimatity principle implies that the following recursion holds for v(n). (cf. BELLMAN [ 5 ]) : 1 (1.3.1) v(n)i = max al! A {r~ + l N
i:
j=l p~. v(n-1) .} lJ J i I, .. ,N; a €A.Recursion (1.3.1) can be written in vector notation when policies are introduced. A poliay fis a function from {I, ••• ,N} toA. Thesetof all possible policies is denoted by F. Let P(f) be the (substocbastic) ma-trix with entries
p~~i)
and r(f) the vector with components r:(i) forlJ l
i,j
=
1,2, ••• ,N; f € F. From these definitions, it immediately followsthat the collection of Nx(N+I) matrices
{(P(f), r(f))
I
f € F}(1 .3.2) where v(n) troducing (1.3.3) v(n)
=
max {r(f) + P(f) v(n-1)} fe:Fdenotes the vector with components a simple dummy variable we obtain
v(n)i' ( v(n) ] = max [ P(f) r(f)
)
(v(~-1)
) I fe:F 0 n e: lN i • l, ••• ,N. By in-n e: lNwhich is an example of the recursion (1.2.1), to be stuclied in part I of this monograph.
b. The continuous-time case.
7
As in the previous example we consider a system with a finite state space, {1,2, ••• ,N} say, and a finite action space A. Suppose now the system is observed continuously. At each time point t e: [O,m) the system is allowed to make a transition from one state to another one. It will be clear that the significant parameters are transition rates rather than transition probabiHties (ef. CHUNG [13]).
We assume that a controller is allowed to react at each time point
te: [O,m). If at timet the system is in state i, and action a e: A is taken
the system is supposed to make a transition to state j in a short time in-terval at with probability q.~ at (i,j
=
1, ••• ,N). The probability of twol.J
or more transitions is of order o(at) if At is sufficiently small (we say that a function h(t) is of order o(t) for t small if lim t-1h(t) • 0). The
t-+0
probability of making no transition in a short time interval At is then
N
equal to 1-
I
q~. Atj=l . l.J
Suppose furthermore that, if the system is in state i at time t and action a is chosen, a reward of r.~ per unit time is earned during the time
l.l.
that the system remains in state i. If the system moves from state i to state j a reward r~. is received (i,j
=
1, ••• ,N). Now, if v(t)1• denotes the l.J
maximal expected return in a time interval of length ~ when starting in state i and v(O)i denotes the terminal reward in state i, it follows from Bellman's optimality principle that for i= t, ••• ,N and te: [0,~):
N N
v(t+At).= max { (1-
i
tt!'.
Át)(r.~ At+v(t).) + I'q~. At (r~.+ v(t) .)} +cr(llt)Define for i,j • I, ••• ,N and a € A
a ~ -a
qi • • - l q •.
1 jo# 1J
r.a. • r.a.
11 11 + NI
j•l
Thea, for i • I, ••• ,N and t € [O,m), we obtain
(1.3.4) v(t+6t). - v(t). _ _ _ ..;;;1 _ _ _ ..;;;.1 • max At a€A N {r~ +
I
1 j=l a ( ) } + o(A t) q •• V t . 1J J At .... a q •• 1J a r .. • 1JAgain, a policy fis defined as a function from {I, ••• ,N} toA. Let F de-nota the set of all possible policies, Q(f) the matrix with entries
q:~i)
and r(f) the vector with components rf(i). I f we take the limit in (I~t4)
as At + 0 we obtain, in vector-notation:
(1.3.5) • max {r(f) + Q(f) v(t)} f€F
Define a scalar function vN+I(t) - fort € [O,m), Then we ~ay write
(1.3.6)
r
:;(t) ]
l
dv dtN+l(t) • max f€F [ Qo(f)r(f)l [
v(t)l
0 VN+l (t)which is an example of the model to be studied in chapter 6. Note that the collection of matrices
is a colleetien of ML-matrices with the product property.
1,3.2. Risk-sensitive Markov decision processes
has been described in part a of example 1.3.1. Suppose now that a decision maker represents his risk preferenee by a utility function u that assigns a real number to each of a number of possible outcomes. Thus, if r~ is the
1
expected reward when in state i action a is chosen, the value for the de-cision maker is equal to u(r~); if v(n). is the maximal expected return
1 1
for the n-period problem, then the utility for the decision maker equals u(v(n).).
1
.In example 1.3.1, part a, we treated the case in which u(x) =x for each possible return x, which implies risk-indifference. HOWARD AND MATHE-SON [30] treated the case in which the utility function has the following form:
(1.3. 7) u(x)
=
-(sgn y) exp(-yx)9
where Y f. 0 is called the risk aversion Cf:?effiaient and sgn y denotes the sign of y. A positive value of y indicates risk aversion, a negative value indicates risk preference. Note that the function u(.), defined in (1.3.7), is increasing.
It follows that astreamof rewards r. , r. , ••• ,r. has a utility
11 12 1n
-(sgn y) exp(-y(r. +r. + ••• +r. ))
11 12 1n
Now, let v(n)i denote the utility of staying in the system for n periods when starting in state i. Using the concept of "certain equivalent", HOWARD AND MATHESON [30] showed:
N
(I. 3.8) v(n). 1 = max
~
P·. a exp(-yr~.) v(n-1).a€A j=l 1J 1J J i= I, .. ,N; m € lN. Defining ~a a exp(-yr~ .) p .. p .. 1J 1J 1J we obtain N (1.3.9) v(n). = max
~
~a p .. v(n-1). 1 a€A j=l 1J J i I, •• ,N; m € lN ,or, defining f, F, P(f) and v(n) as usual,
(1.3.10) v(n) max P(f) v(n-1)
f€F
1.3.3. Controlled multitype branching processes
n € JN.
Consider a population consisting of individuals of N types, labeled 1,2, ••• ,N, which is observed at time points 0,1,2, •••• Each individual lives from one such time point to the next, at which moment he produces a random number of offspring; all these numbers are supposed to be independent. At time t an action is chosen (from a finite set A) for each individual. Different actions may be chosen for different individuals (possibly of the Sat'1e type),
At each time point the state of the system is described by a vector (s
1, ••• ,sN)' where si denotes the number of individuals of type i. Let
pi(t
1, ••• ,tN
I
a) denote the probability that (as aresult of action aE:A}one individual of type i produces exactly t. individuals of type j,
J
j = 1,2, ••• ,N. Suppose furthermore, that, if for an individual of type i action a is chosen, a reward r~ is earned. It is not hard to verify ~bat
l
this system may be described by a Ma.rkov decision process with a countable statespace (cf. PLISKA [46]).
Note that, in general, different actions may be selected for different individuals of the same type. A decision tule that selects the same
action for all individuals of the same type and such that this selection is independent of the state (s
1, ••• ,sN) is called statie. PLISKA [46]
showed that the multitype branching process, described above, can be con-trolled by éonsidering only static decision rules, and a collection of nonnegative N xN matrices with the product property. Let u.a. denote the
lJ
expected number of individuals of type j among the offspring of one indi-vidual of type i when action a is chosen. Assume
0 < u.a. < oo
lJ i,j • I, ••• ,N; a € A.
L~t, furthermore, x(n)i denote the maximal expected return when we start with exactly one individual of type i, no individuals of other types, when only static decision rules are considered, and with n periods to go. Then obviously
11 N (1.3.11) x(n)i = max ae:A {r~ + ~
I
u. 8 • x(n-1) .} ~J J i I, •• ,N;nElN, j=1where x(O). is a terminal reward. If wedefine a static policy f as a
func-1
' f {1 N} t A d U(f) d h ' ' h · f(i)
t~on rom , ••• , · o , an enotes t e matr~x w~t entr1es u.. ,
f (.) lJ
while r(f), resp. x(n), are the veetors with components r. 1 , resp. x(n).,
l l
then we may write
(1.3.12) x(n) = max {r(f) + U(f) x(n-1)}
fE:F
n E lN,
where F denotes thesetof all static policies. As before, (1.3.12) can be transformed into a recursion of the form (1.2.1):
(
u
0 (f) .. max fE:F r(f) n E: lN •It is interesting to note that PLISKA [46] showed that, if both static and nonstatic decision rules are considered, the maximal expected return for an n-period controlled multitype branching process, when starting in state
(s
1, ••• ,sN), and summed over the total number of individuals at the start,
is equal to N
1:
i• I s. x(n) •• l lHence there exists a static decision rule which is optimal. It follows that these problems can be handled either as a Markov decision process ,with a countable state space or as a more general dynamic programming problem with a set of finite-dimensional nonnegative matrices with the product property.
1.3.4. An input-output system with substitution
An economie system, consisting of N industries (or resources), is controlled at discrete points of time. We assume presence of a sufficient amount of labour (of homogeneous type). Each industry i produces a single commodity, also indicated by i (no joint production is allowed). Further-more, there exists a finite set A of alternative technologies for each in-dustry i. If inin-dustry i chooses technology ae:A, we denote by p.~ the
num-lJ
necessary for the production of one unit of commodity i. Furthermore, ~~
1
denotes the amount of labeur, necessary for the production of one unit of commodity i, when technology a is cbosen.
Let w be tbe (constant) wage rate and let c(n). denote the costof
1
the production of one unit of commodity i at time point n. We assume c{O). > 0 for i • I, •.• ,N. Since we may expect that each industry is
inte-1
rested in minimfzing its costs, we find
(1.3.13)
N
c(n)i .. min {w~~ +
r
p.a. c(n-1) .}a!.A l. j=l 1J J i I, ... ,N; n € JN
(bere we assumed that the production costs of one unit of a commodity is equal to its price on the market).
A technology vectorfis a function from {I, ••• ,N} toA, which,spe-cifies for each industry a particular technology. The set of all technology veetors is denoted by F, P(f) denotes the matrix with entries
p~~i)
and~(f)
the vector with components !:(i), for all i,j,f. Withthes~
3defini-1
tions, (1.3.13) can be written as·
(1.3,14) c(n) =min {w ~(f) + P(f) c(n-1)} f!.F
nt::JN,
where c(n) denotes the vector with components c(n)i' i • t, ••• ,N. As befor~, we find a recursion of the form (1.2.1):
[
c(n)} .. min ( P(f)
I feF 0
Here, we have an. example with "max" replaced by "min". These models can be' treated in essentially the same way as the one, introduced in sectien 1.2. The model, described above, is an example of a Leontief substitution system
(cf. MORISHIMA {41], BURMEISTER AND DOBELL [12]),
1.3.5. A terminating decision process
In BELLMAN [ 3}, a multistage decision processis considered where, at eaçh stage, one bas the choice of one of a finite number of actions,
1,2, ••• ,K say. The choice of action a € {I, ••• ,K} results in a probability
distribution with the following properties:
a. There is ~ probability p~ that one receives i units and the
cess continues (i 1,2, ... ,N);
b. There is a probability pri that one receives nothing and the pro-cess terminates.
13
Now let n be a fixed integer and suppose a decision maker wants to maximize the probability that he receives at least a total number of n units before the process terminates. Let u. denote the maximal probability
J
of obtaining at least j units before terminatien of the ,process, then
N
I
a j <:: 0 ma x p. u. i== I l J-i (1.3.15) u ... a J j < 0.Applying a simpte transformation, this problem can again be written in the formulation, introduced insection 1.2. For j == 1,2, ••• ,n we have
a a PI •. • • • • • • • PN
=
max a l 0 * . . . 0 Il
L ...
().
..
where we start with (u0, ••• ,u1_N))T
=
(l, ..• ,I)T.It follows that the decision maker bas to solve an n-step sequentia! deci-aion problem of type (1.2,1).
1.4. Summary of the subsequent chapters
As mentioned already, one of the main objectives of this monograph is to analyze the asymptotic behaviour of dynamic programming recursions (or quasi-linear equations, cf. BELLMAN [ 3]) of type (1.2.1), based on a set K of nonnegative square matrices with the product property, I t will be clear that some insight in the structure of such sets of matrices is fundamental. In chapter 2 we first briefly repeat some well-known results concerning structure and properties of a single nonnegative matrix. A relatively large part of this chapter is devoted·to what we will call a generalized
, eigenvector theory for square nonnegative matrices (cf. ROTHBLUM [48]) • Chapters 3,4' and 5 deal with sets of finite-dimensional nonnegative ma-trices. ln chapter 3 it is sbown that a particular block-triangular struc-ture exists for sets of nonnegative matrices which is closely related to the behaviour of dynamic programming recursions of type (1.2.1). In chap-ter 4, convergence results for these recursions are proved under rather special conditions. Indispensable for the analysis in this chapter is a result, recently proved by SCHWEITZER AND FEDERGRUEN [61J, concerning geo-metrie convergence in undiscounted Markov decision processes. The original proof of this result is extremely complicated; in appendix 4.A we present a new, relatively simple proof, together with some extensions. This geo-metrie convergence result plays a key role again in chapter 5, where botb convergence results for recursions of type (1.2.1) in the most general case are proved, and a theory concerning generalized eigenveetors for sets of nonnegative matrices with the product property is completed. Key words in the analysis are spectraZ Padius~ index and generaZized eigenvectoPS. Brief attention will be paid to estimation methods for these characteristics. Typical for the finite case is that all proofs can be given in a construc-tive way; in particular it is possible to develop policy iteration methods for the construction of matrices which maximize the "growth" of systems of type (1.2.1).
In chapter 6 we briefly treat the continuous-time analogue of tbe model, studied in chapters 3,4 and 5. There we deal with a set of ML-ma-trices with the product property. Special attention is paid to an expo-nential convergence result for undiscounted continuous-time Markov decision processes (appendix 6.A), which may be viewed as an analogue of the main result of appendix 4.A in the discrete-time case.
Although a theory for sets of nonnegative matrices with the product property bas been developed mainly for its usefulness in the analysis of dynamic programming recursions, the results are interesting in themselves; they provide a considerable generalization of the classica! Perron-Frobe-nius theory, In part II (starting with chapter 7) an attempt is made to extend this theory to sets of countably infinite nonnegative matrices. Such an extension is relevant in conneetion with the study of denumerable Markov decision processes, invariant veetors for sets of nonnegative 'matrices etc. Chapter 7 is an introductory one in which Markov ebains with a countable state space are discussed. StPong er>godiaity and the DoebUn condition are some of the key concepts in the analysis. Although interesting in itself,
15
the results mainly serve to explain and motivate the conditions of the theorema, prÓved in chapter 8. In that chapter the .structure of countably infinite nonnegative matrices is analyzed; it turns out that a beautiful extension of the generalized eigenvector theory, treated in chapter 2, exists. Vere-J,ones's R-theoP,W (which deals only with irreducible nonnega-tive matrices of countably infinite dimension) is used as a starting point (cf. VERE-JONES [65], [66]). The results obtained are related to results in potential theory for Markov ebains (cf. KEMENY, SNELLAND KNAPP [35]). In chapter 9, finally, we return to ~of (countably infinite) nonnega-tive matrices and show how results, similar to these in chapter 3 can be . obtained. As a by-product of our analysis we obtain a semi-probabilistic
interpretation of (generalized) eigenveetors and (generalized) invariant veetors which seems to be new even in the finite case.
l.S. Notational conventions
We shall be concerned with sets of nonnegative matrices with the pro-duct property (cf. definition l.l). Unless stated otherwise all matrices will be square and of a fixed dimension. Throughout part I N denotes the dimension of these matrices. Motivated by the theory of Markov processes the set {1,2, ••• ,N} is called the statespace and denoted by S. Part II deals with matrices of countably infinite dimension; in this case
s :•
{1,2, ••• }.Matrices will be denoted by capitals P,Q, ••• , (column) veetors by lower case letters x,y,u,w, •••• The identity matrix (ones on the diagonal1 zeros elsewhere) is denoted by I, the vector with all components equal to one by e. The null matrix is denoted by 0, the null vector by O.
The n-th power of a matrix P is de:oted by Pn;
p~~) deno~es
the ij-th( l) l.J
entry of Pn • Instead of p.. we usually write p ••• P. denotes the i-th
l.J l.J l.
row of P. The i-th component of a vector x is denoted by x .• Wedefine
0 l
p := I.
As usual lN denotes the set of positive integers,
IN
:= lNtJ{co}, lNO ::o lNtJ{O},IN
0 :=lN0 u{oo}. lR is thesetof real numbers, lR+ the set of positive real numbers,
:i
:= lRu{co},lR~
:,.lR+u{O}. lRk denotes the k-fold cartesian product lR x lR x ••• x lR (k eIN).
+
A nonnegative square matrixPis a function from S x S to R0• If p •. > 0 for all i,j e S the matrix P is called positive. I f P is
nonnega-l.J
write P <:! O, if P ;> 0 and P .; 0. Furthermore we write P > Q (<:! Q, > Q) if
=
= =
~-P-Q;;,
2
(~Q•
>2>·
Similar definitions apply to vectors. Insteadof "pos-itive vector" often the words "strictly pos"pos-itive vector" will be used.The transpose ~f a matrix P is denoted by PT; the transpose of a (column) vector x is written as xT. Subsets of the state space S will be denoted by A,B,C,D, •••• If C c S then by PC the restrietion of the square matrix P to C x C is denoted. Similarly, xc is the
restricti~n
of theI
(column) vector x to C. If {1(1), I(2), ••• ,I(n)} denotes a pdrtition of the state.space S then we often write P(k,t) for the restrietion of P to
(k,k) I(k)
I (k) x I (t), k, t = I, ... , r. No te that P = P , k .= I, ... , r.
If P is a square matrix of finite dimension then the
speetral radius
of P is defined as the modulus of its largest eigenvalue. Throughout this monograph the speetral radius of Pis denoted by a(P).
In chapter 6 HL-matrices of finite dimension are considered, An HL-matrix is a square HL-matrix with all its nondiagonal entries nonnegative. The name is adopted from SENETA [52], who uses the word in conneetion with the workof Metzler and Leontief in mathematica! economics.
Lexicographical order symbols are used in several chapters. Let (x(l), ••• ,x(n)) and (y(l), ••• ,y(n)) be two sequenoes of real-valued vectors. We say that (x(l), ••• ,x(n)) ~ (y(l), ••• ,y(n)) if x(l) > y(l) or if for some k E {l, ••• ,n-1} holds that x(t) • y(t) fort • 1,2, ••• ,k and x(k+l)>y(k+l). Similar defin~tions hold for
t,
~.<,
~ and ~·Let f(t) and g(t) be real-valued (vector) functions such that g(t)>O
- -1
-fortE JR. Then f(t) = a(g(t)) fort~ a (a E lR) if ~im (g(t)i) {(t)i=O for all i. Furthermore f(t) = O(g(t)) for t ~a if there exists ·a constant c such that
I
f(t)I
~ c ( g(t)) for t close to a.The symbol :• is used to define concepts. The symbol ~ is used for asymptotic equality; for instanee x(n) - y(n) for n ~ oo means that for each
e > 0 there exists an integer n0 such that (1-e) y(n)
!
x(n) ~ (l+e) y(n} for n~n0
• The· symbol 0 denotes the end of a proof, or the end of the formulation o: a proposition, lemma or theorem if no proof is given. Also the end of a def-inition is marked.by0.
The Kronecker delta öij is defined by öij :• if i= j, ö •• := 0 i f i.; j. By 11 •• 11 the usualsup-norm is denoted.PART I
NONNEGATIVE MATRICES: A STRUCTURE ANALYSIS
Any investigation of dynamic programming recursions of the type
(1.2.1) x(n)
=
max P x(n-1)PeK n
=
1,2, ••• ; x(O) > 0
with
K
a set of nonnegative square matrices with the product property, en-tails the study of products of nonnegative matrices, or, in the case thatK
contains only one matrix, of powers of that matrix. Clearly, powers of a square nonnegative matrix can be studied by familiar matrix-theoretica! methods such as Jordan decomposition. The disadvantage of these methods how-ever is that the nonnegativity of the entries is completely ignored. A·graph-theoretical, rather than a matrix-theoretica!, approach appears to be the natural answer to this objection (cf. SENETA [52], p. 9-12 and ROTHBLUM[48]). The authors mentioned exploit the idea that a square nonnegative
matrix P of dimension N can be represented by a directed graph with N nodes in which a transition from node i to node j is possible if and only if
p •• > 0 (i,j
=
1, ••• ,N). lJIn this chapter a rather detailed analysis of the structure of a single square nonnegative matrix is presented. We follow the (graph-theoretical) terminology of ROTHBLUM [48], which is strongly motivated by the theory of Markov chains. Insection 2.1, a brief review of some well-known definitions and results will be given (most of them without proof) which can be found, for instance, in SENETA [ 52] or BERMAN AND PLEMMONS [ 6 ] • We a lso give some immediate corollaries which will be needed later. In section 2.2, a funda-mental decomposition result for one square nonnegative matrix is presented which describes the hierarchical structure of the underlying graph; this decomposition proves to be extremely useful for the analysis of the be-haviour of powersof that matrix (cf. SLADKY [58], ZIJM [76]). Section 2.3
20
is devoted to an analysis of the structure of so-called generàlized eigen-vectors, associated with the speetral radius of a square nonnegative matrix, whereas section 2.4 relates these results to more familiar concepts in
ma-trix theory.
The results obtained in this chapter imply some immediate corollaries on the behaviour of the vector x(n), defined by
x(n)
=
Pn x(O) n e 1N ; x(O) > 0,where P denotes a square nonnegative matrix, However, the great advantage of the methods developed bere is that they can be extended to
!!!!
of non-negative matrices with the product property, where they yield analogous results for dynamic programming recursions of the type (1.2.1). In order to facil.itate the proofs of these extensions, state classifications are introduced in section 2.5, and the results of chapter 2 are reformulated in terros of these state classifications. In fact, state classifications relate in a very precise way the hierarchical structure of the graph, associated with a nonnegative matrix, to the behaviour of its powers; they will proveto play a key role in the forthcoming analysis.
Throughout this chapter P denotes a nonnegative N x N matrix; the statespaceS is defined by S :• {1,2, ••• ,N}.
2.1. Basic tools and definitions
In this section we briefly review some (mostly·well-known) definitions and results concerning the structure of nonnegative matrices.
We start with a definition.
DEFINITION2.1. We say that state i bas aacess to statejunder P if P <.n.> lJ > 0 for some n e m0 (' • l , J e S , )
D
Note that, sincep~?>.
I, state i bas always access to state i. Definitionl l
2.1. reflects the idea that the positive-zero configuration of P can
bere-presented by a directed graph. Accordingly, we consider P as a function from
· + N N
S x S to JR0 rather than as a linear operator ftom lR to lR •
Powers of square matrices are usually studied in terros of their eigen-value structure {Jordan decomposition). For nonnegative square matrices an-otber approach exists, based on accessibility relations between the states
(cf. SENETA [52]), It can be shown that an analysis of the behaviour of powers of a square nonnegative matrix becomes much easier if in the under-lying graph any two states have access to each other.
DEFINITION 2.2. P is called i~eduaibLe if ~ny two states have access to each
other. In all other classes we call P redua~öte. D
This definition implies that a square reducible nonnegative matrix can be written in block-triangular form, possibly after a permutation of the states. In other words: using the accessibility relations a hierarchical structure of the state space can be shown.
Irreducible nonnegative matrices can be either periodic ~r aperiodic. , We need the following definition:
DEFINITION 2.3. Let P be irreducible. The
period
d. of a state i withres-1
peet to P is defined by
•
{ I
(n)d . • = g.c.d n p •• > O, n E: 1N}
1 11 i €
s.
0
,A proof of the following result can be found in SENETA [52]:
PROPOSITION 2.1. Let P be irreducible. Then all statas have the same period, d say, with respect toP. There exists a unique partition {C(l), ••• ,C(d)} of S such that ie C(k) and p •• > 0 implies jE: C(k+l) if k < d and je C(l)
lJ
i f k
=
d. DP is said to be
apenodia
if d = 1, otherwise it isperiodia
with period d. Some authors use the word (a)cyclic instead of (a)periodic.Powers of square matrices are usually studied by eigenvalue methods. The eigenvalues on the speetral circle, i.e., the eigenvalues with largast absolute value, play a special role, in fact they characterize the tirst-order asymptotic behaviour of Pn for n + ~. If Pis nonnegative, these ei-:
genvalues and their associated eigenveetors possess very nice properties; these properties are summarized below in the famous Pe~on-Frobenius
theorem.
PRoPOSITION 2.2. Let P be a square nonnegative matrix and let a(P) denote its
speetraL radius,
i.e~, a(P):• max { IÀ1 l À an eigenvalue of P}. Thensemi-posi-22
tive left- and right-eigenvectors. If P is irreducible these eigenveetors are unique up to multiplicative constants; furthermore they can be chosen strictly positive in this case. If P is irreducible, cr(P) is simple. If P is irreducible with period d then there exist precisely d eigenvalties :>..1, .... :>..d with l:>..kl • 1, nsmely \ '"'cr(P) exp(i.21fk/d) for k = l, ••.• d.
These eigenvalues are all simple. 0
For a proof we refer to GANTMACRER [261 or SENETA [52). Note that cr(P)>I:>..I for any eigenvalue À {: cr{P), if P is irreducible and aperiodic.
The existence of strictly positive eigenveetors associated with the speetral radius cr(P) of a square nonnegative matrix P immediately provides us with bounds for the vector x(n) '"'Pn x(O), n • 1,2, ••• , where x(O) is any positive vector. Let u be a strictly positive right-eigenvector, asso-ciated with cr(P), and choose constants c
1,c2 > 0 with c1u! x(O) ! c2u. Then
n e lN,
This result suggests the question: what nonnegative matrices posseas strict-ly positive eigenvectors. Irreducibility is a sufficient (but certainstrict-ly not. necessary) condition. Before answering this question, we need a few defini-tions.
DEFINITION 2.4. Let D be a proper subset of S. Tbe restrietion PD of P to
D x D is called a principat minoP of P. 0
For principal minors the following result holds.
PROPOS.ITION 2.3. The speetral radius cr(P') of any principal minor P' of P doe.s not exceed the speetral radius cr(P) of P. If P is irreducible, we have cr(P') < c(P); if Pis reducible, then cr(P') • cr(P) for at least one
irre-ducible principal minor P'. 0
For a proof see GANTMACRER [ 26] •
Reducible nonnegative matrices can be written in block-triangular form (possibly after permutation of the states) in such a way th.at the blocks on the diagonal are irreducible. This defines a partially hierarcbic-al structure in the underlying graph. The irreducible blocks correspond to
classes.
More formally:DEFINITION 2.5. A a~ss C of P is a subset C of S such that PC is irredu-cible and such tbat C cannot be enlarged without destroying the irreduci-bility. Cis called
basia
if a(PC)=
a(P), otherwise Cis callednonbasia
(in whicb case a(PC) < a(P), according to proposition 2.3).
0
The reader may note that P partitions tbe state space S into classes, C(I),C(2), ••• ,C(n) say. I f P(i,j) denotes the restrietion of P to C(i)x C(j), i,j =.l, ••• ,n, then, possibly after permutation of the states, P can be written in the following form:
(2.1.1) p"'
P (l, I) P (I, 2) . . . . p (1, n)
P(2,2) .... p(2,n)
.• h p(i,j) 0 f . . . . 1 1 b .
w1t =
=
or 1 > J, 1,J = , ••• ,n. Hence c asses can e part1ally ordered by accessibility relations. We may speak ofaaaess
to(jrom) a alass
if there is access to (from) some (or equivalently : any) state in tbat class.
DEFINITION 2.6. A class C, associated with P, is called
fînal,
if C has no access to any other class. A class C is calledinitial,
if no otber classbas access to C. 0
The question which class bas access to which class is fundamental for the investigation of powers of nonnegative matrices. The existence of strict-ly positive eigenveetors also depends completestrict-ly on the accessibility struc-ture. We have
PROPOSITION 2.4. P possesses a strictly positive right- (left-) eigenvector if and only if its basic classes are precisely its final (initial) classes.
0 For a proof we refer to GANTMACHER [26] again.
Ms.trices with strictly positive eigenveetors (and especially their powers) have very nice properties as bas already been indicated àbove. Tbe following lemma is fundamental for the analysis in the forthcoming
sec-24
tions. We have:
LEMMA 2.5. Let P have speetral radius a> 0, and let there exist.a strict-ly positive right-eigenvector, u say, associated with a. Then there exists a nonnegative matrix P*, defined by
• 1 n -k k
p* :• l:Lm
+T
I
cr P •n-+co n k=O
We have PP*
=
p*p=
crP* and (P*)2=
P*. Furthermore, * pij > 0 if and only ifj is contained in a basic class of P and i bas access to j under P. If the restrietion of P to each of its basic classes is aperiodic, then
Finally the matrix (ai-P+P*), is nonsingular.
PROOF. The matrix
P,
defined by.. -1
P .. =a
l.J i,j €
s
is stoahastia (i.e., P > 0 , Pe
=
e), For stochastic matrices the results = =are well known (cf. KEMENY AND SNELL [34]). By the inverse transformation of (*) all results for
P
are translated into the corresponding results forP. D
The matrix P* is the projector on the nuli-space of (ai-P), along the range of oi-P. The matrix (ai-P+P*) is often called the furuk;rmental matri~, corresponding toP (KEMENY AND SNELt [34]). Note that the
restrie-tion of P* to each basic class of P is strictly positive.
Even if a nonnegative reducible matrix P does not possess a strictly positive eigenvector, associated with its speetral radius o, it is easy to understand the fundamental role of a with respect to the behaviour of Pn. The following characterization is useful.
a. cr
n._ n
b, cr sup Ul3 w>
Q
Pw ~ Àw}=
inf {ÀI3w> Q. : Pw ~ Àw} c. For each À > cr there exists a vector w > 0 such that Pw ~ Àw.PROOF. a. follows from DUNFORD AND SCHWARTZ [21], p.567, b, follows from
- - -1
a~ To establish c., take w = (ÀI-P) e.
D
If À > cr, w > 0 such that Pw
-
<-
ÀW one immediately. sees that Ànwdom-. .
inates Pnx(O) if x(O)
..
< w. A vector w, satisfying w > 0 and Pw-
..
< Àw,is often called
À-subinvar-iant
(cf. chapter 8 of this monograph) orstrongZy
exceeeive
(cf. VAN HEE AND WESSELS [70]); these veetors play an important role in stochastie analysis and in potential theory for Markov ebains (cf. KEMENY, SNELL AND KNAPP [35) or HORDIJK [27)).A more detailed analysis of the role of the speetral radius with res-pect to powers of a nonnegative matrix will be given in the next seetion. Accessibility relations between basic (and nonbasic) classes will play a fun-damental role again. We now eonclude this section with two lemmas which are needed in the sequel.
LEMMA 2.7. Let P be irreducible, let cr beits speetral radius and let x> 0.
Then Px
!
ax implies Px=
ax. Analogously, Px ~ ox implies Px=
ox. PROOF. Multiplying Px ~ ox by the strictly positive left-eigenveetor of P, associated with o, yields a > a, a contradiction. Hence Px = crx. Similarly,i f Px < x. 0
LEMMA 2,8. Let P have speetral radius a and suppose Px ~ Ax for some real À and some real vector x with at least one positive component. Then a
!
À.PROOF. Let y := (ÀI-P)x. Then y ~ ~· If À > cr then ÀI-P is nonsingular and
-1
x = (U-P) y •
00
r
n=O
26
In the next section, the structure analysis of reducible nonnegative matri-ces is 'continued. The concepts introduced there are less familiar; again they have a strongly graph-theoretical interpretation.
2.2. Block~triàngulàr decompositions
In this section a rather specific decomposition result .for nonnegative matrices is presented, which is based on the already mentioned hierarchical structure in the underlying graph. Reeall that classes are partially ordered
.
by accessibility relations,cf. (2.1.1). We shall show the existence of a partienlar hierarchical order of basic and nonbasic classes, which is strong-ly related to the behaviour of powers of the associated nonnegative matrix.Let P be a square nonnegative matrix (of finite dimension) and let S be tbe state space. Obviously, there must be a strong conneetion between the familiar Jordan canonical form of P and the partitioning of S in basic and nonbasic classes. The number of basic classes for instanee is precisely equal to the algebraic multiplicity of the eigenvalue a(P). Namely, if C
is a basic class of P, then a(P) is a simple eigenvalue of Pc, and further-more each eigenvalue of PC is an eigenvalue of P (use (2.1.1)). We shall show that there also exists a relationship between certain chain-structures of basic and nonbasic classes and the size of the Jordan-block, associated witb the speetral radius a(P), if a{P) is degenerated (cf. PEASE [44}).
Consider the following example:
Exarople 2,1.
p -
r
~
:
J
x(O) • [ ::l
> [~
1
and define x(n) • Pn x(O), n • 1,2, •••• Then
x(n) • [ : : : : :
l • z" [
xl:,'"·'2
l
Notice the difference in behaviour between x(n)
1 and x(n)2, caused by the fact that state I bas access to state 2 and p11 and p22 are both equ~l to the speetral radius. In terms of classes : the matrix P possesses two basic
classes : {I} and {2}, the first one having access to the second one, which implies an asymptotic behaviour of the vector x(n) of the form
where c1 and c
2 depend on the starting vector x(O).
The next example is an extension of the previous one.
Example 2.2.
~
[
~
4 0 4 0 2
Now it is easy to verify that for n ~ ~ x(n) = Pnx(O) obeys ( ) - 2nd
x n 2 2
where again d1, d2 and d
3 are constants, depending on x(O).
Apparently, the presence of a nonbasic class in a "chain between two basic classes" (definition follows below) does not really influence the asymptotic behaviour of x(n) (note that still the first basic class has access to the second one, but now via a nonbasic class). It is this rela-tionship, between the position of basic and nonbasic classes and the beha-viour of powers of a nonnegative matrix, that. will be studied in this .section."
What we need first is a quantitative indication of the position of .a class. We start with the definition of a ahain.
DEFINITION 2.7~ By a ahain of classes of P we mean a colteetion of classes
{C(I), C(2), ••• ,C(n)}, such that pi jk> 0 forsome pair of states (ik,jk)
withik
~
C(k), jk~
C(k+l), k=
J,~,
••• ,n-1. We say that the chain8~
wi th C (1) and ende wi th C (n). The Zength of a chain is the number of basicclasses it contains. D
28
DEFINITIOF 2.8. The
height (de.pth)
of a class C of P is the length of thelongest chain whicb ends (starts) witb
c.
0To illustrate these definitions consider the following example.
Eaxmple 2.3. 4 0 0 2 0 0 0 P= 4 2 4 2
Here the lower triangle,· of P contains only ze ros. Each class of P contains exactly one state. The following graph shows the positions of these classes (B =basic):
Q
0
0
;-,,~r·
2 B 5 B
0
ö
Classifying the classes according to their position, we obtain:
Basic classes {2} {4} {5} Nonbasic classes { l} {3}
Height Depth I 2 2 1
Finally,we define the degree of a nonnegative matrix.
0 2
0 2
DEFINITION 2.9. The degree v(P) of P is the length of its longest chain.
0
In all the examples v(P)
=
2.LEMMA 2.9 •. Let P have speetral radius a and degree v. There exists a parti-tion {D(v), ••• ,D(l),D(O)} of the statespaceS such that D(k) is the union of all classes with depth k, for k • O,l, ••• ,v. If P(k,t) denotes the res-trietion of P to D(k) x D(t), then P(k,t). 0 for k
..
< t (k,! • 0,1, ••• ,v) • After possibly permuting the states we may write(2.2 • .1) p
p<v,v) P(v,v-1) ••••• P(v,l)
P~~-1,v-~~ ••• P(v-1,1)
We have a(P(k,k)) =a for k • l, ..•
,v
and a(P(O,O)) <a (if D(O) is not empty). Furthermore, there exist veetors u(k) > 0 such that(2.2.2) P(k,k) u (k) • au (k) k • l, •••
,v.
PROOF. Since the degree of P is v, there exist classes with depth k, for k • l, ••• ,v, and possibly classes with depth zero (nonbasic classes which do nothave acces to any basic class). Obviously, a class of depth k has no acces to a class of depth ! > kt hence P(k,!) • 0 for k < !. Basic classes with depth k do not have access to any other class of depth k, whereas nonbasic classes with depth k must have access to at least one ba-sic class of depth k. Furthermore, a(P(k,k)) =a for k • 1, ••• ,v and a(P(O,O)) < a by proposition 2.3. and definition 2.5. Proposition 2.4. now implies the existence of veetors u(k)
>
0 such that (2,2.2) holds fork
=
l" •••,v.
0Remark. Note that each state in D(k) bas access tosome state in D(k-1}, for k • v, v-1, ••• ,2.
DEFINITION 2.10. The partition {D(v), D(v-l), ••• ,D(l), D(O)}, such that D(k) contains all classes with depth k (k • v, v-1, ••• ,1,0), is called the p~noipat
partition
of S with respect to P. 0Consider,· once again, the matrix of example 2.3. We find D(2)•{1,3,4}, D(l) • {2,5}, D(O) = ~. In other words, after permuting the states we have
30
the following structure:
0 4 0 0 4 2 0 4 p 2 0 2 J
Both P(2• 2) and P(l,l} possess a strictly positive right-eigenvector, asso-ciated with eigenvalue 2. If we define an "aggregated" state space
S' • {1',2'} with I'= D(l) and 2' = D(2), and an "aggregated" matrix P'
on S' x S' by
P' •
(~
i)
then the behaviour of powers of P' is in essence the same as the behaviour of powers of P. Note that P' has been investigated in example J; there we saw that the position of the basic classes determined the asymptotic beha-viour of x(n) • Pn x(O) as n ~ w, More generally we have
LEMMA 2.10. Let P have speetral radius a and degree v and let
{D(v), D(v-J),· ••• ,D(I),D(O)} be thè principal partition of S with respecttoP Choose x(O) > Q and let x(n)
=
Pn x(O), n=
1,2,,,, , Then there exist con-stants c1, c2 and veetors u(k) >
Q,
satisfying (2.2.2), such that for n e lNand -n lim a x(n).
=
0 ]. n-+co i € D(k); k=
l, ••• ,v, i e D(O). DThe proof of lemma 2.10 is postponed until section 2.3, where it
fellows immediately from a general result concerning the structure of gener-alized eigenvectors, associated with the speetral radius of a square non-negative matrix. Fora direct proof of lemma 2.10., see ZIJM [76].
Lemma 2.10. shows the desired relationship between the behaviour of x(n) = Pn x(O) and the position of the classes of P. The concept of 11depth of a class C11appears to play a key role, which means that the maximal
num-ber of basic classes which can be found in a chain, starting with C, is relevant. A relationspip, c9mpletely analogous to (*), exists between the height of a class and the behaviour of x(O)TPn, restricted to that class, as n ~ M (note that the depth of C with respect to P is equal to the height of C with respect toPT).
We conclude this section with an extension of lemmas 2.9. and 2.10. which will be needed in the sequel. Note that the concepts "basic
class", "depth" and "degree" are defined with respect to G(P). However, ~f D(O)
~ ~
and G(P(O,O))~
0, we may repeat the procedure, i.e. decompose P(O,O) in exactly the same way. Continuing in this way we finally obtainLEMMA 2.11. Let P be a square nonnegative matrix. There exist an integer r • r(P) and a partition {I(I), I(2), ••• ,I(r)} of the statespace S, such that the following properties hold:
a. Let P(k, 1) denote the restrietion of P to I(k) x I(t). Then p(k,t)
=
0 if k > t; k,1 = l, ••• ,r."'
b. For k
•
< 1,we have a(P(k,k))'
=
> a(P(t, 1)) with equality if andonly if each state in I(k) bas access to some state in I(!), k,1
=
l, ... ,r.c. There exist strictly positive veetors u(k) such that
(2.2.3) k 1 ,2, •••
,r.
n
d. Choose x(O) > ~ and let x(n)
=
P x(O) for n=
1,2, •••• For each k e {1,2, ••• ,r} define the integer~ by~ :• min {t
I
0 < 1 ~ r-k, a(P(k+t,k+t)) < a(P(k,k))} and ~ :• r-k+l if the minimum does not exist.Then, if G(P(r,r)) .; 0, there exist positive constants c
1, c2, depending on x(O) only, such that
32
c
u~k)<
( n) a(P(k,k))-nx(n).<c2ul~k)
1 1 = ~-1 l
=
for ie I(k), k
=
l, •.. ,r and n e JN.0
DEFINITION 2.11. The partition {I(I),I(2), ••• ,I(r)}, discuseed in lemma 2.11 is called the speetral partition of S with respect to P.
In the next section we discuss generalized eigenvectors, associated with. the speetral radius of a square nonnegative matrix. It will appear
that a strong relationship exists with the decomposition result of lemma 2.9.
2.3 Generalized eigenveetors
In the preceding sections we have seen that nonnegative matrices with strictly positive eigenveetors have nice properties, in particular with res-pect to their powers (note that for these matrices the integer r(P), defined in lemma 2.11, is equal to one). One of the most important cases where such a strictly positive eigenvector does not exist is the case with the degree of P.larger than one. In this case the speetral radius a(P) is degenerated as an eigenvalue (i.e., the number of independent eigenveetors associated with a(P) is smaller than its algebraic multiplicity), which implies the existence of generalized eigenveetors (cf. PEASE [44]). In this section, the structure of these generalized eigenveetors is studied and related to accessibility relationsbetween basic classes and so to the decomposition result of lemma 2.9.
Let us start witb a formal definition.
DEFINITION. 2.12. Let P have speetral radius a and for k e lN let Nk(P) be the null space of (P-ai)k. The
inde~
n(P) of Pis the smallest nonnegative. k k+l
1nteger k such tha.t N (P) • N (P) • 0
If P is an N x N matrix with speetral radius a and index n, then n
!
N and for k ~ n(compare e.g. DUNFORD AND SCHWARTZ [21], p.556), The elementsof Nn(P) are called generalized eigenveetora. I f x e Nk(P) \
tf--l
(P), we call x a gene-raZized eigenvector of order k.ROTHBLUM [48] showed that generalized eigenvectors, assoèiated with the speetral radius of a square nonnegative matrix, have nice properties. Befare discussing his results we consider once again the matrix.of example
2. 1.
Example 2.1. (continued). Define P, w(l) and w(2) by
P = (
~ ~
) , w( I)= [
6
J ,
w(2) .. [ : ) Then, clearly,P w(2) = 2w(2) + w(l) , P w(l)
=
2w(l) •Hence w(2) is a generalized eigenvector of order 2. Note that w(2) is strict~ ly positive.
One may wonder whether in general the generalized eigenvector of highest order can be chosen strictly positive. It is intuitively clear that the position of the zeros in any generalized eigenvector must be related to the bloek-triangular structure, presented in lemma 2.9. The following result ean be established (ROTHBLUM [48]).
THEOREM 2. 12. Let P have speetral radius o and degree v. Then for k • I, •• , v there.exist generalized eigenveetors w(k) of order k sueh that
(2.3.1) P w(v)
=
o w(v)(2.3.2) P w(k)
=
o w(k) + w(k+l) k"' l, ••• ,v-1.Let {D(v), D(v-I), ••• ,D{l), D(O)} be the principal partition of S with res-pect to P. Then the veetors w(k) can be chosen in sueh a way that for
k .. ···" and w(k). > 0 1 V for i e; \ ) D(.t), .f.ak k-1 for ie; \_j D(.t) • .9.=0