Conditional mean information and conditional mean entropy
Citation for published version (APA):Nijst, A. G. P. M. (1971). Conditional mean information and conditional mean entropy. Technische Hogeschool Eindhoven. https://doi.org/10.6100/IR29048
DOI:
10.6100/IR29048
Document status and date: Published: 01/01/1971
Document Version:
Publisher’s PDF, also known as Version of Record (includes final page, issue and volume numbers)
Please check the document version of this publication:
• A submitted manuscript is the version of the article upon submission and before peer-review. There can be important differences between the submitted version and the official published version of record. People interested in the research are advised to contact the author for the final version of the publication, or visit the DOI to the publisher's website.
• The final author version and the galley proof are versions of the publication after peer review.
• The final published version features the final layout of the paper including the volume, issue and page numbers.
Link to publication
General rights
Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. • Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain
• You may freely distribute the URL identifying the publication in the public portal.
If the publication is distributed under the terms of Article 25fa of the Dutch Copyright Act, indicated by the “Taverne” license above, please follow below link for the End User Agreement:
www.tue.nl/taverne
Take down policy
If you believe that this document breaches copyright please contact us at:
openaccess@tue.nl
providing details and we will investigate your claim.
CONDITIONAL MEAN INFORMATION AND
CONDITIONAL MEAN ENTROPY
PROEFSCHRIFT
TER VERKRIJGING VAN DE GRAAD VAN DOCTOR IN DE \ TECHNISCHE WETENSCHAPPEN AAN DE TECHNISCHE HOGESCHOOL TE EINDHOVEN, OP GEZAG VAN DE RECTOR MAGNIFICUS, PROF. DR. IR. A.A.TH.M. VAN TRIER, VOOR EEN COMMISSIE UIT DE SENAAT IN HET OPENBAAR TE
VERDEDIGEN OP DINSDAG 15 JUNI 1971 TE 16.00 UUR.
DOOR
ARTHUR GABRIEL PAUL MARIA NIJST
DOOR DE PROMOTOR
CONTENTS
Introduction and summary
Chapter I Preliminaries
l.l. Some measure theoretic propositions
1.2. Separable a-fields
1.3. £
1-preserving transformations
1.4.
The individual ergodic theorem1.5, Conditional information and entropy !.6. Mean information and entropy
Chapter II The case of a measure preserving transfor-mation
2.1. Introduction
2.2. Periodic sub-a-fields 2.3. Mean entropy zero 2.4. Invariant sub-a-fields
2.5. Properties of conditional mean entropy
Chapter III The case of an equivalent invariant probability 3. I, Introduction IX 6 10 16 22 29 33 33 37 42 51 71 78 78 3.2. Existence of conditional mean information and entropy 80 3.3. Some properties of conditional mean entropy 87 3.4. Decomposition of the Kolmogorov-Sina1 invariant 91
Chapter IV The case of a £
1-preserving transformation 98 4.1. PARRY's conditional mean entropy 98 4.2. PARRY's conditional mean information 113
References 126
Samenvatting 129
INTRODUCTION AND SUMMARY
Let (X,~,P) be a probability space. Let ~
=
{A.}~ 1 be a~ ~=
(measurable) partition of X and let ~O and ~I be sub-a-fields of ~. Then the concepts of conditional information of ~ with respect to ~O denoted by I(~
I
~0
) and conditional ent~opy of~I with respect to ~O denoted by H(~
1
I
~0
) are well known. (cf. JACOBS [10], NEVEU [14], [15]>cf. also the definitions 1.5.1. and 1.5.8.) Now let T be a non-singular transformation on (X,~,P). If T is measure preserving then by a generaliza-tion of MAC MILLAN's theorem givenlby CHUNG (cf. PARRY[!£], p. 20, theorem 2.5., cf. also theorem 1.6.5.) for anyparti-t-1 . tion ~with finite entropy the sequence
{l
I( v T-~~)}~_1
t i=O
t-converges a.e. and in £
1-norm to a function called the mean information of ~. Consequently also the sequence
converges to a number called mean entropy of ~.
If T admits an invariant probability P' equivalent with P then for any finite partition ~ the sequence
still converges in £
1-norm as proved by JACOBS [10], p. 318, theorem 3. Consequently also the existence of mean entropy of
~ is guaranteed in this case.
Let T be a measure preserving invertible transformation, let ~ be a partition with finite entropy and let ~O be a sub-a-field of
~with
the propertyT-
1~
0
=
~
0
•
In this settingNEVEU [14] has defined the conditional mean entropy of~ with respect to IR 0 by 00 H(~
I
IRQ V V i=lIt is easy to see from some sinple properties of condi-tional entropy (cf. 2.1.3.) that in this case we have
t-1
lim
l
H( vT-i~
I
IRQ) • t-+<» t i=OIn this thesis we shall investigate some conditions on the transformation T and on the sub-a-field IR0 of 1R which guarantee the existence of conditional mean information and conditional mean entropy for any partition ~ with finite en-tropy (or for any finite partition ~ as in chapter III) with respect to IR
0, Furthermore we shall establish some properties of conditional mean information and conditional mean entropy.
For the convenience of the reader some basic concepts and results are recalled in chapter I. Proofs are given if they are rather difficult to find in the literature~ Further-more we have collect in the sections I.S and 1.6 sone well known properties of (mean) information and (mean) entropy. Chapter II contains results on conditional mean information and conditional mean entropy in the case that T is neasure preserving (not necessarily invertible). The major results of
' I f d'
chapter II concern the existence and the propertLes o con L-tional mean information and condiL-tional mean entropy of a par-tition ~ with finite entropy and conditioning by an 1invariant
sub-a-field IR0 of 1R (section 2.4.) and the generali~ation of NEVEU's theorem [14], 3.2. given in theorem 2.5.9.
In chapter III we introduce the concept of conditional mean entropy in the case that T admits an invariant probability P' equivalent to P for a finite partition ~ and conditioning by an invariant sub-a-field ~O of ~. The method employed in this chapter essentially corresponds to JACOBS' method of introduc-ing the mean entropy of a finite partition ~ as mentioned above. Furthermore in this chapter we discuss another genera-lization of NEVEU's theorem mentioned above and a decomposi-tion of mean entropy in the case that T is invertible but not necessarily measure preserving. At last in chapter IV PARRY's results on mean information and mean entropy for a £
1 -preserv-ing transformation [17] are extended by again admitt-preserv-ing a con-ditioning sub-a-field. In the special case that T is measure preserving these concepts of conditional mean information and conditional mean entropy reduce to the corresponding concepts discussed in chapter II. In the case that T is £
1-preserving and admits an invariant probability P' equivalent to P this need not to be true as will be shown in an example.
With respect to the notation we remark that the set of positive integers is denoted by Nt and the set of real numbers by Rt. Let ~~ and ~
2
be a-fields of subsets of a set X, then the smallest a-field of subsets of X containing ~I and ~2
is denoted by ~, v ~2
• Similar notations are used in the case that we have a (possibly infinite) class of a-fields of sub-sets of X. If A and B are subsub-sets of X we denote by A\B the difference of A and B, i.e. A\B {xI
X e A, xi
B}. Further-more we shall use the notationA
=
X\A and A ~ B=
(A\B) u u (B\A). Other notations will be explained at the place where they turn up.PRELIMINARIES
For the convenience of the reader we shall collect in this chapter some preliminaries. Proofs will be given if they are short and not easily accessible in the literature. For other proofs we refer to the literature.
In this chapter, as in the following chapters, (X,~,P)
will be a probability space, i.e. X is an abstract set, ~ is a a-field of subsets of X and P is a probability on ~. i.e. a positive a-additive set function on ~with the property P(X) = 1. Furthermore we shall denote the set of all sub-a-fields of ~ by cr~. For any A € ~ we denote the characteristic
function of A by XA•
§1. SOME MEASURE THEORETIC PROPOSITIONS
Definition 1.1.1: Let P' be a probability on the measurable space (X,~). The probability P' is called absotutety aontinu-ous with respect to P if for any A € ~ such that P(A)
=
0 we have P'(A)=
0. If also Pis absolutely continuous with re-spect toP', then the probabilities P and P' are calledequivatent. 0
Proposition 1.1.2: Let P' be a probability on (X,~) equivalent with P. Then for any E > 0 there exists a
o
> 0 such that forany A € ~with P'(A) <a, we have P(A) <E.
Proof: J, NEVEU [13], (Corollary I of proposition IV.I.3) 0
Remark 1.1.3: Let T be a measurable transformation on the probability space (X,~,P) (cf. §3). An important problem in ergodic theory is to find a T-invariant measure P' such that Pis absolutely continuous with respect toP'. Many necessary and sufficient conditions for the existence of such a proba-bility P' can be found in Y.ITO: Invariant measures for Markov processes [9]. ITO also proved that i f there exists a probability P' with the above mentioned properties, then there even exists a probability P" which is T-invariant and
equivalent with P. 0
Let ~O e cr~ and let f be a quasi-integrabte jUnction on
(X,~,P), i.e. f is a (~-) measurable function on X with the
property sup(f,O) is integrable or sup(-f,O) is integrable.
Then the set function Q on
~O
defined by Q(A)=
I
fdP forA
any A e ~ is a signed measure on (X,~) absolutely ~ontinuous
with respect toP (HALMOS (7], p. 118 and p. 124). Hence by the theorem of RADON-NIKODYM (HALMOS [7], p. 128, theorem B) there exists a modulo P uniquely determined ~
0
-measurablequasi-integrable function f
0 on X such that:
J
f 0dP =J
fdP for any A e~
0
•
A A
The operator E~ defined on the set of quasi-i~tegrable
0
functions on (X,~,P) which is defined by E~ f = f0 fs called
0
the conditionat expectation with respect to ~
0
• The function f0 is called the conditionat expectation of f with tespect to
~
0
• The operator P~ defined on ~by P~ (A)=
E~ (xA) for any0 0 0
P~ (A) is called the aonditiona~ p~obabi~ity
of
A with respect0
to ~
0
•Notice that for any f € £
1
(x,~,P), i.e. for any integra-ble function f, we have E~ f € £1
(X,~,P).0
Proposition 1.1.4: Let ~O E cr~. Then the conditional
expecta-tion with respect to ~O has the following properties:
a) For any pair of quasi-integrable functions f and g satis-fying the condition that f is ~
0
-measurable and f.g is quasi-integrable we have a.e. E~ (f.g) = f.E~ g.0 0
b) Let~ E cr~, ~I E cr~ and suppose ~O c ~
1
• Let f be aquasi-integrable function. Then we have a.e.
E~ E~ f
=
E~ E~ f=
E~ f.0 1 I 0 0
Proof: NEVEU [13], IV.3, formulas (1) and (2). D
Definition 1.1.5: Let ~
0
E cr~ and A£~. Then the measUPab~ehutt
<A>~ of A is the modulo P uniquely determined ~0
-measur-0
able set with the properties:
a) A c <A>~ [P]; 0 b) P(<A>~ ) 0 inf P(B). B=>A BE~O
Proposition 1.1.6: Let~ E cr~ and A£~. Then
<A>~= {x EX
I
P~ (A)(x) > 0} [P].Proof: NEVEU [15], lemme 1).
D
Proposition 1.1.7: Let ~O ~a~ and let A~~ be such that
~ n A
=
~O n A. Let f be measurable and ·non negative. ThenE~ (f.xA)
=
P~ (A).f a.e. on A.0 0
Proof: t'NEVEU [ 15 J, lemme 2). 0
Proposition 1.1.8: Let ~O ~a~, B ~ ~O and C ~ ~. Then <B n C>~
=
B n <C>~ [P].0 0
Proof: First we note that <B n C> c B n <C>~ [P].
Further-~0 0 more C c <B n C>~ u 0 <Bc n C>~ and hence 0 <C>~ c <B n C>~ u <Bc n C>~ 0 0 0
Thus because of <B n C>tR c B and <B c n C>~ c Be it follows
0 0
that B n <C>~ c <B n C>~ and the proposition is proved.
0 0 0 Definition 1.1.9: Let {f }~ 1 c £
1
(X,~,P). Then {f } 00 1 is n n= n n=called uniformly integPable if lim sup
j
lfnJdPa+oo n~Nt {Jf >a}
n :
The sequence {f }00
1 of a.e. finite measurable functions n n=
.. o.
converges in probability to the a.e. finite measurable func-tion f i f for any e: > 0 we have lim P( { If -f I > e:}) = 0. 0
n-+<» n
Proposition 1.1.10: Let {fn}:=l be a sequence of a.e. finite measurable functions which converges pointwise a.e. to an a.e. finite measurable function f. Then the sequence {f }~
1 con-n con-n=
verges in probability to f.
Proof: J.NEVEU [13], proposition II.4.3. D
Proposition 1.1.11: Let P and P' be equivalent probabilities on~. Suppose that {fn}:.J c £
1
(X,~,P'), f E £1 (X,~,P') and
f +fin probability P'. Then f +fin probability P.
n n
Proof: Let E > O, then by proposition 1.1.2. there exists a 6 > 0 such that for any A E ~with P'(A) <owe have P(A) < E.
Hence for n > 0 by lim P'({lf -fl > n}) = 0 it follows that
n n-+<x>
lim P({lf
-£1
> n})=
o.
D
n n-+<x>
Proposition 1.1.12: Suppose that {gn}:=l c £
1
(x,~,P) isuni-formly integrable and {f }00
1 c £
1
(X,~,P), If I ~ lgI,
n n= n n
n = 1,2, •••• Then the sequence {f }~
1 is uniformly integra-n integra-n= ble. Proof: Hence Hm sup a~ nE:Nt D Proposition 1.1.13: Let {fn}:.
1 c £
1
(X,~,P). Then the follow-ing two assertions are equivalent:00
1) The sequence {f
0}n=l converges in £1-norm to f E £
1
(x,~,P). 2) The sequence {fn}:=l is uniformly integrable and convergesProof: J. NEVEU [13], proposition II.5.4. 0 §2. SEPARABLE a-FIELDS
Definition 1.2.1: Let {A }00
1 be a sequence of measurable n n=
subsets of X and suppose that ~· e a~ satisfies the following two conditions:
1)
2) \if [~*:::>{A }"" =:. ~* :::>6{' ] ,
6{*ea6t · n n•l
then we say that 6t' is generated by {An}:=l • Sub-a-fields of
6t generated by a sequence of measurable subsets of Xi are call-ed aountabZy generated sub-cr-fields of 6t. 0
In definition 1.2.1. and throughout in what follows two sub-a-fields 6t
1 and 6t2 of 6t are identified if for any A1 E 6t1 there exists a set A
2 e 6t2 such that P(A1
a
A2) = 0, and con-versely.Definition 1.2.2: A (measurable) partition~= {A.}~
1 of X
- l l=
is a countable collection of pairwise disjoint measurable
sets with union X. D
We shall identify two partitions ~ and ~· which have modulo P-nullsets the same elements. In order to simplify the
terminology we shall also identify a partition and the cr-field generated by it.
Definition 1.2.3: Let 6t
0 E 06{. Then 6t0 is called separabZe if
there exists a sequence {An}:=l of elements of 6t
0 with the following property: for any A E 6t0 and for every E > 0 there is a number n such that P(A ~ A ) < E.
0
Proposition 1.2.4: Let ~O € o~. The following two assertions
are equivalent:
a) ~ is separable;
b) ~O is countably generated.
~: a) ~ b) Suppose that ~O is separable, let {An}:.1 be a sequence in~ satisfying the condition of definition 1.2.3. and denote by ~O the sub-o-field of ~ generated by {An}:=l • For every fixed A € ~O and for any positive integer k there
I
exists a positive integer ~· such that P(A 6 A ) <
1k .
Let~ 2
00 ()0
t.k•
A , k~ 1,2, ••• and A' = lim sup k~
Ak
= n n=l k=n uAk ,
then A' € ~0
and the sequence { ut.kl:.
1
is
monotone decreasing.k=n
Hence for any € > 0 there exists a positive integer N(e) such
that P(A' 6 u
Ak>
~le for all n ~ N(€). Thus k=nP(A A A') ~ P(A A u
Ak)
+ P( uAkA
A') ~k=n k=n
00
::; L
P(A A A_') + ~£ ::;; - 1 - +le
< ek=n -K 2n-l
if n is sufficiently large. Since this is true for any e > 0 we have P(A A A')
=
0 and therefore A € ~·00
b) ~ a) Let ~ be countably generated and let {An}n=l be a sequence of elements of ~O generating ~
0
• Without restriction of generality we may suppose that A1
=
X and that the se-quence {A }oo I together with A also containsA
= x\A forn n= n n n
every n. Let
ut
be the field of finite unions of finite inter-sections of elements of the sequence {A }001 • Then
Ot
containscountably many elements (HALMOS, [7] p. 23, theorem C) and is dense in ~
0
, i.e. for any A € ~O and every E > 0, there is aset A' € ~ such that P(A ~A') < E (HALMOS [7], p. 56,
theorem D).
D
Proposition 1.2.5: Let~ be separable and ~O E cr~. Then ~O is
separable.
Proof: There exists a sequence {Fn}:=l of elements of ~such that for any E > 0 and A € ~ there is a positive integer n with property P(A
~
Fn) < E. Let Oln,k = {A €~
0
1P(Fk~A)
<*}
for n = 1,2, ••• and k = 1,2, •••. If we choose from every non emptyae
k one element, we obtain a countable set ofn,
elements of ~, which we arrange to a sequence {Ai}~=
1
• For any A E ~ and any positive integer n there is a positiveinteger k such that P(Fk
~
A) <~n
•Beca~se
ofut
2n,k ;0
there exists ani such that P(Fk ~A.) <--2 • Hence
I ~ n
P(A ~A.) < - . Because n runs through all positive integers,
~ n
we have proved the proposition.
D
Corollary 1.2.6: Let~ be separable and ~O € cr~. Then there exists a sequence of finite partitions {s }oo
1 of X such that n n= 00 V n=1
s .
nD
The following propositions will be important in connec-tion with the definiconnec-tion of condiconnec-tional entropy (cf. §5).
Proposition 1.2.7: Let~ E cr~, ~
1
€ cr~ and lets be a parti-tion such that ~ c ~1
c ~ v s• Then there exists a partitionn
such that ~ vn
= ~1
•Proof: Lets= {A.}~
1 and let <A.>0 , i = 1,2, ••• be defined
~ ~= ~ ~1
1,2, . . . . Let
It follows from at1 v ~ = at0 v ~ that there exists an element D. E dt
0 such that D. n A. =C. n A .• We see that
l. l. l. l. l.
C. ~ (D. n <A,>n ) € at
1 and C. 8 (D. n <A.>dt ) c <A,>dt \A ••
1 1 1 ~1 1 1 1 I 1 J 1
This implies P(C. ~ (D. n <A.>n )) = 0 and hence l. 1 l. .,.,1
c.
=1 D. I. n <A.> dt and I. I at1 n <A.>o 1 ~~
=
dtg_l
n <A,>dt, i 1I
=
1,2, ••••E = <A >dt \ u Ei' n ~ 2 and
n n l i=l define E 1
=
<A > J dt • I Now n = CO {E.}. 1 • Then thel. I.= assertion is proved. D
The following two propositions can be found as proposi-tion I) in NEVEU [14].
00
Proposition 1.2.8: Let 6t
0 E crdt, and let ~ {Ai}i=l be a partition of X. Let f E £ 1(x,dt,P), f ~ 0. Then
"' Eat
(XA f) 0 i Eo v ~ f=
L
( )
XA. a.e. ~0 ~ i=J pdt Ai 1 0Proof: Because of proposition 1.1.6 the right side of the equality is defined a.e. Furthermore we remark that both sides of the equality are at
0 v ~-measurable functions of X. Let B E at
0 v ~. then for any i there exists a Ci E dt0 such that C. n A.
=
B n A .• Hence we have"'
f
~ <xA f) CO
J
~
(xA.f)dP=
I
E6l [ P&l (A.) 0 i XA. ]dP •I
i=l 0 0 1. l. i=l
c.
0 l.c.
1. 1. 00J
f
I
..
I
fdP fdP = E6l v F; fdP • 0 i=l A.11C. B B 0 1. 1.Proposition 1.2.9: Let F; ={A.}~
1 be a countable partition.
1. I.=
Let
6la
E a6l. Then for any A E6la
v F;, there exists a uniquely determined sequence {B.}~1 c ~- such that B. c <A.>~- ,
1. 1= .. u 1 1.
-u
i
=
1,2, ••• and such that"'
A-
I
(A. 11 B.). i=l l. 1.00
~: Let
Ot
={A E &t!A =L
(A. n B.), B. € ~-n
<A,>n ,i=l 1 1 1
-u ·
l."'o
i = 1,2, ••• }. Then C( € cr6l and V( c 6lO V F;, Furthermore we have F; cCX.
and 6l0 c
C5E
and henceUt:
=
6!0 v F;.""
""
Now let
L
(A.n
B!)=
I
(A. 11B~)
with the properties i= I l. l. i= I 1 1B~
€ 6l0 11 <A,>o, j.
=
1,2; i=
1,2, . . . . Then A. 11 B!=
l. 1. IJt 0 1. l.
=A. n
B~,
i = 1,2, ••• and hence by proposition 1.1.8:B~ ~
<A,;6l n B! =<A. nB~>6l
=<A. nB~>6l
=
B7, i=
1,2, •••1 1. 0 l. 1. 1. 0 1 1. 0 1.
and the proposition is proved. 0
§3.
£
1-PRESERVING TRANSFORMATIONS
In this section, as in all what follows, T will 'be a
measUPabZe tPansformation on (X,6l,P), i.e. a transformation with the property T-16l c 6l. If 6lo € cr6l and if f is any 6{0
-~ -1 .
function defined by (Tf)(x) f(Tx).
~P~r~o~p~o~s~it~~~·o~n~1~·~3~·~1: Let for ~I € cr~ the conditional
expecta-PT-I P
tions E~ and E~ be taken with respect to the probabilities
I I
P and PT-I
have a.e.
for any ~I € cr~ and for any f ~
o.
wep E _ 1 (Tf). In particular. choosing T ~I
~I {~.X}
we haveJ
TfdP=
I
fdPT-I • XProof: Both and E _p -1
1 (Tf) are T ~
1
-measurableT ~I
functions on X. Suppose that A € ~. then
X X
Hence by monotone approximation for any f ~ 0 and A € ~ it follows that
I
TfdPI
fdPT-I •Now let B
=
T-1A. A €~
1
•
Then we have:I
fdPT-IA
I
TfdP =B
Definition 1.3.2: The transformation T is called P-preserving
-1
if for any A € ~we have P(T A)
=
P(A). The transformation Tis called £1-preserving if f € £
1
(x,~,P) if and only if Tf € £1
(X,T-1~1
,P). The transformation T is callednon-singu-Zar
if for any A E:~
P(A)=
0 if and only if P(T-1A)=
0. The transformation T is called invertibZe if T is a one to one transformation of X onto X with the property T~=
~.0
The following proposition is a slight modification of a lennna due to DUNFORD-MILLER [4),p. 539, lennna I) (cf. also DUNFORD-SCHWARZ [5], p. 664, lemma 7).
Proposition 1.3.3: The following statements are equivalent:
a) T is £
1-preserving,
-I
b) The probabilities P and PT on ~ are equivalent and there
1 dPT-I exists a positive integer K such that
K
~ ~ ::>1 K.
c) There exists a positive integer K such that for any
f €
£
1
(X,~,P)
i
J
lfldP :S:f
ltfldP ::> KJ
lfldP,X X X
Proof: a) =;> b) Let A t:: ~. P(A) - 0 and P(T-1A) > 0. Define f
=
G
on A•
thenf
fdP = 0 and by proposition 1 •. 3. 1on
A
Xf
TfdP =J
fdPT-I= ""·
This is in contradiction with a).X X
i
Similarly it follows from P(T-1A)
=
0 that P(A)=
0. dPT-lLet An
=
{x € X In < ~(x) ::> n+l}. Suppose that there exists an increasing sequence {~}~.1
of positive in!tegers such that P(A ) > O, k=
1,2, ••••CO
Let ak = k.nk P(A ) , k • 1,2, ••• and f ~
J
(X)
fdP =
t
ak P(A ) < "" and by proposition 1.3.1X k=1 ~
f
TfdP •J
-1 "' f~dP ;;::t
ak ~ P(A )= "",
This i~ in dP k=1 nk ~ X Xcontradiction with a) and hence there exists a positive dPT-l
integer K
1 such that
-;rp-
s K1 there exists a positive K2 such
• Similarly we prove that that
~
s K • ChoosedPT- 1 2 K = max(K
1, K2), then assertion b) is proved.
b)~ c) Let f ~ £
1
(X,~,P), then by proposition 1.3.1ITfldP = lfl
~
dPJ
I
-l dP lfldP and X Xf
lrfldPI
If!d~~
-l dP;;::k
I
!f!dP. X X Xc) ~ a) This assertion is trivial. 0
-1
Proposition 1.3.4: Suppose that TX =X. Then for any A € T ~
we have TA € ~. -1
~: Let A € T B, B ~ ~. Then TA = B o TX
=
B € ~. 0 Let Q be a probability on (X,~) such that P and Q are equivalent. If not stated otherwise we shall use the notation~
for the~-measurable
density of Q with respect to P, i.e.A €
~we
havef
:~
dP = Q(A). The existenceof:~
follows byA
the theorem of RADON-NIKODYM (HALMOS [7], p. 128, theorem B). Let T be £
1-preserving with the property TX =X. Then because
of proposition 1.3.3. the probabilities P and PTn on T-~ are equivalent.
In the rest of this section we shall suppose that TX = X and that T is a £
1-preserving transformation on (X,~,P).
From Dowker [3] we adopt the following definition 1.3.5 which will serve to formulate the HUREWICZ version of the individual ergodic theorem in the following section. We shall also collect for later use some properties of the functions
w~, n = 0,1,2, ••• which will be defined presently. -n
Definition 1.3.5: For any positive integer n the T
~-measur-dPTn
able function w' is defined by w' = --- • Furthermore we
n n dP
define w
0:
1.D
Proposition 1.3.6: The sequence {w'}00
0 has the following
- n n= properties: w' .. E (Tn-lw' Twj .wj), n ~ l. n T-n~ 1 a)
f
Tnf.w'dPI
fdP, n ~ O, f € £1
(X,~,P). n b) X X c) I f p=
dPT- 1 , then Tp.wj=
I. Proof: First we show that for any f € £1 (x.~.P) and'n ~ 0 we
have
J
Tnf.Tn-lwj Tw'w'dP I I=
I
fdP. (I)Let f • xA' A e ~and~ • 1. Then
P(A) •
f
X
fdP.
Hence for n • I tqe assertion is true for every step function and also for any f e £
1
(x,~,P). For n > I it follows thatX X = X Tw' w'dP I ' I Tw' W1dP ) • 1 •
Hence the assertion follows by induction. To prove the assertion of a) we remark that w' and
~n-1 ~ -n n
ET-n~(T wj ••• Twi.wi) are T ~-measurable functions. let A
=
T-nB, B e~.
ThenJ
w~dP =
P(B) andA
=
Twj .wj dP • P(B)
and assertion a) is proved.
To prove b), let f € £
1 (X,~,P). Then by assertion a), proposition 1.1.4 a) and formula (!)of p. 14 we have
I
inf.w'dP=
J~tnf.E
(in-lwjn T-n~
X X
Twj .w; )dP
=
I
Tnf. in-lwiTw} .wj
dPI
fdP .X X
To prove c) we remark that Tp
.wj
is T-1~-measurable. Now letA € ~. then
=
f
pdPA
Hence assertion c) is also proved,
Remark 1.3.7: For n ~ 0 we denote
§4. THE INDIVIDUAL ERGODIC THEOREM
X
D
D
We shall later make use of two versions of the individ-ual ergodic theorem. The first one is the version dealing with
a
measure preserving transformation, often cal~ed BIRKHOFF's ergodia theorem. It admits as conclusion a.s. convergence as well as £with a transformation which is £
1-preserving and satisfies TX =X. It admits as conclusion a.s. convergence and is called HUREWICZ'ergodia theorem. Both versions can be deduced from the general ergodic theorem of CHACON-ORNSTEIN [2], (cf. also NEVEU [13], V.6.).
The sub-o-field of all T-invariant measurable subsets of
~will
be denoted by~·
i.e. A €~T
iff A E~
and T-1A = A[P].The transformation T is called ergodia if~= {4,x}.
Theorem 1.4.1 (BIRKHOFF): Let T be P-preserving and let f E £
1
(x,~,P). Then we have a.e. and in £1-norm
• I n ~k
l~m -;)
L
T f=
E~ fn-+<x> n k=O -T
Consequently, if T is ergodic we have a.e. and in £
1-norm
. I n
hm-;)
L
n-+<x> n k=O
I
fdP •X
Proof: J. NEVEU [13],V.6 (Corollary). D
Corollary 1.4.2: Let {fn}:=l c £
1
(x,~,P), sup lfnl E £1
(x,~,P)n and fn + f a.e. and therefore also in £
1-norm. Then I n lim-
L
n-+<x> n k=OT
~n-kf k = E~ f a.e. an d · ~n £ 1-norm.Proof: PARRY [18], formulas 2.8 and 2.9, p. 21-23 •
D
Corollary 1.4.3: Let T be invertible and f E £1
(X,~,P). ThenI n
lim-
L
Proof: Because of ~T = ~ _ 1 we have T I n lim- ~ n-+<:o n i•O D The following theorem which is a slight modification of HUREWICZ' theorem is given by PARRY [17] and can be proved on similar way as is done by DOWKER [3]. The conclusion that the limitfunction is E~f follows by the CHACON-ORNSTEIN ergodic theorem (NEVEU [13], V.6). Note that,.under the hypotheses of theorem 1.4.4, fT :• Tf.w
1 defines a conservative Markov
endomorphism of £
1
(x,~,P) which admits ~T as class o~ invari-ant sets.Theorem 1.4.4(HUREWICZ): Let T be £
1-preserving with'the prop-erty TX =X and let {w }~
0 be defined as in remark 1.3.7.
n n=
Suppose tha~ f e £
1
(X,~,P) andI
wn(x)=
~ a.e. Then a.e. n•O we have n ~iI
T f.wi lim ..;:i_=O.;;;...___ E f n=
~.
I
i=O w. l.Consequently if T is ergodic we have a.e.
n Ai
~
T f.w. i=O l.I
lim "' fdP • n n-+<:oI
w. X i=O l.Notice that in the case that T is P-preserving we have wi(x)
=
I a.e., i = 0,1,2, ••••Hence under the condition TX • X HUREWICZ' theorem generalizes the pointwise part of theorem 1.4.1.
Corollary 1.4.5: Let T be £
1-preserving with the property
TX =X. Let {f }00
1 c £
1
(X,~,P) and suppose that n n=sup If I €: £
1 (X,~, P) and that the sequence { f } ""-• converges
lli':
1 n n n-,
a.e. and hence in £1-norm to f € £
1
(X,~,P). Let {wn}:.o be 00defined as before and suppose that
I
wn(x)=
oo a.e. Thenn=O
n
I
T "'n-i f .• w •1 n-1 i•O
lim .::._:::,.__n _ _ _ _
=
EIR f a.e.n..-
I
-"Ti=O
w.
1
0
In the proof of this corollary we shall need a lemma due to CHACON-ORNSTEIN.
Lemma 1.4.6: For any positive integer N we have a.e.
o.
Proof: CHACON-ORNSTEIN [2], lemma 2.
Proof of corollary 1.4.5: First we remark that
lim sup n-+<x> n \ -n-1 • n \ -· 1 L [T f.]w . - L [T f]w. i=O 1 n-1 i=O 1 n
l:
w. i=O 1 D= lim sup n-+<x> :;:; lim sup n-+<x> • lim sup n-+<x> ::;; lim sup n-+<x> + lim sup n-+«> ::;; lim sup n-+<x> n - •
I
[Tn-1(fi-f)]wn-i i=O nI
w. 1 i=O n - .I
[Tn-11fi-f1Jwn-i i=O nI
wi i=OI
[Tilfn-i- fiJwi i=O ~ nI
w. 1 i=O n-N -·I
L
[T11f . - f ]w. n-1 1 i=O + nr
w. • 0 1 1= N-1 - •L
[Tn-11fi- f!Jwn-i i=O nr
w. 1 i=O n -·L
[T1FN]wi imQ + nI
W· i=O 1 ~ :s:where and -n-N N -· \ 1 [T ( 1.. [T F 0Jw.)]w -N i=l l n + lim sup _ _ ...,;;;.,..;;... _ _ _ _ _ _ n n-T<» \ /.. W• i=O 1
Now we have because of lemma 1.4.6. that
-n-N N -· \" [ 1 [T ( 1.. T F 0Jw.)]w N . i=l 1 n-lim
----=...:..---
=
n-T<» - lim n-T<» nI
w. i=O 1=
0 a.e. Furthermoren
I
w.i=O 1.
where
f
E~
FNdPJ
FNdP + 0 for N +~
by dominatedconver-X T X
gence. Since {FN};=l is monotone decreasing, so is {E~ FN};=l
T
and hence
lim sup
n~
0 a.e.
The corollary then follows by HUREWICZ' theorem.
0
§5. CONDITIONAL INFORMATION AND ENTROPY
Definition 1.5. 1: Let~= {A.}~
1 be a partition of X and 1. 1.=
~O E o~. Then the aonditional information of~ with,respect
to ~O is a.e. defined by
00
I(~i~o):=-
L
XA log Pn (A.) ,i=l i .,...o 1.
where Pn (A.) is the conditional probability of A. with
re-.,...0 1. 1.
spect to ~
0
• 0Proposition 1.5.2: Let~ and n be partitions of X. Let ~
0
e: cr~,a) I(~ v nl~
0
) = I<~l~0
) + I(nl~0
v ~)b) I(~l~
0
) ~ I(nl~0
) if~ c nc) E~I<~I~> ~ E~I(~I~1) d) I(~~~) = 0 iff ~ c ~ • D Remark 1.5.3: Let { -x log x, 0 < x ~ I z(x) = 0 ,x=O,
then we have by proposition 1.1.4:
00
=-
i~l
logp~ (Ai)E~
(xAi) = ii1z(P~
(Ai)) •D
Proof of proposition 1.5.2: Let~= {A.}~
1 and n ={B.}~ 1 •
- - l. J.= J J=
a) By proposition 1.2.8. we have
00 00
I(~
v
nl~)=-
.I .I
xA.nB. log P~- (A. n B.)=1.=
I J=
1 1. J -0 J. J 00 00..
- I I
i=1 j=l XA B log .n • P~(A.) J. -1. J 00 00 PIR_ (A.nB.)-u
1. J- I I
.x
1 og --:::---:.-:"""":-1 . I A.nB. P, (A.) 1.= J= l. J"'o
1.00
I
xA. log PIR (A.)-i=l l. 0 l.
"' 00 Pot (A.nB.)
I
logI
0 l. J-
XB. xA. PIR {A.) =j=l J i=l l. 0 l.
00
= I(t;; I~Ro) -
I
xB. log POlovE;; (B.) =j=l J J
b) This is an immediate consequence of a) and the positivity of conditional information.
c) By JENSEN's inequality (BILLINGSLEY [l], p. 112, formula 10.10) and the concavity of the function z(x) mentioned in remark 1.5.3 we have by proposition 1.1.4.
00
I
z(PIR (A.)) i= l 0 l.~En
I
z(Pn (A.)) =En l(~lotl) •"'o
i= 1 "'1 1"'o
d) Suppose that E;; c !R
000 , then for any i we have PIR 0 (A.) l.
Hence I(t; i~R
0
) = -I
xA log P, (A.) = 0 a.e.i=1 i
"'o
1Let I(t;;!~R
0
) = 0 a.e., then POlo (Ai)=
1 a.e. on Ai, and hence on <A.> • It follows thatl. IRo
P(<A.> ) =
l. IRo P&t (A. )dP 0 l. =
f
XA. dP <A.> l.l. ~Ro
p (A.) •
Hence Ai e ~O and the proposition is proved. D
Proposition 1.5.4: - Let~ be a partition of X. Let {~ }~ n n=l c a0
•n
be an increasing sequence. Then
qo
I(~
I
v ~n) n==l n-+<»lim I(~~~) a.e.
n
Proof: PARRY [18], p. 16, theorem 2.2. i) • D Proposition 1.5.5: Let~ and n be partitions of X and let
~O e a~. Suppose that ~ v ~O == n v ~O • Then we have a.e.
Proof: We may suppose that~ c
n.
Hence by proposition 1.5.2.a) and d) we have:
D
Definition 1.5.6: Let ~O e a~, ~I ea~ and let~ be a parti-tion with the property ~ v ~O = ~I v ~
0
• Then we define a.e.D Remark 1.5.7: It follows immediately by proposition 1.5.5. that definition 1.5.6. is independent of the choice of the partition ~. Now let ~O E cr~, ~I E a~, ~
2
E cr~, ~I c ~2
andlet ~ be a partition with the property ~ v ~I
=
~O v ~I •Then it is easy to see that ~ v ~
2
=
~O v ~2
• Hence also the assertions of the propositions 1.5.2. and 1.5.4. remain true in this more general case as long as all the mentionedDefinition 1.5.8: Let ~O ~ cr~ and ~I e cr~ and suppose that there exists a partition t; == {Ai}~=l of X such that I; v ~O
= ~~ v ~
0
.then the eonditionaZ entropy of~~ with respect to~O is defined by
H(~l
I
~o>
,..
J
I(~l
I
~o)dP
• XIf there does not exist such a partition I; we define
0 Remark 1.5.9: This definition has been given by J. NEVEU and A. HANEN. (NEVEU [14], [15], HANEN et NEVEU [8]) and they have shown that this definition is equivalent with other definitions as given in JACOBS [10], ROKHLIN [21].
0
Proposition 1.5.10: Let ~O e cr~, ~I ~ cr~ and ~2
~ cr~. Then we haveb) if
c) i f
d)
~: a) Suppose that there does not exist a partition E; such that ~O v I;
=
~O v ~I v ~2
, hence H(~1
v ~2
j~0
) = ""· Then either H(~1
I
~0
)= "'
or H(~2
I
~O v ~I)= ""
and the asse.r-tion is proved.~O v n
=
~O v ~I v ~2
, then by proposition 1.2.7. there exist partitions s~~ such that ~O VS= ~O V~)' ~O V~) Vs~=
~O v ~I v ~2
and hence ~O v s v ~ = ~O v ~I v ~2
. The asser-tion now follows by proposiasser-tion 1.5.2. and remark 1.5.7.b) Substitute in a) ~I v ~
2
=
~2
and note that the conditional entropy is non negative.c) In connection with remark 1.5.7. we may suppose that there exists a partition s such that s v ~I = ~O v ~I and hence because of ~~ c ~
2
.s v ~2
=
~O v ~2
• Hence the assertion follows by proposition 1.5.2. and remark 1.5.7.d) This is an immediate consequence of proposition 1.5.2. and
remark 1 • 5. 7. D
Definition I. 5. 11 : We shall denote by 1> the collection of all partitions of X with finite entropy and by 1>0 the collection of all finite partitions of
x.
For ~O € cr!R we defineD
Proposition 1.5.12: Let ~O € cr!R and IR
1 € 1>~. Then there
0
exists a s € 1> with the property s v IR
0
=
IR1 v ~0
•Proof: NEVEU [15], theoreme 6).
D
Proposition 1.5.13: Lets € 1> and suppose that {~n}:=
1
c cr!Ris a monotone increasing sequence, Then
J
supI(sl~n)dP ~
H(s) + 1. X nProof: NEVEU [15], theoreme 5) lemma 2.1.).
(cf. also PARRY [18], p. 14,
Remark 1.5.14: An. immediate consequence of the propositions 1.5.4. and 1.5.13. is the t
1-norm convergence of the sequence {I(~~~ )}00
1 for~ e ~. 0
n n=
Corollary 1.5.15: Let ~O e cr~· let {~n}:=l c a~ be a monotone
increasing sequence. Suppose that H(~
0
I
~1
) < ""· Then:H(~O~ V
i=l
~.)
.
~
Proof: Because of the fact that H(~
0
I
~1
) < ""• there existsa partition~ e ~such that~ v ~I
=
~O v ~1
• It follows fromremark 1.5.7. that~ v ~n ~O v ~n' n = 1,2, •••• Hence the corollary is an immediate consequence of remark 1.5.14.
0
The assertion of corollary 1.5.15 can be given in a more general form as will be done in proposition 1.5.16.
Proposition 1.5.16: Let
vt
c cr~ be an increasingly filtered system generating ~O' i.e. ~O is the smallest sub-cr-field of ~ containing all the elements ofUE.
Let ~1
e cr~ and suppose that there exists a ~' eOl
such that ~I e t~,. Then~: JACOBS [10], p. 257, theorem 3. 0
Definition 1.5.17: The metric oH is on~ defined by
for any pair
Theorem 1.5.18: With respect to the metric oH the collection · ~ is a separable complete metric space. With respect to oH
the collection ~O is a dense subset of ~.
Proof: ROKHLIN [21:], p. 19,20 §6.1 and §6.2. 0 §6. MEAN INFORMATION AND ENTROPY
If not explicitly stated otherwise in this section T will be a measure preserving transformation on (X,~,P) as defined in definition 1.3.2.
Proposition 1.6.1: Let T be a non-singular (not necessarily P-preserving) transformation on (X,~,P). Let ~O E cr~,
6l1 E cr ~· Then
T-1
(~0
V~1)
=
T-1~0
V T-16{1 •-I
Proof: First we note that T (~
0
u ~I)T-1(~ V~~) ~ T-1~0 v T-1~1
Let
""
I
-1-r
-1V!.= {A €
6b
v ~ T A € T ~O V T ~I).Then
<X
is a cr-field which contains ~O and ~I. Hence0 Remark 1.6.2: The condition that T is non singular in proposi-tion I • 6. l. is necessary since we identify two a-fields ~I
and ~
2
with the property that for any A1 E ~I there exists an A
2 E ~ such that P(A1 ~ A2)
=
0, and conversely.Proposition 1.6. 3: Let
%
E a <Rand t~t1
E a <Rand suppose that there exists a partition ~ of X such that ~ v <Ra = <R1 v <Ra .Then we have a.e.
Proof: Let~= {A.}~
1, then by proposition 1.6.1.
l l=
(I)
Hence because of definition 1.5.1. and proposition 1.3.1.
00
-
I
X -1 log T[P <R (A.)]i=l T A. 0 l
l
00
T[-
I
xA. log P <R (A.) J T[I(<RlI
<Ro) J. 0i=t l 0 l
Proposition 1.6.4: Let~ E a<R' t~t
1
E a<R. ThenProof: Suppose that there does not exist a partit~on ~ of X such that <R
0 v ~ = <R0 v <R1 and that n is a partition of X such that
Then
n
=T-
1~. ~
c~O
v
~I
1.6.1. again
and now because of proposition
Now suppose A € ~O v ~I' then there exists aB € ~O v ~such
that T-IB = T- 1A. It follows that P(B ~ A) P(T-IB ~ T-IA) = 0 and hence ~O v ~ = ~O v ~
1
•Contradiction. Hence we may suppose that there exists a
parti-tion~ of X such that ~O v ~ = ~O v ~
1
• The assertion of theproposition follows now by integrating formula (I) of proposi-tion 1.6.3.
Theorem 1.6.5: Let~ € ~. Then
I n-1 . oo •
lim- I( V T-~~) = E~ I(~~ V T-~~) a.e.
n+oo n i=O T i=l
and in £
1-norm, and consequently n-1
lim
l
H( VT-i~)
=H(~~
n+oo n i=OV i=I
If T is ergodic we have a.e. and in £
1-norm n-1 lim
l
I( VT-i~)
=H(~~
n+oo n i=O 00 V i=l -i T ~). DProof: This is a generalization of MAC MILLAN's theorem given by CHUNG. For a proof see PARRY [18], p. 20, theorem 2.5.
D
Definition 1.6.6: Let~= {A.}~
1 € ~.Then the mean
informa-~ informa-~= . n-1 .
tion i(~,T) is a.e. defined by i(~,T) = lim
l
I( v T-~~) • n+oo n i=OThe mean entropy of ~ is defined by h(~,T)
Ji(~,T)dP.
X
Definition 1.6.7: Let ~I € a~ (not necessarily satisfying
1
~
1
c~
1
).
Then we define sup h(~,T) •~c~l
~€$0
Proposition 1.6.8: Let ~
1
€ a~. ThenH(~
1
) ~sup h(~,T) •~c~1
~€~
Proof: JACOBS [10], p. 278, theorem 2.3.
0
0
D
The assertion of proposition 1.6.8. can be generalized as will be done in the following proposition.
Proposition 1.6.9: Let
Vt
c cr~ be an increasingly filteredsystem generating ~I € o~. Then
~: JACOBS [10], p. 278, theorem 2.2. and 2.4.
0
Notice that for ~ € ~ by definition I .6.6. and proposi-tion 1.6.8. we haveH(~)
=
h{~,T) •Hence indeed proposition 1.6.9. is a generalization of propo-sition 1.6.8.
The number h(T) ~ H(~) is called the KOLMOGOROV-SINAI invariant of the dynamical system (X,~,P,T) •
C H A P T E R II
THE CASE OF A MEASURE PRESERVING TRANSFORMATION
§1. INTRODUCTION
The purpose of this chapter is to give a definition, some conditions which guarantee the existence, and some properties of conditional mean information and conditional mean entropy in the case that T is measure preserving. If not stated other-wise in this chapter T will be a measure preserving transfor-mation on (X,~,P). Since we shall also discuss similar results
in a more general case in chapter I l l we shall presently de-fine the conditional mean information and conditional mean entropy even under the weaker hypothesis that T is non-singu-lar. The first definition of conditional mean entropy has been given by NEVEU [14] for a partition~ e ~ in the case that T is an invertible measure preserving transformation on (X,~,P)
and conditioning by a a-field ~O e a~ with the property
T-
1~
0
=~
0
•
In this special case NEVEU also proved theorem 2.5.9. of this thesis. Definition 2,1.1. given below reduces to NEVEU'S definition under the conditions mentioned above.Definition 2.1.1: Let T be a non-singular transformation on
cx.~.P). Let ~0 E
n •
{--1-- I( v T-~~
n+l i=O I
a~ and ~I e ~~ • If the sequence
0
its limit is called the aonditional mean information of ~I
I n •
If the sequence {n+l H(.v T-~~l
I
~0
)}:.1
converges. then.
~-oits limit is called the aonditionaZ mean entPopy of ~
1
withrespect to ~0• D
Remark 2.1.2: In order to simplify the notation. the condi-tional mean information and the condicondi-tional mean entropy will be denoted by i(~
1
I
~0
) and h(~1
I
~0
) respectively. I f E;, E <I>and ~O
=
f$,X}, then we have h(E;,I
~0
)=
h(E;,,T) as defined in1.6.6. D
Remark 2.1.3: Let T be invertible and suppose that
T-
1~
0
=
~
0
•
Thenn
H(T-i~l n T-j~
I
V V ~Q)i=O j=i+1 I (I . 5. lOa))
n n-i
T-j~
= n+l
I
H(~lI
V V ~0)i=O j=l 1
(1.6.4)
Hence by corollary 1.5.15. taking the Cesaro limit it follows that n lim --1 --1 H( V
T-i~
n~ n+ i=O 1"'
I
~0)
=
H(~l
I
VT-j~J
V~0).
j•lThus in this case definition 2.1.1. reduces to NEVEU'S
defini-tion mendefini-tioned above. 0
In the following sections we shall give some other tions under which the conditional mean information and condi-tional mean entropy exist. As said before we do not require that T is invertible. First it is easy to see as in remark 2.1.3. that the conditional mean entropy exists in the case
-1
that T ~O
=
~O (T not invertible). In section III.2.we shall generalize this to the case that there exists a k E Nt such-k
that T ~O
=
~0
• We shall also prove the existence of condi-tional mean information in this case. In section III.3.we shall prove the existence of conditional mean entropy in the~
case that H(~
0
) = 0 and in section III.4.in the case that-1
I
-1T ~O c ~ and H(~
0
T ~0
) < oo, In the last case we shall also prove the existence of conditional mean information.In this introduction we shall give a first simple result concerning the existence of conditional mean information and conditional mean entropy in proposition 2.1 .4.
Proposition 2.1.4: Let ~O E a~ • ~O E <l'>~ , ~I E <l'>~ and suppose
0 0
~
0
c ~0
. Then:a) i(~
1
I
~0
) exists if i(~1
I
~0
) exists.In this case we have i(~
1
I
~0
)=
i(~1
~0
).b) h(~
1
I
~0
) exists if h(~1
I
~0
) exists.In this case we have h(~
1
I
~0
) = h(~1
I
~0
).Proof: By proposition 1.5.12. there exists a partition~ E ~
with the property ~ v ~
0
= ~0
. Hence we have t-1 I( VT-i~
I
~0)
t i=O I t-1 I( VT-i~
I
~0 V~)=
t i=O It-1 =
l
I( VT-i~ V~~~-)
- _tlI(~j~
0
)
t i=O I-u
t-1 =l
I( VT-i~
I %>
+ t i=O I (1.5.2a)) (1.5.2a)) (I)Now notice that by the propositions 1.5.4. and 1.5.13.
t-1 .
I
-1.the sequence {I(~ ~ v v T ~
1
)i=O 1
converges
CO '
both a.e. and
in £
1-norm to the function I(~l% v
-1.
V T ~I),
i=O
Hence the
sequence
converg~s both a.e. and in £
1-norm to zero. Hence assertion a) is an immediate consequence of formula (1) and assertion b) follows by integrating both sides of (l) and then taking
the limit for t + eo, 0
An immediate consequence of proposition 2.1.4. is the existence of the conditional mean information and conditional mean entropy of a partition ~ € ~ with respect to a partition n € ~. In this case we have
t-1
lim
l
I( VT-i~ln)
=E~ I(~~
t+co t i=O -T i=l
§2. PERIODIC SUB-cr-FIELDS
In this section we shall prove the existence theorem of conditional mean information and conditional mean entropy for periodic sub-a-fields of~ as defined in definition 2.2.1.
Definition 2.2.1: Let k be a positive integer. A sub-a-field
~
0
Ecr~
is called periodic with period k ifT-k~O
=~O
and if -k'for any positive integer k' < k we have T ~
0
I
~0
• 0First we shall give an example o~ a periodic sub-cr-field
~O in the case that T is not invertible. Let
x
1
=
{x E ~10 ~ x < 1} be the half open unitinter-val, let ~I be the cr-field of Borelsets on
x
1 and let ~I be the Borelmeasure on ~
1
• Furthermore let k ~ 3 be a positive integer, let x2 = {1,2, ••• ,k}, let ~2 be the a-field of all subsets of X and let for any A E ~2
~2
(A) be the number of elements of A divided by k. Now we consider the probability space (X,~,P) where X =x
1 x
x
2 is the Cartesian product,~ is the cr-field of subsets of X generated by the measurable rectangles and P is the product measure on ~. The measure preserving (not invertible) transformation T on (X,~,P) is defined by T(x,n) = (T~x,T
2
n), where T1x = 2x (mod 1) and T
2(n) = (n+l) (mod k). Now let
~
=
{A.}; 1 1= 1, whereThen
~
is a partition of X with the propertiesT-k~
=~.
n
T-i~
I
~.
0 < i < k.H~nce
the a-field~
is periodic with period k.Theorem 2.2.2: Let k be a positive integer and suppose that
k-1
Suppose that ~~ E n
i=O
where
Consequently,
• Then both a. e. and in £
1-norm
00
V T-i~] v T-(k-j)~O)(Tj)
i=l
k-1 00
h(~I
I
~o)
=~
.I
H(~I
I
vT-i~l
v T-j~o)
J=O i=l
If T is ergodic we have a.e. and in £
1-norm Proof: To k-I = _!_
L
H(~]
I
k j=O 00 V T-i~] V T-j~O) i=lsimplify the notation we denote
i T-j~ V T-i~ ) f. = I(~l
I
V ~ j=l I 0 00 T-j~ V T-i~O) f! = I(~l V ~ j=I 1 i I, 2, ...Because of the fact that Tk is a measure
pr~serving
transformation on (X,~,P) by BIRKHOFF's theorem w,e have a.e. and in £lim p-+«> kp Putting v j==kp+l {~,X} we remark that kp • I( V T-~6\1
I
6\0) = i==O kp • kp =l
I(T-~6\1 V i=O j=i+l T-jdt V T-kp6\0) = (I • 5. 2a)) I kp -· kp-i T-jdt V T-(kp-i)dt )]=
=
l
T~[I{M1
V I 0 i=O j=l (1.6.3) Hence we have1-
1- I(~p
T-iiRI
IR ) - E f(k)I
s kp+l i=O I 0 ~k I kp Ai t (k) + 1 - k II
T f, -. - E!Rkf!.
p+ i=O Kp ~ -i (I)0 a.e.
and in £
1-norm because of corollary 1.4.2. it is sufficient to prove that sup If - f' n n
I
€ £1
(x.~,P) and lim If -nf'l
n = 0n n~
a.e. and in £
1-norm. Indeed,
sup If - f'
I
:s; sup IfI
+ sup If I I ::;n n n n n n n k-1 n T-jdt V T-i!RO) + ::;
l
sup I(IR 1 I V I i=O n j=l k-1 +l
i=O CO I(o:t 1 I v T-jiR1 v T-i%) e £1 (X,dt,P) j=lbecause of proposition 1.5.13. Furthermore by the propositions 1.5.4. and 1.5.13. we have for any p
n
lim I(IR
1 I v T-idt v
T-p~)
=
n+oo i=l 1
=I(~
I
"' v T-i~ v T-p~) a. e. and £1-norm.
i=l
Since we may restrict p to the finite set 0 ::: p :s; (k-1) this convergence is uniform in p and hence we have a.e. and in
£
1-norm
lim
I
I(')I
n~Now we have to prove that a.e. and in£
1-norm
First we remark that
I kp A• k I p-1 A k (k) - - \" T1f' .
=
__p_-
\
TJ1. f + kp+ I . L kp-1 kp+ I p a=LO 1=0 "' + - 1 - Tkpi(IR kp+l I"'
v T-jiR v IRQ) • j=l . IBy BIRKHOFF's theorem we have a.e. and in £
1-norm "' V T-jdtl V
~St
0
)J
-j==l 00 V T-jdtl V IR 0)J=
0 , j=lHence by BIRKHOFF's theorem with Tk instead of T we find
I kp
Ai
-
E f(k)l - 0 lim !k +I.I
T fk -i IR k-p+«> p 1=0 p .T
both a.e. and in £
1-norm.
To prove the second assertion of theorem 2.2.2. we re-mark that
D
Then
Let ~O E o~ be periodic with period I and let ~I ~ ~~ • 0
00
V
T-i~)
V~O)J
a.e.i=l and in £
1-norm. Hence theorem 2.2.2. is a generalization of
UAC MILLAN' s theorem ( 1. 6. 5) where ~O {$;X}.
§3. MEAN ENTROPY ZERO
In this section we shall prove the existence of condi-tional mean entropy if the conditioning sub-o-field of ~ has mean entropy zero in the following sense.
Definition 2.3.1: The o-field ~O ~ o~ is said to have mean
entropy zero if H(~
0
)=
0. DBefore we formulate the existence theorem we shall first exhibit the connection between mean entropy zero and condi-tional independence.
Definition 2.3.2: Let ~O E o~; ~I E o~ and ~
2
E o~. Then ~ and ~I are called conditionally independent with respect to ~2
if we have for any A ~ ~O' B E ~IP~ (An B)
=
P~ (A).P~ (B) a.e.2 2 2 0
'
(See e.g. LOEVE [12], p. 35I, 25.3)
Proposition 2.3.3: Let ~O € cr~, ~
2
€ cr~ and ~I € ~~ • Then2
the following two assertions are equivalent:
a) ~O and ~I are conditionally independent with respect to ~
2
;b) H(~l
I
~2 V ~0) = H(~lI
~2).Proof: NIJST [16], p. 313, theorem 2.2. 0
Proposition 2.3.4: Let~· € a~, ~I € a~ and ~
2
€ cr~. Suppose 00 •that ~· c ~I and H(~
1
v ~2
I
v T-1~ ) < oo, Theni=l I lim H(~' t+«>
=
H(~'I
00 -i 00 V T ~I V V i=l k=t 00 V i=l T-i~ ) IProof: JACOBS [10], p. 273, theorem IV. 0
Theorem 2.3.5: Let ~O € cr~. Then the following three asser-tions are equivalent:
a) H(~
0
) = O; b)H(~
I
~ T-i~
V~0)
=
H(~
i=l "" V i=lT-i~)
for any~
€~;
c) ~O and ~ are conditionally independent with respect to
00
v
T-i~
for any~
€~.
Proof: The equivalence of the assertions b) and c) is an im-mediate consequence of proposition 2.3.3.
b) ~ a) Let ~
2
e $, ~2
c ~O then we have for any ~ E $I QO • \ (X)
H(~
I
VT-
1~
V~0)
QH(~
I
VT-i~) ~
i~l i=l=
00~ H(~
I
VT-i~
V~2) ~ H(~
I
VT-i~
V~0)
iQJ i=l (I.S.lOc)) Hence 00 00H(~
VT-i~
V~2)
QH(~
I
VT-i~).
i=J i=JSubstitute in this equality ~
=
~2
• ThenI t follows that H(~
0
)=
sup h(~2
,T)=
0. ~2~ ~2€$ a) ~ b) Let ~2
c ~· ~2
E $. Then h(~2
,T) = H(~2
I
00 • -1Hence by proposition 1.5.10d) ~
2
c .v T ~2
and by""
-1.
V T ~
2
)=0. i=linduction
00 1=1
~
2
c .vT-i~
2
for any positive integer t. Let~I
E $ , then1=t we have 00
H(~J
V~2
I
VT-i~l)
s
H(~J
V~2)
s
i=l ( L5.10)and hence V i=l \ 00 V V T-k~ )
=
k=t 2 00=
H(~l
I
VT-i~l)
• {2.3.4) i=l 0!) -iFor any positive integer tit follows from ~
2
c v T ~2
that i=t . . • 00H(~l
I
.v
T-1~,> ~ H(~l
I
VT-i~l
V~2) ~
1=1 i=l 2: H(~l (1.5.10) and hence 00 00H(~l
VT-i~l)
2:H(~l
I
VT-i~
V~2)
2: i=l i=l I .. -i 00 -k V T ~I V V T ~z) = i=l k=t 00 -i = H(~I
V T ~~). (2.3.4) I i=l 00 T-i~""
T-i~ ) Thus we have 8(~1
V V ~2)=
H(~I
V andi=l 1 1 i=I I