Conditional mean information and conditional mean entropy

(1)

Conditional mean information and conditional mean entropy

Citation for published version (APA):

Nijst, A. G. P. M. (1971). Conditional mean information and conditional mean entropy. Technische Hogeschool Eindhoven. https://doi.org/10.6100/IR29048

DOI:

10.6100/IR29048

Document status and date: Published: 01/01/1971

Document Version:

Publisher’s PDF, also known as Version of Record (includes final page, issue and volume numbers)

Please check the document version of this publication:

• A submitted manuscript is the version of the article upon submission and before peer-review. There can be important differences between the submitted version and the official published version of record. People interested in the research are advised to contact the author for the final version of the publication, or visit the DOI to the publisher's website.

• The final author version and the galley proof are versions of the publication after peer review.

• The final published version features the final layout of the paper including the volume, issue and page numbers.

Link to publication

General rights

Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. • Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain

• You may freely distribute the URL identifying the publication in the public portal.

If the publication is distributed under the terms of Article 25fa of the Dutch Copyright Act, indicated by the “Taverne” license above, please follow below link for the End User Agreement:

www.tue.nl/taverne

Take down policy

If you believe that this document breaches copyright please contact us at:

openaccess@tue.nl

providing details and we will investigate your claim.

(2)

(3)

(4)

CONDITIONAL MEAN INFORMATION AND

CONDITIONAL MEAN ENTROPY

PROEFSCHRIFT

TER VERKRIJGING VAN DE GRAAD VAN DOCTOR IN DE _\ TECHNISCHE WETENSCHAPPEN AAN DE TECHNISCHE HOGESCHOOL TE EINDHOVEN, OP GEZAG VAN DE RECTOR MAGNIFICUS, PROF. DR. IR. A.A.TH.M. VAN TRIER, VOOR EEN COMMISSIE UIT DE SENAAT IN HET OPENBAAR TE

VERDEDIGEN OP DINSDAG 15 JUNI 1971 TE 16.00 UUR.

DOOR

ARTHUR GABRIEL PAUL MARIA NIJST

(5)

DOOR DE PROMOTOR

(6)

(7)

CONTENTS

Introduction and summary

Chapter I Preliminaries

l.l. Some measure theoretic propositions

1.2. Separable a-fields

1.3. £

1-preserving transformations

1.4.

The individual ergodic theorem

1.5, Conditional information and entropy !.6. Mean information and entropy

Chapter II The case of a measure preserving transfor-mation

2.1. Introduction

2.2. Periodic sub-a-fields 2.3. Mean entropy zero 2.4. Invariant sub-a-fields

2.5. Properties of conditional mean entropy

Chapter III The case of an equivalent invariant probability 3. I, Introduction IX 6 10 16 22 29 33 33 37 42 51 71 78 78 3.2. Existence of conditional mean information and entropy 80 3.3. Some properties of conditional mean entropy 87 3.4. Decomposition of the Kolmogorov-Sina1 invariant 91

(8)

Chapter IV The case of a £

1-preserving transformation 98 4.1. PARRY's conditional mean entropy 98 4.2. PARRY's conditional mean information 113

References 126

Samenvatting 129

(9)

INTRODUCTION AND SUMMARY

Let (X,~,P) be a probability space. Let ~

=

{A.}~ ₁be a

~ ~=

(measurable) partition of X and let ~O and ~I be sub-a-fields of ~. Then the concepts of conditional information of ~ with respect to ~O denoted by I(~

I

~

0

) and conditional ent~opy of

~I with respect to ~O denoted by H(~

₁

I

~

₀

) are well known. (cf. JACOBS [10], NEVEU [14], [15]>cf. also the definitions 1.5.1. and 1.5.8.) Now let T be a non-singular transformation on (X,~,P). If T is measure preserving then by a generaliza-tion of MAC MILLAN's theorem givenlby CHUNG (cf. PARRY[!£], p. 20, theorem 2.5., cf. also theorem 1.6.5.) for any

parti-t-1 . tion ~with finite entropy the sequence

{l

I( v T-~~)}~_

1

t i=O

t-converges a.e. and in £

1-norm to a function called the mean information of ~. Consequently also the sequence

converges to a number called mean entropy of ~.

If T admits an invariant probability P' equivalent with P then for any finite partition ~ the sequence

still converges in £

1-norm as proved by JACOBS [10], p. 318, theorem 3. Consequently also the existence of mean entropy of

~ is guaranteed in this case.

Let T be a measure preserving invertible transformation, let ~ be a partition with finite entropy and let ~O be a sub-a-field of

~with

the property

T-

1

~

0 =

~

0

•

In this setting

(10)

NEVEU [14] has defined the conditional mean entropy of~ with respect to IR 0 by 00 H(~

I

IRQ V V i=l

It is easy to see from some sinple properties of condi-tional entropy (cf. 2.1.3.) that in this case we have

t-1

lim

l

H( v

T-i~

I

IRQ) • t-+<» t i=O

In this thesis we shall investigate some conditions on the transformation T and on the sub-a-field IR₀of 1R which guarantee the existence of conditional mean information and conditional mean entropy for any partition ~ with finite en-tropy (or for any finite partition ~ as in chapter III) with respect to IR

0, Furthermore we shall establish some properties of conditional mean information and conditional mean entropy.

For the convenience of the reader some basic concepts and results are recalled in chapter I. Proofs are given if they are rather difficult to find in the literature~ Further-more we have collect in the sections I.S and 1.6 sone well known properties of (mean) information and (mean) entropy. Chapter II contains results on conditional mean information and conditional mean entropy in the case that T is neasure preserving (not necessarily invertible). The major results of

' I f d'

chapter II concern the existence and the propertLes o con L-tional mean information and condiL-tional mean entropy of a par-tition ~ with finite entropy and conditioning by an 1invariant

sub-a-field IR₀of 1R (section 2.4.) and the generali~ation of NEVEU's theorem [14], 3.2. given in theorem 2.5.9.

(11)

In chapter III we introduce the concept of conditional mean entropy in the case that T admits an invariant probability P' equivalent to P for a finite partition ~ and conditioning by an invariant sub-a-field ~O of ~. The method employed in this chapter essentially corresponds to JACOBS' method of introduc-ing the mean entropy of a finite partition ~ as mentioned above. Furthermore in this chapter we discuss another genera-lization of NEVEU's theorem mentioned above and a decomposi-tion of mean entropy in the case that T is invertible but not necessarily measure preserving. At last in chapter IV PARRY's results on mean information and mean entropy for a £

1 -preserv-ing transformation [17] are extended by again admitt-preserv-ing a con-ditioning sub-a-field. In the special case that T is measure preserving these concepts of conditional mean information and conditional mean entropy reduce to the corresponding concepts discussed in chapter II. In the case that T is £

1-preserving and admits an invariant probability P' equivalent to P this need not to be true as will be shown in an example.

With respect to the notation we remark that the set of positive integers is denoted by Nt and the set of real numbers by Rt. Let ~~ and ~

₂

be a-fields of subsets of a set X, then the smallest a-field of subsets of X containing ~I and ~

₂

is denoted by ~, v ~

₂

• Similar notations are used in the case that we have a (possibly infinite) class of a-fields of sub-sets of X. If A and B are subsub-sets of X we denote by A\B the difference of A and B, i.e. A\B {x

I

X e A, x

i

B}. Further-more we shall use the notation

A

=

X\A and A ~ B

=

(A\B) u u (B\A). Other notations will be explained at the place where they turn up.

(12)

PRELIMINARIES

For the convenience of the reader we shall collect in this chapter some preliminaries. Proofs will be given if they are short and not easily accessible in the literature. For other proofs we refer to the literature.

In this chapter, as in the following chapters, (X,~,P)

will be a probability space, i.e. X is an abstract set, ~ is a a-field of subsets of X and P is a probability on ~. i.e. a positive a-additive set function on ~with the property P(X) = 1. Furthermore we shall denote the set of all sub-a-fields of ~ by cr~. For any A € ~ we denote the characteristic

function of A by XA•

§1. SOME MEASURE THEORETIC PROPOSITIONS

Definition 1.1.1: Let P' be a probability on the measurable space (X,~). The probability P' is called absotutety aontinu-ous with respect to P if for any A € ~ such that P(A)

=

0 we have P'(A)

=

0. If also Pis absolutely continuous with re-spect toP', then the probabilities P and P' are called

equivatent. 0

Proposition 1.1.2: Let P' be a probability on (X,~) equivalent with P. Then for any E > 0 there exists a

o

> 0 such that for

any A € ~with P'(A) <a, we have P(A) <E.

Proof: J, NEVEU [13], (Corollary I of proposition IV.I.3) 0

(13)

Remark 1.1.3: Let T be a measurable transformation on the probability space (X,~,P) (cf. §3). An important problem in ergodic theory is to find a T-invariant measure P' such that Pis absolutely continuous with respect toP'. Many necessary and sufficient conditions for the existence of such a proba-bility P' can be found in Y.ITO: Invariant measures for Markov processes [9]. ITO also proved that i f there exists a probability P' with the above mentioned properties, then there even exists a probability P" which is T-invariant and

equivalent with P. 0

Let ~O e cr~ and let f be a quasi-integrabte jUnction on

(X,~,P), i.e. f is a (~-) measurable function on X with the

property sup(f,O) is integrable or sup(-f,O) is integrable.

Then the set function Q on

~O

defined by Q(A)

=

I

fdP for

A

any A e ~ is a signed measure on (X,~) absolutely ~ontinuous

with respect toP (HALMOS (7], p. 118 and p. 124). Hence by the theorem of RADON-NIKODYM (HALMOS [7], p. 128, theorem B) there exists a modulo P uniquely determined ~

₀

-measurable

quasi-integrable function f

0 on X such that:

J

_{f 0dP}=

J

fdP for any A e

~

0

•

A A

The operator E~ defined on the set of quasi-i~tegrable

0

functions on (X,~,P) which is defined by E~ f = f₀fs called

0

the conditionat expectation with respect to ~

₀

• The function f

0 is called the conditionat expectation of f with tespect to

~

₀

• The operator P~ defined on ~by P~ (A)

=

E~ (xA) for any

0 0 0

(14)

P~ (A) is called the aonditiona~ p~obabi~ity

of

A with respect

0

to ~

₀

•

Notice that for any f € £

₁

(x,~,P), i.e. for any integra-ble function f, we have E~ f € £

1

(X,~,P).

0

Proposition 1.1.4: Let ~O E cr~. Then the conditional

expecta-tion with respect to ~O has the following properties:

a) For any pair of quasi-integrable functions f and g satis-fying the condition that f is ~

₀

-measurable and f.g is quasi-integrable we have a.e. E~ (f.g) = f.E~ g.

0 0

b) Let~ E cr~, ~I E cr~ and suppose ~O c ~

1

• Let f be a

quasi-integrable function. Then we have a.e.

E~ E~ f

=

E~ E~ f

=

E~ f.

0 1 I 0 0

Proof: NEVEU [13], IV.3, formulas (1) and (2). D

Definition 1.1.5: Let ~

₀

E cr~ and A£~. Then the measUPab~e

hutt

<A>~ of A is the modulo P uniquely determined ~

₀

-measur-0

able set with the properties:

a) A c <A>~ [P]; 0 b) P(<A>~ ) 0 inf P(B). B=>A BE~O

Proposition 1.1.6: Let~ E cr~ and A£~. Then

<A>~= {x EX

I

P~ (A)(x) > 0} [P].

Proof: NEVEU [15], lemme 1).

D

(15)

Proposition 1.1.7: Let ~O ~a~ and let A~~ be such that

~ n A

=

~O n A. Let f be measurable and ·non negative. Then

E~ (f.xA)

=

P~ (A).f a.e. on A.

0 0

Proof: t'NEVEU [ 15 J, lemme 2). ₀

Proposition 1.1.8: Let ~O ~a~, B ~ ~O and C ~ ~. Then ~

=

B n <C>~ [P].

0 0

Proof: First we note that c B n <C>~ [P].

Further-~0 0 more C c ~ u 0 <Bc n C>~ and hence 0 <C>~ c ~ u <Bc n C>~ 0 0 0

Thus because of tR c B and ~ c Be it follows

0 0

that B n <C>~ c ~ and the proposition is proved.

0 0 0 Definition 1.1.9: Let {f }~ 1 c £

1

(X,~,P). Then {f } 00 1 is n n= n n=

called uniformly integPable if lim sup

j

lfnJdP

a+oo n~Nt {Jf >a}

n :

The sequence {f }00

1 of a.e. finite measurable functions n n=

.. o.

converges in probability to the a.e. finite measurable func-tion f i f for any e: > 0 we have lim P( { If -f I > e:}) = 0. 0

n-+<» n

Proposition 1.1.10: Let {fn}:=l be a sequence of a.e. finite measurable functions which converges pointwise a.e. to an a.e. finite measurable function f. Then the sequence {f }~

1 con-n con-n=

(16)

verges in probability to f.

Proof: J.NEVEU [13], proposition II.4.3. D

Proposition 1.1.11: Let P and P' be equivalent probabilities on~. Suppose that {fn}:.J c £

₁

(X,~,P'), f E £

1 (X,~,P') and

f +fin probability P'. Then f +fin probability P.

n n

Proof: Let E > O, then by proposition 1.1.2. there exists a 6 > 0 such that for any A E ~with P'(A) <owe have P(A) < E.

Hence for n > 0 by lim P'({lf -fl > n}) = 0 it follows that

n n-+<x>

lim P({lf

-£1

> n})

=

o.

D

n n-+<x>

Proposition 1.1.12: Suppose that {gn}:=l c £

1

(x,~,P) is

uni-formly integrable and {f }00

1 c £

1

(X,~,P), If I ~ lg

I,

n n= n n

n = 1,2, •••• Then the sequence {f }~

1 is uniformly integra-n integra-n= ble. Proof: Hence Hm sup a~ nE:Nt D Proposition 1.1.13: Let {fn}:.

1 c £

1

(X,~,P). Then the follow-ing two assertions are equivalent:

00

1) The sequence {f

0}n=l converges in £1-norm to f E £

1

(x,~,P). 2) The sequence {fn}:=l is uniformly integrable and converges

(17)

Proof: J. NEVEU [13], proposition II.5.4. 0 §2. SEPARABLE a-FIELDS

Definition 1.2.1: Let {A }00

1 be a sequence of measurable n n=

subsets of X and suppose that ~· e a~ satisfies the following two conditions:

1)

2) \if [~*:::>{A }"" =:. ~* :::>6{' ] ,

6{*ea6t · n n•l

then we say that 6t' is generated by {An}:=l • Sub-a-fields of

6t generated by a sequence of measurable subsets of Xi are call-ed aountabZy generated sub-cr-fields of 6t. 0

In definition 1.2.1. and throughout in what follows two sub-a-fields 6t

1 and 6t2 of 6t are identified if for any A1 E 6t1 there exists a set A

2 e 6t2 such that P(A1

a

A2) = 0, and con-versely.

Definition 1.2.2: A (measurable) partition~= {A.}~

1 of X

- l l=

is a countable collection of pairwise disjoint measurable

sets with union X. D

We shall identify two partitions ~ and ~· which have modulo P-nullsets the same elements. In order to simplify the

terminology we shall also identify a partition and the cr-field generated by it.

Definition 1.2.3: Let 6t

0 E 06{. Then 6t0 is called separabZe if

there exists a sequence {An}:=l of elements of 6t

0 with the following property: for any A E 6t₀and for every E > 0 there is a number n such that P(A ~ A ) < E.

0

(18)

Proposition 1.2.4: Let ~O € o~. The following two assertions

are equivalent:

a) ~ is separable;

b) ~O is countably generated.

~: a) ~ b) Suppose that ~O is separable, let {An}:.₁be a sequence in~ satisfying the condition of definition 1.2.3. and denote by ~O the sub-o-field of ~ generated by {An}:=l • For every fixed A € ~O and for any positive integer k there

I

exists a positive integer ~· such that P(A 6 A ) <

1k .

Let

~ 2

00 ()0

t.k•

A , k

~ 1,2, ••• and A' = lim sup k~

Ak

= n n=l k=n u

Ak ,

then A' € ~

0

and the sequence { u

t.kl:.

1

is

monotone decreasing.

k=n

Hence for any € > 0 there exists a positive integer N(e) such

that P(A' 6 u

Ak>

~le for all n ~ N(€). Thus k=n

P(A A A') ~ P(A A u

Ak)

+ P( u

AkA

A') ~

k=n k=n

00

::; L

P(A A A_') + ~£ ::;; - 1 - +

le

< e

k=n -K 2n-l

if n is sufficiently large. Since this is true for any e > 0 we have P(A A A')

=

0 and therefore A € ~·

00

b) ~ a) Let ~ be countably generated and let {An}n=l be a sequence of elements of ~O generating ~

₀

• Without restriction of generality we may suppose that A

1

=

X and that the se-quence {A }oo I together with A also contains

A

= x\A for

n n= n n n

every n. Let

ut

be the field of finite unions of finite inter-sections of elements of the sequence {A }00

1 • Then

Ot

contains

(19)

countably many elements (HALMOS, [7] p. 23, theorem C) and is dense in ~

₀

, i.e. for any A € ~O and every E > 0, there is a

set A' € ~ such that P(A ~A') < E (HALMOS [7], p. 56,

theorem D).

D

Proposition 1.2.5: Let~ be separable and ~O E cr~. Then ~O is

separable.

Proof: There exists a sequence {Fn}:=l of elements of ~such that for any E > 0 and A € ~ there is a positive integer n with property P(A

~

Fn) < E. Let Oln,k = {A €

~

0 1P(Fk~A)

<

*}

for n = 1,2, ••• and k = 1,2, •••. If we choose from every non empty

ae

k one element, we obtain a countable set of

n,

elements of ~, which we arrange to a sequence {Ai}~=

1

• For any A E ~ and any positive integer n there is a positive

integer k such that P(Fk

~

A) <

~n

•

Beca~se

of

ut

₂n,k ;

0

there exists ani such that P(Fk ~A.) <

--2 • Hence

I ~ n

P(A ~A.) < - . Because n runs through all positive integers,

~ n

we have proved the proposition.

D

Corollary 1.2.6: Let~ be separable and ~O € cr~. Then there exists a sequence of finite partitions {s }oo

1 of X such that n n= 00 V n=1

s .

_n

D

The following propositions will be important in connec-tion with the definiconnec-tion of condiconnec-tional entropy (cf. §5).

Proposition 1.2.7: Let~ E cr~, ~

1

€ cr~ and lets be a parti-tion such that ~ c ~

1

c ~ v s• Then there exists a partition

n

such that ~ v

n

= ~

₁

•

Proof: Lets= {A.}~

1 and let <A.>0 , i = 1,2, ••• be defined

~ ~= ~ ~1

(20)

1,2, . . . . Let

It follows from at₁v ~ = at₀v ~ that there exists an element D. E dt

0 such that D. n A. =C. n A .• We see that

l. l. l. l. l.

C. ~ (D. n <A,>n ) € at

1 and C. 8 (D. n <A.>dt ) c <A,>dt \A ••

1 1 1 ~₁ 1 1 1 I 1 J 1

This implies P(C. ~ (D. n <A.>n )) = 0 and hence l. 1 l. .,.,1

c.

=

1 D. I. n <A.> dt and I. I at1 n <A.>o 1 ~~

=

dt

g_l

n <A,>dt, i 1

I

=

1,2, ••••

E = <A >dt \ u Ei' n ~ 2 and

n n l i=l define E 1

=

<A > J dt • I Now n = CO {E.}. 1 • Then the

l. I.= assertion is proved. D

The following two propositions can be found as proposi-tion I) in NEVEU [14].

00

Proposition 1.2.8: Let 6t

0 E crdt, and let ~ {Ai}i=l be a partition of X. Let f E £ 1(x,dt,P), f ~ 0. Then

"' Eat

(XA f) 0 i Eo v ~ f

=

L

( )

XA. a.e. ~0 ~ i=J pdt Ai 1 0

Proof: Because of proposition 1.1.6 the right side of the equality is defined a.e. Furthermore we remark that both sides of the equality are at

0 v ~-measurable functions of X. Let B E at

0 v ~. then for any i there exists a Ci E dt0 such that C. n A.

=

B n A .• Hence we have

(21)

"'

f

~ <xA f) CO

J

~

(xA.f)dP

=

_I

E6l [ _{P&l (A.)}0 i XA. ]dP •

I

i=l 0 ₀ _1. l. i=l

_c.

₀ _l.

c.

_1. 1. 00

J

f

I

..

_I

fdP fdP = _{E6l v}_F; _{fdP •} ₀ i=l _A.11C. _B B 0 1. 1.

Proposition 1.2.9: Let F; ={A.}~

1 be a countable partition.

1. I.=

Let

6la

E a6l. Then for any A E

6la

v F;, there exists a uniquely determined sequence {B.}~

1 c ~- such that B. c <A.>~- ,

1. 1= .. u 1 1.

-u

i

=

1,2, ••• and such that

"'

A-

I

(A. 11 B.). i=l l. 1.

00

~: Let

Ot

={A E &t!A =

L

(A. n B.), B. € ~-

n

<A,>n ,

i=l 1 1 1

_{-u ·}

_l.

"'o

i = 1,2, ••• }. Then C( € cr6l and V( c 6lO V F;, Furthermore we have F; c

CX.

and 6l

0 c

C5E

and hence

Ut:

=

6!0 v F;.

""

Now let

L

(A.

n

B!)

=

I

(A. 11

B~)

with the properties i= I l. l. i= I 1 1

B~

€ 6l

0 11 <A,>o, j.

=

1,2; i

=

1,2, . . . . Then A. 11 B!

=

l. 1. IJt 0 1. l.

=A. n

B~,

i = 1,2, ••• and hence by proposition 1.1.8:

B~ ~

<A,;6l n B! =<A. n

B~>6l

=<A. n

B~>6l

=

B7, i

=

1,2, •••

1 _1. ₀ _l. _1. _1. ₀ 1 _1. ₀ _1.

and the proposition is proved. 0

§3.

£

1-PRESERVING TRANSFORMATIONS

In this section, as in all what follows, T will 'be a

measUPabZe tPansformation on (X,6l,P), i.e. a transformation with the property T-16l c 6l. If 6lo € cr6l and if f is any 6{₀

-~ -1 .

(22)

function defined by (Tf)(x) f(Tx).

~P~r~o~p~o~s~it~~~·o~n~1~·~3~·~1: Let for ~I € cr~ the conditional

expecta-PT-I P

tions E~ and E~ be taken with respect to the probabilities

I I

P and PT-I

have a.e.

for any ~I € cr~ and for any f ~

o.

we

p E _ 1 (Tf). In particular. choosing T ~I

~I {~.X}

we have

J

TfdP

=

I

fdPT-I • X

Proof: Both and E _p -1

1 (Tf) are T ~

1

-measurable

T ~I

functions on X. Suppose that A € ~. then

X X

Hence by monotone approximation for any f ~ 0 and A € ~ it follows that

I

TfdP

I

fdPT-I •

Now let B

=

T-1A. A €

~

1

•

Then we have:

I

fdPT-I

A

I

TfdP =

B

(23)

Definition 1.3.2: The transformation T is called P-preserving

-1

if for any A € ~we have P(T A)

=

P(A). The transformation T

is called £₁-preserving if f € £

1

(x,~,P) if and only if Tf € £

1

(X,T-1~

1

,P). The transformation T is called

non-singu-Zar

if for any A E:

~

P(A)

=

0 if and only if P(T-1A)

=

0. The transformation T is called invertibZe if T is a one to one transformation of X onto X with the property T~

=

~.

0

The following proposition is a slight modification of a lennna due to DUNFORD-MILLER [4),p. 539, lennna I) (cf. also DUNFORD-SCHWARZ [5], p. 664, lemma 7).

Proposition 1.3.3: The following statements are equivalent:

a) T is £

1-preserving,

-I

b) The probabilities P and PT on ~ are equivalent and there

1 dPT-I exists a positive integer K such that

K

~ ~ ::>

1 K.

c) There exists a positive integer K such that for any

f €

£

1 (X,~,P)

i

J

lfldP :S:

f

ltfldP ::> K

J

lfldP,

X X X

Proof: a) =;> b) Let A t:: ~. P(A) - 0 and P(T-1A) > 0. Define f

=

G

on A

•

then

f

fdP = 0 and by proposition 1 •. 3. 1

on

A

_X

f

TfdP =

J

fdPT-I

_{= ""·}

This is in contradiction with a).

X X

i

Similarly it follows from P(T-1A)

=

0 that P(A)

=

0. dPT-l

Let An

=

{x € X In < ~(x) ::> n+l}. Suppose that there exists an increasing sequence {~}~.

1

of positive in!tegers such that P(A ) > O, k

=

1,2, ••••

(24)

CO

Let ak = k.nk P(A ) , k • 1,2, ••• and f ~

J

(X)

fdP =

t

ak P(A ) < "" and by proposition 1.3.1

X k=1 ~

f

TfdP •

_J

-1 _"' f~dP ;;::

_t

_ak_~_{P(A )}

_{= "",}

_This_i~_in dP k=1 nk ~ X X

contradiction with a) and hence there exists a positive dPT-l

integer K

1 such that

-;rp-

s K1 there exists a positive K

2 such

• Similarly we prove that that

~

s K • Choose

dPT- 1 2 K = max(K

1, K2), then assertion b) is proved.

b)~ c) Let f ~ £

₁

(X,~,P), then by proposition 1.3.1

ITfldP = lfl

~

dP

J

I

-l dP lfldP and X X

f

lrfldP

I

If!

d~~

-l dP;;::

k

I

!f!dP. X X X

c) ~ a) This assertion is trivial. 0

-1

Proposition 1.3.4: Suppose that TX =X. Then for any A € T ~

we have TA € ~. -1

~: Let A € T B, B ~ ~. Then TA = B o TX

=

B € ~. 0 Let Q be a probability on (X,~) such that P and Q are equivalent. If not stated otherwise we shall use the notation

~

for the

~-measurable

density of Q with respect to P, i.e.

(25)

A €

~we

have

f

:~

dP = Q(A). The existence

of:~

follows by

A

the theorem of RADON-NIKODYM (HALMOS [7], p. 128, theorem B). Let T be £

1-preserving with the property TX =X. Then because

of proposition 1.3.3. the probabilities P and PTn on T-~ are equivalent.

In the rest of this section we shall suppose that TX = X and that T is a £

1-preserving transformation on (X,~,P).

From Dowker [3] we adopt the following definition 1.3.5 which will serve to formulate the HUREWICZ version of the individual ergodic theorem in the following section. We shall also collect for later use some properties of the functions

w~, n = 0,1,2, ••• which will be defined presently. -n

Definition 1.3.5: For any positive integer n the T

~-measur-dPTn

able function w' is defined by w' = --- • Furthermore we

n n dP

define w

_0:

1.

D

Proposition 1.3.6: The sequence {w'}00

0 has the following

- n n= properties: w' .. E (Tn-lw' Twj .wj), n ~ l. n T-n~ 1 a)

f

Tnf.w'dP

I

fdP, n ~ O, f € £

1

(X,~,P). n b) X X c) I f p

=

dPT- 1 , then Tp.wj

₌

_I. Proof: First we show that for any f € £

1 (x.~.P) and'n ~ 0 we

have

J

Tnf.Tn-lwj Tw'w'dP I I

=

I

fdP. (I)

(26)

Let f • xA' A e ~and~ • 1. Then

P(A) •

f

X

fdP.

Hence for n • I tqe assertion is true for every step function and also for any f e £

₁

(x,~,P). For n > I it follows that

X X = X Tw' w'dP I ' I Tw' W1_dP ) • 1 •

Hence the assertion follows by induction. To prove the assertion of a) we remark that w' and

~n-1 ~ -n n

ET-n~(T wj ••• Twi.wi) are T ~-measurable functions. let A

=

T-nB, B e

~.

Then

J

w~dP =

P(B) and

A

=

Twj .wj dP • P(B)

and assertion a) is proved.

(27)

To prove b), let f € £

1 (X,~,P). Then by assertion a), proposition 1.1.4 a) and formula (!)of p. 14 we have

I

inf.w'dP

=

J~tnf.E

(in-lwj

n T-n~

X X

Twj .w; )dP

=

I

Tnf. in-lwi

_{Tw} .wj}

dP

I

fdP .

X X

To prove c) we remark that Tp

.wj

is T-1~-measurable. Now let

A € ~. then

=

f

pdP

A

Hence assertion c) is also proved,

Remark 1.3.7: For n ~ 0 we denote

§4. THE INDIVIDUAL ERGODIC THEOREM

X

D

We shall later make use of two versions of the individ-ual ergodic theorem. The first one is the version dealing with

a

measure preserving transformation, often cal~ed BIRKHOFF's ergodia theorem. It admits as conclusion a.s. convergence as well as £

(28)

with a transformation which is £

1-preserving and satisfies TX =X. It admits as conclusion a.s. convergence and is called HUREWICZ'ergodia theorem. Both versions can be deduced from the general ergodic theorem of CHACON-ORNSTEIN [2], (cf. also NEVEU [13], V.6.).

The sub-o-field of all T-invariant measurable subsets of

~will

be denoted by

~·

i.e. A €

~T

iff A E

~

and T-1A = A[P].

The transformation T is called ergodia if~= {4,x}.

Theorem 1.4.1 (BIRKHOFF): Let T be P-preserving and let f E £

1

(x,~,P). Then we have a.e. and in £

1-norm

• I n ~k

l~m -;)

L

T f

=

E~ f

n-+<x> n k=O -T

Consequently, if T is ergodic we have a.e. and in £

1-norm

. I n

hm-;)

L

n-+<x> n k=O

I

fdP •

X

Proof: J. NEVEU [13],V.6 (Corollary). D

Corollary 1.4.2: Let {fn}:=l c £

1

(x,~,P), sup lfnl E £

1

(x,~,P)

n and fn + f a.e. and therefore also in £

1-norm. Then I n lim-

L

n-+<x> n k=O

T

~n-kf _{k =}_E~f _{a.e. an}d · _~n£ 1-norm.

Proof: PARRY [18], formulas 2.8 and 2.9, p. 21-23 •

D

Corollary 1.4.3: Let T be invertible and f E £

1

(X,~,P). Then

I n

lim-

L

(29)

Proof: Because of ~T = ~ _ 1 we have T I n lim- ~ n-+<:o n i•O D The following theorem which is a slight modification of HUREWICZ' theorem is given by PARRY [17] and can be proved on similar way as is done by DOWKER [3]. The conclusion that the limitfunction is E~f follows by the CHACON-ORNSTEIN ergodic theorem (NEVEU [13], V.6). Note that,.under the hypotheses of theorem 1.4.4, fT :• Tf.w

1 defines a conservative Markov

endomorphism of £

₁

(x,~,P) which admits ~T as class o~ invari-ant sets.

Theorem 1.4.4(HUREWICZ): Let T be £

1-preserving with'the prop-erty TX =X and let {w }~

0 be defined as in remark 1.3.7.

n n=

Suppose tha~ f e £

1

(X,~,P) and

I

wn(x)

=

~ a.e. Then a.e. n•O we have n ~i

I

T f.wi lim ..;:i_=O.;;;...___ E f n

=

~

.

I

i=O w. l.

Consequently if T is ergodic we have a.e.

n Ai

~

T f.w. i=O l.

I

lim _"' fdP • n n-+<:o

I

w. X i=O l.

Notice that in the case that T is P-preserving we have wi(x)

=

I a.e., i = 0,1,2, ••••

(30)

Hence under the condition TX • X HUREWICZ' theorem generalizes the pointwise part of theorem 1.4.1.

Corollary 1.4.5: Let T be £

1-preserving with the property

TX =X. Let {f }00

1 c £

1

(X,~,P) and suppose that n n=

sup If I €: £

1 (X,~, P) and that the sequence { f } ""-• converges

lli':

1 n n n-,

a.e. and hence in £₁-norm to f € £

1

(X,~,P). Let {wn}:.o be 00

defined as before and suppose that

I

wn(x)

=

oo a.e. Then

n=O

n

I

T "'n-i f .• w •

1 n-1 i•O

lim .::._:::,.__n _ _ _ _

=

EIR f a.e.

n..-

I

-"T

i=O

w.

1

0

In the proof of this corollary we shall need a lemma due to CHACON-ORNSTEIN.

Lemma 1.4.6: For any positive integer N we have a.e.

o.

Proof: CHACON-ORNSTEIN [2], lemma 2.

Proof of corollary 1.4.5: First we remark that

lim sup n-+<x> n _\ -_n-1• n _\ -· ₁ L [T f.]w . - L [T f]w. i=O 1 n-1 i=O 1 n

l:

w. i=O 1 D

(31)

= lim sup n-+<x> :;:; lim sup n-+<x> • lim sup n-+<x> ::;; lim sup n-+<x> + lim sup n-+«> ::;; lim sup n-+<x> n - •

I

[Tn-1(fi-f)]wn-i i=O n

I

w. 1 i=O n - .

I

[Tn-11fi-f1Jwn-i i=O n

I

wi i=O

I

[Tilfn-i- fiJwi i=O ~ n

I

w. 1 i=O n-N -·

I

L

[T11f . - f ]w. n-1 1 i=O + n

r

w. • 0 1 1= N-1 - •

L

[Tn-11fi- f!Jwn-i i=O n

r

w. 1 i=O n -·

L

[T1FN]wi imQ + n

I

W· i=O 1 ~ :s:

(32)

where and -n-N N -· \ 1 [T ( 1.. [T F 0Jw.)]w -N i=l l n + lim sup _ _ ...,;;;.,..;;... _ _ _ _ _ _ n n-T<» \ /.. W• i=O 1

Now we have because of lemma 1.4.6. that

-_n-N N -· _{\" [} 1 [T ( 1.. T F 0Jw.)]w N . i=l 1 n-lim

----=...:..---

=

n-T<» - lim n-T<» n

I

w. i=O 1

=

0 a.e. Furthermore

(33)

n

I

w.

i=O 1.

where

f

E~

FNdP

J

FNdP + 0 for N +

~

by dominated

conver-X T X

gence. Since {FN};=l is monotone decreasing, so is {E~ FN};=l

T

and hence

lim sup

n~

0 a.e.

The corollary then follows by HUREWICZ' theorem.

0

§5. CONDITIONAL INFORMATION AND ENTROPY

Definition 1.5. 1: Let~= {A.}~

1 be a partition of X and 1. 1.=

~O E o~. Then the aonditional information of~ with,respect

to ~O is a.e. defined by

00

I(~i~o):=-

L

XA log Pn (A.) ,

i=l i .,...o 1.

where Pn (A.) is the conditional probability of A. with

re-.,...0 1. 1.

spect to ~

₀

• ₀

Proposition 1.5.2: Let~ and n be partitions of X. Let ~

₀

e: cr~,

(34)

a) I(~ v nl~

0

) = I<~l~

0

) + I(nl~

0

v ~)

b) I(~l~

0

) ~ I(nl~

0

) if~ c n

c) E~I<~I~> ~ E~I(~I~1) d) I(~~~) = 0 iff ~ c ~ • D Remark 1.5.3: Let { -x log x, 0 < x ~ I z(x) = 0 ,x=O,

then we have by proposition 1.1.4:

00

=-

i~l

log

p~ (Ai)E~

(xAi) = ii1

z(P~

(Ai)) •

D

Proof of proposition 1.5.2: Let~= {A.}~

1 and n ={B.}~ 1 •

- - l. J.= J J=

a) By proposition 1.2.8. we have

00 00

I(~

v

nl~)

=-

.I .I

xA.nB. log P~- (A. n B.)=

1.=

I J

=

1 1. J -0 J. J 00 00

..

_{- I I}

i=1 j=l XA B log _{.n •} P~(A.) _J. -1. J 00 00 PIR_ (A.nB.)

-u

1. J

- I I

_.

x

1 og --:::---:.-:"""":-1 . I A.nB. P, (A.) 1.= J= l. J

"'o

1.

(35)

00

I

xA. log PIR (A.)

-i=l l. 0 l.

"' 00 Pot (A.nB.)

I

log

_I

0 l. J

-

_XB. _xA. _PIR _{A.) =

j=l J i=l l. ₀ _l.

00

= I(t;; I~Ro) -

I

_{xB. log}_POlovE;; (B.) =

j=l J J

b) This is an immediate consequence of a) and the positivity of conditional information.

c) By JENSEN's inequality (BILLINGSLEY [l], p. 112, formula 10.10) and the concavity of the function z(x) mentioned in remark 1.5.3 we have by proposition 1.1.4.

00

I

z(PIR (A.)) i= l 0 l.

~En

I

z(Pn (A.)) =En l(~lotl) •

"'o

i= 1 "'1 1

"'o

d) Suppose that E;; c !R

0₀₀, then for any i we have PIR ₀(A.) _l.

Hence I(t; i~R

0

) = -

I

xA log P, (A.) = 0 a.e.

i=1 i

"'o

1

Let I(t;;!~R

0

) = 0 a.e., then POlo (Ai)

=

1 a.e. on Ai, and hence on <A.> • It follows that

l. IRo

P(<A.> ) =

l. IRo P&t (A. )dP ₀ l. =

f

XA. dP <A.> l.

l. ~Ro

p (A.) •

(36)

Hence Ai e ~O and the proposition is proved. D

Proposition 1.5.4: _- Let~ be a partition of X. Let {~ }~ _{n n=l} c a0

•n

be an increasing sequence. Then

qo

I(~

I

v ~n) n==l n-+<»

lim I(~~~) a.e.

n

Proof: PARRY [18], p. 16, theorem 2.2. i) • D Proposition 1.5.5: Let~ and n be partitions of X and let

~O e a~. Suppose that ~ v ~O == n v ~O • Then we have a.e.

Proof: We may suppose that~ c

n.

Hence by proposition 1.5.2.

a) and d) we have:

D

Definition 1.5.6: Let ~O e a~, ~I ea~ and let~ be a parti-tion with the property ~ v ~O = ~I v ~

₀

• Then we define a.e.

D Remark 1.5.7: It follows immediately by proposition 1.5.5. that definition 1.5.6. is independent of the choice of the partition ~. Now let ~O E cr~, ~I E a~, ~

2

E cr~, ~I c ~

2

and

let ~ be a partition with the property ~ v ~I

=

~O v ~I •

Then it is easy to see that ~ v ~

₂

=

~O v ~

₂

• Hence also the assertions of the propositions 1.5.2. and 1.5.4. remain true in this more general case as long as all the mentioned

(37)

Definition 1.5.8: Let ~O ~ cr~ and ~I e cr~ and suppose that there exists a partition t; == {Ai}~=l of X such that I; v ~O

= ~~ v ~

0

.then the eonditionaZ entropy of~~ with respect to

~O is defined by

H(~l

I

~o>

,..

J

I(~l

I

~o)dP

• X

If there does not exist such a partition I; we define

0 Remark 1.5.9: This definition has been given by J. NEVEU and A. HANEN. (NEVEU [14], [15], HANEN et NEVEU [8]) and they have shown that this definition is equivalent with other definitions as given in JACOBS [10], ROKHLIN [21].

0

Proposition 1.5.10: Let ~O e cr~, ~I ~ cr~ and ~

₂

~ cr~. Then we have

b) if

c) i f

d)

~: a) Suppose that there does not exist a partition E; such that ~O v I;

=

~O v ~I v ~

2

, hence H(~

1

v ~

2

j~

0

) = ""· Then either H(~

1 I

~

0

)

= "'

or H(~

2 I

~O v ~I)

= ""

and the asse.r-tion is proved.

(38)

~O v n

=

~O v ~I v ~

₂

, then by proposition 1.2.7. there exist partitions s~~ such that ~O VS= ~O V~)' ~O V~) Vs~

=

~O v ~I v ~

₂

and hence ~O v s v ~ = ~O v ~I v ~

₂

. The asser-tion now follows by proposiasser-tion 1.5.2. and remark 1.5.7.

b) Substitute in a) ~I v ~

₂

=

~

₂

and note that the conditional entropy is non negative.

c) In connection with remark 1.5.7. we may suppose that there exists a partition s such that s v ~I = ~O v ~I and hence because of ~~ c ~

₂

.s v ~

₂

=

~O v ~

₂

• Hence the assertion follows by proposition 1.5.2. and remark 1.5.7.

d) This is an immediate consequence of proposition 1.5.2. and

remark 1 • 5. 7. D

Definition I. 5. 11 : We shall denote by 1> the collection of all partitions of X with finite entropy and by 1>0 the collection of all finite partitions of

x.

For ~O € cr!R we define

D

Proposition 1.5.12: Let ~O € cr!R and IR

1 € 1>~. Then there

0

exists a s € 1> with the property s v IR

0

=

IR1 v ~

0

•

Proof: NEVEU [15], theoreme 6).

D

Proposition 1.5.13: Lets € 1> and suppose that {~n}:=

1

c cr!R

is a monotone increasing sequence, Then

J

sup

I(sl~n)dP ~

H(s) + 1. X n

Proof: NEVEU [15], theoreme 5) lemma 2.1.).

(cf. also PARRY [18], p. 14,

(39)

Remark 1.5.14: An. immediate consequence of the propositions 1.5.4. and 1.5.13. is the t

1-norm convergence of the sequence {I(~~~ )}00

1 for~ e ~. 0

n n=

Corollary 1.5.15: Let ~O e cr~· let {~n}:=l c a~ be a monotone

increasing sequence. Suppose that H(~

₀

I

~

₁

) < ""· Then:

H(~O~ V

i=l

~.)

.

~

Proof: Because of the fact that H(~

₀

I

~

₁

) < ""• there exists

a partition~ e ~such that~ v ~I

=

~O v ~

₁

• It follows from

remark 1.5.7. that~ v ~n ~O v ~n' n = 1,2, •••• Hence the corollary is an immediate consequence of remark 1.5.14.

0

The assertion of corollary 1.5.15 can be given in a more general form as will be done in proposition 1.5.16.

Proposition 1.5.16: Let

vt

c cr~ be an increasingly filtered system generating ~O' i.e. ~O is the smallest sub-cr-field of ~ containing all the elements of

UE.

Let ~

1

e cr~ and suppose that there exists a ~' e

Ol

such that ~I e t~,. Then

~: JACOBS [10], p. 257, theorem 3. 0

Definition 1.5.17: The metric oH is on~ defined by

for any pair

(40)

Theorem 1.5.18: With respect to the metric oH the collection · ~ is a separable complete metric space. With respect to oH

the collection ~O is a dense subset of ~.

Proof: ROKHLIN [21:], p. 19,20 §6.1 and §6.2. 0 §6. MEAN INFORMATION AND ENTROPY

If not explicitly stated otherwise in this section T will be a measure preserving transformation on (X,~,P) as defined in definition 1.3.2.

Proposition 1.6.1: Let T be a non-singular (not necessarily P-preserving) transformation on (X,~,P). Let ~O E cr~,

6l1 E cr ~· Then

T-1

(~0

V

~1)

=

T-1~0

V T-16{1 •

-I

Proof: First we note that T (~

₀

u ~I)

T-1(~ V~~) ~ T-1~0 v T-1~1

Let

""

I

-1

-r

-1

V!.= {A €

6b

v ~ T A € T ~O V T ~I).

Then

<X

is a cr-field which contains ~O and ~I. Hence

0 Remark 1.6.2: The condition that T is non singular in proposi-tion I • 6. l. is necessary since we identify two a-fields ~I

and ~

₂

with the property that for any A

1 E ~I there exists an A

2 E ~ such that P(A1 ~ A2)

=

0, and conversely.

(41)

Proposition 1.6. 3: Let

%

E a <Rand t~t

1

E a <Rand suppose that there exists a partition ~ of X such that ~ v <Ra = <R₁v <Ra .

Then we have a.e.

Proof: Let~= {A.}~

1, then by proposition 1.6.1.

l l=

(I)

Hence because of definition 1.5.1. and proposition 1.3.1.

00

-

_I

_{X -1} log T[P <R (A.)]

i=l T A. 0 l

l

00

T[-

_I

_xA. log P <R (A.) J T[I(<Rl

I

<Ro) J. 0

i=t l 0 l

Proposition 1.6.4: Let~ E a<R' t~t

1

E a<R. Then

Proof: Suppose that there does not exist a partit~on ~ of X such that <R

0 v ~ = <R0 v <R1 and that n is a partition of X such that

(42)

Then

n

=

T-

1

~. ~

c

~O

v

~I

1.6.1. again

and now because of proposition

Now suppose A € ~O v ~I' then there exists aB € ~O v ~such

that T-IB = T- 1A. It follows that P(B ~ A) P(T-IB ~ T-IA) = 0 and hence ~O v ~ = ~O v ~

₁

•

Contradiction. Hence we may suppose that there exists a

parti-tion~ of X such that ~O v ~ = ~O v ~

₁

• The assertion of the

proposition follows now by integrating formula (I) of proposi-tion 1.6.3.

Theorem 1.6.5: Let~ € ~. Then

I n-1 . oo •

lim- I( V T-~~) = E~ I(~~ V T-~~) a.e.

n+oo n i=O T i=l

and in £

1-norm, and consequently n-1

lim

l

H( V

T-i~)

=

H(~~

n+oo n i=O

V i=I

If T is ergodic we have a.e. and in £

1-norm n-1 lim

l

I( V

T-i~)

=

H(~~

n+oo n i=O 00 V i=l -i T ~). D

Proof: This is a generalization of MAC MILLAN's theorem given by CHUNG. For a proof see PARRY [18], p. 20, theorem 2.5.

D

Definition 1.6.6: Let~= {A.}~

1 € ~.Then the mean

informa-~ informa-~= . n-1 .

tion i(~,T) is a.e. defined by i(~,T) = lim

l

I( v T-~~) • n+oo n i=O

(43)

The mean entropy of ~ is defined by h(~,T)

Ji(~,T)dP.

X

Definition 1.6.7: Let ~I € a~ (not necessarily satisfying

1

_~

1

c

~

1 ).

Then we define sup h(~,T) •

~c~l

~€$0

Proposition 1.6.8: Let ~

₁

€ a~. Then

H(~

₁

) ~sup h(~,T) •

~c~1

~€~

Proof: JACOBS [10], p. 278, theorem 2.3.

0

D

The assertion of proposition 1.6.8. can be generalized as will be done in the following proposition.

Proposition 1.6.9: Let

Vt

c cr~ be an increasingly filtered

system generating ~I € o~. Then

~: JACOBS [10], p. 278, theorem 2.2. and 2.4.

0

Notice that for ~ € ~ by definition I .6.6. and proposi-tion 1.6.8. we have

H(~)

=

h{~,T) •

Hence indeed proposition 1.6.9. is a generalization of propo-sition 1.6.8.

The number h(T) ~ H(~) is called the KOLMOGOROV-SINAI invariant of the dynamical system (X,~,P,T) •

(44)

C H A P T E R II

THE CASE OF A MEASURE PRESERVING TRANSFORMATION

§1. INTRODUCTION

The purpose of this chapter is to give a definition, some conditions which guarantee the existence, and some properties of conditional mean information and conditional mean entropy in the case that T is measure preserving. If not stated other-wise in this chapter T will be a measure preserving transfor-mation on (X,~,P). Since we shall also discuss similar results

in a more general case in chapter I l l we shall presently de-fine the conditional mean information and conditional mean entropy even under the weaker hypothesis that T is non-singu-lar. The first definition of conditional mean entropy has been given by NEVEU [14] for a partition~ e ~ in the case that T is an invertible measure preserving transformation on (X,~,P)

and conditioning by a a-field ~O e a~ with the property

T-

1

_~

0

=

~

0

•

In this special case NEVEU also proved theorem 2.5.9. of this thesis. Definition 2,1.1. given below reduces to NEVEU'S definition under the conditions mentioned above.

Definition 2.1.1: Let T be a non-singular transformation on

cx.~.P). Let ~0 E

n •

{--1-- I( v T-~~

n+l i=O I

a~ and ~I e ~~ • If the sequence

0

its limit is called the aonditional mean information of ~I

(45)

I n •

If the sequence {n+l H(.v T-~~l

I

~

0

)}:.

1

converges. then

.

~-o

its limit is called the aonditionaZ mean entPopy of ~

₁

with

respect to ~₀• D

Remark 2.1.2: In order to simplify the notation. the condi-tional mean information and the condicondi-tional mean entropy will be denoted by i(~

1 I

~

0

) and h(~

1 I

~

0

) respectively. I f E;, E

and ~O

=

f$,X}, then we have h(E;,

I

~

0

)

=

h(E;,,T) as defined in

1.6.6. D

Remark 2.1.3: Let T be invertible and suppose that

T-

1

~

0 =

~

0

•

Then

n

H(T-i~l n T-j~

I

V V ~Q)

i=O j=i+1 I (I . 5. lOa))

n n-i

T-j~

= n+l

I

H(~l

I

V V ~0)

i=O j=l 1

(1.6.4)

Hence by corollary 1.5.15. taking the Cesaro limit it follows that n lim --1 --1 H( V

T-i~

n~ n+ i=O 1

"'

I

~0)

=

H(~l

I

V

T-j~J

V

~0).

j•l

(46)

Thus in this case definition 2.1.1. reduces to NEVEU'S

defini-tion mendefini-tioned above. 0

In the following sections we shall give some other tions under which the conditional mean information and condi-tional mean entropy exist. As said before we do not require that T is invertible. First it is easy to see as in remark 2.1.3. that the conditional mean entropy exists in the case

-1

that T ~O

=

~O (T not invertible). In section III.2.we shall generalize this to the case that there exists a k E Nt such

-k

that T ~O

=

~

₀

• We shall also prove the existence of condi-tional mean information in this case. In section III.3.we shall prove the existence of conditional mean entropy in the

~

case that H(~

₀

) = 0 and in section III.4.in the case that

-1

I

-1

T ~O c ~ and H(~

₀

T ~

₀

) < oo, In the last case we shall also prove the existence of conditional mean information.

In this introduction we shall give a first simple result concerning the existence of conditional mean information and conditional mean entropy in proposition 2.1 .4.

Proposition 2.1.4: Let ~O E a~ • ~O E <l'>~ , ~I E <l'>~ and suppose

0 0

~

₀

c ~

0

. Then:

a) i(~

1 I

~

0

) exists if i(~

₁

I

~

₀

) exists.

In this case we have i(~

1 I

~

0

)

=

i(~

1

~

0

).

b) h(~

1 I

~

0

) exists if h(~

1 I

~

0

) exists.

In this case we have h(~

₁

I

~

₀

) = h(~

1 I

~

0

).

Proof: By proposition 1.5.12. there exists a partition~ E ~

with the property ~ v ~

₀

= ~

₀

. Hence we have t-1 I( V

T-i~

I

~0)

t i=O I t-1 I( V

T-i~

I

~0 V~)=

t i=O I

(47)

t-1 =

l

I( V

T-i~ V~~~-)

- _tl

I(~j~

0 )

t i=O I

-u

t-1 =

l

I( V

T-i~

I %>

+ t i=O I (1.5.2a)) (1.5.2a)) (I)

Now notice that by the propositions 1.5.4. and 1.5.13.

t-1 .

I

-1.

the sequence {I(~ ~ v v T ~

₁

)

i=O 1

converges

CO '

both a.e. and

in £

1-norm to the function I(~l% v

-1.

V T ~I),

i=O

Hence the

sequence

converg~s both a.e. and in £

1-norm to zero. Hence assertion a) is an immediate consequence of formula (1) and assertion b) follows by integrating both sides of (l) and then taking

the limit for t + eo, 0

An immediate consequence of proposition 2.1.4. is the existence of the conditional mean information and conditional mean entropy of a partition ~ € ~ with respect to a partition n € ~. In this case we have

t-1

lim

l

I( V

T-i~ln)

=

E~ I(~~

t+co t i=O -T _i=l

(48)

§2. PERIODIC SUB-cr-FIELDS

In this section we shall prove the existence theorem of conditional mean information and conditional mean entropy for periodic sub-a-fields of~ as defined in definition 2.2.1.

Definition 2.2.1: Let k be a positive integer. A sub-a-field

~

₀

E

cr~

is called periodic with period k if

T-k~O

=

~O

and if -k'

for any positive integer k' < k we have T ~

₀

I

~

₀

• 0

First we shall give an example o~ a periodic sub-cr-field

~O in the case that T is not invertible. Let

x

1

=

{x E ~10 ~ x < 1} be the half open unit

inter-val, let ~I be the cr-field of Borelsets on

x

1 and let ~I be the Borelmeasure on ~

₁

• Furthermore let k ~ 3 be a positive integer, let x2 = {1,2, ••• ,k}, let ~2 be the a-field of all subsets of X and let for any A E ~

2

~

2

(A) be the number of elements of A divided by k. Now we consider the probability space (X,~,P) where X =

x

1 x

x

2 is the Cartesian product,

~ is the cr-field of subsets of X generated by the measurable rectangles and P is the product measure on ~. The measure preserving (not invertible) transformation T on (X,~,P) is defined by T(x,n) = (T~x,T

₂

n), where T

1x = 2x (mod 1) and T

2(n) = (n+l) (mod k). Now let

~

=

{A.}; 1 1= 1, where

Then

~

is a partition of X with the properties

T-k~

=

~.

n

T-i~

I

~.

0 < i < k.

H~nce

the a-field

~

is periodic with period k.

Theorem 2.2.2: Let k be a positive integer and suppose that

(49)

k-1

Suppose that ~~ E n

i=O

where

Consequently,

• Then both a. e. and in £

1-norm

00

V T-i~] v T-(k-j)~O)(Tj)

i=l

k-1 00

h(~I

I

~o)

=

~

.

I

H(~I

I

v

T-i~l

v T

-j~o)

J=O i=l

If T is ergodic we have a.e. and in £

1-norm Proof: To k-I = _!_

L

H

(~]

I

k j=O 00 V T-i~] V T-j~O) i=l

simplify the notation we denote

i T-j~ V T-i~ ) f. = I(~l

I

V ~ j=l I 0 00 T-j~ V T-i~O) f! = I(~l V ~ j=I 1 i I, 2, ...

Because of the fact that Tk is a measure

pr~serving

transformation on (X,~,P) by BIRKHOFF's theorem w,e have a.e. and in £

(50)

lim p-+«> kp Putting v j==kp+l {~,X} we remark that kp • I( V T-~6\1

I

6\0) = i==O kp • kp =

l

I(T-~6\1 V i=O j=i+l T-jdt V T-kp6\0) _{= (I •}_{5. 2a))} I kp -· kp-i T-jdt V T-(kp-i)dt )]

=

l

T~[I{M

₁

V I 0 i=O j=l (1.6.3) Hence we have

1-

1_- _I(

~p

_T-iiR

_I

_{IR ) - E} _f(k)

_I

_s kp+l i=O I 0 ~k I kp Ai t (k) + 1 - k I

I

T f, -. - E!Rkf

!.

p+ i=O Kp ~ -i (I)

(51)

0 a.e.

and in £

1-norm because of corollary 1.4.2. it is sufficient to prove that sup If - f' _n _n

I

€ £

1

(x.~,P) and lim If -_n

f'l

_n = 0

n n~

a.e. and in £

1-norm. Indeed,

sup If - f'

I

:s; _{sup If}

I

₊_{sup If}I I ::;

n n n n n n n k-1 n T-jdt V T-i!RO) + ::;

_l

_supI(IR 1 I V I i=O n j=l k-1 +

l

i=O CO I(o:t 1 I v T-jiR1 v T-i%) e £1 (X,dt,P) j=l

because of proposition 1.5.13. Furthermore by the propositions 1.5.4. and 1.5.13. we have for any p

n

lim I(IR

1 I v T-idt v

T-p~)

=

n+oo i=l 1

=I(~

I

"' v T-i~ v T-p~) a. e. and £

1-norm.

i=l

Since we may restrict p to the finite set 0 ::: p :s; (k-1) this convergence is uniform in p and hence we have a.e. and in

£

1-norm

lim

I

I(')

I

n~

(52)

Now we have to prove that a.e. and in£

1-norm

First we remark that

I kp A• k I p-1 A k (k) - - \" T1f' .

=

__p_-

\

TJ1. f + kp+ I . L kp-1 kp+ I p a=LO 1=0 "' + - 1 - Tkpi(IR kp+l I

"'

v T-jiR v IRQ) • j=l . I

By BIRKHOFF's theorem we have a.e. and in £

1-norm "' V T-jdtl V

~St

0 )J

-j==l 00 V T-jdtl V IR 0)J

=

0 , j=l

Hence by BIRKHOFF's theorem with Tk instead of T we find

I kp

Ai

-

E f(k)l - 0 lim !k +I

.I

T fk -i IR k

-p+«> p 1=0 p .T

both a.e. and in £

1-norm.

To prove the second assertion of theorem 2.2.2. we re-mark that

(53)

D

Then

Let ~O E o~ be periodic with period I and let ~I ~ ~~ • 0

00

V

T-i~)

V

~O)J

a.e.

i=l and in £

1-norm. Hence theorem 2.2.2. is a generalization of

UAC MILLAN' s theorem ( 1. 6. 5) where ~O {$;X}.

§3. MEAN ENTROPY ZERO

In this section we shall prove the existence of condi-tional mean entropy if the conditioning sub-o-field of ~ has mean entropy zero in the following sense.

Definition 2.3.1: The o-field ~O ~ o~ is said to have mean

entropy zero if H(~

₀

)

=

0. D

Before we formulate the existence theorem we shall first exhibit the connection between mean entropy zero and condi-tional independence.

Definition 2.3.2: Let ~O E o~; ~I E o~ and ~

2

E o~. Then ~ and ~I are called conditionally independent with respect to ~

₂

if we have for any A ~ ~O' B E ~I

(54)

P~ (An B)

=

P~ (A).P~ (B) a.e.

2 2 2 0

'

(See e.g. LOEVE [12], p. 35I, 25.3)

Proposition 2.3.3: Let ~O € cr~, ~

2

€ cr~ and ~I € ~~ • Then

2

the following two assertions are equivalent:

a) ~O and ~I are conditionally independent with respect to ~

₂

;

b) H(~l

I

~2 V ~0) = H(~l

I

~2).

Proof: NIJST [16], p. 313, theorem 2.2. 0

Proposition 2.3.4: Let~· € a~, ~I € a~ and ~

2

€ cr~. Suppose 00 •

that ~· c ~I and H(~

1

v ~

2 I

v T-1~ ) < oo, Then

i=l I lim H(~' t+«>

=

H(~'

I

00 -i 00 V T ~I V V i=l k=t 00 V i=l T-i~ ) I

Proof: JACOBS [10], p. 273, theorem IV. 0

Theorem 2.3.5: Let ~O € cr~. Then the following three asser-tions are equivalent:

a) H(~

₀

) = O; b)

H(~

I

~ T-i~

V

~0)

=

H(~

i=l "" V i=l

T-i~)

for any

~

€

~;

c) ~O and ~ are conditionally independent with respect to

00

v

T-i~

for any

~

€

~.

(55)

Proof: The equivalence of the assertions b) and c) is an im-mediate consequence of proposition 2.3.3.

b) ~ a) Let ~

₂

e $, ~

2

c ~O then we have for any ~ E $

I QO • \ (X)

H(~

I

V

T-

1

~

V

~0)

Q

H(~

I

V

T-i~) ~

i~l i=l

=

00

~ H(~

I

V

T-i~

V

~2) ~ H(~

I

V

T-i~

V

~0)

iQJ i=l (I.S.lOc)) Hence 00 00

H(~

V

T-i~

V

~2)

Q

H(~

I

V

T-i~).

i=J i=J

Substitute in this equality ~

=

~

₂

• Then

I t follows that H(~

₀

)

=

sup h(~

₂

,T)

=

0. ~2~ ~2€$ a) ~ b) Let ~

2

c ~· ~

2

E $. Then h(~

2

,T) = H(~

2 I

00 • -1

Hence by proposition 1.5.10d) ~

₂

c .v T ~

₂

and by

""

_-1

.

V T ~

2

)=0. i=l

induction

00 1=1

~

₂

c .v

T-i~

2

for any positive integer t. Let

~I

E $ , then

1=t we have 00

H(~J

V

~2

I

V

T-i~l)

s

H(~J

V

~2)

s

i=l ( L5.10)

(56)

and hence V i=l \ 00 V V T-k~ )

=

k=t 2 00

=

H(~l

I

V

T-i~l)

• {2.3.4) i=l 0!) -i

For any positive integer tit follows from ~

₂

c v T ~

₂

that i=t . . • 00

H(~l

I

.v

T-1~,> ~ H(~l

I

V

T-i~l

V

~2) ~

1=1 i=l 2: H(~l (1.5.10) and hence 00 00

H(~l

V

T-i~l)

2:

H(~l

I

V

T-i~

V

~2)

2: i=l i=l I .. -i 00 -k V T ~I V V T ~z) = i=l k=t 00 -i = H(~

_I

V _T _~~). (2.3.4) I i=l 00 T-i~

""

T-i~ ) Thus we have 8(~

₁

V _V_~2)

=

H(~

_I

V _and

i=l 1 1 i=I I