• No results found

Wellposedness of Stochastic Differential Equations in Infinite Dimensions

N/A
N/A
Protected

Academic year: 2021

Share "Wellposedness of Stochastic Differential Equations in Infinite Dimensions"

Copied!
117
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Master Thesis

Wellposedness of Stochastic Differential

Equations in Infinite Dimensions

Author: Supervisor:

Colin Groot

dr. S. G. Cox

Examination date:

(2)

We investigate the wellposedness of stochastic differential equations in infinite dimensions, fol-lowing the variational approach given in Liu and R¨ockner (Stochastic Partial Differential Equa-tions: An Introduction, Springer, 2015). We look at the existence and uniqueness of (varia-tional) solutions to stochastic differential equations driven by an infinite-dimensional standard cylindrical Wiener process.

The results we prove require some preliminary knowledge on Bochner integrals and probability and martingale theory in infinite-dimensional Banach spaces amongst other results. We will also introduce the stochastic integral with respect to a Q-Wiener process and a standard cylindrical Wiener process.

After that, we sketch the setting for the main result: we discuss the Gelfand triple and sketch the general setting of the existence and uniqueness result. We impose conditions on the coefficients of the stochastic differential equation, namely hemicontinuity, boundedness, coercivity and weak monotonicity.

The proof of existence relies on the existence of strong solutions for finite-dimensional SDEs and weak convergence results. The uniqueness follows from an integration-by-parts argument.

Title: Wellposedness of Stochastic Differential Equations in Infinite Dimensions Author: Colin Groot, colin.groot@student.uva.nl, 10283781

Supervisor: dr. S. G. Cox

Second Examiner: prof. dr. P. J. C. Spreij Examination date: August 16, 2017

Korteweg-de Vries Institute for Mathematics University of Amsterdam

Science Park 105-107, 1098 XG Amsterdam http://kdvi.uva.nl

(3)

Introduction 5

1 Preliminaries 8

1.1 Functional analysis . . . 8

1.1.1 Hilbert spaces . . . 9

1.1.1.1 Properties of Hilbert spaces . . . 10

1.1.1.2 Symmetric and nonnegative operators . . . 11

1.1.2 Hilbert-Schmidt and finite trace operators, eigenvectors and eigenvalues . 11 1.1.2.1 Finite trace operators . . . 13

1.1.2.2 Eigenvectors and eigenvalues . . . 14

1.1.3 Reflexivity, weak convergence and weak* convergence . . . 15

1.1.3.1 Weak topology and weak convergence . . . 16

1.1.3.2 Weak* topology and weak* convergence . . . 19

1.1.3.3 Strong operator topology on L(X) . . . 20

1.2 Bochner spaces and Bochner integral . . . 20

1.2.1 Bochner integral . . . 20

1.2.2 Bochner spaces . . . 23

1.3 A short overview of results in measure, probability and real-valued (local) mar-tingale theory . . . 24

1.4 A brief summary of properties of functions . . . 25

1.4.1 Lower semi-continuity . . . 26

1.4.2 Functions of bounded variation . . . 26

2 Stochastic integration in Hilbert spaces 27 2.1 Probability and martingale theory in Banach spaces . . . 27

2.1.1 Probability theory: Gaussian measures on Hilbert spaces . . . 27

2.1.2 Random variables and stochastic processes . . . 28

2.1.2.1 Random variables . . . 28

2.1.2.2 Stochastic processes . . . 28

2.1.2.3 Progressive measurability . . . 29

2.1.3 Martingale theory in Banach spaces . . . 30

2.1.3.1 Conditional expectation of Banach space-valued random variables 30 2.1.3.2 Martingale theory in Banach spaces . . . 30

2.2 Q-Wiener processes . . . 31

2.3 Stochastic integral with respect to a Q-Wiener process and properties of the stochastic integral . . . 33

2.3.1 Stochastic integral with respect to a Q-Wiener process . . . 33

2.3.2 Properties of the stochastic integral . . . 37

2.4 Q-cylindrical Wiener processes . . . 39

2.4.1 Stochastic integral with respect to standard cylindrical Wiener processes . 40 2.5 A very important remark regarding the integrands of the stochastic integral . . . 43

3 Main result 44 3.1 General setting of main result . . . 44

(4)

3.2 Formulation of the main result (Theorem 3.2.3) . . . 49

3.3 Auxiliary results . . . 50

3.3.1 Approximation of a stochastic process (Lemma 3.3.1) . . . 50

3.3.2 A special case of It¯o’s formula (Theorem 3.3.17) . . . 57

3.4 Existence and uniqueness of strong solutions to SDEs in finite dimensions . . . . 80

3.5 Proof of the main result . . . 81

3.5.1 Preparations for the proof of Theorem 3.2.3 . . . 82

3.5.2 Weak convergence results . . . 89

3.5.3 Proof of Theorem 3.2.3 . . . 92

3.6 Examples . . . 112

(5)

A historical context on stochastic calculus (up to around 1970) can be found in [19]. An overview on the results on stochastic calculus from 1950 onwards is given in [25].

The foundation of stochastic calculus is Brownian motion. This process is named after the botanist Robert Brown, who first observed the erratic movement of pollen grains suspended in water in 1827 ([32], p. 1). Back in 1900, Louis Jean-Baptiste Alphonse Bachelier, who “is seen by many as the founder of modern Mathematical Finance” ([19], p. 1) created a model of Brownian motion in his thesis Th´eorie de la Sp´eculation to describe the dynamic behavior of the Paris stock market.

The first person to provide a mathematical foundation of Brownian motion was Norbert Wiener in 1923. In recognition of his work, his construction of Brownian motion is often referred to as the Wiener process ([19], p. 2). Later on in this thesis, we will also encounter processes that are named after him (see Chapter 2). Similarly, one cannot forgo stochastic integration without mentioning Kiyosi It¯o, “the father of stochastic integration” ([19], p. 3), who developed the theory of stochastic differential equations. Or, as it is put in [25], page 7,

It¯o’s most important contribution is not to have defined stochastic integrals [...] but to have developed their calculus (this is the famous “It¯o’s formula”, which expresses how this integral differs from the ordinary integral) and especially to have used them to develop a very complete theory of stochastic differential equations – in a style so luminous by the way that these old articles have not aged.

We will later see that the famous It¯o isometry also holds, however in a slightly modified version, for stochastic integrals taking values in Hilbert spaces (Chapter 2, Proposition 2.3.3). Moreover, we will derive a special case of It¯o’s formula (Theorem 3.3.17).

Stochastic (partial) differential equations (from now on: S(P)DEs) are used to model the dynamics in man-made systems and in nature. One can think of the famous Black-Scholes formula, which estimates the right price for an European call option, or the stochastic heat equation, describing the distribution of heat in a given region over time, see Example 3.6.4 below.

S(P)DEs in R and Rn have been extensively studied and are known to admit solutions when the coefficients satisfy, for example, some Lipschitz continuity conditions. The proof in that case hinges on a fixed-point argument; compare Theorem 10.2 in [33].

In this thesis, we will discuss S(P)DEs in infinite dimensions where the coefficients do generally not satisfy a Lipschitz continuity condition, so we need to expand our toolbox to show that, under set conditions, S(P)DEs admit a solution. We follow the approach of Liu and R¨ockner [23]. However, before we can derive the existence and uniqueness of solutions, we need to develop the framework to do so. We will now discuss the structure of this thesis. In Chapter 1, we will discuss preliminary knowledge from different branches of mathematics. The approach we take in proving the existence of solutions depends heavily on functional analysis, so a large part of Chapter 1 will entail a review of results from functional analysis. Topics that we will encounter are weak convergence, the Banach-Alaoglu Theorem and Hilbert-Schmidt operators.

Moreover, we will introduce the Bochner integral in Chapter 1, see Section 1.2, that extends the notion of the Lebesgue-integral to (separable) Banach spaces. We give a short overview on results in measure and probability theory and repeat some necessary tools regarding stochastic calculus (Section 1.3) and end Chapter 1 with results on lower semi-continuous functions and functions of bounded variation.

(6)

and infinite-dimensional Wiener processes. After this, we will introduce infinite-dimensional stochastic calculus.

We continue with the definition of a cylindrical Wiener process and develop the stochastic integral with respect to such a process (Section 2.4), using the construction of the stochastic integral with respect to Q-Wiener processes we developed earlier. We end Chapter 2 with an important remark regarding the integrands of the stochastic integral.

The main topic of this thesis will be discussed in Chapter 3. Let us quickly sketch the setting of the main result, which will be discussed in further detail in Section 3.1. We work on a filtered probability space (Ω,F , {Ft: t ∈ [0, T ]}, P) and assume that the underlying filtration

is normal, i.e. satisfies the usual conditions. Consider a separable Banach space V with dual space V∗ and a separable Hilbert space H. Suppose that the following embeddings

V ,→ H : v 7→ v, H ,→ V∗ : h 7→ h∗|V

are dense and continuous; here, we denote by h∗ the map h∗: H → R : x 7→ hh, xiH. With these

two embeddings, we have “V ⊂ H ⊂ V∗”, which is known as a Gelfand triple.

Let U be another separable Hilbert space and denote by L2(U, H) the space of Hilbert-Schmidt

operators from U to H (this will be discussed in greater length in Section 1.1.2). Consider the following maps (compare (3.6) and (3.7) on page 46 below)

A : [0, T ] × V ×Ω → V∗, B : [0, T ] × V ×Ω → L2(U, H).

These maps satisfy certain conditions that can be found in Condition 3.1.3.

We consider the SDE, which we will label with (3.14) to correspond to the same SDE (3.14) discussed later on,

dX(t) = A(t, X(t)) dt + B(t, X(t)) dW (t). (3.14) Here, the process W is a cylindrical Wiener process, which will be defined in Section 2.4 as mentioned above. We define a solution to this SDE as follows; compare Definition 3.2.1. The constant α we will encounter in the definition is due to Condition (H3) on page 46.

Definition. A process X = {X(t) : t ∈ [0, T ]} is called a solution of (3.14) if X is a continu-ous H-valued, F-adapted process that coincides λ ⊗ P-almost everywhere with a progressively measurable V-valued process ¯X ∈ Lα([0, T ] × Ω; V)∩L2([0, T ] × Ω; H) such that for all t ∈ [0, T ]

X(t) = X(0) + Z t 0 A s,X(s)¯  ds + Z t 0 B s,X(s)¯  dW (s) P-almost surely.

The main topic of this thesis will concern Theorem 3.2.3: we will show that the SDE (3.14) has a solution in the sense defined above. In short, we prove the following statement.

Theorem. Let the maps A and B defined above satisfy the conditions in Condition 3.1.3. Then, there exists a solution X to the SDE (3.14). This solution is unique up to P-indistinguishability. The main result, Theorem 3.2.3, also contains an integrability condition. Again, this is discussed later on.

In proving Theorem 3.2.3, we make use of the auxiliary results we derive in Section 3.3, with Theorem 3.3.17 in particular. The latter theorem is a special case of It¯o’s formula.

We will dedicate a lot of space to the proofs of the auxiliary results and the proof of our main result, in the meanwhile using [23] as a backbone for our arguments. The arguments therein are compact and some of them deserve a deeper exploration. We provide the reader with details to help understand these compact arguments.

We will also shortly sketch the proof of the main result. The uniqueness of the solution follows by an easy integration-by-parts argument. The proof of the existence of a solution to the SDE (3.14) in the sense of Definition 3.2.1 essentially boils down to three points.

(7)

Section 1.1). We will then project the coefficients of the SDE onto the first n basis vectors and use an identification argument to obtain a strong solution for the “projected” SDE for every n ∈ N; see Section 3.5.1 and in particular Lemma 3.5.5 for an in-depth discussion.

• The sequence of strong solutions we obtain from the “projected” SDEs are uniformly bounded in a sense made more formal in Lemma 3.5.11 below. Then, we can apply (a consequence of the) Banach-Alaoglu Theorem to obtain essential weak convergence results, see Section 3.5.2 and Corollary 3.5.14 in particular.

• The inequalities (3.117) in Lemma 3.5.25 and (3.119) in Lemma 3.5.27.

We end this thesis with a few examples (Section 3.6), mostly due to [23], that fit the framework we will discuss here. One of these examples is the stochastic porous medium equation, see Example 3.6.6, describing the time evolution of the density of a substance in a porous medium.

(8)

Before we tackle our main topic, we need to familiarize ourselves with some preliminary know-ledge from different fields of mathematics, which we will do in this chapter. We will start with an assumption.

Assumption 1.0.1. All vector spaces we will encounter are assumed to be real.

1.1 Functional analysis

Let (X, k · kX) and (Y, k · kY) be (real) normed vector spaces. We will write B(X) for the Borel

σ-algebra on X. In addition, we will denote the collection of bounded linear operators from X to Y by L(X, Y). We will write L(X) for L(X, X) and say that A is a bounded linear operator on X if A ∈ L(X). If Y is complete, then L(X, Y) is complete as well under the usual operator norm, given by the following equivalent definitions

kAkL(X,Y) := sup

x∈X: kxkX≤1 kA(x)kY = sup x∈X: kxkX=1 kA(x)kY = sup x∈X \{0X} kA(x)kY kxkX

for each A ∈ L(X, Y). The dual space of X (or simply the dual ), consisting of all bounded linear operators from X to the underlying scalar field R, will be denoted by X∗. As R is a Banach space itself (with respect to the absolute value on R), the dual of any normed vector space is complete. Moreover, we can express the norm of an element x ∈ X by using the dual space.

Theorem 1.1.1 (Theorem 4.3(b) in [30]). Define BX∗:= {f ∈ X∗ : kf kX∗ ≤ 1}. Then, we have

that kxkX = sup {|f (x)| : f ∈ BX∗} for all x ∈ X.

In addition, if we denote the dimension of a normed vector space X by dim(X), we have the following result.

Theorem 1.1.2 (Theorem 5.1 in [31]). Let X be a finite-dimensional normed vector space. Then, we have dim(X) = dim(X∗).

Later on, we will employ the notation

hA, xiX× X:= A(x), (1.1)

for any A ∈ X∗ and any x ∈ X. This is called the dual pairing or the dual pair. Furthermore, as is customary, we will sometimes write Ax instead of A(x), just like we will sometimes denote the composition of two bounded linear operators S and T as ST instead of S ◦ T .

In a normed vector space (X, k · kX), we will denote the norm closure of a subset B ⊂ X by

B. For a subset B ⊂ X, we define Sp(B), the linear span of B, to be

Sp(B) := ( n X i=1 λixi: n∈N xi∈B for all i = 1, . . . , n λi∈R for all i = 1, . . . , n ) .

A collection X ⊂ X is called a basis when

(9)

and when this collection is linearly independent.

A normed vector space (X, k · kX) is called separable if X contains a countable, dense subset.

We have a useful characterization of a dense set in a Banach space, which is a consequence of the celebrated Hahn-Banach Theorem.

Proposition 1.1.3. Let X be a Banach space and let A ⊂ X. Then, the set A is dense in X if and only if every functional f ∈ X∗ that vanishes on A is the zero functional.

In addition, the separability of a normed vector space follows from the separability of its dual space.

Theorem 1.1.4 (Theorem 5.24 in [31]). Let X be a normed vector space. If X∗ is separable, then so is X.

We end here with by listing four useful results.

Lemma 1.1.5. Let S and T be bounded linear operators on a Banach space X. Suppose that S and T commute and that T is invertible. Then, the operators S and T−1 commute as well. Lemma 1.1.6. Let X and Y be Banach spaces. Suppose that the inclusion map X → Y : x 7→ x is continuous, i.e., there exists a constant α > 0 such that kxkY ≤ αkxkX for all x ∈ X. Let

{xn: n ∈ N} ⊂ X be a sequence that is convergent in X and Y. Then, the limits in the respective spaces coincide.

Theorem 1.1.7. Let X0 be X1 be Banach spaces. Suppose that X0 and X1 both are linear

subspaces of a normed vector space X and assume that the inclusion map ιk: Xk → X : x 7→ x is

continuous for k = 0, 1. Then, the intersection space X0∩ X1 is a Banach space when endowed

with the norm

kxkX0∩ X1 := maxkxkX0, kxkX1

for all x ∈ X0∩ X1.

Proof. See Theorem 1.3 in Chapter 3.1 (page 97) in [3].

Proposition 1.1.8. Let X and Y be Banach spaces. Then, the product space X × Y is a Banach space, when endowed with the norm

k(x, y)kX × Y := max {kxkX, kykY} for all (x, y) ∈ X × Y.

1.1.1 Hilbert spaces

A Hilbert space H is a vector space endowed with an inner product h·, ·iHthat is complete under

its inner product-induced norm, given by

khkH:=

q hh, hiH

(10)

1.1.1.1 Properties of Hilbert spaces

The previous definitions of basis and separability come together nicely when we consider a Hilbert space and want to determine if this space is separable. Before we do that, however, we need to two more definitions.

A collection H ⊂ H is called orthogonal if hh, h0i = 0 for all h, h0∈ H with h 6= h0. Moreover,

this collection is called orthonormal if H is orthogonal and if khkH = 1 for every h ∈ H.

As a consequence of Zorn’s lemma, every Hilbert space has an orthonormal basis. From the cardinality of this basis, we can determine whether a Hilbert space is separable.

Proposition 1.1.9. A Hilbert space (H, h·, ·iH) is separable if and only if it contains an at most

countable orthonormal basis. Proof. See Proposition 3.4.7 in [20].

Before we proceed, let us make the following assumption.

Assumption 1.1.10. All Hilbert spaces that we will consider are assumed to be separable, unless specified otherwise.

As a consequence of Proposition 1.1.9, every finite-dimensional Hilbert space is separable. In addition, we have the next result.

Corollary 1.1.11. Let (H, h·, ·iH) be a separable Hilbert space. Then, every orthogonal set in

H is at most countable.

The following proposition can be shown by the Gram-Schmidt orthonormalization procedure. Proposition 1.1.12. Let (H, h·, ·iH) be a separable Hilbert space and let X ⊂ H be a dense

linear subspace. Then, H has an orthonormal basis consisting of elements in X.

Every separable Hilbert space (H, h·, ·iH) has an orthonormal basis {hn: n ∈ N} ⊂ H; using

this basis, we can express every element h ∈ H in terms of elements from the orthonormal basis through

h =X

n∈N

hh, hniHhn. (1.2)

Note that it is not entirely clear beforehand that the series on the right-hand side converges and, when it does, that it converges to h. Luckily, it does, and a justification can be found in Section 3.4 in [31]. From this expression (1.2) follows a well-known formula, Parseval’s identity. Proposition 1.1.13 (Parseval’s identity). Let (H, h·, ·iH) be a separable Hilbert space and let

{hn: n ∈ N} ⊂ H be an orthonormal basis for H. Then, we have that

khk2H=X n∈N |hh, hniH| 2 (1.3) for any h ∈ H.

This inner product-induced norm has some other special properties.

Lemma 1.1.14. Let (H, h·, ·iH) be a real inner product space. Then, we have the following

results.

(i) The parallellogram rule holds:

2 kx + yk2H+ 2 kx − yk2H= 2 kxk2H+ kyk2H (1.4) for all x, y ∈ H.

(11)

(ii) The polarization identity holds:

hx, yiH = 14 kx + yk2H− kx − yk2H

(1.5)

for all x, y ∈ H.

(iii) The following identity holds:

kx − yk2H= kxk2H+ kyk2H− 2hx, yiH (1.6) = kxk2H− kyk2H− 2 hx − y, yiH (1.7) for all x, y ∈ H.

1.1.1.2 Symmetric and nonnegative operators

Using the inner product h·, ·iH on a Hilbert space H, we can classify bounded linear operators in L(H). We say that a bounded linear operator A ∈ L(H) is symmetric if

hAx, yiH= hx, AyiH

for all x, y ∈ H. We say that bounded linear operator A ∈ L(H) is nonnegative if

hAx, xiH∈ [0, ∞)

for every x ∈ H. Suppose that A satisfies both criteria, then we have the following result, see e.g. Proposition 2.3.4 in [27]: a symmetric nonnegative operator A admits a “square root ”.

Theorem 1.1.15. Let (H, h·, ·iH) be a Hilbert space and let A ∈ L(H). If A is symmetric and

nonnegative, then there exists a unique symmetric and nonnegative bounded linear operator, denoted A12, such that A

1 2 ◦ A

1 2 = A.

1.1.2 Hilbert-Schmidt and finite trace operators, eigenvectors and eigenvalues

Let (H, h·, ·iH) and (K, h·, ·iK) be Hilbert spaces and let A ∈ L(H, K). Then, there exists a unique

bounded linear operator from K to H, denoted A∗, such that hAx, yiK= hx, A∗yiH

for all x, y ∈ K. This bounded linear operator A∗ is called the adjoint of the operator A ∈ L(H, K) or just the adjoint operator. The existence of such an operator is not clear beforehand — it can be shown using the Riesz-Fr´echet Theorem (see Theorem 6.1 in [31])— but the uniqueness follows directly. The following properties hold for adjoint operators, see Section 12.9 in [30].

Lemma 1.1.16. Let G, H and K be separable Hilbert spaces. Let S ∈ L(G, H) and T ∈ L(H, K). Then, the following relations hold:

(i) kSkL(G,H) = kS∗kL(H,G), (ii) (T S)∗ = S∗T∗,

(iii) (S∗)∗= S,

(iv) (αS + βT )∗ = αS∗+ βT∗.

The introduction of the adjoint operator allows us to show a result that we will use in defining Hilbert-Schmidt operators.

(12)

Lemma 1.1.17 (Lemma 3.5.27 in [20]). Let (H, h·, ·iH) and (K, h·, ·iK) be separable Hilbert

spaces. Let {hn: n ∈ N} be an orthonormal basis of H and let {kn: n ∈ N} be an orthonormal

basis of K. Let A ∈ L(H, K). Then, we have that X n∈N kAhnk2K= X n∈N kA∗knk2H,

where both series could equal infinity.

When we consider the orthonormal basis {hn : n ∈ N} of H as in the preceding Lemma, we

can look at the infinite seriesP

n∈NkAhnk 2

K. Does this expression depend on our choice of basis?

Suppose that {h0n: n ∈ N} is another orthonormal basis of H, then Lemma 1.1.17 guarantees that we have X n∈N kAhnk2K= X n∈N kA∗knk2H= X n∈N Ah0n 2 K

so the expression does not depend on our choice of orthonormal basis.

This means that we can now give the definition of a Hilbert-Schmidt operator.

Definition 1.1.18 (Hilbert-Schmidt operator). Let (H, h·, ·iH) and (K, h·, ·iK) be separable

Hilbert spaces and let {hn : n ∈ N} be an orthonormal basis of H. Then, we say that a

bounded linear operator A ∈ L(H, K) is a Hilbert-Schmidt operator if

X

n∈N

kAhnk2K

!1/2

(1.8)

is finite. As remarked before, note that this number does not depend on the choice of basis. We will denote by L2(H, K) the set of all Hilbert-Schmidt operators from the separable Hilbert

space H to the separable Hilbert space K. When we have two Hilbert-Schmidt operators S and T , Minkowski’s inequality guarantees that the sum S +T of the two operators also satisfies (1.8), turning this set into a vector space. We can even endow this space with the following inner product: let {hn : n ∈ N} be an orthonormal basis of H and take arbitrary S, T ∈ L2(H, K).

Then, we define the inner product h·, ·iL2(H,K) by

hS, T iL2(H,K) :=X

n∈N

hShn, T hniK. (1.9)

The Cauchy-Schwarz inequality, combined with H¨older’s inequality and the Hilbert-Schmidt property of the bounded linear operators S and T guarantees that the infinite series on the right-hand side converges. (One can check that this indeed defines an inner product on the space of Hilbert-Schmidt operators from H to K.) However, just as before, one might wonder whether the definition of this inner product depends on the choice of basis in H. Luckily, this is not the case: by the polarization identity (1.5), we have

hS, T iL2(H,K)= 14 " X n∈N k(S + T )hnk2K # − " X n∈N k(S − T )hnk2K #!

and we already established that these infinite series on the right-hand side do not depend on our choice of basis. From the definition of the inner product h·, ·iL2(H,K) on L2(H, K), it follows

directly that (1.8) is the inner product induced norm on this space. We will denote this norm by k · kL2(H,K). Looking back at Lemma 1.1.17, the next result will not come as a surprise.

Corollary 1.1.19. Let H and K be separable Hilbert spaces and let A : H → K be a Hilbert-Schmidt operator. Then, the adjoint operator A∗ is a Hilbert-Schmidt operator from K to H and we have kAkL2(H,K) = kA

k

(13)

Moreover, this inner product turns L2(H, K) into a separable Hilbert space, under the

condi-tion that both H and K are separable.

Proposition 1.1.20 (Proposition B.0.7 in [27]). Let H and K be separable Hilbert spaces. Let {hi : k ∈ N} be an orthonormal basis of H and let {kj : j ∈ N} be an orthonormal basis of

K. Define, for i, j ∈ N, the bounded linear operator Ai,j : H → K : h 7→ hh, hiiHkj. Then, the

collection {Ai,j : i ∈ N, j ∈ N} is an orthonormal basis of L2(H, K).

Finally, when we have a Hilbert-Schmidt operator at hand, we can use it to construct many more Hilbert-Schmidt operators.

Lemma 1.1.21. Let G1, G2, H and K be separable Hilbert spaces and let T : H → K be a

Hilbert-Schmidt operator. In addition, let S ∈ L(G1, H) and R ∈ L(K, G2). Then, the map

R ◦ T ◦ S is a Hilbert-Schmidt operator from G1 to G2 and the inequality

kR ◦ T ◦ SkL

2(G1,G2)≤ kRkL(K,G2)· kT kL2(H,K)· kSkL(G1,H)

holds.

Proof. See Remark B.0.6(iii) in [23].

1.1.2.1 Finite trace operators

In the previous section, we introduced Hilbert-Schmidt operators: the norm on the space of Hilbert-Schmidt operators is given by (1.8). Now, suppose that we have a symmetric and nonnegative bounded linear operator A on H. Then, by Theorem 1.1.15, we know that this operator A admits a square root A12.

As H is separable, there exists an orthonormal basis {hn: n ∈ N} of H. Let A be a symmetric

and nonnegative bounded linear operator mapping on H. Then, we define its trace as

tr(A) := A 1 2 2 L2(H) . (1.10)

If we expand the right-hand side, we find

A 1 2 2 L2(H) =X n∈N A 1 2hn 2 H= X n∈N D A12hn, A 1 2hn E H.

Let us remark again that this expression does not depend on the chosen orthonormal basis. As the bounded linear operator A12 is symmetric and has the property that A

1 2 ◦ A

1

2 = A by

Theorem 1.1.15, we see thatDA12hn, A 1 2hn

E

H= hAhn, hn

iH for every n ∈ N. This yields

tr(A) = X

n∈N

hAhn, hniH. (1.11)

This expression is often used as the definition for the trace of an arbitrary bounded linear operator on a Hilbert space, but only if the expression in (1.11) is finite — see for example Definition B.0.3 in [27]. These definitions coincide when a bounded linear operator is symmetric and nonnegative.

We say that a bounded linear operator A on H has finite trace if the expression in (1.11) is finite. Also note that the trace operation is linear on the space of all finite trace operators.

We end this section by remarking that in the case where an operator A is symmetric and nonnegative, its square root is a Hilbert-Schmidt operator if A has finite trace; this follows directly from (1.10) being finite and Definition 1.1.18.

(14)

1.1.2.2 Eigenvectors and eigenvalues

If the symmetric and nonnegative bounded linear operator A has finite trace and the underlying Hilbert space H is separable, we can relate the trace of A with the eigenvalues of A. We will start with defining eigenvalues and eigenvectors for a symmetric and nonnegative bounded linear operator A with finite trace.

A number λ ∈ R is called an eigenvalue of the operator A if A − λ idH is not injective, or,

equivalently, when the equation Ah = λh has a non-trivial solution h ∈ H. In that case, we call the vector h an eigenvector of A with corresponding eigenvalue λ. Here, we implicitly used the result that all eigenvalues of a symmetric operator are real-valued (Proposition 5.8 in [6]). Moreover, it is necessary to remark that the set of eigenvalues of A, known as the point spectrum of A and denoted by σp(A), is non-empty, as kAkL(H) ∈ σp(A), see Lemma 5.9 in Chapter II in

[6].

Note that every eigenvalue of a symmetric and nonnegative bounded linear operator is non-negative. In addition, just like it is the case with symmetric matrices, one can show that eigenvectors corresponding to two distinct eigenvalues are necessarily orthogonal. As remarked before in Corollary 1.1.11, we know that every orthogonal set in a separable Hilbert space is at most countable. In particular, this means that the operator A only has countable many distinct eigenvalues, i.e. the set σp(A) is countable. Even more is true, as we will see next in what is

known as Lidskii’s (Trace) Theorem.

Theorem 1.1.22 (Lidskii, 1959). Let A be a symmetric, non-negative operator with finite trace on a separable Hilbert space H. Then, we have that

X

λ∈σp(A)

λ = tr(A) < ∞.

Proof. See, for example, Theorem 6.1 in Chapter VI.6 (page 63) of [16].

In particular, this theorem implies that every distinct non-zero eigenvalue λ only has finite multiplicity. This makes it possible to order all eigenvalues in non-increasing order, as the only possible eigenvalue with infinite multiplicity is 0.

One can use the symmetric and nonnegative bounded linear operator A with finite trace to construct an orthonormal basis of H that consists of eigenvectors of A.

Proposition 1.1.23. Let A be a symmetric, non-negative operator with finite trace on a separable Hilbert space H. Let {λn : n ≥ 1} be the sequence of eigenvalues of A, counted

for multiplicity and given in non-increasing order. Then there exists an orthonormal basis {hn: n ∈ N} of H and such that

Ahn= λnhn

for every n ∈ N.

Proof. See Proposition 2.1.5 in [27].

Moreover, the symmetric, non-negative operator A with finite trace operator admits a “square root” operator A12 by Theorem 1.1.15 and we can relate the eigenvalues and eigenvectors of these

two bounded linear operators, which is the topic of the next two propositions.

Proposition 1.1.24. Let A ∈ L(H) be a symmetric, non-negative operator with finite trace. Let {hn: n ∈ N} be an orthonormal basis of H, consisting of eigenvectors of A with

correspond-ing eigenvalues {λn : n ∈ N}. Then, if λn is a positive eigenvalue of A for some n ∈ N with

corresponding eigenvector hn, it holds that

λn, hn is an eigenpair of A

1 2.

(15)

Proof. See page 34 in [23].

Proposition 1.1.25. Let A be a symmetric, non-negative operator with finite trace on a separable Hilbert space H. Then, the null space of A coincides with the null space of A12.

Proof. It follows almost directly that the null space of A contains the null space of A12. For

the other inclusion, note that if h ∈ ker(A), we have 0 = hAh, hiH= D A12h, A 1 2h E H, so it must follow that A12h = 0H.

Together, these two propositions yield the next result.

Corollary 1.1.26. Let A be a symmetric, non-negative operator with finite trace on a separable Hilbert space H. Let {hn : n ∈ N} be an orthonormal basis of H, consisting of eigenvectors of

A, with corresponding eigenvalues {λn : n ∈ N}. Then, for every n ∈ N, the vector hn is an

eigenvector of A12 with corresponding eigenvalue

√ λn.

This corollary can be used to introduce the last result of this section: an important linear subspace of H, namely the image of A12, which also is a separable Hilbert space when equipped

with an appropiately chosen inner product.

Proposition 1.1.27. Let A be a symmetric, non-negative operator with finite trace on a separable Hilbert space H. Let {hn : n ≥ 1} be an orthonormal basis of H, consisting of

eigenvectors of A with corresponding eigenvalues {λn : n ∈ N}, counted for multiplicity and

given in non-increasing order. Define Λ := {n ∈ N : λn > 0}, the index set of non-zero

eigenvalues. Then, we have

A12(H) = ker  A12 ⊥ ∩ ( y ∈ H :X n∈Λ 1 λn |hy, hniH|2 < +∞ ) .

This space A12(H) is a separable Hilbert space when endowed with the h·, ·i0-inner product,

defined by hg, hi0:= X n∈Λ 1 λn hg, hniHhh, hniH

for all g, h ∈ A12(H), with orthonormal basis {

√ λnhn: n ∈ Λ} = n A12hn: n ∈ Λ o .

Proof. The first assertion follows by element tracing (also, compare Exercise 3 in Section II.5 in [6], page 49). One can verify that h·, ·i0 is indeed an inner product on A

1

2(H). For the

separability and the completeness of the inner product space A12(H), h·, ·i0



, we refer to e.g. page 23 in [15] or page 96 in [8].

1.1.3 Reflexivity, weak convergence and weak* convergence

Let (X, k · kX) be a normed vector space. Then, we already saw that its dual space X∗ is a

Banach space. We can even go one step further and consider the dual space of X∗, which we will denote by X∗∗ and call the second dual or double dual of X. This again is a Banach space, when endowed with the usual operator norm.

For any element x ∈ X, we can define the mapping

Fx: X∗ → R : f 7→ f(x), (1.12)

hence Fx(f ) = f (x) for all f ∈ X∗. Then, the map Fx is an element of X∗∗ for any x ∈ X.

(16)

Having introduced the map Fx, we can also define the function JX from X to its second dual

X∗∗ by

JX: X → X∗∗: x 7→ Fx. (1.13)

This mapping JX is linear and it is even isometric, since kFxkX∗∗ = kxkX for every x ∈ X.

Now, we call a normed vector space X reflexive if JX(X) = X∗∗. Examples of reflexive Banach

spaces are the Lebesgue Lp-spaces for each p ∈ (1, ∞), every finite dimensional normed vector space and every Hilbert space.

Note that for reflexive spaces the map JX is an isometric isomorphism, it follows that X must

be a complete space and therefore, only Banach spaces can be reflexive.

It does however not mean that every Banach space is reflexive. Moreover, reflexivity of a Banach space X is only be defined through the “natural map” JX: there exists a non-reflexive

Banach space that is isometrically isomorphic to its second dual, see [18]. We will end this discussion on reflexivity by listing some results of reflexive spaces.

Lemma 1.1.28 (Theorem 5.23 in [31]). A Banach space X is reflexive if and only if its dual X∗ is reflexive.

Lemma 1.1.29 (Corollary 5.56 in [31]). If normed vector spaces X and Y are isomorphic, then X is reflexive if and only if Y is reflexive.

Lemma 1.1.30 (Theorem 5.44 in [31]). If a normed vector space X is reflexive and Y is a closed linear subspace of X, then Y is reflexive.

Lemma 1.1.31 (Theorem 20.4 in [26]). If normed vector spaces X and Y are reflexive, then the product space X × Y is reflexive.

1.1.3.1 Weak topology and weak convergence

Every element of the dual of a normed vector space (X, k · kX) is continuous with respect to the

norm topology that is generated by the norm on X. This means that the norm topology on X might not be the coarsest topology on the vector space that makes all the elements of the dual space continuous. In this section, we will construct this topology, called the weak topology on X.

Suppose that we have a vector space Y, endowed with a seminorm, p. The collection of open balls around elements of Y with respect to the seminorm p forms the basis for a topology on Y. In this fashion, every bounded linear functional f ∈ X∗ induces a seminorm k · kf on X, given

by kxkf := |f (x)| for each x ∈ X, and therefore constructs a topology, say Tf, on X. Then, we

define the weak topology on X to be the coarcest topology on X to contain each Tf. This weak

topology on X also characterizes weak convergence, by virtue of the next proposition, based on Exercise 1.9.3 from [34]. We will only give an outline of the proof.

Proposition 1.1.32. Let V be a vector space and let {Fα : α ∈ A} be a non-empty (possibly

infinite) family of topologies on V. Let F be the coarcest topology on V that contains Fα for

every α ∈ A. Then, a sequence (vn)n∈N ⊂ V converges to v ∈ V in F if and only if vn→ v in

Fα for every α ∈ A.

Outline of the proof. Firstly, one can show that the collection

B = ( n \ i=1 Fi: n∈N,

αi∈A for all i = 1, . . . , n

Fi∈Fαi for all i = 1, . . . , n

)

is a basis that generates F . Then, to prove the statement concerning the convergence, note that the “only if”-part follows also immediately, as every open neighborhood in Fα is also an open

(17)

neighborhood in F , as F contains every Fα. For the “if”-part, it suffices to check that for

open neighborhoods in B the convergence criterion holds: it follows quickly, since every open neighborhood is the countable intersection of sets for which we know the convergence already happens.

From Proposition 1.1.32, we can look at weak topology, the smallest topology on the normed vector space X that contains the seminorm-induced topology Tf for every f ∈ X∗; a normed

vector space X equipped with the weak topology is a Hausdorff space, by virtue of the Hahn-Banach Theorem.

Using Proposition 1.1.32 again, we can give the definition of weak convergence: a sequence (xn)n∈N converges to x ∈ X in the weak topology if and only if xn→ x in Tf for every f ∈ X∗.

The latter definition is equivalent to the condition that kxn− xkf → 0 for every f ∈ X

, which,

in turn, by the linearity of each f ∈ X∗, is equivalent to f (xn) → f (x) for every bounded linear

operator f ∈ X∗. The last definition will be used to define weak convergence of a sequence (xn)n∈Nto an element x ∈ X. The following notation will be used to denote weak convergence,

xn−→ x,w

or, sometimes, we will just write “xn → x weakly”. Weak limits are necessarily unique. To

distinguish between this notion of convergence and the usual convergence on X, we will refer to the latter as “norm convergence” where confusion could arise and sometimes, we will call the usual topology on X that is induced by the norm the norm topology.

Remark 1.1.33. Other authors take another route to define the weak topology, by defining it as the coarcest topology to make all elements of the dual space continuous. One can show that the two definitions coincide, by checking that Tf coincides with the topology generated by the

inverse images of open sets with respect to f for every f ∈ X∗. A similar approach is found on page 161 in [13].

We can use the uniform boundedness principle to show that every weakly convergent sequence is bounded. Moreover, we have the following bounds on a weakly convergent sequence.

Lemma 1.1.34. Let X be a Banach space and suppose that the sequence (xn)n∈N⊂ X converges

weakly to x ∈ X. Then, the following inequalities

kxkX≤ lim inf

n→∞ kxnkX ≤ supn∈NkxnkX

hold.

The continuity of a linear functional f ∈ X∗ implies that if a sequence (xn)n∈N converges in

norm to x ∈ X, the sequence converges weakly to x as well. The converse of this statement does not hold in general: we will give an example in Example 1.1.38 below, where we connect weak convergence in Hilbert spaces to the inner product structure as described in Proposition 1.1.37. We would like to point out that the notions of norm and weak convergence do however coincide when the underlying normed vector space is finite-dimensional.

Lemma 1.1.35 (Lemma 5.70(c) in [31]). Norm convergence and weak convergence are equiv-alent in finite-dimensional normed vector spaces.

Regarding continuity with respect to the weak topology, we have the following well-known result that we will use later on.

Theorem 1.1.36. Let X and Y be normed vector spaces. Then, T : X → Y is continuous with respect to the norm topologies on X and Y (norm-to-norm continuous) if and only if T is continuous with respect to the weak topologies on X and Y (weak-to-weak continuous).

(18)

For Hilbert spaces, we can define weak convergence in a different, equivalent manner.

Proposition 1.1.37 (Equivalent definition of weak convergence in Hilbert spaces). Let H be a Hilbert space. A sequence {hn : n ≥ 1} converges weakly to h ∈ H if and only if hh0, hniH

hh0, hiH for all h0 ∈ H.

Proof. Follows from the Riesz-Fr´echet Theorem.

We can use this proposition to show that weak convergence does not generally imply norm convergence.

Example 1.1.38. Consider the Hilbert space `2of square-summable sequences. Lete(n): n ∈ N be the canonical basis of `2 and let 0 = (0, 0, . . . ) denote the all-zeros sequence in `2. Every sequence a = (a1, a2, . . . ) ∈ `2 converges to zero, as it is square summable and this means that

D a, e(n)E `2 = X k∈N ake(n)k = an→ 0 = ha, 0i`2,

hence e(n) w→ 0. The strong convergence of e(n): n ∈ N to 0 cannot hold, as

e(n)− 0

`2 = 1

for every n ∈ N, which proves that the converse does not hold in general.

Later on, we will encounter functions that we will call weakly continuous into a Banach space.

Definition 1.1.39. Let [0, T ] ⊂ R be an interval and let X be a normed vector space. A function f : [0, T ] → X is said to be weakly continuous into X if tn→ t implies f (tn)

w

→ f (t).

Regarding these kind of functions, we have the following result.

Proposition 1.1.40. Let Y be a Banach space and let X ⊂ Y be a linear subspace that is dense in Y. Let f : [0, T ] → Y be continuous. Then, if the map f only takes values in X, it is weakly continuous as a map from [0, T ] into X.

Proof. We will check that for every S ∈ X∗, it holds that S(f (tn)) → S(f (t)) when tn→ t. For

every A ∈ Y∗, we have A(f (tn)) → A(f (t)) when tn→ t, as both f : [0, T ] → Y and A : Y → R

are continuous maps. Let S ∈ X∗ be arbitrary. Then, since X is a dense linear subspace of Y, the functional S ∈ X∗ can be extended to a functional ˜S ∈ Y∗ that coincides with S on X by the Hahn-Banach Theorem. However, as ˜S(f (tn)) → ˜S(f (t)), since ˜S ∈ Y∗, and f is X-valued,

it follows that S(f (tn)) = ˜S(f (tn)) → ˜S(f (t)) = S(f (t)).

We end the discussion on the weak topology with two useful weak convergence results.

Proposition 1.1.41. Let X0and X1 be Banach spaces. Suppose that X0 and X1 both are linear

subspaces of a normed vector space X and assume that the inclusion map ιk : Xk → X : x 7→ x

is continuous for k = 0, 1. Consider the Banach space X0∩ X1 defined in Theorem 1.1.7. If a

sequence (xn)n∈N⊂ X0∩ X1 converges weakly to x ∈ X0∩ X1 in the intersection space X0∩ X1,

then xn w

→ x in X0 and X1.

Proof. Restrict functionals on X0 (respectively X1) to X0∩ X1; the continuity of these restricted

functionals is dealt with by the norm on the intersection.

Using a similar restriction argument, we obtain a comparable result.

Proposition 1.1.42. Let X be a linear subspace of Y and suppose that (xn) ⊂ X converges

weakly in X to x ∈ X. Then, it also holds that xn w

(19)

1.1.3.2 Weak* topology and weak* convergence

We can also define a topology on the dual space X∗ of a normed vector space X that is different from the norm topology on the dual generated by the operator norm. The topology we will constuct on the dual space is called the weak* topology (on X∗); the construction of this topology will be similar to the weak topology we saw earlier.

Just like before, we will construct seminorms, but in this case, it will be seminorms on the dual space X∗. For every x ∈ X, the function k · kx, given by kf kx := |f (x)| for all f ∈ X∗,

generates a seminorm on the dual space for each x ∈ X. In contrast with the seminorms on that generated the weak topology, here, the input is a function, an element of the dual space, and not a point of X.

Every seminorm k · kx generates a topology on X∗. Then, we define the weak* topology to be

the coarcest topology on X∗ to contain every topology that is generated by k · kx. Equivalently,

it is also the coarcest topology that will make the evaluation map Fx : X∗ → R : f 7→ f(x)

continuous for every x ∈ X, which can be shown in a similar fashion as with weak convergence (see Remark 1.1.33). This topology turns the dual space X∗ into a Hausdorff space.

We can again invoke Proposition 1.1.32 to characterize weak* convergence: we say that a sequence (fn)n∈N⊂ X∗ converges to a functional f ∈ X∗ in weak* sense if fn(x) → f (x) for all

x ∈ X.

It is known that, in general, the weak* topology on X∗ is coarcer than the weak topology on the dual space X∗ (which is in this case defined to be the smallest topology on X∗ to make every map S ∈ X∗∗ continuous). This means that when we have a sequence (fn)n∈N ⊂ X∗ that

converges weakly to an element f ∈ X∗, we also have weak* convergence of this sequence to f ∈ X∗, but the converse need not be true. However, when the normed vector space is reflexive, the notions of weak and weak* convergence on X∗ coincide.

Proposition 1.1.43. Let (X, k · kX) be a reflexive Banach space. Let (fn)n∈Nbe a sequence in

X∗ and let f ∈ X∗. Then, we have weak convergence of the sequence (fn)n∈N to f if and only if

(fn)n∈N converges to f in weak* sense.

In the context of the weak* topology and weak* convergence, we have the following important result.

Theorem 1.1.44 (Banach-Alaoglu). Let (X, k · kX) be a normed vector space. Then, the

norm-closed unit ball in X∗,

{f ∈ X∗: kf kX∗ ≤ 1}

is compact in the weak* topology.

Proof. See Theorem 5.18 in [13].

In contrast: if X is a normed vector space with an infinite-dimensional dual space X∗, the norm-closed unit ball in X∗ is not compact in the operator norm topology by Riesz’ lemma: see also Theorem 2.26 in [31].

We would like to end here with an important consequence of the Banach-Alaoglu Theorem in reflexive Banach spaces.

Theorem 1.1.45. Let (X, k · kX) be a reflexive Banach space. If (xn)n∈Nis a bounded sequence

in X, it has a weakly convergent subsequence.

(20)

1.1.3.3 Strong operator topology on L(X)

This section will be a short one, mentioning the strong operator topology and the convergence in said topology. Fix a normed vector space X. When we look at L(X), we see that every x ∈ X generates a seminorm k · kx: L(X) → [0, ∞) : T 7→ kT xkX and therefore also generates a

topology Tx. Then, the coarcest topology on L(X) that contains every topology Tx is called the

strong operator topology. One can show that this topology coincides with the smallest topology that makes the evaluation maps T 7→ T x continuous for every x ∈ X. We end here with the next observation, which is a consequence of Proposition 1.1.32.

Lemma 1.1.46. A sequence {Tn : n ≥ 1} ⊂ L(X) converges in strong operator topology if and

only if kTnx − T xkX → 0 for each x ∈ X.

1.2 Bochner spaces and Bochner integral

We can extend our notion of Lp-spaces, that consist of functions from some measure space into R, to functions taking values into a more general Banach space, (E, k · kE). This section is based

on Sections 1.2 and 1.3 in [36].

Before we can define Bochner spaces and Bochner integrals, we have to introduce a differ-ent notion of measurability: µ-strong measurability, which we can formalize through simple functions. Let (A,A , µ) be a measure space.

A function s : A → E is called a simple function when s can be written in the following form

s =

m

X

k=1

1Akek (1.14)

where A1, . . . , Am ∈A are disjoint sets and e1, . . . , em∈ E are distinct elements of E.

Now, we say that a function f : A → E is µ-strongly measurable if there exists a sequence (fn)n∈N of simple functions that converge pointwise µ-almost everywhere to f .

The collection of µ-strongly measurable functions is easily seen to be a vector space and every simple function s is µ-strongly measurable (just take fn= s for every n ∈ N). Regarding

(strong) measurability, we have the following observations.

Observation 1.2.1. If the measure space (A,A , µ) is complete, every µ-strongly measurable function is measurable as well.

Observation 1.2.2 (Consequence of Proposition 1.8 in [36]). Let (A,A , µ) be a finite measure space and let E be a Banach space. If E is separable, the notions of µ-strong measurability and measurability coincide.

The following result will be used to define the Bochner integral.

Lemma 1.2.3. Let (A,A , µ) be a finite measure space and let E and F be Banach spaces. If the map f : A → E is µ-strongly measurable and the function φ : E → F is continous, the composition φ ◦ f : A → F is µ-strongly measurable. If the finite measure space (A,A , µ) is complete, the composition φ ◦ f : A → F is measurable as well.

Proof. The first statement is Corollary 1.13 in [36]. The second statement follows immediately from the observation we made earlier.

1.2.1 Bochner integral

The Bochner integral is a generalisation of the Lebesgue integral to vector-valued functions. This section is based Section 1.3 in [36]. As was the case with Lebesgue integration, we will do this through simple functions. The construction is pretty straightforward and we will dive right into it.

(21)

Throughout this section, we let (A,A , µ) be a finite measure space and let (E, k · kE) be a

Banach space. We say that a function f : A → E is µ-Bochner integrable if there exists a sequence (fn)n∈N of simple functions from A to E such that the following two conditions hold:

(i) the sequence (fn)n∈N converges pointwise to f µ-almost everywhere;

(ii) limn→∞

R

Akfn− f kE dµ = 0.

Note that a µ-Bochner integrable function f is µ-strongly measurable. In addition, note that in the second condition, the function fn− f is µ-strongly measurable, so that kfn− f kE is a

measurable R-valued function for every n ∈ N by Observation 1.2.2; this means that the integral in Condition (ii) is to be interpreted in the ordinary Lebesgue sense. Every simple function is easily seen to be µ-Bochner integrable.

As with the construction of the Lebesgue integral, we define the Bochner integral of a simple function s to be Z A s dµ := m X k=1 µ(Ak)ek

and one can verify that this definition is independent of our chosen representation. On the vector space of simple functions, the Bochner integral is linear and satisfies the following inequality

Z A s dµ E ≤ Z A kskE dµ. (1.15)

Using this inequality, we can define the Bochner integral of a µ-Bochner integrable function f : A → E. Note that the sequenceR

Afndµ : n ∈ N is Cauchy by (1.15), since sup n≥m Z A fndµ − Z A fm dµ E = sup n≥m Z A (fn− fm) dµ E ≤ sup n≥m Z A (fn− fm) dµ E ≤ sup n≥m Z A kfn− fmkE dµ ≤ 2 sup n≥m Z A kfn− f kE dµ → 0

as m → ∞ by Condition (ii). Hence, the limit

Z A f dµ := lim n→∞ Z A fn dµ. (1.16)

exists in E and can again be shown to be independent of the chosen approximating sequence of simple functions by merging two approximation sequences into one. Also, the Bochner integral is linear, as is readily verified.

The next result gives an alternative characterization of µ-Bochner integrable functions.

Proposition 1.2.4 (Proposition 1.16 in [36]). A µ-strongly measurable function f : A → E is µ-Bochner integrable if and only if

Z

A

kf kE dµ < ∞,

in which case, Bochner’s inequality holds Z X f dµ E ≤ Z X kf kE dµ (1.17) holds.

(22)

Lemma 1.2.5 (Proposition A.2.2 in [23]). Let (A,A , µ) be a finite measure space and let (E, k · kE) be a Banach space. Let f : A → E be a µ-Bochner integrable function. Let ` ∈ E∗.

Then, we have ` Z A f dµ  = Z A (` ◦ f ) dµ. (1.18)

An important consequence of the previous lemma is that in Hilbert spaces, the inner product commutes with the Bochner integral.

There also exists a similar statement regarding E∗, the dual space of E. As the dual space is a Banach space as well, the Bochner integral exists for µ-Bochner integrable functions g : A → E∗ and the Bochner integral with respect to g is again an element of E∗. Then, one has the following result: the Bochner integral with respect to E∗-valued (Bochner integrable) functions commutes with elements from E.

Lemma 1.2.6. Let (A,A , µ) be a finite measure space and let E∗ be the dual of a Banach space E. Let g : A → E∗ be a µ-Bochner integrable function. Then, we have

Z A g dµ  (e) = Z A g(e) dµ for all e ∈ E.

As noted on page 10 in [36], “results from the theory of Lebesgue integration carry over to the Bochner integral as long as there are no non-negativity assumptions involved.” This means that there are, in general, no analogues of Fatou’s lemma and the Monotone Convergence Theorem, but there does exist a Dominated Convergence Theorem for Bochner integrals and a version of Fubini’s theorem.

Theorem 1.2.7 (Dominated Convergence Theorem for Bochner integrals). Let (A,A , µ) be a finite measure space and let (E, k · kE) be a Banach space. Let {fn: n ≥ 1} be a sequence of

µ-Bochner integrable functions. Suppose that there exists a function f : A → E and a non-negative Lebesgue integrable function g : A → R such that the sequence {fn: n ≥ 1} converges

µ-almost everywhere to f and that the inequality kfnkE ≤ g holds µ-almost everywhere for

every n ∈ N.

Then the function f is Bochner-integrable and Z

A

kf − fnkE dµ → 0

when n tends to infinity.

Proof. See Proposition 1.18 in [36].

Theorem 1.2.8 (Fubini’s theorem). Let (A1,A1, µ) and (A2,A2, ν) be σ-finite measure spaces

and let (E, k · kE) be a Banach space. Suppose that the map f : A1× A2 → E is µ ⊗ ν-Bochner

intergrable. Then, the following assertions hold.

(1) For µ-almost all a1∈ A1, the function a27→ f (a1, a2) is ν-Bochner integrable.

(2) For ν-almost all a2 ∈ A2, the function a1 7→ f (a1, a2) is µ-Bochner integrable.

(3) The function a1 7→

R

A2f (a1, a2) dν(a2), respectively a2 7→

R

A1f (a1, a2) dµ(a1), is

µ-Bochner integrable, respectively ν-µ-Bochner integrable, and Z A1× A2 f (a1, a2) d(µ ⊗ ν)(a1, a2) = Z A1 Z A2 f (a1, a2) dν(a2)  dµ(a1) = Z A2 Z A1 f (a1, a2) dµ(a1)  dν(a2). Proof. Proposition 1.2.7 in [17].

(23)

1.2.2 Bochner spaces

When one is familiar with the “ordinary” Lp- and Lp-spaces, the construction of Bochner spaces

will come as no surprise.

Let (A,A , µ) be a finite measure space and let (E, k · kE) be a Banach space. Let p ≥ 1 be

a real number. Then, we let Lp(A; E) consist of all strongly µ-measurable functions f : A → E

for which Z

A

kf kpE dµ < ∞.

We call two functions equivalent if they are equal µ-almost everywhere: from this equivalence relation, we obtain the Lp(A; E)-space, the space of µ-equivalence classes of functions from A to E. If it is of particular importance, we will explicitly mention the underlying σ-algebra and measure. As usual, we will use the accepted abuse of notation and write f for its correspond-ing µ-equivalence class [f ] in Lp(A; E) and refer to these µ-equivalence classes as functions or processes. This is a vector space that is complete under the norm

kf kLp(A;E) := Z A kf kpE dµ 1/p , (1.19)

as the proof for the ordinary Lp-spaces carries over to this setting; see page 12 in [36]. Per

construction, it holds that the E-valued simple functions are dense in Lp(A; E) with respect to the k · kLp(A;E)-norm.

If F ⊂ E is a dense subset, the F-valued simple functions are dense in Lp(A; E) as well.

Lemma 1.2.9. Let p ∈ [1, ∞). Consider a complete finite measure space (A,A , µ) and suppose that F is a dense subset of the Banach space E. Then, the F-valued simple functions are dense in Lp(A; E).

Proof. Approximate an E-valued simple functions with an F-valued simple function.

Moreover, if the Banach space E is reflexive, the same is true for the Lp-Bochner space Lp(A; E).

Proposition 1.2.10 (Proposition 2.2.3(c) in [14]). Let p ∈ (1, ∞), let (A,A , µ) be a complete finite measure space and let (E, k · kE) be a reflexive Banach space. Then, Lp(A; E) is reflexive.

Again, to draw comparison to the ordinary Lp spaces, the dual of the Bochner space Lp(A; E) can, under certain conditions, be identified with the conjugate Bochner space of E∗-valued functions.

Theorem 1.2.11. Let (A,A , µ) be a complete finite measure space and let (E, k · kE) be a

Banach space. Let p ∈ [1, ∞). Then, if the dual space E∗ is reflexive or separable, the map

L p p−1(A; E) → Lp(A; E): g 7→ Z A hg, ·iE× E

is an isometric isomorphism. In particular, this map is an isometric isomorphism if E is reflexive.

Proof/remark. The stated conditions are sufficient, but not necessary, as this result holds for all Banach spaces for which the dual has the Radon-Nikod´ym property with respect to (A,A , µ). It is known that reflexive and separable spaces possess the Radon-Nikod´ym property. For a proof of the first statement, we refer to Theorem 1.3.10 in [17].

For the “In particular” part, we note that the reflexivity of E implies the reflexivity of the dual space E∗, see Lemma 1.1.28, and so the assertion follows directly.

(24)

Corollary 1.2.12. Let (A,A , µ) be a complete finite measure space and let (E, k · kE) be a

reflexive Banach space. Let p ∈ [1, ∞). Then,

I : Lp−1p ([0, T ] × Ω; E) → (Lp([0, T ] × Ω; E)): Φ 7→ E Z T 0 h·, Φ(t)iE× E dt  is an isometric isomorphism.

The last statement we will discuss here will be used regularly.

Proposition 1.2.13 (Proposition 1.2.24 in [17]). Let (A1,A1, µ1) and (A2,A2, µ2) be finite

measure spaces, let (E, k · kE) be a Banach space and let p ∈ [1, ∞). Suppose that F ∈

Lp(A1× A2; E). Then, for µ1-almost all x ∈ A1, the map y 7→ F (x, y) belongs to Lp(A2; E).

When we replace the Banach space (E, k · kE) with a Hilbert space (H, h·, ·iH) and look at the

space L2(A; H), this is an inner product space, with the inner product given by

hf, giL2(A;H)=

Z

A

hf, giH

for all (µ-equivalence classes of) functions f, g ∈ L2(A; H). From this inner product, one

imme-diately recognizes that the inner product induced norm is the same as the norm in (1.19): this makes the inner product space L2(A; H) complete, hence, a Hilbert space.

There also exists an L∞(A; E)-Bochner space: we will not use this Bochner space here and will therefore refer the interested reader to Section 1.3.2 in [36].

1.3 A short overview of results in measure, probability and

real-valued (local) martingale theory

Let (Ω,F , P) be a probability space and let [0, T ] ⊂ R be a finite interval. A filtration F := {Ft: t ∈ [0, T ]} is said to be normal if F0 contains all P-null sets and when the filtration is

right-continuous, that is, if Ft is equal to

F+ t := \ t<u≤T Fu for every t ∈ [0, T ).

A sequence of stopping times (τn)n∈N on the probability space (Ω,F , P) is said to be a

localizing sequence of stopping times if τn ≤ τn+1 for all n ∈ N and if τn ↑ T P-almost surely.

When the probability space (Ω,F , P) is endowed with a normal filtration F, we say that a right-continuous, F-adapted, real-valued process X = {X(t) : t ≤ T } is a local martingale if there exists a localizing seqeunce of stopping times {τn: n ≥ 1} such that for every n ∈ N, the

stopped process Xτn := {X (τ

n∧ t) : t ≤ T } is a uniformly integrable martingale. When X is

continuous, even more is true.

Proposition 1.3.1 (Proposition 4.5 in [33]). If X is a continuous local martingale starting at zero P-almost surely, then there exists a localizing sequence of stopping times {τn: n ∈ N} of

stopping times such that the processes Xτn are bounded martingales.

Denote by hXi the quadratic variation process for a real-valued continuous local martingale X starting at zero. That is: the process hXi the natural increasing process, that is unique up to P-indistinguishability, turning X2− hXi into a martingale.

Then, the Burkholder-Davis-Gundy inequality holds, a result that relates the maximum of a local martingale to its quadratic variation.

(25)

Proposition 1.3.2 (Burkholder-Davis-Gundy inequality). For every p > 0, there exists a universal constant Cp, only depending on p, such that for all continuous local martingales

M = {M (t) : t ≤ T } starting at zero with respect to a normal filtration F E  sup t≤τ |M (t)|2p  ≤ CpE [hM ipτ] (1.20)

holds for every finite stopping time τ .

Proof. See Theorem 3.28 (page 166) in [21].

From the Burkholder-Davis-Gundy inequality, we obtain the following auxilary results.

Corollary 1.3.3. Let δ,  ∈ (0, ∞). Let M = {M (t) : t ≤ T } be a real-valued continuous local martingale starting at zero with respect to a normal filtration F on a probability space (Ω,F , P). Then, we have P sup t≤T |M (t)| ≥  ! ≤ 3 E h hM i1/2T ∧ δi+ P hM iT > δ2 .

Proof. See Corollary D0.2 in [23].

Proposition 1.3.4. Let M = {M (t) : t ≤ T } be a real-valued continuous local martingale starting at zero with respect to a normal filtration F on a probability space (Ω, F , P). Suppose that E

h

hM i1/2T i< ∞. Then, the local martingale M is a martingale starting at zero. Proof. See Proposition D.0.1(ii) in [23].

We end with two measure theoretic results.

Lemma 1.3.5 (Lemma 2.12.5 in [4]). Let (X,AX), (Y1,A1),. . . ,(Yk,Ak) be measurable spaces.

Let the space Y := Y1× · · · × Yk be equipped with the σ-algebraAY :=A1⊗ · · · ⊗Ak. Then,

the mapping F := (F1, . . . , Fk) : X → Y isAX/AY-measurable if and only if every Fi is AX/Ai

-measurable.

Lemma 1.3.6. Let (A1,A1, µ) and (A2,A2, ν) be finite measure spaces. If ¯µ is a measure on

(A1,A1) that is absolutely continuous with respect to µ, then we have ¯µ ⊗ ν  µ ⊗ ν on the

measurable space (A1× A2,A1⊗A2).

Proof. The assertion is valid on the π-system {A1× A2: A1∈A1, A2∈A2} that generates the

product σ-algebraA1⊗A2.

1.4 A brief summary of properties of functions

Later on in this thesis, we will need some results on lower semi-continuous functions and func-tions of bounded variation. This topic is not a main focus of this thesis and we will therefore only list the necessary definitions and results.

(26)

1.4.1 Lower semi-continuity

This section is based on Section 2.10 in [2].

Definition 1.4.1 (Lower semi-continuous function). Let X be a topological space. Then, a function f : X → [−∞, ∞] is called lower semi-continuous if the set f−1[[−∞, c]] is closed in X for each c ∈ R.

In particular, this means that every lower semi-continuous function is measurable and that every continuous function is lower-semicontinuous as well. When X is a metric space, we have the following results.

Lemma 1.4.2 (Lemma 2.42 in [2]). Let (X, d) be a metric space. A function f : X → [−∞, ∞] is lower semi-continuous if and only if lim infx→x0f (x) ≥ f (x0) for all x0 ∈ X.

Lemma 1.4.3. Let (X, d) be a metric space and let f : X → [0, ∞] be a non-negative lower semi-continuous function. If the set A ⊂ X is dense in X, it holds that supx∈Af (x) = supx∈Xf (x). Proof. One inequality is obvious from A ⊂ X, whereas the other inequality can be shown with Lemma 1.4.2.

We end with two useful results on lower semi-continuity.

Lemma 1.4.4 (Lemma 2.41 in [2]). The pointwise supremum of lower semi-continuous functions is lower semi-continuous.

Lemma 1.4.5. Let X and Y be topological spaces. Suppose that the function ψ : X → Y is continuous and that the function f : Y → [−∞, ∞] is lower semi-continuous. Then, the map f ◦ ψ : X → Y is lower semi-continuous as well.

Proof. The set ψ−1f−1[[−∞, c]] is closed in X for every c ∈ R.

1.4.2 Functions of bounded variation

We will list some results on functions of bounded variation.

Proposition 1.4.6 (Jordan decomposition for functions of bounded variation). Let [a, b] ⊂ R be a finite interval. A function f : [a, b] → R is of bounded variation if and only if f is the difference of two monotonely non-decreasing real-valued functions on [a, b].

Proof. See [29], Theorem 5 in Part 1, Chapter 5, Section 2 (pp. 103).

We call a function f : [a, b] → R monotone if it is either monotonely non-decreasing or monotonely non-increasing.

Corollary 1.4.7. If f : [a, b] → R is monotone, it is of bounded variation.

Lemma 1.4.8. Let g : [a, b] → R be integrable and let c ∈ R. Then, the function f : [a, b] → R : x 7→ c +Raxg(s) ds is a continuous function of bounded variation.

Proof. Follows from Lemma 7 in Part 1, Chapter 5, Section 3 (pp. 105) in [29].

We will later employ the ‘integration by parts’-formula for continuous functions of bounded variation.

Proposition 1.4.9 (Integration by parts). Let f, g : [a, b] → R be continuous functions of bounded variation. Then, the following ‘integration by parts’-formula holds:

Z b a f (x) dg(x) + Z b a g(x) df (x) = f (b)g(b) − f (a)g(a).

(27)

Anyone that is familiar with real-valued stochastic integrals knows what the ingredients are for the construction of a stochastic integral. In a nutshell, one has to introduce Brownian motion, also known as a Wiener process, consider simple processes and then approach a, say, continuous process with these simple processes. From this approximation follows the definition of a stochastic integral.

The approach that we take for stochastic integrals in Hilbert spaces is not that must different, but we need to translate the real-valued setting into a Hilbert space-setting and that requires some effort. Luckily, many properties of real-valued stochastic integrals carry over, such as the It¯o isometry, even though in a slightly different form, and the optional sampling property.

In this chapter, we fix two separable (real) Hilbert spaces U and H. The approach we take is more or less the same as is done in Chapter 2 in [27] and in Chapter 2 in [23] and we will mostly omit the proofs of the statements posed in this chapter by referring the reader to the corresponding statements in the literature.

Before we can define the stochastic integral, we need to introduce the Hilbert-space equivalent of Brownian motion, known as a Q-Wiener process, that depends on a non-negative symmetric finite trace operator Q on U; see Section 2.2. Just like in the real-valued setting, this is a continuous process with independent, Gaussian increments (see Definition 2.2.1). However, we first need to define what “Gaussian” means in this context and introduce more probability and martingale theory in Banach and Hilbert spaces: we will therefore start this chapter with a section on that topic (Section 2.1).

We will also encounter cylindrical Wiener processes in Section 2.4 and construct the stochastic integral with respect to a standard cylindrical Wiener process; more on that later.

2.1 Probability and martingale theory in Banach spaces

We will introduce Gaussian measures on Hilbert spaces and give a summary on the general theory of E-valued random variables and stochastic processes.

2.1.1 Probability theory: Gaussian measures on Hilbert spaces

The Riesz-Fr´echet Theorem allows us to identify every element of U∗ with a unique element of U. We will denote the Riesz-Fr´echet isomorphism by Φ. Recall that we write B(U) for the Borel-σ-algebra on U. Let µ be a measure on (U, B(U)). Since every element in U∗ is continuous, it is measurable with respect to the Borel-σ-algebra B(U) on U, which allows us to define the push-forward measure µu for each u ∈ U by

µu : B(R) → [0, ∞) : A 7→ µ



(Φ(u))−1[A].

This measure µ is called Gaussian if the pushforward measure µuadmits a Gaussian law for every

u ∈ U, that is, for every u ∈ U, there exist mu ∈ R and σu ∈ [0, ∞) such that µu∼ N mu, σ2u,

i.e., µu(A) = 1 σu √ 2π Z A e− (x−mu)2 σ2u dx

for all A ∈ B(R) if σu > 0 ´or that, if σu = 0, we have µu(A) = δmu(A) for all A ∈ B(R). We

(28)

There is another way we can determine whether a measure µ on (U, B(U)) is Gaussian. Theorem 2.1.1. A measure µ on (U, B(U)) is Gaussian if and only if there exists an m ∈ U and a nonnegative, symmetric Q ∈ L(U) with finite trace such that

ˆ µ(u) :=

Z

U

eihu,viU dµ(v) = eihm,uiU−12hQu,uiU

for all u ∈ U. Moreover, this µ is uniquely determined by m and Q.

Proof. See Theorem 2.1.2 in [27].

The element m ∈ U in Theorem 2.1.1 is called the mean of µ and the operator Q ∈ L(U ) is called the covariance (operator). If µ has mean m and covariance Q, we will write µ ∼ N (m, Q).

A Gaussian measure µ on (U, B(U)) has the following properties.

Lemma 2.1.2. Let µ be a Gaussian measure on (U, B(U)) with mean m and covariance operator Q. Then, the following properties hold.

(i) For all u ∈ U, we haveR

Uhx, uiUdµ(x) = hm, uiU.

(ii) For all u, v ∈ U, we haveR

U(hx − m, uiU) (hx − m, viU) dµ(x) = hQu, viU.

(iii) The identity RUkx − mk2U dµ(x) = tr(Q) holds. Proof. See Theorem 2.1.2 in [27].

2.1.2 Random variables and stochastic processes

We will summarize the general theory of E-valued random variables and stochastic processes. In this section, we will assume throughout that E is a separable Banach space.

2.1.2.1 Random variables

A map X from Ω into E is called a random variable if it is measurable. Normally, we would require this map to be P-strongly measurable, but since E is separable, measurability is sufficient, see Observation 1.2.2. The random variable X is called p-(Bochner) integrable if EkXkpE < ∞.

When p = 2, we say that X is square-integrable and when p = 1, we will just say that X is integrable.

If a random variable X on a probability space (Ω,F , P) takes values in a separable Hilbert space U, we say that X is Gaussian if its law

PX := P ◦ X−1: B(U) → [0, 1] : A 7→ P X−1[A]

is Gaussian, hence, there is an element m ∈ U and a nonnegative, symmetric, finite trace operator Q ∈ L(U) such that PX ∼ N (m, Q).

2.1.2.2 Stochastic processes

We say that a family X := {Xt: t ∈ [0, T ]} is an E-valued (stochastic) process if Xtis a E-valued

random variable for every t ∈ [0, T ]. We will sometimes refer to an element t ∈ [0, T ] as a time point.

Comparable to the real-valued case, we say that a process X is continuous if all sample paths [0, T ] → E : t 7→ Xt(ω) are continuous and we call a process X P-almost surely continuous

if the sample paths are continuous P-almost surely. In addition, a process X is said to be p-integrable if Xt is p-integrable for each t ∈ [0, T ], with natural definitions for integrable and

Referenties

GERELATEERDE DOCUMENTEN

Wanneer u geen klachten meer heeft van uw enkel bij dagelijkse activiteiten, kunt u uw enkel weer volledig belasten.. Extreme belasting zoals intensief sporten en zwaar werk is

The subject of this paper is to propose a new identification procedure for Wiener systems that reduces the computational burden of maximum likelihood/prediction error techniques

Silhouette curves with mean Silhouette coefficient for clustering solutions of 2 up to 25 clusters for text-only clustering, link-only clustering, integrated clustering with

The stationary distribution is so called because if the initial state of a Markov chain is drawn according to a stationary distribution, then the Markov chain

In this chapter, a brief introduction to stochastic differential equations (SDEs) will be given, after which the newly developed SDE based CR modulation model, used extensively in

As part of the proof estimates on the corresponding semigroup are found in terms of weighted Hölder norms for arbitrary networks, which are proven to be equivalent to the semigroup

So we will consider a stochastic integral equation with Banach space valued functions except for the integrand of the integral that models the noise, which we assume to take values in

Example 5.3.2 New result Consider a minimal, standard, stable and right-continuous Markov process with values in the countable state space E equipped with E = 2 E. Exercise 5.5)..