Epidemic Models

(1)

Epidemic Models

Maaike Koninx

July 12, 2015

Bachelor thesis

Supervisor: dr. S.G. Cox

Korteweg-de Vries Instituut voor Wiskunde

(2)

Abstract

In our everyday lives, we are challanged with many questions concerning our health. In this thesis, we will explore the spread of infectious diseases using a mathematical approach. Our primary goal is to examine multiple epidemic models and their related traits that contribute to the occurrence of an epidemic. We will consider discrete and continuous time epidemic models and argue that for all models the same general result emerges: in the supercritical regime an epidemic may occur while in the subcritical regime the disease will surely die out. For our discrete time models, we will compute the probability that an infectious disease will die out and discuss the probability that a certain proportion of the population is infected. For our continuous time models, we will explore the conditions that will lead to a decrease in the number of infected people. We will argue that if we view the number of infected people as a Markov chain on N, the number of infected people approximates a differential equation in large populations.

Title: Epidemic Models

Author: Maaike Koninx, maaike.koninx@student.uva.nl, 6060102 Supervisor: dr. S.G. Cox

Second grader: dhr. dr. B.J.K. Kleijn Date: July 12, 2015

Korteweg-de Vries Instituut voor Wiskunde Universiteit van Amsterdam

Science Park 904, 1098 XH Amsterdam http://www.science.uva.nl/math

(3)

1. Introduction

In our everyday lives, we are challanged with many questions concerning our health. New infectious diseases get discovered at a regular basis, while older, well-known diseases are still present in our world today. Some of the most well-known infectious diseases have played a major part in history, and even altered its course. For example in the 1300s a series of epidemics of the plague, also known as the Second plague pandemic, that started with the Black Death wiped out 30 to 70 percent of the population in Europe. Since then, many more outbreaks of the plague1 _{have been well documented, as well as}

major outbreaks of cholera, influenza, measles, yellow fever, etc. The world is becoming increasingly interconnected and naturally a concern for our health arises. In this thesis, we will explore the spread of infectious diseases using a mathematical approach.

Figure 1.1.: A depiction of the Black Death from a 15th century bible.

Our main concern for this thesis is in which circumstances an epidemic occurs. We will define an epidemic as the presence of a certain amount of infected individuals of an infectious disease in a population for a substantial amount of time. In general, an epidemic may occur when a single individual or small proportion of the population is exposed to an infectious disease. If the infectious individuals make contact with other members of the population, the disease is spread throughout the population. Whether or not an epidemic occurs, is determined by certain traits of the disease as well as the population. When the population is exposed to the disease, an epidemic may or may not occur according to these traits. For example, high rates on the ‘infectiousness’ of the disease and the rate at which people make contact in a population may lead to an

(5)

epidemic, while other traits like the rate at which sick people recover may lead to a decline in the number of infected people.

The primary goal of this thesis is to examine multiple epidemic models and their related traits that contribute to the occurrence of an epidemic. Furthermore, if an epi-demic occurs, we are interested in the number of people that become infected and the probability that the disease will die out eventually. We will support our theoretic find-ings with computer simulations. We argue that the same, general result emerges for all of our epidemic models, which is the following;

If the expected number of people a single infected person infects is greater than one, an epidemic may occur. If the expected number of people he in-fects equals or is less than one, the disease will surely die out.

In Chapter 2, we consider the spread of an infectious disease in a branching process and we will compute the probability that an infectious disease will die out. In Chapter 3 we discuss a discrete time epidemic-model, where we will argue that the Reed-Frost SIR model is equivalent to an E-R random graph given the same initial state. Here we will discuss the probability that a certain proportion of the population is infected. In Chapter 4, we introduce two continuous time epidemic models. We argue that if we view the number of infected people as a Markov chain on N, we find by applying Kurtz’s Theorem, that the number of infected people approximates the solution to a differential equation in large populations.

Acknowledgments. I would like to truly thank my supervisor Sonja Cox for offering me guidance throughout this project.

(6)

2. The Galton-Watson branching

process

For our first exploration of epidemics, we consider a simple branching process in which an epidemic starts at a so-called patient zero and spreads with a certain probability to a number of his contacts. In their turn, the infected individuals infect their contacts with a certain probability. This process continues as long as the disease is passed on between generations and is called the Galton-Watson branching process. 1

Definition 2.1. Let the random variable N represent the number of people each infected individual spreads the disease on to. The distribution of N is referred to as the offspring distribution, and its expected value R0 = E[N ] is called the basic reproductive number.

We are interested in the probability that the disease lives on for a certain amount of time. The probability pext

n that the disease has died out by the nth generation is a

sequence in [0, 1] that converges to a point pext, the probability that the disease dies out eventually. By Theorem 2.2, the probability that the epidemic either dies out or contin-ues to exist indefinitely depends completely on the offspring distribution. If the number of new infections an infected individual causes is strictly less than one, the disease will surely die out. If this number is greater than one, the disease may continue to exist indefinitely. This is stated in the next theorem.

Note that in Theorem 2.2 below we do not consider the case where P [N = 1] = 1. In this case, every infected individual spreads his disease to exactly one individual. It is clear that pext

n = 0 for every n ∈ N, and therefore pext= 0.

Theorem 2.2. Conditional on P [N = 1] < 1, the following regimes occur: (a) If R0 ≤ 1 then pext= 1 .

(b) If R0 > 1, then pext< 1.

Proof. Let Xn be the number of infected people in the nth generation. We view an

individual i from the 1st _{generation and notice that if i is infected, he can be considered}

as patient zero in his own branching process. In particular, if Zn,irepresents the number

of infected people in the nth generation after i, (in)directly caused by i ∈ X1, then

Zn+1,i∼ Xn. Furthermore if i = 1, 2, . . . ∈ X1 , the {Zn,i} are i.i.d by definition and we

1_{This model is best known not for its application in epidemics, but for its application in genealogy.}

The model was first introduced by Francis Galton to study the extinction of family names. The terminology used in this chapter is therefore closely related to common word usage in genealogy.

(7)

can deduce the following E[sXn+1 _{: X}

1 = j] = E[s Pj

i=1Zn+1,i_{] = E[s}Zn+1,1_·sZn+1,2_{· · · ]}i.i.d₌ _E[sZn+1,1_]j _{= E[s}Xn_]j_.

(2.1) Now we can introduce the probability generating function φn(s) of Xn, by

φn(s) = E[sXn] = ∞ X j=0 E[sXn _{: X} 1 = j]P [X1 = j] 2.1 = ∞ X j=0 E[sXn−1_]j_{P [N = j]} = ∞ X j=0 (φn−1(s))jP [N = j].

Note that φn(0) = P [Xn = 0] = pextn . Consequently, setting s = 0 in the previous

equation gives us pext_n = ∞ X j=0 pext_n−1jP [N = j]. (2.2) Since pext_n converges to a unique point pextin [0, 1], we know that pext_n ≈ pext

n−1 as n → ∞.

We will now argue that by setting pext

n = pextn−1, the smallest solution to the equation

(2.2) in [0, 1] gives us pext_{. This will finalize our proof.}

Let f : [0, 1] → R be defined by f (x) = P∞

j=0x

j_{P [N = j]. We deduce the following:}

(i) f is non-decreasing as f0(x) =P∞

j=1jx

j−1_{P (N = j) ≥ 0, for all x ∈ [0, 1].}

(ii) f is strictly convex, indeed since P [N = 1] < 1 and E[N ] = 1, we know that P [N ≥ 2] > 0. Therefore f00(x) =P∞

j=2j(j −1)x

j−2_{P (N = j) > 0 for all x ∈ [0, 1].}

(iii) f (1) =P∞

j=0P [N = j] = 1.

We can see from Figure 2.1 that in the cases where (a) R0 ≤ 1 and (b) R0 > 1 the

following is true:

(a) In the case where R0 ≤ 1, the strict convexity of f together with the condition

that f0(1) = E[N ] = R0 ≤ 1 force the graph of f to lie strictly above the function y = x

on [0, 1). This implies that the sequence pext

n is increasing on [0, 1) and will converge to

pext _{= 1.}

(b) In the case where R0 > 1, the condition that f0(1) = R0 > 1 and the above

(8)

(a) (b)

Figure 2.1.: A graph of f in the case where (a) R0 ≤ 1 and (b) R0 > 1.

and that a is the unique solution to f (x) = x in [0, 1). Indeed, if pext_n ∈ (a, 1) then (pext n )

is a decreasing sequence and if pext

n ∈ [0, a) then (pextn ) is increasing. We conclude that

the sequence pext

n will converge to a = pext< 1.

Definition 2.3. We will denote R0 < 1 as the subcritical regime, R0 = 1 the critical

regime and R0 > 1 as the supercritical regime.

The previous theorem tells us that if we know the value of R0, we can predict whether

a disease will die out or persist indefinitely in a population. We refer the interested reader to Appendix A for an overview of well-known infectious diseases and their basic reproductive numbers. In particular, if we know the distribution of N , we can calculate the extinction probability of the disease by finding the minimal non-negative solution to 2.2. We conclude this chapter by examining the extinction probability for some distributions of N .

Example 2.4. Let 0 < p < 1 and consider pk = P [N = k] = pk(1 − p) for k = 0, 1, . . .

as the probability of infecting k people. Clearly pk ∈ (0, 1) for all k = 0, 1, . . ., and these

probabilities add to one seeing as P∞

k=0pk = (1 − p)

P∞

k=0p

k _{= (1 − p)(1 − p)}−1 _{= 1.}

Therefore the set {pk} is a probability distribution2.

Suppose that p ≤ 1₂. Since E[N ] = ∞ X k=0 kpk = (1 − p) ∞ X k=0 kpk= (1 − p) p (1 − p)2 = p 1 − p

we know that E[N ] ≤ 1. Seeing as p1 < 1, Theorem 2.2 tells us that pext = 1. In this

case, the disease will almost surely die out at a certain point in time.

Now suppose p > 1₂, then E[N ] > 1 and by Theorem 2.2 pext < 1. We are interested in finding the value of pext_{. Therefore we have to find the minimal non-negative solution}

(9)

to φ(x) = x for φ(x) = ∞ X k=0 xkpk = (1 − p) ∞ X k=0 xkpk = (1 − p) 1 1 − xp

Some calculations are required to find that φ(x) = x on [0, 1] when x equals 1 or 1−p_p . Seeing as pext _{< 1, we find that p}ext ₌ 1−p

p . In this case, the disease will live on

indefinitely with probability 1 − pext= 2p−1_p .

Example 2.5. We can consider the spread of an infectious disease not only by counting infectious individuals but also by counting the healthy ones. Whenever a strand of DNA is exposed to a certain amount of radiation, its molecular composition can be permanently altered. These errors in genetic material are called mutations. Suppose that a healthy strand of DNA replicates itself during radiation and the number of healthy replicas N is Poisson distributed with parameter λ. The healthy replicas on their turn replicate themselves, with the same offspring distribution N . This process is again a Galton-Watson branching process and E[N ] = R0 = λ, so the Poisson distribution has

a parameter equal to the expected number of non-mutated replica DNA strands. The probability pext _{that the healthy DNA strands die out in a certain point of time is the}

minimal non-negative solution to

x = φ(x) = ∞ X k=0 xkλke −λ k! = e−λ ∞ X k=0 (xλ)k k! = eλ(x−1) (2.3)

The exact solution for this equation cannot be obtained analytically, but we can obtain it numerically. For instance if λ = 2 we get pext _{= 0.2032, which tells us that when we}

expect every healthy DNA strand to generate 2 healthy replicas, the probability that the healthy DNA strands die out is about 20.3%.

(10)

3. Dicrete time SIR epidemic

models

As we saw in the Chapter 2, the Galton-Watson branching process is an effective model for describing the extinction rate of an infectious disease that starts at a patient zero. However, the Galton Watson branching process is hardly an effective model for describing epidemics in real life. In real life epidemics, a healthy person can extract a infectious disease from multiple infected individuals he meets. In the Galton-Watson process a healthy person can only extract a disease from his direct contact in the former generation. Furthermore, in real epidemiology, infectious diseases often do not get discovered until a certain proportion of the population is sick. One cannot easily trace the generations in which the disease has spread much less find patient zero.

Hence we need to find another model that describes the spread of infectious diseases without assuming that the process starts at a patient zero and allowing the disease to reach healthy people through multiple channels. One such model is called the Reed-Frost model.

3.1. The Reed-Frost model

The Reed-Frost model is a discrete time SIR epidemic process. A SIR epidemic process involves three types of individuals: susceptible (S), infectious (I) and resistant (R). In a population of n individuals, during each unit of time, an infected person infects each susceptible with probability p independent of other infected people. The infected person is then assigned as resistant. During the following unit of time, the process continues for the newly infectious group of people. Once resistant, a person cannot become susceptible again, since being resistant equals being immune. The process kan be represented as

S → I → R

A more detailed description is as follows. Let S(t), I(t) and R(t) be the number of susceptible, infectious and resistant people respectively at time t. From the assumptions it is clear that S(t) + I(t) + R(t) = n. Then Z(t) = (S(t), I(t), R(t)) ∈ {1, 2, . . . , n}3 _is

a random variable that represents the state of each of the n individuals at time t. Let z, ˜z ∈ {1, 2, . . . , n}3 be two states. If the number of susceptibles S(z) in state z equals S(z) = I(˜z) + S(˜z), then it is possible to reach state ˜z from state z in one step (notation: z → ˜z). If {Z(t)}_t∈N is a random walk on {1, 2, . . . , n}3_{, and z → ˜}_{z is possible, then the}

(11)

transition probabilities equal P [Z(t + 1) = ˜z : Z(t) = z] =S(z) S(˜z) (1 − p)I(z) S(˜z) 1 − (1 − p)I(z) S(z)−S(˜z) (3.1) =S(z) S(˜z) | {z } (i) (1 − p)I(z) S(˜z) | {z } (ii) 1 − (1 − p)I(z) I(˜z) | {z } (iii) .

Explanation. (i) The number of ways to obtain S(˜z) susceptibles out of S(z) suscep-tibles.

(ii) The probability that S(˜z) people do not get infected. Obviously, for each susceptible, the probability of not getting infected is (1 − p)I(z)_{, so for all S(˜}_{z) susceptibles this}

probability equals term (ii).

(iii) The probability that I(˜z) people get infected. The probability for each susceptible of getting infected is 1 − (1 − p)I(z)_{, so for all I(˜}_{z) this probability equals (iii). Note that}

one has to encounter only one infected person to possibly get sick, but we do not exclude the case where one can meet multiple infected persons. Therefore, this probability does not equal pI(z)I(˜z)_.

The reader can check that these probabilities add to 1 on the set {˜z : z → ˜z is possible}. Remark. Suppose that {Z(t)}_t∈N is a random walk on {1, 2, . . . , n}3_{. Then the}

condi-tional transition probability P [Z(t + 1) = ˜z : Z(t) = z, . . . , Z(0) = z0] depends only on

the present state Z(t), i.e.

P [Z(t + 1) = ˜z : Z(t) = z, . . . , Z(0) = z0] = P [Z(t + 1) = ˜z : Z(t) = z].

This means that the future process depends only on the present state and not on the history of the process. We call this memoryless property the Markov property, making Z(t) a Markov Chain on {1, 2, . . . , n}3_.

As can be seen from equation 3.1, the conditional random variable S(t + 1)|Z(t) has a binomial distribution of parameter (S(t), (1 − p)I(t)₎1_{. Using the law of large numbers, it}

can be shown that this distribution approximates a Poisson distribution in a population where n → ∞ and I(t) is a finite number, i.e. in a population with a large number of healthy individuals and a small fraction of infected people.

Theorem 3.1. The Law of Large Numbers. Let t ≥ 0 be fixed and let λ = S(t)(1 − p)I(t)_{. Then for finite I(t) and n → ∞, S(t + 1)|Z(t) is Poisson(λ) distributed.}

Proof. We write ˆp = (1 − p)I(t)_{, so that S(t + 1)|Z(t) ∼ Bin(S(t), ˆ}_{p) and the generating}

(12)

function equals φ(s) = E[sS(t+1)|Z(t)] = n X k=0 skS(t) k ˆ pk(1 − ˆp)S(t)−k = (1 + (s − 1)ˆp)S(t) = 1 + (s − 1) λ S(t) S(t) → e(s−1)λ,

as n → ∞, which is exactly the generating function of a Poisson(λ) distribution as we saw in Example 2.5.

Now that we know that S(t + 1)|Z(t) is Poisson distributed for large enough n, pro-vided that S(t) increases accordingly, one might think this would make calculations involving S(t + 1) easier and consequently lead to insights into the Reed-Frost epidemic model. However, since each transition probability is dependent on the former state, the previous model fails to provide insight into the long-term infection rate. We therefore introduce a new model based on random graphs which we will prove is equivalent to the Reed-Frost model.

3.2. The Erd˝

os-R´

enyi model.

The Erd˝os-R´enyi (E-R) random graph G(n, p) is a graph on n nodes where each two nodes are connected with probability p, independent of how other edges are connected. Our model then simulates an epidemic in the following way. A finite number of nodes, which we will refer to as I(0), is infected at random at t = 0. During the next unit of time, each infected node infects his direct neighbours after which he is assigned as resistant. This process continues until all nodes in the connected component of each infected individual are infected.

Now, if S(t) and I(t) represent the number of susceptibles and infectious at time t respectively, it can be argued that the conditional transition probability P [S(t + 1) = ˜

z : S(t), I(t)] in our new model equals that of the Reed-R´enyi model since it is exactly (i) · (ii) · (iii) as in 3.1, see Example 3.2. We conclude that the two models are equivalent given that both models start with the same amount of infected people I(0).

Example 3.2. . We now consider two states z, ˜z, of a graph in which the nodes are assigned as susceptible, infectious or resistant, see Figure 3.1. Suppose it is possible to reach state ˜z from state z in one timestep, i.e. all susceptible nodes in z are assigned as either susceptible or infected in state ˜z. The probability that all nodes were connected

(13)

accordingly at t = 0 in our model equals (i) · (ii) · (iii) as in 3.1, since

(i) There are S(z)_S(˜_z) ways of connecting the infectious nodes in state z to the infectious nodes in state ˜z.

(ii) The probability that the susceptible nodes in ˜z were not connected to the infectious ones in z equals (1 − p)I(z)S(˜z)_.

(iii) The probability that the infectious nodes in ˜z were connected to the infectious ones in z equals (1 − (1 − p)I(z))I(˜z).

z z˜

(a) ˜z

Figure 3.1.: Two states z, ˜z on a graph of 6 nodes. The green nodes are susceptible, the red ones are infectious and the yellow ones are resistant. At least one of the solid drawn connections have to be made at t = 0 to reach state ˜z from state z in one timestep.

Through our new model we can deduce some valuable results. We can obtain an upper bound for the number of people that get infected in terms of R0, the expected number

of people that every infected person spreads his disease on to. It is clear that R0 equals

the expected number of neighbours pn of an infected person. The theorem below states that in the subcritical regime, when R0 < 1, all connected components are sufficiently

small so that there is at most a small outbreak. In the supercritical regime a so-called giant component appears, which leads to a large outbreak.

We denote by C1 and C2 the largest and second-largest connected component

respec-tively in G(n, p). Let |Ci| the number of nodes in Ci.

Theorem 3.3. In an E-R graph G(n, p) with R0 = pn, the following regimes occur.

(i) Subcritical regime: If R0 < 1 then there exists a constant c depending on R0, so that

lim

n→∞P [|C1| ≤ c log(n)] = 1.

(ii) Critical regime: If R0 = 1 then for all a > 3/2 and all δ > 0

lim n→∞P |C1| nα ≥ δ = 0.

Furthermore, if pext_R₀ is the extinction probability of a Galton-Watson branching process with N ∼ P oisson(R0), i.e. pext is the unique solution to 2.3, then the following holds.

(14)

(ii) Supercritical regime: If R0 > 1 then for some constant c depending on λ and all δ > 0 lim n→∞P |C1| n − (1 − p ext R0) ≤ δ and |C2| ≤ c log(n) = 1.

The proof for this theorem exceeds this chapter but we refer the interested reader to [3]. We used Netlogo (version 5.10) to simulate an E-R graph for various values of R0. For

100 nodes, we connnected each pair with probability p = R0/100. The results agreed

with Theorem 3.3. Indeed, the largest connected components in the subcritical and critical regimes were relatively small while in the supercritical case, one can easily see the giant component emerge. See Figure 3.2.

(a) R0 = 0.5, |C1| = 6 (b) R0= 1, |C1| = 13 (c) R0= 2, |C1| = 82

Figure 3.2.: Simulation of a E-R random graph on 100 nodes. The red nodes are part of the largest component C1. Only the connections of C1 are shown.

The previous theorem leads to some interesting results regarding the number of people that get infected when we simulate the spread of an epidemic. We will discuss this for each case:

(i) For R0 < 1 the largest component is at most of logarithmic size with respect to

the total population almost surely when n is large, that is |C1| ≤ c log(n) a.s. for some

c > 0. This implies that every component is at most of logarithmic size. When a finite number I(0) of the nodes are infected, a maximum of I(0) · c · log(n) people eventually become infected. Hence, the outbreak is restricted to a logarithmic sized proportion of the population. Seeing as

I(0) · c · log(n)

n → 0 as n → ∞

this is a negligible proportion of the total population. Therefore this will result in a small outbreak.

(ii) For R0 = 1 the size of the largest component is at most of order n2/3 with high

(15)

with high probability a maximum of I(0) · c · n2/3 people are infected for some c > 0 and there is only a small outbreak of the disease, as

I(0) · c · n2/3

n → 0 as n → ∞.

(iii) For R0 > 1, the size of the largest component in proportion to n converges in

probability to 1 − pext_R₀ as n → ∞. That is, in large populations a proportion of 1 − pext_R₀ of the population is interconnected i.e.

|C1|

n → (1 − p

ext

R0) as n → ∞.

We call C1 the giant compontent. All other components including the second largest

component are at most of logarithmic size almost surely. If a person in these logaritmic sized components is infected, the disease will be limited to a negligible proportion of the total population as n → ∞ as we saw in (i). If a person in the giant component is infected this will almost surely lead to a large outbreak of the disease where at least a proportion of 1 − pext

R0 of the population is infected.

3.3. Simulation.

We ran a simulation to find out more about the amount of sick people in a population of n people and to verify the previously stated theoretic bounds. In this section, we are interested in the mean number of infected people for R0 = 0.5, 1 and 2 as a function of

n.

First, we used Matlab (version R2014b) to simulate the spread of an epidemic based on the Reed-Frost epidemic model. For every R0, we chose a certain n and p = R0/n

accordingly. Also, we chose I(0) = 10. For our simulation, we allowed the infected people to infect the susceptibles in each iteration, after which they were labeled as resistant. Every new amount of susceptibles was selected out of all susceptibles by using the rand function and transition probabilities stated in 3.1. See Appendix C.1 for the script. Although this procedure ran flawlessly for small n, the program failed to accurately produce all transition probabilities for n > 68. One might explain this by looking back at the transition probabilities in 3.1 and noticing that for S(z) > 68 and some values of S(˜z)

S(z) S(˜z)

> 1019.

Any value stored as a double requires 64 bits. Hence values larger than 1019 can not be accurately stored as a double. It is therefore favorable to simulate the E-R graph, which is equivalent to the Reed-Frost model but does not require the same calculations.

Again, we used Matlab to generate an E-R graph. By using the binornd function, we generated a symmetric n × n adjacency matrix in which every two nodes are connected

(16)

with probability p. Next, 10 nodes were assigned as ‘infected’ and during the next iteration we allowed them to infect their direct neighbours. During the iterations, the infected nodes infected their direct neighbours until there were no infected or healthy nodes left. See Appendix C.2 for the script. For every value of R0, we repeated the

previous script for increasing values of n with p = R0/n until the script exceeded a cpu

of 300s. For every n, which we chose as a power of two starting from n = 24 _{= 16, we ran}

the script 100 times and calculated the mean number of infected people. See Appendix C.3 for this script.

Results In the subcritical regime, the mean number of infected people was indeed bounded by c · log(n) for c = 4. When fitted to an logarithmic function, the data was best fitted to Y = 1.64 · log(n) + 8.71. Though as can be seen from Figure 3.3, the data disagrees with this fitting for some n. This could be explained by variance or maybe the data is best fit to another, non-logarithmic function.

In the critical regime, the mean number of infected people was clearly bounded by and well-fitted to a function of the form c · n2/3, see Figure 3.3. The data appeared to fit the function Y = 1.10 · n2/3_{+ 11.1 very well, confirming the theoretic upper bounds}

from Theorem 3.3 for c = 1.10.

Figure 3.3.: The mean number of infected nodes in an E-R graph for I(0) = 10. The blue markers represent the mean number of infected people, and the dotted lines represents the upper bounds stated in Theorem 3.3 for various values of c. Above row: R0 = 0.5, total cpu = 379s. Lower row: R0 = 1, total cpu

= 399s.

In the supercritical regime, the mean proportion infected people appeared to be con-verging towards 1 − pext for increasing values of n, suggesting the appearence of a giant component. See Figure 3.4. The difference δ between the data and 1 − pext _{is positive}

for every n, which tells us that in proportion to n even more people are infected than 1 − pext. The data does appear to be converging to 1 − pext, as we can see by the rapidly decreasing value of δ.

(17)

Figure 3.4.: The mean proportion of infected nodes in an E-R graph for R0 = 2 and

I(0) = 10. The blue markers represent the mean proportion of infected people, and the dotted line is the theoretic proportion of people in the giant component. The difference δ between the data and 1 − pext_{, is shown on the}

right. Total cpu = 379s.

The size of I(0). The probability of a large outbreak of an infectious disease in su-percritical regime is dependent on the size of I(0). If we take I(0) large enough, there is a substantial probability of infecting a node in C1 and causing a large outbreak. In

contrast, if we take I(0) small, there might not be a large outbreak. The probability of a large outbreak is dependent on I(0) as follows.

Suppose we assign each of the initial infections randomly to a node. The probability of assigning an infection to a node that is not in the giant component is equal to pext_R₀, since the giant component is of size 1 − pext

R0 with respect to n. Therefore with probability

(pext R0)

I(0)_{, none of the initial infections are assigned to nodes in the giant component.}

The probability that at least one of the infected nodes I(0) is in the giant component is 1 − (pext

R0)

I(0)_{. Consequently , with probability 1 − (p}ext R0)

I(0)_{, there is a large outbreak in}

which at least a proportion of 1 − pext

R0 of the total population is infected.

Example 3.4. Suppose R0 = 2 i.e. the probability of infection is p = 2_n. Then pext =

0.2032 (see Example 2.5). Suppose I(0) = 1. For a population of size n, a proportion of 1 − pext _{= 0.7968 of the total population is infected with probability 1 − p}ext _{= 0.7968}

as n → ∞. For I(0) = 2 at least the same proportion of the population is infected with probabiliy 1 − (pext₎2 _{= 0.9588.}

Earlier, we assumed that I(0) was finite. Yet, there is a way of writing I(0) as a function of n such that the previous results still hold. For instance if we take I(0) = nα_{· (log(n))}−1 _{for α < 1 in the subcritical case, a maximum of I(0) · log(n) = n}α _people

are infected. Since

lim

n→∞

nα

n = limn→∞n

α−1 _{→ 0}

this is still a negligible proportion of the total population. So this results in just a small outbreak. The same goes for I(0) = nα−2/3 _{with α < 1 in the critical regime, since again}

a maximum of I(0) · n2/3 = nα people are infected and this proportion again equals nα−1 _{→ 0 as n → ∞.}

(18)

The SIS epidemic process. We now consider the spread of a disease in which each infected person recovers and becomes susceptible again as opposed to becoming resistant as we saw in the SIR model.

S → I → S

An example of a SIS epidemic is the well-known ‘common’ flu. We will now deduce some results based on previous observations. For I(˜z) = k the reader can check that the transition probabilities equal

P [Z(t + 1) = ˜z : Z(t) = z] =S(z) k

(1 − p)I(z)(S(z)−k){1 − (1 − p)I(z)_}k_. _(3.2)

which implies that the conditional random variable Z(t + 1)|Z(t) ∼ Bin(S(z), 1 − (1 − p)I(z)_{). We now want to deduce some results that are similar to Theorem 3.3 that will}

predict whether there will be a large or small outbreak. Therefore we first have to find a random graph that will be equivalent to the SIS epidemic process. An obvious choice is a similar to the E-R graph, but with a slight alteration: each infected node infects his direct neighbours after which he becomes susceptible again. This model indeed has the same transition probabilities as 3.2, but unfortunately does not depict the same process. In the SIS epidemic model, every time a person becomes susceptible after infection, he becomes infected again in one timestep with probability (1 − p)I(z). The same probability in our altered E-R graph is either 1 if the person has neighbours (seeing as they are surely infected), or 0 if the person has no neighbours. To properly adjust our altered E-R graph , we have to randomly reassign neighbours to every person that gets susceptible again after infection with probability p. Gaining further insight into this model will be complicated and unnessasary as there are easier ways of examining the SIS model. In Chapter 4 we will show that if the number of infected individuals in our SIS model is represented by a Markov chain Xn(t) on N, we can construct a differential

(19)

4. Continuous time epidemic models

We now look into some continuous time epidemic models. In continuous time epidemic models, just like in real life, people can get infected at any point in time and stay infected for any amount of time. We will refer to the rate at which an infectious individual infects each susceptible as β, and the rate at which an infected person becomes immune as γ. Our main goal for this Chapter is to examine how β and γ contribute to the occurence of an epidemic and how many people are infected. In order to do so, we will have to look into large populations seeing as continuous time epidemic models are more advanced in small populations.

The first continuous time model we discuss is a deterministic model which, as opposed to all previous models, contains no stochastic elements. Deterministic models are often described by differential equations and the output is fully determined by parameter values and initial conditions. Second, we will discuss a stochastic model where the number of infected people is represented by a Markov jump process. By applying Kurtz’s theorem we argue that in large populations, the number of infected people approximates the solution to a differential equation. We will finalize this Chapter with a proof for Kurtz’s theorem as well as a simulation of the above results.

4.1. The deterministic model.

The SIR model We denote by s, i, r the proportion of susceptible, infectious and resistant people relative to n, i.e. s = S/n, i = I/n and r = R/n, and s(t)+i(t)+r(t) = 1 for all t ≥ 0. Clearly, this process can be represented as

s → iβ → r.γ

We stated that an infectious individual infects a susceptible at rate β, where there are s susceptible and i infected people relative to n. Consequently, the rate at which all infected people infect all susceptibles equals βis. Also, since γ is the rate at which each infected person becomes immune, γi is the rate at which all infected people get immune. Therefore we obtain the following system of differental equations

ds(t) dt = −βi(t)s(t) (4.1) di(t) dt = βi(t)s(t) − γi(t) (4.2) dr(t) dt = γi(t).

(20)

By setting ds(t)_dt = di(t)_dt = 0, we conclude that an equilibrium arises only when i(t) = 0, i.e. when the disease has died out. We consider two cases which eventually lead to an equilibrium, assuming that i(0) > 0. First note that the mean duration of an infection is γ−1 units of time. Since an infected individual infects each susceptible at rate β, the expected proportion of susceptibles he infects during his infectious period equals R0 = β/γ. Seeing as R0 is the reproductive number for this model, it is not

surpris-ing that the two cases we consider are related to the (sub)critical and the supercritical regimes.

(i) s(0) ≤ R−1₀ . Note that R0 ≤ 1 always satisfies this condition. Since s(t) is decreasing

for t > 0, one can deduce that di

dt(t) ≤ i(t)(βs(0) − γ) (

= 0 if s(0) = R−1₀ or i(t) = 0 < 0 if s(0) < R−1₀ .

Seeing as s(0) = R₀−1 is not an equilibrium, we conclude that the number of infected people will strictly decrease until i(t) = 0 and the disease has died out. No epidemic occurs.

(ii) s(0) > R−1₀ . Note that this implies R0 > 1. Clearly,

di dt(t) = i(t)(βs(t) − γ)      > 0 if s(t) > R₀−1 = 0 if s(t) = R₀−1 or i(t) = 0 < 0 if s(t) < R₀−1

Seeing as s(0) > R−1₀ , we deduce that the number of infected people will increase until s(t) = R₀−1 for some t. Hence, an epidemic occurs. Since s(t) = R−1₀ is not an equilib-rium, the number of infective people then decreases until i(t) = 0 and the disease has died out. Obviously, no more than R−1₀ of the susceptibles survive the epidemic. See Figure 4.1 how this event takes place in the subcritical and the supercritical regime.

Figure 4.1.: Phase portrait for the equations 4.1 and 4.2 in the case where R0 = 0.5

(21)

The SIS-model For our SIS-model we get a similar result, with the alteration that infected people become susceptible, as apposed to resistant, at rate γi.

ds(t)

dt = −βi(t)s(t) + γi(t) di(t)

dt = βi(t)s(t) − γi(t)

It is clear that an equilibrium occurs in either of the following conditions (i) i(t) = 0 and s(t) = 1 or

(ii) i(t) = 1 − R−1₀ and s(t) = R−1₀ . Clearly di(t) dt = i(t)(βs(t) − γ)      > 0 if s(t) > R−1₀ = 0 if s(t) = R−1₀ or i(t) = 0 < 0 if s(t) < R−1₀

implies that in the case where s(0) < R−1₀ the number of infected people decreases until either s(t) = R−1₀ and i(t) = 1 − R−1₀ if R0 > 1 or in the case where R0 < 1, the

disease has died out. If R0 ≤ 1, this will imply that the disease has died out. In the

case where s(0) > R−1₀ , and therefore R0 > 1, the number of infected people increases

until i(t) = 1 − R−1₀ . We conclude that for all values of s(0), i(t), an equilibrium occurs between the proportion of susceptible and infected people where s(t) = R−1₀ and i(t) = 1 − R−1₀ .

4.2. Kurtz’s Theorem.

Next, we will discuss a continuous time SIS model based on a continuous time stochastic process. We will first introduce the concept of a continuous time Markov chain.

Let (X(t))t≥0be a right-continuous process with values in a set N , which is called the

state space. Let Λ = (λx,y)x,y∈N be a Q-matrix with jump martix Π.

Definition 4.1. We call (X(t))t≥0a Markov chain on a N with jump rates (λx,y)x,y∈N ∈

Λ, if for all t, h ≥ 0 conditional on X(t) = x, X(t + h) is independent of (X(s))s≤t and

as h ↓ 0 uniformly in t for all y ∈ N

P [X(t + h) = y|X(t) = x] = δx,y+ λx,yh + o(h),

where δx,y is the Kronecker delta, and o(h) is the little-o of h

Remark. Recall that f (h) ∈ o(h) if and only if lim

h→0

f (h) h = 0.

(22)

This implies that for any Markov chain (X(t))t≥0 on N and for any states x, y ∈ N

the process jumps from x to y with certain probability upon expiration of a exponential distributed timer, whose parameter depends solely on the pair (x, y).

In the remaining part of this chapter, we consider a family of Markov chains Xn for n ∈ N

on state space N. We will assume that for all n, Xn has only a finite number k of jump

directions, i.e. (ei)i=1,...,k ∈ Z. For every n, wel define the functions λi : R → R+ such

that the transition rate of jumping to x + ei from x equals nλi(x/n) for any state x ∈ N

of Xn. In this case, we call Xn a birth and death process and the following holds for all

x ∈ N P [Xn(t + h) = x + ei|Xn(t) = x] = nhλi(x/n) + o(h) P [Xn(t + h) = x|Xn(t) = x] = 1 − nh k X i=1 λi(x/n) + o(h).

Example 4.2. Suppose Xn(t) is a Markov chain on N that represents the number of

infectives at time t in a population of n individuals. Then the jump directions are restricted to belong to {−1, 0, 1}. As we saw in the former paragraph, for jumping directions e+= +1 and e−= −1 the corresponding functions λ± equal λ−(I/n) = γI/n

and λ+(I/n) = βI/n(1 − I/n). Therefore, for every state I ∈ N, the rate of jumping to

I −1 equals nλ−(I/n) = γI and the rate of jumping to I +1 equals nλ+(I/n) = β_nI(n−I).

We obtain the following equations: P [Xn(t + h) = I + 1|Xn(t) = I] =

β

nI(n − I)h + o(h) (4.3) P [Xn(t + h) = I − 1|Xn(t) = I] = γIh + o(h) (4.4) P [Xn(t + h) = I|Xn(t) = I] = 1 − β nI(n − I) + γI h + o(h) (4.5) We will now show that when n goes to infinity, the trajectories t → _n1Xn(t) converge

in probability to the solution x(t) of a ordinary differential equation. In particular this convergence is uniform and a.s. on [0, T ] for any T > 0.

Theorem 4.3. Kurtz’s Theorem Let ¯e := max{|ei| : i = 1, . . . , k} for some arbitrary

norm | · |. Assume that

¯

λ = max

i=1,...,ksup_x∈Rλi(x)

is finite and that the function F : R → R defined by: F (x) =

k

X

i=1

(23)

is Lipschitz-continuous, i.e. there exists a constant M such that |F (x) − F (y)| ≤ M |x − y| _{for all x, y ∈ R.}

Assume that limn→∞ _n1Xn(0) = x(0) almost surely, and let x : R+ → R be the solution

to the integral equation

x(t) = x(0) + Z t

0

F (x(s))ds. (4.6) Then for any fixed , T > 0, for sufficiently large n,

P sup 0≤t≤T 1 nXn(t) − x(t) ≥ ≤ 2k exp −nT ¯λh e −M T 2kT ¯λ¯e , (4.7) where h(t) = (1 + t)log(1 + t) − t. Moreover,

lim n→∞_0≤t≤Tsup 1 nXn(t) − x(t) = 0 a.s. (4.8) This theorem provides a law of large numbers for a birth and death process. Equation 4.7 tells us that the right hand side of the equation converges to zero for n → ∞, we deduce that _n1Xn(t) converges in probability to x(t) for all fixed t. Moreover, this

convergence is a.s. and uniform for all t ∈ [0, T ] for any T > 0, i.e. on all compact subsets of R≥0. Hence, Kurtz’s Theorem provides us with a limit of _n1Xn as n → ∞, as well as a

characterisation of the limiting process. Before we prove Kurtz’s Theorem, we will first apply Kurtz’s Theorem on our SIS-model.

Solution for SIS-model. For our SIS-model this implies the following. Equations 4.3, 4.4 and 4.5 imply that F equals

F (x) = λ+(x) − λ−(x) = βx(1 − x) − γx.

We will now provide a solution to dx_dt _{= F (x) for γ = 1. For β 6= 1 and some y ∈ R the} solution to dx dt = F (x) = βx(1 − x) − x, x(0) = y. equals x(t) = (β − 1)y (β − 1 − βy)e(1−β)t_{+ βy}. (4.9)

It is clear that if β > 1 then lim t→∞x(t) = (β − 1)y 0 + βy = β − 1 β ,

and if β < 1, then limt→∞x(t) = 0. Since Kurtz’s Theorem tells us that x(t) is the

prob-ability limit of 1

nXn(t) as n → ∞, i.e. the proportion of infected people in a population

of infinite size, we deduce that an epidemic occurs almost surely if β > 1, i.e. if R0 > 1.

In the subcritical regime, when R0 < 1, no epidemic occurs. See Figure 4.2 for the value

(24)

Figure 4.2.: Solution to 4.6 for our SIS-model in the case where R0 = 0.5 (left) and

R0 = 2 (right).

Proof. Kurtz’s Theorem. Let (Ni)i=1,...,k be a family of independent unit-rate Poisson

processes. That is, for each t > 0, Ni(t) is Poisson distributed with parameter t for all

i = 1, . . . k. By proof by induction it can be shown that we can write Xn(t) = Xn(0) + k X i=1 eiNi Z t 0 nλi(Xn(s)/n)ds .

The proof of this statement is beyond the scope of this thesis. We now introduce the notation Yn(t) = _n1Xn(t), then it follows that

Yn(t) = Yn(0) + k X i=1 ei nNi Z t 0 nλi(Xn(s)/n)ds = Yn(0) + k X i=1 ei nNi Z t 0 nλi(Xn(s)/n)ds + k X i=1 ei n Z t 0 nλi(Xn(s)/n)ds (i) = Yn(0) + k X i=1 ei nNi Z t 0 nλi(Xn(s)/n)ds + Z t 0 k X i=1 eiλi(Xn(s)/n) ds = Yn(0) + k X i=1 ei nNi Z t 0 nλi(Xn(s)/n)ds + Z t 0 F (Yn(s))ds, (4.10)

where Ni(t) = Ni(t) − t, which is the centered Poisson process. For (i) we used the

linearity of the integral. Clearly we are looking for an expression of |Yn(t) − x(t)|.

Combining equation 4.6 and 4.10 gives us |Yn(t) − x(t)| ≤ |Yn(0) − x(0)| + Z t 0 |F (Yn(s)) − F (x(s))|ds + k X i=1 |ei| n Ni n Z t 0 λi(Xn(s))ds .

(25)

Now we can use the fact that limn→∞Yn(0) = x(0) a.s. and F is M −Lipschitz to obtain

We are going to prove that the right side of this equation is small for large values of n. First, we will show that Pk

i=1 1

nn,i(t) is indeed small for large n. To obtain this result

we need a Lemma 4.5 on page 27 to obtain the an upper bound forPk i=1

1

nn,i(t). Seeing

as n,i(t) are non-negative for all t and i = 1, . . . , k we know that

sup 0≤t≤T k X i=1 1 nn,i(t) ≥ ⊆ ∃i : sup 0≤t≤T 1 nn,i(t) ≥ k . Hence we deduce that

P sup 0≤t≤T k X i=1 1 nn,i(t) ≥ ≤ P ∃i : sup 0≤t≤T 1 nn,i(t) ≥ k = P k [ i=1 sup 0≤t≤T 1 nn,i(t) ≥ k ≤ k X i=1 P sup 0≤t≤T n,i(t) ≥ n k . (4.12) Note that λ1(t) ≤ λ2(t) for all t ≥ 0 implies that P [Ni(λ1(t)) ≥ ] ≤ P [Ni(λ2(t)) ≥ ]

and therefore P [|Ni(λ1(t)) − t| ≥ ] ≤ P [|Ni(λ2(t)) − t| ≥ ]. Since we know that

Rt

0 λi(x(s))ds ≤ ¯λt for all t ≥ 0 we deduce the following k X i=1 P sup 0≤t≤T n,i(t) ≥ n k ≤ k X i=1 P sup 0≤t≤T ¯ e Ni n Z t 0 λi(Xn(s))ds ≥ n k ≤ k X i=1 P sup 0≤t≤T ¯ e Ni n¯λt ≥ n k ≤ k X i=1 P sup 0≤t≤nT ¯λ |Ni(t)| ≥ n k¯e , (4.13)

where ¯e := max{|ei| : i = 1, . . . , k}. Since Ni(t) = N (t) − t, and N (t) is a unit-rate

Poisson process we can apply Lemma 4.5 in estimate (ii) below to obtain an upper bound. Let Zn(t) = |Yn(t) − x(t)|. Combining equation 4.11 with 4.12 and 4.13 gives us

(26)

the following equation for large n P sup 0≤t≤T Zn(t) − M Z t 0 Zn(s)ds ≥ 2 4.11 = P sup 0≤t≤T k X i=1 1 nn,i(t) ≥ ≤ k X i=1 P sup 0≤t≤nT ¯λ |Ni(t)| ≥ n k¯e (ii) ≤ 2k exp −(nT ¯λ)h n k¯e(nT ¯λ) = 2k exp −nT ¯λh k¯eT ¯λ , (4.14) where h(t) = (1 + t) log(1 + t) − t.

To finalize our proof, we want to show that Zn(t) is small for all t ∈ [0, T ] and large

n. In particular, we want to show that P sup 0≤t≤T Zn(t) ≥

is small for large n. To obtain this result, we apply Gr¨onwall’s lemma on Zn(t). To see

that Zn(t) is bounded on [0, T ], note that x(t) is a continuous function and therefore

bounded on closed intervals. As Yn(t) is a Poisson process, it is clearly bounded on

[0, T ]. Therefore Zn(t) = |Yn(t) − x(t)| is bounded on [0, T ]. By Gr¨onwall’s lemma1

Zn(t) ≤ 2 + M

Z t 0

Zn(s)ds for all t ∈ [0, T ]

implies

Zn(t) ≤ 2eM t for all t ∈ [0, T ].

Clearly this implies that sup 0≤t≤T Zn(t) − M Z t 0 Zn(s)ds ≤ 2 ⊆ sup 0≤t≤T Zn(t) ≤ 2eM T . Hence, we deduce that

sup 0≤t≤T Zn(t) − M Z t 0 Zn(s)ds ≥ 2 ⊇ sup 0≤t≤T Zn(t) ≥ 2eM T . We finalize the proof for equation 4.7 by

P sup 0≤t≤T Zn(t) ≥ 2eM T ≤ P sup 0≤t≤T Zn(t) − M Z t 0 Zn(s)ds ≥ 2 4.14 ≤ 2k exp −nT ¯λh k¯eT ¯λ . 1_{see Appedix B.1}

(27)

Or alternatively, if we substitute by ₂e−M T P sup 0≤t≤T Zn(t) ≥ ≤ 2k exp −nT ¯λh e −M T 2k¯eT ¯λ ,

which proves equation 4.7. To deduce that 4.15 holds, note that for any > 0 X n∈N P sup 0≤t≤T Zn(t) ≥ ≤ 2kX n∈N e−nc()= 2k 1 1 − e−c() < ∞ where c() = T ¯λh e −M T 2k¯eT ¯λ > 0.

We apply the Borel-Cantelli Lemma2 to deduce that for any > 0 P lim sup n→∞ sup 0≤t≤T Zn(t) ≥ = 0,

which tells us that sup_0≤t≤T Zn(t) converges to 0 almost surely. Hence

lim n→∞_0≤t≤Tsup 1 nXn(t) − x(t) = 0 a.s. (4.15) must be true.

4.3. Proof of Lemma 4.5

The next proof requires some knowledge of martingales.

Definition 4.4. A real-valued process {X(t)}t≥0 which satisfies

E[X(t)|{X(u)}u≤s] = X(s), for all 0 ≤ s ≤ t

is called a martingale.

Recall that we still need to prove the following statement.

Lemma 4.5. Let N be a unit-rate Poisson process. Then for any > 0 and T > 0, P sup 0≤t≤T |N (t) − t| ≥ ≤ 2e−T h(/T ), where h(t) = (1 + t) log(1 + t) − t. 2_{see Appedix B.2}

(28)

Proof. Let θ > 0 be fixed. Seeing as the exponential is an increasing function we have P sup 0≤t≤T |N (t) − t| ≥ _(i) ≤ (a) z }| { P sup 0≤t≤T (N (t) − t) ≥ + (b) z }| { P sup 0≤t≤T (t − N (t)) ≥ = P sup 0≤t≤T eθ(N (t)−t) ≥ eθ + P sup 0≤t≤T eθ(t−N (t)) ≥ eθ , where we used Boole’s inequality for (i). Next, we are going to find an upper bound for (a) and (b).

Example 4.6. {N (t) − t}t≥0 and {t − N (t)}t≥0 are martingales. The proof of this

statement is beyond the scope of this chapter and is left to the reader. Example 4.7. {eθ(N (t)−t)_}

t≥0 and {eθ(t−N (t))}t≥0 are non-negative submartingales by

Example 4.6 and Jensen’s inequality 3. The proof of this statement again exceeds this chapter and is left to the reader.

By applying Doob’s inequality 4 we obtain P sup 0≤t≤T eθ(N (t)−t) ≥ eθ ≤ e−θE[eθ(N (T )−T )], and P sup 0≤t≤T eθ(t−N (t)) ≥ eθ ≤ e−θE[e−θ(N (T )−T )].

Seeing as N (T ) ∼ Poisson(T ), we know that by Example 2.5 that E[eθN (T )] = φ(eθ) = exp(T (eθ_{− 1)). Hence we deduce}

P sup 0≤t≤T eθ(N (t)−t) ≥ eθ ≤ e−θ(+T )E[eθN (T )] = e−θ(+T )exp(T (eθ− 1)) = e−θ(+T )exp(T (eθ− 1)) = exp(−θ( + T ) + T (eθ− 1))

We now want to minimize the exponent −θ( + T ) + T (eθ− 1) over θ. We will obtain this minimum by setting

0 = d dθexp(−θ( + T ) + T (e θ_{− 1))} = (−( + T ) + T eθ) exp(−θ( + T ) + T (eθ− 1)) =⇒ −( + T ) + T eθ = 0 3_{see Appendix B.3} 4_{see Appedix B.4}

(29)

Thus, θ = log(1 + /T ). Combining the above yields the following upper bound for (a) P sup 0≤t≤T (N (t) − t) ≥ ≤ exp(− log(1 + /T )( + T ) + ) = exp(−T (log(1 + /T )(1 + /T ) − /T )) = exp(−T h(/T )),

where h(t) = (1 + t) log(1 + t) − t. We find an upper bound for (b) in a similar fashion P sup 0≤t≤T eθ(t−N (t)) ≥ eθ ≤ exp(θ(T − ) + T (e−θ− 1))

Since the right side of the inequality attains its minimal at θ = − log(1 − /T ). P sup 0≤t≤T (t − N (t)) ≥

≤ exp(− log(1 − /T )(T − ) + T (elog(1−/T )_{− 1))}

≤ exp(−T (log(1 − /T )(1 − /T ) − /T )) = exp(−T h(−/T ))

Clearly h(−t) ≥ h(t) for all t ∈ [0, 1]. Therefore, we conclude that P sup 0≤t≤T |N (t) − t| ≥ ≤ e−T h(/T )+ e−T h(−/T )≤ 2e−T h(/T ) holds.

4.4. Simulation.

We ran a simulation in Matlab to comfirm that the Markov chain Xn(t) described by

equations 4.3, 4.4 and 4.5 indeed converges in probability to x(t) in 4.9 for large n. Again we ran the program for R0 = 0.5 and R0 = 2 for γ = 1, i.e. β = R0/γ.

The jump matrix and mean sojourn times of Xn(t) were constructed the according to

equations 4.3, 4.4 and 4.5. We chose the number of infected people at t = 0 as ten percent of the population, i.e. I(0) = N/10, and constructed the initial distribution of Xn(t) accordingly. Jump times were generated according to the sojourn times using

the rand function. Jump directions were generated using the rand function to chose a direction according to the jumping prababilities. See Appendix C.4 and C.5 for the scripts. For every value of R0, we repeated the previous script for increasing values of

(30)

Results In the subcritical as well as the supercritical regime, the value of Xn(t) was

indeed close to x(t) for all 3 simulations. Naturally, an epidemic occured for all sim-ulations the supercritical regime where R0 = 2, while the disease decreased to a small

fraction in all simulations of the subcritical regime where R0 = 0.5. See Figures 4.3 and

4.4

The difference between x(t) and Xn(t) did not seem to decrease for larger n. An

explanation for this could be that our largest value of n, 212, may not be sufficiently large to show convergence.

Figure 4.3.: Number of infected people in a Markov jumping process for R0 = 0.5. The

blue line equals x(t), the others equal _n1Xn(t) for n = 210 (yellow), n = 211

(orange) and n = 212 _(red).

Figure 4.4.: Number of infected people in a Markov jumping process for R0 = 2. The

blue line equals x(t), the others equal _n1Xn(t) for n = 210 (yellow), n = 211

(31)

5. Conclusion

As stated, a general result emerges for all of our epidemic models, which is the following If the expected number of people a single infected person infects is greater than one, an epidemic may occur. If the expected number of people he in-fects is equal or less than one, the disease will surely die out.

We argued that the probability that an infectious disease lives on indefinitely in the Galton-Watson branching process can be calculated. While there exists a positive prob-ability that the disease lives on indefinitely in the supercritical case, this probprob-ability equals zero in the subcritical case. Next, we explored the Reed-Frost SIR model. By showing equivalence to the E-R random graph, we stated that there are upper and lower bounds to the number of people that are infected in large populations in the subcritical and supercritical case respectively. Where in the supercritical case the proportion of infected individuals in a population of n people approaches 1 − pext

R0 as n → ∞, the same

proportion decreases to zero in the subcritical case. For our deterministic SIR model, we argued that the number of infected people will decrease when the proportion of suscep-tibles is less than R−1₀ of the population. In the supercritical regime, if the proportion of susceptibles is larger than R−1₀ , an epidemic will occur and no more than R−1₀ of the susceptibles survive. If we view the number of infected people as a Markov chain on N, we find that by applying Kurtz’s Theorem, the number of infected people approximates a differential equation in large populations. This result agrees with our simulations.

All epidemic models are based upon certain assumptions on how an infectious disease is passed on. Based on these assumptions, the epidemic models will give us an idea on how the disease is spread theoretically, e.g. the probability that an epidemic occurs, the probability that the disease will die out eventually or the maximum number of people that get infected or stay healthy. Although these are valuable results, one should keep in mind that some of these models are a gross simplification of real life epidemics and should not be applied as though they are a true representation of real epidemics. For instance, most epidemic models discussed in this thesis assume that susceptibles get in-fected by inin-fected people independent of other susceptibles and inin-fected people. Clearly, this is not a representation of the transmission of real infectious diseases. Furthermore, we assumed that the rate at which susceptibles get infected upon contact is the same for all individuals. Maybe the greatest objection of all is that all of our models assume that people make contact at random. In reality our social networks are highly complex and to deduce results about the spread of epidemics accordingly, one would require substantial knowledge of social sciences.

(32)

To overcome these limitations one would require information on how social networks are composed, as well as births and deaths within a population and individual differences in ‘susceptibleness’ and recovery rate. This information can be implemented in new epidemic models. These models may be complex to analyse mathematically but can be simulated. If the new epidemic models could form a true representation of real epidemics, they could help us understand the spread of real epidemics and may even provide us with insight in how epidemics could be prevented.

(33)

6. Populaire samenvatting

We horen wel vaker in het nieuws dat er een nieuwe besmettelijke ziekte is ontdekt. Als nog niet bekend is hoe besmettelijk de ziekte is en hoe het wordt overgedragen, kan dat best beangstigend zijn. Een besmettelijke ziekte kan zich namelijk snel verspreiden en uitgroeien tot een epidemie. We spreken van een epidemie als een aandeel van de bevolking voor een lange tijd ge¨ınfecteerd is met een besmettelijke ziekte. Niet elke be-smettelijke ziekte veroorzaakt een epidemie. Daarom is het belangrijk dat we weten wat er voor zorgt dat sommige besmettelijke ziektes uitgroeien tot een epidemie en andere niet.

De meeste besmettelijke ziektes beginnen bij een persoon. We noemen deze persoon patient zero. Patient zero geeft de ziekte met een bepaalde kans door aan de mensen die hij of zij tegenkomt. Deze mensen geven op hun beurt hun ziekte door aan de mensen die zij tegen komen, enzovoorts. Als een epidemie ontstaat dan is er na een lange tijd nog een groep mensen in de bevolking die de ziekte heeft. Zo niet, dan zeggen we dat de ziekte is uitgestorven. We willen nu weten in welke omstandigheden een besmettelijke ziekte uitgroeit tot een epidemie.

Hiervoor introduceren we eerst het begrip ’verwachtingswaarde’. Wanneer je het over kansen hebt, heb je het vaak ook over verwachtingswaardes. Iedereen heeft weleens te maken met verwachtingswaardes, ook al heb je dat misschien niet door. Als je een grote kans hebt om de loterij te winnen dan zou je altijd meespelen, omdat je verwacht te winnen. Maar dit hoeft natuurlijk niet altijd te gebeuren! De verwachtingswaarde van je winst is een soort gemiddelde van de kans dat je wint maal het bedrag wat je zou winnen. Zo heb je ook verwachtingswaardes in de overdracht van ziektes. We kunnen namelijk kijken naar de verwachtingswaarde van het aantal mensen dat door elk ziek persoon wordt besmet. We noemen dit de reproductiewaarde van een ziekte.

De reproductiewaarde vormt de basis voor ons begrip voor de verspreiding van epi-demie¨en. We kunnen namelijk het volgende stellen:

Wanneer elk ziek persoon naar verwachting meer dan een persoon besmet is er een grote kans op een epidemie. Wanneer hij of zij naar verwachting ´e´en iemand of minder besmet sterft de ziekte na verloop van tijd uit.

Dit is misschien best logisch. Wanneer elk ziek persoon zijn ziekte naar meer dan een iemand doorgeeft dan verspreidt de ziekte erg snel. Let op, er hoeft in dit geval niet per se een epidemie te ontstaan! Er is altijd een kans dat alle zieke personen hun ziekte aan niemand doorgeven (ookal verwacht je dat niet). Wanneer elk ziek persoon zijn ziekte

(34)

aan minder dan een iemand doorgeeft dan is het niet onverwacht dat de ziekte na verloop van tijd uitsterft. Misschien is het laatste geval nog het meest verrassend: wanneer elk ziek persoon naar verwachting precies één iemand besmet dan is er een tijdstip waarop niemand meer ziek is. Dit is misschien niet intu¨ıtief, maar met behulp van wiskundige bewijzen is het aan te tonen dat dit met zekerheid gebeurt. Daarom is het belangrijk dat we wiskundige methodes hebben om epidemiën te bekijken.

We hebben nu bekeken in welke omstandigheden een epidemie kan ontstaan. We kunnen meerdere wiskundige methodes gebruiken om ook te kijken naar het maximaal aantal mensen die ziek worden of het aantal zieke mensen op elk tijdstip. Deze methodes kunnen ons helpen de verspreiding van epidemi¨en beter te kunnen begrijpen.

(35)

Bibliography

[1] R. Clark Robinson, An Introduction to Dynamical Systems, American Mathematical Society, Providence, 2012.

[2] R.L. Schilling, Measures, Integrals and Martingales Cambridge University Press, Cambridge, 2005.

[3] M. Draief, L Massouli´e, Epidemics and rumours in complex networks Cambridge University Press, Cambridge, 2010.

(36)

A. Appendix A

Disease Avarage R0 Fatality rate Transmission

Malaria 16-80 1 _50% _{Mosquito bite}

Measles 15 30% Airborne

Pertussis (Whooping Cough) 14,5 4% Airborne droplet Mumphs 12,5 1% Airborne droplet T uberculosis (untreated) 10 60% Airborne droplet Chicken pox 8,5 0% Airborne droplet Polio 6 22% Fecal-oral route Smallpox 6 15% Airborne droplet Diphtheria 6,5 7,5% Salvia

Rubella 6 0% Airborne droplet Lyme disease 4,4 0,2% Tick bite

HIV (untreated) 3,5 80% Sexual contact HIV (treated) 3,5 2,1% Sexual contact Pneumonic Plague 3,2 100% Airborne droplet Influenza 3 2,5% Airborne droplet Ebola 2,5 50% Bodily fluids Seasonal Flu 2,5 0,10% Airborne droplet SARS 2,4 9,6% Airborne droplet

Cholera 2,13 1,63% Water

(37)

B. Appendix B

B.1. Supporting theorems for Chapter 4

Lemma B.1. Gr¨onwall’s Lemma Let u be a bounded real-valued function on [0, T ] sat-isfying

u(t) ≤ a + b Z t

0

u(s)ds for all t ∈ [0, T ] where a, b are non-zero constants. Then

u(t) ≤ aebt. See [1] for a proof of this theorem.

Lemma B.2. Borel-Cantelli Lemma Let (Σ, A, P ) be a probability space and {Aj}j∈N⊂

A. Then ∞ X j=1 P (Aj) < ∞ =⇒ P lim sup j→∞ Aj = 0. See [2] for a proof of this theorem.

Lemma B.3. Jensen’s inequality Let X be a random variable and let φ be a convex function. Then

φ(E[X]) ≤ E[φ(X)] See [2] for a proof of this theorem.

Lemma B.4. Doob’s inequality Let {X(t)}t≥0 be a right-continuous submartingale with

left-limits. Then for every x ≥ 0 and T ≥ 0 P [ sup

0≤t≤T

|X(t)| ≥ x] ≤ E[|X(T )|]. See [2] for a proof of this theorem.

(38)

C. Appendix C

C.1. The number of infected people in an

Reed-Frost epidemic

n=66 p=.066 sick = 1; SIR = zeros(1,n+1); x = 0:n; tau = 0;

healthy=n+1; %number of healthy people

while healthy > 1 && sick>0 %run while there are sick & healthy people for Sz = 0:x(healthy); %potential number of people who don’t get sick

SIR(1, Sz+1) = [nchoosek(x(healthy), Sz) * (1-p)^(sick*Sz) * (1-(1-p)^sick)^(x(healthy)-Sz)] ;

%transition probabilities end

SIR(1, x(healthy)+2:end) = zeros(1,length(SIR(1, x(healthy)+2:end))); %delete old probabilities cdf = cumsum(SIR(1,:));

u = rand; healthy = 1;

while cdf(healthy) <= u %choose an new number of healthy people according to trans.prob healthy = healthy+1;

end

x(healthy) %new number of healthy people in population +1 sick = size(find(SIR),2) - x(healthy) %new number of sick people tau = tau + 1;

end

C.2. The number of infected people on G(n,p)

function [ R0, C, tau ] = afschattingnagaan2functie( n,p ) Iznu = 10;

tau = 1;

Iz = 1:Iznu; %list of all infected people

R1 = binornd(ones(n),p); % create edges with probability p

R2 = triu(R1,1) + triu (R1,1).’; %create symmetric adjacency matrix R2(1:(n+1):Iznu*(n+1)) = Iz; %put infected people in matrix

I = R2(1:(n+1):end);

while sum(I~=0)<n %while there are still susceptibles left I = sum(R2(:,Iz),2);

if sum(I~=0) == sum(Iz~=0) %break if disease died out break

elseif tau>100

disp(’takes too long’) break

else

Iz = find(I); %new list of infectious people

R2(1:(n+1):end) = I; %put new infected ones in matrix tau = tau+1;

(39)

end end R0 = p*n; tau; C = sum(Iz~=0); end

C.3. The mean number of infected people for 100

simulations on G(n,p)

aanttst = 100; %number of simulations C = zeros(10,aanttst); %lists all data

e=0; k=4;

t = cputime; while e<300

n=2^k;

for test = 1:aanttst %run simulation 100 times

[r,c,tau] = afschattingnagaan2functie(n,1./(2*n)); C(k-3,test) = c; %lists the number of infected people end

k=k+1; e = cputime-t end

cpu = cputime-t

C=sum(C’)./aanttst; %calculate mean number of infected people x=4:size(C,2);

y1=2*log(2.^x); y2=3*log(2.^x); y2=4*log(2.^x);

figure %plots the mean number of infected people for n=16,32, etc plot(2.^x, C(1,x-3), ’b-’,2.^x,y1, ’--’,2.^x,y2, ’--’,2.^x,y3, ’--’ ) title(’R_0 = 0.5’)

xlabel(’n’)

ylabel(’number of infected people’)

figure %fits to the appropriate function plot(log(2.^x), C(1,x-3), ’bo’)

title(’R_0 = 0.5’) xlabel(’log(n)’)

ylabel(’number of infected people’)

C.4. Number of infected people in a Markov chain

model

%setting up our jump matrix, sojourn times and initial distribution clear all

beta=2; gamma=1; N =2048 ; I0=204;

S = [0:N+1]; %set up state space (in a population of N people) Q=zeros(N+1); % set up a matrix of zeros, of the right size

% rows 1..N+1 correspond to wealths 0..N lambda=zeros(1,N+1); % set up initial distribution for i=2:N, % set up jump matrix Q

(40)

qi = (gamma*(i-1) + (beta/N)*(i-1)*(N-(i-1))); %calculate mean sojourn times

Q(i,i-1)=gamma*(i-1)/qi; % wealth decreases by 1

Q(i,i+1)=(beta/N)*(i-1)*(N-(i-1))/qi; % wealth increases by 1 lambda(i)=qi;

end

lambda(1)=1; %0 is an absorbant state (disease died out) lambda(N+1)=1; %N+1 is an absorbant state (everybody is sick) Q(1,1) = 1;

Q(N+1,N+1) = 1;

mu=zeros(1,N+1); % set up initial distribution mu(I0)=1;

clear T %T lists our jump times clear x %x lists our jump directions T(1) = 0; % start times at 0

x(1) = rando(mu); % generate first x value (time 0, not time 1) i = 1;

%simulation

while T(i)<9; %repeat until 9 time steps

T(i+1) = T(i) - log(rand)/lambda(x(i)); % calculate jump times by %generating exponential random variable

x(i+1) = rando(Q(x(i),:)); % use Q to make state transitions if x(i+1) == 1 %the disease died out if 0 is hit

disp(’disease died out’) break

else

i=i+1; %if not died out, continue end

end

%plotting

S(x) = S(x)./N; %scale with respect to N hold all

for i=1:(length(T)-1), hold all

plot([T(i) T(i+1)], [S(x(i)) S(x(i))]) %plot poisson process hold all

end

stairs(T,S(x));

t=0:0.01:T(i);

y = ((beta - 1)*(I0/N)*exp((beta-1)*t))./((beta - 1) - beta *(I0/N)*(1-exp((beta - 1)*t))); plot(t,y,’b’) %plot solution to differential equation

axis([0 T(i+1)*1.1 0 (max(x)+1)*1.1./N]); xlabel(’Time’);

ylabel(’Number of infected people’)

C.5. Accessory function for C.4

% rando.m generates a random variable in 1, 2, ..., n given a distribution vector function [index] = rando(p)

u = rand; i = 1; s = p(1);

while ((u > s) & (i < length(p))), i=i+1;

s=s+p(i); end

(41)

Epidemic Models

Epidemic Models

Maaike Koninx

July 12, 2015

Bachelor thesis

Supervisor: dr. S.G. Cox

Abstract

Contents

1. Introduction

2. The Galton-Watson branching

process

3. Dicrete time SIR epidemic

models

3.1. The Reed-Frost model

3.2. The Erd˝

os-R´

enyi model.

3.3. Simulation.

4. Continuous time epidemic models

4.1. The deterministic model.

4.2. Kurtz’s Theorem.

4.3. Proof of Lemma 4.5

4.4. Simulation.

5. Conclusion

6. Populaire samenvatting

Bibliography

A. Appendix A

B. Appendix B

B.1. Supporting theorems for Chapter 4

C. Appendix C

C.1. The number of infected people in an

Reed-Frost epidemic

C.2. The number of infected people on G(n,p)

C.3. The mean number of infected people for 100

simulations on G(n,p)

C.4. Number of infected people in a Markov chain

model

C.5. Accessory function for C.4