• No results found

BachelorthesisThesissupervisor:Dr.J.Schmidt-HieberDateofpublishment:August14,2014MathematicalInstituteofLeidenUniversity TheEM-algorithmforPoissondata C.F.vanOosten

N/A
N/A
Protected

Academic year: 2021

Share "BachelorthesisThesissupervisor:Dr.J.Schmidt-HieberDateofpublishment:August14,2014MathematicalInstituteofLeidenUniversity TheEM-algorithmforPoissondata C.F.vanOosten"

Copied!
33
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

C.F. van Oosten

c.f.van.oosten@umail.leidenuniv.nl

The EM-algorithm for Poisson data

Bachelor thesis

Thesis supervisor: Dr. J. Schmidt-Hieber

Date of publishment: August 14, 2014

Mathematical Institute of Leiden University

(2)

Contents

1 Problem formulation 3

2 Summary 3

3 Application 3

4 EM-algorithm 3

5 EM-algorithm for Poisson data 4

6 EM-algorithm for normal data 5

7 Landweber iteration 7

8 EM-algorithm for normal data revisited 9

9 Linking the formulas of the Normal and Poisson problems 11

10 Simulations 13

10.1 Normal model with variance independent of λ (First model) . . . 14

10.2 Normal model with variance dependent of λ (Second model) . . . 15

10.3 Poisson Model . . . 16

10.4 Discussion of simulation results . . . 17

11 Technical Appendix 18

(3)

1 Problem formulation

The goal is to find stable estimators for the parameters λ = (λj)j in the following problem:

We know a matrix A of known weights given as

A = (aij)i=1,...,n j=1,...,m m

X

j=1

aij = 1, aij ≥ 0, ∀i = 1, ..., n, j = 1, ..., m.

These conditions are necessary for identifiability of the parameter λ. The distribution of Nij is given by

Nij ∼ P(aijλj), i = 1, ..., n, j = 1, ..., m

where P denotes the Poisson distribution. While we do not know the observations of these Nij, we do have the observations {Yi}i=1,...n which have the following distribution:

Yi =

m

X

j=1

Nij ∼ P(

m

X

j=1

aijλj) (1)

We would like to find stable estimators for λ using the data {Yi}i=1,...,n.

2 Summary

Using the EM-algorithm we calculate iterative formulas to estimate λ. Because the it- eration formula for Poisson data is difficult to analyse, we study a related problem. We study the normal distribution with parameters aijλj, aij and another normal distribution with parameters aijλj, aijλj which approximates the Poisson distribution for large aijλj. We find the Landweber iteration formula for the first normal distribution which has an explicit solution. We compare the second normal distribution with the Poisson case to find the similarities. In the end we run some simulations to see the convergent behaviour of the iteration formula’s.

3 Application

This problem occurs in positron emission tomography, molecular microscopy and various problems in astrophysics as mentioned in [4]. In positron emission tomography (PET), the Poisson data models the emission density. The model fits the distribution of the amount of photons in double-slit interference of light. This works as follows: After the light goes through the double-slit it diverges, where a bigger diversion results in a lower light intensity.

4 EM-algorithm

The EM-algoritm is an iterative method used to find the maximum likelihood for pa- rameters and can be used when some of the data is not available. We will first define

(4)

the EM-algorithm as in [2, p. 134]. Let p(X, θ) be the probability function of X with parameter θ and S(X) is a linear function of X. Let

J (θ|θ0) ≡ Eθ0



log p(X, θ) p(X, θ0)

S(X) = s



. (2)

Then do the following steps:

Step 1: Initialize the parameter θold= θ0.

Step 2: Compute J (θ|θold) for as many values as needed.

Step 3: Maximize J (θ|θold) as a function of θ.

Step 4: Define θnew = arg max J (θ|θold), set θold= θnew and continue with step 2

5 EM-algorithm for Poisson data

In model (1) we can easily compute the likelihood function, so we can use the EM-algorithm to estimate λ. We find for the likelihood function

L(Nij)ij(λ) =

n

Y

i=1 m

Y

j=1

eaijλjjaij)Nij Nij!

and for the log-likelihood function we find l(Nij)ij(λ) =

n

X

i=1 m

X

j=1

(−λjaij+ Nijlog(λjaij) − log(Nij!)) . Looking at the derivative of the log-likelihood with resprect to λj we obtain

d

jE[l(Nij)ij|(Yi)i] =

n

X

i=1

−aij + 1

λjE[(Nij|(Yi)i] ∀j = 1, ..., m.

Note that Nij|(Yi)i has the same distribution as Nij|Yi because Nij is independent of all Yk with k 6= i. We use the following claim to determine the distribution of Nij|Yi, the proof can be found in the technical appendix:

Lemma 5.1. Let X1, X2 be independent Poisson distributions with X1 ∼ P(λ1),

X2 ∼ P(λ2).

Then X1|(X1+ X2) ∼ Bin



X1+ X2,λλ1

12

 .

By taking X1 = Nij and X2 = Yi− Nij we find Nij|Yi ∼ Bin

Yi, maijλj

P

k=1

aikλk

.

Because the expectation of a Binomial distribution with parameters n, p is np we have E[Nij|Yi] = Yiaijλj

m

P

k=1

aikλk .

(5)

Therefore

d dλj

E[l(Nij)ij|(Yi)i] =

n

X

i=1

−aij + 1 λj

Yiaijλoldj

m

P

k=1

aikλoldk

∀j = 1, ..., m. (3)

Setting the derivative to 0 to find a possible maximum gives 0 =

n

X

i=1

−aij + 1 λj

Yiaijλoldj

m

P

k=1

aikλoldk .

Solving this for all λj gives

λj = λoldj Pn i=1

aij

n

X

i=1

Yiaij Pm k=1

aikλoldk

. (4)

Looking back at (3) we see that the derivative function is a decreasing function of λj, so the value we found is a maximum value.

6 EM-algorithm for normal data

Since the Poisson problem issues a dilemma in finding an exact solution, we first work in a toy model. We find that for larger parameters λ of the Poisson distribution that it approximates a normal distribution with mean λ and variance λ. So we look at the similar problem with normal distributions which we formulate as:

Nij ∼ N (aijλj, aij) A = (aij)i=1,...,n

j=1,...,m

With the following conditions on the matrix A:

m

X

j=1

aij = 1, aij ≥ 0, ∀i = 1, ..., n, j = 1, ..., m

Here we observe the data {Yi}i=1,...,n which has the following distribution:

Yi=

m

X

j=1

Nij ∼ N (

m

X

j=1

aijλj, 1) (5)

In this model we can easily compute the likelihood function of normal distributions, so we use the EM-algorithm to estimate λ. We find the likelihood function for Nij to be

L(Nij)ij(λ) =

n

Y

i=1 m

Y

j=1

√ 1 aij

2πe

1

2aij(Nij−aijλj)2

!

(6)

and the log-likelihood function of Nij is

l(Nij)ij(λ) =

n

X

i=1 m

X

j=1



− log(√ aij

2π) − 1 2aij

(Nij − aijλj)2

 .

Thus the derivative of the log-likelihood with respect to λj is d

j

E[l(Nij)ij|Yi] =

n

X

i=1

− 1 2aij

d dλj

E[(Nij− aijλj)2|Yi] ∀j = 1, ..., m.

To find a maximum for the log-likelihood we have to computed

jE[(Nij−aijλj)2|Yi]. Note that (Nij − aijλj)2 is a continuous function and so is the derivative d

j(Nij − aijλj)2 =

−2aij(Nij− λjaij). Since Nij− aijλj is a normal distribution function it is also bounded.

Therefore (Nij − aijλj)2 and −2aij(Nij − λjaij) are also bounded. This satisfies the con- ditions of theorem 10.3 in [3] so we can switch the order of differentation and expectation.

By switching them we find d dλj

E[(Nij− aijλj)2|Yi] = −2aijE[Nij − λjaij|Yi] which gives us

d

jE[l(Nij)ij|Yi] =

n

X

i=1

E[Nij− λjaij|Yi] ∀j = 1, ..., m.

To further simplify this we have to find an explicit formula for the conditional expectation of two normal distributions looks like. We use the following lemma of which the proof can be found in the technical appendix.

Lemma 6.1. Let X1, X2 be non-degenerate normal distributions with X1 ∼ N (µ1, σ1),

X2 ∼ N (µ2, σ2).

Then X1|X2 ∼ N

µ1+σσ1

2ρ(X2− µ2), (1 − ρ212

, with ρ the correlation coefficient be- tween X1 and X2.

For Nij and Yi we find that Cov(Nij, Yi) = Cov(Nij,Pm

j=1Nij) = Cov(Nij, Nij) = Var(Nij) = aij. So the corrolation coefficient between Nij and Yi is

ρ = Cov(Nij, Yi)

pVar(Nij)pVar(Yi) =√ aij.

Using the previous lemma we obtain

Nij|Yi ∼ N aijλoldj + aij(Yi

m

X

l=1

ailλoldl ), (1 − aij)aij

! .

(7)

This means E[Nij− λjaij|Yi] = aijλoldj + aij(Yi−Pm

l=1ailλoldl ) which gives d

jE[l(Nij)ij(λ)|Yi] =

n

X

i=1

aijoldj − λj) + aij(Yi

m

X

l=1

ailλoldl )

!

∀j = 1, ..., m.

To maximize the likelihood we have to solve d

jE[l(Nij)ij(λ)|Yi] = 0, or equivalently [

n

X

i=1

aij](λoldj − λj) +

n

X

i=1

[aij(Yi

m

X

l=1

ailλoldl )] = 0 ∀j = 1, ..., m.

Bringing all terms other than λ to the other side gives λj = λoldj + 1

Pn i=1aij

n

X

i=1

aij(Yi

m

X

l=1

ailλoldl ) ∀j = 1, ..., m. (6)

Define w = (wj)j=1,...,m with terms wj = Pn1

i=1aij for all j = 1, ..., m and take W the square matrix with elements wj on the diagonal.

Then we can rewrite equation (6) into matrix notation. This gives the following updating formula:

λ = λold+ W AT(Y − Aλold) (7) This iterative method is also known as Landweber iteration and will be studied in more detail in the next section.

7 Landweber iteration

We have found an iterative method to estimate our parameters now, but can we find an explicit solution for each iteration step without having to go through all steps before that?

Fortunately there is such a solution if we initialize the EM-algorithm with λ(0) = 0, as shown by the following lemma:

Lemma 7.1. Let W in (7) be constant on the diagonal and write λ(k)= pk(W ATA)W ATY for k = 0, 1, 2, ...

Then λ(k) is an explicit solution to the Landweber iteration if pk(x) = −1

x(1 − x)k+ 1 x

Proof: Define x = W ATA. Substituting λ(k) into equation (7) gives pk+1(x)W ATY = pk(x)W ATY + W ATY − xpk(x)W ATY

pk+1(x) = pk(x) + 1 − xpk(x).

(8)

Substituting pk we have in the lemma gives

−1

x(1 − x)k+1+1

x = (1 − x)



−1

x(1 − x)k+ 1 x

 + 1

= −1

x(1 − x)k+1+1 − x x + 1

= −1

x(1 − x)k+1+ 1 x.

This goes for all x 6= 0, so λ(k) suffices the iteration method.

We find λ(0) = p0(x)W ATY = 0 · W ATY = 0. Thus λ(k) = pk(W ATA)W ATY is a solution to the Landweber iteration if pk(x) = −x1(1 − x)k+1x.

Since ATA is a symmetric matrix there exists an orthogonal matrix D and a diagonal matrix ∆ such that ATA = DT∆D. Then we can easily rewrite

pk(W ATA) = DTpk(W ∆)D.

For k large, pk(x) approaches x1 on 0 < x < 1. Moreover, the supremum over pk is bounded, that is sup

x∈[0,1]

pk(x) ≤ k. This is shown in the following figure:

(9)

8 EM-algorithm for normal data revisited

Now that we found an explicit solution for the normal model with variance independent of λ (5) we can improve it further. We expand the toy model by making the variance of Nij dependent on λj. This means the problem is formulated as

Mij ∼ N (aijλj, aijλj).

We are interested in this normal distribution because for large parameters aijλj this dis- tribution approximates a Poisson distribution according to the Central Limit Theorem.

We have the same weight matrix A:

A = (aij)i=1,...,n

j=1,...,m

with the following conditions on the matrix A:

m

X

j=1

aij = 1, aij ≥ 0 ∀i = 1, ..., n, j = 1, ..., m.

We know the data {Yi}i=1,...,n which has the following distribution:

Yi =

m

X

j=1

Mij ∼ N (

m

X

j=1

aijλj,

m

X

j=1

aijλj) (8)

We find that the likelihood fuction of Mij is LMij(λ) =

n

Y

i=1 m

Y

j=1

"

√ 1

2πpaijλj

+ exp



− 1

2aijλj

(Mij− aijλj)2

# .

Thus the log-likelihood becomes lMij(λ) =

n

X

i=1 m

X

j=1



−1

2log(2πaijλj) − 1

2aijλj(Mij − aijλj)2

 . Looking at the derivative of the log-likelihood with respect to λj we obtain

d dλj

E[lMij(λ)] =

n

X

i=1



− 1 2λj

− d dλj

E

 1 2aijλj

(Mij − aijλj)2



.

We know Mij − aij is a normal probabilty function, so it is continuous and bounded.

Thus 2a1

ijλj(Mij − aijλj)2 is continous and bounded for λj 6= 0. We find the same for the derivative d

j

 1

2aijλj(Mij− aijλj)2



= −2a1

ijλ2j(Mij− aijλj)2λ1

j(Mij− aijλj). This satisfies the conditions of theorem 10.3 in [3] so we can switch the order of differentation and expectation. By switching them we obtain

d dλj

E[lMij(λ)|Yi] =

n

X

i=1

− 1 2λj

+ 1

2aijλ2jE[(Mij − aijλj)2|Yi] + 1 λj

E[Mij − aijλj|Yi]

! . (9) To find these expectations we look into the following claim of which the proof can be found in the technical appendix:

(10)

Claim 8.1. Let Mij0 = αYi+ ξij. Then Mij and Mij0 have the same distribution when

ξij ∼ N 0, aijλj− (aijλj)2 Pm

j=1aijλj

! ,

α = aijλj Pm

j=1aijλj.

With this claim we can find the expectations in (9). We have E[(Mij0 )2|Yi] = E[α2Yi2− 2αYiξij + ξij2|Yi]

= (aijλoldj )2 (Pm

j=1aijλoldj )2Yi2+ 0 + aijλoldj − (aijλoldj )2 Pm

j=1aijλoldj E[Mij0 |Yi] = aijλoldj

Pm

j=1aijλoldj Yi. Therefore we obtain

E[(Mij0 − aijλj)2|Yi] = E[Mij02− aijλjMij0 + a2ijλ2j|Yi]

= (aijλoldj )2 (Pm

j=1aijλoldj )2Yi2+ 0 + aijλoldj − (aijλoldj )2 Pm

j=1aijλoldj

− 2aijλj

aijλoldj Pm

j=1aijλoldj Yi+ a2ijλ2j. Substituting these in equation (9) results in

d

jE[lMij(λ)|Yi] =

n

X

i=1

− 1

j + 1

2aijλ2jE[(Mij0 − aijλj)2|Yi] + 1

λjE[Mij0 − aijλj|Yi]

!

=

n

X

i=1

− 1 2λj

+ 1

2aijλ2j

h (aijλoldj )2 (Pm

j=1aijλoldj )2Yi2+ 0 + aijλoldj − (aijλoldj )2 Pm

j=1aijλoldj

−2aijλj

aijλoldj Pm

j=1aijλoldj Yi+ a2ijλ2j i

+ 1 λj

"

aijλoldj Pm

j=1aijλoldj Yi− aijλj

#!

=

n

X

i=1

− 1 2λj

+ 1

2aijλ2j

"

(aijλoldj )2 (Pm

j=1aijλoldj )2Yi2+ aijλoldj − (aijλoldj )2 Pm

j=1aijλoldj

#

−1 2aij

!

d dλj

E[lMij(λ)|Yi] =

n

X

i=1

− 1 2λj

+ 1 2λ2j

"

λoldj + aijoldj )2 (Pm

j=1aijλoldj )2Yi2− aijoldj )2 Pm

j=1aijλoldj

#

−1 2aij

! . (10) Setting the derivative to 0 and multiplying with 2λ2j on both sides gives

0 =

n

X

i=1

−λj+

"

λoldj + aijoldj )2 (Pm

j=1aijλoldj )2Yi2− aijoldj )2 Pm

j=1aijλoldj

#

− aijλ2j

! .

(11)

Or equivalently,

0 = n

n

P

i=1

aij

λj

n

P

i=1



λoldj + aij

old j )2 (Pm

j=1aijλoldj )2Yi2aij

old j )2 Pm

j=1aijλoldj



n

P

i=1

aij

+ λ2j. (11)

So we have a quadratic equation. We can look at the behaviour when λj is either large or small. When λj is large then the linear term will be of a lower order compared to the quadratic term. When λj is small however the quadratic term will be of a lower order compared to the linear term.

Because we have a quadratic equation of the form x2+ px + q = 0 we can solve this using the quadratic formula which gives solutions x±= −p2+

qp2

4 − q. Let p = n

n

P

i=1

aij ,

q = −

n

P

i=1



λoldj + aij

old j )2 (Pm

j=1aijλoldj )2Yi2aij

old j )2 Pm

j=1aijλoldj



n

P

i=1

aij

.

This equation has two solutions, namely λj+,j− = −p2 ± qp2

4 − q. Note that p > 0 since n > 0, so λj−< 0. This doesn’t fit in our problem, so we should look at the other solution.

We find that λoldj + aij

old j )2 (Pm

j=1aijλoldj )2Yi2Pamijoldj )2

j=1aijλoldj > λoldj + aij

old j )2 (Pm

j=1aijλoldj )2Yi2 − λoldj =

aijoldj )2 (Pm

j=1aijλoldj )2Yi2 ≥ 0 for all j. This means that q ≤ 0, thus λj+= −p

2 + rp2

4 − q (12)

is a positive solution.

Note that for x ↓ 0 the derivative (10) goes to +∞ because λoldj + aij

old j )2 (Pm

j=1aijλoldj )2Yi2

aijoldj )2 Pm

j=1aijλoldj ≥ 0. Since λj+ is the only positive zero point this means for x near λj that

d

jE[lMij(λ)|Yi] > 0 if x < λj+ and d

jE[lMij(λ)|Yi] < 0 if x > λj+. Thus λj+ is a maximum value.

9 Linking the formulas of the Normal and Poisson problems

Let us look further into equation (11) when λj is large. We are looking for the stable value of the iteration formula, so we require λoldj = λj. We noticed that the linear terms of (11) are of a smaller order than the quadratic terms when λj is large. Since λoldj = λj the λoldj terms are also of smaller order when λj is large. This means we can rewrite (11) into the following equation:

(12)

0 =

n

X

i=1

aijoldj )2 (Pm

j=1aijλoldj )2Yi2− aijλ2j

λ2j = 1

n

P

i=1

aij n

X

i=1

aijoldj )2 (Pm

j=1aijλoldj )2Yi2 (13)

We know that Yi ∼ N (Pm

j=1aijλj,Pm

j=1aijλj), so we can say Yi =

m

X

j=1

aijλj + v u u t

m

X

j=1

aijλj· ξi

where ξi ∼ N (0, 1) and independent for all i. This means

Yi2 = Yi

m

X

j=1

aijλj+ v u u t

m

X

j=1

aijλj · ξi

= Yi

m

X

j=1

aijλj+ lower order terms of λj

Substituting this in (13) and taking (λoldj )2 out of the sum we find λ2j = (λoldj )2

n

P

i=1

aij

n

X

i=1

aij

Pm

k=1aikλoldk Yi. (14)

We can compare this equation with the iteration formula of the Poisson problem (4), λj = λoldj

Pn i=1

aij

n

X

i=1

Yiaij Pm k=1

aikλoldk .

Note that Yi and Yi have approximately the same distribution when Pm

j=1aijλj is large.

Let

Dj = 1

n

P

i=1

aij n

X

i=1

Yiaij Pm

k=1aikλoldk Then we can rewrite the iterative formula (14) into

λ2j = (λoldj )2Dj

and for the Poisson formula (4) we obtain

λj = λoldj Dj.

Since λoldj → λj for any convergent solution we find that Dj → 1 for all j = 1, 2, ... and so does Dj. So there are similarities between the two, and if it can be proven that Dj = 1 has a unique solution then both formulas have the same stable values.

(13)

10 Simulations

We now have an iteration formula for the normal problem (8), but it is not easy to find an explicit solution. That is why we run some simulations and check if the iterative formula (12) has the same behaviour as the original formula for the Poisson data (4). We will generate the independent weights aij in two different ways: an exponential distribution and a fixed distribution. Generating weights aij exponentially with parameter m1 we find aij ≥ 0 and

E[aij] = 1

m and E[

m

X

j=1

aij] = 1 Then we normalize the weights aij,

aij = aij Pm j=1

aij

to make sure that

m

P

j=1

aij = 1.

We can deduce a fixed distribution for the weights from the dual-slit interference problem.

Since the weights aij represent the amount of photons moving from point i to j we can assume that amount is only dependent on the distance between i and j. This result in A having the same elements on every subdiagonal. We can take the following values for weights on every subdiagonal:

bk=

 1

(k+1)z k ≥ 0 b−k k < 0 aij = b(i−j)

Where we choose z ∈ (0, 2) as the diversion factor. We see that the main diagonal has the largest value because most photons will stay at the same point. Note that

m

P

j=1

aij 6= 1, so we have to normalize the weights:

aij = aij

m

P

j=1

aij

With this in mind we look into five scenarios with our simulations.

1. The weights are exponentially generated and λreal= 1.

2. The weights are exponentially generated and λreal= 10000.

3. The weights are fixed and λreal= 1, z = 2.

4. The weights are fixed and λreal= 10000, z = 2.

5. The weights are fixed and λreal= 10000, z = 1/100.

In the following simulations we use m = n = 100 and take the starting value for our approximation at 500.

(14)

10.1 Normal model with variance independent of λ (First model)

We simulate the first model which resulted in the Landweber iteration (7). This gives the following results:

Scenario 1) The weights are exponentially generated and λreal= 1:

Scenario 2) The weights are exponentially generated and λreal= 10000:

Scenario 3) The weights are fixed and λreal= 1, z = 2:

Scenario 4) The weights are fixed and λreal= 10000, z = 2:

For scenarios 1-4 we notice that the minimal value was always attained in the second iteration step.

Scenario 5) The weights are fixed and λreal= 10000, z = 1/100:

(15)

We see that the iteration formula gives a linear result.

10.2 Normal model with variance dependent of λ (Second model) We use the iterative formula (12) to find the following simulations:

Scenario 1) The weights are exponentially generated and λreal= 1:

Scenario 2) The weights are exponentially generated and λreal= 10000:

Scenario 3) The weights are fixed and λreal= 1, z = 2:

Scenario 4) The weights are fixed and λreal= 10000, z = 2:

(16)

Scenario 5) The weights are fixed and λreal= 10000, z = 1/100:

10.3 Poisson Model

We can compare this with simulations done using the iterative formula for the Poisson model (4) we found:

Scenario 1) The weights are exponentially generated and λreal= 1:

Scenario 2) The weights are exponentially generated and λreal= 10000:

Scenario 3) The weights are fixed and λreal= 1, z = 2:

(17)

Scenario 4) The weights are fixed and λreal= 10000, z = 2:

Scenario 5) The weights are fixed and λreal= 10000, z = 1/100:

10.4 Discussion of simulation results

By comparing the models with large values for λrealwe see that the results are very similar for the normal model with variance dependent on λ and the Poisson model. This seems to stave our assumption that the EM-algorithm for the normal model we used is comparable with the EM-algorithm of the Poisson model. For the lower values of λreal we see very different behaviour for each of the models. We can not base any conclusions off of them.

(18)

11 Technical Appendix

In this appendix, we give the proofs of lemma 5.1, lemma 6.1 and claim 8.1. In the second part, we provide the Matlab-code used for the simulations in section 10.

Proof lemma 5.1: Since X1 and X2 are independent we know that X1+ X2 has a Poisson distribution with parameter λ1+ λ2 [1]. Using the definition of the conditional probability we find (with the independence of X1 and X2)

p(X1 = x|X1+ X2 = y) = p(X1 = x, X1+ X2= y) p(X1+ X2 = y)

= p(X1 = x, X2 = y − x) p(X1+ X2= y)

=

λx1

x! ·−λ1 ·(y−x)!λy−x2 · e−λ2

12)y

y! · e−λ1−λ2

=

λx1

x! ·(y−x)!λy−x2

12)y y!

= y!

x!(y − x)!· λx1 · λy−x21+ λ2)y

=y x



· λx1

1+ λ2)x · λy−x21+ λ2)y−x

=y x



·

 λ1 λ1+ λ2

x

·

 λ2 λ1+ λ2

y−x

=y x



·

 λ1

λ1+ λ2

x

·



1 − λ1

λ1+ λ2

y−x

This is the probability function of Bin y,λλ1

12



Proof lemma 6.1: Using the definition of the conditional probability we find p(X1 = x|X2 = y) = p(X1 = x, X2 = y)

p(X2= y)

We know that

p(X1 = x, X2 = y) = 1

2πσ1σ2p(1 − ρ2)exp



− 1

2(1 − ρ2)

 (x − µ1)2 σ12

− 2ρ σ1σ2

(x − µ1)(y − µ2) +(y − µ2)2 σ22



(19)

This gives us the following result p(X1 = x, X2 = y) = 1

σ1p(1 − ρ2)√ 2πexp



− 1

2(1 − ρ2)

 (x − µ1)2 σ12

− 2ρ σ1σ2

(x − µ1)(y − µ2) + (1 − (1 − ρ2))(y − µ2)2 σ22



= 1

σ1p(1 − ρ2)√

2πexp − 1 2(1 − ρ2)

 (x − µ1) σ1

− ρ(y − µ2) σ2

2!

Proof claim 8.1: We can rewrite our distribution of Mij and Yi as

 Mij

Yi



∼ N

 aijλj Pm

j=1aijλj

 ,

 aijλj aijλj aijλj Pm

j=1aijλj



We can look at Mij0 = αYi+ ξij. We have

E[Mij0 ] = αE[Yi] + E[ξij] Var(Mij0 ) = α2Var(Yi) + Var(ξij) Thus we find that

 Mij0 Yi



∼ N

 αPm

j=1aijλj + E[ξij] Pm

j=1aijλj

 ,

 α2Pm

j=1aijλj + Var(ξij) Bij

Bij Pm

j=1aijλj



Where Bij = Cov(Mij0 , Yi). Note that ξij and Yi are independent, so we find

Cov(Mij0 , Yi) = α Var(Yi) = α

m

X

j=1

aijλj

Substituting our α and ξ as in the claim we find

 Mij0 Yi



∼ N

 aijλj Pm

j=1aijλj

 ,

 aijλj aijλj aijλj Pm

j=1aijλj



So Mij and Mij0 have the same distribution.

(20)

Code for normal problem with variation independent of λ, exponential weights

%This program runs the iterative formula found for the

%Normal a_ij lambda_j, a_ij lambda_j distribution

%with exponentially distributed weights

%for simulations

%%%%%%%%%%%%%%%%%%%% PARAMETERS %%%%%%%%%%%%%%%%%%%%%

m = 100; %Rows of matrix A n = 100; %Columns of matrix A true = 10000;

lambdareal = true*ones(1,m); %true lambda k = 300; %nr of iteration steps

%%%%%%%%%%%%% Objects for computations %%%%%%%%%%%%%%%%

A = zeros(n,m); %weight matrix

A_init = zeros(n,m); %for construction of weight matrix

%lambda = 0.001*ones(k,m);

lambda = 500*ones(k,m); %starting value

N = zeros(n,m); %matrix of Poisson variables N_ij

u = zeros(k,1); %for approx. error in each iteration step

%%%%%%%%% COMPUTATION OF MATRIX A %%%%%%%%%%%%%%%%%%%%%

A_init = -log(rand(n,m))./m; %generate exponential values S = sum(A_init,2);

for i = 1:n; %normalize the values for j = 1:m;

A(i,j) = A_init(i,j)/S(i);

end end

%%%%%%%%% COMPUTATION OF VARIABLES N %%%%%%%%%%%%%%%%%%

for i = 1:n;

for j = 1:m;

N(i,j) = A(i,j)*lambdareal(1,j) + sqrt(A(i,j))*randn(1);

end end

Sumi = sum(A); %sum over the i : sum_i A_ij Y = sum(N,2); %sum over the j : sum_j N_ij

(21)

u(1) = (lambda(1,1)-lambdareal(1,1))^2;

%Average error of starting value

%%%%%%%%%%%%%%%% ITERATIONS %%%%%%%%%%%%%%%%%%%%%

temp = zeros(n,1); %for constructing lambda for t=1:k-1;

Q = A*lambda’;

for j =1:m;

weight = 1/Sumi(j);

for i=1:n;

temp(i) = A(i,j)*(Y(i)-Q(i,t));

end

lambda(t+1,j) = lambda(t,j)+weight*sum(temp);

end

M=(lambda(t+1,:)-lambdareal).^2;

u(t+1) = 1/m * sum(M);%average squared error end

%%%%%%%%%%%%%%% PLOTTING %%%%%%%%%%%%%%%%%%%%%

x = 1:k;

z=2;

w=2;

c(x)=lambdareal(1);

plot(x(z:end), lambda(x(z:end),1),x(z:end),lambda(x(z:end),25),...

x(z:end), lambda(x(z:end),50),x(z:end),lambda(x(z:end),75),...

x(1:end),c(x(1:end))) xlabel(’iteration step’) ylabel(’value’)

title([’evaluation of certain \lambda_j with ’,...

’exponentially generated constants, \lambda_{start} = 500’]) legend(’\lambda_1’,’\lambda_{25}’,’\lambda_{50}’,...

’\lambda_{75}’,’real \lambda’) figure;

scatter(x(w:end),u(x(w:end))) xlabel(’iteration step’) ylabel(’error’)

title([’error of approximation with exponentially ’,...

’generated constants, \lambda_{start} = 500’])

Code for normal problem with variation independent of λ, fixed weights

%This program runs the iterative formula found for the

%Normal a_ij lambda_j, a_ij lambda_j distribution

%with fixed weights

%for simulations

%%%%%%%%%%%%%%%%%%%% PARAMETERS %%%%%%%%%%%%%%%%%%%%%

(22)

m = 100;%Rows of matrix A n = 100;%Columns of matrix A true = 10000;

lambdareal = true*ones(1,m); %true lambda k=300; %nr of iteration steps

%%%%%%%%%%%%% Objects for computations %%%%%%%%%%%%%%%%

A = zeros(n,m); %weight matrix

A_init = zeros(n,m); %for construction of weight matrix

%lambda = 0.001*ones(k,m);

lambda = 500*ones(k,m); %starting value

N = zeros(n,m); %matrix of Poisson variables N_ij

u = zeros(k,1); %for approx. error in each iteration step

%%%%%%%%% COMPUTATION OF MATRX A %%%%%%%%%%%%%%%%%%%%%

b = zeros(n,1);

for l = 1:n

b(l) = 1/(l+1)^2;

%b(l) = 1/(l+1)^1;

end

for z=1:n/m %assumes n is dividable by m for i = 1:m;

for j = 1:m;

A_init((z-1)*m+i,j) = b(abs(i-j)+1);

end end end

for i = 1:n; %normalize the values for j = 1:m;

A(i,j) = A_init(i,j)/sum(A_init(i,:));

end end

%%%%%%%%% COMPUTATION OF VARIABLES N %%%%%%%%%%%%%%%%%%

for i = 1:n;

for j = 1:m;

N(i,j) = A(i,j)*lambdareal(1,j) + sqrt(A(i,j))*randn(1);

end end

(23)

Sumi = sum(A); %sum over the i : sum_i A_ij Y = sum(N,2); %sum over the j : sum_j N_ij u(1) = (lambda(1,1)-lambdareal(1,1))^2;

%Average error of starting value

%%%%%%%%%%%%%%%% ITERATIONS %%%%%%%%%%%%%%%%%%%%%

temp = zeros(n,1); %for constructing lambda for t=1:k-1;

Q = A*transpose(lambda);

for j =1:m;

weight = 1/Sumi(j);

for i=1:n;

temp(i) = A(i,j)*(Y(i)-Q(i,t));

end

lambda(t+1,j) = lambda(t,j)+weight*sum(temp);

end

M=(lambda(t+1,:)-lambdareal).^2;

u(t+1) = 1/m * sum(M);%average squared error end

%%%%%%%%%%%%%%% PLOTTING %%%%%%%%%%%%%%%%%%%%%

x = 1:k;

z=2;

w=2;

c(x)=lambdareal(1);

plot(x(z:end), lambda(x(z:end),1),x(z:end),lambda(x(z:end),25),...

x(z:end), lambda(x(z:end),50),x(z:end),lambda(x(z:end),75),...

x(1:end),c(x(1:end))) xlabel(’iteration step’) ylabel(’value’)

title([’evaluation of certain \lambda_j with exponentially ’,...

’generated constants, \lambda_{start} = 500’]) legend(’\lambda_1’,’\lambda_{25}’,’\lambda_{50}’,...

’\lambda_{75}’,’real \lambda’) figure;

scatter(x(w:end),u(x(w:end))) xlabel(’iteration step’) ylabel(’error’)

title([’error of approximation with exponentially ’,...

’generated constants, \lambda_{start} = 500’])

Code for normal problem with variation dependent of λ, exponential weights

%This program runs the iterative formula found for the

%Normal a_ij lambda_j, a_ij lambda_j distribution

%with exponentially distributed weights

(24)

%for simulations

%%%%%%%%%%%%%%%%%%%% PARAMETERS %%%%%%%%%%%%%%%%%%%%%

m = 100; %Rows of matrix A n = 100; %Columns of matrix A true = 10000;

lambdareal = true*ones(1,m); %true lambda k = 300; %nr of iteration steps

%%%%%%%%%%%%% Objects for computations %%%%%%%%%%%%%%%%

A = zeros(n,m); %weight matrix

A_init = zeros(n,m); %for construction of weight matrix

%lambda = 0.001*ones(k,m);

lambda = 500*ones(k,m); %starting value

N = zeros(n,m); %matrix of Poisson variables N_ij

u = zeros(k,1); %for approx. error in each iteration step

%%%%%%%%% COMPUTATION OF MATRIX A %%%%%%%%%%%%%%%%%%%%%

A_init = -log(rand(n,m))./m; %generate exponential values S = sum(A_init,2);

for i = 1:n; %normalize the values for j = 1:m;

A(i,j) = A_init(i,j)/S(i);

end end

%%%%%%%%% COMPUTATION OF VARIABLES N %%%%%%%%%%%%%%%%%%

for i = 1:n; %generate normal data for j = 1:m;

par = A(i,j)*lambdareal(1,j);

N(i,j) = par + sqrt(par)*randn(1);

end end

Sumi = sum(A); %sum over the i : sum_i A_ij Y = sum(N,2); %sum over the j : sum_j N_ij u(1) = (lambda(1,1)-lambdareal(1,1))^2;

%Average error of starting value

%%%%%%%%%%%%%%%% ITERATIONS %%%%%%%%%%%%%%%%%%%%%

(25)

temp = zeros(n,1); %for constructing lambda for t=1:k-1;

Q = A*lambda’;

for j =1:m;

weight = 1/Sumi(j);

for i=1:n;

temp(i) = -lambda(t,j) - A(i,j)*lambda(t,j)^2/Q(i,t)^2*Y(i)^2+...

+A(i,j)*lambda(t,j)^2/Q(i,t);

end

q = sum(temp)*weight;

p = n * weight;

lambda(t+1,j) = -p/2 + (p^2/4-q)^(1/2);

end

M = (lambda(t+1,:)-lambdareal).^2;

u(t+1) = 1/m * sum(M); %average squared error end

%%%%%%%%%%%%%%% PLOTTING %%%%%%%%%%%%%%%%%%%%%

x = 1:k;

z=2;

w=2;

c(x)=lambdareal(1);

plot(x(z:end), lambda(x(z:end),1),x(z:end),lambda(x(z:end),25),...

x(z:end), lambda(x(z:end),50),x(z:end),lambda(x(z:end),75),...

x(1:end),c(x(1:end))) xlabel(’iteration step’) ylabel(’value’)

title([’evaluation of certain \lambda_j with ’,...

’exponentially generated constants, \lambda_{start} = 500’]) legend(’\lambda_1’,’\lambda_{25}’,...

’\lambda_{50}’,’\lambda_{75}’,’real \lambda’) figure;

scatter(x(w:end),u(x(w:end))) xlabel(’iteration step’) ylabel(’error’)

title([’error of approximation with exponentially ’, ...

’generated constants, \lambda_{start} = 500’])

Code for normal problem with variation dependent of λ, fixed weights

%This program runs the iterative formula found for the

%Normal a_ij lambda_j, a_ij lambda_j distribution

%with fixed weights

%for simulations

%%%%%%%%%%%%%%%%%%%% PARAMETERS %%%%%%%%%%%%%%%%%%%%%

(26)

m = 100;%Rows of matrix A n = 100;%Columns of matrix A true = 10000;

lambdareal = true*ones(1,m); %true lambda k=300; %nr of iteration steps

%%%%%%%%%%%%% Objects for computations %%%%%%%%%%%%%%%%

A = zeros(n,m); %weight matrix

A_init = zeros(n,m); %for construction of weight matrix

%lambda = 0.001*ones(k,m);

lambda = 500*ones(k,m); %starting value

N = zeros(n,m); %matrix of Poisson variables N_ij

u = zeros(k,1); %for approx. error in each iteration step

%%%%%%%%% COMPUTATION OF MATRX A %%%%%%%%%%%%%%%%%%%%%

b = zeros(n,1);

for l = 1:n

b(l) = 1/(l+1)^2;

%b(l) = 1/(l+1)^1;

end

for z=1:n/m %assumes n is dividable by m for i = 1:m;

for j = 1:m;

A_init((z-1)*m+i,j) = b(abs(i-j)+1);

end end end

for i = 1:n; %normalize the values for j = 1:m;

A(i,j) = A_init(i,j)/sum(A_init(i,:));

end end

%%%%%%%%% COMPUTATION OF VARIABLES N %%%%%%%%%%%%%%%%%%

for i = 1:n; %generate normal data for j = 1:m;

par = A(i,j)*lambdareal(1,j);

N(i,j) = par + sqrt(par)*randn(1);

end end

(27)

Sumi = sum(A); %sum over the i : sum_i A_ij Y = sum(N,2); %sum over the j : sum_j N_ij

u(1) = (lambda(1,1)-lambdareal(1,1))^2; %Average error of starting value

%%%%%%%%%%%%%%%% ITERATIONS %%%%%%%%%%%%%%%%%%%%%

temp = zeros(n,1); %for constructing lambda for t=1:k-1;

Q = A*lambda’; %Q(i,t) = sum_j a_ij lambda_j^t for j =1:m;

weight = 1/Sumi(j);

for i=1:n;

temp(i) = -lambda(t,j) - A(i,j)*lambda(t,j)^2/Q(i,t)^2*Y(i)^2+...

+A(i,j)*lambda(t,j)^2/Q(i,t);

end

q = sum(temp)*weight;

p = n * weight;

lambda(t+1,j) = -p/2 + (p^2/4-q)^(1/2);

end

M = (lambda(t+1,:)-lambdareal).^2;

u(t+1) = 1/m * sum(M); %average squared error end

%%%%%%%%%%%%%%% PLOTTING %%%%%%%%%%%%%%%%%%%%%

x = 1:k;

z=9;

w=17;

c(x)=lambdareal(1);

plot(x(z:end), lambda(x(z:end),1),x(z:end),lambda(x(z:end),25),...

x(z:end), lambda(x(z:end),50),x(z:end),lambda(x(z:end),75),...

x(1:end),c(x(1:end))) xlabel(’iteration step’) ylabel(’value’)

title([’evaluation of certain \lambda_j ’,...

’with fixed constants, \lambda_{start} = 500’]) legend(’\lambda_1’,’\lambda_{25}’,’\lambda_{50}’,...

’\lambda_{75}’,’real \lambda’) figure;

scatter(x(w:end),u(x(w:end))) xlabel(’iteration step’) ylabel(’error’)

title([’error of approximation with ’, ...

’fixed constants, \lambda_{start} = 500’])

Code for Poisson problem with parameter aijλj, exponential weights

%This program runs the iterative formula found for the

Referenties

GERELATEERDE DOCUMENTEN

Having considered the linking section and its structural aspects from a con- crete version of the model based on OLS and used for deterministic control, we

Comparing Gaussian graphical models with the posterior predictive distribution and Bayesian model selection.. Williams, Donald R.; Rast, Philip; Pericchi, Luis R.;

The performance statistics (Tables 2 and 3) showed that although Common Factor models with small groups of countries, the Poisson Lee-Carter and the Age-Period-Cohort models

Similar to to our results of forecasting the variance and the tail tests, we found that using a normal or skewed normal specification for either ARMA-GARCH or GAS resulted in the

The participants were asked to describe the overall management of volunteers at the Cape Town Association for the Physically Disabled; these included the management tasks used as

The research has demonstrated that investigating management practices and activities influencing the effectiveness of organisations in Namibia, is a fruitful field in the

Du Plessis’s analysis clearly does not cover the type of situation which we shall see came before the court in the Thatcher case, namely a request for assistance in criminal

NME M 0ν for the 0νββ decay of 150 Nd → 150 Sm, calculated within the GCM + PNAMP scheme based on the CDFT using both the full relativistic (Rel.) and nonrelativistic-reduced (NR)