• No results found

in a Multiple State Model for Disability Insurance

N/A
N/A
Protected

Academic year: 2021

Share "in a Multiple State Model for Disability Insurance"

Copied!
78
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Master’s Thesis Actuarial Studies Application of Claim Reserving

in a Multiple State Model for Disability Insurance

Author: P.W. Bultena

1502786

pieter@bultenanet.nl

Date: October 2009

Faculty: Economics & Business

Master: Econometrics, Operations Research & Actuarial Studies Specialisation: Actuarial Studies

Supervisors: prof. dr. R.H. Koning

dr. L. Spierdijk

(2)

ii

(3)

Master’s Thesis Actuarial Studies

Application of Claim Reserving in a Multiple State Model for Disability Insurance

Abstract

In this thesis, we present a simulation model for the estimation of the prospective reserves of a

disability insurance. The model is based on data from a large Dutch insurance company. We

estimate the reserves for a disabled individual and take into account covariates about age and

unemployment rate. Besides that we estimate the future claims in a portfolio of an insurer. The

estimation outcomes are compared to actual observations. The simulation procedure is based on

a multiple state model. The transitions in this model are estimated by using proportional hazard

models. In order to include estimation uncertainty in our model, we describe parametric and

nonparametric bootstrap techniques.

(4)

iv

(5)

Preface

This master thesis is the result of my graduation project in order to obtain the degree Master of Science in Econometrics, Operations Research and Actuarial Studies. My specialization in this master study is Actuarial Studies.

First of all I want to thank prof. dr. R.H. Koning and dr. L. Spierdijk for giving me the op- portunity for this research. I have benefited from comments of two supervisors instead of one supervisor. I also would like to thank Bj¨orn Wijbenga for his critical review of this thesis.

Although a master study lasts one year, including the previous bachelor study, it feels like gradu- ation after five years. Therefore, I also want to thank my fellow students for a great time during this period. I also want to thank my parents for giving me the opportunity to complete my study.

Pieter Bultena

Groningen, October 2009

(6)

vi

(7)

Contents

1 Introduction 1

1.1 Solvency II . . . . 1

1.2 Research question and literature review . . . . 2

1.3 Thesis outline . . . . 2

2 Multiple state disability models 3 2.1 Introduction of the disability model . . . . 3

2.2 Markov and semi-Markov processes . . . . 4

2.3 Derivation of transition probabilities . . . . 5

2.4 Transition probabilities in a simulation process . . . . 9

2.5 Summary . . . . 9

3 Data and sample statistics 11 4 Modeling durations in the disability states 15 4.1 Nelson-Aalen and Kaplan-Meier estimators . . . . 16

4.1.1 Estimation results . . . . 17

4.2 Proportional Hazard Models . . . . 17

4.2.1 Tied data . . . . 18

4.2.2 Coefficient estimation results . . . . 19

4.2.3 Testing the proportional hazard assumption . . . . 20

4.3 Smoothing the hazard function . . . . 24

4.3.1 B-spline models of the hazard rate . . . . 24

4.4 Estimating uncertainty using bootstrap methods . . . . 28

4.4.1 Literature research about bootstrapping censored survival data . . . . 28

4.4.2 Parametric versus nonparametric bootstrap . . . . 29

4.4.3 Nonparametric bootstrap with split episodes . . . . 30

4.5 Summary and remarks . . . . 31

5 Disability reserves 33 5.1 Individual reserves . . . . 34

5.2 Portfolio reserves of an insurer . . . . 40

5.3 Comparison to actual losses . . . . 45

5.4 Summary and remarks . . . . 45

6 Conclusion and recommendations 47 A Coefficient estimation in the proportional hazard model 49 A.1 Newton-Raphson method . . . . 50

B Time-dependent coefficient plots 51

C Penalized B-spline regression 57

(8)

Contents

D Additional figures and tables for section 4.4 59

E Parallel computing 63

F Additional figures and tables for section 5.2 65

(9)

Chapter 1

Introduction

Since August 2004, disability insurance for self-employed workers in the Netherlands is provided by private insurance companies. Disability insurance is a form of insurance that insures the income of the beneficiary against the risk of sick leave that makes working impossible. Before August 2004, self-employed workers were protected by the ‘Wet Arbeidsongeschiktheid Zelfstandigen’ (Law on Disability of Self-Employed). The self-employed were obliged to pay a premium for a collective disability insurance provided by the government.

When a self-employed contracts for a disability insurance, the insured is assumed to be healthy.

The insured receives benefit payments when he is not able to work at a 100% level due to illness.

We say that the insured is disabled, which intensity can vary from 0 to 100%. The benefit payments correspond to this intensity. Consider for example an insured with a disability factor of 40% during a period of two months. This means that the insured is expected to work three days in a week.

If the insured has an annual compensation benefit level of e60,000, the insurer’s payment equals e4,000.

When an insured becomes disabled, the insurer has to put reserves aside for future claims of that person. The estimation of reserves for a disability insurance at an insurance company is currently known as not very precise. With the advent of Solvency II, insurers have to develop a model based framework for decisions about the reserves of a disability insurance. The objective of this thesis is to create a model for these reserves, based on real data of an insurer.

1.1 Solvency II

Solvency II is the new being developed risk management regulatory framework by the European Union (EU). Eling, Schmeiser, and Schmit (2007) give a clear overview about the development process and a critical analysis. Solvency II will consist of a three-pillar structure of insurance supervision. First we have quantitative requirements. These are rules about determining the minimal capital required and the target capital. The minimal capital requirements depend on whether the business is related to life or non-life insurance. The target capital corresponds to the insurer’s economic capital for running its business, within a given safety level. The process of claims reserving falls under the first pillar.

The second pillar is about supervisory activities. This means that the risks from the quanti- tative models in the first pillar must be treated with suitable processes and decisions. The third pillar is about supervisory reporting and public disclosure. This implies considerations about market transparency and disclosure requirements. However, our attention is focused on reserves, which are covered by the first pillar.

A major implication of Solvency II is the possible use of internal, instead of standard, risk

models to determine the target capital. An internal model is constructed by the insurer for its

specific needs. In contrast, a standard model is designed by the regulator and used uniformly

across insurers. Internal models are expected to result in more accurate analysis of the insurer’s

(10)

Chapter 1. Introduction

financial situation than the more generic standard models. The model must be certified by the supervisor. This process requires detailed documentation of the internal model and its underlying assumptions. The model must be examined periodically to ensure that the model is properly adjusted to the dynamic financial environment.

The Solvency II model is not yet fully developed, but one of the European Commission’s main requirements is that the model should be easy to use. Internal models are usually very detailed and insurer-specific, which could lead to the conclusion that internal models may have more accuracy but also have high transaction costs. Smaller insurers dislike high transaction costs. So there must be an appropriate trade-off between the costs of an internal model and the increasing accuracy.

An argument for using standard models is that insurers which use it can be compared to each another. An argument against is the possibility of systematic risk. An unusual event in the capital or insurance market could encourage all insurers to take the exact same response.

1.2 Research question and literature review

We will present a disability insurance reserve model which can be used as an internal model. We have to model the disability durations of the insureds, but also the process of going from one disability state to another. We present a multiple state model, where insureds are in an active (healthy) state, in a mild disabled state, or in a severe disabled state. For both disability states we will estimate the prospective claim reserves. We are not only interested in a point estimate of the reserve, but also in its uncertainty. This leads us to the following research question of this thesis: “How can we measure the claim reserves and its uncertainty for a disability insurance in a multiple state model?”. We will answer this for an individual and for a portfolio of an insurer.

We use Spierdijk and Koning (2009) as starting point for this thesis. They have used the same data as we use, and have already estimated the transition intensities between the disability states by using proportional hazard models. There are many papers about health-related multiple state models. For example, Putter, Van Der Hage, De Bock, Elgalta, and Van De Velde (2006) present a multiple state model for the estimation and prediction of breast cancer. Czado and Rudolph (2002) present a multiple state model about long-term care insurance. Similarly as we will do, they estimate the transition intensities by proportional hazard models. These papers will be discussed shortly in this thesis. The computations of the reserves will be strongly based on Haberman and Pitacco (1999) and England and Verrall (2002). Especially the first one is a leading source about actuarial models in the field of disability insurance. However, they make some simplifying assumption, which are incorrect in our opinion. This will be made clear in the next chapter.

1.3 Thesis outline

In order to answer the research question, we first have to define a multiple state model. This model is presented in Chapter 2, where we also introduce notations about transition probabilities and transition intensities. We will fit historical data of an insurer to this multiple state model. This dataset is summarized in Chapter 3. With this data we can fit survival models for each disability state. Outcomes of Nelson-Aalen estimators, Kaplan-Meier estimators, and proportional hazard model fits are presented in Chapter 4. The results of the previous chapters come together in Chapter 5. In that chapter we use a simulation model to estimate future losses of an insurer.

In Section 5.1 the reserves for several individuals with certain characteristics are estimated and compared. In Section 5.2 we estimate the future portfolio losses of an insurer and compare them with actual observations. In Section 5.3 we do a similar estimation, but with true values of losses, in order to check whether the simulation model really works. In Chapter 6 we will draw our conclusions and give recommendations for further research.

The research in this thesis is done with a lot of techniques from the field of econometrics. Most

parts of this thesis will be very technically in order to come to our conclusions. We assume that

the reader has a significant level of statistics, equivalent to graduate students in econometrics.

(11)

Chapter 2

Multiple state disability models

In this chapter we will introduce a multiple state model for a disability insurance. The multiple state model will be the general framework in this thesis, which will be connected to calculations of prospective claim reserves. We will introduce basic notations which will be used in the remainder of this thesis.

2.1 Introduction of the disability model

When an insured is disabled, his level of disability, and corresponding benefit payment, is expressed in a percentage. Disability percentages can take any value between 0 and 100%, but it is not convenient to define all these states in our model. This would make the model huge. The objective is to create a simple model which approximates the true behaviour of insureds. We will define three disability states. The first state refers to a disability ratio of 0-25%. In this state, no benefit payments are made by the insurer. So you are either at work or you are in a unpaid sick leave situation. This state is denoted by the number 0. The two other states correspond to a disability ratio of 26-50% and 51-100%, which are respectivily denoted by the numbers 1 and 2. An insured can jump between these states with time-dependent intensities. There are six possible transitions:

0→1, 0→2, 1→0, 1→2, 2→0, and 2→1. However, an insurer does not record information about the duration in state 0 of an insured. Therefore we have no information about the transitions 0→1 and 0→2, and consequently we analyse four possible transtions. The definitions in this chapter are from Haberman and Pitacco (1999), which is a leading source about disability insurance. We will also present some papers which depend their analysis on their theories.

In this chapter we will introduce stochastic processes. From Ross (2007) we know that a stochastic process {X(t), t ∈ T } is a collection of random variables. This means that for each t ∈ T , X(t) is a random variable. The set T is the index of the process and is interpreted as time.

Therefore, X(t) is referred to the state of the process at time t. The stochastic process is called a discrete-time process when T is a countable set. When T is an interval of the real line, then the stochastic process is a called a continuous-time process.

In the disability model we formally have a state space S = {0, 1, 2}. Then we denote S(t) as the random state at time t, t ≥ 0, with values in the set S . Between one state and another state transitions take place. Transitions are formally defined as pairs (i, j) from the subset T :

T ⊆ {(i, j) | i 6= j; i, j ∈ S } . (2.1)

In our model, transitions take place with intensity µ

ij

(t), i, j ∈ S . This three-state model is

illustrated in Figure 2.1. We observe that state 0 is an absorbing state and states 1 and 2 are

transient states.

(12)

Chapter 2. Multiple state disability models

0

1 2

µ01(t) µ02(t)

µ12(t)

µ10(t) µ20(t)

µ21(t)

Figure 2.1: Multiple state disability model. The three states refer respectively to a disability ratio of 0-25%, 26-50% and 51-100%. The two transitions which are illustrated by the dashed arrows are not observed.

2.2 Markov and semi-Markov processes

On the topic of multiple state models, we have to introduce the subject about the Markov property.

Consider a stochastic process {S(t); t ≥ 0}, with a finite state space S consisting of the states i

0

, . . . , i

n1

, i

n

, j ∈ S . The state space corresponds to a finite set of times 0 ≤ t

0

< · · · < t

n−1

<

t

n

< u. If the following holds:

Pr {S(t

0

) = i

0

∧ · · · ∧ S(t

n−1

) = i

n−1

∧ S(t

n

) = i

n

∧ S(u) = j} > 0,

then {S(t); t ≥ 0} is called a time-continuous Markov chain if the following Markov property is satisfied:

Pr {S(u) = j | S(t

0

) ∧ · · · ∧ S(t

n−1

) = i

n−1

∧ S(t

n

) = i

n

}

= Pr {S(u) = j | S(t

n

) = i

n

} . (2.2)

This means that the transition probability only depends on the current state S(t

n

) = i

n

, and is independent of the path before i

n

. For 0 ≤ t < u we can define the transition probabilities:

P

ij

(t, u) := Pr {S(u) = j | S(t) = i} . (2.3) The relationship between transition probabilities and transition intensities is defined as follows:

µ

ij

(t) = lim

u→t

P

ij

(t, u)

u − t . (2.4)

The transition intensity can be interpreted as an instantaneous probability of going from state i to state j. The use of transition intensities is very useful since it depends on a single time variable, instead of two time variables in the transition probability. Actually, we first fit the transition intensities, from which we derive the transition probabilities. If we make a plot of the transition intensity versus time, we get an easy interpretation about the transition behavior.

In the three-state disability model illustrated in Figure 2.1, it is very likely that the transition

probabilities depend on the history of an insured. Consider for example two insureds, person A

and person B, both staying in state 2. Person A is staying there for one month and Person B is

staying there already for six months. At this moment it is likely that person A will move much

sooner to the active state, state 0, than person B. To model this, we have to consider semi-Markov

(13)

Application of Claim Reserving in a Multiple State Model for Disability Insurance

processes. Compared to a Markov process, we add a variable R(t), which represents the time spent in state S(t) up to time t since the last transition. Then we have the following stochastic process:

{S(t), R(t); t ≥ 0} .

This process is called a time-continuous, time-inhomogeneous Markov process. The process can also be denoted without the variable R(t), and is then is called a time-continuous, time- inhomogeneous semi-Markov process. Formally we have:

R(t) = max {τ : τ ≤ t, S(t − h) = S(t) ∀h ∈ [0, τ]} . (2.5) The transition probabilities now depend on the current state S(t) = i and the time spent in that state since the last transition, R(t) = r. The transition probabilities for 0 ≤ t ≤ u and r, w ≥ 0 are now defined as:

P

ij

(t, u, r, w) = Pr {S(u) = j ∧ R(u) ≤ w | S(t) = i ∧ R(t) = r} . (2.6) Then we can also redefine the transition intensity for all i 6= j, t ≥ 0, r ≥ 0:

µ

ij

(t, r) = lim

u→t

P

ij

(t, u, r, +∞)

u − t . (2.7)

In the next section we will introduce the Kolmogorov differential equations which solve the tran- sition probabilities in a multiple state model for a set of transition intensities.

2.3 Derivation of transition probabilities

Transition probabilities and transition intensities satisfy the Kolmogorov forward differential equa- tion:

d

dt P

ij

(z, t) = X

k:k6=j

P

ik

(z, t)µ

kj

(t) − P

ij

(z, t)µ

j

(u), (2.8) with

µ

i

(t) := X

j:j6=i

µ

ij

(t). (2.9)

Definition (2.9) is called the total intensity of decrement from state i. The Kolmogorov backward differential equation is as follows:

d

dz P

ij

(z, t) = P

ij

(z, t)µ

i

(z) − X

k:k6=i

P

kj

(z, t)µ

ik

(z). (2.10)

The probability of staying in a state, for 0 ≤ t < u and i ∈ S , is called the occupancy probability:

P

ii

(t, u) := Pr {S(z) = i ∀z ∈ [t, u] | S(t) = i} . (2.11) This definition is different from the definition of P

ii

(t, u), which also indicates the probability of being in the same state for 0 ≤ t < u, but one could have visited other states j, j 6= i between t and u. The relationship between occupancy probabilities and transition intensities can be derived from the Kolmogorov backward differential equation:

d

dt P

ii

(z, t) = −P

ii

(z, t)µ

i

(t). (2.12) Using the boundary condition P

ii

(z, z) = 1, we obtain a closed form formula:

P

ii

(z, t) = exp 

− Z

t

z

µ

i

(u)du 

. (2.13)

(14)

Chapter 2. Multiple state disability models

1 2

3

Figure 2.2: Three state model from Haberman and Pitacco (1999). The numbers refer respec- tively to ‘active’, ‘disabled’ and ‘dead’.

The derivation of the transition probabilities from the Kolmogorov differential equations is more complex. We will illustrate this by a problem described by Haberman and Pitacco (1999, Example 1.9). They present a three-state model, where an insured can be active (1), disabled (2) or dead (3). The model allows for recovery, so an insured can jump from the disabled state to the active state. The model is illustrated in Figure 2.2. Technically, this model is equal to the model presented in Figure 2.1, but with another interpretation. The transition intensities have the following interpretation:

µ

12

(t) = intensity of (total) disability;

µ

21

(t) = intensity of recovery;

µ

13

(t) = intensity of mortality for active lives;

µ

23

(t) = intensity of mortality for disabled lives.

In order to compute the transition probabilities, the following set of Kolmogorov forward differ- ential equations have been derived:

d

dt P

11

(z, t) = P

12

(z, t)µ

21

(t) − P

11

(z, t) [µ

12

(t) + µ

13

(t)] ; (2.14a) d

dt P

12

(z, t) = P

11

(z, t)µ

12

(t) − P

12

(z, t) [µ

21

(t) + µ

23

(t)] ; (2.14b) d

dt P

13

(z, t) = P

11

(z, t)µ

13

(t) + P

12

(z, t)µ

23

(t); (2.14c) d

dt P

21

(z, t) = P

22

(z, t)µ

21

(t) − P

21

(z, t) [µ

12

(t) + µ

13

(t)] ; (2.14d) d

dt P

22

(z, t) = P

21

(z, t)µ

12

(t) − P

22

(z, t) [µ

21

(t) + µ

23

(t)] ; (2.14e) d

dt P

23

(z, t) = P

22

(z, t)µ

23

(t) + P

21

(z, t)µ

13

(t). (2.14f) For the two transient states, 1 and 2, we can simply derive the occupancy probabilities:

P

11

(z, t) = exp 

− Z

t

z

12

(u) + µ

13

(u)] du 

; (2.15a)

P

22

(z, t) = exp 

− Z

t

z

21

(u) + µ

23

(u)] du 

. (2.15b)

The set of differential equations in (2.14) can be solved as follows. Equations (2.14a) and (2.14b)

form a pair of differential equations with two unknown functions P

11

(z, t) and P

12

(z, t). With

(15)

Application of Claim Reserving in a Multiple State Model for Disability Insurance

these two solutions, we can solve (2.14c). Similar computations can be done for (2.14d)-(2.14f).

This procedure can be illustrated as follows. From (2.14a) we have:

P

12

(z, t) = 1 µ

12

(t)

 d

dt P

11

(z, t) − P

11

(z, t) [µ

12

(t) + µ

13

(t)] 

. (2.16)

This can be substituted in (2.14b). After some manipulations we have:

d

2

dt

2

P

11

(z, t) +  d

dt log µ

21

(t) + µ

12

(t) + µ

13

(t) + µ

21

(t) + µ

23

(t)  d

dt P

11

(z, t) +  d

dt log µ

21

(t) [µ

12

(t)µ

13

(t)] + d

dt µ

12

(t) + d

dt µ

13

(t) − µ

12

(t)µ

21

(t) + [µ

12

(t) + µ

13

(t)] [µ

21

(t) + µ

23

(t)] 

P

11

(z, t) = 0. (2.17)

This can be solved with the boundary conditions:

P

11

(z, z) = 1;

d

dt P

11

(z, t)

t=z

= −µ

12

(z) − µ

13

(z).

The derivation of the transition probabilities is very complex. Therefore, the authors propose a simplified model with time-homogeneous transition intensities:

µ

12

(t) = µ

12

, µ

13

(t) = µ

13

, µ

21

(t) = µ

21

, µ

23

(t) = µ

23

.

Later on they make another simplified assumption: µ

13

= µ

23

= µ. This means that there is no difference in mortality for active or disabled insureds. This is an assumption which is not plausible for our disability model (Figure 2.1). An insured with a more severe state of disability has certainly a lower intensity of jumping to the active state 0.

The analysis of Haberman and Pitacco (1999) is continued by a semi-Markov process. The transition intensities are now defined as:

µ

12

(t) = intensity of (total) disability;

µ

21

(t, r) = intensity of recovery;

µ

13

(t) = intensity of mortality for active lives;

µ

23

(t, r) = intensity of mortality for disabled lives.

Now it is assumed that the transition from the disabled state to the active state or to the dead state depends on the duration of disability. Probably for high values of r, µ

21

(t) becomes smaller and µ

23

(t) becomes higher. For the transition probabilities we now have:

d

dt P

11

(z, t) = Z

t z

P

11

(z, v)µ

12

(v)P

22

(v, t, 0)µ

21

(t, t − v)dv

−P

11

(z, t) [µ

12

(t) + µ

13

(t)] . (2.18) For the occupancy probabilities we have:

d

dt P

22

(z, t, r) = −P

22

(z, t, r) [µ

21

(t, t − z + r) + µ

23

(t, t − z + r)] , (2.19) whose solution is:

P

22

(z, t, r) = exp 

− Z

t

z

21

(v, v − z + r) + µ

23

(v, v − z − r)] dv 

. (2.20)

(16)

Chapter 2. Multiple state disability models

The question is whether these solutions are plausible in our three-state disability model. Consider again person A and B, both staying in state 2, for respectively one and six months. We assume that before entering state 2, person A stayed in state 1 for 12 months, and person B stayed in the active state 0. Now it is maybe more likely that person B recovers sooner than person A, in contrast to our previous conclusion. However, it is not possible to include information about previous states in the semi-Markov model. Therefore we turn to a simulation algorithm in order to estimate the transition probabilities. The information for this simulation algorithm will be introduced in Section 2.4 and later on performed in Chapter 5.

The techniques of Haberman and Pitacco (1999) are used by many other authors. For example, Putter, Van Der Hage, De Bock, Elgalta, and Van De Velde (2006) present a multiple state model for the estimation and prediction of breast cancer. The model consists of a ‘surgery state’, a

‘death state’ and several states in between. The objective is to give insight into what happens to

a patient after the first event. Czado and Rudolph (2002) present a multiple state model about

long-term care insurance. They present a model in which an insured is either healthy, has care at

home, has care in a nursing home, or is dead. The authors connect this model to the calculations

of premiums and reserves. Similar computations will be done in this thesis. However, their model

does not allow for recovery, which is a crucial assumption in our model.

(17)

Application of Claim Reserving in a Multiple State Model for Disability Insurance

2.4 Transition probabilities in a simulation process

From now on we assume that the time-axis t is on a monthly scale. We have no solution for a closed formula of P

ij

(z, t) with information about the duration in the current state and the previous states. Consider Figure 2.3, where we have drawn a probability tree which starts in state 1 at t = 0 and which ultimately ends at t = 3. An insured walks a random path in this probability tree in which the transition probabilities P

ij

(t, t + 1) depend on the time since the previous transition. We say that the clock is reset after each transition. When an insured enters state 0, he leaves the system. One can observe that an insured can walk 15 possible unique paths if we look three months ahead. When we want to look 24 months ahead, the insured can walk 2

25

− 1 = 33, 554, 431 possible paths. The possible paths in the three-state model increases exponentially as the ‘distance’ between z and t becomes larger. This makes it computationally very hard to calculate P

ij

(z, t) analytically. Therefore we have to simulate sample paths in order to compute P

ij

(z, t). When we simulate sample paths, we walk through a probability tree. At each point in the tree, we draw a sample of the next step. Therefore we only have to calculate P

ij

(t, t + 1). Between t and t + 1, the insured can make only one transition or he stays in the current state. In that case we have P

ii

(t, t + 1) = P

ii

(t, t + 1). When P

ii

(t, t + 1) is computed by (2.13), the portion 1−P

ii

(t, t+1) ‘is left’ for the transition probabilities P

ij

(t, t+1). That portion can be divided proportionally to the transition intensities. This yields the following one-month transition and occupancy probabilities:

P

11

(t, t + 1) = exp



− Z

t+1

t

10

(u) + µ

12

(u)] du



; (2.21a)

P

10

(t, t + 1) = (1 − P

11

(t, t + 1))

R

t+1

t

µ

10

(u)du R

t+1

t

10

(u) + µ

12

(u)] du ; (2.21b) P

12

(t, t + 1) = 1 − P

11

(t, t + 1) − P

10

(t, t + 1); (2.21c)

P

22

(t, t + 1) = exp 

− Z

t+1

t

20

(u) + µ

21

(u)] du 

; (2.21d)

P

20

(t, t + 1) = (1 − P

22

(t, t + 1))

R

t+1

t

µ

20

(u)du R

t+1

t

20

(u) + µ

21

(u)] du ; (2.21e) P

21

(t, t + 1) = 1 − P

22

(t, t + 1) − P

20

(t, t + 1). (2.21f) These transition and occupancy probabilities will be used in Chapter 5. When we simulate sample paths, we can also easily depend P

ij

(t, t + 1) on covariates about the previous state and the total duration in the previous states, as we will show later on. In Chapter 4 we will estimate the transition intensities, based on data of an insurer, in order get results for (2.21). First, we will present the data of the insurer in Chapter 3.

2.5 Summary

In this chapter we presented a multiple state model for a disability insurance. We tried to analyze

the transitions in this model by a Markov process. In a Markov process, the transition probabilities

only depend on the current state. Since we believe that the transition probabilities must also

depend on the duration in the current state, we turned to a semi-Markov process. But we also

wanted to depend the transition probabilities on information about the previous states. Then

the use of analytic models stops and we have to make use of simulation models. When we do

a simulation, we only have to look one month ahead. For this simulation, we presented the

one-month transition probabilities.

(18)

Chapter 2. Multiple state disability models

1

2

2

2

P

22

(1, 2) 0

P

20

(1, 2)

1 P

21

(1, 2)

P

22

(0 , 1) P

20

(0, 1) 0

1

2

P

12

(0, 1) 0

P

10

(0, 1)

1 P

11

(0, 1)

P

21

(0, 1) P

12

(0 ,1) P

10

(0, 1) 0

1

2

2

P

22

(0, 1) 0

P

20

(0, 1)

1 P

21

(0, 1)

P

12

(1 , 2) P

10

(1, 2) 0

1

2

P

12

(2, 3) 0

P

10

(2, 3)

1 P

11

(2, 3)

P

11

(1, 2)

P

11

(0 ,1)

Figure 2.3: Probability tree with initial state 1 for 0 ≤ t ≤ 3. The clock is reset after a transition

to a new state. When an insured enters state 0 he leaves the system.

(19)

Chapter 3

Data and sample statistics

Our analysis about disability losses is based on a dataset provided by a large insurance company from the Netherlands. The dataset consists of 19,170 cases, which are observed from January 2000 up to and including September 2008. One case consists of consecutive monthly information about an individual insured in one of the two disability states. When an insured enters the active state, and later returns to a disability state, a new case is created. To illustrate these cases, we have outlined the information about case number 3 in Table 3.1. The variables start and stop indicate the monthly intervals in which the insured is disabled. The variable event indicates whether a transition event occurred after the interval (1) or not (0). When an event occurs and the insured jumps to another disability state, the clock is reset to 0. In this case we say that the insured starts a new episode sequence. At the end of September 2008 we have insureds in the dataset for which the episode sequence has not ended yet. For these episode sequences we do not observe a transition. We say that the durations of these episode sequences are right censored. The information about the disabled insured is presented by several covariates which are listed below:

age.at.t0 measures the age in years of an insured, when he enters the system. The age varies from 18.65 to 65.13 with mean equal to 42.82 and median equal to 42.59.

offset measures the number of months in the system before the previous transition. In 58.4% of the events, the offset is equal to zero. The remaining part varies from 1 to 102 with mean and median equal to respectivily 10.3 and 5.0.

previous.state indicates the state in which the insured was before the previous transition.

unemp stands for the unemployed labor force measured in percentage as measured by the Dutch

‘Centraal Bureau voor de Statistiek’ (CBS). See Figure 3.1 for an illustration of this rate over the time range of our dataset. The unemployment rate is given monthly, but is actually measured as the average of three months around that particular month.

growth stands for the quarterly data of the growth of the GDP, measured in percentages, com- pared to the previous period, as measured by the CBS. See Figure 3.2 for an illustration of this growth versus time.

Unfortunately, we have no information about gender in our dataset. There are many other covari- ates in the original dataset, but in order to keep our survey clear, we restrict our attention to these covariates. Since we split the data for each month in each case, we call this episode split data.

Episode splitting is needed when we deal with time-dependent covariates, like unemp and growth.

The episode split dataset consists of 161,738 rows. There are in total 27,162 transition events, divided in 9,234 events for transition 1→0, 2,454 for transition 1→2, 7,541 for transition 2→0 and 7,933 events for transition 2→1. There are 3,375 episode sequences which are right censored.

In Chapter 4 we will fit several survival models based on these data. Particularly, the propor-

tional hazard model is of interest, since this model allows for covariates.

(20)

Chapter 3. Data and sample statistics

Table 3.1: Disability data of a disabled insured. The two events in this case correspond respec- tively to transition 2→1 and transition 1→0. This case id consists of two episode sequences.

case id start end event age.at.t0 offset previous.state unemp growth

3 0 1 0 42.0 0 0 4.3 0.1

3 1 2 0 42.0 0 0 4.3 0.7

3 2 3 0 42.0 0 0 4.5 0.7

3 3 4 1 42.0 0 0 5.0 0.7

3 0 1 0 42.0 4 2 5.2 0.3

3 1 2 0 42.0 4 2 5.2 0.3

3 2 3 0 42.0 4 2 5.4 0.3

3 3 4 1 42.0 4 2 5.5 0.4

4 5 6 7

year

unemplo yment r ate

2002 2004 2006 2008

Figure 3.1: Monthly data of the covariate unemp.

(21)

Application of Claim Reserving in a Multiple State Model for Disability Insurance

0.5 1.0 1.5 2.0 2.5 3.0

year

gro wth

2000 2002 2004 2006 2008

Figure 3.2: Quarterly data of the covariate growth.

(22)

Chapter 3. Data and sample statistics

(23)

Chapter 4

Modeling durations in the disability states

In Chapter 2 we described a method to calculate the transition probabilities with the use of transition intensities in a multiple state model. In order to estimate the transition intensities we have to model the duration in each state. There are numerous ways of estimating durations. In this chapter we will focus on the hazard rate. The hazard rate can be used as transition intensity.

One often estimates the hazard rate in a model where only one transition is possible, i.e. a model without competing risks. Consider a number of n insured who are disabled (so we do not differentiate on different disability states). The durations T

i

until recovery of the ith insured, i = 1, . . . , n, are i.i.d. with density function f (t) and distribution function F (t). Since we do not observe all recoveries, we deal with right censoring. We have T

i

= min (T

i

, C

i

), where C

i

is the censoring time. We define a counting process Y

i

(t) = I {T

i

≥ t}, which indicates whether insured i is still in the active state, Y

i

= 1, or not, Y

i

= 0, at time t. The observations Y

i

are right censored.

The dataset Y is called survival data. The hazard rate is now defined as:

λ(t) = lim

∆→0

Pr {Y ∈ [t, t + ∆t) | Y > t}

∆t . (4.1)

From this definition we can interpret the hazard rate as an instantaneous probability of going from one state to another state. An equivalent definition of the hazard rate is given by:

λ(t) = f (t)

S(t) , (4.2)

where

S(t) = 1 − F (t). (4.3)

Definition (4.3) is called the survival function

1

. In the previous example, we have one possible transition. In the three-state disability model (Figure 2.1), there are four possible transitions. The data from Chapter 3 is used to model these four transitions. The dataset is split in four subsets, denoted by tr10, tr12, tr20 and tr21, consisting respectively by the episode sequences with the transitions 1→0, 1→2, 2→0, 2→1. Insureds in state 1 for which no transition is observed, are at risk for transitions to state 0 and state 2. Therefore we add the censored episode sequences, denoted by 1→NA, to both tr10 and tr12. Similarly we add the censored episode sequences in state 2, denoted by 2→NA, to tr20 and tr21. Adding censored data does not change the likelihood estimators as we see later on. For each subset of transition data, we will estimate the hazard rate. The hazard rate is then denoted by λ

ij

(t), where the subscripts refer to transition from state i to j, i, j ∈ S , i 6= j.

1Note that we have the same notation for the state spaces. It will be clear from the context whether we deal with a survival function or a notation for a state space.

(24)

Chapter 4. Modeling durations in the disability states

In the remainder of this chapter we will introduce different estimation methods of the hazard rate. These methods give estimations for the integrated hazard rate, which is defined as:

Λ(t) = Z

t 0

λ(u)du, (4.4)

which turns out to be much easier to estimate than the hazard rate. The integrated hazard rate has no useful interpretation, but it has a useful connection with the survival function:

S(t) = exp {−Λ(t)} . (4.5)

In Section 4.1 we introduce the Nelson-Aalen and Kaplan-Meier estimators. These estimators do not allow for covariates. In contrast, the proportional hazard model does allow for covariates and will be discussed in Section 4.2. All these estimators yield non-smooth functions. Since we prefer smooth functions, we will smooth these estimators by way of splines in Section 4.3. In Section 4.4 we will introduce bootstrap methods for survival models. Bootstrapping is necessary in order to get understanding about the estimation uncertainty in our disability insurance model. In the remainder of this chapter we use the notation of Therneau and Grambsch (2000).

4.1 Nelson-Aalen and Kaplan-Meier estimators

The Nelson-Aalen estimator is an estimator for the integrated hazard rate, which does not allow for covariates. Consider a survival dataset with n subjects. Besides the counting process notation Y

i

, we use the complementary function N

i

(t). It counts the number of observed events for the ith subject. N

i

(t) = 0 until the moment of an event and equals 1 thereafter. We denote the aggregate processes by Y (t) = P

i

Y

i

(t) and N(t) = P

i

N

i

(t). Y (t) is the number of subjects at risk at time t. N (t) is the total number of events up to and including t. The Nelson-Aalen estimator is then defined as:

ˆΛ(t) = Z

t 0

dN(s)

Y (s) , (4.6)

To explain this estimator, consider a small interval of time of the integrated hazard rate:

Λ(s + h) − Λ(s) ≈ λ(s)h

= Pr(event in (s, s + h] | at risk at s).

It is reasonable to estimate this probability by:

N (s + h) − N(s)

Y (s) .

Integrating these quantities over the range (0, t] yields the Nelson-Aalen estimator. Since we deal with discrete time intervals in the disability dataset, it is more convenient to define the Nelson- Aalen estimator by the equivalent sum:

ˆΛ(t) = X

i:ti≤t

∆N(t

i

)

Y (t

i

) , (4.7)

where ∆N(t

i

) denotes the number of events occurring precisely at time t

i

, the time until the ith event. The Nelson-Aalen estimator is a method of moments estimator. Then the variance is estimated by:

Var h

ˆΛ(t)i = X

i:ti≤t

∆N(t

i

)

Y

2

(t

i

) . (4.8)

Since the integrated hazard rate has no useful interpretation, we will transform it to a survival

function in order to present the estimation of the four possible transitions in the disability model.

(25)

Application of Claim Reserving in a Multiple State Model for Disability Insurance

0 10 20 30 40 50 60

0.0 0.2 0.4 0.6 0.8 1.0

time (months)

propor tion in initial state

1−>0 1−>2 2−>0 2−>1

Figure 4.1: Kaplan-Meier estimators of the transitions in the three-state disability model.

The Nelson-Aalen estimator has a close connection to the Kaplan-Meier survival function. Let dˆΛ(t

i

) = dN(t

i

)/Y (t

i

), the increment of the Nelson-Aalen estimator at the ith failure. Then the Kaplan-Meier estimator of the survival function is defined as:

S ˆ

KM

(t) = Y

j:tj≤t

h1 − dˆΛ(t

j

) i

. (4.9)

This estimator gives slightly different results than the more obvious estimator proposed by Breslow (1972):

S ˆ

B

(t) = exp h

−ˆ Λ(t) i

. (4.10)

But since exp(−x) ≈ 1−x for small x, the Kaplan-Meier estimator works fine for small increments dˆΛ.

4.1.1 Estimation results

The results of the Kaplan-Meier estimators in the three-state disability model are presented in Figure 4.1. We observe that duration till the transitions from state 1 to state 2 is significantly longer than in the other transitions. This is the only transition in which the insured is worse off than before. In the other transitions, the insureds are forced to make the transition quickly, since they will be better off. The durations till these transitions are distributed almost equally.

4.2 Proportional Hazard Models

In order to fit a hazard rate which allows for covariates, we use the proportional hazard model presented by Cox (1972). The proportional hazard function assumes for an individual i:

λ

i

(t) = λ

0

(t) exp(X

i

(t)β), (4.11)

(26)

Chapter 4. Modeling durations in the disability states

where λ

0

(t) is the baseline hazard, β is the vector of regression coefficients, and X

i

(t) is vector of covariates which may depend on time. The baseline hazard λ

0

(t) is an unspecified function, just like the Nelson-Aalen estimator. Therefore we call the proportional hazard model a semi- parametric model. The model is displayed as a hazard function, but in the estimation procedure we actually fit an integrated hazard function. The model is called proportional since the hazard ratio for two subjects with fixed covariate vectors X

i

and X

j

is given by:

λ

i

(t)

λ

j

(t) = λ

0

(t) exp(X

i

β)

λ

0

(t) exp(X

j

β) = exp(X

i

β)

exp(X

j

β) . (4.12)

When we draw two curves of the same proportional hazard model for X

i

and X

j

, both curves are only scaled upward/downward with respect to each other. The scaling factor is independent of t.

In order to fit a proportional hazard model, we first have to estimate the vector coefficient β. The solution of β is derived by maximizing the following partial likelihood function:

P L(β) =

n

Y

i=1

Y

t≥0

( Y

i

(t)r

i

(β, t) P

j

Y

j

(t)r

j

(β, t) )

dNi(t)

, (4.13)

with

r

i

(β, t) = exp [X

i

(t)β] ≡ r

i

(t). (4.14) Definition (4.14) is called the risk score for subject i. Definition (4.13) is called a partial likelihood function, since it only depends on β and not on the baseline hazard. The exact derivation of β and its variance can be found in Appendix A. The integrated baseline hazard is fitted similarly as the Nelson-Aalen estimator. In this case, we give weights to the observations Y

i

by the corresponding risk score. The integrated baseline hazard is then given by:

ˆΛ

0

(t, ˆ β) = Z

t

0

dN(s) P

n

j=1

Y

j

(s)ˆr

j

(s) . (4.15)

When we fit a proportional hazard model without covariates, this estimator is obviously equal to the Nelson-Aalen estimator. The variance of the baseline hazard is derived similarly as (4.8):

Var h

ˆΛ

0

(t, ˆ β)i = Z

t 0

dN(s) h P

n

j=1

Y

j

(s)ˆr

j

(s) i

2

. (4.16) In order to show the results of the integrated baseline hazard, the results are often transformed to a survival curve, similarly to what we do with a Nelson-Aalen estimator. The results in Figure 4.1 can be interpreted as the transformed integrated baseline hazard, where the covariate values are at the corresponding means from the original data. We have not plotted the confidence bounds of the integrated hazard rate. This subject will be discussed in Section 4.4.

4.2.1 Tied data

The partial likelihood function in Equation (4.13) does not allow for tied events. The function is designed for continuous data where two events almost never occur at the same time. However, the durations in our disability model are measured in months. In this dataset there are numerous tied events. Let us consider four subjects for which we measure the time until an event. The first two subjects have an event at exactly the same time. Then according to the partial likelihood function we have to compute either

 r

1

r

1

+ r

2

+ r

3

+ r

4

  r

2

r

2

+ r

3

+ r

4



(4.17)

or  r

2

r

1

+ r

2

+ r

3

+ r

4

  r

1

r

1

+ r

3

+ r

4



, (4.18)

(27)

Application of Claim Reserving in a Multiple State Model for Disability Insurance

Table 4.1: Coefficient estimation outcomes of the proportional hazard models of the four different transition.

coef exp(coef) p-value

1→ 0 age.at.t0 -0.0131 0.987 0.0e+00

unemp 0.0601 1.062 3.4e-08

offset -0.0650 0.937 0.0e+00

previous.state 0 -0.2228 0.800 0.0e+00

1→ 2 age.at.t0 -0.0092 0.991 2.6e-04

unemp 0.1313 1.140 1.1e-10

offset -0.0189 0.981 0.0e+00

previous.state 0 0.1861 1.205 4.5e-05

2→ 0 age.at.t0 -0.0126 0.987 0.0e+00

unemp 0.1757 1.192 0.0e+00

offset -0.0385 0.962 0.0e+00

previous.state 0 0.4274 1.533 1.1e-16

2→ 1 age.at.t0 -0.0071 0.993 1.1e-07

unemp 0.1288 1.137 0.0e+00

offset -0.0084 0.992 1.1e-13

previous.state 0 0.0154 1.015 6.6e-01

but we do not know which one we have to choose. Note that we have abbreviated the risk score by r

i

. Efron (1977) proposes the following approximation of the contribution to the partial likelihood, for which both subjects have a share:

 r

1

r

1

+ r

2

+ r

3

+ r

4

  r

2

1

2

r

1

+

12

r

2

+ r

3

+ r

4



. (4.19)

To interpret this approximation, we have to look at Equations (4.17) and (4.18). When there are two tied events, they are both for sure in the first denominator. The second event is with probability 1/2 in the second denominator. Efron puts the average denominator

12

r

1

+

12

r

2

+r

3

+r

4

in the second term of likelihood contribution. When there would be three tied events, we compute:

 r

1

r

1

+ r

2

+ r

3

+ r

4

  r

2

2

3

r

1

+

23

r

2

+

23

r

3

+ r

4

  r

3

1

3

r

1

+

13

r

2

+

13

r

3

+ r

4



. (4.20) This method is used by default when fitting a proportional hazard model in the computer program R, R Development Core Team (2009). There is also a method to compute the partial likelihood exactly, but if there are a large number of tied events, the computational time will be excessive.

Another method is proposed by Breslow (1972). We will not discuss those two methods further in detail.

4.2.2 Coefficient estimation results

At this point we have completely defined the estimation method for the coefficients in a propor-

tional hazard model. We estimated the coefficients of the four proportional hazard models in

the three-state model. The results of these estimations, together with the exponent of the coef-

ficient and the results of the Wald test, are shown in Table 4.1. The estimated coefficients are

asymptotically normally distributed. The coefficients are tested by the hypothesis H

0

: ˆ β = 0,

see Appendix A. Almost all coefficients are significant at a 5% level, except the coeffient for the

covariate previous.state in transition 2→1. When a coefficient of a covariate is negative, this

means that this covariate scales the hazard rate down, which causes a longer expected duration in

the disability state. This can be clearly observed from the exponent of the coefficient. The other

(28)

Chapter 4. Modeling durations in the disability states

way around holds for a positive value of a coefficient. We observe that for each covariate the sign of the coefficient is the same in each state, except for the covariate previous.state.

We observe that an increase of one year in age.at.t0, decreases all transition intensities by about 1%. An older insured has obviously a longer expected duration in the disability states, compared to a younger insured. An increase of 1 percentage point in the unemployment rate shifts the transition intensities upwards by 6% (transition 1→0) to 19% (transition 2→0). When unemployment is high, self-employed workers are more driven to go back to work, afraid of losing their competitive advantage. The disability history of the insured plays also an important role. The covariate offset shifts the transition intensities down by 1% (transition 2→1) to 6% (transition 1 → 0) for each month of increase. The longer an insured has been in the system (before the previous transition), the slower the recovery. This covariate plays a less important role in the transitions between the two disability states. It can be interpreted that the covariate offset is especially inmportant for determining the duration till the transition to the active state 0.

The covariate previous.state is a factor variable. For the transitions 1→0 and 1→2, the previous state 2 is taken as basis variable. When the previous state is 0, the transition intensities are respectively shifted upward by 20% and shifted downward by 20%. An insured in state 1, which was previously in state 2, has a higher transition intensity to state 0 than an insured which was previously in state 0. This can be explained from the fact that in the first case the insured has an improving health condition, compared to the second case. The insureds which make a transition from state 1 to state 2, which were previously in state 0, have a higher transition intensity than those which were previously in state 2. In the first case, the insureds are in a process of a deteriorating condition from the active state. Those insureds make this transition faster. For the transitions 2→0 and 2→1, the previous state 1 is taken as basis variable. When the previous state is 0, the transition intensities are respectively shifted upward by 53% and 1%.

However, for transition 2→1, this covariate is not significant. It does not matter for insureds which make a transition from a severe disability state (2) to a mild disability state (1), whether their previous state was 0 or 1. It is convenient to have the same covariates in all proportional hazard models. Therefore, we leave this covariate in the system. The insureds which make the transition from state 2 tot state 0, make the transition much faster when they were previously in state 0 compared to the insureds which were previously in state 1. The insureds in the first case are probably the ones with a short-term disability. The insureds in the second case have a long-term disability and make transitions to the active state much slower.

The covariate growth, which represents the economic growth, turns out to be insignificant in all the four proportional hazard fits. Therefore, we ignore this covariate in our model.

Note that we have only included a small number of the available covariates in the proportional hazard models. Spierdijk and Koning (2009) show that among others covariates about benefit compensations, type of industry and geographic location are significant. We did not include all these covariates in the model, in order to make the model for the prospective reserves not too complex. The authors also show that frailty is significant. A frailty variable in the proportional hazard model measures the individual-specific unobserved heterogeneity, such as moral hazard.

4.2.3 Testing the proportional hazard assumption

After fitting a proportional hazard model, one should test the proportional hazard assumption.

Testing proportional hazards can be done visually or formally. Let us first turn to a visual test of

proportionality. If a proportional hazard model is fitted correctly, then the Kaplan-Meier curves

for different levels of the same covariate should not cross each other. In Figure 4.2 we have

plotted Kaplan-Meier curves at two levels of each covariate for transition 1→0. For the continuous

covariates age.at.t0, unemp and offset, we have compared the Kaplan-Meier curves at levels

above and below its mean value. We observe no crossing for each covariate, which indicates that

the fitted proportional hazard model might be correct. However, the visual test is not very useful

when testing the proportionality of continuous covariates, since they have many levels for which

we could test the proportionality. Therefore we turn to a formal test of proportionality. But first

we have to introduce some new variables to describe this test.

(29)

Application of Claim Reserving in a Multiple State Model for Disability Insurance

0 10 20 30 40 50 60

0.00.20.40.60.81.0

time (months)

proportion in initial state

age.at.t0<mean age.at.t0>mean

(a) age.at.t0

0 10 20 30 40 50 60

0.00.20.40.60.81.0

time (months)

proportion in initial state

unemp<mean unemp>mean

(b) unemp

0 10 20 30 40 50 60

0.00.20.40.60.81.0

time (months)

proportion in initial state

offset<mean offset>mean

(c) offset

0 10 20 30 40 50 60

0.00.20.40.60.81.0

time (months)

proportion in initial state

State 0 State 2

(d) previous.state

Figure 4.2: Visual tests of proportionality for transition 1→0.

(30)

Chapter 4. Modeling durations in the disability states

Let us first define martingale residuals, introduced by Barlow and Prentice (1988) and further developed by Therneau, Grambsch, and Fleming (1990). A martingale residual process is defined as:

M c

i

(t) = N

i

(t) − Z

t 0

Y

i

(s) exp [X

i

(s)β] dˆΛ

0

(s). (4.21) The martingale residual is the difference between the observed and expected number of events.

The martingale residuals have expectation equal to zero and are uncorrelated. Schoenfeld (1980) proposed a residual that depends on each kth event time. Let 0 < t

1

< · · · < t

k

< · · · t

d

be the time axis, with d = N(∞) the number of event times. Then the Schoenfeld residual is defined as:

s

k

= Z

tk tk−1

X

i

h X

i

− ¯x( ˆ β, s)i d M c

i

(s), (4.22)

where

¯x(β, s) = P Y

i

(s)r

i

(s)X

i

(s)

P Y

i

(s)r

i

(s) . (4.23)

Definition (4.23) is the weighted mean of the covariates over those at risk at time s. The Schoenfeld residual is derived from the the score process, which is a three-way array with dimensions of subject i, covariate j and time k:

U

ijk

(β) = Z

tk tk−1

[X

ij

(s) − ¯x

j

(β, s)] dM

i

(s). (4.24)

Since dM

i

is not observed, the estimator d M c

i

(t) is used.

Assume now that the survival data violates the proportionality assumption. In that case, the hazard rate can be fitted by a model similar as the proportional hazard model, but with time-dependent coefficients:

λ(t) = λ

0

(t) exp [Xβ(t)] . (4.25)

Grambsch and Therneau (1994) show that the following relation must hold for the coefficient of the jth covariate, where ˆ β is the vector of coefficients from a fit of a proportional hazard model:

E(s

kj

) + ˆ β

j

≈ β

j

(t

k

), (4.26) where

s

k

:= V

−1

( ˆ β, t

k

)s

k

. (4.27) Definition (4.27) is called the scaled Schoenfeld residual. It is scaled by the weighted variance of X at time s:

V (β, s) = P

i

Y

i

(s)r

i

(s) [X

i

(s) − ¯x(β, s)]

0

[X

i

(s) − ¯x(β, s)]

P

i

Y

i

(s)r

i

(s) . (4.28)

To check the proportionality we can plot s

kj

+ ˆ β

j

versus time, or some function of time g(t). A line can be fitted through these points. A line with a zero slope indicates that β(t) is constant and the proportionality assumption holds. In Figure 4.3 we plotted a smooth spline (more about smooth splines in Section 4.3) of order three. The spline is fitted on the observations s

kj

+ ˆ β

j

versus log(t

k

), where we used the covariate age.at.t0 from transition 1→0. The dashed lines represent the 90% confidence intervals. The plot has been augmented with two horizontal dotted lines, one at 0 and one at the value of ˆ β

j

. The value ˆ β

j

does not fit exactly between the confidence bounds of the spline fit, but deviations are numerically not very large when we look at the y-axis.

We should reject the proportionality assumption from this plot, but the results are ‘good enough’

in order to coninue with the proportional hazard model. Moreover, only the first 24 months are

of interest, since we will do a prediction of reserves for at most 24 months from t = 0. This will

be made clear later on. So the large outlier of the spline at the end of the plot does not matter.

(31)

Application of Claim Reserving in a Multiple State Model for Disability Insurance

1 2 5 10 20 50

−0.02 0.00 0.01 0.02 0.03 0.04

Time

Beta(t) f or age .at.t0

Figure 4.3: Time-dependent coefficient plot of the covariate age.at.t0 for the transition data 1→0.

With Figure 4.3 in mind we can set up a formal test of proportionality. We regress β(t) on the function of time g(t):

β

j

(t) = β

j

+ θ

j

[g

j

(t) − ¯g

j

] , j = 1, . . . , p, (4.29) where ¯g

j

is the mean of the observations g

j

(t

k

) and we assume that g

j

(t) ≡ g(t). Let S be a p × d matrix of unscaled Schoenfeld residuals, and let S

= dSI

−1

be a matrix of scaled Schoenfeld residuals. Let I( ˆβ) = P

k

V ( ˆ β, t

k

), the summation of the variances at each time t

k

. Then the global test of proportional hazards is given by the following χ

2

-statistic:

T = (g − ¯g)

0

S I

−1

S

0

(g − ¯g)

P (g

k

− ¯g)

2

/d = (g − ¯g)

0

S

IS

∗0

(g − ¯g)

d P (g

k

− ¯g)

2

. (4.30)

A univariate test for non-proportionality is given by the χ

2

-statistic:

T

j

= n P

k

(g

k

− ¯g)s

kj

o

2

d I

jj

P

k

(g

k

− ¯g)

2

, (4.31)

with I

jj

as the diagonal elements of I. A derivation of (4.30) and (4.31) can be found in Therneau

and Grambsch (2000, Chapter 6). The results of both test statistics for all covariates of the four

different transitions are given in Table 4.2. From these results we must conclude that in most

cases proportionality does not hold. Notice that a large test statistic (small p-value) of the global

proportionality test suggests no proportionality, and a large test statistic of the univariate test

suggests that the proportionality assumption holds. However, does it matter that not all the

test statistics for proportionality give nice results? In Appendix B we plotted similar graphs as in

Figure 4.3 for each covariate of the four transitions models. At each plot we have added the p-value

of the univariate test for non-proportionality. We observe that in the most case of acceptance of

non-proportionality, the deviation outside the confidence bounds of ˆ β

j

is numerically small. We

have relatively large deviations for the covariates age.at.t0 and offset. These deviations are

(32)

Chapter 4. Modeling durations in the disability states

Table 4.2: Test statistics of the univariate test for non-proportionality and global test of propor- tional hazard for the four proportional hazard models.

χ

2

-stat. p-value χ

2

-stat. p-value

1→ 0

age.at.t0 4.3174 3.77e-02

2→ 0

7.16e+01 0.000

unemp 3.3416 6.75e-02 5.29e-03 0.942

offset 67.7296 2.22e-16 7.18e+01 0.000

previous.state 0 0.0285 8.66e-01 1.01e+00 0.315

GLOBAL 88.7823 0.00e+00 1.78e+02 0.000

1→ 2

age.at.t0 4.11 0.04265

2→ 1

18.55 1.65e-05

unemp 7.11 0.00766 0.49 4.84e-01

offset 88.30 0.00000 8.94 2.79e-03

previous.state 0 8.03 0.00460 7.93 4.85e-03

GLOBAL 141.30 0.00000 65.70 1.83e-13

in most of the cases at the end of the time-horizon. We can conclude that the results are ‘good enough’ to continue working with the proportional hazard models. We ignore the results from Table 4.2.

4.3 Smoothing the hazard function

The estimation of the integrated hazard rate by the methods of Nelson-Aalen and Cox yield non- smooth functions. Especially when we do not observe many events, the estimation of the hazard curve is very rough. However, if we want to apply the function of the hazard rate to the estimation of the transition probabilities in the equations of (2.21), we prefer smooth functions. Therefore we choose to fit the integrated hazard rate by B-splines with a monotonicity restriction, similarly as described by Rosenberg (1995). When we have fitted a B-spline to the integrated hazard rate we can easily take the derivative in order to compute the hazard rate. Similar smoothing procedures are done by Anderson and Senthilselvan (1980) and Etezadi-Amoli and Ciampi (1987). First we will discuss B-spline models in general. After that we will introduce B-spline models with a monotonicity restriction.

4.3.1 B-spline models of the hazard rate

Consider a dataset with m data points (x

i

, y

i

), where the x-axis can be defined as a time axis.

To fit a curve through these points over the interval [T

min

, T

max

], one divides the interval by a number of K knots. This yield K + 1 intervals T

min

< t

1

< t

2

< · · · < t

K

< T

max

. One can span that space by n = K + 4 basis functions B

j,k

(t) of degree k. The basis functions are defined by the Cox-de Boor recursion formula, see Eilers and Marx (1996):

B

j,0

(x) := (1 if t

j

≤ x < t

j+1

,

0 otherwise; (4.32)

B

j,k

(x) := x − t

j

t

j+k

− t

j

B

j,k−1

(x) + t

j+k+1

− x

t

j+k+1

− t

j+1

B

j+1,k−1

(x). (4.33) We want to regress the data points (x

i

, y

i

) on the basis functions B

j

of certain degree n. We leave out the notation n in the subscript of the basis functions, since this is predefined. The regression curve is the linear combination ˆy(x) = P

nj=1

ˆa

j

B

j

(x). The least squares objective function to minimize is:

S =

m

X

i=1

 y

i

n

X

j=1

a

j

B

j

(x

i

)

2

. (4.34)

Referenties

GERELATEERDE DOCUMENTEN

Die kenmerke van die begrip “gemeenskapsmusiek”, soos dit in die laaste 15 jaar in ander lande geformuleer is, word aan die hand van die Perseverance Kersfeesorkes

For that reason, we propose an algorithm, called the smoothed SCA (SSCA), that additionally upper-bounds the weight vector of the pruned solution and, for the commonly used

Crosstab analyses are applied to study the sensitivity and specificity of the DASS-21 cut-off score, in comparison with the BDI-II cut-off scores indicating mild, moderate, and

De glastuin- bouwsector heeft zichzelf doelen gesteld om duurzamer te telen en Waddenglas wil daaraan zeker meewerken.. Ook de overheden hebben immers doelen

The rise of a contested science of madness in the nineteenth century provides a basis for the medical treatment of mental states as illnesses in the systematic classification of

By conducting the main analysis 1 (i.e. zero measurement) by means of a quantitative questionnaire, the score of three luxury brands (e.g. Chanel, Armani and

The results have been put in table 7, which presents percentages that indicate the increase or decrease of the formants before elimination with respect to the vowels before

However, most of this research was cross-sectional and it has rarely been explored if somatization in itself leads to long-term disability, or whether concurrent ill mental