• No results found

Influence assessment in claims reserving

N/A
N/A
Protected

Academic year: 2021

Share "Influence assessment in claims reserving"

Copied!
90
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Influence assessment in

claims reserving

Saskia Peters

(2)

Master thesis

Econometrics, Operations Research and Actuarial Studies

Specialization Actuarial Studies

Saskia Peters

First supervisor: Prof. dr. R.H. Koning

Second supervisor: Prof. dr. H.L. Trentelman

University of Groningen

Faculty of Economics and Business

P.O. Box 800

(3)

Influence assessment in

claims reserving

Saskia Peters

Abstract

Thomas and Cook (1990) developed a method to assess influence of a single observation on point predictions in the generalized linear model. We apply their ideas to claims reserving.

We consider the overdispersed Poisson model; a particular model in claims reserving and often used in practice. We develop methods to assess the influence of payments on the reserve and reserve variability. Perturbation of payments and case weights and case deletion are considered as influence measure. The results of case deletion turn out to be useful in the assessment of influence. The reserve is especially sensitive to fluctuations of payments in the corner points of the triangle. Large payments in the north-west corner have a decreasing effect on the reserve and in the south-west and north-east corner an increasing effect on the reserve. The method to assess influence of payments on the reserve detects influential payments and indicates whether the effect is increasing or decreasing. Furthermore, we describe a method to assess the influence on an estimate of the reserve variability. There is no clear relation between this influence and the location of a payment. Finally, the described methods are not applicable for the south-west and north-east corner in the triangle. Therefore, we describe an alternative method to test whether these payments differ from other payments. Different t-tests to the estimated parameters provide a measure to indicate whether the payments in the corners differ from other payments.

(4)
(5)

Preface

This master thesis is the result of my graduation project in order to obtain the degree Master of Science in Econometrics, Operations Research and Actuarial Studies. The research is carried out during an internship at Ernst&Young in Utrecht.

I would like to thank some people for their help and support during my project. First, I would like to express my gratitude to my supervisor Ruud Koning for his guidance. His comments and suggestions were very valuable, especially in some technical parts of my research. I also would like to thank my second supervisor Harry Trentelman for his support.

Of course, I would like to thank my supervisors Angela van Heerwaarden and Loes Andringa from Ernst & Young for their help and their enthusiasm about my project. I also would like to thank my other colleagues for the pleasant time during my internship.

Finally, I would like to thank my parents for giving me the opportunity to study and for supporting me during my study.

Saskia Peters

(6)
(7)

Contents

1 Introduction 1

2 Generalized linear models 3

2.1 Maximum likelihood . . . 5

2.1.1 Estimation . . . 5

2.1.2 Score function and Fisher information . . . 6

2.1.3 Asymptotic properties . . . 6

2.2 Quasi-likelihood . . . 7

2.2.1 Estimation . . . 7

2.2.2 ‘Score function’ and ‘Fisher information’ . . . 8

2.2.3 Asymptotic properties . . . 8

2.3 Maximum likelihood versus quasi-likelihood . . . 9

3 Claims reserving 10 3.1 Overdispersed Poisson model . . . 11

3.2 Model estimation . . . 12

3.2.1 Parameter estimation . . . 12

3.2.2 Reserve estimate . . . 13

3.3 Reserve variability . . . 14

3.4 Example . . . 15

4 Influential observations: an introduction 16 4.1 Examples in the linear model . . . 17

4.2 Overview of literature . . . 18

5 Influence on reserve 19 5.1 Perturbations . . . 19

5.1.1 Perturbations of payments . . . 21

5.1.2 Perturbations of case weights . . . 22

5.2 Case deletion . . . 23

5.3 Scaling . . . 23

5.4 Example . . . 24

6 Influence on reserve variability 27 6.1 Perturbations . . . 27

6.1.1 Perturbations of payments . . . 31

6.1.2 Perturbations of case weights . . . 32

6.2 Case deletion . . . 33

6.3 Interpretations of the effects . . . 34

6.4 Example . . . 35

7 Corner points 39 7.1 South-west corner . . . 39

(8)

ii CONTENTS

8 Performance of the methods 42

8.1 The simulation . . . 42

8.2 Influence on reserve . . . 43

8.3 Influence on reserve variability . . . 45

8.4 Corner points . . . 47

9 Influence assessment in practice 48 9.1 Choice of influence assessment methods . . . 48

9.1.1 Influence on reserve . . . 48

9.1.2 Influence on reserve variability . . . 49

9.1.3 Corner points . . . 49

9.2 Model adaptation . . . 50

9.2.1 Influence on reserve: model specification . . . 50

9.2.2 Influence on reserve: influential payments . . . 50

9.2.3 Influence on reserve variability . . . 51

9.2.4 Corner points . . . 51

10 Conclusion 53 10.1 Conclusions . . . 53

10.2 Directions for further research . . . 54

Bibliography 55 A List of symbols and abbreviations 57 B Derivations 58 B.1 Moments of the EDF . . . 58

B.2 Moments of the score function . . . 60

B.3 ‘Score function’ of the quasi-likelihood . . . 61

B.4 Delta method . . . 62

C Proofs 63 C.1 Proof of Theorem 1 . . . 63

D Examples 64 D.1 Triangle of Taylor and Ashe . . . 64

D.2 Triangle of England and Verrall . . . 69

D.3 Triangle of Mack . . . 70

E Simulation results 71 E.1 Influence on reserve . . . 71

(9)

Chapter 1

Introduction

Background

Today, many developments in supervision of financial institutions take place. Insurance companies face different risks and they have to value these risks. Rules, such as Solvency II, are developed to monitor whether insurance companies (and other financial institutions) will meet their obligations. An important part of the valuation of risks for non-life insurers is the estimation of the reserve. The insurer earns premiums in every accident year. However, the payments of claims could occur years after the accident year. Therefore, insurance companies put up a technical reserve to cover future payments originating from a past accident year.

The reserve is an important part of the profit and loss sheet. It is taken into account when deter-mining profit and the payment of taxes. Furthermore, it is important to the policyholder that the insurer is able to meet its obligations. Therefore, the insurer should determine the reserve with care such that it represents an accurate estimate.

Different methods are available to estimate the reserve. The insurance company has to choose a suitable method and has to be able to show how the reserve is determined. Rules and supervision monitor whether the reserve is accurate and the insurer is able to meet obligations.

Previous research

Claims reserving methods assume a model for the observed claims. The model is used to predict future claims and to determine the corresponding reserve. One of the first developed reserving methods is the chain ladder technique. This method assumes a pattern in the claims. The pattern is projected forward to estimate future claims.

The chain ladder method is deterministic since the uncertainty in future claims is not modelled. To include the uncertainty, stochastic methods were developed. Examples are the models described by Renshaw and Verrall (1998) and Zehnwirth (1994). These methods are able to recognize and incorporate uncertainty of the claim process. Consequently, different functions of the distribution of the reserve are available, such as the mean, variance and percentiles. Furthermore, measures of precision of the estimates are available and hypothesis testing is possible.

Gap in previous research

(10)

2 Introduction

Influence assessment is important for several reasons. Primarily, it is an aid to quickly detect influential payments that need special attention. In practice, experts detect remarkable payments by close inspection of the data. However, this detection method is very subjective. Systematic influence assessment complements expert’s judgement in three ways. First, the results of the influence measurement can confirm the expert’s impression. In this case, the identified claims need particular attention and background information needs to be considered. Second, the influence measurement can identify influential claims that are overlooked by the expert. Finally, the results of the influence measurement can contradict the expert’s impression. A claim could be remarkable but not influential. In this case, we do not have to put effort into studying this claim in detail.

Research in this thesis

The identification of influential observations in claims reserving is the topic of this thesis. We analyse a generalized linear model (GLM) as claims reserving method; in particular the overdis-persed Poisson model. We have three reasons to consider this particular model. First, the model is extensively described in literature, for example by McCullagh and Nelder (1989). Therefore, the model is well understood. Second, the model allows for negative claim amounts. Finally, the overdispersed Poisson model is often used in practice.

The research question in this thesis is:

How should we measure the influence of observed payments on functions of the overdispersed Pois-son model in claims reserving, and how should we incorporate this knowledge into the model? The thesis consists of ten chapters. First, in Chapter 2 we consider the generalized linear model. This model is described and two estimation methods are presented. The material in that chapter is based on a study of literature.

In Chapter 3, we discuss a specific generalized linear model, often used in claims reserving: the overdispersed Poisson model. First, we introduce a triangular representation of the observed claims. Furthermore, we describe the overdispersed Poisson model and we consider the variability of the predictions. The generalized linear model in claims reserving is extensively described in literature and we base Chapter 3 on parts of that literature.

In Chapter 4, we introduce influential observations in general. We give some examples of influen-tial observations in a linear model and we give an overview of existing literature about influence assessment.

Influence of payments in claims reserving is described in Chapters 5 and 6. We consider influ-ence on the mean and variability of the reserve. Assessing influinflu-ence on functions is considered in literature in different frameworks. We apply influence assessment methods in generalized linear models from literature to the special case of claims reserving and the overdispersed Poisson model. Two payments are special cases that need a special approach. In Chapter 7, we describe how to consider these two payments in the influence assessment.

We consider the performance of the influence measures by a simulation study. The results are described in Chapter 8.

In Chapter 9, we describe the use of the influence measure in practice. We explain the choice of specific measures to assess the influence of payments on the reserve and reserve variability. Fur-thermore, we consider in which cases, and how, we incorporate knowledge of influential payments into the model estimation.

(11)

Chapter 2

Generalized linear models

The generalized linear model is a generalization of the classical linear model. A linear model as-sumes normally distributed error terms with a constant variance. The random variables Y1, . . . , Yn are assumed to be independent and the model is specified by

E(Yi) = µi and µi= p X

j=1

xijβj, i = 1, . . . , n,

where xij are known explanatory variables and βj are unknown parameters.

The assumptions of the generalized linear model, described by McCullagh and Nelder (1989), are: 1. Random component

The observations yi are realizations of the random variables Yi (i = 1, . . . , n). These ran-dom variables are independently distributed and the density function of Yi belongs to the exponential dispersion family:

f (yi; θi, φ) = exp  1 ai(φ) (yiθi− b(θi)) + c(yi, φ)  , (2.1)

where φ and θ are unknown parameters (φ is called the dispersion parameter) and a(·), b(·) and c(·, ·) are known functions that determine the density function. We write Y ∼ E DF . 2. Systematic component

Each observation is described by a linear predictor:

ηi= p X j=1 xijβj. 3. The link

Let E(Yi) = µi. The link function g(·) connects the random and systematic component: ηi = g(µi).

Note that independently, normally distributed1observations and the identity function as link func-tion imply the classical linear model.

The first and second moment of random variable Y is derived from the model assumptions. The results are

E(Y ) = b0(θ) and Var(Y ) = a(φ)b00(θ).

(12)

4 Generalized linear models

The derivations are given in Appendix B.1. The variance can be written as Var(Y ) = a(φ)V (µ),

where V (µ) = b00(θ) is called the variance function. Usually, the function a(φ) has the following form:

ai(φ) = φ wi

, (2.2)

(13)

2.1 Maximum likelihood 5

2.1

Maximum likelihood

The generalized linear model assumes a specific density function as specified in Equation (2.1). If a specific distribution is assumed, maximum likelihood estimation is commonly used. This estimation method provides parameter estimates, standard errors and theory about the asymptotic distribution. Furthermore, goodness-of-fit measures are available. We are interested in model predictions. Therefore, we will not focus on the fit of the model. We refer to McCullagh and Nelder (1989) for goodness-of-fit measures. We consider maximum likelihood estimation of the generalized linear model and properties of the estimates.

2.1.1

Estimation

We obtain parameter estimates by maximizing the log-likelihood function with respect to the parameters. The log-likelihood for the generalized linear model is given by

l(µ; y) = n X i=1 log(f (yi; θi, φ)) = n X i=1  wi φ (yiθi− b(θi)) + c(yi, φ)  = n X i=1 li(µi; yi). (2.3)

To maximize the log-likelihood, we set the derivative of li(µi; yi) with respect to β equal to zero. The derivate is given by

∂li ∂βj = ∂li ∂θi ∂θi ∂µi ∂µi ∂ηi ∂ηi ∂βj , j = 1, . . . , p where ∂li ∂θi = wi φ(yi− b 0 i)) = wi φ(yi− µi) ∂µi ∂θi = b00(θi) = V (µi) ∂ηi ∂βj = xij

Using these terms, we derive the derivative of Equation (2.3) and set this equal to zero: n X i=1 ∂li(µi; yi) ∂βj = n X i=1 wi(yi− µi) φV (µi) ∂µi ∂ηi xij= 0, j = 1, . . . , p. (2.4)

Because φ cancels out, the estimating equations in (2.4) become n X i=1 wi(yi− µi) V (µi) ∂µi ∂ηi xij= 0, j = 1, . . . , p, (2.5)

By solving this system of equations, the maximum likelihood estimates (MLE) of the parameters are obtained. The iteratively reweighted least-squares (IRWLS) algorithm can be used to solve the system of equations. The IRWLS algorithm is described in for example Azzalini (1996). The dispersion parameters φ is not involved in the parameter estimation. If φ is unknown, the following estimate is commonly used (for example by McCullagh and Nelder (1989)):

(14)

6 Generalized linear models

2.1.2

Score function and Fisher information

The derivative of the log-likelihood with respect to the parameters is called the score function U : U (β) = ∂l(µ; y)

∂β and Uj(β) =

∂l(µ; y) ∂βj

. Two properties of the score function are

E(U (β)) = 0 Var(U (β)) = I(β), where I(β) = E (U (β)U0(β)) = −E∂β∂0U (β)



is the expected Fisher information matrix. These properties of the score function and Fisher information matrix are derived in Appendix B.2. The (j, k)th element of the expected Fisher information matrix for the generalized linear model equals Ijk(β) = E(Uj(β)Uk(β)) = E n X i=1  Yi− µi φV (µi) ∂µi ∂ηi xij  n X l=1  Yl− µl φV (µl) ∂µl ∂ηl xlk ! = n X i=1 E((Yi− µi)2)xijxik (φV (µi))2  ∂µi ∂ηi 2 ,

where we used E ((Yi− µi)(Yl− µl)) = 0 for i 6= l, because we assume independence of the random variables. Furthermore, we have E (Yi− µi)2 = Var(Yi) and φV (µi) = Var(Yi), so

Ijk= n X i=1 xijxik Var(Yi)  ∂µi ∂ηi 2 .

2.1.3

Asymptotic properties

Denote the MLE of β by ˆβ. Under some mild conditions, the MLE is consistent: ˆ

β → β,p

where β is the true but unknown parameter vector. Furthermore, under some conditions, the scaled MLE is asymptotically normal:

n( ˆβ − β)→ N (0, nId −1).

This result applies when the MLE is based on n observations, where n → ∞. For finite n, we have approximately

ˆ

β ∼ N (β, I−1).

(15)

2.2 Quasi-likelihood 7

2.2

Quasi-likelihood

In linear model estimation, two commonly used methods are ordinary least squares and maximum likelihood. Maximum likelihood estimation assumes normally distributed random variables Y . On the other hand, least squares assumes no particular distribution of Y and only requires a specification of the mean and variance. Although the assumptions of least squares are weaker than the assumptions of maximum likelihood, both methods obtain the same estimates of the parameters. Similarly, for generalized linear models we can use quasi-likelihood estimation instead of maximum likelihood.

2.2.1

Estimation

Maximum likelihood assumes that the distribution of Y belongs to the exponential dispersion family. Instead of specifying the whole distribution, quasi-likelihood only requires the specification of the mean and variance:

E(Yi) = µi,

Var(Yi) = φV (µi), (2.7)

where g(µi) = ηi = P p

j=1xijβj. Wedderburn (1974) introduced the quasi-likelihood function. This function Q has to satisfy the following relation:

∂Q(µi; yi) ∂µi

= yi− µi φV (µi)

. (2.8)

McCullagh and Nelder (1989) defined Q(µi; yi) as Q(µi; yi) =

Z µi

yi

yi− t φV (t)dt,

This expression satisfies Equation (2.8). For n observations from independent random variables, the quasi-likelihood is obtained by the following summation (as for the log-likelihood):

Q(µ; y) = n X

i=1

Q(µi; yi).

Wedderburn (1974) has shown that Q “has properties similar to those of log likelihoods”. This will be discussed in Section 2.2.2.

Assume that µi is a function of parameters β and assume the specification in (2.7). We estimate the parameters β by maximizing the quasi-likelihood with respect to these parameters. We have to solve the following estimating equations:

∂Q ∂βj = n X i=1 ∂Qi ∂µi ∂µi ∂ηi ∂ηi ∂βj = n X i=1 yi− µi φV (µi) ∂µi ∂ηi xij= 0, j = 1, . . . , p.

Because φ cancels out, the estimating equations can be written as n X i=1 yi− µi V (µi) ∂µi ∂ηi xij = 0, j = 1, . . . , p. (2.9)

The dispersion parameter φ is not involved in the quasi-likelihood estimation. Wedderburn (1974) suggests and motivates the following estimate:

(16)

8 Generalized linear models

2.2.2

‘Score function’ and ‘Fisher information’

Quasi-likelihood functions and estimates have many properties in common with likelihood func-tions and estimates. We consider the analogue of the score function and the asymptotic distribution of the quasi-likelihood estimates.

Define u(β) = ∂Q ∂β and uj= ∂Q ∂βj . The function u satisfies the properties of a score function:

E(u(β)) = 0 Var(u(β)) = i(β),

where i(β) = E(u(β)u0(β)) = −E∂u(β)∂β . The (j, k)th element of the matrix i(β) is given by

ijk(β) = n X i=1 1 φV (µi) ∂µi ∂βj ∂µi ∂βk . (2.11)

The proofs of these properties are given in Appendix B.3.

2.2.3

Asymptotic properties

Let bβ the quasi-likelihood estimate of β. Under some mild conditions, the quasi-likelihood esti-mate is consistent:

b β → β.p

Furthermore, the estimate ˆβ is asymptotically normally distributed: √

n( ˆβ − β) ∼ N (0, ni(β)−1). This result applies when n → ∞. For finite n, we have approximately

ˆ

β ∼ N (β, i(β)−1).

(17)

2.3 Maximum likelihood versus quasi-likelihood 9

2.3

Maximum likelihood versus quasi-likelihood

The generalized linear model can be estimated by using maximum likelihood or quasi-likelihood. We summarize the assumptions and properties of these methods in Table 2.1.

Maximum likelihood Quasi-likelihood Assumptions Y1, . . . , Ynindependent Y1, . . . , Yn independent

E(Yi) = µi where g(µi) =Pnj=1xijβj E(Yi) = µiwhere g(µi) =Pnj=1xijβj

Yi∼ EDF Var(Yi) = φV (µi) Log-likelihood l(µ; y) =Pn i=1li(µi; yi) Q(µ; y) = Pn i=1Q(µi; yi) where Q(µi; yi) = Rµi yi yi−t φV (t)dt Estimating Pn i=1 yi−µi V (µi) ∂µi ∂ηixij= 0 Pn i=1 yi−µi V (µi) ∂µi ∂ηixij= 0 equations

Table 2.1: Maximum likelihood versus quasi-likelihood

For the generalized linear model, the two methods obtain the same estimating equations. There-fore, the parameter estimates are equal. In fact, the two methods obtain the same estimates only if the distribution function belongs to the exponential dispersion family. This result is described in a theorem of Wedderburn (1974):

Theorem 1. For one observation y, the log-likelihood function l has the property ∂l

∂µ= y − µ φV (µ),

where µ = E(y) and φV (µ) = Var(y), if and only if the density function of y can be written in the form

exp 1

φ(yθ − b(θ)) 

+ c(y, φ), (2.12)

where θ is some function of µ.

The proof of this theorem is given in Appendix C.1.

However, an underlying generalized linear model does not have to exist to use quasi-likelihood estimation. The mean and variance can be specified such that they do not imply a distribution function. For example, consider the following specification:

E(Y ) = µ

Var(Y ) = φµ. (2.13)

Note that the function V (·) is the identity function. No distribution of the exponential dispersion family satisfies the specification in (2.13), except for φ = 1. In that case, the variance equals the mean, which implies the Poisson distribution. The model specified in (2.13) is called the overdis-persed Poisson distribution. Because the specification implies no specific distribution function, we can only use quasi-likelihood estimation.

(18)

Chapter 3

Claims reserving

An insurance policy usually covers claims incurred in one year; the accident year. However, the payment could be made years later. There are several reasons for this. First, some claims are not reported during the accident year. For example, when a car accident occurs at the end of the policy year, it will probably be reported in the next year. Second, the size of the claim could be settled after the end of the policy year. Long legal proceedings can delay the settlement of the payment. These are not unusual for liability claims. Third, the claim could consist of yearly payments. This often occurs in disability insurance. Finally, the damage could be manifested (far) after the policy year, for example pollution claims.

Insurance companies put a technical reserve to cover future payments originating from a past acci-dent year. A model is used to predict future payments and to determine the corresponding reserve. In this chapter, we introduce a schematic representation of the observed payments. Second, we consider the overdispersed Poisson model to predict future payments.

The observed payments are schematically represented in a triangle: Development year 1 2 · · · j · · · n − 1 n Acciden t y ear 1 C1,1 C1,2 · · · C1,j · · · C1,n−1 C1,n 2 C2,1 C2,2 · · · C2,j · · · C2,n−1 .. . ... ... · · · ... · · ·

i Ci,1 Ci,2 · · · Ci,n−i+1 ..

. ... ...

n − 1 Cn−1,1 Cn−1,2

n Cn,1

Table 3.1: Development triangle

Observation Cij represents the payment of claims originating from accident year i and paid in development year j1. Payments of claims originating from accident years i = 1, . . . , n and paid in development years j = 1, . . . , n − i + 1 are observed. Cij is unknown for i = 2, . . . , n and j = n − i + 2, . . . , n.

The triangle is called the development triangle, because it presents the development of payments for different accident years. This notation is often used in literature, for example by Kaas et al. (2001) and Renshaw and Verrall (1998). In the next section, we consider the overdispersed Poisson model to predict future payments.

1C

ij can also stand for incurred claims or number of claims reported. We consider claim payments, but the

(19)

3.1 Overdispersed Poisson model 11

3.1

Overdispersed Poisson model

The overdispersed Poisson model as reserving method is extensively described in literature, for example by England and Verrall (1999, 2001, 2002) and Renshaw and Verrall (1998). Therefore, many properties are known. We assume the overdispersed Poisson model for payments Cij and they are assumed to be independent. The mean and variance are specified by

E(Cij) = µij,

Var(Cij) = φµij. (3.1)

To give an interpretation of the overdispersed Poisson model, we consider the distribution of Cij

φ . This random variable has the following mean and variance:

E Cij φ  = µij φ Var Cij φ  = φµij φ2 = µij φ . Because the variance equals the mean, Cij

φ is Poisson distributed with parameter µij

φ . Therefore, φ could be interpreted as the average payment and Cij

φ as the number of payments which is Poisson distributed.

To complete the model, we specify a linear predictor and link function. Many different specifica-tions are possible. The following specification is common in claims reserving2:

log(µij) = ηij,

ηij = c + αi+ βj. (3.2)

This structure is used in for example England and Verrall (1999). It models patterns of the data through the mean. In particular, we model the log of the mean. For every accident year i and every development year j a separate parameter is included. The βj’s determine the development pattern: how the mean changes with development year.

Sometimes, variations of the structure in (3.2) are specified. For example, an additional parameter can be included to incorporate a calendar year effect. Furthermore, restrictions can be imposed on the parameters, such as αi = (i − 1)α and βj = (j − 1)β. They imply a particular pattern in the parameters. Because the structure specified in (3.2) is often used in literature, we consider this structure.

The reserve is the sum of future payments originating from past accident years: R = n X i=2 n X j=n−i+2 Cij.

However, the Cij’s are random variables and the outcome of the distribution of these random variables is uncertain. Consequently, the reserve is the uncertain outcome of a distribution. An estimate of the reserve is the centre of the reserve distribution: the mean. This is given by

E(R) = n X i=2 n X j=n−i+2 E(Cij) = n X i=2 n X j=n−i+2 exp(c + αi+ βj).

The theoretical variance of the reserve distribution is obtained from the specified variance of the (independent) payments Cij: Var(R) = n X i=2 X j=n−i+2 Var(Cij) = φE(R) 2Here β

j does not refer to β defined in Chapter 2. In this case βj represents one parameter for development

(20)

12 Claims reserving

3.2

Model estimation

In this section, we describe the model estimation. No distribution satisfies the specification in Equation (3.1). Therefore, we use quasi-likelihood estimation. This method is flexible with respect to the data. The data do not need to be integers and are allowed to contain negative observations. For comparison, the Poisson distribution only allows non-negative integers. First we estimate the model parameters. Using these parameters, we derive the fitted values and predictions of the future payments.

3.2.1

Parameter estimation

The model parameters are not uniquely defined; the parameters c, αiand βj imply the same µij as c, αi+ δ and βj− δ. So several parameter vectors imply the same mean and variance. This is called overparameterization. To avoid overparameterization, we introduce the corner point restriction:

α1= β1= 0. The described model could be written as

E(Cij) = µij= exp(c) exp(αi) exp(βj).

Therefore, this model is called a multiplicative model. Because of the corner point restriction, exp(c) represents the expectation of C1,1. Furthermore, the model specification implies that pay-ments Cij in accident year i are proportional to the first payment of that accident year Ci1. This proportionality is represented by exp(βj). On the other hand, payments Cij in development year j are proportional to the first payment of that development year C1j. This proportionality is represented by exp(αi).

From Equation (2.9), we derive the system of estimating equations: n X i=1 n−i+1 X j=1 Cij− µij µij ∂µij ∂c = 0 n X i=1 n−i+1 X j=1 Cij− µij µij ∂µij ∂αk = 0, k = 2, . . . , n n X i=1 n−i+1 X j=1 Cij− µij µij ∂µij ∂βk = 0, k = 2, . . . , n, where ∂µij ∂c = µij ∂µij ∂αk =  µij if i = k 0 if i 6= k ∂µij ∂βk =  µij if j = k 0 if j 6= k .

Plugging in these derivatives into the estimating equations and rearrange it, we arrive at the following estimating equations:

(21)

3.2 Model estimation 13

The left side of the estimating equations is positive, because of the logarithmic link function. Therefore, positive row and column sums in the development triangle is a restriction of the model. The standard errors of the parameters are estimated by the Fisher information matrix. The in-verse of this matrix gives an estimate of the covariance matrix and the square root of the diagonal elements give the estimated standard errors.

From Equation (2.10), we derive the estimate of the dispersion parameter:

b φ = 1 N − p n X i=1 n−i+1 X j=1 (Cij−µbij) 2 b µij . (3.4)

where N is the number of observations and p the number of estimated parameters. In case of the development triangle, N = 1

2n(n + 1) and p = 2n − 1. The estimates bµij are derived in the next section.

3.2.2

Reserve estimate

The estimated payments are obtained by \

E(Cij) =µbij = exp(c +b αbi+ bβj). (3.5) The fitted values of the observed payments are found for i = 1, . . . , n and j = 1, . . . , n − i + 1, whereas the predicted future payments are found for i = 2, . . . , n and j = n − i + 2, . . . , n. From the predicted payments, we estimate the expected reserve:

(22)

14 Claims reserving

3.3

Reserve variability

The expected reserve indicates the centre of the reserve distribution. Besides the expectation, we are interested in the reserve variability. The variability is important if a risk margin is determined for prudence.

We consider the root mean squared error of prediction (RMSEP); also known as the prediction error. We follow the derivation described in England and Verrall (1999), because they derived an analytical expression for the estimate of the prediction error. This is useful in the next chapters. Another method to estimate the prediction error is by the bootstrap technique. This is a resam-pling method. We refer to England and Verrall (1999, 2002) for the bootstrap technique and a comparison between the two different methods to estimate the prediction error.

We consider future payments Cij, where i = 2, . . . , n and j = n − i + 2, . . . , n. The mean squared error of prediction (MSEP) is defined by

MSEP(µbij) = E (Cij−µbij) 2

= E {(Cij− E(Cij)) − (µbij− E(Cij))} 2

The quasi-likelihood estimates of the parameters are asymptotically normally distributed. Apply-ing the delta method, the claim estimatesµbij are asymptotically normally distributed with mean µij. Explanation of the delta method and a derivation of this result can be found in Appendix B.4. In a finite sample, we have that E(ˆµij) ≈ µij = E(Cij). Therefore,

MSEP(µbij) ≈ E {Cij− E(Cij)}2 − 2E ({Cij− E(Cij)}{µbij− E(µbij)}) + E {µbij− E (µbij)}2



(3.6) = Var(Cij) − 2Cov(Cij,bµij) + Var(µbij).

We assume independence between future observations and past observations. This results in MSEP(µbij) ≈ Var(Cij) + Var(µbij).

The variance of Cij is the process variance, reflecting the variability in the data. The variance of µbij is the estimation variance, reflecting the variability due to estimation. The overdispersed Poisson model assumes the following process variance:

Var(Cij) = φµij. For the estimation variance, the delta method gives us

Var(µbij) ≈ ∂µij ∂ηij 2 Var(ηbij) (3.7) = µ2ijVar(ηbij).

This result is derived in Appendix B.4. So the MSEP of the predicted claim amountµbij can be approximated by

MSEP(µbij) ≈ φµij+ µ2ijVar(bηij), and this is estimated by

\

MSEP(µbij) = bφµcij+µb 2

ijVar(bηij).

Also, the MSEP of the reserve estimates can be approximated. Denote the indices of the predicted payments by . Then the estimate of the MSEP of the reserve estimate is obtained by

\ MSEP(\E(R)) = X i,j∈ b φbµij+ X i,j∈ b µ2ijVar(ηbij) + 2 X i1,j1∈ i2,j2∈ i1,j16=i2,j2 b µi1j1µbi2j2Cov(ηbi1j1,ηbi2j2). (3.8)

(23)

3.4 Example 15

3.4

Example

We illustrate the overdispersed Poisson model in claims reserving with an example. We consider the development triangle of Taylor and Ashe (1983). This triangle is often analysed in literature. The triangle is given in Table 3.2.

Development year 1 2 3 4 5 6 7 8 9 10 Acciden t y ear 1 358 767 611 483 527 574 146 140 227 68 2 352 884 934 1,183 446 321 528 266 425 3 291 1,002 926 1,017 751 147 496 280 4 311 1,108 776 1,562 272 352 206 5 443 693 992 769 505 471 6 396 937 847 805 706 7 441 848 1,131 1,063 8 359 1,062 1,443 9 377 987 10 344

Table 3.2: Development triangle of Taylor and Ashe (×$1, 000)

We fit an overdispersed Poisson model to the data. The parameters are estimated with quasi-likelihood estimation. The results are given in Table D.1 in Appendix D.1. For the development years, we observe an increase in the first parameters. After some years, the parameters decrease. Therefore, in later development years we expect smaller payments. Furthermore, we expect that the payments in accident years two up to and including ten are higher than the payments in the first accident year, because every αi (i = 2, . . . , 10) is greater than zero. Furthermore, accident year eight has a relatively large parameter (αb8) with respect to the other α’s. Probably, this isb caused by the high payment C8,3.

The estimated payments and predicted future payments are given in Table D.2 in Appendix D.1. First, we note that µb1,10 = C1,10 and µb10,1 = C10,1. This is a consequence of the estimating equations and the single observation in the last column and last row. Furthermore, we can check that the other estimating equations in (3.3) hold.

(24)

Chapter 4

Influential observations: an

introduction

In the previous chapter, we described the estimation of the overdispersed Poisson model. Gener-ally, after an estimation procedure, diagnostic checks are performed to establish whether the model fits well. In claims reserving, it is not common to additionally assess the influence of observed payments on the estimates and predictions. However, knowledge of the influence of payments will provide useful information.

Belsley et al. (1980) describe four sources of influential observations in general datasets. First, data could be reported incorrectly. Second, errors in the data could also be caused by obser-vational errors. Third, extreme events often result in outlying observations. These observations could be influential. Belsley et al. (1980) note that such observations usually contain important information. Although the presence of these data points could improve the model, it is important to determine the influence of them on estimates. Fourth, influential observations could be gener-ated by another process than the other observations.

In practice, experts in claims reserving detect remarkable payments by close inspection of the data. These remarkable payments could have large influence on the model predictions and need special attention. However, this detection method is very subjective. A systematic assessment of influence on predictions could aid the expert in the detection of influential payments. Influence assessment complements the expert’s opinion in three ways. First, the influence measure could confirm the expert’s suspicions. Second, the influence measure could identify influential payments that are not remarkable. These payments are not detected by expert judgement. Third, the influence measure could contradict the expert’s suspicions. Remarkable payments that are not influential need no special attention; we do not have to put effort into studying the payment in detail. Besides, the influence measure could give important insights into the model.

(25)

4.1 Examples in the linear model 17

4.1

Examples in the linear model

We indicate possible situations in which observations have a large influence. We illustrate these situations with a linear model, because of simplicity. We can imagine the same situations in generalized linear models. In Figure 4.1, we show some interesting situations.

● ● ●● ● ● ● ●● ●● ● ● ● ●● ● ● ● ● ● 0 10 20 30 40 0 100 200 300 400 500 Figure (a) x y ● ● ●● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● 0 10 20 30 40 0 100 200 300 400 500 Figure (b) x y ● ● ●● ● ● ● ●● ●● ● ● ● ●● ● ● ● ● ● 0 10 20 30 40 0 100 200 300 400 500 Figure (c) x y ● ● ●● ● ● ● ●● ●● ● ● ● ●● ● ● ● ● ● ● 0 10 20 30 40 0 100 200 300 400 500 Figure (d) x y

Figure 4.1: Influential observations in linear models

The observations are indicated by dots and the line shows the fitted linear model to the data. In addition, the estimated model parameters and a model prediction at x = 50 are given in Table 4.1.

Intercept Slope Prediction at x = 50 Figure (a) 103.1 (1.9) 9.9 (0.2) 596.3 (6.7) Figure (b) 117.1 (28.2) 9.9 (2.4) 610.3 (97.6) Figure (c) 86.4 (17.5) 12.5 (1.5) 711.5 (60.7) Figure (d) 103.1 (1.6) 9.9 (0.1) 596.3 (4.4)

Table 4.1: Influential observations in linear models

The values in parentheses are the estimated standard errors.

In Figure 4.1(a), we observe no atypical observation. The model fits well to the data and the standard errors of the estimated parameters are small.

(26)

18 Influential observations: an introduction

In Figure 4.1(c), we observe a large observation at the last observed x-value (x = 20). Conse-quently, the intercept and slope estimates are affected. The intercept decreases and the slope increases. Furthermore, the standard errors are larger than in the model in Figure (a). Finally, we observe a large increase in the prediction.

In Figure 4.1(d), we observe an additional observation at x = 40. This observation fits exactly to the estimated model in Figure (a). Therefore, the estimated parameters are not affected. Also the predictions are similar. However, we observe a reduction of the standard errors, because the additional observation shows more evidence of the linear trend.

Concluding, influence of an observation depends on the function. An outlying observation can af-fect one function, while another function is not afaf-fected. For example, the intercept estimate could be affected, while the slope estimate remains the same. Furthermore, an observation does not have to be outlying in the y-value to be influential. For example, the last observation in Figure (d) fits exactly to the model, but this observation does have influence on the estimated standard error. As the influence could depend on the function, we consider the influence on the reserve and reserve variability separately.

4.2

Overview of literature

Different influence measures are described in literature. Most articles focus on outputs of the model estimation, such as parameter estimates, fitted values and standard errors. Two approaches are often used to asses the influence on these estimates. The first approach measures the effect of deleting an observation, see for example Belsley et al. (1980) and Pregibon (1981). The second approach measures the effect of small perturbations of (elements of) the data.

Cook (1986) considers local normal curvature to assess influence of model perturbations on param-eter estimates. The behaviour around the point where no perturbation takes place is considered. Cook measures the effect of model perturbations on parameter estimates by the likelihood dis-placement function. The first derivative of this function in the point where no perturbation takes place equals zero, because the function achieves a local minimum in this point. Therefore, Cook uses the second derivative.

Thomas and Cook (1990) apply the method of Cook to assess influence of model perturbations on point predictions. Because the likelihood displacement function is not defined for predictions, they consider another objective function: a squared scaled residual. The residuals measure the difference in point predictions from the modified and unmodified data. Again, the objective func-tion achieves a minimum at the point where no perturbafunc-tion takes place. Consequently, the first derivative gives no information about the change caused by the perturbation. Therefore, they apply the local normal curvature method too.

Fung and Kwan (1997) investigate some properties of the local normal curvature method. They conclude that the results are not scale invariant in case that the first derivative of the objective function is unequal to zero. In this case, the conclusions from the local normal curvature method could be different if the objective function is scaled by a scalar. Fung and Kwan suggest to use the first derivative of the objective function for investigating local influence if this derivative is non-zero. Furthermore, they conclude that in general, this measure is scale invariant.

(27)

Chapter 5

Influence on reserve

We are interested in the influence of payments on the estimated total reserve. In this chapter, we consider methods to measure this influence. First, we consider the effect of model perturbations and second, we consider the effect of case deletion. Furthermore, we illustrate the theory with an example, using the triangle of Taylor and Ashe (1983).

5.1

Perturbations

We consider the influence of local model perturbations on the estimated total reserve. Suppose we modify the payments or the model specification of the payments by a perturbation vector ω = (ω1,1, ω1,2, . . . , ωij, . . . , ωn,1), where i = 1, . . . , n and j = 1, . . . , n − i + 1 and ωij corresponds to payment Cij. Denote the perturbation vector that implies no perturbation in the data or model specification by ω0. Furthermore, letµbij(ω) the estimates based on the model including the per-turbation. Consequently, we haveµbij(w0) =µbij.

The perturbation vector can be written as ω = ω0+ λl, where l = (l1,1, l1,2, . . . , lij, . . . , ln,1) is a direction vector. This can be seen as moving away from w0 in the direction of l with magnitude λ. Usually, l represents a unit direction vector.

We are interested in the question: in which direction l at ω0 do we obtain the largest change in the estimate of the total reserve and how large is this change? To answer this question, we consider the estimated reserve among different perturbation schemes. We move along ω = ω0+ λl for different direction vectors and we observe the estimated total reserve. This is described by the (N + 1) × 1 dimensional surface ω

\ E(R)(ω)

!

. The change in the estimated total reserve, caused by the direction vector l, is measured by the directional derivative. The following definition of the directional derivative is given in Marsden and Tromba (2003):

Definition 1. The directional derivative of function f at the point x along the vector v is given by d dtf (x + tv) t=0 .

In the definition of a directional derivative, we normally choose v to be a unit vector.

(28)

20 Influence on reserve

The directional derivative of \E(R)(ω) with respect to λ, evaluated in λ = 0 is obtained by ∂\E(R)(ω) ∂λ λ=0 = n X i=1 n−i+1 X j=1 ∂\E(R)(ω) ∂ωij ω=ω0 ! · lij, = < ∇\E(R)(ω), l >

where ∇\E(R)(ω) is the gradient of \E(R)(ω), <, > means the inner product and

∂ [E(R)(ω) ∂ωij ω=ω0 = n X r=2 n X s=n−r+2 exp  b c(ω) +αbr(ω) + bβs(ω)  bc(ω) ∂ωij +∂αbr(ω) ∂ωij +∂ bβs(ω) ∂ωij ! ω=ω0   = n X r=2 n X s=n−r+2 b µr,s ∂bc(ω) ∂ωij +∂αbr(ω) ∂ωij +∂ bβs(ω) ∂ωij ! ω=ω0 (5.1)

So we only need to obtain the partial derivatives of the parameters with respect tot ω. To simplify notation, we first introduce a notation for the model parameters. Let the p × 1-vector γ contain the model parameters: γ = (c, α2, . . . , αn, β2, . . . , βn).

We follow the article of Cook (1986) to derive the derivatives of the parameters. We use the quasi-likelihood equations to obtain these derivatives. For the quasi-likelihood function, we have

∂Q(ω) ∂γq γ= b γ(ω) = 0, q = 1, . . . , p,

for all perturbation schemes w. In matrix notation, we have ∂Q(ω) ∂γ γ= b γ(ω) =  ∂Q(ω) ∂c , ∂Q(ω) ∂α2 , · · · ,∂Q(ω) ∂αn ,∂Q(ω) ∂β2 , · · · ,∂Q(ω) ∂βn T γ= b γ(ω) = (0, 0, · · · , 0, 0, · · · , 0)T.

Differentiating both vectors with respect to ω and evaluating in ω = ω0, we obtain ∂2Q(ω) ∂γ∂γ0 γ=bγ ω=ω0 ∂bγ(ω) ∂ω + ∂2Q(ω) ∂γ∂ω γ=bγ ω=ω0 = 0. (5.2) Define ¨ Q = ∂ 2Q(ω) ∂γ∂γ0 γ=bγ ω=ω0 and ∆ = ∂ 2Q(ω) ∂γ∂ω γ=bγ ω=ω0 .

We recognize that −E( ¨Q) is the expected information matrix. Finally, we obtain from Equation (5.2) the derivatives of the parameters with respect to the perturbations:

¨ Q∂bγ(ω) ∂ω + ∆ = 0 ∂bγ(ω) ∂ω = −( ¨Q) −1∆. (5.3) ¨

Q is derived from the original model and does not depend on the different perturbations. ∆ varies with the chosen perturbation. Now, we have the ingredients to measure the influence of perturbations on the total reserve. Next, a theorem from Marsden and Tromba (2003) provides the direction of the fastest change in the estimated total reserve:

(29)

5.1 Perturbations 21

The unit direction vector that produces the largest increase in the total reserve is given by lmax= ˙ R k ˙Rk = ˙ R p ˙R0R˙, (5.4) where ˙R = ∂\E(R)(ω)∂ω ω=ω0

. Furthermore, −lmax produces the largest decrease in the total reserve. Concluding, large absolute elements in ˙R (or lmax) indicate perturbations that have a large effect on the reserve. Furthermore, the elements in ˙R provide a measure of change of the total reserve after a perturbation.

In the next section, we develop the method for different perturbations. First, we consider per-turbations of the payment itself by a small addition. Second, we introduce case weights and we consider perturbations of these weights.

5.1.1

Perturbations of payments

We consider the effect of perturbations of the observed payments on the total reserve. The results of the influence measure will provide information on the stability of the estimated total reserve: is the estimated total reserve sensitive to small fluctuations of the observed data?

Renshaw (1989) considers the log-normal model in claims reserving. As in the overdispersed Pois-son model, Renshaw includes a parameter for every accident year and development year. He give attention to predictor stability in this model. He conclude that instability of a predicted value depends on the number of parameters used to make this prediction and on the sensitivity of these parameters to fluctuations in the data. Renshaw performed a simulation study, and concluded: “...that predictions are sufficiently robust to data fluctuations in the heart of and in the north-west corner of the run-off triangle; and that stability deteriorates as data points further into the other two corners of the run-off triangle are varied.” We investigate whether these conclusions apply to the overdispersed Poisson model.

We modify payment Cij by Cij(ω) = Cij+ q

b

φµbijωij, where bφ and µbij are estimates from the original model. The perturbation by addition is suggested by Thomas and Cook (1990). The perturbation ωij is scaled by an estimate of the standard deviation of the payment, because each payment has a different variance. Note that we have ω0 = (0, . . . , 0). The estimating equations for the modified payments are given by

∂Q(ω) ∂γq = n X i=1 n−i+1 X j=1 Cij+ q b φbµijωij− µij φµij ∂µij ∂γq = 0.

The (q, r)th element of the matrix ∆ is obtained by ∆q,r = ∂2Q(ω) ∂γq∂ωr γ=bγ ω=ω0 = ∂ ∂ωr   n X i=1 n−i+1 X j=1 Cij+ q b φbµijωij− µij φµij ∂µij ∂γq   γ=bγ ω=ω0 = q1 b φbµr ∂µr γq γ=bγ ω=ω0 , (5.5)

where µr is the rth element of µ = (µ1,1, µ1,2, . . . , µij, . . . , µn,1).

(30)

22 Influence on reserve

5.1.2

Perturbations of case weights

The overdispersed Poisson model enables the assignment of different weights to payments. The quasi-likelihood estimating equations, including weights, are given by

∂Q(ω) ∂γq = n X i=1 n−i+1 X j=1 ωij(Cij− µij) φµij ∂µij ∂γq = 0, q = 1, . . . , p. (5.6)

Thomas and Cook (1990) give some interpretations to case weight perturbation. First, it extends case deletion. Considering case deletion, weight ωij is set equal to zero, such that payment Cij is not involved in the model estimation. Second, varying the weight of a payment varies its impor-tance in the estimation. If we increase weight ωij, the difference between the observed payment Cijand the estimateµbij decreases. Third, a weight alters the variance of a payment. The variance definition becomes Var(Cij) = ωφ

ijµij. Attaching a larger weight to a payment results in a smaller

variance.

In the original model, every observation has weight one, so ω0= (1, . . . , 1). From the estimating equations in (5.6), we derive the elements of the matrix ∆:

∆q,r = ∂2Q(ω) ∂γq∂ωr γ=bγ ω=ω0 = ∂ ∂ωr   n X i=1 n−i+1 X j=1 ωij(Cij− µij) φµij ∂µij ∂γq   γ=bγ ω=ω0 = Cr− µr φµr ∂µr γq γ= b γ , (5.7)

where Cris the rth element of (C1,1, C1,2, . . . , Cij, . . . , Cn,1).

Again, we derive the directional derivative from Equation (5.3), using the derivation of ∆ in Equa-tion (5.7). Furthermore, we obtain the direcEqua-tion of maximum change lmaxby using Equation (5.4). Special case

Suppose we analyse a triangle that is precisely fitted by an overdispersed Poisson model. Conse-quently, the fitted values equal the observed claims (µbij = Cij) and the residuals equal zero. This results in the following elements of the matrix ∆ in case of case weight perturbation:

∆q,r = Cr− µr φµr ∂µr ∂γq γ= b γ = Cr−bµr b φµbr ∂µr ∂γq γ=bγ = 0,

The matrix ∆ consists of only zeros and varying case weights has no effect on the parameters. Consequently, varying case weights causes no change in the estimated total reserve. Therefore, a zero effect could serve as a benchmark in the influence assessment. If a payment perfectly agrees with the pattern in the other payments, the influence on the reserve will be small. This is reflected in the zero effect of perturbing case weights.

(31)

5.2 Case deletion 23

5.2

Case deletion

Case deletion is a restricted version of case weight perturbation. The weights are restricted to zero and one. If the weight of a payment equals one, the payment is involved in the estimation. If the weight of a payment equals zero, the payment is not involved in the estimation. We consider case deletion as an alternative to case weight perturbation.

We estimate the overdispersed Poisson model twice. First, we estimate the model including the payment, resulting in the reserve estimate \E(R)(ω0). Next, we estimate the model excluding the payment. This results in the reserve estimate \E(R)(ω1), where ω1= (1, . . . , 1, 0, 1, . . . , 1) and the zero corresponds to the deleted payment. The following difference is a measure of influence:

\

E(R)(ω1) − \E(R)(ω0)

We describe the relation between the results from case weight perturbation and case deletion. Suppose we vary the weight ωij of one payment, while the weights of the other payments equal one. The rate of change of the reserve caused by the case weight perturbation is measured by the derivative in ωij = 1. Suppose that the relation between the weight ωij and the change in the reserve is linear. In that case, the derivative represents the change in the reserve estimate caused by increasing the weight from one to two. On the other hand, the negative of the derivative represents the change in the reserve estimate, caused by decreasing the weight from one to zero; that is case deletion. Concluding, if the relation between the weight ωij and the estimated reserve is linear, the following relation holds:

∂\E(R)(ω) ∂ωij ω=ω 0 = −\E(R)(ω1) − \E(R)(ω0)  .

If the relation between the weight and the estimated reserve is non-linear, the results of case dele-tion and case weight perturbadele-tion could be different.

As with case weight perturbation, the two payments C1,n and Cn,1 cause problems. Payment C1,ncompletely determines bβn and Cn,1 completely determinesαbn. Therefore, it is impossible to estimate these parameters if we delete the two corner points. An alternative approach for these payments is described in Chapter 7.

5.3

Scaling

Suppose the total reserve increases with one-thousand after deleting a payment. This is worse if the original reserve was five-thousand than if it was ten-thousand. Furthermore, this increase is also worse if the reserve variability was one-hundred than if it was one-thousand. Therefore, we prefer to scale the results of the influence measurement. Denote the influence measure for payment Cij by ∆E(R)ij. Two possible scaled measures are:

∆E(R)ij \ E(R) and ∆E(R)ij \ RMSEP .

However, a payment that has large influence on the reserve could also influence the RMSEP. Therefore, the second scaled measure could give a misrepresentation of the influence. If the in-fluence measure is scaled by the RMSEP, we suggest to use the estimated RMSEP obtained from the model excluding payment Cij.

(32)

24 Influence on reserve

5.4

Example

To illustrate the theory, we consider payment perturbation, case weight perturbation and case deletion for the triangle of Taylor and Ashe (1983).

Payment perturbation

First, we consider Figure D.1 in the appendix. Every graph shows the relation between the per-turbation of a payment in the triangle and the change in the reserve (change is expressed as percentage of the original reserve). We observe a particular pattern in the graphs: in the first development years the graphs are decreasing and in later years the graphs are increasing.

The effect of payment perturbation on the reserve is measured by the derivatives of these graphs in ω = ω0, defined in Sections 5.1 and 5.1.1. Figure 5.1 represents these derivatives for every payment in the triangle, scaled by the original reserve. Furthermore, lmax is given.

Payment perturbations scaled by the original reserve

Development year Accident year 10 9 8 7 6 5 4 3 2 1 1 2 3 4 5 6 7 8 9 10 −0.08 −0.06 −0.04 −0.02 0.00 0.02 0.04 0.06 0.08

L_max for payment perturbation

Development year Accident year 10 9 8 7 6 5 4 3 2 1 1 2 3 4 5 6 7 8 9 10 −0.6 −0.4 −0.2 0.0 0.2 0.4 0.6

Figure 5.1: Results for payment perturbation

Apparently, the location of the payment in the triangle is important. The reserve is especially sensitive to perturbations of payments in the corners of the triangle. Large payments in the north-west corner of the triangle have a decreasing effect on the reserve and large payments in the north-east and south-west corner have an increasing effect on the reserve. The results in the north-east and south-west corner are probably a result of the model specification. These two payments completely determine the parameters β10 and α10and these parameters determine to a large extent the future payments in the last row and last column. Because the payments in the last row are in general larger than the payments in the last column, the effect of perturbing the payment in the south-west corner is large.

We investigate the large influence of C10,1 in more detail. First, we obtain the maximum possible change in the reserve, that is caused by the unit direction vector lmax. The observed payments are modified by Cij(ω) = Cij+

q b

φµbijlmaxij . The estimated total reserve increases with 2,903 (×1, 000). Next, suppose we only modify C10,1 by applying the unit direction vector l = (0, . . . , 0, 1). Con-sequently, only payment C10,1 is increased with

q b

(33)

5.4 Example 25

Case weight perturbation

Next, we consider the effect of case weight perturbation. First, we note that we search for influ-ential payments by mutual comparison. However, this will not answer the question: is the largest influence really large? Therefore, we call the payments with the largest influences ‘potentially influential’. In Chapter 8, we determine a boundary between influential and not influential. To indicate the effect of varying one case weight, we consider Figure D.2 in the appendix. The graphs show the relation between the case weight of a payment and the change in the estimated total reserve (change is expressed as percentage of the original reserve). We observe no particular pattern in the graphs.

We measure the effect of case weight perturbation on the reserve by the derivatives of these graphs in ω = ω0. Figure 5.2 shows these derivatives for every payment in the development triangle, scaled by the original reserve. Furthermore, lmaxis given in the same figure.

Case weight perturbations scaled by the original reserve

Development year Accident year 10 9 8 7 6 5 4 3 2 1 1 2 3 4 5 6 7 8 9 10 −0.020 −0.015 −0.010 −0.005 0.000 0.005 0.010 0.015 0.020

L_max for case weight perturbation

Development year Accident year 10 9 8 7 6 5 4 3 2 1 1 2 3 4 5 6 7 8 9 10 −0.3 −0.2 −0.1 0.0 0.1 0.2 0.3

Figure 5.2: Results for case weight perturbation (scaled by the original reserve)

The figure shows the sensitivity of the estimated total reserve to the original case weight specifi-cation. We notice the influence of payment C8,3. If case weight ω8,3 increases, the total reserve increases with a relatively large amount. And conversely, if the case weight decreases, the total reserve decreases with a relatively large amount. Therefore, the (high) payment C8,3 has poten-tially large influence on the estimated total reserve. Furthermore, we notice that varying the case weight of C1,n or Cn,1has no effect on the estimated reserve.

We have a closer look at the effect of C8,3by considering lmax. The unit direction vector lmax pro-duces the largest increase in the total reserve. Therefore, we estimate the model incorporating the weights ω = ω0+lmax. The estimated reserve increases with 1,114 (×1, 000). Next, suppose we only modify the weight of payment C8,3. We use the unit direction vector l = (0, . . . , 0, l8,3 = 1, 0, . . . , 0) and specify the weights by ω = ω0+ l. This produces an increase in the reserve estimate of 281 (×1, 000). This increase in the total reserve is approximately 25% of the maximum increase, reached by applying lmax.

(34)

26 Influence on reserve

Case deletion

Finally, we consider case deletion and we compare the results with case weight perturbation. To enable the comparison, we consider the negative of the influence measure from case deletion. Con-sequently, the results indicate whether a payment has an increasing or a decreasing effect on the reserve.

Again, consider the effects of case weight perturbation in Figure D.2 in the appendix. The result of the deletion of a payment is shown by the negative of the y-value corresponding to ω = 0 on the x-axis. Furthermore, most graphs seems to be non-linear. Therefore, the results from case weight perturbation and case deletion are possibly different.

In Figure 5.3 the (negative of the) results of case deletion are presented. Furthermore, the results of case weight perturbation are given. Both results are scaled by the original total reserve.

Case deletion scaled by the original reserve

Development year Accident year 10 9 8 7 6 5 4 3 2 1 1 2 3 4 5 6 7 8 9 10 −0.04 −0.03 −0.02 −0.01 0.00 0.01 0.02 0.03 0.04

Case weight perturbations scaled by the original reserve

Development year Accident year 10 9 8 7 6 5 4 3 2 1 1 2 3 4 5 6 7 8 9 10 −0.020 −0.015 −0.010 −0.005 0.000 0.005 0.010 0.015 0.020

Figure 5.3: Results for case deletion (scaled by the original reserve)

The two figures look very similar. We observe some small differences, caused by the non-linear relation between the case weight of a payment and the estimated reserve. As with case weight perturbation, again payment C8,3 has the largest influence of all payments.

The results are scaled by the original reserve. Another possibility is to scale the results by the RMSEP from the model excluding the payment. However, if we compare the influence of payments mutually, the scaling is not important. We scale the results by a constant number and this does not effect the mutual results. Therefore, the results scaled by the RMSEP show exactly the same figure as the result in Figure 5.3 (left). Only the scale will be different.

(35)

Chapter 6

Influence on reserve variability

As well as the total reserve, the reserve variability is an important function in claims reserving. In this chapter, we consider the influence of payments on the reserve variability.

The RMSEP is a measure of the reserve variability. Suppose we consider the influence on the RMSEP to indicate the influence of payments on this variability. This will cause a difficulty and we explain this difficulty with an example. Suppose that the influence of a payment on the RM-SEP indicates that this payment has an increasing effect on the RMRM-SEP. We would conclude that this payment has a large influence on the reserve variability and needs special attention. How-ever, suppose that this payment has an increasing effect on the total reserve too. In that case, the increasing effect on the RMSEP is possibly an effect of the increasing effect on the reserve. Therefore, we prefer a variability measure relatively to the reserve.

We consider the prediction error as percentage of the total reserve:

d

CV = RMSEP\ \ E(R)

× 100.

CV stands for coefficient of variation. In this chapter, we describe methods to measure the influence of payments on the CV.

6.1

Perturbations

We consider the influence of model perturbations on the estimated CV. We use the same notation and approach as in Chapter 5.

We consider the surface

 ω

d CV(ω)



, where ω = ω0+ λl. The directional derivative of CV(ω) with respect to λ, evaluated in λ = 0, is given by

∂dCV(ω) ∂λ λ=0 = n X i=1 n−i+1 X j=1 ∂dCV(ω) ∂ωij ω=ω 0 ! · lij, = < ∇dCV(ω), l >

(36)

28 Influence on reserve variability

This derivative is given by ∂dCV(ω) ∂ωij ω=ω0 = ∂ ∂ωij \ RMSEP(ω) \ E(R)(ω) × 100 ω=ω0 = \ E(R) ∂ \RMSEP(ω)∂ω ij ω=ω 0 − \RMSEP ∂\E(R)(ω)∂ω ij ω=ω 0 \ E(R) 2 × 100. (6.1)

The derivative of the estimated reserve with respect to ωij is derived in the previous chapter. We only have to obtain the derivative of the RMSEP. We described an analytical approximation of the RMSEP. Another method to estimate the RMSEP is by the bootstrap technique. We consider the analytical approximation, because then we could apply the same approach as described for the influence on the total reserve. Furthermore, the bootstrap is a resampling method, that increases the computation time to estimate the reserve variability.

The RMSEP is approximated by \

RMSEP = pprocess variance + estimation variance =

r \

Var(R) +Var(\\E(R))

= v u u u u u u u u u t   X i,j∈ ( bφbµij)  +         X i,j∈ b µ2 ijVar(ηbij) + 2 X i1,j1∈ i2,j2∈ i1,j16=i2,j2 b µi1j1µbi2j2Cov(ηbi1j1,ηbi2j2)         .

The derivative of the RMSEP is obtained by ∂ \RMSEP(ω) ∂ωij ω=ω 0 = ∂ ∂ωij r \

Var(R)(ω) +Var(\\E(R))(ω) ω=ω 0 = 1 2 r \

Var(R) +Var(\\E(R))   ∂ \Var(R)(ω) ∂ωij +∂ \ Var(\E(R))(ω) ∂ωij   ω=ω0 . (6.2)

We consider the two derivatives in Equation (6.2) separately. Process variance

For the process variance, we have ∂ \Var(R)(ω) ∂ωij ω=ω0 = ∂ bφ(ω)\E(R)(ω) ∂ωij ω=ω0 = ∂ bφ(ω) ∂ωij ω=ω 0 \ E(R) + ∂\E(R)(ω) ∂ωij ω=ω 0 b φ.

(37)

6.1 Perturbations 29

Estimation variance

For the estimation variance, we first introduce matrix notation to simplify notation. We number the observed payments by r = 1, . . . , N , where C1,1 is observation 1 and we continue the number-ing from row to row. The last payment Cn,1has number N . As introduced before, the p × 1-vector γ contains the model parameters. Furthermore, X is the N × p-design matrix of the observed payments. Every row corresponds to a payment and every column corresponds to a parameter. An element xij in X equals 1 if payment i depends on parameter γj and otherwise it equals 0. For example, the first column consists of ones, because every payment depends on the constant c. Finally X is the n2× p-design matrix for all payments, including the predicted payments. The estimation variance is obtained with the delta method, as explained in Appendix B.4. First, the variance matrix ofbγ is estimated by the inverse of the Fisher information matrix, defined in Equation 2.11. In matrix notation, we have

\ Var(γ)b = X W X −1 = X V X0 b φ !−1 = φb  X V X0  −1 ,

where W is a diagonal matrix with diagonal elements µbr b

φ, r = 1, . . . , N and V is a diagonal matrix with diagonal elements bµr. If we consider case weights unequal to one, then we have to multiply the diagonal elements of both matrices by ωij.

Next, the variance matrix ofbη is estimated by \ Var(η)b = X \Var(bγ)X0 = φXb  X V X0  −1 X0, where η concerns both observed and future payments.

Finally, we estimate the estimation variance by \

Var(\E(R)) =µbfutureVar(\bη)bµ 0 future,

whereµbfutureis defined as a 1 × n2-vector, containing the estimated and predicted payments, but where the estimated payments are replaced by zero (we are concerned with future observations only).

Now, we can derive the derivative of the estimation variance with respect to ω. This derivative is obtained by ∂Var(\\E(R))(ω) ∂ωij ω=ω 0 = ∂µbfuture(ω) ∂ωij ω=ω0 \ Var(η)bµb0future+ b µfuture ∂ \Var(bη)(ω) ∂ωij ω=ω 0 b µ0future+ b µfutureVar(\η)bbµfuture0(ω) ∂ωij ω=ω 0 .

(38)

30 Influence on reserve variability

Chapter 5. The derivative of the estimated variance matrix ofbη is given by

∂ \Var(η)(ω)b ∂ωij ω=ω 0 = ∂ bφ(ω)XX V (ω)X0  −1 X0 ∂ωij ω=ω 0 = ∂ bφ(ω) ∂ωij ω=ω 0 XX V X0  −1 X0+ bφX ∂X V (ω)X0  −1 ∂ωij ω=ω 0 X0, where ∂X V (ω)X0  −1 ∂ωij ω=ω 0 = −X V X0  −1 ∂  X V (ω)X0  ∂ωij ω=ω 0  X V X0  −1 = −X V X0  −1 X ∂V (ω) ∂ωij ω=ω 0 X0 X V X0  −1 .

Concluding, the derivative of the estimation variance consists of different components. The deriva-tive of the dispersion parameter and the derivaderiva-tive of V (ω) with respect to ω depend on the kind of perturbation. Therefore, we derive these formulas for the two perturbation schemes separately in the next two sections.

Finally, the unit direction vector that produces the largest increase in the CV is given by

lmax= ˙ CV k ˙CVk = ˙ CV p ˙ CV0CV˙ (6.3) whereCV =˙ ∂dCV(ω)∂ω ω=ω 0

. Furthermore, −lmax produces the largest decrease in the CV.

(39)

6.1 Perturbations 31

6.1.1

Perturbations of payments

We consider perturbations of payments on the reserve variability. We are interested in the ques-tion: is the estimated CV sensitive to small fluctuations of the observed data?

The payments are modified by Cij(ω) = Cij+ q

b

φµbijωij and we measure the effect on the CV. First, the effect on the dispersion parameter is measured by

∂ bφ(ω) ∂ωij ω=ω 0 = ∂ ∂ωij      1 N − p n X r=1 n−r+1 X s=1  Crs+ ωrs q b φbµrs−µbrs(ω) 2 b µrs(ω)      ω=ω o = 1 N − p n X r=1 n−r+1 X s=1 2µbrs(Crs−µbrs) q b φbµijI(r=i s=j) − ∂bµrs(ω) ∂ωij ω=ω0  − (Crs−µbrs)2 ∂µbrs(ω) ∂ωij ω=ω0 b µ2 rs , where q b φµbrs and q b

φµbij are estimates from the model based on the original (unmodified) data. These are constants and do not vary with ω. The function I(r=i

s=j) defines the indicator function

that equals one if r = i and s = j and equals zero otherwise. For the diagonal matrix V , we have

diag(V (ω)) = (µb1,1(ω),µb1,2(ω), . . . ,µbn,1(ω)) and the derivative ∂V (ω)∂ω

ij

ω=ω

0

is a diagonal matrix containing the diagonal elements

bµ1,1(ω) ∂ωij ω=ω 0 , ∂µb1,2(ω) ∂ωij ω=ω 0 , . . . , ∂µbn,1(ω) ∂ωij ω=ω 0 ! .

These elements were (implicitly) derived in Chapter 5 in Equation 5.1.

Referenties

GERELATEERDE DOCUMENTEN

Further, it seems rmly in line with the re- lational egalitarian approach insofar as one's identity within a community is essentially a relational form of identity, and

(zie hoofdstuk VI &#34;Niet verspreiding&#34;). Harde be- scherming van de kwetsbare centrales kan leiden tot het afglijden naar een politiecstaat. Dat alles in overweging genomen,

De Nederlandse poli ti elm partijen zeggen; ''lij hebben voor U bij zondere paragra- fen in ons progrrun en speci:c,listen in onze Kr.1,nrnrfracties. Wanneer de

- Richting het uiterste ZW van het profiel (linkerdeel transect F-G) klapt de hellingsrichting om, met lagen van zowel eenheid 1, 2 als 3 die hellen naar het zuiden.. I

example of religion.33 Anidjar’s findings on the linkages between modern so- cial categories and religion along withthe conclusions reached by other schol- ars should make us

Hypothesis 1 tests whether a disproportionality in the number of police contacts between adolescents with a non-German ethnic appearance and adolescents with a German ethnic

In the previous section it was concluded that the use of differing regulatory obligations between Member States, specifi cally when foreign service providers are able to

This also confirms the results as between the quantitative and qualitative analysis with regard to the positive effect of the condition ‘characteristics of the housing