• No results found

Disability durations of Dutch self-employed

N/A
N/A
Protected

Academic year: 2021

Share "Disability durations of Dutch self-employed"

Copied!
86
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Assessing how certain risk factors, particularly variables related to the business

cycle, influence disability durations of Dutch self-employed

Remko Amelink

August 2010

(2)

Abstract

For insurance companies it is crucial to understand which risk factors affect the return-to-work process of self-employed. More knowledge about such risk factors gives insurance companies more insight in the riskiness of applicants. During the financial and economic crisis the insurer, who provided the dataset, ob-served a drop in recovery rates. The reason for this drop is unknown to the insurer and it is also unknown whether the recovery rates will increase again after the financial and economic crisis.

In this thesis we assess the influence of several risk factors, in particularly variables related to the busi-ness cycle, on the sick-leave durations of self-employed. Besides analyzing the determinants of the total disability duration we also zoom in on the health condition of the self-employed during his incapacity. We create three health states and analyze the transition of going from one health state to another. In this way, we assess which risk factors have an impact on the full recovery rates, partial recovery rate and on the back rate of a self-employed.

Variables related to the business cycle are the GDP growth rate and the business confidence index. A decrease in the GDP growth rate leads to higher recovery rates, hence shorter durations. We allow for a structural change in the coefficient of the business confidence index after March 2009. After this date, most of the self-employed started to experience the consequences of the financial and economic crisis. Until March 2009, an increase in the business confidence index led to a significant increase in recovery rates. After March 2009, a change in the business confidence index has no significant effect on the recovery rates of self-employed anymore. Overall, the effect of a change in the business confidence index is much larger than the effect of a change in the GDP growth rate and in times of crisis self-employed recover more slowly compared to other periods.

Furthermore, we find that older claimants and females have significantly lower recovery rates than younger claimants and males. Also the type of disorder plays an important role in explaining the return-to-work pro-cess. Self-employed with a cardiovascular, neurological or a psychological disorder recover more slowly than claimants with another type of disorder and self-employed with a skin disease or with a respiratory or digestive disorder recover faster than claimants with other type of disorders.

Keywords: Sick leave durations, Self-employed, Disability durations, Business confidence index

Supervisors: dr. L. Spierdijk Co-assessor: prof. dr. R.H. Koning

(3)

This Master thesis is the result of my Master program in Econometrics, Operations Research and Actuarial Studies at the University of Groningen. My specialization in this Master study is Actuarial Studies.

First, I would like to thank my supervisor dr. L. Spierdijk and co-assessor prof. dr. R.H. Koning. I would like to thank dr. L. Spierdijk for giving me the opportunity for this research and for her comments and good suggestions during the writing of my thesis. Furthermore, I would like to thank her for supporting me in the periods of rapid progress but especially in the periods of slow progress of my thesis. I would like to thank prof. dr. R.H. Koning for his comments.

Second, I would like to thank Niels Holtrop for his critical review on this thesis and his useful comments about spelling and style. Third, I would like to thank my fellow students and especially Niels Holtrop, Richard Kamst, Wim Siekman and Fred Heijnen for the great time during classes. Last but certainly not least I thank my parents for giving me the opportunity to study.

(4)
(5)

1 Introduction and research question 1

2 Literature review 5

2.1 Absenteeism among employees . . . 5

2.2 Absenteeism among self-employed . . . 6

3 Income insurance for self-employed 7 3.1 Introduction to income insurance for self-employed . . . 7

3.2 Income insurance for self-employed . . . 8

4 Data 11 4.1 Data description . . . 11

4.2 Replacement rates . . . 14

4.3 Preliminary statistics . . . 14

5 Proportional hazards model 21 5.1 Proportional hazards model . . . 21

5.2 Mixed proportional hazard model . . . 22

5.2.1 Unobserved heterogeneity . . . 22

5.2.2 Mixed proportional hazard model with continuous frailty distribution . . . 23

5.2.3 Mixed proportional hazard model with Heckman-Singer frailty . . . 24

5.3 Multi-state mixed proportional hazards model . . . 25

5.4 Baseline hazard specification in the MMPH model with HS frailty . . . 28

6 Empirical results of the two states representation 29 6.1 Estimation results . . . 30

6.2 Testing the proportional hazard assumption . . . 36

6.3 Robustness analysis . . . 37

7 Empirical results of the three states representation 39 7.1 Risk factors . . . 40

7.2 Baseline hazard . . . 43

7.3 Unobserved heterogeneity . . . 44

(6)

CONTENTS

8 Simulation analysis 49

8.1 Transition probabilities and transition intensities in the multi-state model . . . 49 8.2 Simulation methods . . . 53 8.3 Simulation results . . . 55

9 Conclusions 61

Appendix 65

A Kaplan-Meier Survival Functions 65

B Episode Splitting 67

C Estimation results multi-state MPH model 71

D Schematic scheme of estimation the expected duration until recovery 77

Bibliography 80

(7)

1

Introduction and research question

Self-employed are exposed to several kinds of risk and against some of these risks they can protect them-selves by means of insurance. For example, they can buy income insurance that prevents income loss due to sickness or disability to work. Such an insurance is better known as a disability insurance. Since August 2004 disability insurance for self-employed workers in the Netherlands is provided by private in-surance companies. Until August 2004 self-employed were protected against income loss due to disability by means of a compulsory social disability insurance, also known as WAZ (Wet Arbeidsongeschikthei-dsverzekering Zelfstandigen (Dutch) or Self-Employed Income Insurance Act (English)).

For insurance companies who provide disability insurance it is crucial to understand which risk factors affect the return-to-work process. More knowledge about which risk factors affect the return-to-work pro-cess gives insurance companies more insight in the riskiness of applicants. It also gives them a better understanding of the underwriting criteria which can help them determining the premium they should ask.

(8)

CHAPTER 1. INTRODUCTION AND RESEARCH QUESTION

Bakker et al. (2006) investigate many papers on disability and concluded that risk factors that affect sick leave durations of employees do not necessary apply to self-employed. The number of empirical studies on sick leave durations of self-employed is very scarce. Two empirical studies on the return-to-work process of self-employed are Spierdijk et al. (2009) and Spierdijk and Koning (2009). Spierdijk et al. (2009) analyze a unique data set of sick leave claims by Dutch self-employed working in various professions, provided by a private Dutch insurance company. They show that various socio-demographic, contract-specific and business-cycle specific factors significantly affect the disability durations of self-employed. Furthermore, they find several differences in work absence behaviour between self-employed and employees. Instead of analyzing the determinants of the total disability duration, Spierdijk and Koning (2009) adopts a different approach. The authors zoom in on the health condition of the self-employed during his incapacity to work. They create three health states and analyze the transition from one health state to another. The three health states used are: an active (healthy) state, a mildly disabled state and a severely disabled state. By analyzing the transition from one health state to another, they asses which risk factors have an impact on the full recovery rates, partial recovery rate and on the back rate of a self-employed.1

Spierdijk and Koning (2009) assess the role of certain risk factors on the return-to-work process. Vari-ables related to the business cycle were not included in their models because they act as time-varying covariates, which would make estimation of their model very time consuming and computationally inten-sive. Spierdijk et al. (2009) showed however that these factors play an important role in explaining the sick leave durations of employees and self-employed. It would be interesting to include variables related to the business cycle on the full recovery rates, partial recovery rate and on the full back rate of a self-employed.

As already mentioned an insurance company should use model-based calculation of loss reserves. When it turns out that self-employed would recover more slowly in times when the economy is in a downturn, then the insurance company should increase their loss reserves, but when self-employed would recover faster in times when the economy is in a downturn, then the insurance company can decrease their loss reserves. So, understanding the effects of changes in variables related to the business cycle on the return-to-work process gives the insurer more insight in the required loss reserves. Besides this reason, there is another reason for the insurer to understand the effects of changes in variables related to the business cycle on the return-to-work process. Reduction and prevention of long absence durations will lead to cost reductions for the insurer. So, when insurance companies would know in which periods self-employed recover more slowly, they can provide additional prevention and reintegration services in these periods to reduce costs. Including variables related to the business cycle in a multi-state disability model will for these reasons the central theme of this thesis.

1A claimant falls back in health status when goes from the mildly disabled state to the severely disabled state.

(9)

and we assess the influence of several risk factors on full recovery rates, partial recovery rate and on the full back rate of a self-employed in the case of using a multi-state disability model. We are especially interested in assessing the impact of business cycle related variables on the full recovery rates, partial recovery rate and on the full back rate of a self-employed. Because Spierdijk and Koning (2009) did a similar study, only without including variables related to the business cycle, we will use their paper as a starting point for our research. The research question of this thesis is therefore as follows: Which risk factors and in particular which variables related to the business cycle, influence disability durations of self-employed?

(10)
(11)

2

Literature review

The main goal of this study is to assess how certain risk factors influence disability durations of self-employed. In this section we will discuss relevant theoretical and empirical literature dealing with workers absenteeism. There is a rich literature on sick leave duration among employees but the literature on sick leave duration among self-employed is very scarce, see Bakker et al. (2006). The number of studies avail-able for direct comparison is therefore relatively low. As already mentioned in the introduction, the paper of Spierdijk and Koning (2009) is used as a starting point of our study. Spierdijk et al. (2009) provided an extensive literature review on workers absenteeism. For this reason we will keep the literature review short and refer to Spierdijk et al. (2009) for a more extensive literature review.

2.1 Absenteeism among employees

(12)

CHAPTER 2. LITERATURE REVIEW

2.2 Absenteeism among self-employed

A self-employed is his own boss and for this reason we would expect that some of the risk factors act differently on the sick leave durations of self-employed compared with sick leave durations of employees. For example, it is not very likely that self-employed would react in the same way on an increase of the unemployment rate as employees. Employees’ sick leave durations are usually shorter in periods of high unemployed. A possible explanation is reduced career opportunities. Such an explanation would not make a lot of sense when self-employed would react in the same way to a change in the unemployment rate as employees do. Reduced career opportunities is not very likely for them because they are there own boss. Empirical literature on sick leave duration of self-employed is very scarce, see Bakker et al. (2006), Spierdijk et al. (2009) and Spierdijk and Koning (2009). Spierdijk et al. (2009) and Spierdijk and Koning (2009) find a negative dependence between age and the length of the disability duration. Furthermore, they find that women recovery slower than men. Spierdijk and Koning (2009) find some evidence of a negative relation between higher replacement income and the length of the disability spell. Spierdijk et al. (2009) however find no relation between the replacement income and the length of the disability spell. In most of the studies employees recover more quickly in an economic downturn. The question is whether this is also the case for self-employed. One can argue that a self-employed wants to go faster back to work when the economy is booming. When the economy is booming a large profit and hence a large income can be achieved by the self-employed. On the other hand, during an economic upturn self-employed may have to work harder, resulting in relatively severe disorders. Moral hazard can also play a role when the economy is in a downturn. During a recession the income of a self-employed is likely to become less compared to periods of high economic growth and hence, a replacement income paid by disability insurance may become an attractive alternative. On the other hand, when the economy is in a downturn the risk of going bankrupt increases, which could force self-employed to return to work as soon as possible. Spierdijk et al. (2009) found a negative relation between the GDP growth rate and recovery rate. An increase in the GDP growth rate lead to a decrease in the recovery rate, hence longer sick leave durations. For a more extensive literature review we refer to Spierdijk et al. (2009).

(13)

3

Income insurance for self-employed

3.1 Introduction to income insurance for self-employed

Self-employed are exposed to several kinds of risks, for example personal and business related risks. Exam-ples of personal risks are mortality and income loss due to sickness. Self-employed can protect themselves against some of these risk by means of an insurance. They can buy insurance which prevents, for example their family, for negative financial consequences in the case the self-employed dies. They can also buy an income insurance that prevents income loss due to sickness.

Until August 2004 a self-employed was, after one year of sickness, protected against income loss by means of a compulsory social disability insurance in the Netherlands. The disability insurance was known as WAZ, which stands for Wet Arbeidsongeschiktheidsverzekering Zelfstandigen. The costs of the WAZ in-surance was funded from taxation payed by the self-employed. So, until August 2004 the state provided a self-employed, after one year of sickness, a compensation income. This compensation income depended on the percentage of incapacity of a self-employed and was bounded to a maximum of 70% of the minimum wage of an employer in the Netherlands, which was around 16.000 euro (bruto) a year in 2004.1To protect income loss in the first year of a disability spell, a self-employed could buy additional insurance by private insurance companies. Additional income insurance was also available for the other years of a disability spell. The compensation income provided by the government was only 70% of the minimum wage of an employer and therefore many self-employed still faced a larger income loss due to sickness after the first year of sickness.

An additional income insurance that covers the income loss in the first year of disability is called an

A-1More conditions on when the Dutch government provided a replacement income can be found on the website:

(14)

CHAPTER 3. INCOME INSURANCE FOR SELF-EMPLOYED

cover and additional income insurance that covers the income loss after the first year of disability is called a B-cover. The A-cover policy conditions include a deferment period of seven days to six months. The deferment is the time between the start of the illness and start of the benefit payment. The insured is enti-tled to a replacement income when he is at least 25% disabled. The maximum period of benefit payments equals one year. The B-cover policy conditions include a one-year deferment period. The benefit payments can stop due to several reasons. Examples are when the insured dies or when he reaches some age, which is contract specific. The insured amounts can be indexed (yearly) according to fixed or time dependent rates. Until August 2004 most self-employed with an A- and B-cover had a replacement income in the B-cover that equaled the replacement income of the A-cover minus the replacement income provided by the government. In section 3.2 we will come back to the policy conditions of an income insurance.

In August 2004 the WAZ was abolished by the Dutch government and since then income insurance due to sickness has only have been available from private insurance companies. A self-employed still can choose which type of income insurance he would like to buy. For example an A-cover, a B-cover or both. Many self-employed with an income insurance have both covers and the replacement income in both covers is al-most always the same. Therefore we only discuss the policy conditions of the following income insurance: a combination of both covers where the replacement income in both cases is assumed to be the same. The A- and B-cover are in this way special cases of the afore mentioned income insurance.

3.2 Income insurance for self-employed

In this section we will have a closer look at the policy conditions belonging to the income insurance sold by the insurance company that provided the data for this study. We will discuss several terms and conditions such as replacement income, amount insured, disability period, premium payments and end of contract.

The compensation income, or replacement income, depends on the percentage of incapacity of a self-employed, which is also known as the replacement rate. In some cases the compensation income equals the percentage of incapacity times the amount insured and in other cases the compensation income equals the payout percentage times the amount insured, where the payout percentage times are defined as follows:

disability percentage payout percentage

(15)

During our study we will assume that a claimant compensation income equals the percentage of incapacity times the amount insured.

The amount insured equals the replacement income when the insured would be fully disabled to work. Each year, at the renewal date of the contract, the amount insured can be adjusted and if needed corrected for inflation by means of an indexation. This indexation can be done by a fixed rate, for example 3, 4 or 5% or by means of a time depending rate, for example by means of the price index. The indexation rate will be set equal to 0% in the years when the price index is negative. Once a year the insured is allowed to increase the amount insured if the following conditions are satisfied: the insured is not older than 54, the insured has a disability percentage of 0% for the last 60 days before the increase, the total number of days that the insured was disabled is less then 365 days during the past contract period and the time between two increases is at least three months.

The insured can increase the amount insured by at most 10% and the amount insured may never exceed 3 times the average yearly income of the insured, nor may it exceed 150.000 euro’s. This is done to avoid moral hazard.

(16)
(17)

4

Data

This chapter gives a detailed description of our dataset and provides sample statistics. A description of the data will be provided in Section 4.1. To simply the calculations later on we will create several disability states. The definition of these disability states will be provided in Section 4.2. A preliminary duration analysis will be done in Section ?? and some sample statistics related to the available variables will be provided and discussed in Section 4.3.

4.1 Data description

The dataset used in this thesis is provided by a large private Dutch insurance company. The dataset contains information on disability spells of Dutch self-employed. In our analysis we focus on the disability duration of each claimant. For each claim the duration in months is known, together with the development of the replacement rate over time. This replacement rate equals the incapacity of the claimant which is deter-mined at the end of each calendar month. For some of the claims the first couple of replacement rates are missing, which can be due to several reasons. The first reason is that the claimant has a deferment period of a couple of days, meaning that there will be some time between the start of the duration and the start of the observation. A second reason is that some of the claimants do not report their claim immediately, but wait for some moment. Sometimes even up to several years. A third reason is that the insurance company was not able to determine the replacement rate for the first month(s). For example when a claimant reports his claim at the last day of a given month.

(18)

CHAPTER 4. DATA

durations. A second feature of many survival data is right censoring. A claim is right censored if we do not observe the end of the duration, i.e. the disability spell continues after our observation window. Without taking right censoring into account, we would indirectly assume that all claims have ended at the end of our observations window what is certainly not the case.

There are different reasons for a claim to end. We are interested in those claims who have ended be-cause the claimant has recovered from his incapacity to work. Claims ended by other reasons, for example deaths, will be treated as right censored. Furthermore, the dataset contains characteristics on the claimant such as age, gender and occupational class. It also contains characteristics on the insurance product such as brand, amount insured and indexation rate and information on type of illness.

The start of the spells varies from December 2002 up until May 2010. As already mentioned in Chap-ter 3, there was a public income insurance for self-employed, called WAZ, until August 2004.1 Claimants for whom the incapacity to work started before August 2004 have also receive a WAZ replacement income besides a replacement income from the private insurance company. Claimant whose spell started after Au-gust 2004 only received a replacement from the private insurance company. The abolishment of the WAZ could have lead to a structural break around August 2004. To avoid any irregularities due to this possible structural break, we will remove all claims that started before August 2004.

Claims with irregularities regarding the replacement rate will also be removed from the dataset, together with claims due to pregnancy. Finally, some corrections are made for multiple observations belonging to the same claim. The final dataset contains 19, 524 observations on 14, 311 claimants. 24.8% of the claimants have multiple spells, 43.3% of the claims has a delayed entry and 23.3% of the claims is right censored.

A summary of the number of claims per starting year can be found in Table 4.1. Notice that number of self-employed that are at risk, varies over time. For this reason we can not draw any conclusions about the incidence rate. The values in the table are only meant as an overview. For the year 2004 the number of claims is the number of claims started in the period 01-08-2004 up until 31-12-2004 and for the year 2010 the number of claims is the number of claims started in the period 01-01-2010 up until 31-05-2010.

A short description of each available variable in the dataset can be found in Table 4.2. From the literature it turns out that business cycle variables plays an important role in explaining the sick leave durations of employees and self-employed. Variables related to the business cycle, such a GDP growth rate and the business confidence index, will therefore be used as explanatory variables in our models. A short

descrip-1WAZ: Wet Arbeidsongeschiktheidsverzekering Zelfstandigen.

(19)

Year # claims Year # claims Year # claims Year # claims

2004 1407 2006 3239 2008 3363 2010 1341

2005 3244 2007 3217 2009 3713

Table 4.1: Number of claims per starting year of the duration

tion of these variables can be found in Table 4.2. Later on, in Section 4.3, we provide more information on the variables related to the business cycle.

Variable name Description

Personal characteristics

Age Age of the claimant (in years)

Gender The gender of the claimant (male/female)

Occupational class The occupational class of the claimant. Either Retail , Medical services, Catering,

recreation and tourism, Industry and construction, Horticulture, Livestock farming, Business services, Other Agriculture or Others (dummy variables)

Contract characteristics

Brand Chosen brand by the insured (Brand 1,2,3 or 4)

EndAge The age of the claimant at which the contract will end no matter the disability status (years)

Amount insured The yearly amount insured (in euro’s)

Index Yearly indexation rate of the replacement income (%)

Type of illness

Cardiovascular diseases, Skin diseases, Locomotive diseases, Neurological diseases, Psychological diseases Digestive diseases, Urogenous diseases, Ophthalmology diseases, Others diseases (dummy variables)

Time varying variables

DP(i) Disability percentage belonging to the i-th month of the duration

GDP growth rate1 Quarterly data on the real growth rate of the Dutch economy (in percentage)

Business confidence index2 Quarterly data on business confidence (in percentage)

Table 4.2: Description of the available variables

1Data can be found on the website: www.cbs.nl

(20)

CHAPTER 4. DATA

4.2 Replacement rates

For each claim in the dataset the duration in months is known together with the development of the re-placement rate over time. This rere-placement rate equals the percentage of incapacity of the claimant, which is determined at the end of each calendar month and can take any value between 0 − 100%. To simplify the calculations, we will create several disablity states. The most simple case is the case of two states. The first disability state corresponds to a replacement rate of 0 − 25% and the other disability state corresponds to replacement rates higher than 25%. The first state will be labeled as ’W/U’ which stands for at work or unpaid sick leave and the second state will be labeled as ’P’, which stands for paid sick leave. The first state actually consists of two groups, claimants with a replacement rate equal to 1 − 25% and healthy self-employed. These two groups are aggregated to one group because it difficult to distinguish between these two groups. Many self-employed do not inform the insurance company when they move from unpaid sick leave to at work.2 One transition reflects an improvement of the health status while the other is related

to a decline of the health status. The data only contains information on the sick leave durations, hence the relevant transition which we will focus on is the transition P → W/U. Notice that all claimants start in the same state, namely state ’P’ and that we only focus the transition P → W/U. For this reason we call the model with two states a single-state model.

Besides a single-state duration model we will also use a multi-state duration model. This model con-sist of three or more states, in our case it will concon-sist of three states. The first state is exactly the same as the first state in the two states situation. In the three state model we will split the second state into two separate states. A graphical representation of the three states can be found in right panel of figure 4.1. The state ’P50’ corresponds to a replacement rate of 26 − 50% and the state ’P100’ corresponds to replacement rates higher than 50%. In the three states model there are six transitions. The relevant transitions for our analysis are P100 → W/U, P50 → W/U, P50 → P100 and P50 → P100. The first three states reflect an improvement of the health status, while the last one is related a decline in the health status.

4.3 Preliminary statistics

To get some insight in the available variables, we first calculate some statistics. Preliminary statistics of the personal characteristics can be found in Table 4.3, preliminary statistics of the contract characteristics can be found in Table 4.4 and preliminary statistics of the type of illness can be found in Table 4.5. A plot of the time varying variables can be found in the Figure 4.2.

The youngest claimant is ... years old at the start of his spell and the oldest claimants is ... years old at start

2The insurancy company only provides a replacement income when the claimants is diabled for more than 25%

(21)

(a) 2 states (b) 3 states

Figure 4.1: Graphical representation of the different states. Left panel two state representation and right panel three state representation.

Age Group Occupational Group Gender

Min Q1 Mean Q3 Max Agriculture SME Other

Industry & construction Farmer (livestock) Other Agriculture Medical Services Others M % F %

(22)

CHAPTER 4. DATA

of his spell. ... of the claimants are younger then ... and ... of the claimants are older than .... The average age of the claimants equals ....

From each claimant there is information available about the occupational class he belongs to. Only the four largest occupational classes are shown in Table 4.3. The largest occupational class is Industry & Con-struction with ... claimants, the second largest class is Livestock Farming with ... claimants and the third, fourth and fifth largest classes are other Agriculture, Medical Services and Horticulture with ..., ... and ... claimants respectively. From the occupational classes we see that a lot of claimants are active in the Agriculture and the Horticulture. This is not surprising, because a part of today’s insurance company has traditionally been an insurance company for, and founded by, farmers and horticulturists.

The last personal characteristic that will be discussed here is gender. From all claimants ... is male and only ... female. Either males claim more than females or there are much more self-employed males than employed females. Most likely is that there are much more employed males at risk then self-employed females.

Indexation rate Amount insured EndAge Brand

0%: ... 3%: ... 5%: ... Wage Index: ... Price Index: ... Other: ... Min ... Q1 ... Mean 27150 Q3 ... Max ... Min ... Q1 ... Mean 60.63 Q3 ... Max ... Brand1 ... Brand2 ... Others ...

Table 4.4: Preliminary statistics (contract characteristics)

The amount insured in Table 4.4 is only meant to give some insight in the true amount insured. The ac-tual values are slightly different because the amount insured is a time varying variable. Each year, at the renewal date of the contract, the amount insured can be adjusted and if needed corrected for inflation by means of a indexation. The replacement income of a claimant depends on the amount insured at the start of the incapacity, the indexation rate and the percentage of incapacity. The replacement income of a claimant with an incapacity of 100% equals the amount insured at the start of the incapacity. When the claimant is still unable to work after a year then the replacement income for the next year will be the replacement income of the previous year plus some correction for inflation. This indexation rate used for this correction is in general not the same as the indexation rate used to correct the amount insured. The indexation rate is contract specific. Frequently used fixed indexation rates are 0%, 3%, 4% and 5%. Sometimes a claimant

(23)

opts for a flexible, time varying, indexation rate. Frequently used time varying indexation rates are the price and wage index. In some rare cases a claimant has chosen for a mix of the given indexations. For example one part of the replacement income will be indexed by a fixed rate while the other part will be indexed by a time varying rate.

The variable Endage stands for the age at which the contract will end. Claims that ended because the claimant has reached his Endage are treated as right censored, as mentioned earlier in this chapter. Most of the claimants have a Endage of ... years. The minimum Endage equals ... years and the maximum equals ... years.

The last contract characteristic is Brand. There are four brands. Because of the sensitive information in the dataset we can not give the real brand names. Instead we will label them from brand1 up until brand4. The number of observations for brand3 and brand4 are low and are therefore grouped together. From Table 4.4 it is clear that the largest brand in the dataset is brand1; ...% of the claimants is insured by brand1. The other brands are underrepresented in the dataset compared to brand1.

CAS Code

A: Other : ...

B: Blood vessel diseases: ...

C: Cardiovascular diseases: ...

D: Skin diseases: ...

E: Endocrine organs diseases: ...

H: Otolaryngology diseases: ... L: Locomotive diseases: ... N: Neurological diseases: ... P: Psychological diseases: ... R: Respiratory diseases: ... S: Digestive diseases: ... U: Urogenous diseases: ... V: Ophthalmology diseases: ...

Table 4.5: Preliminary statistics (type of illness)

Each claim is based on a medical certificate which contains a medical diagnosis. Each medical diagnosis has a CAS code. Examples are the medical diagnosis neck pain and enlarged lymph node which have CAS codes L101 and C104 respectively. There are many different codes. Including all CAS codes in our mod-els would lead to many to be estimated parameters. Therefore we will aggregate some of the CAS codes together. Each CAS code starts with a letter followed by three numbers. A CAS codes with the same letter will be aggregated into one group. A small summary on the medical diagnosis and the CAS codes can be found in Table 4.5.

(24)

CHAPTER 4. DATA

explanatory variables in our models. Examples of such variables are the GDP growth rate and the business confidence index. We use the business confidence index used by the Chamber of Commerce (in Dutch: Kamer van Koophandel). Four times a year, i.e. quarterly, a large business survey is held by the VNO-NCW, MKB, CBS, EIB and KvK to measure the most important developments and expectations of the Dutch trade and industry. The results of the business survey are published afterward together with the busi-ness confidence index which is the sum of negative and positive expectations of four indicators, namely; sales, investment, export and employment and its range varies between [−100, 100]. A positive value im-plies that companies generally expect that, compared to the previous quarter, most over the indicators will go up. A negative value implies that companies generally expect that most of the indicators will go down. It would be better to include the four indicators separately into our models. Unfortunately, the required data on the four indicators is only published since the fourth quarter of 2008. We are therefore limited to use the sum of the four indicators, hence the business confidence index. More information on the business confidence index can be found on the website of the Chamber of Commerce.3 Plots of the GDP growth rate and the business confidence index over time can be found in Figure 4.2. Notice that the values of the business confidence index in the plot are divided by 10.

Figure 4.2: GDP growth rate (%)and business confidence index/ 10 (%) over time

From January 2004 up and till June 2008 the GDP growth rate and the business confidence index are positive. From June 2008 we observe four quarters of negative GDP growth. The business confidence con-tinues to be positive for the next two quarters. After December 2008 we observe four quarters of negative business confidence. The negative GDP growth and the negative business confidence are probably due to the international financial and economic crisis, which started around July 2008. It seems that the business confidence index is more sensitive to this financial crisis than the GDP growth rate. During the financial

3www.kvk.nl.

(25)

crisis we observe extreme negative business confidence.

(26)
(27)

5

Proportional hazards model

In our analysis we use several models. A mixed proportional hazards model will be used for the two states representation of the data and a multi-state mixed proportional hazards model will be used for the three states representation. The mixed proportional hazards model will be introduced and discussed in Sections 5.1 and 5.2 of this chapter. The multi-state mixed proportional hazards model will be introduced and discussed in Section 5.3.

5.1 Proportional hazards model

The hazard rate is an useful tool in the analysis of survival data. The hazard rate expresses the instantaneous ’probability’ of ending the spell at time t, given that this did not occur before time t. Notice that this instantaneous probability is not a real probability because it can be larger then 1. Usually the hazard rate will depend on certain covariates. The conditional hazard rate of duration Y given covariates X = x is defined as θ(t|x) = lim ∆t↓0 P(Y ≤ t+ ∆t|Y > t, X = x) ∆t .  5.1 The proportional hazards model proposed by Cox (1972), is widely used to analyze the effects of covariates on the hazard rate. Proportional hazards are also known as multiplicative hazard models. The hazard rate in the proportional hazards model is of the form

θ(t|X) = θ0(t) exp(β0X),     5.2

where X is a K-dimensional vector of covariates, β a K-dimensional vector of coefficients and θ0(t) is the

baseline hazard which only depends one t.1 Suppose we take a fixed time point t, lets call it ˜t. Then the

(28)

CHAPTER 5. PROPORTIONAL HAZARDS MODEL

ratio of the hazards of any two individuals i and j equals θ(˜t, Xi) θ(˜t, Xj) = exp(β0 (Xi− Xj)),     5.3

which is independent of time. In order to fit a proportional hazards model, we have to estimate the β parameters. The estimate of β, which we define as ˆβ, will be derived by partial maximum likelihood, where the partial likelihood function that should be maximized is given by:

PL(β) = n Y i=1 Y t≥0 Yi(t)ri(β, t) P jYj(t)rj(β, t) !dNi(t) , 5.4

where Yi(t) equals one when claimant i is under observation and at risk at time t and zero otherwise.2

Furthermore, Ni(t) equals the number of observed events in the period [0, t] of claimant iand ri(β, t) is the

risk score of claimant i and is defined as follows:

ri(β, t) = exp(Xi(t)β).     5.5

One of the great advantages of the proportional hazards (PH) model is that the partial likelihood function does not depend on the baseline hazard. Therefore, we do not need to specify the baseline hazard in the PH model. The exact derivation of ˆβ together with its variance is rather long and complicated. Therefore we refer to Therneau and Grambsch (2000) for the exact derivation.

The partial likelihood is derived by assuming continuous survival times. This assumption is doubtful because survival times, in our case recovery times, are measured in discrete time units. This implies that there will be ties in the survival times. One option to deal with discrete time data is to use a discrete survival model. Examples are the logistic and the cloglog models. In both cases a duration dependence should be specified, for example by means of a p-th degree polynomial. Another option is still assuming continuous survival times. Assuming continuous survival times, there are three common methods for dealing with ties, namely Breslow approximation, Efron approximation and Exact partial likelihood. For the estimation of the parameters in the PH model we will use the Efron approximation3. For a detailed description on

tied data and the three methods for handling with ties, we refer to Everitt and Rabe-Hesketh (2001) and Therneau and Grambsch (2000).

5.2 Mixed proportional hazard model

5.2.1

Unobserved heterogeneity

A particular concern in survival data is unobserved heterogeneity. A clear description of why this is the case can be found in Box-Steffensmeier and Zorn (1999). In Box-Steffensmeier and Zorn (1999) a simple

2Notice that risk at time t means that the claimants can experience an transition from one state to another state at time t.

3Also in the mixed proportional hazards (MPH) model the Efron approximation will be used for dealing with ties.

(29)

illustration is given which we will repeat here. Consider a population that consists of two subpopulations with different risks of experiencing the event. Both baseline hazards are assumed to be constant, i.e. the survival functions are exponential. The group with the higher hazard rate has a higher risk of experiencing an event than the group with the lower hazard rate, implying that the proportion of the two groups in the sample declines over time. As a result, the hazard rate in the total population will appear to fall over time, despite the fact that the hazard for both groups remain constant over time. So, unobserved heterogeneity may give the appearance of an aggregate decline in hazard rates despite the fact that individual hazards are constant. This phenomenon is also known as ’weeding out’ and ’sorting’.

A problem of models that do not allow for unobserved heterogeneity is that the estimated hazard rate will become biased towards negative duration dependence, i.e. the degree of negative duration dependence will be overestimated. To allow for unobserved heterogeneity we can add a frailty term, or random effect, v to the model in equation (5.2). This random effect v can be interpreted as a function of unobserved explana-tory variables, see for example Van den Berg (1997),Van den Berg (2001) and Hougaard (2001). Examples of possible unobserved explanatory variables are risk aversion, motivation to recover, willingness to take prescribed medication and education (Spierdijk and Koning (2009)). We will use the specification that the frailty term ˜v ≡ exp(v) operates multiplicative on the hazard rate, hence the hazard rate can be written as

θ(t|X, v) = θ0(t) exp(β0X+ v),     5.6 = θ(t|X)˜v. 5.7

There is no information available on the random effect ˜v. This implies that we should make assumptions on the individual values of ˜v. For identification we assume that ˜v is independent of X. The hazard rate can not be negative. Therefore the distribution of ˜v is usually chosen from class of positive distribution, see Box-Steffensmeier and Zorn (1999). Most frequently used distributions are gamma, normal, and t distribution. Another popular used frailty distribution is a discrete Q mass point distribution which is proposed by Heckman and Singer (1984). This type of frailty is therefore also know as Heckman-Singer frailty.

5.2.2

Mixed proportional hazard model with continuous frailty distribution

A PH model with random effect is better known as a mixed proportional hazard (MPH) model. Inserting a frailty term implies that the proportionate response of the hazard rate to a change in a specific regres-sors will no longer be constant. The proportionate response will decline over time. Furthermore, the true proportionate response of the hazard to a change in a specific regressors will, almost always, be under-estimated when frailty is not taken into account.

(30)

Sup-CHAPTER 5. PROPORTIONAL HAZARDS MODEL

pose that person l has kldifferent claims each with his own frailty term v.4 Conditional on X= x and v the

individual hazard function θ(t|x, v) is the same for all these spells. For the case of simplicity, we assume for the moment no censoring and that non of the claims has a delayed entry. The likelihood contribution of person l is given by Z ∞ 0 . . .Z ∞ 0 kl Y k=1 f(tl,k, |Xl,k, vk)dG(˜v1, . . . , ˜vkl),     5.8

where tl,kis the length of the k-th duration of individual l and where G(˜v1, . . . , ˜vkl) is a joint distribution, see Van den Berg (2001). Following Klein and Moeschberger (2003) we will assume that an individual will have a shared frailty term for different spells. By this assumption the individual likelihood contribution in equation (5.8) reduces to Z ∞ 0 kl Y k=1 f(tl,k|Xl,k, v)dG(˜v),     5.9

where G(˜v) is a particular distribution, for example a gamma, a normal, or a t distribution. In our cal-culations we will use the gamma distribution as frailty distribution because it is the most frequently used distribution. The mean of the frailty distribution will be normalized to 1. Therefore we only have to estimate the variance of ˜v. The individual likelihood contribution in equation (5.9) should of course be corrected for delayed entry and censoring. The estimation of the parameters in the MPH models will be done in the program R.5

The only concern left is identification of the MPH model. In Van den Berg (2001) some assumptions are listed for the identification for the single-spell MPH model. (1) There should be enough variation in the observed explanatory variables, (2) the right-hand tail of the frailty distribution should not be too fat to ensure a finite mean of the frailty term, (3) the frailty term should be independent of the covariates. Be-cause of the large number of covariates both, continuous and discrete, assumption (1) seems to be satisfied. Assumption (2) is clearly satisfied because a gamma distribution has finite mean and assumption (3) had already been made. Because the relatively few multiple spells in the data, the non-parametric identification of the MPH in equation (5.7) is guaranteed, see Van den Berg (2001) and Spierdijk et al. (2009).

5.2.3

Mixed proportional hazard model with Heckman-Singer frailty

In this section we will provide an introduction to Heckman-Singer frailty. For simplicity and to make the story more understandable, we will first discuss Singer frailty in the MPH model. Heckman-Singer frailty in the multi-state MPH model will be discussed in Section 5.3.

4We will use an episode split dataset, i.e. each claim of spell length n, will be split into n sub-periods. More information on the

episode split dataset can be found in Appendix B.

51) Version 2.10.1 of R is used for all calculations. 2) A MPH model will be estimated in the case of two states. In the three

states case we will estimate a multi-state MPH model. 3) In R there is a survival library which contains pre-programmed function for estimating PH and MPH models by means of penalized partial likelihood. In both cases the baseline will be unspecified.

(31)

Heckman and Singer (1984) proposed to use a discrete frailty distribution instead of a continuous frailty distribution. Suppose this discrete frailty distribution has Q mass points. In that case the individual likeli-hood contribution in equation (5.9) becomes6

Q X q=1 kl Y k=1 f(tl,k|Xl,k, vq)Pq.  5.10 Here Pqrepresents the mixing probability of the likelihoods for each mass point and vqis a groups specific

intercept, which models the between groups heterogeneity. The likelihood function equals the product of the individual likelihood contributions adjusted for right censoring and delayed entry. The adjusted likelihood is given by L = n Y l=1 Q X q=1 kl Y k=1 S(tl,k|Xl,k, vq) S(to l,k|Xl,k, vq) θδl,k(t l,k|Xl,k, vq)Pq,     5.11

or in somewhat more convenient notation

L = n Y l=1 Q X q=1 kl Y k=1       S(tl,k|Xl,k, vq) S(to l,k|Xl,k, vq)       1−δl,k      S(Yl,k|Xl,k, vq) S(to l,k|Xl,k, vq) θ(tl,k|Xl,k, vq)       δl,k Pq,     5.12

where we used the fact that f (t|x, v)= S (t|x, v)θ(t|x, v). The start of the k-th spell of individual l is given by to

l,kand the end of this spell is given by tl,k. Furthermore, we have that δl,kis an indicator function. It equals

one when we observe the end of the spell and it equals zero if the observation is censored. The first term in equation (5.12) is the likelihood contribution of a completed spell and the second term is the likelihood contribution of a censored spell.

In the case of a discrete frailty distribution we should make assumptions on the baseline hazard. We could specify the baseline hazard for example by a Weibull distribution. Another option is to assume that the baseline hazard is piecewise constant. We will come back to the baseline hazard specification in Section 5.4.

5.3 Multi-state mixed proportional hazards model

As already mentioned in Section 4.2 we will estimate a multi-state state duration model. In Section 4.2 we constructed 3 disability states. For convenience we will, again, provide the graphical representation of these three states, see Figure 5.1.

In the multi-state model we will focus on four transitions, namely P100 → W/U, P50 → W/U, P100 →

6For simplicity we still assume here no censoring and no delayed entries, in the calculations we will adjust the likelihood function

(32)

CHAPTER 5. PROPORTIONAL HAZARDS MODEL

P50 and P50 → P100. To simplify the mathematical equations later on, we will label the three states as follows: 0:W/U, 1:P50 and 2:P100. By doing so, we have now that transition P100 → W/U is given by 2 → 0 etc. In the multi-state MPH model each transition intensity will be modeled according a MPH

Figure 5.1: Transition diagram

model. The MPH model for the transition i → j, with (i j) ∈ A ≡ {(10), (20), (21), (12)} and conditional on the covariates X= x and frailty term vi jis then defined as

θi j(t|x, vi j) = θi j0(t) exp(x0βi j+ vi j),

 5.13 where βi jis a K-dimensional vector of coefficients. The first three transitions are related to an improvement

in the health status while the last one is related a decline in the health status. Because the nature of the transitions differs, we explicitly allow that the coefficients βi jmay differ between transitions.

Without taking unobserved heterogeneity into account the estimation of the MMPH model boils down to estimating four single PH models, for each transition one. In the previous subsection we already men-tioned that we will model unobserved heterogeneity by means of Heckman-Singer frailty. Assuming the same number of mass points for each frailty distribution, say Q mass points, then the individual likelihood contribution of individual l is defined as follows:

(33)

where ki j is the number of observed transitions i → j of individual l. Notice thatPv10 is defined as the sum over all possible outcomes of v10. The other three summations are defined in a similar way. The

distribution of (v10, v20, v21, v12) is determined by the probability distribution p(v10, v20, v21, v12). Because

the joint probabilities should add up to one, we have to estimate 4Q+ Q4− 1 frailty parameters. 4Q mass

points and Q4− 1 joint probabilities. For Q= 2 the computational intensity is already very large. Therefore

we will follow for example Heckman and Walker (1990) ,Van den Berg (2001) and Spierdijk and Koning (2009), who opt for an one-factor loading frailty specification. In that case we take an individual-specific random variable ω , with a discrete distribution on Q mass points, i.e. pq ≡ p(ω= ωq), for q= 1, 2, . . . , Q

such thatPQ

q=1Pq= 1. Furthermore, they assume that

vi j,q = ai jωq+ bi j,     5.15

for certain constants ai jand bi jand with (i j) ∈ A. By using an one-factor loading frailty specification the

individual likelihood contribution in equation 5.14 reduces to

Ll = Q X q=1 Y (i j)∈A ki j Y k=1        Si j(t l,i j,k|Xl,i j,k, vi j,q) Si j(to l,i j,k|Xl,i j,k, vi j,q)        1−δl,i j,k ×        Si j(t l,i j,k|Xl,i j,k, vi j,q) Si j(to l,i j,k|Xl,i j,k, vi j,q) θi j(t l,i j,k|Xl,i j,k, vi j,q)        δl,i j,k Pq,  5.16

where δl,i j,kis an indicator function. It equals one when the k-th observation of individual l belonging to the transition i → j is observed. Otherwise it equals zero. Notice again that ki jis the total number of i → j

transitions in the episode split dataset of a specific claimant. The likelihood function of all individuals is given by: L = n Y l=1 Q X q=1 Y (i j)∈A ki j Y k=1        Si j(t l,i j,k|Xl,i j,k, vi j,q) Si j(to l,i j,k|Xl,i j,k, vi j,q)        1−δl,i j,k ×        Si j(t l,i j,k|Xl,i j,k, vi j,q) Si j(to l,i j,k|Xl,i j,k, vi j,q) θi j(t l,i j,k|Xl,i j,k, vi j,q)        δl,i j,k Pq.  5.17

By implying an one factor frailty distribution the number of unknown frailty parameters is reduced to 2Q+ 5. For more details on the properties of one-factor loading frailty distributions and identification criteria of the MMPH model we refer to Van den Berg (2001).

The estimates of βi j in equation (5.13) will be derived by maximizing likelihood, where the likelihood

(34)

CHAPTER 5. PROPORTIONAL HAZARDS MODEL

5.4 Baseline hazard specification in the MMPH model with HS

frailty

In the previous sections we already mentioned that maximum likelihood will be applied to estimate the parameters in MMPH model and that in the case of a discrete frailty distribution we should specify the baseline hazard. In the MMPH model we will assume transition specific baseline hazards θi j0(t). Further-more, we will assume that these transition specific baseline hazards θi j0(t) are piecewise constant on the following intervals: I1 := (0, 1], I2 := (1, 2], I3 := (2, 4], I4 := (4, 6], I5 := (6, 12], I6 := (12, ∞). These

assumptions imply that the transition specific baseline hazard θi j0(t) equals

θi j 0(t) = 6 X k=1 ci j,kI(t ∈ Ii).     5.18

For each transition specific baseline hazard we can normalize one of these parameters, say ci j,1≡ 1. Esti-mates of the coefficients will be obtained by maximizing the likelihood function in equation (5.17).

(35)

6

Empirical results of the two states

representation

Figure 6.1: Transition diagram

In the two state representation, the transition P → W/U is modeled by a MPH model with gamma frailty. The model is given by equation (5.6), which is reproduced here for convenience:

θ(t|X, v) = θ0(t) exp(β0X+ v).

 6.1

Sick leave is associated with numerous determinants and over the years, extensive research has been done to establish the precise influence of those determinants. In Chapter 2 we already mentioned that according to the literature, sick leave durations are related to gender, age, education and marital status. It would therefore be a good idea to include these variables in our model. Unfortunately, not all of these variables are available in the dataset. Those variables that are motivated by previous studies and that are available in the dataset will therefore be included in the models. The included variables are listed in Table 4.2.

(36)

Dif-CHAPTER 6. EMPIRICAL RESULTS OF THE TWO STATES REPRESENTATION

ferent types of disorders will therefore be included in the model as explanatory variables. Some disorder groups are aggregated into one group, called other disorders. Reason for this aggregation is that for some type of disorders the number of observations in the dataset is very low. The same is done for some occu-pational classes. The contract characteristic Brand is left out from the model. This is done because only a few claimants have an insurance contract of Brand 2, Brand 3 or Brand 4.

Besides the time-constant variables (we assume that the above variables do not change over time) we will also include several time-varying variables as explanatory variables. The time-varying variables that will be included in the model are the GDP growth rate and the business confidence index (BCI). In the first quarter of 2009 there was a very large drop in the business confidence index. It even became negative. In the COEN reports of the first and second quarter of 2009 it is stated that this large drop is due to the financial and economic crisis.1In the fourth quarter of 2008 the sectors industry and transport were mostly affected by the financial and economic crisis, but from the first quarter of 2009 also other sectors started to observed the effects of the crisis. So, from the beginning of 2009 the effects of the financial and economic crisis became more and more visible and it is therefore doubtful that the coefficient of the business confi-dence index will be constant over time. Note that time here is calendar time and not the number of months that a self-employed is unable to work. To allow for a structural change in the coefficient of the business confidence index we will define the coefficient of the business confidence index as follows:

βBCI(t) = βBCI,1+ βBCI,2 I(t ≥ 2009-04-01),

 6.2 where I() is the indicator function. It equals zero when the observed record in the episode split dataset is observed before 2009-04-01 (year-month-day) and it equals one otherwise.2 We have chosen to set the

break around 2009-04-01 because at that moment most of the self-employed started to observe the effects of the financial and economic crisis. Furthermore, the claimants observed at that moment in time the large drop of the business confidence index.3 Also, for the GDP growth rate we could introduce a structural

break. However, from a preliminary estimation phase it turned out that there is no significant evidence for such a break. For this reason we only allow a structural break in the coefficient of the business confidence index.

6.1 Estimation results

The estimated coefficients of the MPH model with gamma frailty together with their standard errors can be found in Table 6.1. The coefficients are asymptotically normally distributed. We used a Wald test to test

1After each quarter, the business confidence index is published in a so called COEN report together with the most important

developments and expectations regarding the Dutch trade and industry. For more information on the COEN reports we refer to the

website htt p : //www.kvk.nl/branchein f ormatie/con junctuurenquetenederland/con junctuurenquetenederland.

2See appendix B for a clear description of the notion of episode splitting.

3The time depending covariates used in the models are lagged by one quarter, for an explanation we refer to appendix B.

(37)

whether the null hypothesis H0 : β= 0 holds or its alternative Ha : β , 0. The results of the test can be

found in Table 6.1. The null hypothesis will be rejected based on a 5% significance level. So for example, when the p-value of a specific coefficient is smaller then 5% we will reject the null hypothesis. In that case the specific variable has a significant effect on the recovery rate.

The recovery rate decreases significantly over age, which is in agreement with the findings found by Spierdijk et al. (2009). The older a claimant is, the longer his disability spell will be. The recovery rate of a 55-year old claimant is only 47% of that of a 25 year old claimant. The recovery rate of females is 18% lower than that of males. Females recover more slowy than men, which is in agreement with the results found Spierdijk and Koning (2009). The coefficient of the replacement income is negative, but not significant. This implies that there is no moral hazard regarding the replacement rate. This finding is in agreement with Spierdijk et al. (2009) who also found that economic incentives do not affect the recovery of self-employed. However, this finding contradicts with the finding of Galizzi and Boden (2003), who find that higher wage replacement benefits leads to longer absence durations of employees.

Claimants with a cardiovascular, neurological or a psychological disorder recover significantly more slowly than claimants with another type of disorder. Especially claimants with a cardiovascular or a psychological disorder recover very slowly. The recovery rates for these two disorders are respectively 32% and 31% lower compared with other disorders. On the other hand, claimants with a skin disease or with a respiratory or digestive disorder recover significantly faster than claimants with other type of disorders. For example, the hazard rate of a claimant with a skin disease is 83% higher than that of a claimant with an other type of disorder. The recovery rate of claimants with a locomotive disorder is not significantly different from that of claimants with other disorders. These findings are in line with the findings found by Spierdijk et al. (2009).

The occupational class agriculture used in the model, is an aggregate class of two sub-classes in the dataset, namely agriculture and agriculture others. Only self-employed who are working in the agriculture, the live-stock farming industry or medical services sector have a significantly different hazard rate compared with self-employed working in other sectors. The hazard rates of self-employed who are working in the agri-culture, the livestock farming industry or medical survives sector are significantly higher compared with self-employed working in other sectors, which implies that these self-employed recover faster compared to others.

(38)

CHAPTER 6. EMPIRICAL RESULTS OF THE TWO STATES REPRESENTATION

Coef Exp(Coef) se(Coef) p-Value

Age -0.025 0.975 0.001 0.00 Female -0.202 0.821 0.034 0.00 Replacement income -0.000 1.000 0.000 0.84 Cardiovascular -0.390 0.677 0.056 0.00 Skin 0.605 1.830 0.058 0.00 Locomotive 0.055 1.056 0.033 0.10 Neurological -0.239 0.788 0.055 0.00 Psychological -0.367 0.693 0.040 0.00 Respiratory 0.443 1.409 0.057 0.00 Digestive 0.414 1.512 0.048 0.00 Livestock farming 0.102 1.108 0.029 0.00 Horticulture 0.037 1.038 0.032 0.27 Agriculture 0.088 1.092 0.029 0.00 Business services 0.027 1.028 0.036 0.46

Industry and construction -0.007 0.993 0.025 0.79

Medical services 0.127 1.136 0.036 0.00

GDP growth rate -0.041 0.957 0.012 0.00

BCI,1 0.022 1.022 0.002 0.00

BCI,2 -0.025 0.976 0.004 0.00

Gamma frailty (variance) 0.253 0.00

Table 6.1: Estimation results MPH model with gamma frailty

there is a large difference in the magnitude of a 1% change of the GDP growth rate on the recovery rate. Spierdijk et al. (2009) found that an 1% increase of the GDP growth rate would lead to a 27% decrease in the recovery rate while we find that an 1% increase of the GDP growth rate would only lead to a 4% de-crease of the recovery rate. A possible explanation for the negative relation could be due to higher working pressure when the economy is booming or the need to return to work as soon as possible during a recession.

For the business confidence index both coefficients are significantly different from zero. For the period before 2009-04-01 we have that claimants recover more slowly if the business confidence index is low. A decrease of the business confidence index by 1% would lead to a decrease of the recovery rate by 2%. So, claimants have longer durations when the business confidence index is low. A possible explanation of this finding is a secondary-gain effect. When self-employed have high confidence in the economy they expect an increase in sales, investment, export and employment which will probably leads to a positive expectation on the profit they can make. This positive expectation can makes it attractive to go back to work as soon as

(39)

possible.

Maybe it is somewhat confusing that the sign of the business confidence index coefficient in the period before 2009-04-01 is positive. The GDP growth rate and the business confidence index both give infor-mation on the growth of the economy, so it may be surprising that we find opposite signs for the two coefficients. Notice however that the GDP growth and the business confidence index are two different vari-ables and that we should not interpret them in the same way. We will give three reasons why. The first reason is that the GDP growth rate measures the real growth of the whole economy, while the business confidence index is a measure of how many companies expect an increase in sales, investment, export and employment. So the business confidence index gives information on how many companies expect an in-crease in sales, investment, export and employment, but it gives no information on the size of the inin-creases in sales, investment, export and employment. The second reason is that both variables depend on different variables. The GDP growth rate is based on consumption, government spending, employment, export and import of all consumers and companies while the business confidence depends on sales, investment, export and employment of companies. This implies that an increase of the GDP growth rate does not have to lead to an increase of the business confidence index. The third reason is that the business confidence index is an expectation on the growth of the economy and an expectation could be different from reality. Compa-nies’ expectations are likely based on two aspects. The first aspect is real based values, such as the sales in the previous period. The second aspect is not based on real based values but more based on a feeling, i.e. psychological effects. When all colleagues of a self-employed are negative about the growth of the economy, then this will probably effect this self-employed expectation on the economy. Changes in the business confidence index can therefore be caused by psychological effects while the GDP growth rate is not directly affected by psychological effects.4

In Section 2.2 we provided a possible reason why self-employed could recover faster in times of economic welfare (higher working pressure) and we provided a possible reason why self-employed could recover more slowly in times economic welfare (secondary gain effect). Both reasons are likely and both could happen simultaneously. This could explain why we find opposite sign for the business confidence index and the GDP growth rate.

Although the size of coefficient of the GDP growth rate is twice as large as the size of the coefficient of the business confidence index (in the period until March 2009), the economic impact of a change in the business confidence index on the hazard rate is much larger than the impact of a change in GDP growth rate. In the period before 2009-04-01 the GDP growth rate fluctuates between -1.1.% and 1.6%, while the

4Psychological effects can of course indirectly affect the GDP growth rate. When consumers have less confidence in the economy

(40)

CHAPTER 6. EMPIRICAL RESULTS OF THE TWO STATES REPRESENTATION

business confidence index fluctuates between 3.1% and 16.4%. The main driver of a change in the recov-ery rate over time (for the period before 2009-04-01) is therefore the business confidence index. Notice again that we mean with time, calendar time and not the number of months that a self-employed is disabled.

After March 2009, we have a significant break in the coefficient of the business confidence index. The coefficient of the business confidence index then equals 0.02204 − 0.02482 = −0.00278. The asymptotic standard error of the coefficient of the business confidence index after the break equals

p

Var(βBCI,1)+ Var(βBCI,2)+ 2Cov(βBCI,1, βBCI,2)= 0.0015.

The coefficient after the break is based on a Wald test not significantly different from zero. A change in the business confidence index has therefore no significant effect on the recovery rate of a claimant.

Self-employed do not react to a change in the business confidence index in periods when it is negative. It is possible that the business confidence index is not a very good or reliable indicator of the economic climate in times of a financial and economic crisis. Companies could be too pessimistic in their expecta-tions on sales, investment, export and employment caused by psychological effects. Another reason is that a financial and economic crisis of such a scale is very rare and it could therefore be difficult for a company to come up with proper estimates of the increases in sales, investment, export and employment. It is likely that a self-employed searches for certainty in times of a crisis. This may explain why the effect of a change in the business confidence index on the recovery rate of a claimants is not significant in times of a financial and economic crisis. Another possible reason is that there is no secondary-gain effect. All values of the business confidence index are negative during the financial and economic crisis. This implies that during the crisis the majority of the companies expect an decrease in sales, investment, export and employment and therefore probably also a decrease in their profits.

In sum, claimants tend to recover faster when the business confidence index is positive. Claimants’ re-covery rates are not significantly affected by a change in the business confidence index in periods when the business confidence index is negative. In the period before the structural break, the average observed business confidence index equaled 11.9% and in the period after the structural break the average observed business confidence index equaled -16.2%. The recovery rate after the structural break is therefore on av-erage 100 exp[ ˆβBCI,1(−16.2 − 11.9)+ ˆβBCI,2(−16.2 − 0)]= 80% of the recovery rate before the structural break. We conclude that claimants recover much faster in periods that the business confidence index is positive than in periods that the business confidence index is negative.

In Figure 6.2a we plotted the effects of the GDP growth rate on the recovery rate over time (calendar time). For each moment in time we calculated the ratio exp( ˆβGDP(Xi− Xj)) of individuals i and j, where

(41)

Xiequals the observed GDP growth rate and where Xjequals 0 for all time moments, hence the recovery

rate of person j will not be affected by a change in the GDP growth rate. By doing so, we can see what the effect of the GDP growth rate is on the recovery rate for different time periods (calendar time). In March 2005 the ratio equaled 0.99 which implies that the recovery rate of person i is 1% lower than the recovery rate of person j. In December 2007 the ratio equaled 0.96 which implies that the recovery rate of person i is 4% lower than that of person j. For person i this also implies that his recovery rate is lower in December 2007 compared to March 2005. So, the higher the ratio the faster person i recovers compared to our benchmark, person j. A similar plot can be made for the business confidence index and the overall effect of both time-dependent covariates, see Figures 6.2a and 6.2b respectively.

(a) GDP growth rate and the business confidence index

(b) GDP growth rate and the business confidence index together

Figure 6.2: Effect of the GDP growth rate and the business confidence index over time (calender time).

Referenties

GERELATEERDE DOCUMENTEN

Although self-employment has the negative effect on job security and personal life, the overall job satisfaction of self-employed is higher compared to employees, suggesting that

They also have to a make plausible case that buyers (direct and indirect) benefit from the safeguarding of quality or the promotion thereof, for example based on studies into

I identify nine clusters of careers involving self-employment across all cohorts and countries considered: (1) always self-employed individuals, (2) those that become self-employed

Differences in health care expenditure between newly self‐employed and employees were most pronounced in men (for mental health care), individuals aged 40 years and older (for

The relationship between the level of happiness of self-employed foreigners compared to foreign employees, self-employed individuals who are not foreign and different generations of

Hypothesis 3 - Psychological predisposition: Individuals who are psychologically predisposed to anticipate their financial resource needs (i.e., high risk tolerance, high

• Laat deelnemers die het eens zijn met de stelling een stap naar voren doen en deelnemers die het niet eens zijn met de stelling een stap naar achter.. • Laat ze vervolgens

Thus, the goal of this Special Issue of the International Journal on Network Management is to present innovative flow-based approaches and solutions for network management tasks, as