• No results found

Understanding policyholder lapse behavior : an empirical analysis of the Dutch market

N/A
N/A
Protected

Academic year: 2021

Share "Understanding policyholder lapse behavior : an empirical analysis of the Dutch market"

Copied!
34
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

MSc Actuarial Science and Mathematical Finance

FEB, University of Amsterdam

Understanding policyholder lapse

behavior

An empirical analysis of the Dutch market

Author:

Raymond Rosa, 6038689

Supervisors: Dr. T.J. Boonen Prof. Dr. R.J.A. Laeven

(2)

CONTENTS CONTENTS

Contents

1 Introduction 2 2 Theoretical background 3 2.1 Literature Research . . . 3 2.2 Hypothesis Development . . . 5 3 Data 7 3.1 Data description . . . 7 3.2 Summary statistics . . . 8 4 Methodology 11 4.1 Generalized Linear Models . . . 11

4.1.1 Logistic Regression . . . 13

4.2 Company model . . . 14

4.3 Measures for model comparison . . . 15

5 Results 16 5.1 Empirical Analysis . . . 16

5.2 Model Comparison . . . 21

6 Remarks and Conclusion 24

7 References 26

A Additional Tables 29

(3)

1 INTRODUCTION

1

Introduction

Insurance policies provide policyholders with copious amounts of options that can significantly influ-ence the extent of the insurer’s liabilities (Gatzert, 2009). For example, the surrender option gives policyholder the right to surrender their policy and receive a surrender value. Another example is the paid-up option where policyholders can decide to discontinue premium payments. In the literature the term “lapse” can be interpreted differently, in accordance with the new risk-based EU regulatory regime Solvency II, the term lapse covers all legal or contractual policyholder options which can signif-icantly change the value of the future cash-flows. Therefore lapse consists of options to fully or partly terminate, decrease, restrict or suspend the insurance cover along with options which allow the full or partial establishment, renewal, increase, extension or resumption of insurance coverage (CEIOPS, 2010, p.155).

Previous studies show that numerous companies assume irrational lapse behavior (Chen et al., 2008). Life insurers characterize irrationality as the decision to lapse is not influenced by (economic) factors, for some of their policyholders. This assumption may cause significant losses for the insurer, if the actual rates deviates from the assumed lapse rates used in the projections. Hence a proper understanding of lapse dynamics is particularly important for insurance managers, regulators and customers. The profitability and liquidity of life insurers can be heavily influenced through acquisition cost, adverse selection1, and cash surrender values (Kuo et al., 2003). A massive lapse event can threaten the insurer’s liquidity, force the selling of assets and diminish the effectiveness of risk pooling. Furthermore lapse can result in a decrease of potential future profits, in particular early lapse results in substantial losses if the insurer is unable to retrieve acquisition costs (Prestele, 2006).

For regulators, the quantitative impact studies of Solvency II have demonstrated that lapse risk is among the main risk drivers of risk-based capital requirements for life insurance companies. Lapse risk accounts for half of the capital requirements in the life underwriting module (CEIOPS, 2008). Accordingly regulators should also have a thorough understanding of lapse dynamics in order to define reasonable capital requirements in the life underwriting module. Lastly, customers use lapse as one of the main indicators to assess the product and service quality of life insurance companies (Kiesenbauer, 2012). Companies with above average lapse rates might offer more expensive products, for the same coverage, or provide less services than competitors. Customer might use such qualitative indicators as additional source of information when making a purchasing decision for life insurance contracts.

As a result, the regulators implement a regulation within the Solvency II regime where life insurance companies are required to determine the likelihood that policyholders will exercise contractual options based on an analysis of past policyholder behavior. The analysis should take the benefits of exercising the options for the policyholder, influences of past economic condition, impact of management actions and any other circumstances into account. Unless empirical evidence states otherwise, the likelihood of exercising contractual options shall not be assumed independent (EC, 2011).

Kim (2005) examines economic variables as determinants for lapses as well as policyholder infor-mation. Kim employs logistic regression models to identify lapse drivers and to develop a predictive lapse model using Korean data. Renshaw and Haberman (1986) are the first to analyze lapse data, of seven Scottish life insurers, using policyholder characteristics and/or account products. Recent studies include Cherchiara et al. (2009), Milhaud et al. (2010) and Kiesenbauer (2012) analyzing Italian, Spanish and German data, respectively.

This paper extends the existing literature on lapse in the Dutch life insurance industry by analyzing the portfolio of a Dutch life insurance company. The starting point for this analysis is the logistic regression model given by Kim (2005). The logistic regression model is utilized to determine the drivers of lapse and to derive a model for predicting future lapse rates. Interestingly this approach coincides with the Solvency II regulation pertaining to lapse. Consequently a comparison will be made between the logistic regression predictions and the company’s current model predictions, regarding future lapse. This serves as an assessment to evaluate whether the company’s current model predicts lapse rates prudently.

1Customers in poor health might be less inclined to lapse a contract including death cover as they will hardly find

(4)

2 THEORETICAL BACKGROUND

The findings show that the explanatory variables that determine lapse behavior are similar across all product types, however the direction and impact of the variables are product specific. In particular, the interest hypothesis and emergency hypothesis hold only for Unit Linked products in the Dutch market. The assessment of the prediction accuracy, using the RMSE, suggests that the results of the Dutch market are not comparable to those of Kim (2005) and Kiesenbauer (2012), however the prediction accuracy depends on the selected explanatory variables. Furthermore, the results of the company model are comparable to the logistic regression model. This implies that both the company model and the logistic regression model provide reasonable predictions for lapse rate developments in the near future. The predictions require assumptions regarding the future developments of the underlying explanatory variables. As a result, predicting lapse rates cannot be understood as point estimates.

The remainder of this paper is structured as follows: Section 2 depicts the existing theoretical background on life insurance lapse and derives the hypotheses. Section 3 briefly outlines the data and discusses some summary statistics. Section 4 describes the methodology of this paper. Section 5 presents and examines the findings and section 6 concludes this paper.

2

Theoretical background

This section gives an overview of the existing literature on lapse behavior in the life insurance industry, afterwards the next subsection derives the hypotheses.

2.1 Literature Research

Lapse rate modeling has been a very active field of research in recent years. Moreover, a fair amount of empirical work has been done, especially on the issue of how environmental variables affect lapse. In the literature there are two main hypotheses that are used to describe policyholder lapse behavior. The first one is the interest rate hypothesis (IRH), according to the IRH the market interest rate can be seen as the opportunity cost of owning a life insurance policy. As the interest rates go up, premiums for new policies usually decrease. Therefore policyholders are inclined to surrender their contracts and purchase a new one, taking advantage of the higher yield. However, policyholders are less inclined to surrender their policy in times of falling interest rates.

The second hypothesis, originally proposed by Linton (1932), is based on the premise that house-holds regard their savings in life policies as a source of emergency funds which can be drawn upon in times of necessity, either by taking out policy loans or by seeking surrender. In other words, the emergency fund hypothesis (EFH) states that policyholders utilize cash surrender values as emer-gency funds when facing personal financial distress. It implies that during an economic downturn policyholders have greater incentive to surrender their policies, hence lapse rates increase during these circumstances.

Outreville (1990) studies the EFH with data of whole-life insurance in the U.S. and Canada. The results provide consistent evidence for the emergency fund hypothesis. Dar and Dodds (1989) test both hypotheses using endowment policies of U.K. life insurers. They find evidence in favor of the emergency fund hypothesis, but no significant relationship between surrenders and rate of return. Kuo et al. (2003) analyze lapse rates in the U.S. using the cointegration approach to address long-term lapse dynamics. They perform an impulse response analysis and find that the interest rate effect is economically more significant than the unemployment rate in explaining the lapse rate dynamics. Kiesenbauer (2012) analyzes the main determinants of lapse in the German life insurance industry. He finds that buyer confidence, current yield, and gross domestic product (GDP) development are the most relevant economic indicators, while distributional focus, participation rate spread and company age are identified as the most relevant company characteristics. He finds that both IRH and EFH do not hold for traditional life insurance products, however both hypotheses are supported when unit-linked products are considered. Knoller et al. (2013) analyze surrender behavior for variable annuity contracts using individual Japanese policy data. They conclude that the majority of the surrender behavior is explained by the moneyness of the policy. Also, the influence of moneyness on

(5)

2.1 Literature Research 2 THEORETICAL BACKGROUND

the surrender rate depends on the size of the policy . Owners of large policies tend to behave more sensitively towards the moneyness of the policy.

Eling and Kochanski (2013) review research done on lapse in life insurance and argue that there is no consistency in the findings of the empirical work, this may be due to differences in the markets, methods, time periods, product types and variable specifications employed. Milhaud et al. (2010) determine that fiscal repercussions influence lapse risk considerably. They study policies with a fiscal constraint, surrender charges only apply for a certain part of the policy duration. As soon as the con-tract reaches the point when the policyholder can surrender it without penalty, the lapse risk increases significantly. They also find that other relevant risk factors include policyholder age and method of payment (i.e., regular versus single premiums).

The choice of appropriate lapse functions to model lapse rates, for example, to use in the internal models under Solvency II has been recently discussed in recent literature. Lapse rate models in the literature can be divided into two classes, the deterministic and dynamic models. Deterministic lap-sation is not scenario specific and thus can be determined with ’offline’ calculations. For performance reasons, internal risk models should be designed to perform as many calculations as possible out-side of the scenario-specific online calculations in order to avoid redundancy. Lapse drivers that are scenario-specific are classified as dynamic. Dynamic lapsation results from an internal decision pro-cess, therefore endogenous lapse is classified as dynamic according to Bacinello (2005). Exogenous lapsation depending on external factors, such as social, individual and economic factors are classified as deterministic. However some of these factors (e.g., GDP, market rates, health status) are suitable for stochastic modeling and can therefore be used for modeling dynamic lapsation. The interest rate hypothesis is categorized as rational lapsation being that rational lapse depends on factors such as market rates and in-the-moneyness or value of guarantees. Irrational lapsation is a result of social, individual and economic factors that can be associated with the emergency fund hypothesis. Most academic literature treats policyholders as financially rational and risk neutral (Eling and Kochanski, 2013). Given a financial model policyholders are assumed to act optimally in terms of maximizing the terminal value of their investment. Assuming rationality, homogeneous insurance portfolios will either lapse entirely or not at all.

Grosen and Jorgensen (2000) develop a dynamic model and use contingent claims analysis to value the surrender option. They find that the option to surrender can be quite expensive, up to 50% of the contract’s fair value, under certain market conditions. Bacinello (2003) defines the lapse option as an American-style put option, that allows the policyholder to sell back the contract to the insurer at the cash surrender value. By analyzing the value of the surrender option in the Italian endowment policies, he finds that the value of the surrender option can account for up to 10% of the premium depending on the penalty function used to calculate surrender charges. Gatzert and Schmeiser (2008) evaluate the risk potential of the paid-up option. Additionally they study the resumption option, the policyholder can resume premium payments once after exercising the paid-up option, and flexible payments options, the policyholder is able to stop and resume premium payments at multiple points in time. The value of the pure paid-up option increases staggeringly when the guaranteed interest rate is reduced, possibly accounting for more than 10% of the present value of expected premium payments. Despite the fact that these frameworks obviously do not reflect reality, they can be viewed as a worst-case model of policyholder behavior, from an insurer’s point of view.

General lapse models that include dynamic and deterministic lapse offer the possibility to model sub-optimal lapsation. Two popular modeling techniques have emerged from constructing combined lapse rates. The lapse intensity can be broken down into a constant, representing the deterministic laps, and a dynamic variable, representing dynamic lapse. The combined lapse rate is then calculated using the multiple decrement model. The other technique uses the deterministic lapse rates as base rates, which can be adjusted by dynamic factors, the so-called dynamic lapse multipliers.

Due to lack of statistical data and the wide variety of factors influencing policyholder behavior, most of these approaches are theoretical and not directly applicable to the real-world. Kim (2005), Cherchiara et al. (2009) and Milhaud et al. (2010) first develop lapse rate models based on empirical data. Kim (2005) models lapse rates of a Korean life insurer using the logit and the complementary

(6)

2 THEORETICAL BACKGROUND 2.2 Hypothesis Development

log-log function, respectively. Kim (2005) studies both economic indicators (e.g. interest rates, un-employment rates, economic growth rates) and policy characteristics (policy age since inception) as explanatory variables, afterwards the results are compared to the less sophisticated arctangent model. German, Scottish, and Italian data are analyzed by Eling and Kiesenbauer (2013), Renshaw and Haberman (1986) and Cherchiara et al. (2009), respectively. These analyses employ generalized linear models to determine relevant contract features and policyholder characteristics concerning lapse be-havior. Eling and Kiesenbauer (2013) analyze policyholder’s age, contract age, product type. The case study of Cherchiara et al. (2009) shows the importance of policyholder of policy duration, calendar year, product class, and policyholder age on lapse rates. Renshaw and Haberman (1986) focus their analysis on age at entry, duration of policy, type of policy, and company.

To summarize, an outline of the academic literature is given, regarding the empirical work describ-ing policyholder behavior and several modeldescrib-ing approaches. The lack of consistency in the empirical work, is possibly attributed to differences in markets, methods, time periods, product types and vari-able specifications employed. Furthermore, most of the modeling approaches discussed were theoretical and not applicable to the real-world. A feasible modeling approach is presented by Kim (2005), who uses the general linear models to determine both relevant economic indicator and policy characteristics. The next subsection presents the hypotheses that will be investigated.

2.2 Hypothesis Development

Based on the existing literature, a number of hypotheses are proven in statistical analyses. This study analyzes factors that have already been studied in other countries. Taking both the literature and the specifics of the Dutch life insurance market into account, we analyze the following hypotheses:

I Product type: The existing literature has corroborated variations in lapse rates when different product types are regarded (Renshaw and Haberman, 1986; Cherchiara et al., 2009; Milhaud et al., 2010). Kim (2005) and Kiesenbauer (2012) analyze determinants of lapse for different product types in the Korean and German life insurance market, respectively. They find that the main drivers of lapse are similar across all product types, however the impact and direction of these drivers differ according to product types. Country-specific differences between the products should be kept in mind when interpreting these results. The initial assumption will be that the explanatory variables affect the product types differently, as a result lapse behavior is product specific.

II Policyholder gender: Kagraoka (2005) and Eling and Kiesenbauer (2013) analyze lapse in the Japanese and German life insurance markets, respectively, and find that females have a lower propensity to lapse their policy in comparison to males. The willingness to purchase an insurance product that is not completely comprehensible might be less for females. Kagraoka (2005) argues that housewives purchase life insurance products conditioned on the height of the household income. As a consequence the expected outcome for the policyholder’s gender is a lower lapse rate for females in comparison to males.

III Contract age: Contract age, the time since policy initiation, is regarded as an explanatory variable in most empirical studies (Renshaw and Haberman, 1986; Kagraoka, 2005; Milhaud et al., 2010).The results are remarkably consistent in that the lapse rates are highest for the initial contract years and then gradually declines. Thus the lapse rate decreases as the contract age increases.

IV Policyholder age: The existing literature shows that there is a decreasing lapse rate with in-creasing policyholder age (Renshaw and Haberman, 1986; Kagraoka, 2005; Cherchiara et al., 2009; Milhaud et al., 2010). However, the modeling approach differs. Cherchiara et al. (2009) and Eling and Kiesenbauer (2013) study the current policyholder age which coincides with the applied approach in this study. Other studies focus rather on the underwriting age of the pol-icyholder; the age of the policyholder at policy inception. Eling and Kiesenbauer (2013) argue

(7)

2.2 Hypothesis Development 2 THEORETICAL BACKGROUND

that the current policyholder age reflects the current policyholder status, which might be more influential for the decision to lapse. All the empirical studies use age classes aggregating up to 40 years rather than analyzing the effects for each age. Furthermore, the age classes differ across studies. Therefore the results of the analysis are not directly comparable to the existing findings. Nonetheless a reasonable expectation would be that lapse rates vary significantly with respect to the policyholder age.

V Premium payments: Milhaud et al. (2010) and Eling and Kiesenbauer (2013) find that single premium business lapses less often than regular premium business. They argue that policyholders investing a sizable amount into a single premium policy might have a better understanding of the product. Moreover, as there is no requirement for future premium payments, such a contract is less often to be lapsed due to financial distress. An argument can be made that regular premium business policyholders, who notices a reduction in his/her monthly disposable income due to premium payments, may have greater incentive to lapse, especially when the premium is substantial. For this reason a feasible expectation would imply that lapse rates will increase with premium size, for regular premium business.

VI Calendar year: Calendar year effects should reflect the economic environment, that is, the interest rate hypothesis and the emergency fund hypothesis. Cherchiara et al. (2009) and Eling and Kiesenbauer (2013) study calendar year effects. Cherchiara et al. (2009) observe a decrease in lapse rates until the end of the 1990s. In the subsequent years, the lapse rates rise and reach its maximum in 2007. The results of Eling and Kiesenbauer (2013) are consistent with the empirical result Cherchiara et al. (2009) presents, they conclude that increasing lapse rates might be a consequence of economic crisis. As a result the initial expectation will be an increase in lapse rates from the beginning of the period 2000 onward. Moreover, both global and local developments should be reflected in the calender year results, for example, the financial crisis at the global level and the profiteering affair (“Woekerpolisaffaire”) at the local level.

VII Reserve: The policy reserve has not been analyzed as extensively as the other explanatory factors. Knoller et al. (2013) suggest that the policy size is negatively correlated with the probability of surrender, however this has not been verified. It is not clear what effect the reserve might have on lapse since different effects are feasible. For example, contracts with a considerable reserve might lapse less often due to the fiscal repercussions associated with lapsing2. Conversely these contracts might be more sensitive to adverse economic conditions, as the policyholder’s confidence in the financial market diminishes.

VIII Macro-economic indicators: Kuo et al. (2003) find that the unemployment rate influences the lapse rate in the short term as well as in the long term. Whereas the interest rate is only marginally significant in the short term, yet has a statistically significant power in explaining long-term behavior of lapse rates. Kim (2005) concludes that policy lapse are not only dependent on interest rates but also on other exogenous factors. It is clear that exogenous factors do affect policyholder behavior, as a result an extension of macro-economic indicators will be studied. Besides unemployment rates, gross domestic product (GDP), consumer-price-index (CPI). This study also regards factors that indicate the investment climate, such as the AEX index and Euro Stoxx. Furthermore, to examine how policyholders react to prospective rates, this paper includes Dutch zero yields with different maturities.

In addition to statistically analyzing the Dutch insurer’s portfolio, an assessment of the company’s current model will be made. Kim (2005) and Kiesenbauer (2012) employ the logistic regression to predict lapse rates, both conclude that the prediction accuracy of the model is moderate. Kiesenbauer (2012) finds that the predicted lapse rates is reasonable for endowment, annuity, and term life, but limited for group and other business. Consequently a comparison will be made between the predictions of the logistic regression model and the company’s current model. If the logistic regression model

2For Annuity products, if the surrender value exceeds a threshold, the policyholder has to pay income tax of 52% plus

(8)

3 DATA

significantly outperforms the company’s current model, then the company has to reevaluate it’s model. On the other hand if there is no significant difference in the predictions or if the company’s current model outperforms the logistic regression model, then the company approaches lapse prudently.

To recap the main points of this subsection, the hypotheses have been derived based on the existing literature and the specifics of the Dutch life insurance market. Insight was given into the potential relationship between the factors and the lapse rates. Additionally a comparison will be made between the logistic regression model and the company’s current model, to assess whether the company approaches lapse prudently. In the next section a description of the data follows.

3

Data

This section describes the data that is used to examine the hypotheses in Subsection 2.2. The section begins with a description of the data, afterwards we provide some summary statistics and preliminary findings.

3.1 Data description

This study analyzes data that is provided by a Dutch life insurer, it covers the time period 1996 to 2012. The data consists of 5 product types: Annuity (“lijfrente”), Unit-Linked, Term life (“risico”), Savings(“spaar ”) and Mortgage-Linked Savings (“spaarhypotheek ”). These products are common in the Dutch insurance market. The insurance company has authorized access to the individuals’ raw data, as a result the yearly policyholder characteristics are included in the analysis. These characteris-tics are: the policyholder’s age, gender, the contract age, premium payments and the reserve. Besides these characteristics, calendar year and macroeconomic indicators are also included. These indicators act as a proxy for the economic situation during the time period. The macroeconomic variables are defined as follows:

GDPt = yearly change (in percent) of gross domestic product in year t, U Rt = yearly average unemployment rate (in percent) in year t,

CP It = yearly consumer price index (CPI) with reference year 2006 in year t, AEXt = (primo) yearly stock price (in euros) of the AEX-index in year t, EU Rt = (primo) yearly stock price (in euros) of the Euro Stoxx 50 in year t,

B1t = (primo) yearly zero yield (in percent) Dutch government bonds, with one year maturity in year t,

B3t = (primo) yearly zero yield (in percent) Dutch government bonds, with three year maturity in year t,

B5t = (primo) yearly zero yield (in percent) Dutch government bonds, with five year maturity in year t,

B10t = (primo) yearly zero yield (in percent) Dutch government bonds, with ten year maturity in year t.

The choice of macroeconomic variables is based on the existing literature and assumptions of policy-holder behavior in the Dutch life insurance market (see Subsection 2.2). They are displayed in Table A.1, in Appendix A, from 1996 to 2012.

During this time period the company underwent conversions to other administration systems. During the conversion process there has been loss of data, in particular policies that are no longer relevant for the administration, such as expired and lapsed policies. As a result the data has some inconsistencies, especially the earlier data in this time period. These inconsistencies tend to give a diluted portrayal of the past, therefore not being as reliable. Ultimately this influences the estimates of the models. Accordingly to minimize these inconsistencies the data is cleansed by removing incomplete observations. This entails removing policies without gender specification, as well as policies that have other inconsistent features such as negative duration and unlikely age specifications.

Afterwards the data needs to be prepared for the analysis. The product types are studied sep-arately, on account of the data provided by the company and the fact that the empirical literature

(9)

3.2 Summary statistics 3 DATA

confirms that product types affect lapse rates differently (see Subsection 2.2). Due to the size of data sets (smallest data set has over a million entries) and for model comparison purposes, all policyholder characteristics will be categorized. For example, age is categorized into several levels to capture life cycle effects instead of a single (linear) age effect. A single age effect might be misleading when the real age effect is not monotone (see Subsection 2.2). The same arguments can be made for the other policyholder characteristics contract age, reserve and premium payment. These characteristics are categorized in such a manner that the relative dependence between factor and lapse rates remains unchanged, for each product type.

The individual contracts are split up into all possible combinations of the different products and policyholder characteristics. For example such a combination would consist of annuity (product type), 2010 (calendar year), 10-15 (level contract age), 40-60 (level policyholder age), male (policyholder gender), and single (level premium payment). After splitting the individual data, into categorized combinations of all policyholder characteristics, grouping takes place where homogeneous groups are created from individual data with the same combinations. These homogeneous groups are then used in the analysis.

3.2 Summary statistics

Before proceeding to the next section, a summary of the data will be given along with some preliminary findings. The lapse rates are measured using the sum of the lapsed contract divided by the total policyholders in the appropriate homogeneous groups. The chosen exposure is the beginning of the year, this choice does not correspond with the company’s current model exposure (see Subsection 4.2). However, for model comparison purposes the preferred exposure is the beginning of the year. It follows that the lapse rates are calculated as

lapse rate = # of lapsed policies

# of policies at the beginning of the year, (1)

where # denotes the number of policies. The choice is rather intuitive, based on the beginning of a year, the expected amount of lapsed contracts will be estimated, therefore taking a prospective view on lapse. Additional arguments for the exposure choice will be given in the next section. The levels of the policyholder characteristics for each product type are displayed in Table A.2, in Appendix A. The exposure varies depending on the level of the characteristic, particularly the exposure in the extremities is limited. Limited exposure exacerbates the coefficient significance in the statistical analysis, therefore possibly giving misleading results. In order to have a common reference level for the analysis, the level with the largest exposure is chosen for each characteristic.

Figure 1 displays the relationship between some categorized homogeneous groups (classes) and the lapse rates. Notice that the rows represent the product types Unit Linked and Annuity respectively. While the columns represent the policyholder characteristics age and contract age. An initial browse over the figure shows that these characteristics clearly affect Unit Linked products, while Annuity products seem unaffected. Also this product type exhibits remarkably low lapse rates, further analysis reveals that the percentage of policyholders that lapsed in the Annuity data set is approximately 0.12%. These low lapse rates along with the lack of dependence between lapse rates and policyholder characteristics are attributed to the fact that Annuity products are usually associated with single payments (in the data set about 98% of policyholders). Moreover, annuities are fiscally attractive, policyholders do not pay tax at acquisition of the insurance policy. The taxes are deducted when the policy reaches the distribution phase, in which the insurance company makes income payments to the policyholder. By exercising the surrender option (i.e. lapsing) the policyholder is penalized, this enforces a direct payments of the taxes. These repercussions involve a significant amount of money. As a result, policyholders seldom lapse their annuity insurance. Aside from giving insight into the low lapse rates for Annuity products, these arguments also validate the hypotheses that single premium business lapses less often than regular premium business and that lapse behavior is different between product types (see Subsection 2.2). Also it reinforces Milhaud et al. (2010) findings that fiscal repercussions affect lapse considerably.

(10)

3 DATA 3.2 Summary statistics

Another observation concerning Figure 1 is that the lapse rates vary considerably among the levels for each policyholder characteristic. There is no apparent linear relationship between the policy-holder characteristics and the lapse rates3. These preliminary findings coincide with the hypotheses in Subsection 2.2, however these statements still have to be, statistically, proven in subsequent sections. To summarize, this section highlights the data that is used for the analysis. Before analyzing the data, some data cleansing and ordering occurred. After these data manipulations, the subsection summary statistics discusses the relationship between some policyholder characteristics and the lapse rates. Also some preliminary findings is provided. The next section presents the methodology of this paper.

3

The relationship between the policyholder characteristics and the other product types Term life, Savings and Mortgage-Linked Savings resemble the Unit Linked product type in that the policyholder characteristics affect the lapse rates non linearly. However they are not included due to not having any relevance in this context. Detailed results are available upon request.

(11)

3.2 Summary statistics 3 DATA

Figure 1: Dependence between policyholders’ characteristics and lapse rates per product type; average rate, median rate

(a)

Unit Linked: Lapse vs Age

(b)

Unit Linked: Lapse vs Contract Age

(c)

Annuity: Lapse vs Age

(d)

Annuity: Lapse vs Contract Age

Boxplots show the overall patterns of response of a group. They provide a useful way to visualize the range and other characteristics of responses for a large group. A boxplot consists of a lower quartile, median, upper quartile, lower whisker and upper whisker. The ”box” represents the middle 50% of the scores of the group. The box ranges from the lower quartile (25% of scores) to the upper quartile (75% of scores) and is referred to as the inter-quartile range. The median marks the mid-point of the data (50% of scores) and is shown by the line that divides the box into two parts. The upper and lower whiskers represent scores outside the middle 50%. They represent the maximum and minimum value, respectively, excluding outliers.

(12)

4 METHODOLOGY

4

Methodology

The purpose of this paper is to investigate the influence of economic and policyholder characteristics on lapse rates within the Dutch life insurance market. Lapsing an insurance contract is a binary event, a contract is either lapsed or retained (and perhaps lapsed at a later time period). Kim (2005) discusses two possible functions to model lapse in this regard, the logit function and the complementary log-log function. Kim compares the corresponding models with the arctangent model. The findings reveal that the differences between the logit and complementary log-log function are minimal, however both models perform significantly better than the arctangent model. These results are not surprising, the arctangent model assumes the lapse rate as a function of the interest rate, i.e., only one explanatory variable and three additional model parameters. Whereas the logit and complementary log-log models take several explanatory variables into account.

Based on the proposed models of Kim (2005), this paper focuses on the logit function using the corresponding logistic regression. Being that the aim of this paper is to identify determinants of lapse behavior in the Dutch life insurance market, it is not necessary to utilize the complementary log-log function. Because the complementary log-log-log-log function yields similar results as the log-logit function (Kiesenbauer, 2012).

In subsequent subsections the General Linear Models (GLM), particularly the logistic regression model will be described along with the company’s current model. Afterwards some measures to assess goodness of fit and compare the models are presented.

4.1 Generalized Linear Models

General linear models were first introduced by Nelder and Baker (1972) and further developed by McCullagh and Nelder (1989) as a generalization of the (classical) linear model. A suitable starting point would be a brief description of the linear model.

Suppose there are n observations of vector {yi, xi}ni=1, where each observation includes the response yi and a vector of p predictors xi. In a linear regression model the response variable yi is a linear function of the regressors:

yi = x0iβ + εi, (2)

where yi is assumed to be a realization of a random variable Y whose components are independently distributed with means µ. The systematic part of the model is a specification for the vector µ in terms of a small number of unknown parameters, the regressors β1, . . . , βp. The random component is assumed to have independence and constant variance of errors. A further specialization of the linear model involves the stronger assumption that the errors follow a normal distribution with constant variance σ2.

The components of Y are identical independent normal variables with constant variance σ2, where µ is the n × 1 vector of estimators, X is the n × p matrix of the observations and β is the p × 1 vector of unknown parameters,

E(Y ) = µ where µ = Xβ. (3)

The generalized linear models is an extension of the (ordinary) linear models, to simplify the transition of linear models to generalized linear models, (3) is slightly rearranged to generate the following three-part specification:

1. The random component: the components of Y have identical independent normal distributions with E(Y ) = µ and a constant variance σ2.

2. The systematic component: refers to the dependence of the response variable y on the vector of explanatory variables xj, j = 1, . . . , p. The dependence of the covariates x1, x2, . . . , xp on y are linear in the parameters, this relationship is expressed by the linear predictor η given by

η = p X

j=1

(13)

4.1 Generalized Linear Models 4 METHODOLOGY

3. The link describes how the mean, E(Y ) = µ, of the random variable Y depends on the systematic components (linear predictor), for the ordinary linear model this link is given by:

µ = η. (5)

This generalization introduces a new symbol η for the linear predictor and the third component then specifies that µ and η are in fact identical. If

ηi= g(µi), (6)

then g(·) is called the link function. General linear models allow two extensions; the distribution in the first component can come from an exponential family other than the normal distribution. Also the link function in the third component may become any monotonic differentiable function.

Assume that each component of Y has a distribution in the exponential family, taking the form fY(y; θ, φ) = exp

 yθ − b(θ)

a(φ) + c(y, φ) 

, (7)

for some specific function a(·), b(·) and c(·). The parameter θ is known as the location parameter, this parameter is related to the mean of the distribution, for some function of the mean, µ. The parameter φ, known as the scale parameter, is related to the variance of the distribution, for some function of the variance, σ2. If θ is known, this is an exponential-family with canonical parameters θ. Writing l(θ, φ; y) = lnfY(θ, φ; y) for the log-likelihood function considered as a function of θ and φ, y being given. The mean and the variance of Y can be easily derived from the known relations (see Kendall and Stuart, 1967) E ∂l ∂θ  = 0, (8) and E ∂ 2l ∂θ2  = −E ∂l ∂θ 2 . (9) We have  ∂l ∂θ  = y − b 0(θ) a(φ) , and  ∂2l ∂θ2  = b 00(θ) a(φ). Then equation (8) implies that E(∂l/∂θ) = 0, this leads to

µ = E(y) = b0(θ). (10)

From equation (9) we obtain b00(θ)/a(φ) = var(y)/a(φ)2, hence

var(y) = b00(θ)a(φ). (11)

The reader is referred to McCullagh and Nelder (1989) and Nelder and Wedderburn (1972) for a detailed discussion. Thus the variance of Y is the product of two functions; the first, b00(θ), depends on the canonical parameters only and will be called the variance function, while the other is independent of θ and depends only on φ.

The link function relates the linear predictor η to the expected value µ of y. Each distribution has a special link function for which there exists a sufficient statistic equal in dimension to β in the linear predictor η =Pp

j=ixjβj. These canonical link functions, as they are called, express θ in terms of µ, i.e. θ = z(µ) for some function z(·). For most distributions, the mean µ is one of the parameters in the standard form of the distribution’s density function. The function z(µ) turns the distribution’s density

(14)

4 METHODOLOGY 4.1 Generalized Linear Models

function into its canonical form (expresses the density function with θ instead of µ, see Equation (7)). When using the canonical link function it follows that

θ = z(µ) = Xβ = η, (12)

where θ is the canonical parameter as defined in (7). In other words, when θ, the parameter of the distribution of the random element, and Y the predicted value of the linear model coincide. The canonical links for common distributions in the exponential family are given in Table 1.

Table 1: Common distributions with canonical link function

Distribution Link Normal η = µ Poisson η = ln{µ} Binomial η = ln{π/(1 − π)} Gamma η = µ−1 Inverse Gaussian η = µ−2

Although the canonical links lead to desirable statistical properties of the model, specifically in small samples, there is in general no a priori reason why the systematic effects in a model should be additive on the scale given by that link.

4.1.1 Logistic Regression

The distinguishing feature of a logistic regression is that the outcome variable in this regression is binary. This difference is reflected in both the choice of a parameter model and in the assumptions. Once this difference is accounted for, the methods employed in an analysis using the logistic regression follow the same general principles. Using the notation of Hosmer et al. (2013), the specific form of the multiple logistic regression model is:

E(Y | x) = π(x; β) = e

β0+β1x1+β2x2+...+βpxp

1 + eβ0+β1x1+β2x2+...+βpxp. (13)

The canonical link of π(x; β) is the logit transformation. The relevance of this transformation is that g(x), the link function, has many desirable properties of a linear regression model. As result of the logit transformation; g(x) is linear in its parameters, may be continuous, and may range from −∞ to +∞ depending on the range of x. This transformation is defined as:

g(x) = ln  π(x; β) 1 − π(x; β)  , = β1x1+ β2x2+ . . . + βpxp. (14)

The distribution of the errors of a logistic regression model, expressed as y = π(x; β) + ε, are not nor-mally distribution . Here the quantity ε may assume to take one of two possible values. If y = 1 then ε = 1 − π(x; β) with probability π(x; β), and if y = 0 then ε = −π(x; β) with probability 1 − π(x; β). Thus, ε has a distribution with mean zero and variance equal to π(x; β)[1 − π(x; β)]. That is, the (conditional) distribution of the outcome variable, y, follows a binomial distribution with probability given by the conditional mean, π(x; β).

The general method of estimation that leads to the least squares function under the classical linear model is the maximum likelihood. This method provides the foundation for the approach to estimate the logistic regression model. The likelihood function expresses the probability of the observed data as a function of the unknown parameters (β). The maximum likelihood estimators ( ˆβ) of these parameters maximize the value of this function. Assuming independent observations, the likelihood function is: L(x1, . . . , xn; y1, . . . , yn; β) = Πni=1 n π(xi; β)yi· [1 − π(xi; β)]1−yi o . (15)

(15)

4.2 Company model 4 METHODOLOGY

The principle of maximum likelihood states that the estimates of the vector β maximize the expression in equation (15). However, it is computationally easier to work with the log of equation (15). The log likelihood is defined as l(x; y; β) = ln[L(x1, . . . , xn; y1, . . . , yn; β)] = n X i=1 {yi· ln [π(xi; β)] + (1 − yi) · [1 − π(xi; β)]} . (16) To find the value of β that maximizes l(x; y; β), differentiate l(x; y; β) with respect to the vector β0 = (β0, β1, . . . , βp) and set the resulting expression equal to zero. These conditions, known as the likelihood equations, lead to:

n X i=1 [yi− π(xi; β)] = 0, (17) and n X i=1 xij[yi− π(xi; β)] = 0, (18)

for j = 1, 2, . . . , p. For a logistic regression the expression in equations (17) and (18) are nonlinear, therefore they require special methods to solve them. These methods are iterative in nature, Mc-Cullagh and Nelder (1989) argue that the maximum likelihood estimates of the parameters β can be obtained by iterative weighted least squares. The reader is referred to the text by McCullagh and Nelder (1989) for a general discussion.

The interpretation of any fitted model requires the ability to draw practical inferences from the estimates coefficients in the model. For the ordinary linear regression model recall that the slope of a coefficient, βj, is equal to the difference between the value of the dependent variable at x + 1 and the value of the dependent variable at x, for any value of x. For example, if y(x) = β0+ β1x, it follows that β1 = y(x + 1) − y(x). In the logistic regression model, the slope coefficient represent the change in the logit, see equation (14), corresponding to a change of one unit in the independent variable (i.e., β1 = g(x + 1) − g(x)). Proper interpretation of the coefficient in a logistic regression depends on being able to place meaning on the difference between two logits.

To interpret the results of the estimated coefficient, a measure of association termed the odds ratio is introduced. The odds ratio is a widely used measure of association, as it approximates how much more likely (or unlikely) it is for the outcome to be present among a reference group than among other categorical groups.

4.2 Company model

The Dutch life insurance company characterizes policyholder behavior as the alteration or withdrawal of policies apart from the conditions directly described in the policy clause. The company distinguishes between the number of policies and the policy reserves as the measurement basis of the lapse rates. The choice of the measurement basis is based on the outcome of the main cash flow in the projections. The main cash flow is determined by the latest lapse sensitivity run of the Market Consistent Embedded Value (MCEV). If the absolute value of the change in present value of the costs cash flow is greater than the absolute value of the change in present value of the benefits cash flow, then the measurement basis is the number of policy and vice versa for the measurement basis policy reserves. Also the insurer argues that policyholder behavior depends greatly on product type, as a result the measurement basis of each product type has to be determined.

To estimate the lapse rates the insurer uses the following:

lapse rate = # of lapsed policies

0.5 × (# of policies primo + # of policies ultimo + # of lapsed policies), (19) where # denotes the number of policies. The lapse rate in (19) uses the number of policies as a measurement basis. The lapse rate estimates with policy reserve as the measurement basis are:

lapse rate = reserve of lapsed policies

(16)

4 METHODOLOGY 4.3 Measures for model comparison

An important fact is that in the event of early lapse, i.e. the policy is lapsed within the first contract year, the numerator accounts for a factor 0.5 in equations (19) and (20). The exposure to the lapse risk coincides with the assumption the insurer has that lapse occurs halfway through the year.

The modeling choice of the insurer is the weighted average for the best estimate prediction. To calibrate the best estimate the insurer uses a five year period. The insurer argues that the chosen time period takes the trend sensitivity of lapse into account, but also minimizes the consequences of the confidence interval with regard to using a three of four year period. Also incidents are not included in the determination of the lapse rates. An incident is defined as: an external event, that strongly influences policyholder behavior and the probability of recurrence is small.

In order to compare the company’s current model with the logistic regression model, we employ the numerical approach to lapse rate for the company’s current model, the other approach, i.e. policy reserve, is left outside the scope of this paper. The logistic regression model uses numbers to esti-mate the coefficients, therefore to properly compare both models, the company’s approach has to be numerical.

For confidentiality purposes the derivation of these statistics are not included in this paper, the numerical approach of the insurer yields the following:

ˆ µ = PT t=1V Nt· nt PT t=1(nt)2 , (21)

where V Nt is the number of lapsed policies in year t, nt is the exposure (i.e., the total number of policies) in year t and T is the amount of historical years. The variance of the best estimate (ˆµ) is

V ar(ˆµ) = µ(1 − µ) PT t=1(nt)3  PT t=1(nt)2 2. (22)

Consequently the confidence interval of the best estimate is given as: h

ˆ

µ + tβ/2T −1pV ar[ˆµ], ˆµ + t1−β/2T −1 pV ar[ˆµ]i, (23) where tβ/2T −1 and t1−β/2T −1 are the critical value of the Student-t distribution with T − 1 degrees of freedom with a confidence level β/2 and 1 − β/2. In this particular case, the company uses T = 5 in order to determine the best estimates.

Before proceeding to the next subsection, we infer an important remark. The Dutch insurance company estimates lapse rates as the number of lapsed policies divided by a linear interpolation of the total policies (see Equation (19)), however the exposure the insurer employs is not compatible with the logistic regression model. First the logistic regression utilizes integer values, also the interpretation of the results will be affected. In order to improve the similarity between the models, instead of employing the linear interpolation of the company we use the number of policyholders at the beginning of the year as the exposure. The lapse rates are then calculated as in equation (1). This change does not significantly affect the estimates of the company’s model, as there is a relatively small amount of mid-year policy initiation in the data set.

4.3 Measures for model comparison

Due to the rudimentary nature of the company model, there is a limited amount of tests to assess the global goodness of fit compared to the logistic regression model. Also Browne and Cudeck (1992) argue that even models approximating the data closely will be rejected if the sample size is sufficiently large. As a result statistical tests to assess goodness of fit are of limited use in the present analysis covering several years of data and millions of insurance contracts.

To compare both models we differentiate between prediction accuracy and prediction prudence. Prediction accuracy entails the estimated errors between predicted and real lapse rates to assess how well the model fits the observations. The consideration of estimated errors is taken from Kim (2005)

(17)

5 RESULTS

and Kiesenbauer (2012), this approach allows for a direct comparison of the results. The root mean square error (RMSE) is calculated as

RM SE = √1 n v u u t n X k=1 (yk− ˆyk)2, (24)

where yk denotes the k-th real value, ˆyk denotes the k-th predicted value, and n is the sample size, the notation is adopted from Kim (2005).

The prediction accuracy gives a proper indication of the best estimate and how well the model can extrapolate into the future. However this is not particularly of interest for the insurer, the insurer is concerned with the prudence of the company model. In this context prudence is defined as whether the real lapse rates are located within the predicted confidence interval or not. As a result, to compare the prudence of both models, the following measure is used

P rudence = 1 n n X k=1 1Ak, (25)

where1Ak is a indicator function for the k-th predicted value and n is the sample size. The indicator

function is defined as 1Ak = ( 1 if yk∈ Ak 0 if yk∈ A/ k , where Ak = [ak, bk], (26)

where yk denotes the k-th real value, ak denotes the confidence interval lower endpoint of the k-th predicted best estimate and bk denotes the confidence interval upper endpoint of the k-th predicted best estimate. By comparing the prudence of the company’s model and the logistic regression model, the insurer can gauge how well it’s model performs in comparison to a model that performs reasonably well (Kiesenbauer, 2012).

In summary, this section gives some insight into the proposed models. The generalized linear models is an extension of the classical linear models that relaxes the assumption of the errors being normally distributed, also the assumption of the link between the random and the systematic component is relaxed. The distinguishing feature of the logistic regression model is that the outcome variable is binary, as a result the distribution of a logistic regression is the binomial distribution. The method of estimation is the maximum likelihood approach and the interpretation of the coefficients are the odds ratio. After describing the logistic regression, we discuss the company model and describe some measures for comparison of the models. The following section presents the results of the analysis.

5

Results

In this section, we detail our empirical results and the comparison of the global goodness of fit between the company model and the logistic regression model. We begin by examining the effect of economic indicators and policyholder characteristics on lapse rates. Afterwards the logistic regression model and company model prediction are compared and discussed.

5.1 Empirical Analysis

This study covers a wide array of explanatory variables (see Subsection 3.1), in order to reduce model complexity, explanatory variables are omitted using the ANOVA, t-test and F-test (see Appendix B) until the remaining variables are significant at the 1% significance level. These variables remain in the reduced model for each product type studied, the reduced model of the Unit Linked product type is displayed in Table 2. The reduced models of the other product types Annuity, Saving, Mortgage-Linked Savings and Term Life are displayed in Table A.3, in Appendix A.

(18)

5 RESULTS 5.1 Empirical Analysis

Table 2: Regression output for Unit Linked products.

Coefficients Estimate Std. Errors P(>| z |)

Intercept -1.498 0.78 0.054

Premium(reference level: Small)

Single -0.761 0.01 0.000∗∗∗

Medium -0.207 0.01 0.000∗∗∗

Large -0.006 0.01 0.670

Year 0.450 0.02 0.000∗∗∗

Gender(reference level: Male)

Female -0.090 0.01 0.000∗∗∗

Contract age(reference level: (5.5, 14.5])

(-1, 1.5] 0.634 0.02 0.000∗∗∗

(1.5, 5.5] 0.567 0.01 0.000∗∗∗

(14.5, 40.5] -0.598 0.02 0.000∗∗∗

Reserve(reference level: New)

Small 0.351 0.01 0.000∗∗∗ Medium 0.513 0.01 0.000∗∗∗ Large 0.509 0.02 0.000∗∗∗ Age(reference level: (40, 60]) (-1, 5] -0.168 0.05 0.001∗∗∗ (5,15.5] -0.126 0.02 0.000∗∗∗ (15.5, 25.5] 0.672 0.02 0.000∗∗∗ (25.5,40] 0.483 0.01 0.000∗∗∗ (60,96.5] -0.191 0.02 0.000∗∗∗ GDP -9.750 0.39 0.000∗∗∗ Unemployment 39.39 1.17 0.000∗∗∗ CPI.PI06 -9.863 1.01 0.000∗∗∗ AEX 0.010 0.00 0.000∗∗∗ Euro Stoxx -0.001 0.00 0.000∗∗∗ b1 37.12 1.42 0.000∗∗∗

Significant for: ∗∗∗z < 0.001,∗∗z < 0.05,∗z < 0.1 (Wald-test)

The coefficients in the reduced models of Table 2 and A.3 (see Appendix A) are expressed in log-odds. A positive coefficient indicates that the lapse log-probability increases (decreases) with an increasing (decreasing) value of the corresponding explanatory variable. For negative regression coefficients the lapse log-probability decreases (increases) with an increasing (decreasing) value of the explanatory variable. By exponentiation of the coefficients, we can interpret the coefficients as odds ratios. For example, in comparison to males, the odds of lapsing for females are exp−0.090 ≈ 0.91, that is, 9 percent less keeping all other variables fixed (ceteris paribus), for Unit Linked products (see Table 2). We now discuss the results of each characteristics in detail, focusing on the logistic regression output (see Tables 2 and A.3). Comparing the product types is difficult due to the differences in categorical partitioning of some explanatory variables, e.g., Unit Linked products have 6 levels for the variable policyholder age whereas Annuity products have 3 levels (see Table A.2, in Appendix A). Furthermore the values of these levels differ per product type. Also the selection of the reference level within the logistic regression differs, to illustrate, the reference level for the explanatory variable premium is Small for Unit Linked products (see Table 2) whereas for Term Life products the refer-ence level is Single (see Table A.3, in Appendix A). As a result we discuss each product type separately. Product type. The results show that the explanatory variables in the reduced models of the product types are similar across all product types. However the regression estimates indicates that the impact and direction differs according to product types. For example, the variable Year in the Unit Linked product type suggests that lapse increases yearly while the same variable in the Mortgage-Linked

(19)

5.1 Empirical Analysis 5 RESULTS

Savings product type indicates that lapse decreases yearly. We also notice that some product types are less influenced by macro-economic indicators and policyholder characteristics than others, such as Annuity products have the least significant variables, this is mostly credited by the fiscal repercus-sion associated with lapsing this product (see Subsection 3.2). These findings agree with the existing literature (Kim, 2005; Kiesenbauer, 2012), consequently we have confirmed hypothesis I, on page 5. Lapse behavior is indeed product specific.

Policyholder gender. The lapse rate for females is lower than it is for males, this is the case for Unit Linked, Term Life and Savings products. Females are approximately 9, 10 and 10 percent less likely to lapse than males, respectively. Conversely Mortgage-Linked Savings products exhibits a higher probability of lapse for females than males. Females are about 3 percent more likely to lapse than males for Mortgage-Linked Savings. This might be explained by females having a greater incentive to lapse an insurance contract they do not completely comprehend. Another explanation might be a higher risk aversion among females in financial matters (Halek and Eisenhauer, 2001). Policyholder gender is not significant for Annuity products, as a result there is no sufficient evidence to find a differ-ence between male and female lapse rates. These findings confirm hypothesis II, see page 5, for Unit Linked, Term Life and Savings products. However the hypothesis is rejected by Mortgage-Linked Sav-ings and Annuity products. It should be noted that the existing literature mostly studies Traditional, Unit-Linked, and Variable annuities, Mortgage-Linked Savings products have not been extensively researched. Also our findings for the Unit-Linked, Term Life and Savings products coincides with the findings of Kagraoka (2005) and Eling and Kiesenbauer (2013).

Contract Age. All products exhibit a relative decrease in lapse rates with an increase in contract age. To illustrate how contract age affects lapse rates for each product type, we refer to Figure 1(b). The regression coefficients of each product type indicate that as the contract age increases, the prob-ability of lapsing decreases. These results are consistent with the empirical findings of Renshaw and Haberman (1986), Cherchiara et al. (2009), and Milhaud et al. (2010). Most policyholders realize quickly whether they really need a purchased policy and have been accordingly advised by a salesper-son. If the customer, for instance, cannot afford the regular premium payments, the customer might lapse the contract within the first years after policy inception. If a product really fits the policyholder’s need, it is less likely that policy will be lapsed. Life insurance savings might then only be used in the case of personal financial distress according to EFH (Dar and Dodds, 1989; Kuo et al., 2003). These findings confirm hypothesis III on page 5, for all product types studied.

Policyholder Age. When discussing the relationship between the age of the policyholder and lapse rates, it can be characterized as a parabola (see Figure 1(a)). All product types exhibit a low, high and low development in lapse rates. We distinguish these three phases as: young-, middle-, and old-age. The young-age phase represents policyholders until age 25, the lapse rate in this phase is remarkably low but increases steadily. These policies might initially be purchased by the policyholder’s parents. When family circumstances change (e.g., marriage or birth of children), the needs might change and the premiums are no longer affordable. The middle-aged phase characterizes policyholders between 26 and 60, this phase shows an increase in lapse rates. Policyholders in this age group might have other expenditures such as mortgage payments, as a result premium payments are no longer practical. Also another explanation is that, for products with a savings component, as people get older it might be difficult to find a new job. According to the EFH (Outreville, 1990), they might use their life insurance savings as emergency funds. Other policyholders in their late 50s might retire early and live on their life insurance savings until they can draw a pension. The old-aged phase comprises policyholders older than 60, this phase exhibits a decline in lapse rates. As policyholders grow older the policy maturity date approaches, in particular for products with a savings component, as a consequence policyholders might be less inclined to lapse their contracts. The regression coefficients of all product types, taking the reference levels of policyholder age for each product type into account, suggest a low, high and low development in lapse rates. These results confirm hypothesis IV, see page 5, lapse rates vary with respect to policyholder age for all product types.

(20)

5 RESULTS 5.1 Empirical Analysis

Premium payments. Single premium policies lapses less often for all but one product type. The regression coefficient of Mortgage-Linked Savings for single premium is not significant, therefore there is no sufficient evidence to suggest a difference between single premium and the reference level (i.e., small). This is intuitive, if a policyholder has enough funds to purchase a single premium policy (lump-sum), then the policyholder would have enough funds to buy a house therefore not needing a Mortgage-Linked Savings contract. Term Life, Savings and Annuity products have the same reference level single premium policies, interestingly these products reveal an increasing odds of lapsing with an increase in regular premium level. Small regular premium policies lapses approximately 486, 32 and 962 percent more often than single premium business, for Term Life, Savings and Annuity products respectively. These odds increase when the size of the premium payments increase. As the amount of premium payments increase, policyholders might have greater incentive to lapse their contract as they find premium payments no longer feasible, especially during economic hardship. Unit Linked products are more complicated to interpret than the other product types, because the regression coefficients indicate that single premium business lapses 53 percent less often than regular small premium poli-cies. However as the size of the premium policies increase, the probability of lapse decreases, regular medium premium policies lapse about 19 percent less than regular small premium policies. The re-gression coefficient of large premium policies are not significant, as a result there is a lack of evidence to suggest a significant difference between large premium policies and the reference level (i.e., small). The decreasing lapse rates with increasing regular premium size for Unit Linked products might be credited to the product specifications. If the product has a profit sharing clause or a guaranteed yield added, then the premium would be higher, however the policyholder would have less risk and therefore be less inclined to lapse. Behrman et al. (2010) find a positive correlation between wealth accumu-lation and financial literacy4. Policy size is likely to be correlated to financial literacy, as the size of premium payments policyholders are likely to have a better understanding of the insurance product. A financially literate policyholder might be less inclined to lapse his/her insurance contract, as they carefully rationalize before purchasing a financial instrument. The finding that single premium has lower lapse rates coincides with Milhaud et al. (2010) and Eling and Kiesenbauer (2013). Furthermore this paper adds to the existing literature by studying the relationship of premium size and lapse rates. These results confirm hypothesis V, on page 6, for Term Life, Savings and Annuity products. Unit Linked and Mortgage-Linked Savings products partially confirm hypothesis V, single premium policies have a lower lapse rate than regular premium policies. However an increase in premium size does not necessarily indicate in increase in lapse rates.

Calendar year. During the analysis we notice a linear trend for the variable Year, accordingly we include the variable Year as a continuous variable in the logistic regression model. The reduced regression model of Annuity products does not contain the variable Year, this implies that the lapse rates of this product type is not affected by yearly changes in the economic environment. This lack of influence can be explained by the fiscal repercussion associated with lapsing this product type (see Subsection 3.2). Unit Linked and Savings products show an increase of approximately 56 and 11 per-cent, respectively, per increase in year. The increasing lapse rates per year coincide with the empirical results of Cerchiara et al. (2009) and Eling and Kiesenbauer (2013). The increasing lapse rate during the time period 1996 to 2012 might be a consequence of both global and local development, such as the financial crisis at the global level and the profiteering affair at the local level (see Subsection 2.2). This increase in lapse rates supports the EFH, as the economic environment worsens policyholders surrender their insurance policy in times of necessity. Surprisingly Term Life and Mortgage-Linked Savings products experience a decrease in lapse rates per year, policyholders might be aware that as the economic environment worsens (interest rates falls) they will hardly find comparable insurance coverage at the same premium level. This decrease in lapse rates supports the IRH, as interest rates falls policyholders are less inclined to lapse their policy. Unit Linked and Savings products confirm hypothesis VI, the lapse rates increase during the period. Mortgage-Linked Savings and Term Life

4Financial literacy is regarded as the ability to use knowledge and skills to manage financial resources effectively for a

(21)

5.1 Empirical Analysis 5 RESULTS

products partially confirm hypothesis VI on page 6, calendar year effects influence lapse rates. How-ever these products experience a decrease in lapse during the time period. Calendar year effects do not influence the lapse rates of Annuity products.

Reserve. The effect of policy reserve on lapse rates is intricate, the Unit Linked and Savings prod-uct indicate an increase in lapse rates as the reserve increases. Policyholders might be more inclined to exercise the paid-up option (stop premium payments) when the policy reserve reaches a certain size, as the policyholder receives a reduced payment at the distribution phase. However the increase in lapse rates levels off when the policy reserve reaches a certain size. The difference in lapse rates between medium and large reserves are minimal, as a result one might argue that policy reserve influences lapse up to a certain size. The policy reserve reflects the contract age, as the contract age increases, the size of the reserve increases. Consequently the policyholder is less inclined to lapse as the size of the policy reserve (i.e., contract age) increases. Annuity and Term Life products show a decrease in lapse rates as the reserve increase. The decrease in lapse rates for Annuity products arise from the fiscal repercussion associated with lapsing (see Subsection 3.2), as the size of the policy reserve increase, so does the size of the tax penalty. Mortgage-Linked Savings does not contain policy reserve as an explanatory variable, this implies that policy reserve does not significantly affect lapse rates of this product type. These findings confirm hypothesis VII on page 6, policy reserve affects Unit Linked, Savings, Term Life and Annuity products. An increasing size of the policy reserve increases lapse rates for Unit Linked and Savings product, whereas an increasing size of policy reserve decreases lapse rates for Term Life and Annuity products. Policy reserve does not affect the lapse rates of Mortgage-Linked Savings products.

Macro-economic indicators. There is a wide variety of macro-economic indicators used in this study, to avoid multicollinearity5 we include the variables sequentially and choose the variables which have the greatest decrease in deviance (see Appendix B). The unemployment rates only affect lapse rates of Unit Linked products, the other product types are not affected. The regression coefficient of the unemployment rate suggests that an increase in unemployment rate, increases the odds of lapsing considerably. This increase in lapse supports the EFH, as the unemployment rises policyholders might not be able to find a job and resort to surrendering their insurance policy for capital. The positive relation between unemployment rates and lapse rates for Unit Linked products coincides with Kiesen-bauer (2012). Unit Linked and Savings products suggest a decrease in lapse rates as the GDP rate increases, this supports the EFH as the economic conditions appreciates (i.e., GDP rises) policyhold-ers are less inclined to lapse their contract as the economy improves. Interestingly, Mortgage-Linked Savings and Term Life products show a positive regression coefficient, an increasing GDP rate in-creases the lapse rates, this positive relationship between GDP and lapse rates contradicts the EFH. A possible explanation might be that, in favorable economic conditions, policyholders might use their accumulated funds for a larger acquisition, e.g., to purchase a house. To reflect the investment climate we include the AEX index and the Euro Stoxx index, these indexes are significant for Unit Linked, Savings, Mortgage-Linked Savings and Annuity product. However the regression coefficient for these indexes are minimal for all product types, as a result we only include them for fitting purposes (see Subsection 5.2). To measure the effects of inflation on lapse rates we add the CPI in the logistic regression model. Term Life and Mortgage-Linked Savings suggest a positive relation between CPI and lapse rates, the lapse rates increase as the CPI increases. This positive relation relation supports the EFH, a rising CPI decreases the policyholders purchasing power. Policyholders are more likely to lapse their insurance contract as their financial position worsens. On the other hand Unit Linked shows a negative coefficient for CPI, lapse rates decreases as the CPI increases. This negative impact of CPI on lapse rates, contradicts the EFH. Policyholders might be less inclined to lapse their contract as the real interest rate on the market decreases in comparison to the yield of the insurance contract (supporting IRH). The CPI does not influence lapse rates of Annuity products. Lastly we include Dutch zero yields to measure the relationship between interest rates and lapse rates. We notice that

5Multicollinearity is a statistical anomaly in which two or more explanatory variables in a multiple regression model are

Referenties

GERELATEERDE DOCUMENTEN

There were no practical significant diierences between any of the other variables (see Table 4) and it can be concluded that most consumers still believe that fresh

Then, a start place and initializing transition are added and connected to the input place of all transitions producing leaf elements (STEP 3).. The initializing transition is

Impact of road surface impedance and nearby scattering objects on beam forming performance: (left) H-matrix BEM model discretisation, (right) spatial distribution of the

Behalve de aandacht voor het feest zelf hebben we gekeken naar de manier waarop mobilisatie voor het feest tot stand is gekomen. Daarbij hebben we gekeken naar de aanwezigheid van

Several techniques which have been used to increase the performance of the metal oxide semiconductor field effect transistor (MOSFET) are also applied to the FinFET; such as

Dependent on parents GOVERNMENT intermediaries STUDENTS parents tax benefits, family allowances budget, guarantees no tuition, BAFöG: grant/loan merit scholarships legal parental

Besonderhde gratis van: Unle-Boekhoa- kollege, Posboa :12,

Lyle en na hom ds. Op taktvolle wyse is die keuse van die onderwysmedium aan die ouers oorge- laat: gevolglik i s Engelsmediumonderwys bevorder omdat dit die gewildste keuse