A modified elemental percentile method for estimation of generalized pareto distribution parameters

(1)

A Modified Elemental Percentile

Method for Estimation of Generalized

Pareto Distribution Parameters

Kibum Kwon

Master’s Thesis to obtain the degree in Actuarial Science and Mathematical Finance University of Amsterdam

Faculty of Economics and Business Amsterdam School of Economics

Author: Kibum Kwon

Student nr: 11374993

Email: kvkwon@gmail.com Date: July 7, 2017 Supervisor: Merrick Li Second reader: Roger Laeven

(2)

Statement of Originality

This document is written by Student Kibum Kwon who declares to take full responsibility for the contents of this document. I declare that the text and the work presented in this document is original and that no sources other than those mentioned in the text and its references have been used in creating it. The Faculty of Economics and Business is responsible solely for the supervision of completion of the work, not for the contents.

(3)

A Modified EPM for GPD Parameter Estimation — Kibum Kwon iii

Abstract

Accurate estimation of the generalized Pareto distribution parameters is of great importance in modeling threshold exceedances. A number of estimation methods have been introduced, including the elemental percentile method (EPM) of Castillo and Hadi (1997). EPM estima-tors exist for all values of parameters, and do not show convergence problems. However, it is computationally inefficient, and some of the initial estimates lack in accuracy. In this paper, we propose the modi-fied elemental percentile method (ModEPM) as a modification of the EPM algorithm by taking a stratified sampling approach. Through ex-tensive simulation studies and a real-world application, we show that the ModEPM is a decent alternative to the EPM in terms of accurate paramter estimation and better fit to the data.

Keywords Generalized Pareto distribution, Parameter estimation, Elemental percentile method, Stratified sampling, Monte Carlo simulation

(4)

Preface

Working on the thesis was mainly about going through a cycle of squeezing out ideas, trying them out, and failures. Whenever I strug-gled, I figured a way out after having lengthy discussions with Mer-rick. Words cannot fully describe how grateful I feel for his invaluable feedbacks. I also sincerely appreciate the warm support and inspiring messages I got from family members and friends back in South Korea. Finally, it was my friends in Amsterdam who kept me going through this year-long academic journey. I will never forget those discussions on study topics, coffee breaks in the library, and drinks/frisbee trainings at the parks.

(6)

(7)

Chapter 1

Introduction

Modeling the distribution of extreme values is of great interest in both theory and practice. Extreme value theory (EVT) is a branch of statistical studies which deals with the limiting distribution models for extreme values in large samples. In hydrology, modeling the distribution of high waves or extreme precipitation data is important in order to build sea dykes or to prepare precautionary measures against flooding. In the financial world, EVT plays a significant role in modeling extremal behaviors of various financial risk factors (McNeil et al. (2015)). For instance, players in the stock market are concerned about an unexpected downside move of the market, thus would like to have an idea on how to model extreme events in market risk.

The traditional approach to modeling extremes is the block maxima. This approach describes the limiting distribution of the normalized maximum of n blocks of the sample with the generalized extreme value distribution (GEVD). The block maxima approach has been criticized due to the fact that it wastes too much data, by taking only the maximum values of each blocks into account. An alternative approach is to analyze threshold exceedances. That is, modeling the exceedances of the given data over a high threshold. Compared to the block maxima approach, it makes use of all the data which is above a certain threshold, thus being less wasteful of the data. The core concept in modeling threshold exceedances is that, the excess distribution over large thresholds are approximated by the family of the generalized Pareto distribution (GPD).

Since the introduction of the GPD in Pickands III(1975), various methods for esti-mating the parameters have been proposed, each with its advantages and disadvantages. In this paper, we propose a modified method for estimating the GPD parameters. The method is a modification of the elemental percentile method proposed by Castillo and Hadi (1997). By taking a stratified sampling approach, the modified method is compu-tationally efficient, and estimates the parameters more accurately than the elemental percentile method.

The rest of the paper is organized as follows: In Chapter 2 we review the litera-ture regarding GPD and its implication on modeling the tail of a distribution, some well-known estimation methods, an accuracy measure of the estimators and stratified random sampling. The modified estimation method model is introduced in Chapter 3. In Chapter 4, extensive simulation studies are conducted to compare the performance of the modified estimators to the existing ones. We apply the modified method to a real world data, in the context of modeling extreme insurance loss in Chapter 5. Chapter 6

concludes with final remarks, and a suggestion of when to use the modified method.

(8)

Chapter 2

Literature Review

2.1 Threshold exceedances

One of the methods to model the behavior of extreme values is modeling the threshold exceedances. As we mention in Chapter 1, these are models for all observations that exceed a pre-specified, high threshold level.McNeil et al.(2015) stated that these models are considered to be the most useful for practical applications, due to their efficient use of the data on extreme outcomes, which are often limited in size.

The distribution for threshold exceedances is approximated by the two-parameter generalized Pareto distribution (GPD) with parameters ξ and β introduced byPickands III

(1975), whose cumulative distribution function (cdf) is defined by

Gξ, β(x) = (

1 − (1 + ξx/β)−1/ξ, ξ 6= 0,

1 − exp(−x/β), ξ = 0, (2.1)

where ξ and β are referred to as shape and scale parameters, respectively. The domains of x are x ≥ 0 when ξ ≥ 0 and 0 ≤ x ≤ −β/ξ when ξ < 0.

2.2 GPD parameter estimation

In this section, we briefly introduce some well-known methods for estimating the GPD parameters ξ and β, as well as their benefits and shortfalls. Assume that our data x = (x1, . . . , xn) are i.i.d. with common distribution G, where G is GPD with parameters ξ and β (Gξ, β(x) in (2.1)). Note that the following methods of GPD parameter estimation can be found inEmbrechts et al. (2013).

2.2.1 Maximum likelihood method

The most conventional parameter estimation method is the maximum likelihood method. It has been considered by many authors, andGrimshaw(1993) presented the algorithm for computing the maximum likelihood estimates.

From (2.1), we can derive the density and the log-likelihood function of GPD when ξ 6= 0: gξ, β(x) = ξ β 1 + ξx β −1/ξ−1 , (2.2) and lx(ξ, β) = −nlnβ − 1 ξ + 1 n X i=1 ln 1 + ξ βxi . (2.3)

By numerically solving the maximization problem of (2.3), one can obtain the maximum likelihood estimators (MLEs) ˆξM LE and ˆβM LE.

(9)

A Modified EPM for GPD Parameter Estimation — Kibum Kwon 3 0 2 4 6 8 0.0 0.2 0.4 0.6 0.8 1.0 GPD density ξ = 0.5 ξ = 0.0 ξ = −0.5

Figure 2.1: GPD density with ξ = 0.5, ξ = 0.0 and ξ = −0.5.

Although the they are asymptotically efficient, Hosking and Wallis (1987) reported that the MLEs do not display its asymptotic efficiency in samples as large as 500. Also it is computationally difficult to obtain the MLEs and they might have convergence prob-lems (see Hosking and Wallis(1987), Grimshaw(1993) and Castillo and Hadi (1997)). Moreover, for ξ ≤ −1, the likelihood function can be infinite, resulting in non-existence of the MLEs.

2.2.2 Method of moments

The method of moments (MOM) estimates are introduced inHosking and Wallis(1987). The estimators are given by:

ˆ ξM OM = 1 2 1 −x¯ 2 s2 and ˆ βM OM = ¯ x 2 1 +x¯ 2 s2 , (2.4)

where ¯x and s2 are the sample mean and the sample variance of our data x.

The advantage of the method of moments estimates are that they are very easy to compute. Jockovi´c (2012) stated that MOM estimates are often used as the initial estimates for other estimation methods involving numerical techniques. However, for ξ ≥ 0.5, Var[x] = ∞, thus the MOM estimates do not exist. Also, Hosking and Wallis

(10)

4 Kibum Kwon — A Modified EPM for GPD Parameter Estimation

2.2.3 Probability weighted moments

Hosking and Wallis (1987) also approched the estimation problem by considering the method of probability weighted moments (PWM). For the sample data x, the PWM estimates are: ˆ ξP W M = ¯ x (¯x − 2t)− 2 and ˆ βP W M = 2¯xt (¯x − 2t), (2.5) where t = n−1 n X i=1 (1 − pi:n)xi:n,

pi:n= (i − .35)/n, and xi:n is the ith order statistic in the sample, i = 1, . . . , n.

The PWM works well when 0 ≤ ξ ≤ 0.5, but the estimates do not exist when ξ ≥ 0.5. In addition, both MOM and PWM estimators have low asymptotic efficiencies, and they may not be consistent with the sample. In particular, some of the sample values fall outside the domain which is suggested by the MOM or PWM estimators (see

Castillo and Hadi (1997)).

2.2.4 Elemental percentile method

Castillo and Hadi (1997) gave the two-stage procedure of elemental percentile method (EPM). The stages are:

1. Initial estimates

Let xi:nand xj:nbe two distinct order statistics (i = 1, . . . , n − 1, j = 2, . . . , n, i < j), in a random sample from Gξ, β(x) in (2.1), with sample size of n. Then, by matching the theoretical cdf values obtained under the observed order statistics with the corresponding quantiles, we have

Gξ, β(xi:n) = pi:n and Gξ, β(xj:n) = pj:n, (2.6) where pi:n= i − γ n + α

is a suitable plotting position.Castillo and Hadi (1997) reported that γ = 0 and α = 1 yields the best results. Let δ = β_ξ and substituting into (2.6) and eliminating ξ, we obtain Cjln 1 +xi:n δ = Ciln 1 +xj:n δ , (2.7)

where Ci = ln(1 − pi:n) < 0. Whereas eliminating β, we obtain

[(1 − pj:n)−ξ− 1]xi:n= [(1 − pi:n)−ξ− 1]xj:n. (2.8) Solving (2.7) for ˆδ(i, j) and substituting this value into (2.6) yields the initial estimates ˆξ(i, j) and ˆβ(i, j).

(11)

A Modified EPM for GPD Parameter Estimation — Kibum Kwon 5

2. Final estimates

After computing ˆξ(i, j) and ˆβ(i, j) for all (i, j) pairs, the overall estimators of ξ and β are obtained by taking the median of the initial estimators:

ˆ

ξEP M = median( ˆξ(1, 2), ˆξ(1, 3), . . . , ˆξ(n − 1, n)) and

ˆ

βEP M = median( ˆβ(1, 2), ˆβ(1, 3), . . . , ˆβ(n − 1, n)). (2.9)

Castillo and Hadi (1997) proposed the EPM as the alternative to the previously introduced estimation methods (MLE, MOM, PWM). They suggested using MLE if the sample size is large (e.g., n > 500) and −0.5 < ξ < 0.5, PWM if the sample size is small and 0 ≤ ξ ≤ 0.5. In all other cases, they proposed using the EPM, especially when the MLE has convergence problems or the MOM/PWM gives estimates that are not consistent with the sample. This is due to the advantage of EPM that the EPM estimators exist for every values of ξ and β, and have no convergence problems. However, the EPM has shortfalls in terms of computational efficiency. We will look deeper into this, and propose our model to remedy this problem in Chapter 3.

2.3 Modeling tails and value at risk using GPD

The GPD model for excess distributions is used to estimate the quantiles at the tail of the underlying distribution (see p.154 ofMcNeil et al. (2015)). Assume a continuous random variable X > 0 with df F (x) = P r[X < x], and for some high threshold u, the excess distribution function Fu(x) can be approximated by Gξ, β(x) in (2.1). Then

¯

F (x), the upper tail of F can be represented as follows: ¯ F (x) = P r [X > u] P r [X > x|X > u] = ¯F (u)P r [X − u > x − u|X > u] = ¯F (u) ¯Fu(x − u) = ¯F (u) 1 + ξx − u β −1/ξ . (2.10)

Then, we can obtain the quantile value of F by inverting (2.10), which is interpreted as the value at risk (VaR). That is, for 0 ≤ F (u) ≤ p ≤ 1,

VaRp= qp(F ) = u + β ξ 1 − p ¯ F (u) −ξ − 1 ! . (2.11)

In practice, we estimate ξ and β with ˆξ and ˆβ, with an appropriate estimation method. Also, as an estimate of ¯F (u), we take the empirical estimator Nu/n, where Nu is the number of occurrences that exceed the threshold u in a given sample, and n the sample size. By replacing these values in (2.11), we get as the point estimate for value at risk,

d VaRp = u + ˆ β ˆ ξ n(1 − p) Nu − ˆξ − 1 ! . (2.12)

Under the current SolvencyII regulatory framework, an insurance company calcu-lates its solvency capital requirement (SCR). SCR is the amount of capital an insurer has to hold in order to assure that the probability of insolvency over a one-year horizon is less than 0.5% (see p.25 of McNeil et al. (2015)). In essence, SCR is the estimated VaR, with a confidence level of 99.5% over a one-year period. Thus, it is crucial for the insurers to accurately estimate the VaR of loss distributions from various risk factors that often exhibit heavy-tailed behaviors.

(12)

2.4 Accuracy measures of estimators

When comparing different parameter estimation methods, it is useful to assess the ac-curacy of the estimators. A good estimator should be accurate, which in statistical terms, means that the estimator is close to the true parameter value (Kotz and Johnson

(1988)). Accuracy measures consider the bias and precision of estimator(s) simultane-ously. Walther and Moore (2005) provided a rich review on the statistical concepts of bias, precision and accuracy, and some measures for each concepts. Here, we review a common accuracy measure used in the studies of GPD parameter estimation.

2.4.1 Root mean square error

Mean square error (MSE) is a common accuracy measure in statistical studies. It is the mean of the squared differences of the estimator and the true parameter value. Let N be the number of simulation runs, ˆξi and ˆβi the estimators for ξ and β from the ith sample. The MSE is calculated as follows:

MSE( ˆξ) = 1 N N X i=1 ( ˆξi− ξ)2 and MSE( ˆβ) = 1 N N X i=1 ( ˆβi− β)2. (2.13)

The MSE incorporates both bias and precision as it is in fact, the sum of the variance and the sqared bias of an estimator. Here we briefly show that this holds, as the proof can be found in a number of literature, for instance on p.303 of Casella and Berger

(2002). The MSE of an estimator ˆξ is,

MSE( ˆξ) = 1 N N X i=1 ( ˆξi− ξ)2 = 1 N N X i=1 ( ˆξi− ¯ξ + ¯ξ − ξ)2 = 1 N ( _N X i=1 ( ˆξi− ¯ξ)2+ N X i=1 ( ¯ξ − ξ)2 ) ,

where the first sum of the last equation represents the variance part and the second sum the squared bias part. Small MSE value of an estimator implies high accuracy. Meanwhile, a large MSE value of an inaccurate estimator may be due to large bias or high variance (low precision), or both (seeWalther and Moore (2005)).

Since the differences are squared in MSE calculation, the scale is now different from the original sample measurements. After taking the square root, we return to the original scale, and this measure is called the root mean square error (RMSE). RMSE is widely used as a comparison criteria of GPD parameter estimation methods.

RMSE( ˆξ) = v u u t 1 N N X i=1 ( ˆξi− ξ)2 and RMSE( ˆβ) = v u u t 1 N N X i=1 ( ˆβi− β)2. (2.14)

(13)

Naturally, the implication of RMSE is the same as MSE. The smaller the RMSE value, the more accurate an estimator is.

2.5 Stratified random sampling

Our modified estimation method involves a random sampling technique called stratified random sampling. Cochran (2007) explained that in stratified random sampling, the population of size N is divided into subgroups or, strata, which are nonoverlapping and together composes the whole population. That is, N1 + N2 + . . . + NL = N , where Nk, k = 1, . . . , L are the sizes of each stratum, respectively. The division is also referred to as stratification. After stratification, simple random samples of sizes n1, n2, . . . , nL are drawn from each, where the drawings are made independently in different strata.

Cochran (2007) also pointed out as one of the principal reasons for stratification, that it may produce a benefit in precision in the estimation of characteristics of the whole population. Estimates obtained from each stratum can be combined into a precise estimate. This idea is at the center of our modified estimation model, which is proposed in Chapter 3.

(14)

Chapter 3

Model Proposal

Before we proceed to our model, we first look into the details of the two-stage algorithms of the EPM, since our model is a modification of the algorithm. The following is the algorithm fromCastillo and Hadi (1997) to solve for δ in (2.7).

Algorithm 1

1. Select any two distinct order statistics xi:n < xj:n, where i < j, to compute Ci and Cj in (2.7). Then, let d = Cjxi:n− Cixj:n.

2. If d = 0, then let ˆδ(i, j) = ±∞, ˆξ(i, j) = 0, and go to step 5, to finalize the estimation; otherwise, proceed to step 3.

3. Compute δ0 = xi:nxj:n(Cj− Ci)/d. If δ0 > 0, then it follows that δ0 > xj:n. Thus apply the bisection method on the interval [xj:n, δ0] to obtain ˆδ(i, j) and go to step 5; otherwise, go to step 4.

4. Use the bisection method on the interval [δ0, 0] and obtain ˆδ(i, j).

5. Use ˆδ(i, j) to compute ˆξ(i, j) and ˆβ(i, j) by substituting this value into (2.6). The estimates ˆξ(i, j) and ˆβ(i, j) are based on only one arbitrary pair of the order statistics (xi:n, xj:n). To make use of the complete information in the sample, and thus to be able to obtain statistically efficient estimators,Castillo and Hadi(1997) proceeded to the next step.

Algorithm 2

1. Use Algorithm 1 to compute ˆξ(i, j) and ˆβ(i, j) for all distinct pairs xi:n< xj:n. 2. Use the median of the sets of estimators to obtain the modified EPM estimators

of ξ and β; that is, ˆ

ξEP M = median( ˆξ(1, 2), ˆξ(1, 3), . . . , ˆξ(n − 1, n)) (3.1) and

ˆ

βEP M = median( ˆβ(1, 2), ˆβ(1, 3), . . . , ˆβ(n − 1, n)). (3.2) As we state in Chapter 2, the beauty of the EPM lies in the fact that the EPM estimates exist for every values of ξ and β parameters without showing any convergence problems. This is a major advantage of the EPM over other estimation methods such as MLE, MOM and PWM. However, as the sample size n becomes larger, the number of distinct pairs of order statistics increases proportionally to the square of n (number of pairs=(n − 1) · n/2), making the estimation process computationally intensive. Thus, it would be beneficial if we can reduce the number of quantile pairs involved in estimation.

Castillo and Hadi (1997) acknowledged this issue, and proposed some modifications to reduce the number of calculations as follows:

1. Random sampling. Select M distinct order statistics at random with replacement. 2. Systematic sampling. Select pairs of distinct order statistics at least r steps apart.

(15)

3. Setting j = n and choosing the pairs xi:n and xn:n, for i = 1, 2, . . . , n − 1.

The drawback of the first and second modifications is that the numbers M or r need to be pre-specified by the data analyst, which makes the estimation results subjective while the performances remain similar to the original EPM. A potential problem with the third method is that the initial estimates are influenced by the maximum of the sample, which might be outliers (seeCastillo and Hadi (1997)).

Table 3.1: The optimal (p, q) pairs for the estimators ˆξasym and ˆβasym which yield the minimal asymptotic variance, depending on different values of true shape parameter value ξ. ˆ ξasym βˆasym ξ p q p q 2.0 0.19 0.92 0.29 0.97 1.0 0.27 0.95 0.39 0.98 0.5 0.33 0.97 0.46 0.99 -0.5 0.66 0.99 0.69 0.99 -1.0 0.77 0.99 0.77 0.99 -2.0 0.80 0.99 0.80 0.99

On the other hand, one can think of finding an ‘optimal’ choice for p and q, and using that specific quantile pair to derive the parameter estimators. Based on the aymptotic results from Castillo and Hadi (1997), we can find the optimal choices for p and q, in the sense of minimum asymptotic variance. Table 3.1 shows the p and q choices with the minimum asymptotic variance. Note that the optimal (p, q) pairs do not match for the ξ and β estimators for ξ = {-0.5, 0.5, 1.0, 2.0}. Thus, minimum asymptotic variance cannot be the optimum selection criteria. Similar problem arises when we switch the criteria to the minimum root mean square error. From Table3.2, we see that the optimal (p, q) pairs do not coincide for ξ = {0.5, 1.0, 2.0} under the minimal RMSE criteria.

3.1 Modified Elemental Percentile Method (ModEPM)

Let alone computational inefficiency, the existence of quantile pairs that lead to high RMSE values of the initial estimates is another possible shortcoming of the EPM. Fig-ure 3.1 shows the RMSE values of the initial estimates under sample size 50, and ξ = {0.5}, β = {1.0}. We observe a generally decreasing pattern of the RMSE val-ues of the initial estimates of ξ, as the upper quantile q gets closer to 1.0. In case of the RMSE values of the initial estimates of β, there is some decreasing pattern, with a rather concave one with low RMSE values in the mid-range quantile pairs. Note that we observe similar patterns under different sample sizes and parameter values considered. Considering this pattern, it is possible to reduce the number of quantile pairs on one hand, while using accurate initial estimates and thereby improving the accuracy of the final estimates, on the other. Here, we propose the modified elemental percentile method (ModEPM) as a modification of the elemental percentile method. Denote the modified EPM estimators as ˆξM odEP M and ˆβM odEP M, following is the ModEPM algorithm.

Modified algorithm 2

1. Stratified sampling. Initially, categorize the quantile pairs by the upper order statistic xj:n, j = n, . . . , 2 and denote the corresponding groups as strata n−1, n− 2, . . . , 1, respectively. Then, randomly sample m, m − 1, . . . , 1 pairs respectively from strata n − 1, n − 2, . . . , n − m, where 1 < m < n. One ends up with M =

m(m+1)

(16)

Table 3.2: The optimal (p, q) pairs yielding the minimum RMSE, under different condi-tions of sample size n, true shape parameter value ξ. Scale parameter β is set to equal 1.0, since the results are invariant for β. ˆξoptim and ˆβoptim refers to the optimal shape and scale parameter, respectively, while ˆξEP M and ˆβEP M are the EPM estimators of the parameters.

ˆ

ξoptim ξÊP M βôptim βÊP M

n ξ p q RMSE RMSE p q RMSE RMSE

15 2.0 0.2000 0.9333 1.0740 1.0033 0.2000 1.0000 0.8907 0.8339 15 1.0 0.3333 0.9333 0.7548 0.7793 0.2667 1.0000 0.7014 0.6650 15 0.5 0.3333 1.0000 0.5715 0.7026 0.4000 1.0000 0.6017 0.6014 15 -0.5 0.6000 1.0000 0.4308 0.6944 0.6000 1.0000 0.4576 0.5080 15 -1.0 0.6000 1.0000 0.5022 0.7740 0.6000 1.0000 0.4135 0.4807 15 -2.0 0.7333 1.0000 0.7699 1.0409 0.7333 1.0000 0.3769 0.4478 50 2.0 0.1400 0.9600 0.3344 0.3212 0.2800 0.9800 0.2572 0.2415 50 1.0 0.2600 0.9600 0.3597 0.3876 0.4000 1.0000 0.3266 0.3343 50 0.5 0.2800 0.9800 0.2697 0.3477 0.4400 1.0000 0.2810 0.3089 50 -0.5 0.6800 1.0000 0.1672 0.3514 0.6800 1.0000 0.2143 0.2700 50 -1.0 0.6800 1.0000 0.2119 0.3994 0.6800 1.0000 0.1965 0.2568 50 -2.0 0.7800 1.0000 0.3677 0.5452 0.7800 1.0000 0.1835 0.2393 100 2.0 0.3000 0.8600 0.1630 0.1589 0.2000 0.9700 0.1215 0.1133 100 1.0 0.2500 0.9300 0.2415 0.2607 0.3700 0.9900 0.2284 0.2212 100 0.5 0.2900 0.9600 0.1927 0.2323 0.4300 0.9900 0.2012 0.2064 100 -0.5 0.6600 1.0000 0.1021 0.2326 0.6600 1.0000 0.1417 0.1822 100 -1.0 0.7200 1.0000 0.1377 0.2663 0.7200 1.0000 0.1321 0.1740 100 -2.0 0.8200 1.0000 0.2498 0.3684 0.8200 1.0000 0.1249 0.1634

2. Use Algorithm 1 to compute ˆξ(i, j) and ˆβ(i, j) for the M pairs (xi:n, xj:n) sampled from Step 1.

3. Use the median of ˆξ(i, j) and ˆβ(i, j) to obtain the modified EPM estimators. that is,

ˆ

ξM odEP M = median( ˆξ(i, j)) (3.3) and

ˆ

βM odEP M = median( ˆβ(i, j)). (3.4) Note that, ˆξM odEP M = ˆξEP M and ˆβM odEP M = ˆβEP M, when m = n − 1.

The modified elemental percentile method (ModEPM) takes a stratified random sampling approach to sample the quantile pairs. Recall that in stratified random sam-pling the population is subdivided with respect to some category. In the modified EPM, the quantile pairs are categorized by the upper order statistic. For example, quantile pairs (x1:n, xn:n), . . . , (x(n−1):n, xn:n) together form strata n − 1. As a major goal of the modified EPM is to derive more accurate estimates than the original EPM estimates, we would like to sample more quantile pairs from the strata with high values of the upper order statistic xj:n, based on the decreasing RMSE pattern observed from Figure3.1. One way to do this is to adjust the sample sizes from each strata, such as sampling m quantile pairs from strata n − 1, and then decreasing the sample sizes moving along towards lower strata. The stratified sampling of the quantile pairs is illustrated in Ta-ble3.3. After the sampling process, we compute ˆξ(i, j) and ˆβ(i, j) and take the median of the estimates to derive the final estimates.

In Chapter 4, we perform simulation studies to assess the performances of the mod-ified EPM estimators and compare them to the original EPM estimators.

(17)

Table 3.3: An illustration of Step 1 of the ModEPM. Each element of the matrix rep-resents the quantile pairs (xi:n, xj:n), where i < j (in case of i ≥ j, the elements are denoted −). These quantile pairs are divided columnwise to form each strata. Then m, m − 1, . . . , 1 quantile pairs are randomly sampled from strata n − 1, n − 2, . . . , n − m. Note that pairs are sampled from each strata, such that more are sampled from strata with higher upper order statistic xj:n. The resulting total number of quantile pairs is M = m(m+1)₂ .

strata n − m strata n − 2 strata n − 1                               − (x1:n, x(n−m+1):n) · · · (x1:n, x(n−1):n) (x1:n, xn:n) − (x2:n, x(n−m+1):n) · · · (x2:n, x(n−1):n) (x2:n, xn:n) .. . ... . .. ... ... − − · · · − (x(n−1):n, xn:n) − − · · · − − ↓ ↓ ↓ Sample sizes → 1 · · · m − 1 m

(18)

12 Kibum Kwon — A Modified EPM for GPD Parameter Estimation p 0.2 0.4 0.6 0.8 q 0.2 0.4 0.6 0.8 1.0 RMSE 20 40 60 80

(a) RMSE values of the initial estimators of ξ.

p 0.2 0.4 0.6 0.8 q 0.2 0.4 0.6 0.8 1.0 RMSE 1 2 3

(b) RMSE values of the initial estimators of β.

Figure 3.1: RMSE values of the initial estimators of ξ and β, depending on different (p, q) pairs, (p < q). The sample of size n = 50 is generated from the GPD of parameter values ξ = {0.5}, β = {1.0}.

(19)

Chapter 4

Simulation Studies

In this chapter, we carry out extensive simulation studies to compare the performances of the modified EPM estimators and the original EPM estimators. We consider several cases of the sample size and the value of the true parameters as the RMSE values depend on the conditions. Setting the conditions in line with Castillo and Hadi(1997), we consider the values of ξ = {2.0, 1.0, 0.5, -0.5, -1.0, -2.0} and β = {1.0}, as the results are invariant with respect to β. Moreover, we consider sample sizes less than 100, n = {15, 50, 100}. As the number of quantile pairs to use, we choose m so that M is close to half the size of the whole quantile pairs (M ≈ 1₂n(n−1)₂ ). For instance, when n = 50, M = m(m+1)₂ ≈ 1₂49·50₂ = 612.5 and therefore we take m = 35 so that M = 630 ≈ 612.5. Throughout the simulations, we consider samples x = {x1, . . . , xn} from a two-parameter GPD in (2.1). Under each case, 1000 simulation runs are performed in R.

Table 4.1: Simulated data: Root mean square error values of ξ and β estimators. m is chosen such that M = m(m+1)₂ is close to half the number of all quantile pairs. EPM estimators and modified EPM estimators are denoted as EPM and ModEPM, respectively. Samples of sizes n are generated from the GPD of true parameter values ξ and β = {1.0}. ξ Estimate n m Method -2.0 -1.0 -0.5 0.5 1.0 2.0 ˆ ξ 15 EPM 1.0409 0.7740 0.6944 0.7026 0.7793 1.0033 10 ModEPM 0.9676 0.6881 0.5945 0.6095 0.6998 0.9348 50 EPM 0.5452 0.3994 0.3514 0.3477 0.3876 0.3212 35 ModEPM 0.4844 0.3436 0.2988 0.3117 0.3577 0.5032 100 EPM 0.3684 0.2663 0.2326 0.2323 0.2607 0.1589 70 ModEPM 0.3296 0.2337 0.2041 0.2143 0.2471 0.3336 ˆ β 15 EPM 0.4478 0.4807 0.5080 0.6014 0.6650 0.8339 10 ModEPM 0.4344 0.4746 0.5097 0.6222 0.6996 0.8356 50 EPM 0.2393 0.2568 0.2700 0.3089 0.3343 0.2415 35 ModEPM 0.2191 0.2363 0.2498 0.2925 0.3214 0.4047 100 EPM 0.1634 0.1740 0.1822 0.2064 0.2212 0.1133 70 ModEPM 0.1495 0.1605 0.1693 0.1961 0.2133 0.2533

4.1 Properties of estimators

In this section, we compare the RMSE values of the modified EPM estimators and the original EPM estimators. Table 4.1 contains the results and they are summarized as follows:

(20)

For ξ, the RMSE values of modified EPM estimators are smaller than those of EPM estimators, indicating better accuracy, except when ξ = {2.0}. On the other hand, for β, the modified EPM estimators perform better than the original EPM estimators when ξ = {−2.0, −1.0, −0.5}. When ξ = {0.5, 1.0}, the performance depends on the sample size n, such that when n = {50, 100}, the modified EPM estimates the true β more accurately. Also, the RMSE decreases as the sample size n increases, which suggests that the modified EPM estimators are consistent, as the EPM estimators are. We also observe that the RMSE gap between the two methods narrows down as ξ increases. This indicates that the modified EPM is much better off than the EPM when ξ is not too large (e.g., ξ < 0.5). We give an example of the development of the RMSE values of ˆξM odEP M, ˆξEP M, as well as ˆβM odEP M, ˆβEP M, in Figure 4.1.

4.2 Goodness of fit

We observe in Section4.1, that in general, the modified EPM estimators perform better than the original EPM estimators, but not in every case considered. Even in these cases, if the modified EPM estimates give a better fit to the data than the original EPM estimates, it would be a minor problem. Thus as the next step, we consider the average scaled absolute error (ASAE) of the two methods to judge the overall goodness-of-fit (see Castillo and Hadi(1997)). The ASAE is computed as:

ASAE = 1 n n X i=1 |ˆxi:n− xi:n| xn:n− x1:n , (4.1) where ˆ xi:n= ˆ β ˆ ξ h (1 − pi:n)− ˆξ− 1 i .

The results are contained in Table4.2, and the following can be concluded:

Overall, fit to the data with the ModEPM estimates are similar to, but better than that with the EPM estimates. Also the ASAE values drop as the sample size n increases, which again, indicate the consistency of both methods.

Table 4.2: Simulated data: Average of scaled absolute errors of the ModEPM, EPM estimators. Samples of sizes n are generated from the GPD of true parameter values ξ and β = {1.0}. ξ n m Method -2.0 -1.0 -0.5 0.5 1.0 2.0 15 EPM 0.0658 0.0711 0.0733 0.0812 0.0969 0.1786 10 ModEPM 0.0732 0.0692 0.0669 0.0666 0.0743 0.1137 50 EPM 0.0310 0.0339 0.0339 0.0321 0.0365 0.0272 35 ModEPM 0.0296 0.0303 0.0295 0.0267 0.0301 0.0214 100 EPM 0.0211 0.0232 0.0230 0.0183 0.0193 0.0187 70 ModEPM 0.0199 0.0208 0.0200 0.0159 0.0171 0.0166

4.3 Quantile estimation

As we state in Section 2.3, one crucial objective of GPD parameter estimation would be to obtain a more accurate estimate of the higher quantiles that lie in the tail of the loss distribution. In this section we present the simulation study results for the 99.5% quantile estimation using the EPM and the modified EPM. The estimated 99.5% quantile values and the corresponding RMSE values are presented in Table 4.3.

(21)

A Modified EPM for GPD Parameter Estimation — Kibum Kwon 15 −2.0 −1.5 −1.0 −0.5 0.0 0.5 0.2 0.3 0.4 0.5 0.6 0.7

RMSE of estimates for ξ, n=50

ξ

RMSE

EPM ModEPM

(a) RMSE values of ˆξM odEP M and ˆξEP M.

−2.0 −1.5 −1.0 −0.5 0.0 0.5

0.20

0.25

0.30

0.35

RMSE of estimates for β, n=50

ξ

RMSE

EPM ModEPM

(b) RMSE values of ˆβM odEP M and ˆβEP M.

Figure 4.1: RMSE values of the modified EPM estimators and the original EPM esti-mators. The solid curve shows the RMSE values of the modified EPM estimators, while the dotted curve represents the RMSE of the original EPM estimators. The sample of size n = 50 is generated from the GPD of true parameter values ξ and β = {1.0}.

(22)

EPM estimators is stable and accurate when ξ = {-2.0, -1.0, -0.5}, in the sense that the RMSE values of the estimates are fairly small. In these cases, the modified EPM shows outstanding performance compared to the original EPM. When ξ = {0.5}, the quantile estimation becomes rather inaccurate, but the accuracy improves as the sample size n increases. Nevertheless, the modified EPM performs much better than the EPM. When ξ = {1.0, 2.0}, the quality of estimation worsens seriously, which questions the practical use of both methods under these conditions.

Table 4.3: Simulated data: Simulated GPD 99.5% quantile estimation results under different values of n, ξ and β = {1.0}. RMSE of each estimation are presented inside the parentheses, under the estimated quantile values. The values in the third row are the true GPD 99.5% quantile values.

ξ -2.0 -1.0 -0.5 99.5% quantile 0.4999875 0.995 1.858579 n m Method 15 EPM 0.5753 (0.7145) 1.8809 (9.0709) 6.6627 (48.1185) 10 ModEPM 0.5366 (0.3032) 1.3298 (1.8855) 3.6272 (8.5859) 50 EPM 0.5087 (0.0433) 1.0823 (0.3472) 2.2931 (1.5448) 35 ModEPM 0.5057 (0.0295) 1.0474 (0.2143) 2.1236 (0.9178) 100 EPM 0.5039 (0.0202) 1.0329 (0.1534) 2.0433 (0.6539) 70 ModEPM 0.5024 (0.0133) 1.0188 (0.1028) 1.9792 (0.4457) ξ 0.5 1.0 2.0 99.5% quantile 26.28427 199 19999.5 n m Method 15 EPM 842.6029 (7351.126) 40368.57 (368399.5) 986408444 (4964985638) 10 ModEPM 427.1498 (2068.109) 37689.98 (382527.5) 35064741216 (178400476329) 50 EPM 67.3477 (139.577) 943.2782 (3503.911) 564028.3 (2785459) 35 ModEPM 55.7267 (92.5491) 770.9061 (2640.123) 627221.6 (4124371) 100 EPM 40.7806 (40.7424) 402.9828 (568.7335) 83521.52 (215222.9) 70 ModEPM 37.3735 (30.9136) 367.6659 (471.2294) 78529.05 (204370.6)

(23)

Chapter 5

Application in Insurance

In Chapter 4, we have conducted simulation studies to find out the properties and performances of the modified EPM estimates. In practice, a financial risk manager within an insurance company might apply the modified EPM to estimate the distributional properties of a heavy-tailed loss distribution. To give an example of the real world application of the ModEPM, we apply our method to a well-studied dataset in the field of insurance and extreme value theory, namely the Danish fire insurance data.

5.1 Danish fire loss data

The Danish fire loss data comprises of 2156 fire insurance losses from 1980 to 1990 inclu-sive in financial units of 1,000,000 Danish kroner. This dataset is well-studied throughout the history of extreme value theory, as it exhibits a feature that seems to comply with an i.i.d. model, with some extreme loss values recorded. More details of the dataset follows in the appendix. Number of papers and articles such as McNeil (1997) thor-oughly reported the findings from applying the techniques of extreme value theory to the dataset, and the dataset also appears in various texts related to quantitative risk management as well (seeMcNeil et al.(2015)).

Table 5.1: Parameter estimates, ASAE values and VaR estimates. The GPD is fit to the 100 exceedances over threshold u = 10.5. The methods used are MLE=maximum likelihood, MOM=method of moments, PWM=probability weighted moments, EPM=elemental percentile method and ModEPM=modified elemental per-centile method.

Estimator ξˆ βˆ ASAE VaRd_.99 VaRd_.995 VaRd_.999 MLE 0.47 7.58 0.0124 27.52 40.36 92.81 MOM 0.39 9.01 0.0154 29.39 42.47 90.82 PWM 0.51 7.24 0.0120 27.29 40.47 96.91 EPM 0.29 8.46 0.0163 26.74 36.80 69.52 ModEPM 0.31 8.38 0.0157 26.94 37.42 72.60

5.2 Estimation

Now we apply the ModEPM along with estimation methods that are mentioned in Chapter2 to the Danish fire loss data. One of the key result ofMcNeil(1997) was that the GPD is fitted well to the 109 exceedances over threshold u = 10. When we slightly change the threshold up to u = 10.5, the sample size becomes 100, while not losing the good GPD fit.

(24)

Figure 5.1 shows the GPD fit of both EPM and ModEPM. Both methods exhibit similar and reasonable fits, although they tend to overestimate the excess distribution above loss amount 100 (mil DKK). We also include the GPD fit graph of the other three estimation methods considered. The results of the parameter estimates, ASAE values and quantile estimation are given in Table5.1. While the modified EPM peforms better than the original EPM and similar to MOM, the MLE and PWM seem to give better fit to the data.

0.0 0.2 0.4 0.6 0.8 1.0

Losses (log scale)

Empir ical e xcess distr ib ution 10 50 100 ModEPM EPM

(a) The parameter estimates are ˆξEP M = 0.29, ˆβEP M = 8.46

and ˆξM odEP M= 0.31, ˆβM odEP M= 8.38.

0.0 0.2 0.4 0.6 0.8 1.0

Losses (log scale)

Empir ical e xcess distr ib ution 10 50 100 ML MOM PWM

(b) The parameter estimates are ˆξM LE = 0.47, ˆβM LE = 7.58

and ˆξM OM = 0.39, ˆβM OM = 9.01 and ˆξP W M = 0.51, ˆβM OM =

9.01.

Figure 5.1: The empirical excess distribution (dots) vs. the fitted GPD to exceedances over the threshold u = 10.5.

(25)

Chapter 6

Conclusion

In this paper, we propose a modified elemental percentile method (ModEPM) for esti-mating the two-parameter generalized Pareto distribution (GPD) parameters. Among different parameter estimation methods, the elemental percentile method (EPM) pro-posed by Castillo and Hadi(1997) is advantageous in the sense that the estimators are guaranteed to exist for all values of ξ and β, and has no convergence problems. Yet, the EPM does come with some major disadvantages, which are computational inefficiency when the sample size n is large, and taking inaccurate initial estimates into account.

The ModEPM copes with both of the above-mentioned problems, by taking a strat-ified sampling approach, sampling the quantile pairs from which the initial estimates with low root mean squared error (RMSE) are derived. In this way, the ModEPM uses less quantile pairs than the EPM, while achieving better accuracy of the final estimates. We also consider average scaled absolute error (ASAE) to measure the goodness-of-fit to the data. The ModEPM estimates generally fit better than the EPM estimates.

In insurance context under the current SolvencyII framework, accurately estimating higher quantiles (e.g., 99.5% quantile) of a loss distribution is of central importance. As the tail of a distribution can be modeled by the GPD, the parameter estimates should desirably lead to accurate quantile estimations. We compare the ModEPM and the EPM by assessing the RMSE of the 99.5% quantile estimation. The ModEPM performs better in every case except ξ = {2.0}. However, when ξ = {1.0, 2.0}, the performances of both EPM and ModEPM are extremely poor, so one might consider using different estimation methods such as maximum likelihood method. Also when ξ = {0.5}, the performances depend highly on the sample size n, as when n gets closer to 100 the accuracy of estimation improves considerably. A demonstration of the ModEPM on a real-world heavy tailed loss data also shows that the ModEPM could be a better alternative to the EPM. Its performance, measured by ASAE, suggests that the ModEPM is comparable to the MOM.

Therefore, based on the results from our extensive simulation studies and those by

Castillo and Hadi(1997), we suggest the following as a guideline to when the ModEPM should be preferred to the EPM:

1. If it is believed that −2.0 < ξ < 0, use the ModEPM over the original EPM. 2. If the sample size n is large (e.g., n = 100) and there is some evidence that

0 < ξ ≤ 0.5, use the ModEPM.

(26)

Appendix: Description of the

Danish fire insurance dataset

Here, we present some details on the Danish fire insurance dataset in Chapter 5. First, the visualization on the data is presented in Figure 6.1 and some descriptive statistics in Table6.1. 1980 1982 1984 1986 1988 1990 0 50 100 200 Loss (mil DKK)

(a) Time series plot

Loss (mil DKK) Frequency 0 50 100 150 200 250 0 200 600 1000 (b) Histogram

Figure 6.1: Time series plot and histogram of the Danish fire insurance dataset. One can observe extreme fire losses that have occurred throughout the time period.

(27)

Table 6.1: Descriptive statistics of the Danish fire loss data. Statistics Value (mil DKK)

Mean 3.385

Variance 72.38 Skewness 18.74

(28)

Bibliography

Casella, G. and Berger, R. L. (2002). Statistical inference, volume 2. Duxbury Pacific Grove, CA.

Castillo, E. and Hadi, A. S. (1997). Fitting the generalized pareto distribution to data. Journal of the American Statistical Association, 92(440):1609–1620.

Cochran, W. G. (2007). Sampling techniques. John Wiley & Sons.

Embrechts, P., Kl¨uppelberg, C., and Mikosch, T. (2013). Modelling extremal events: for insurance and finance, volume 33. Springer Science & Business Media.

Grimshaw, S. D. (1993). Computing maximum likelihood estimates for the generalized pareto distribution. Technometrics, 35(2):185–191.

Hosking, J. R. and Wallis, J. R. (1987). Parameter and quantile estimation for the generalized pareto distribution. Technometrics, 29(3):339–349.

Jockovi´c, J. (2012). Quantile estimation for the generalized pareto distribution with application to finance. Yugoslav Journal of Operations Research, 22(2):297–311. Kotz, S. and Johnson, N. L. (1982-1988). Encyclopedia of Statistical Sciences: Vol.: 1-9.

John Wiley & Sons.

McNeil, A. J. (1997). Estimating the tails of loss severity distributions using extreme value theory. ASTIN bulletin, 27(01):117–137.

McNeil, A. J., Frey, R., and Embrechts, P. (2015). Quantitative risk management: Concepts, techniques and tools. Princeton university press.

Pickands III, J. (1975). Statistical inference using extreme order statistics. the Annals of Statistics, pages 119–131.

Walther, B. A. and Moore, J. L. (2005). The concepts of bias, precision and accuracy, and their use in testing the performance of species richness estimators, with a literature review of estimator performance. Ecography, 28(6):815–829.

A modified elemental percentile method for estimation of generalized pareto distribution parameters