• No results found

Challenging the Challenger Model

N/A
N/A
Protected

Academic year: 2021

Share "Challenging the Challenger Model"

Copied!
48
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

MSc in Econometrics Thesis

Financial Econometrics

Challenging the Challenger Model

Author: Supervisior: Supervisors DNB: Marthe van der Klein prof. dr. H.P. Boswijk F.S. Armandillo Student number: Second Reader: P.C. de Rijke 10002001 dhr. dr. N.P.A. van Giersbergen

(2)
(3)

1 Introduction 1 1.1 Background . . . 1 1.2 Research Question . . . 2 2 Credit Risk 4 2.1 Unexpected Loss . . . 4 2.1.1 Probability of Default . . . 5

2.1.2 Loss Given Default . . . 8

2.2 Expected Loss . . . 9

2.3 Challenger Model . . . 9

3 The Challenger Model 11 3.1 Data . . . 11

3.2 PI - Base scenario . . . 12

3.3 CR - Base scenario . . . 15

3.4 LGL - Base scenario . . . 17

4 Challenging the Challenger Model 23 4.1 PI - Alternative scenarios . . . 23 4.1.1 Alternative Scenario I . . . 23 4.1.2 Alternative scenario II . . . 23 4.2 CR - Alternative scenarios . . . 28 4.2.1 Alternative scenario I . . . 28 4.2.2 Alternative scenario II . . . 29

4.2.3 Alternative scenario III . . . 30

4.3 LGL - Alternative scenarios . . . 30 4.3.1 Alternative scenario I . . . 30 4.3.2 Alternative scenario II . . . 31 5 Provision Results 34 5.1 Base scenario . . . 34 5.2 Alternative scenario . . . 34 6 Conclusion 38 7 Literature 39

(4)

University of Amsterdam Marthe van der Klein

Appendices 41

A Glossary . . . 41 B Data Loan tape . . . 42 C Historical Data . . . 43

(5)

1.1 Phase 2 AQR: work blocks . . . 2

2.1 Loss distribution . . . 4

3.1 Alignment Loan tapes . . . 11

3.2 Cure rate fitted on the Weibull CDF . . . 17

3.3 Expected Recoveries . . . 19

4.1 LTV buckets . . . 25

4.2 LIR buckets . . . 26

4.3 ROC curve PI model . . . 27

4.4 Cure rate, cut off point 24 months . . . 28

4.5 Cure rate, cut off point 24 months and no fall back parameters . . . 29

4.6 Cure rate fitting . . . 30

List of Tables

3.1 PI - Base scenario . . . 15

3.2 Example of the transition matrix . . . 16

3.3 Sales ratio . . . 21

3.4 LGL per LTVI bucket . . . 22

4.1 LTV buckets alternative scenario I . . . 23

4.2 PI output alternative scenario I . . . 24

4.3 PI- Logit model regression output . . . 27

4.4 LTV buckets alternative scenario I . . . 31

4.5 LGL - output scenario II . . . 33

5.1 Provision per alternative scenario . . . 34

(6)

Chapter 1:

Introduction

1.1

Background

The 2008 financial crisis is the worst economic disaster since the Great Depression of 1929. A slowdown in the US economy caused among others American homeowners to default on their mortgages. Banks all over the world with investments linked to those subprime mortgages started losing money. In an attempt to stop European banks from failing, governments came to the rescue in many EU countries. As Europe fell into recession in 2009, a problem that started in the banks began to affect governments more and more, as markets worried that some countries could not afford to rescue banks in trouble and thus the situation developed into a sovereign debt crisis.

The Eurozone crisis showed the potentially vicious circle between banks and sovereign debt. The need for a monetary union for a single currency to work in the long run became clear. A banking union should lead to a single centralized mechanism for the supervision and restruc-turing of banks. The Capital Requirements Regulation (CRR) and the Capital Requirements Directive IV (CRD IV) are primary binding legislation implementing Basel III standards. This is the legal foundation of the Banking Union and will ensure uniform application of the Basel III measures in all Member States. The Single Supervisory Mechanism (SSM) places the European Central Bank (ECB) as the central prudential supervisor of financial institutions in the Euro-zone and in those non-Euro EU countries that choose to join the SSM. The ECB directly supervises the largest banks, while the national supervisors continue to monitor the remaining banks. The Single Resolution Mechanism (SRM) applies to banks covered by the SSM. If a bank faces serious difficulties, the Single Resolution Board will restructure the institution in order to preserve financial stability and restore the viability of that institution. This will be done with minimal cost to taxpayers and the real economy through a Single Resolution Fund, financed by the banking sector.

In preparation of regulating 130 banks in the Eurozone, the ECB conducted a comprehensive assessment between November 2013 and October 2014. The aims were to enhance the quality of information available and to identify problems and implement necessary actions. Also, the aim was to ensure that the banks are fundamentally sound and trustworthy (ECB, 2013).

The comprehensive assessment comprised three pillars. The first pillar was a risk assessment. In this assessment, the risk profile of banks were determined by assessing key risks in the banks’ balance sheets. The second pillar was an asset quality review (AQR). In this pillar, the asset side of bank balance sheets as at 31 December 2013 were examined. This is conducted based on harmonised definitions, which allowed to compare all participating banks. The last pillar was a stress test, the forward looking view of banks’ shock-absorption capacity under stress. Together, these three pillars formed an in-dept review of the banks’ balance sheet.

(7)

Zooming in on the AQR, the AQR was composed of two phases. The first phase involved portfolio selection of banks and the second phase comprised nine work blocks displayed in the figure below.

Figure 1.1: Phase 2 AQR: work blocks Source: European Central Bank, 2014

Phase 2 started with a review on banks policies, processes and accounting practices. Work block 2 involved data checks on loan tapes provided by the banks. These loan tapes are files containing every loan per portfolio. The loan tape included basic account information such as identifiers of the loan, missed payments status, product type and geographical information. Work block 3 was the sampling. Given the time and volume of the AQR, not all exposures and portfolios could be reviewed. Therefore, a sampling was conducted in a manner that the sample was both large enough and representative enough. Work block 4, the credit file review, verified the correct classification of each exposure in the banks‘ system. Work block 5 took care of the collateral and real estate valuation. Work block 6 projected the findings of work block 4. Projection of findings was applied to homogeneous pools of exposure within each portfolio. Work block 7 is the collective provisioning analysis. The provision level of banks were verified using a statistical model, to compare with the banks’ practices. The model used to estimate provisioning levels is called the Challenger Model (CM). The Challenger Model is the subject of this thesis. Work block 8 revaluated banks with a specific type of exposure. Work block 9 was a join up of the aforementioned work blocks; all AQR adjustments were combined to adjust the common equity tier 1 (CET 1) capital ratio. This was used as input for the stress test, pillar three of the comprehensive assessment.

1.2

Research Question

The Challenger Model is a simplistic statistical model that was used to estimate provisioning levels for all asset classes of banks during the AQR. Assumptions underlying the methodology of the ECB and choices made during calibration of the model, could lead to a different outcome. In this thesis, the sensitivity of the Challenger Model will be investigated. The Challenger Model will be built based on the ECB methodology and parts of the framework will be replaced or

(8)

University of Amsterdam Marthe van der Klein

Since a new AQR could be performed in the future it will be interesting to investigate how sensitive the Challenger Model is. This will be investigated on a residential real estate (RRE) portfolio of a large Dutch bank. The same loan tapes used in de AQR will be used in this thesis, the loan tapes of 31 December 2012 and 31 December 2013.

The remainder of the document is organized as follows. First, an introduction to credit risk will be given. The terms unexpected loss and expected loss will be introduced and in what way banks could use credit risk model to cover for these losses. Chapter 3 covers the Challenger Model. In this chapter, the data will be explained and then each of the parameters of model will be examined separately. Alternative scenarios of each of the parameters of the Challenger Model are discussed in Chapter 4. The result of the base case and the alternative scenarios will be described in Chapter 5. Finally, the conclusion of the alternative model will made in Chapter 6.

(9)

Banks are required to hold a minimum amount of capital based on their exposure to credit risk, market risk and operational risk according Basel III. Credit risk is the main risk for banks and is defined as the possibility that a bank, borrower or counterparty may fail to meet its obligations in accordance with the agreed terms. Losses can be divided into two types: expected losses (EL) and unexpected losses (UL). EL cover average economic situations, which a bank can forecast by the average level of credit losses it will reasonably expect to experience. UL are the losses above the expected levels. The loss distribution divided over EL and UL is displayed in the figure below.

Figure 2.1: Loss distribution

2.1

Unexpected Loss

Unexpected losses are covered by risk based own funds capital requirements. The own funds capital requirements, hereafter regulatory capital, is obtained from the risk weighted exposure (RWE). Every exposure class has a different risk weight since different types of assets have different risk profiles.

RWE = RW × exposure value

The risk weight (RW) can be calculated using one of two approaches: the standard approach (SA) or the internal rating based approach (IRB). According to the SA, banks have to divide their credit exposure into classes based on observable characteristics of the exposures; corporate loans versus mortgages for example. For all classes, a fixed risk weight is determined in the

(10)

University of Amsterdam Marthe van der Klein

each type of exposure. The RW formula for retail exposure, including residential real estate exposures, is based on the Vasicek model (see below):

RW =  LGD × Φ  1 √ 1 − ρΦ −1(PD) + r ρ 1 − ρ× Φ −1(0.999)  − LGD × PD  × 12.5 × 1.06

according article 154 of the CRD, where

• PD is the long run average probability of default

• LGD is the loss given default in a downturn situation, the conditional expectation of loss given that default already has occurred.

• ρ is the coefficient of correlation of the default occurrence with a systematic risk factor • Φ is the cumulative distribution function for a standard normal random variable.

The first part of the formula represents the loss in a situation which arises with a statistical probability of 1 in 1000. The second part, −LGD × PD adjusts the potential loss for the expected loss under the current circumstances. The last part, 12.5 × 1.06 translates the capital requirements into risk weights. It is chosen such that the total capital level in the banking system is expected to remain stable.

IRB banks develop their own models for the PD and LGD parameters. In the next sections, the PD and LGD will be explained more thoroughly with some modeling concepts.

2.1.1 Probability of Default

The amount of capital necessary to support a portfolio depends on the loss distribution of the portfolio, the Vasicek model (Vasicek, 2002). The Vasicek model uses a geometric Brownian motion for asset values. Let Ai be the value of the ith borrower’s assets. The process of Ai will

be:

dAi = µAidt + σiAidWi,

where Wi is a Brownian motion and the asset value at time t = T can be represented as:

log Ai(T ) = log Ai(0) + (µi− 1 2σ 2 i)T + σiWi(T ). Let Wi(T ) = Xi √

T . The variables Xi are jointly standard normal with equal pairwise

correlations ρ and therefore can be represented as:

Xi= Y ρ + Zi

p 1 − ρ2

where Y, Z1, .., Zn are mutually independent standard normal variables. The variable Y can be

interpreted as a portfolio of common factors such as an economic index. Then the term Y ρ is the company’s exposure to the common factor and the term Zi

p

1 − ρ2 represents the company

specific risk.

When the asset value falls below a certain threshold with loan maturity T , the loan will be in default. In other words, when the asset value Ai falls below the contractual value Bior when

(11)

the standard normal variable Xi falls below the corresponding threshold ci, the borrower is in

default. The probability of this for loan i is:

pi = P [Ai(T ) < Bi] = P [Xi< ci] = Φ(ci)

with Φ(ci) the cumulative normal distribution function of ci and the threshold ci equal to

ci= − logAi(T ) Bi + (µi− 1 2σ2i)T σi √ T

Now, consider a portfolio of n loans with the probability of default of each loan equal to pi = p and the correlation of two loans equal to the coefficient ρ. Assume a maturity of T for

all loans in the portfolio. The loss on the ith loan is expressed as Li with Li = 1 when the loan

is in default and Li= 0 otherwise. The percentage loss of the portfolio is: L = 1n

Pn

i=1Li

When the common factor Y is fixed, the conditional probability of loss on any loan is:

p(Y ) = P [Li= 1|Y ] = Φ " Φ−1(p) − Y ρ p 1 − ρ2 #

where p(Y ) equals the default probability of a loan under the given scenario Y , p equals the unconditional default probability as the average of the conditional probabilities over the other scenarios.

Conditional on the value of Y , the variables Li are independent identically distributed with

a finite variance. The portfolio loss conditional on Y converges, by the law of large numbers, to its expectation p(Y ) as n goes to infinity. Then

P [L ≤ x] → P [p(Y ) ≤ x] = P [Y ≥ p−1(x)] = Φ(−p−1(x))

On a very large portfolio, in the limit this is equal to:

P [L ≤ x] = Φ p

1 − ρ2Φ−1(x)–Φ−1(p)

ρ

!

In other words, the portfolio loss distribution given by the cumulative distribution function is:

F (x, p, ρ) = Φ p 1 − ρ2Φ−1(x)–Φ−1(p) ρ ! Logit

A common statistical method to estimate the PD is with a logit model. The logistic regression uses as a dependent variable a binary variable that takes the value 1 if a borrower defaulted in the observation period and zero otherwise. The independent variables are all potentially relevant parameters to credit risk.

We have n independent observations y1. . . yn in the portfolio and the ithobservation can be

(12)

University of Amsterdam Marthe van der Klein

with probability πi. Now suppose the logit of the underlying probability πi is a linear function

of the predictors

logit(πi) = log

πi

1 − πi

= x0iβ

where xi is a vector of covariates and β is a vector of regression coefficients. Solving for the

probability πi gives the following model:

πi =

exp(x0iβ) 1 + exp(x0iβ).

For retail portfolios, financial ratios measuring payment behaviour have the most statistical power in differentiating defaulted from non-defaulted loans.

Expert

An other way to estimate the PD is with an expert model (Medema et al, 2009). Expert models attempt to use past experience to evaluate the future creditworthiness of a potential borrower. Credit experts choose relevant creditworthiness factors and their weights based on experience. In classic rating questionnaires, the credit experts define clearly answerable questions regard-ing factors relevant of creditworthiness and assign fixed number of points to specific questions. So based on information from the questionnaire and the expert opinion, a minimum PD for a particular group of loans can be set.

Neural Networks

Artificial neural networks have been introduced to develop more objective expert systems. Both neural networks and expert models are using historic repayment experience and default data. Neural networks are ’trained’ using this data and are therefore more objective than expert models. In neural networks, structural matches are found that coincide with defaulting firms and then used to determine a weighting scheme to forecast PD. Each time the neural network evaluates the credit risk of a new loan opportunity, it updates its weighting scheme so that it continually learns’ from experience. Therefore, the data is called ’trained’.

Neural networks use information technology in an attempt to simulate the complicated way in which the human brain processes information. In each stage hidden correlations among the explanatory variables are identified making the processing a black box model and suffer from lack of transparency. Since there is no clear economic interpretation that can be attached to the hidden intermediate steps, the system cannot be checked for plausibility and accuracy. Structural errors will not be detected until PD estimates become noticeably inaccurate (Allen et al, 2004).

The models mentioned above are rarely used in their pure form. Even though statistical and causal models are generally seen as better rating procedures the inclusion of credit expert’s knowledge generally improves ratings. Expert’s knowledge can be included in the form of overrides (Einarsson, 2008). The rating obtained from the statistical model is then altered by the expert. This should only be done if it is considered necessary, otherwise the model should be questioned.

(13)

2.1.2 Loss Given Default

The loss given default is the conditional expectation of loss, given that default already occurred. Depending on what happens after a loan goes into default, the loss can be determined. There are several scenarios which could occur after a counterparty goes into default. The bank could contact the debtor for a re-evaluation of the loan whereby the debtor would have to pay a slightly higher interest rate on the remaining loan but would have lower and more manageable monthly repayment amounts. The bank could also decide to sell the loan to a separate company which works specifically towards the collection of repayments from defaulted loans or repossess the property (enter foreclosure) and sell it to cover losses.

The LGD formula can be expressed as:

LGD = (1 − P [R]) × Loss

where

• P [R] the probability of recovery is, the probability the loan will go back to performing without loss with R ∈ [0, 1].

• Loss is the loss of the loan when foreclosing.

There are no regulatory requirements on the risk drivers used in mortgage LGD models. However, most banks use loan to value, collateral value and the presence of the ’Nationale Hypotheek Garantie’ which is a mortgage indemnity guarantee in the Netherlands.

Bastos (2009) investigated how to model the probability of recovery. With a parametric fractional response regression where the variables are bounded on the unit interval, he used the following functions:

E[R|x] = G(x0β) =

( exp(x0β)

1+exp(x0β) logistic function

exp(− exp(−x0β)) log-log function

where x are the characteristics of the loan. Bastos used the log-log functional form, but found the conclusion with the logistic functional form similar. This is a possible way of modeling recovery, and not necessarily the common method.

Most LGD models possess the same shortcoming namely that LGD and PD are estimated independently (Hlawatsch & Reichling, 2010). The Basel Committee on Banking Supervision commits banks to consider the extent of any dependence between the risk of the borrower and that of the collateral.

Downturn LGD

To calculate the capital requirement banks need to use the downturn LGD. The downturn LGD is an estimate of losses the bank will incur on defaults that start the coming year, under the as-sumption that an economic downturn scenario will unfold. Some models have a fixed downturn factor with which the LGD is multiplied. A more sophisticated approach is to stress several input parameters separately. This could be a scenario with a sovereign debt crisis or a mortgage crisis for example.

(14)

University of Amsterdam Marthe van der Klein

2.2

Expected Loss

Expected losses are covered by the provision of the bank and is calculated as follows:

EL = PD × LGD × EAD

with EAD the exposure at default.

An adjusted version of the PD and LGD of the bank’s IRB model are used in the EL model. The PD in the UL model is defined as an average, through-the-cycle (TTC) PD. The horizon of a TTC PD is long enough for business-cycle effects mostly to go away, a period of five or more years. This PD is adjusted to a point-in-time (PIT) PD for the EL model. PIT PD is the PD measured over a short horizon, often considered a year or less (Aguais et al, 2004).

Moreover, the LGD in the UL model is estimated over a downturn situation. Estimating the LGD over a downturn situation reflects the economic condition where necessary. The forecasts of recovery rates on exposures that default are now based where credit losses are expected to be substantially higher than average (Basel Committee on Banking Supervision, 2005). This makes the LGD more conservative than estimating the LGD PIT.

When the expected loss amount exceeds the total provision amount, a ’shortfall’ must be deducted from capital. If the total provision amount exceeds the expected loss amount, the ’excess provision’ will be eligible as an element of capital. Comparing the expected losses with the total amount of provisions of banks will produce a ’shortfall’ if the expected loss amount exceeds the total provision amount, or an ’excess’ otherwise.

2.3

Challenger Model

The Challenger Model is a collective provision model, with parameters comparable to the para-meters in the EL model. The Challenger Model is based on the following formula:

Collective Provision = (

PI × (1-CR) × LGL × EAD Performing Loans (1-CR) × LGL × EAD Non-Performing Loans

with PI the probability of impairment, CR the cure rate, LGL loss given loss and EAD exposure at default.

PI is based on the simplified approach definition of the European Banking Authority (EBA). This implies loans which are impaired, defaulted and more than 90 days past due. According to the CRR, a loan is defaulted when the obligor is unlikely to pay, for example when there is a history of debt problems, and/or more than 90 or 180 days past due, depending on the default definition the bank is allowed to use. A loan is impaired, according to IAS 39, when a loss event impacts on future cash flows. A loss event is for example financial difficulties of the borrower. The PI is reduced by the ratio between the bank’s emergence period in months and 12 months for performing exposures. The emergence period is the amount of time between the event of loss and the observation of the loss. The simplified approach default definition is thus the result of defaulted and impaired loans. In other words, the PI is defined as the probability

(15)

a loan will be in default and impaired within a certain period whereas the PD is the probability a loan will be in default within a certain period.

CR is the long term likelihood of an impaired loan returning to the performing state. Loans that are not past-due and without risk of non-repayment, are performing. LGL is equal to the level of loss after discounted recoveries that can be expected when a loan is in default, such as collateral. The multiplication of (1-CR) and LGL is comparable to the LGD of the EL model.

(16)

Chapter 3:

The Challenger Model

The Challenger Model is based on the PI, CR and LGL. The method to obtain each of these components, and the data needed, will be discussed in this chapter. The formula of the para-meters will be derived and the possible segments are discussed.

3.1

Data

This thesis is based on the residential real estate portfolio of a large Dutch bank. There are two loan tapes available, a loan tape of December 2012 and December 2013. The loan tape of December 2012, t = 0, and December 2013, t = 1, are aligned into one file. A part of the loans existing in t = 0 but missing in t = 1 are be found in the write-off list. The write-off list consist the loans that went foreclosure between t = 0 and t = 1. The rest of the missing loans could be located somewhere else in the bank’s system by migrating into a different portfolio, for example, but a big part of these missing loans are assumed to be missing due to data issues of the bank.

Moreover, there are loans included in the t = 1 loan tape but excluded in the t=0 loan tape. This could be new loans starting between t = 0 and t = 1 or loans migrated from another portfolio to the residential real estate portfolio. This is not a problem since the model will be calibrated on the merged loan tape at t = 0 from which these new loans at t = 1 are excluded. Once it is built and calibrated, the model will be used on the whole loan tape of t = 1 thus then the new loans will be included.

(17)

Looking more deeply into the loan tape of t = 0, twenty loans occur twice. It is not clear if this is a result of constructing the loan tape from the internal data system of the bank or a problem of the bank’s data management. Therefore, these loans were excluded. The write-off list also included 9 loans multiple times. This time it is assumed to be a mistake when the list was constructed since it is a separate, much smaller list. Therefore, the double loans are removed.

Missing data also occur in the loan specific information. For example, the region of a house of the loan can be missing. This information is used to index the collateral value. The missing region is resolved by using an index value based on the country, not the region. More difficult issues arise when the collateral value at the last appraisal is left blank. For five thousand loans the collateral value at the last appraisal at t = 0 it is left as not available. For some missing data, assumptions have been used to fill in these missing values. For example, the date of the last appraisal is missing for some loans. Since the CRR require banks to re-appraise the collateral of the loans every three year, an appraisal date of January 2012 is used instead. In Appendix B and C more information on the data included in the loan tape can be found.

3.2

PI - Base scenario

The PI calculation is based on the status of the loan at t = 0 and t = 1. The status could be performing, which implies the loan is less than 90 days past due, or the status could be non-performing, which implies more than 90 days past due, defaulted and impaired. Each loan receives a NPE flag, a non-performing flag, equal to 1 or 0. When a loan has a performing status at t=0 and non-performing at t=1, the NPE flag is equal to 1. When a loan has a performing status at t = 0 and at t = 1, the NPE flag is equal to 0. Also, the NPE flag equals 1 when a loan is performing at t = 0 and t = 1, but non-performing between t = 0 and t = 1. The loans included on the write-off list will also get a flag equal to one. These loans were included on the loan tape at t=0 but not on the loan tape at t = 1. So a part of these loans were foreclosed and therefore not included in the loan tape at t = 1. The other possible type, a loan with a non-performing status at t = 0, is already in default and therefore the PI equals 1.

NPE flag =          1 PE at t=0, NP at t=1

PE at t=0 and t=1, NP between t=0 and t=1 Included in the write-off list

0 PE at t=0 and t=1

These types of loans with their statuses form the basis to calculate the PI. The horizon of the PI calculation is thus 12 months, between t = 0 and t = 1. The PI calculation will be explained below.

For the PI calculation, the distinction between performing and non-performing loans at t=0 is made. The performing loans at t = 0 are divided over groups of loans in the portfolio. The loans are classified in segments, so groups of loans form a homogeneous group for which the PI calculation is done.

(18)

University of Amsterdam Marthe van der Klein

The segments are based on: • Loan to Value

• Risk type • Product type • Channel

The loan to value (LTV) is the ratio of the loan divided by the value of the collateral provided for the loan. The higher the LTV, the riskier the loan. In the base scenario, the LTV ratios are divided over the following five buckets: 0-60%, 60-80%, 80-100%, 100-120% and more than 120%.

The risk type segments in the base scenario are based on two flags, namely the current status flag and the cured flag. The current status flag equals:

Current status flag =                       

Default NPE=1, non-performing, or Internal NPE=1, non-performing in the banks’ internal system.

High risk NPE=0, performing, and Internal NPE=0, performing in the banks’ internal system;

and the loan is more than 15 days past due or LTV ratio is higher than 500%.

Normal risk Otherwise and the cured flag equals:

Cured flag = (

1 Current status flag 6= Default and NPE in the last 12 months = 1 0 Current status flag 6= Default and NPE in the last 12 months = 0 Combining the current status flag and the cured flag, the following risk segments can be formed:

• High risk cured when:

– Current status flag = High risk – Cured =1

• High risk when:

– Current status flag = High risk – Cured = 0

• Normal cured when:

– Current status flag = Normal – Cured = 1

(19)

• Normal when:

– Current status flag = Normal – Cured = 0

• Default when:

– Current status flag = Default

The segment product type is the exact product of the loan. For example an ’Aflossingsvrije hypotheek’ which an interest-only mortgage.

The segment channel is the channel how the application of the loan is done. This could be done directly by a bank, through a broker (third party), self-service and by a branch for example. In the base scenario the distinction is make between broker and otherwise.

The PI can be calculated for the whole portfolio of performing loans, for the different segmentations and combinations of those segmentations. We expect a monotone relationship over the different segmentations classes. For example, a higher PI for LTV high than LTV low since the risk for high LTV is higher than low LTV. When observed relationships between segments are judged to be nonintuitive, adjacent segments should be merged until a logical relationship is defined. The best combination of the segmentations will be determined and used to apply on the data of t = 1.

The probability of impairment for segmentation x with loans i = 1 . . . N is then calculated as:

P Ix=

PN

i=1(N P Ei× (ON BALi+ CCFi× OF F BALi))

PN

i=1(ON BALi+ CCFi× OF F BALi)

with:

• NPE: the non-performing flag

• ONBAL: the on balance sheet exposure of the loan • OFFBAL: the off balance sheet exposure of the loan • CCF: credit conversion factor

The second part of the nominator, (ON BALi+CCFi×OF F BALi), is the current exposure

of loan i. The PI is thus the ratio over the exposure of non-performing loans in segment x to the exposure of all loans in segment x. Segment x can be any combination of the dimensions of product, LTV, channel and risk, but can also be non of the dimensions and thus equal to the whole portfolio.

In the base scenario, the existence of a monotone relationship between the risk order buckets is leading which segments are included and excluded. If all segments are used, the LTV, risk and channel segments, no clear monotone relationship can be found. For example, the PI of the segment bucket LTV 60-80% broker and normal risk is lower than the PI of segment bucket LTV 0-60% broker and normal risk. Since a loan with a LTV of 60-80% is riskier than a loan with a LTV of 0-60%, a higher PI is expected. The exposure amount of the buckets with the

(20)

University of Amsterdam Marthe van der Klein

channel segment ’broker’. It is less than 5% of the total exposure at t = 0. Given this small percentage of exposure of loans with the channel ’other’ and the missing monotone relationship in multiple buckets, a channel segment will be excluded from the base scenario.

When only the segments LTV and risk are included to calculate the PI buckets, a more monotone relationship can be found. Again, not for every bucket is a monotone relationship but for less buckets compared to the previous segmentation composition. Since these buckets are distinctive in risk and it divides loans in homogeneous groups, the base scenario will be the segmentation LTV and risk.

In Table 3.1 the PI per risk and LTV segment is displayed. Also, the exposure per PI bucket is displayed in amount and in percentage of the whole portfolio. When the PI is calculated for the whole portfolio, so no segments used, all performing loans will have a PI of 1.47% and all non-performing loans a PI of 100%.

Risk bucket LTV bucket PI in percentage Exposure Percentage Normal risk LTV 0-0.6 0.56 % 33.528.409.512 22.55 % Normal risk cured LTV 0-0.6 30.37 % 73.316.087 0.05 % High risk LTV 0-0.6 10.42 % 1.221.482.022 0.82 % High risk cured LTV 0-0.6 50.18 % 24.132.565 0.02 % Normal risk LTV 0.6-0.8 0.47 % 22.900.610.709 15.40 % Normal risk cured LTV 0.6-0.8 22.36 % 48.915.717 0.03 % High risk LTV 0.6-0.8 3.96 % 2.195.048.127 1.48 % High risk cured LTV 0.6-0.8 47.82 % 32.532.903 0.02 % Normal risk LTV 0.8-1.0 0.63 % 35.226.775.940 23.69 % Normal risk cured LTV 0.8-1.0 26.93 % 91.013.203 0.06 % High risk LTV 0.8-1.0 4.43 % 5.140.584.420 3.46 % High risk cured LTV 0.8-1.0 44.34 % 76.146.474 0.05 % Normal risk LTV 1.0-1.2 1.14 % 36.115.209.533 24.29 % Normal risk cured LTV 1.0-1.2 30.34 % 179.575.454 0.12 % High risk LTV 1.0-1.2 5.81 % 6.796.106.281 4.57 % High risk cured LTV 1.0-1.2 50.47 % 15.509.355 0.10 % Normal risk LTV 1.2+ 2.40 % 177.523.864 1.19 % Normal risk cured LTV 1.2+ 43.25 % 38.830.802 0.03 % High risk LTV 1.2+ 18.37 % 210.567.625 0.14 % High risk cured LTV 1.2+ 57.35 % 20.732.043 0.01 % Normal risk LTV missing 1.09 % 687.940.882 0.46 % Normal risk cured LTV missing 40.95 % 5.156.336 0.00 % High risk LTV missing 8.36 % 8.343.611 0.06 % High risk cured LTV missing 70.84 % 513.561 0.00 %

Table 3.1: PI - Base scenario

3.3

CR - Base scenario

The cure rate model is based on a transition matrix. A transition matrix describes the migration probability of states between two time points. For these two time point, each current state of the loan is determined. The percentage of the exposure of the loans of state j at t = 0 that migrated to state k at t = 1 is the percentage in the transition matrix in row j and column k.

(21)

According to the base scenario, the transition matrix consist of the states performing, forborne, 1 to 36 months past due, write-offs and missings. Loans are performing when they are less than 1 month past due, forborne in case of performing loans but refinancing or modifications of terms and conditions has taken place. 1 to 36 months past due is obvious but there is a cut of point at 36 months past due. Loans more than 36 months past due will be classified in the bucket 36 months past due. There are no strict constraints before loans will be written-off. This depends on the policy of the banks and debtor since a voluntary foreclosure could take place.

Status P F 1 2 L P 100% 0 0 0 0 F 60% 40% 1 2 L 0 0 0 0 100% Table 3.2: Example of the transition matrix

In the base scenario, the performing and the write-off states are absorbing states. This implies that once a loan has the status performing or write-off at t = 0, they will not lose this status and have the same status at t = 1. Moreover, if a bank has no loans in the forborne status at t=0, a fall back parameter of 60% of the forborne loans is set to be performing in t = 1 and 40% is set to be write-off in t = 1. It is important to note that certain states overrule other states when a loan is classified in multiple states. The state L overrules state F , state F overrules x months post due and x months past due overrules state P . Table 3.2 displays the outline of the transition matrix.

The transition matrix describes the changes in states over a one year period. In practice, a loan can migrate to a performing state after more than one year. Assuming loan behaviour is Markovian, the matrix can be multiplied x times with itself to describes the change in states over x years. The base scenario assumes that the maximum resolving period is four years.

The first column of the four-year transition matrix, is the cure rate. This represent the loans migrating from state x = 3, .., 36 MPD at t = 0 to the performing state at t = 1. Given the number of observations, the cure rate will be noisy. Therefore, a relationship between the time past due and the cure rate needs to be fit. One would expect a downward sloping, concave relationship given the character of a cure rate. To achieve the best fit, a least squares approach can be adopted. Since not every bucket has the same amount of exposure, a weighted exposure method will be used to fit a distribution. The weighted residual sum of squares (WSS) will be minimised: W SS = PN i=1(yi− ˆyi)2∗ Wi PN i=1Wi with

• yi the cure rate for state i at t=0 and state P at t=1

• ˆyi the estimator of the cure rate for state i at t=0 and state P at t=1

(22)

University of Amsterdam Marthe van der Klein

The method mentioned above contributes hopefully to a better fit of estimators ˆyi on the

data yi compared to a non-weighted residual sum of squares. The amount of exposure will

decrease over time, so less weight should be assigned to estimators of higher states compared to lower states. The next step is finding a relationship between the cure rate y and the time past due. In the base scenario 1 minus the cumulative distribution function, 1 − CDF , of the Weibull distribution is used:

ˆ

yW BCDF = 1 − (1 − e−xλk)

In Figure 3.2 the cure rate is fitted to 1 minus the Weibull CDF, taken into account Wi.

The first months past due buckets contains the most part of the exposure and therefore the most weight of the curve is situated at the first part of the curve. The high peak at 32 months past due for example, is the result of the minor part of exposure of loans with 32 months past due at t = 0 and a relative large part thereof migrating to performing state at t = 1. Therefore, the Weibull curve does not reach to this point of the cure rate.

months past due

0 5 10 15 20 25 30 35 40

Cured in percentage

Cure Rate

Cure rate Weibull 1-CDF

Figure 3.2: Cure rate fitted on the Weibull CDF

Now, each loan at t = 1 is assigned to the value of the fitted cure rate curve corresponding to the months past due at t = 1. For months past due between 3 and 36 months, the loan will have the cure rate with the same amount months past due. Loans with more than 36 months past due will have the same cure rate as loans with 36 moths past due. Loans with less than 3 months past due will receive the cure rate of loans with 3 months past due.

3.4

LGL - Base scenario

The LGL is determined using a structural approach based on collateral value, taking into consideration the impact of mortgage indemnity guarantees (MIG). The Dutch version of MIG

(23)

is the ’National Hypotheek Garantie’ (NHG). The LGL captures the stochastic nature of the recovery values eventually observed. The intuition behind this approach is that recoveries are assumed to be normally distributed and the expected value of recoveries is computed starting from the assumption that if observed recovery value (v’) is greater than the loan amount (L) the bank recovers L, otherwise the recoveries to the bank are v’.

The LGL is computed as 1 minus the recovered portion of the collateral value as a percentage of the loan balance weighted by the probability of any relevant MIG being successfully claimed. Since collateral is not the only source to reduce the loss, the LGL is reduced by other recoveries. This could be insurance, savings and investment accounts pledged to mortgages for example.

The LGL formula is:

LGL =  (1 − P [MIG]) ×  1 −E[RV ] LTVI  + P [MIG] ×  1 −E[RV ] LTVI − MIG Insurance L  ×(1−OR) where

• L is the loan amount

• LTVI is the indexed LTV, which takes into consideration sales cost, time to sale and appraisal discount

• E[RV ] is the expected value of the recoveries

• P [MIG] is the probability that the MIG is successfully claimed • MIG Insurance is the insurance on the loan

• OR is other recoveries

LTVI for loan i is defined as follows:

LTVIi=

LTVAi× (1 + LTVACosts

i) × (1 + Effective interest rate)

time to sale

Index to today × Index to sale × (1+Appraiser discount) where

• LTVAi is the current on and off-balance sheet exposure divided to the property value at appraisal

• Appraiser discount is the average difference between last bank appraisal indexed to date of appraisal and the independent, external party appraisal of the property value for the AQR sample of residential property

• Costs is the average foreclosure expenses as a percentage of the balance • Time to sale is the observed average time to sale in years

• Index to today is the average property price for the region today (t = 1) divided to the average property price for the region at the date of appraisal

• Index to sale is the forward looking change to house price index for the region • The discounted rate used should be the effective interest rate

(24)

University of Amsterdam Marthe van der Klein

In the numerator of the LTVI, the loan to property value at appraisal ratio (LTVA) is increased by the cost of sale and the time value of the time to sale period. In the denominator of the LTVI, the index corresponding to the last appraisal date is first indexed to t=1 (Index to today) and then indexed forward to the time to sale (Index to sale), which is the average time between moment of default and sale of the property. The resulted index is adjusted with an appraisal discount which a third party established by reappraisal of the properties. So when the bank overvalued the properties with x%, the resulted index is lowered by x%. Conclusively, the LTVI is the LTV adjusted for the cost and time to sale and indexed from the last appraisal date to the time of sale. The LGL segment are the LTVI buckets.

The formula behind the expected recoveries E[RV ] is based on the assumption of the normal distribution of the recoveries. If the observed recovery value (v’) is greater than the loan amount (L), the bank recovers L. If the observed recovery value is lower than the loan amount, the bank recovers v0. So if v0 ≥ L the recovery is L, if v0 ≤ L the recovery is v0. This is displayed in Figure 3.3.

Figure 3.3: Expected Recoveries

The expected recoveries E[RV ] is determined as follows:

E[RV ] = 1 σ√2π Z L 0 v exp(−(v − µ) 2 2σ2 )dv + 1 σ√2π Z ∞ L L exp(−(v − µ) 2 2σ2 )dv

The recovery values are defined as values between zero and infinity, therefore the integral starts at zero and ends at infinity over the observer recovery value distribution.

(25)

To calculate E[RV ], the following substitution is used:

z := v − u σ which leads to:

E[RV ] = √1 2π Z L−µ σ −µ σ (σz + µ) exp(−z 2 2 )dz + 1 √ 2π Z ∞ L−µ σ L exp(−z 2 2 )dz = √σ 2π Z L−µσ −µ σ z exp(−z 2 2 )dz + µ √ 2π Z L−µσ −µ σ exp (−z 2 2 )dz +√L 2π Z ∞ L−µ σ exp(−z 2 2)dz = √σ 2π[− exp( −(L − µ)2 2σ2 ) + exp( −µ2 2σ2)] + µ[Φ( L − µ σ ) − Φ( −µ σ )] +L[1 − Φ(L − µ σ )]

Depending on the mortgage type, a MIG can be part of the loan. If the MIG is accepted, the loss is reduced by the loss insurance. The MIG is assumed to be accepted for 75%, based on an external party.

Conclusively, the LGL formula is split into a part of loans which are not successfully reduced by their insurance and a part of loans which are successfully reduced by their insurance. So the LGL for the part of loans which are not successfully reduced is 1 minus the expected recovery value of the collateral as a percentage of the LTVI: 1 − E[RV]LTVI. The LGL for the part of loans which are successfully reduced is equal to the previous formula, but reduced by the MIG insurance as a percentage of the loan amount: 1 − E[RV]LTVI − MIG insuranceL .

Loans without a loss insurance will have a value equal to zero as loss insurance. These loans will be split up over the two parts of the LGL formula but when the loss insurance is equal to zero, those parts will become the same and trivially, (1-P[MIG]) and P[MIG] will add up to one.

Finally, the LGL is reduced by other recoveries since collateral is not the only source of reducing LGL.

To obtain the LGL, an defaulted portfolio is needed. Since the loan tape at t = 0 and t = 1 contains only information of loans which are not foreclosed, additional information over fore-closure cases is needed. Based on historical data of all forefore-closure cases in the last 36 months from t = 1, the sales ratio, cost and time to sale can be determined. These parameters will be discussed below.

(26)

University of Amsterdam Marthe van der Klein

The sales ratio is the average ratio of the observed sales price divided by the indexed valu-ation of the property:

Sales ratio = Sale price-Sale costs Valuation at Sale date with

Valuation at Sale date = Property Value Index Sale datex Index Valuation datex

The sales ratio will be calculated for segments based on the last valuation date. The segments are:

• Valuation date before 2000

• Valuation date between 2001 and 2005 • Valuation date between 2006 and 2009 • Valuation date after 2010

The indexation of the portfolio is based on an indexation list from a third party. They based the indexation on year, month and region of the house. The x in the formula above refers to the region of the house. Four regions were created:

• Region 1: Drente, Friesland, Groningen • Region 2: Gelderland, Overijssel, Flevoland

• Region 3: Zeeland, Noord-Holland, Zuid-Holland, Utrecht • Region 4: Noord-Brabant, Limburg

So the sales ratio depends on the region of the house and the valuation date of the last appraisal. Due to missing data on valuation and sale dates, outliers on both sides will occur. A cut off point of the sales ratio in the base scenario is on the left side 0.3 and on the right side 3. This means a house with an observed sales price more than three times as large as the indexed valuation of the property is considered to be an outlier. Also, observed sale prices less than 30 per cent of the indexed valuation of the property are outliers. Thus by considering outliers in the base scenario, the sales ratios will not be affected by data issues. Obtained from the historic foreclosure data, the average sales ratios are displayed in Table 3.3.

Valuation Date bucket Sales ratio Before 2000 0.5468 Between 2001 and 2006 0.7218 Between 2006 and 2009 0.7732 After 2010 0.9849

Table 3.3: Sales ratio

Table 3.3 shows that when the valuation date is before 2000, the indexation is less likely to be correct. More indexation mistakes are made of collateral with a valuation date before 2000. Otherwise the ratios would comparable over the time buckets.

(27)

The costs parameter is the average observed costs of foreclosure of collateral divided by the average exposure and is equal to 3.8% The time to sale is the average time between default of a mortgage and sale of the underlying property of the whole historical foreclosure data set. This is equal to 18 months.

Thus, with the parameters based on the historic foreclosure data, the indexed loan to value ratio for each loan in the loan tape at t = 1 is determined. In the base scenario, the LGL segmentation is based on the LTVI buckets. The size of the LTVI buckets are equal to the LTV buckets in the base scenario, namely between zero and 60%, 60-80%, 80-100%, 100-12-%, more than 120% and missing LTVI.

The effective interest rate is fixed on 4.4% and the appraisal discount is -9.8%. This is based on an external party who re-appraised the collateral. This means the collateral is assumed to be valued 9.8% too high in the banks’ book.

The parameters based on the defaulted loan portfolio will now be applied on the loan tape t = 1. First the indexed LTV is calculated for each loan. For 0.6% the missing collateral value is replaced by the collateral value of t = 0. The indexed to sale is based on the time to sale. It has the forward house price index of 18 months. The missing date of collateral valuation is replaced with the index of January 2012. Every three year, collateral needs to reappraised. So given this is done, an index date of January 2012 is reasonable.

The LGL formula is now calculated as explained above. The MIG acceptance claim is set on 75%. This is based on the AQR findings. This includes both payout ratio and decreasing coverage of the guarantee over the years. The loans of the loan tape t = 1 are divided over the LTVI buckets. The exposure weighted LGL is now set for each LTVI bucket.

LTVI bucket LGL 0.01-0.60 4.03% 0.60-0.80 10.37% 0.80-1.00 17.29% 1.00-1.20 25.50% 1.20+ 35.00% Missing 90.00% Table 3.4: LGL per LTVI bucket

(28)

Chapter 4:

Challenging the Challenger Model

Based on the Challenger Model, alternative scenarios for the PI, CR and LGD model are formed. The alternative scenarios will be discussed one by one in the sections below.

4.1

PI - Alternative scenarios

4.1.1 Alternative Scenario I

In the alternative scenario I of the PI model, the LTV ratios will be divided over five equal buckets in terms of number of loans per bucket. An extra bucket ’LTV others’ is created for the loans with missing data in one of the two parts of the LTV formula. In this scenario, there is no strict lower or upper boundary. Table 4.1 displays the LTV thresholds per bucket.

The difference between alternative scenario I and the base scenario is the LTV buckets are more spread over the lower values of LTV in the former case. This could be explained by a high number of loans with a low LTV value in the loan tape. The risk buckets stays the same for each bucket, since only the LTV buckets has changed.

As one can see in Table 4.2, the PI values differ from the base scenario. There is a monotone relationship between the buckets, so each risk bucket and LTV bucket have another risk profile. The loans are now more divided over homogenous groups than in the base scenario. A lower percentage of the exposure is located in the lower LTV buckets since these buckets are very small. The impact for an individual loan can be small in this scenario but the impact on the provision can be high since the PI has a linear relationship in the formula with the provision.

4.1.2 Alternative scenario II

Alternative scenario is not based on the base scenario but is an alternative method to calculate the PI, a logit model. The NPE flag, introduced in Chapter 3.2, is equal to yi. xi will be

determined by optimizing the discriminatory power. The discriminatory power is the ability

LTV bucket Values 1 0-0.21 2 0.21- 0.52 3 0.52 - 0.81 4 0.81 -1.03 5 1.03 +

(29)

Risk bucket LTV bucket PI Exposure Percentage Normal risk LTV 0-0.21 0.78 % 6413367593 4.31 % Normal risk cured LTV 0-0.21 24.28 % 0015040631 0.01 % High risk LTV 0-0.21 25.63 % 0089743519 0.06 % High risk cured LTV 0-0.21 56.74 % 0002952436 0.00 % Normal risk LTV 0.21- 0.52 0.53 % 2.0248676850 13.62 % Normal risk cured LTV 0.21- 0.52 32.27 % 0.0047318378 0.03 % High risk LTV 0.21- 0.52 10.82 % 0.0695077235 0.47 % High risk cured LTV 0.21- 0.52 46.56 % 0.0015607393 0.1 % Normal risk LTV 0.52 - 0.81 0.48 % 3.2007926069 21.53 % Normal risk cured LTV 0.52 - 0.81 23.93 % 0.0065412745 0.04 % High risk LTV 0.52 - 0.81 4.27 % 0.2912367783 1.96 % High risk cured LTV 0.52 - 0.81 48.38 % 0.0041997866 0.03 % Normal risk LTV 0.81 -1.03 0.66 % 3.7636867889 25.31 % Normal risk cured LTV 0.81 -1.03 27.21 % 0.0101032812 0.07 % High risk LTV 0.81 -1.03 4.44 % 0.5740650271 3.86 % High risk cured LTV 0.81 -1.03 46.17 % 0.0083508661 0.06 % Normal risk LTV 1.03 + 1.25 % 3.3247360507 22.36 % Normal risk cured LTV 1.03 + 2.99 % 0.0202859798 0.14 % High risk LTV 1.03 + 6.50 % 0.6126011799 4.12 % High risk cured LTV 1.03 + 50.91 % 0.0164573352 0.11 % Normal risk LTV missing 1.10 % 0.067998631 0.46 % Normal risk cured LTV missing 40.88 % 0.0005143235 0.00 % High risk LTV missing 8.36 % 0.0083373978 0.06 % High risk cured LTV missing 70.86 % 0.0005133437 0.00 %

Table 4.2: PI output alternative scenario I

to discriminate between performing and non-performing loans and commonly used in validat-ing models (Basel Committee on Bankvalidat-ing Supervision, 2005). Discriminatory power is also a method to evaluate the goodness of fit of a logistic regression according econometric theory (Cameron & Travedi, 2005). In both model validation methods and econometric theory, the receiver operating characteristics (ROC) curve is used to display the goodness of fit of the model. The Receiver Operating Characteristic (ROC) displays in a graph the false positive rate on the horizontal axis and the true positive rate on the vertical axis. This is obtained from the comparison between the true observations yi and the calculated probability πi. The

area under the ROC curve (AUC) indicates the classifier performance. The steeper the ROC curve is at the left and the closer the ROC curve’s position is to the point (0,1), the better the model’s performance. This implies the larger the area under the ROC curve the better the model’s performance. A perfect model has an AUC value equal to 1 and a random model without discriminatory power has an AUC value equal to 0.5. The ROC curve of a model without discriminatory power is a straight line between (0,0) and (1,1).

To construct the alternative PI model, the logit model, an univariate analysis is conducted to determine which explanatory variables are significant in the model. With the univariate analysis, the explanatory variable with the highest AUC value will be the basis of the model. After that, the variable with the second highest AUC value will be included. The new variable

(30)

University of Amsterdam Marthe van der Klein

the variable is significant.

The first variable x1 is the indicator function equal to one if the loan is past due and equal

to zero otherwise. The variable is significant with a standard error of 0.02 and a p-value of zero. The coefficient of x1, β1, is positive implying a positive relation between a loan past due and a

loan being non-performing. The discriminatory power of this variable is good, the AUC value is equal to 0.7082. So only the variable x1 in the model gives a good result in terms of model

performance.

The second variable x2 is the variable LTV divided in three buckets. The LTV buckets are

constructed in a way that the three buckets form three different risk profiles. Bucket one has the lowest probability of impairment and bucket three the highest. The pi of one bucket should

not be in the confidence interval of the pi of an other bucket, then it is clear the LTV buckets

have a different risk profile.

The first LTV bucket includes every loan with a LTV smaller than 80% with p1 equal to

0.8% in the confidence interval CIi=[0.0078 0.0082]. The second LTV bucket includes every

loan with a LTV equal to 80% or smaller than 120%, p2 is 1.8% with CI2=[0.0177 0.0183]. The

third LTV bucket includes every loan with a LTV higher than 120%. The corresponding NPE rate is p3 = 4.3% with CI3=[0.0425 0.0433]. In Figure 4.1 the NPE rate per LTV bucket is

shown.

LTV bucket <0.8,0.8-1.2,>1.2

Probability NPE model

0.005 0.01 0.015 0.02 0.025 0.03 0.035 0.04

0.045 NPE rate with CI

Figure 4.1: LTV buckets

The standard error of the model with only the LTV bucket variable is 0.02 and the p-value is equal to zero. The AUC of the model is equal to 0.619.

The third variable x3 is the indicator function equal to one when the loan product is equal

to an ’Aflossingsvrije hypotheek’, an interest-only mortgage. The standard error of the model with only the distinction between ’Aflossingsvrije hypotheek’ and other products is 0.02 and the p-value is zero. This AUC value is equal to 0.509, when x3 is the only explanatory variable in

(31)

the model. This AUC value implies there is no discriminatory power between performing and non-performing loans.

The fourth variable x4 is the current interest rate value of the loan. The standard error of

the model is equal to 0.73 and the p-value is zero. The AUC is equal to 0.535. These results indicates that the current interest rate value is not an self-contained variable where the model should be based on.

The fifth variable x5 is the loan to income ratio (LIR). The same way as the LTV buckets

are constructed, are the LIR buckets constructed. So each LIR bucket corresponds to an other risk profile of the loan. The higher the LIR ratio, the higher the risk. The LIR is missing for a large part of the loans. Therefore, only 2 buckets are constructed, loans with a LIR of 200% lower of higher. Loans with a LIR lower than 200% have a NPE rate p1 = 1.2% with

CI1=[0.0121 0.0125] and loans with a LIR higher than 200% have a NPE rate p2 = 1.4% with

CI2=[0.0141 0.0146]. This is displayed in Figure 4.2. The standard error of the model is 0.06

and the p-value is 0. The AUC is equal to 0.506, which corresponds to a non-discriminatory model.

Loan to Incume bucket with threshold ratio 200%

Probability NPE model

0.012 0.0125 0.013 0.0135 0.014 0.0145

0.015 NPE rate with CI

Figure 4.2: LIR buckets

The above variables will be included in the final logit model, one by one based on their AUC values. The variables are included one by one to avoid overfitting. A variable will be included when the AUC value goes up, given the variable is significant. The first variable is x1 since the

AUC has the highest value. When the second variable x2 is added, the AUC values increases

to 0.761. The third variable x3 has also a positive effect on the AUC value, this is now equal to

0.7645 with still every coefficient significant and a SE smaller than 0.05 and p-values equal to zero. Although the AUC increases when x4 is added to the model, the variable is not significant.

(32)

University of Amsterdam Marthe van der Klein

0.7645 which indicates of a good performing model since an AUC should be between 0.5 and 1. In Table 4.3 the regression output is displayed with the coefficients and corresponding standard error en p-value. The ROC curve of the final model is displayed in Figure 4.3.

logit(y) ∼ 1 + x1 + x2 + x3, distribution = Binomial Coefficients Estimate SE tStat pValue

Intercept -5.71 0.032 -179.33 0 x1 3.35 0.021 162.50 0 x2 0.49 0.016 29.70 0 x3 0.14 0.020 6.95 0 Table 4.3: PI- Logit model regression output

False positive rate

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

True positive rate

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

ROC for Classification by Logistic Regression

Figure 4.3: ROC curve PI model

The obtained model is tested for multicollinearity by checking for correlation between the variables. x2 and x3 have the highest correlation, namely −21%. Since we assume

multicol-linearity to be a problem started from 60%, we can conclude there is no multicolmulticol-linearity in this model. Also, it is clear that x2 and x3 have some correlation. The mortgage product

’Aflossingsvrije hypotheek’ does not reduce the loan till the end of the mortgage. Therefore, the LTV does not change over time.

Furthermore, the obtained ˆy and the internal PD of the bank show a correlation of 0.7397. As mentioned before, the internal PD is based on the default definition of the bank and is estimated on through-the-cycle data. The PI of our model is based on the simplified default definition and on point-in-time data. Therefore, the two variables are correlated with each other but are based on different definitions.

(33)

are comparing two different definitions with each other. However, the bank has access to more time series on our data, and could therefore predict a lower PD based on longer time periods.

The coefficients βi for i = 0, 1, ..3 (an intercept is included as β0) can now be applied on the

data of t = 1. Again, loans with a NPE equal to 1 at t = 1, will get a PI equal to 1. The rest of the loans will get the forecasted PI.

4.2

CR - Alternative scenarios

4.2.1 Alternative scenario I

In alternative scenario I, the transition matrix as a whole will be challenged. First, the cut off point of number of months past due is changed. In the base scenario, this is equal to 36 months. In Figure 3.1 large peaks arise after 24 months. The exposure amount of the loans is very low after 24 months, therefore a cut off point of 24 months is more appropriate. In Figure 4.4, one can see the cure rate is less peaky compared to the base scenario.

months past due

0 5 10 15 20 25

Cured in percentage

Cure Rate

Cure rate Weibull 1-CDF

Figure 4.4: Cure rate, cut off point 24 months

When the parameter of the maximum amount of years loans will cure after default (in the base scenario equal to 4) is set to 3 years, the cure rate is lower compared to the base scenario. The shape of the curve is comparable but for every month past due bucket the probability to cure is lower.

The opposite occurs when the parameter of the maximum amount of years loans will cure after default is set to 5 years. The cure rate is then higher compared to the base scenario. The shape of the curve remains comparable with the base scenario but every month past due bucket now has a higher probability to cure.

(34)

University of Amsterdam Marthe van der Klein

is no forborne loan. The cut-off point is 24 months but the maximum amount of years loans will cure after default is as the base scenario equal to 4. In Figure 4.5 the cure rate is shown. The cure rate is lower compared to the alternative scenario with fall back variables. Since this bank has no forborne loans at t=0, the forborne status row will be empty. By multiplying the matrix, the zeros of the forborne status row are causing the lower numbers in the first column of the transition matrix and therefore the lower cure rate.

months past due

0 5 10 15 20 25

Cured in percentage

Cure Rate

Cure rate Weibull 1-CDF

Figure 4.5: Cure rate, cut off point 24 months and no fall back parameters

In conclusion, a lower cut off point smooths the cure rate at the end of the curve. Fur-thermore, a shorter period between default and time to cure will cause a lower cure rate and a larger period will cause a higher cure rate. The fall back parameters are beneficial for a high cure rate, when they are left out a lower cure rate arises.

4.2.2 Alternative scenario II

In alternative scenario II, a different method to assign a weight to each bucket of state i is used, W2

i . Compared to the base scenario, now the weight of bucket i is the exposure of all loans of

state i at t = 0 migrating to any state j at t = 1.

Wi2 = N X j=1 M X k=1

ExposurekIi,j(k)for i=1,..,N

Due to the different weighting method, the fitted cure rate curve is over the different states 1 to 3% lower than the weighted method of the base scenario. The biggest part of the exposure of the loans is divided over the first rows and columns of the transition matrix. So the likelihood of loans migrating from a state of zero to low months past due to a state of zero to low months past due is higher than the likelihood of loans migration from a sate of zero to low months past due to a state of high months past due. Therefore, the impact of the different weighted method

(35)

is not significantly different compared to the base scenario and will not result in a significant different value of the provision.

4.2.3 Alternative scenario III

In alternative scenario III, the relationship between the cure rate and the time past due is fitted to the exponential probability distribution function (PDF) and the Weibull function. The latter is the PDF of the Weibull function with three parameters. This is displayed in figure 3.6 (a). The Weibull CDF and the exponential PDF are close to each other. With a weighted residual sum of squares of respectively 0.0019 and 0.0027 the functions are a better fit compared to the Weibull function with a weighted residual sum of squares of respectively 0.0157. The Weibull function is lower compared to the other two curves. A lower cure rate will effect the amount provision enormously. The impact on the provision will be shown in the next chapter.

Also a polynomial function is fitted to the cure rate, see Figure 4.6 (b). One would expect a concave, downward sloping relationship between months past due and the cure rate. Since the polynomial function does not comply with this expectation, the base with the Weibull CDF fit is the best fit compared to the alternatives.

months past due

0 5 10 15 20 25 30 35 40 Cured in percentage Cure Rate Cure rate Exponential PDF Weibull function Weibull 1-CDF

(a) Weibull CDF, Weibull function and Exponential PDF

months past due

0 5 10 15 20 25 30 35 40 Cured in percentage Cure Rate polynomial degree 4 curerate (b) Polynomial of degree 4

Figure 4.6: Cure rate fitting

4.3

LGL - Alternative scenarios

4.3.1 Alternative scenario I

In the alternative scenario I of the LGL model, the LTVI ratios will be divided over five equal buckets in terms of number of loans per bucket. An extra bucket ’LTVI others’ is created for the loans with missing data in one of the parts of the LTVI formula. In this scenario, there is no strict lower or upper boundary. Table 4.4 displays the LTVI thresholds per bucket with the corresponding LGL.

Comparing this with the LGL of the base scenario, the lower LTVI buckets have a lower LGL value and the higher LTVI buckets have a higher LGL value. Since it is more likely a loan

(36)

University of Amsterdam Marthe van der Klein

LTVI bucket Values LGL 1 0-0.29 1.55% 2 0.29- 0.66 5.69% 3 0.66 - 1.10 17.08% 4 1.10 -.41 30.23% 5 1.41 + 38.06% 6 missing 90.00% Table 4.4: LTV buckets alternative scenario I

in a higher LTVI bucket will default, the higher values of the LGL for those buckets will have more effect on the amount provision than the lower LGL values for lower LTVI buckets.

4.3.2 Alternative scenario II

In the alternative scenario II, the LGL formula will be challenged. Since the formula is based on the normal distribution of the sales ratio, this will be investigated. The sales ratio obtained from the historic default data per defaulted loan will be used, with a cut-off point of 300% to excluded outliers.

First, a Kolmogorov Smirnov (KS) test is used to test whether the sales ratio is normally distributed. The null hypothesis is the sales ratio is normally distributed and the alternative hypothesis is it is not. With α = 0.05, the null hypothesis is rejected with a p-value equal to 0.017.

The same test is used to test whether the sales ratio could be fit on a student’s t-distribution with unknown location and scale. The null hypothesis is the sales ratio is student’s t-distributed with unknown location and scale against the alternative hypothesis it is not. The null hypothesis is rejected with a p-value equal to zero.

Next, a distinction will be made between loans based on the last appraisal date. If only loans with the last appraisal date before July 2004 are fitted on the t-location scale distribution, the KS test does not reject the null hypothesis with a p-value equal to 0.32. The KS test is also performed of the selection loans are normally distributed but that is rejected with a p-value equal to 0.01.

The other part of the loans, loans with a last appraisal date after July 2004 are also fitted on the normal and t-location scale distribution. The KS test is performed both with null hypothesis of the normal distribution but is rejected with a p-value equal to zero. The KS test is also performed with null-hypothesis of the t-location scale distribution. Again, the null hypothesis is rejected with a p-value equal to zero.

Since the KS test showed loans with a last appraisal date before July 2004 have a sales ratio with a t-location scale distribution, we use this distribution in the LGL formula. For the rest of the loans, the base scenario is used.

To apply the t-location scale distribution in the LGL formula, the E[RV ] needs to be re-written. The same way as in the base scenario, if the observed recovery value is greater than the loan amount (L), the bank recovers L. If the observed recovery value is lower than the loan amount, the bank recovers v0.

(37)

The expected recoveries E[RV ] can now be expressed as: E[RV ] = Z L 0 vfν(v)dv + Z ∞ L Lfν(v)dv (4.1) with fν(v) = Γ(ν+12 ) √ νπΓ(ν2) " 1 + v−µ σ 2 ν #−(v+12 ) Fν(v) = Z v −∞ fν(v)dv If ν>2 ,

E[v] = µ and var[z] = σ2 ν ν − 2

Let z = v−µσ , implying v = zσ + µ. When v<L,

z<(L − µ) σ Evaluating (1) into parts leads to the following:

Z L 0 vfν(v)dv = Z (L−µ)σ −µ σ (µ + σz) fν(z)dz = Z (L−µ) σ µ σ µfν(z)dz + Z (L−µ) σ −µ σ σzfν(z)dz = µ  Fν( (L − µ) σ − Fν( −µ σ )  (4.2) +σ Z (L−µ) σ µ σ zfν(z)dz (4.3) Z ∞ L Lfν(v)dv = Z ∞ (L−µ) σ Lfν(z)dz = L  1 − Fν( (L − µ) σ )  (4.4)

Given −µσ <0<(L−µ)σ , we can rewrite (3):

σ Z (L−µ) σ −µ σ zfν(z)dz = σ Γ(ν+12 ) √ νπΓ(ν2) " Z 0 −µ σ  1 +z 2 ν −ν+12 zdz + Z (L−µ) σ 0  1 +z 2 ν −ν+12 zdz # = σ Γ( ν+1 2 ) √ νπΓ(ν2) " − Z −µσ 0  1 +z 2 ν −ν+12 zdz + Z (L−µ)σ 0  1 +z 2 ν −ν+12 zdz #

(38)

University of Amsterdam Marthe van der Klein

Now substitute y =: z2 and dy =: 2zdz

= σ Γ( ν+1 2 ) √ νπΓ(ν2) " −1 2 Z (−µσ )2 0  1 +y ν −ν+12 dy + 1 2 Z ((L−µ)σ )2 0  1 + y ν −ν+12 dy # = σ Γ( ν+1 2 ) √ νπΓ(ν2)   ν ν − 1   1 +y ν −ν−12 ( µ σ) 2 0 − ν ν − 1   1 +y ν −ν−12 ( (L−µ) σ ) 2 0   = σ Γ( ν+1 2 ) √ νπΓ(ν2)   ν ν − 1    1 +( µ σ)2 ν −ν−12 − 1 +( (L−µ) σ )2 ν !−ν−12     (4.5)

Adding (2), (5) and (4) , E[RV ] in (1) can be rewritten as:

E[RV ] = µ  Fν( L − µ σ − Fν( −µ σ )  +σ Γ( ν+1 2 ) √ νπΓ(ν2)   ν ν − 1    1 +( µ σ) 2 ν − ν−1 2 − 1 +( (L−µ) σ ) 2 ν !−ν−12     +L  1 − Fν( (L − µ) σ )  (4.6)

Applying this scenario in the LGL formula and use the LTVI buckets as in the base scenario, the following holds:

LTVI bucket LGL 0.01-0.60 4.83% 0.60-0.80 10.81% 0.80-1.00 17.50% 1.00-1.20 25.59% 1.20+ 35.00% Missing 90.00% Table 4.5: LGL - output scenario II

The results are comparable to the base scenario, only the lowest LTVI bucket has a slightly higher LGL value. The impact on the provision will be discussed in the next chapter.

Referenties

GERELATEERDE DOCUMENTEN

Als uw zorginhoudelijke beoordeling leidt tot de conclusie dat zorg in de thuissituatie nog wel verantwoord is, zult u een indicatie zonder verblijf (in extramurale functies) kunnen

Proudman, January 2008 1 Background to the research method used for the Stimulating the Population of Repositories research project.. Stimulating the Population of Repositories was

Uit het huidige onderzoek werd al duidelijk dat personen in de contexten anderen zien overgeven en zelf overgeven, alleen of in het bijzijn van anderen, verschillend

The main idea in self-rostering is that employees can propose their own shift rosters, and if they do this in a ‘good’ way, they get to work most of their shifts as in their

These include a lack of knowledge of legislation and available redress mechanisms in other Member States, conflict among national legislations, no information about

Het belang van deze versies voor de publieke gedichten is niet zo zeer een verandering in betekenis, maar vooral een toename in bekendheid die het gedicht kreeg door de opname in

We investigated the prevalence of prescriptions with potential DDIs between ARVs from general practitioners (GPs) and specialists (SPs) for patients in different age groups

Unless the original article in the bibliographic database is clearly known to be a retracted, researchers may not know that the paper has been withdrawn and the