• No results found

Information asymmetry in P2P finance : a research for asymmetry lowering determinants on P2P lending platforms

N/A
N/A
Protected

Academic year: 2021

Share "Information asymmetry in P2P finance : a research for asymmetry lowering determinants on P2P lending platforms"

Copied!
60
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

INFORMATION ASYMMETRY IN P2P FINANCE

A research for asymmetry lowering determinants on P2P lending platforms UNIVERSITY OF AMSTERDAM

Amsterdam Business School

MSc Business Economics – Finance

Author: Johan C.J. Biesjot Student number: 6045391

Thesis supervisor: Dr J.J.G. Lemmen Finish date: July 3rd, 2016

(2)

PREFACE AND ACKNOWLEDGEMENTS

In January 2014, I became interested in Crowdfunding where I wrote a paper about the status of the Dutch Crowdfunding market. It came to my knowledge that this market is relatively young and that there is a lot of future potential for this financing alternative. These positive thoughts were translated in the establishment of Milestone, the first cross-border investment fund in P2P loans in continental Europe I founded in June 2015.

I wish to thank my thesis supervisor Dr Jan Lemmen for providing the opportunity to write my thesis about a subject that deeply interests me. Whenever I ran into trouble or had a question, the door was always open. He allowed this thesis to be my own work and directed me the right way when I did not know what to do. Furthermore, I would like to thank my fellow students who helped me with their abilities to overcome the hurdles I faced during this academic year.

I am grateful to my family for their constant encouragement to focus on my academic career. It would be misplaced to not name Gijs Zaalberg for his listening ears during the darkest hours of that very same career.

Finally, I must express my very profound gratitude to the love of my life Sandra Manders. With her unfailing support and trust in me throughout my years of study she is the one who got me through the process. The accomplishment of finishing a Master in Science would not have been possible without her.

(3)

ABSTRACT

This study is interested in variables that lower existing information asymmetries on P2P lending platforms. Determining a set of significant influential variables is useful for regulatory frameworks across the globe and for investors to make better informed investment decisions. Using Lending Club’s loan book, the biggest US lending platform, 16 preselected variables are tested on their predictive power toward defaults through logistic regression analyses. The sub-grade and interest rate assigned to a loan are by far the most important variables that indicate the probability of future defaults. The purpose of a loan, housing situation and the borrowers’ personal finance records are variables that also provide significant results. The most predictive variable is the grade assigned (53.60%); this implies that Lending Club’s grading systems control for the most obvious factors that influences an applicant’s creditworthiness.

Keywords: P2P lending platforms, consumer loans, information asymmetry, default prediction

JEL Classification: G23, G24, G28

NON-PLAGIARISM STATEMENT

By submitting this thesis the author declares to have written this thesis completely by himself/herself, and not to have used sources or resources other than the ones mentioned. All sources used, quotes and citations that were literally taken from publications, or that were in close accordance with the meaning of those publications, are indicated as such.

COPYRIGHT STATEMENT

The author has copyright of this thesis, but also acknowledges the intellectual copyright of contributions made by the thesis supervisor, which may include important research ideas and data. Author and thesis supervisor will have made clear agreements about issues such as confidentiality.

(4)

TABLE OF CONTENTS

PREFACE & ACKNOWLEDGEMENTS ... ii

ABSTRACT ... iii TABLE OF CONTENTS ... iv LIST OF TABLES ... vi LIST OF ABBREVIATIONS ... vi 1: INTRODUCTION ... 1 1.1 P2P Lending ... 1 1.2 Regulatory challenges ... 2 1.3 Information asymmetry ... 5 1.4 Relevance ... 6 1.5 Hypothesis development ... 7 2: LITERATURE REVIEW ... 10 2.1 Online environment ... 10 2.1.1 Crowdfunding ... 10 2.1.2 Environmental framework ... 10

2.2 Determinants of P2P finance growth ... 12

2.1.1 Operational P2P processes ... 12

2.2.2 Transaction costs ... 13

2.2.3 Credit rationing ... 14

2.3 Credit risk in P2P finance ... 15

2.3.1 Profits from lending activities ... 15

2.3.2 Credit risk with P2P assets ... 16

2.4 Information asymmetry ... 16

2.4.1 Classic asymmetries in a new market ... 16

(5)

2.5 Default prediction in P2P lending ... 18

2.5.1 Credit ratings ... 18

2.5.2 Default prediciton models ... 18

2.5.3 Consumer credit models ... 19

2.5.4 Consumer creditworthiness determinants ... 20

2.6 Conclusions ... 22

3: DATA ... 23

3.1 Data set description ... 23

4: METHODOLOGY ... 26

4.1 Correlation coefficients ... 26

4.2 Testing for predictive variables ... 31

4.3 Logistic regression ... 32

5: EMPIRICAL RESULTS ... 33

5.1 Predictive variables study ... 33

5.2 Logistic regression results ... 37

6: CONCLUSIONS ... 41

7: DISCUSSION ... 44

8: APPENDICES ... 47

(6)

LIST OF TABLES

Table 1 Variables selection for empirical study 25 Table 2 PCC of continuous variables (N=31.465) 29 Table 3 SRCC for discrete variables (N=31.465) 30 Table 4 Simple T-test on predictive continuous variables 35 Table 5 Chi-square test on predictive discrete variables 36 Table 6 Logistic regressions on default probability variables 40

LIST OF ABBREVIATIONS

AFM Dutch Financial Authority DNB Dutch Central Bank

EU European Union

FCA Financial Conduct Authority FICO Fair, Isaac and Company Fin-Tech Financial-Technological

LC Lending Club

P2O peer-to-organization

P2P peer-to-peer

PCC Pearson Correlation Coefficients SME Small & Medium-sized Enterprise

SRCC Spearman’s Rank Correlation Coefficients

UK United Kingdom

(7)

1: INTRODUCTION

1.1 P2P Lending

P2P Lending can be defined according to Tyler Aveni (2015) as “the loan-making between borrowers and lenders who are directly matched via online marketplaces”. A fast-growing part of the world’s economy involves peer-to-peer (P2P) based transactions. These types of marketplaces enable the possibility to obtain goods and services, coordinated through community-based services (Hamari et al., 2015). The common factor these P2P services share is the disintermediation of offline-based middle layers that are replaced by data-driven technologies creating faster and less expensive solutions (Zervas et al., 2015). As a part of this evolvement, P2P lending fully emerged after the 2007-2008 financial crisis and is changing the way banking works fast. By further disintermediating the traditional banking institutions by connecting borrowers and lenders directly through the Internet, investment possibilities become available that were exclusively available to large financial institutions such as consumer and Small & Medium-size Enterprise (SME) loans (Morse, 2015). The evolvement of P2P Lending fits perfectly in the current age of Financial-Technological (Fin-Tech) companies disrupting markets through disintermediation by using the power of the Internet. P2P marketplaces manage to provide lower interest rates to borrowers and higher (potential) returns to investors on the platform. Due to their more efficient processes, these actors have become serious challengers of the traditional banking industry (Meyer et al., 2007).

(8)

1.2 Regulatory challenges

Offline-based services are often subjected to industry-specific regulations. These traditional players often deal with specific taxes, regulations and moral principles. Online-based P2P platforms blur the line between personal help and professional service provision. Currently, P2P platforms often do not fall under the heaviest regulatory licenses. Cohen & Sundararajan (2015) state that regulators across the globe face difficulties implementing these new types of services. Their implications may slow the growth of employment that is involved with the exponential growth of P2P services. On the other hand, Verhaar & van Zaal (2016) stated that it is essential for P2P platforms to apply for heavier regulatory licenses. They believe that applying for these types of licenses, borrowers and lenders are better off in the long run when clear rulings and behavioral acts are implemented. Besides this statement, they also faced many problems when applying for the heavier licenses at the Dutch Financial Authority (AFM) and the Dutch Central Bank (DNB). Verhaar & van Zaal’s application process took more than 1.5 years, concluding that the regulatory institutes need to be better aligned to adopt these new service providers more efficient. Contrary to these bureaucratic procedures of the Dutch authorities is the British Financial Conduct Authority (FCA). This regulatory institute created an Innovation Hub where innovative financial services can apply for a special treatment toward regulatory licenses. In this case, the FCA collaborates with these parties to seek for tailored solutions to stimulate the growth of financial innovations. Furthermore, in order to stimulate the growth of P2P investments, British retail investors are able to deduct these losses from their taxable income that come from P2P financed assets (Financial Conduct Authority, 2016).

(9)

P2P-platforms avoid complex asset engineering processes such as loan term transformation, securitization or loan collateralization. As an example, Funding Circle, a P2P lending platform that is active in multiple countries, states that critics argue that alternative financing comes along with large risks.1 P2P-platforms are regulated less strictly than banks, because they do not have billions of dollars on their balance sheets. The difference in capital requirements and regulations doesn’t provide a level-playing field. Funding Circle disagrees and states that P2P-platforms are not subjected to systemic risk, because they don’t create money and don’t have a transformation function. This discussion indicates that regulatory frameworks differ among countries.

FG Lawyers (2016), who published the report “Crowdfunding crossing borders” in March 2016, support this. Their report states that varying regulatory frameworks hinders the growth of cross-border activities in the European Union (EU). This report focuses on the liability risks that are associated with cross border Crowdfunding investments. First, there are different interpretations between regulators whether investments in limited liability companies are considered as a security. This is, for example, not considered as an investment by the Swedish financial regulator. Furthermore, for lending-based Crowdfunding it is the case that there is a distinction between credit provider and credit intermediary, this difference lies in the activity of the platform whether they provide loans to semi-anonymous consumers. Bekaert (2014) states that cross-border Crowdfunding activities are dependent on the applicable taxation rules where special agreements between EU member states have to be studied to calculate the net returns from foreign Crowdfunding investments. By

(10)

combining this set of barriers, it can be stated that the EU Crowdfunding market has to overcome certain hurdles to catch up with the developments that take place in the United States (US) and the United Kingdom (UK). One of the main drivers between the varying growth rates is the information provision by the most important actors in this market. For example, the major platforms in this market like Lending Club (i.e. from now on LC), Funding Circle (US, UK, EU) and Zopa (UK) provide a loan book where they provide information about the loans they settled in the past years. These datasets provide clear information about the loans that were delinquent, charged off or had belated payments. This standardization of information makes it possible for professional investment companies to build investment strategies in this relatively new financing alternative. Determinants from this development lie in the level of securitization and the emergence of investment advisory services like Lending Robot and PeerTrader. To further explain this driver of volume growth, Appendix 1 contains a table where the most important lending platforms in the Netherlands are displayed. This table provides a clear situation about the varying financial information levels in this market. Analyzing this overview, it can be stated that, in the current market situation, it is hard for professional investors to develop investment strategies and also to securitize these types of loans based on the existing information levels.

Besides the investor view on market development, regulators could stimulate a stable market growth by requiring standardized information levels. Hartmann & Goodhart (1998) published a book that presents the main implementations for financial regulation. The different natures and costs of regulations are discussed, but more distinctively new techniques for risk management are shared. P2P platforms are not subjected to balance

(11)

sheet risk. This risk contains for example rulings for minimum capital requirements, boundaries of leverage ratios and liquidity reserves. Due to this different nature, new regulatory techniques are required to assess the stability of a P2P platform. Lending platforms like LC and Funding Circle (US, UK, EU) came to existence less than a decade ago, but already mediated for billions of dollars in consumer and SME loans. Since that, these types of platforms are highly dependent on the client’s engagement; these parties invest heavily to address the systematic misfits of securities that they offer and the current regulations. They centralize activities associated with commercial services such as branding, trust and payments. On the other hand, they also decentralize activities that have to do with pricing, supply infrastructure and service provision (Koopman & Mitchell, 2014). These renewed processes of loan settlements require regulators to review systems without a science-backed manual to stimulate a sustainable growth for this market.

1.3 Information asymmetry

P2P exchanges are characterized by classical information asymmetry problems. That is why existing literature on P2P mainly focuses on the information asymmetries and the behavior of market participants who adopted P2P in its early stage. For example, Freedman and Jin (2011) argue that early lenders did not fully understand the market risk, and show that lender “learning by doing” can reduce risk over time. They predict that the market will exclude more and more subprime borrowers and evolves towards a large fraction of the population served by traditional credit markets.

(12)

Furthermore, Everett (2015) addresses unobserved lenders’ risk characteristics that arise on these lending platforms. He measures the impact of adverse selection, moral hazard and hold-up problems in this industry. P2P financed assets attract private and public investors, each bidding on loans based on the same set of variables. Although a clear distinction cannot be made, a certain level of unobserved risk characteristics cannot be mitigated. P2P lending platforms provide a variety of influential variables, but since that this field of finance is relatively new, a clear set of variables to mitigate information asymmetry problems have not been clearly stated. Xiaoguang & Yi (2011) also find that regulatory frameworks require significantly varying information levels concerned with historic financial statements and grading systems.

Not much research has been done on the investment side of P2P, because of the lacking of mature financial data. Since that the vast majority of actors in this market were established less than a decade ago combined with an average loan term of 40 months, science-based research for minimum information levels is scarce. Retail investors drive the first growth phase of this market. In order to maintain existing growth rates, more transparency is required for institutional investors to enter the marketplace. By becoming more transparent cross border investment activity will be stimulated.

1.4 Relevance

Currently, regulators do not have a predefined set of asymmetry lowering variables that could be required from P2P intermediaries. Prior research is only focused on certain types of information asymmetries in P2P Lending. For example herding behavior (Zhang, 2012), beauty (Gonzalez et al., 2014), age, gender and racial disparity effects (Pope, 2011) and the failure to

(13)

transform the provided loan statistics into the right decisions (Ruiqiong & Junwen, 2014). The relevance of this thesis lies in the constitution of a clear variable set that lowers the overall existence of information asymmetries. This adds value in two ways. First, sophisticated investors become able to build an investment strategy based upon variables that significantly influences specific investment risks. Furthermore, the categorization of information makes it easier for regulators to monitor this market and distinguish various quality levels of the platform specific procedures. This research provides investors and regulators a conceptual framework to require baseline specifications from P2P lending platforms. The empirical results are based on the loan book of LC, the largest US P2P lender that provides a dataset large enough to exclude loans that are still repaying. Furthermore, the number of defaulted loans is large enough to determine the main drivers behind this subsample. Since that public data in this market is scarce, this dataset provides the best way to empirically test the existence of asymmetries in P2P lending and what is needed to minimize them.

1.5 Hypothesis development

The relevance of information on P2P lending platforms is a relatively new field of research. The main reason is that data about P2P financed loans, whether a loan is fully repaid or charged off particularly, were not available until recently. This thesis contributes to a better understanding of the risks that come with investing in P2P assets. A better understanding of these risks supports regulators to implement tailored frameworks for P2P financed assets. Therefore, the research question is: “What type of data should be displayed at a P2P lending platform in order to lower the existing levels of

(14)

information asymmetry?” To answer this question, the following two null hypotheses are formulated:

H0: Specific loan details displayed at Lending Club do not lower

information asymmetry levels

H0: Specific loan details do not have predictive power toward defaults

The corresponding alternative hypotheses are as follows:

H1: Specific loan details displayed at Lending Club lower information

asymmetry levels

H1: Specific loan details have predictive power toward defaults

By empirically testing these hypotheses, relevant loan specifications need to be determined from LC’s loan book. The answers to these hypotheses are valuable for two reasons. First, they deliver insights for lenders to better review the associated risks when investing in a specific loan. Secondly, P2P lending intermediaries can use the findings of this thesis to increase the transparency level on their platform by providing certain variables that lower the existing asymmetry problem. This way, a better functioning market mechanism is enhanced.

The interest rate and sub-grade assigned to a loan are often the most influential factors that determine the risk. It provides a structured and understandable way to determine the risk factors that are embedded in a financial security. Furthermore, the goal to obtain a loan and applicable financial ratios are compliers of risks that contain valuable information for potential investors.

(15)

The remainder of this thesis is structured as follows: chapter 2 presents a literature review where previous empirical results are reviewed for this thesis. This review provides information about the online funding environment and analyzes various credit theories that are applicable to this dataset. Chapter 3 provides a description of the loan data that are used for this research. Chapter 4 describes the methodology where statistical tests are explained and motivated. This comes together with the first descriptive statistics. Chapter 5 presents empirical results of this study and chapter 6 presents the conclusions on the formulated hypotheses and answers the research question. Lastly, chapter 7 provides a discussion for future research and the limitations of this study.

(16)

2: LITERATURE REVIEW

2.1 Online environment

2.1.1 Crowdfunding

Crowdfunding is a collective effort by consumers who network and pool their assets together, usually via the Internet, in order to invest in and support efforts initiated by other people or organizations (Ordanini et al., 2011). The Crowdfunding model has a much longer history than is often known. The construction of the Statue of Liberty’s pedestal was paid by small donations from the inhabitants of New York. This campaign, that took place in 1885, has become the broadly used example of how Crowdfunding always has been around (Gwynne, 2012).

2.1.2. Environmental framework

A distinction that should be made in order to understand the online funding environment is the allocation of assets to persons or organizations. The allocation of assets from an individual to another individual can be officially called P2P. When a person offers certain assets to an organization, can be categorized as P2O. This distinction is important because the legal interpretation and regulatory frameworks differ fundamentally between these two domains.

The phenomenon Crowdfunding covers the domain of non-financial and financial rewarded propositions. Both financial and non-financial domains can be subdivided into two subcategories. The first sub-category that falls under non-financial rewarded propositions is donations. This type of Crowdfunding consists of funding to non-profit organizations or creative collectives. For P2O transactions, this is called Crowd Donating and for

(17)

P2P-based transactions, this is called P2P Donating. The second subcategory of non-financial rewarded propositions is product-based funding. A famous platform in this subcategory is Kickstarter where people can buy products often before they are actually brought into production. When a transaction from a person to an organization takes place it is called Crowd Purchasing, in case of a P2P transaction, it is called P2P Purchasing.

The second domain covers the financial-rewarded propositions and is often referred to the “Crowdfinance” domain. Also here a distinction can be made between two subcategories. The first subcategory covers debt-financing activities. This specific category has many synonyms like Marketplace lending, Crowdfinancing, P2P financing, etcetera. When an individual funds a SME, it is called Crowd Lending. When a person lends to another person it is called P2P Lending. This part covers the lending activities for consumer loans. The second subcategory of financial rewarded propositions is investing, where people can obtain a profit share or equity in start-ups and SME’s through online platforms. When a person invests in an organization, it is called Crowd Investing. When an individual invests in another individual, it is called P2P Investing. An overview the online funding environment is graphically displayed in Figure 1. This study focuses on consumer loans settled by LC, this tells us that this research is focused on P2P lending. This part of the online funding environment has the largest share measured in market volumes (Mollick, 2014).

(18)

Figure 1: The online funding environment graphically displayed

2.2 Determinants of P2P finance growth

2.2.1 Operational P2P processes

This research focuses on P2P lending marketplaces that intermediate within an online-based environment enabling various forms of collective intelligence, actions and resources. P2P lending systems have emerged as a disruptive technology that has threatening implications for large financial institutions. These lending systems provide the possibility of social identity and personal transparency by providing specific loan information (Feller et al., 2016). The processes of these platforms are relatively simple. They provide connections between borrowers and lenders for loans that are usually less than $1 million. The loans are displayed in a standardized way

P2O Crowd Crowd Crowd

Donating Purchasing Lending

P2P P2P P2P P2P

Donating Purchasing Lending

Donation Product Debt Profit share Equity

Online Funding Environment

Retrieved from Zaat, 2011

Investing

Non-financial reward Financial reward Crowdfunding

Crowdfinance

Crowd Investing

(19)

at a competitive rate where investors receive good returns and lenders get access to money in a fast and efficient way (Liu et al., 2015). The process is as follows: loan applicants provide information that influences their creditworthiness. This exists of their historic and current wealth position, the loan purpose and the total amount that is needed. Once displayed on the platform, investors allocate a part of the amount required to these loans. Once the loan is settled, it is paid back as a traditional loan where the interest rates are set by the platforms’ grading system.

2.2.2 Transaction costs

Campbell & Kracaw (1980) explain financial intermediation with a transaction costs approach. They state that institutions bear evaluation costs before a gain can be reported. Once a loan is settled, the costs of monitoring and possibly recovering are also for the mediator’s account. Since that lending platforms do not collect deposits, these types of intermediaries are not subjected to constraining international capital requirements as stated before in the introduction. The P2P financed loans are not accounted on the platforms’ balance sheet which makes them significantly more capital efficient. As an example: A traditional bank that lends $ 100 billion per annum is needed to have liquid reserves amounting to approximately $ 10 billion A P2P platform almost needs no capital to settle this amount of loans. Furthermore, these parties do not face troubles with the coexistence of short- and long-term debt (Desai, 2015).

The automated processes of platforms evaluate an applicant’s creditworthiness much faster, lowering the costs that traditional lenders face. Saunders & Schumacher (2000) research the factors that influence the margin for interest rates at large financial institutions. They find that the

(20)

most influential factor in this process is the height of the operating costs. This decreased operating margin is translated into lower saving interest rates and higher lending costs. Lin et al. (2013) found that the improved operating margins at P2P platforms are partially transferred to the lender, which is translated into higher interest rates, and the borrower, who has lower interest costs compared to traditional lenders. Thus, it could be stated that loans originated through these marketplaces create more societal value for each participant involved.

2.2.3 Credit rationing

Jaffee & Russell (1976) researched credit rationing in markets with imperfect information. Credit rationing can be approached as an externality interrupting the theory of efficient markets. The most basic understanding is that demand should equal supply. If demand exceeds the existing supply, prices will increase until the new equilibrium price is satisfied. Credit rationing explains a “temporary disequilibrium” where exogenous reasons not fully explain the rationale of a perfect working market mechanism. Translated to consumer loan markets, lower graded loan applicants are not able to receive a loan, although they are willing to pay the interest rates that are reasonable for the risk they represent. P2P platforms partially solve this problem by providing an audience for financially excluded people that now can present their proposition to a large number of lenders. This is especially interesting in economic downturns, where credit rationing is even more present (Jacklin & Bhattacharya, 1988).

(21)

2.3 Credit Risk in P2P Finance

2.3.1 Profits from lending activities

Pareto was a nineteenth-century economist who first reported the observation that about 80% of wealth was concentrated in about 20% of the population. This so-called Pareto Principle is present when analyzing the profitability of banking customers (Duboff, 1992). He concluded that the vast majority of profits come from large clients and that the statistical distribution of financing profits is fat tailed. This means there is a large part of the sample that is not necessarily profitable. The majority of clients situated in the tail of this distribution are SMEs. The reason that these types of loans are less profitable is because of the relative high operating costs that come with these loans. Traditional lenders have uniform processes where the largest profit comes from the largest transaction number. This means that the same administrative tasks have to be done, whether the client applies for a loan of $ 100.000 or $ 10 millions. However, the profit margins between these two loans are significantly different.

What can be concluded is that P2P financed loans are situated in the long tail of the distribution and that these new intermediaries do not directly jeopardize the level of profitability for traditional lenders. Traditional financial service providers have standardized systems to assess loan applicants where the majority of the profits come from the loans with the highest transaction number. P2P platforms work more efficient and have highly automated processes, serving consumers and SMEs in a better and more efficient way.

(22)

2.3.2 Credit risk with P2P assets

Credit risk can be defined as the sum of any potential or real factor that influences the creditworthiness of a borrower. The creditworthiness is defined by the ability and willingness to repay by the creditor. In order to categorize the borrowers’ risk, credit scores exist. These scores represent a description of someone’s creditworthiness, often translated into the probability to default during the time to maturity (Altman & Saunders, 1998).

The credit risk in P2P financed assets is transferred to each individual lender. Since that there are no warrants on these loans, investors bear the costs in case of a bankruptcy. The credit scores used by LC can be defined as reliable since that they make use of credit risk models that are also used by the majority of traditional lenders when assessing a persons’ creditworthiness (Verstein, 2011). An example of these credit-scoring models is the Fair, Isaac and Company (FICO) measure. This measure is widely used by banks and credit grantors and assesses a consumer’s payment history, debt burden, length of credit history and type of credit.

2.4 Information Asymmetry

2.4.1 Classic asymmetries in a new market

In 1970, George Akerlof wrote an article about the market for “Lemons”. This paper examines information asymmetries that come to existence when one party is better informed than the other. As an example, he uses the market for second hand cars where a consumer cannot distinguish a low quality vehicle from a high quality vehicle. In this case, the consumers are willing to pay an average price. This indicates that in the long run, the dealer would only sell low quality cars in order to stay profitable. This type

(23)

Jaffee & Russell (1976) also refer to this type of failure that leads to suboptimal outcomes. They state that the existence of asymmetries is one of the main concerns in credit markets. This also counts for P2P financed credit assets. Borrowers are always better informed than investors, which lead to a situation where lenders are at a disadvantage. According to Healy & Palepu (2001), information asymmetries are one of the primary reasons financial intermediaries exist. Since that it is hard to assess a borrowers’ probability of default, credit risk expertise is needed. A theoretical consequence is that asymmetries lead to adverse selection when a clear distinction cannot be made.

2.4.2 Information overview

The solution for this fundamental problem is a perfect alignment of interests and information. The only way to achieve this is transparency. Although P2P mediators are characterized as renewed service providers, they collaborate with existing credit rating agencies in order to lower the existing asymmetries. Miller (2015) constructed a natural experiment to research the provided information and defaults in consumer credit markets. He finds that the provision of information drastically lowers the chance of default for high-risk loans. Furthermore, he finds that more information improves the screening done by lenders. This implies that lending platforms facilitate market growth and a better working mechanism by becoming more transparent about their processes and information provision.

Nowak et al. (2015) also find that more detailed information leads to an increased chance a loan will be funded. This is combined with an improved performance of the loan during its term. Thus, providing more specific information on a P2P lending platform lowers information asymmetries.

(24)

However, for both sides there are suboptimal results. Providing too much detailed information about a loan blurs the vision of investors to look for the most influencing compliers. Nowak et al. (2015) confirm that a clear determined set of required variables would lead to better investment decisions by lenders.

2.5 Default prediction in P2P Lending

2.5.1 Credit ratings

P2P lending platforms rely on scoring models from third parties such as FICO. Grades from these models are associated with varying credit risk rates. They relate to certain levels of default probabilities where riskier loans have higher default predictions compared to less risky loans. The interest rates represent the specific risk factors that are associated with each loan grade. The value of the spread between a risk-free rate and interest level of the loan describes the riskiness of the security. The riskiness of a security is linked to the chance of bankruptcy during the loan’s term.

2.5.2 Default prediction models

A model that explains credit risk is the structural model of Merton (1974). In this model, he addresses credit risk using the option pricing theory from Black & Scholes. He states that the value of debt is dependent on three main items. The first parameter is the required rate of return on the investment. The second part is a summary of loan specific items. The main drivers here are maturity date, interest rate, call terms and seniority. The last item focuses on the chance that the creditor is not able to repay its debt. The probability of default is explained by the structure of the borrowers’ liabilities, combined with the volatility of the asset values. The

(25)

of derivations from this model is enormous, but in most of the cases the default probability is randomized and related to the creditworthiness of the creditor. Furthermore, two main implications are related to the implementation of this model. First, the external processes during the loan’s duration and also the process of recovery is of importance (Delianedis & Geske, 2003).

2.5.3 Consumer credit models

Parameters predicting defaults in consumer credit are modeled in various ways. Credit risk models for consumer credit are based on corporate default models, where the value of the firms’ assets is converted into specifications of the individual creditors. Crook et al. (2007) describe a theoretical consumer credit model that finds its origin in the option pricing theory. Taking Merton’s model as a basis, they implement this model on consumer credit risks. Most consumer credit risk models are based on theoretical models that are not able to capture all the variables to come to the optimum scoring. To understand the construction of a credit score in consumer credit, a clear distinction needs to be made. Credit scoring models can be subdivided into two different pillars namely a statistical or a judgmental approach. The first scoring procedure exists of historical data from earlier granted loans. Combining the outcomes of each separate data point, a prediction of default can be formulated. The judgmental technique is based on the expertise of credit analysts. Combining both types of models would lead to the most effective scoring. The second approach is of use when there is a lack of data to evaluate someone’s creditworthiness.

Gonzalez et al. (2014) examined the effects of perceived attractiveness in the form of general characteristics such as gender and age. They find this

(26)

so-called “Beauty premium” where more attractive individuals are more likely to secure a loan on P2P lending platforms. This implies that the use of the judgmental approach could lead to suboptimal results since that the external factors like attractiveness influence the final decision to support a borrower to obtain a loan. Lee & Lee (2012) studied the behavior of lenders on P2P lending platforms. They empirically tested herding behavior on P2P platforms finding that there is strong evidence of herding and a diminishing effect when the bidding of loans advances.

2.5.4 Consumer creditworthiness determinants

The probability of default is influenced by a large number of variables. For this research, only the main factors are discussed. The purpose of a loan is considered to be one of the drivers of default probability (Vasicek, 2002). A loan to start a business has another risk profile than debt refinancing or the purchase of a car. Many findings about default rates are presented in this market, but a reliable benchmark cannot be determined. These default rates vary from 40% that did not survive the first three years (Gorgievski et al., 2011) to 60% that went bankrupt in the first five years of their existence (Okolie, 2004). Although these findings indicate that these investments are risky, this does not has to mean that these default rates can be extrapolated to the P2P lending market since that the composition of industries, geographies and information provision is different. Compared to small business loans, car loan defaults are less than 4% according to Agarwal et al., (2008). This indicates that the purpose of a loan on P2P lending platforms is an important determinant for the riskiness of the offered loan. Another important determinant of defaults is the size of the loan. This is a widely discussed topic that can be summarized as follows: Empirical studies

(27)

show that when the loan size increases, the risk lowers (Narajabad, 2012) (Chatterjee et al., 2007), but other studies show the opposite also holds. Salas & Zambrano (2004) explain that there is a negative correlation between the loan size and the specific risks. They argue that the larger a loan is, the higher the due diligence controls are before a loan is granted. Besides this negative relationship, it is important to incorporate the individuals’ ability to repay the debt and what part of the loan would be lost in case of a default.

The third determinant describing someone’s creditworthiness is the credit history. This determinant is said to be more predictive toward defaults than certain financial ratios of a firms’ balance sheet for corporate valuations (Altman & Saunders, 1998). In scoring models like FICO, credit history makes up the largest part of the overall score (Allen et al., 2004). Credit history can be subdivided into different parameters such as the number of credit cards, length of credit history, the number of inquiries at other credit institutions and the height of other debts such as mortgages. Altman et al. (2005) categorized scoring models like FICO into four different C’s for credit. Character is a collateral measure for the creditors’ reputation, Capital is a summary for one’s financial household, and Capacity assesses the dynamics of income figures and lastly Collateral. This describes the overall economic situation when a loan is requested. P2P financing is a new way of providing loans. It is expected that the factors that usually predict loan default can be extrapolated to this market. However, since that the sample for consumer loans on Lending Club is not representative, its less than 1% of the total US consumer credit market, it cannot be stated that the earlier presented variables are directly applicable to this new financing alternative.

(28)

2.6 Conclusions

The exponential growth of this financing alternative is empowered by three main drivers. The first driver focuses on the efficient processes compared to traditional lenders. The second and perhaps most important driver is the capital efficiency of online-based lenders. Since these financial intermediaries do not have balance sheet risks, they are more flexible and capital efficient. This results into lower operating margins and better rates for lenders and borrowers. The third driver of the exponential market growth can be derived from the theory of credit rationing where lenders in the long tail of the loan distribution have a possibility to receive capital with acceptable terms that could otherwise not be provided by traditional lenders.

Information asymmetries are one of the fundamental reasons financial intermediaries exist. Minimizing the existence of asymmetries can be obtained by transparency and a perfect alignment of interests. Thus, providing a healthy level of information leads to a more efficient market mechanism. Credit ratings classify the risk of default in an understandable manner. Structured credit models are based on returns, specifications and the probability of default. This is both applicable to corporate and personal credit loans.

The purpose and size of the loan, combined with credit history are the main drivers to predict a person’s creditworthiness. Based on existing literature various implications are given. Since that P2P lending is a relatively undiscovered field in finance and represents a small fraction of the total lending volume, the understanding about credit literature is extrapolated to this market.

(29)

3: DATA

3.1 Data set description

The dataset for this research exists of all the loans that are settled via LC’s platform from June 2007 to December 2015. This platform is the largest US P2P lender and is listed on the New York Stock Exchange. Since December 2014, their shares are daily tradable and can be found under the ticker LC. A subsample from their loan book is created over the period January 2008 to December 2012. The consumer loans that were settled in 2007 have different loan specifications and are therefore excluded from the sample. The duration of a loan varies between 36 months and 60 months. For this research only the 36 months loans have been selected since that most of the longer-term loans are still repaying. Furthermore, this research does not take late payments into account since that the focus is on the final status (whether a loan defaulted or not). The information of loans that were arranged in 2013 or later will become available in 2016 or in the near future. Thus, these loans are also excluded from the subsample. Controlling for these adjustments, the total number of loans (=N) amounts to 31.465 loans. This loan book contains 109 variables that can be subdivided into three main focus areas. The first focus area is about specific loan details. This contains the loans’ grade, the loans’ interest rate and its main financing goal. The second tranche of variables focuses on the borrowers’ personal financial situation. This covers personal delinquency statistics, a variety of housing situation statistics and the height of the annual income. The last part of the selected data focuses on the borrowers’ credit history and indebtedness. This covers financial risk measures like the debt to income ratio, the number of open accounts (how much debt a creditor has during the application process) and other historic inquiries. Table 1 provides the

(30)

selected variables for this research. The variables have been selected based on the related literature in the previous chapter, where certain variables significantly contribute to explain the probability of default and therefore lower existing asymmetries. The first variable is the Loan Purpose. LC classified fourteen different motivations to apply for a loan on their platform. The most common is Debt Refinancing and the least recurring purpose is a loan for a Wedding. Although LC focuses on consumer loans, it has entered the SME loan market in 2015. Because the first loans were initiated last year, these specific loan types are currently repaying and therefore fall outside the subsample. For the purpose of this empirical study, it is assumed that all loans are allocated to consumers. Together with the Loan Purpose comes the total Loan Size as a variable. Besides these two influential factors, the most obvious parameter for a uniform loan classification is the loans’ grade. LC has a grading system that varies from A to G. Each separate grade has 7 subgrades counting from one to five. To sum it up there are in total 35 possible sub-grades. The platform assesses the creditors risk profile using data from common scoring models like FICO, but also incorporates other risk modifiers that consumers need to fill in during the application process. Another assessment variable is the Interest Rate. This is compiled by adding the starting rate of Lending Club with a compensation for risk factors. In May 2016, the average return on A graded loans is 6.81% and for G graded loans 29.54%. An overview of the returns per sub-grade can be found in appendix A.

The second field of variables is focused on the creditors’ personal characteristics. The first variable that is selected for this study is the Yearly Income. The client provides this number during the application process. Another important tranche of variables in this field are the Employment

(31)

Period and the Housing Situation. LC subdivided the Housing Situation variable into four different options: Ownership, Mortgage, Rent and Other. Continuing to the field of financial information, seven drivers of credit history are selected. This covers the length of the credit line, the total sum of earlier requests for credit and the payment history over multiple periods. Furthermore, the following financial ratios are taken into account to assess the creditors’ probability of default. First, the total Loan to Income is assessed. Secondly, the ratio between payments and total income is taken into account and lastly, the Debt to Income ratio.

Table 1: Variable selection for empirical study

Variable Definition General loan variables

Loan Purpose The loans' objective. 14 different purposes are formulated Loan Size The displayed height of the loan

Grade Grading system from A to G. A is considered the safest category Sub-grade There are 5 Sub-grades per letter, summing up to 35 Sub-grades Interest Rate Targeted annualized return on loan

Personal finance variables

Yearly Income The creditors' annual income provided during the application Employment Period The number of years at the current employer

Housing Situation 3 possible situations: Ownership, Mortgage, Rent or Other

Creditworthiness variables

Length Credit Line The period a borrowers' credit line was open Historical Requests The number of historical requests

Public Records Number of refused public records Open Accounts The number of other credit lines

Last Delinquency The number of months since last delinquency Loan to Income The rate between the loan and income level

Payments to Income ratio The annual payments weighted against the annual income Debt to Income ratio The borrowers' debt to income ratio

Table 1 provides information about the variables selected for this research. A total of 16 variables are used and can be subdivided into tree focus areas. 1: The general loan indicators, 2: Personal finance variables and 3: Creditworthiness compliers.

(32)

4: METHODOLOGY

The set of variables discussed in the previous chapter contains discrete and continuous variables. Discrete variables can only take on a finite number of values where continuous variables can take on an infinite number of values. The parameters that can be labeled as discrete are: Sub-grade, Interest Rate, Housing Situation: Ownership, Mortgage, Rent or Other and the fourteen different Loan Purposes: Wedding, Credit Card Refinancing, Car Loan, Major Purchase, Home Improvement, Debt Refinancing, Housing, Vacation, Medical Treatments, Moving, Renewable Energy Installment, Education, Small Business financing and Other. Continuous variables in this dataset are the Sub-grade, Interest Rate, Loan Size, Yearly Income, Employment Period, Length Credit Line, Historical Requests, Public Records, Open Accounts, Last Delinquency, Loan to Income, Payment to Income ratio and the Debt to Income ratio.

4.1 Correlation coefficients

This empirical study starts with the calculation of correlation coefficients. These coefficients provide valuable information about the determinants that explain the Sub-grade assign to a loan. In this case, the loans’ Sub-grade is selected as the dependent variable and the other variables are correlated on this specific classification variable. Since that this variable set contains continuous and discrete variables, different methods need to be performed to calculate the correlation coefficients. For the continuous variables, the Pearson Correlation Coefficients (PCC) are defined. This type of correlation measures the linear relationship between two different variables. The outcomes of this measure lie between -1 and 1, where 1 is a perfect correlation between the two variables and -1 a complete negative

(33)

For the discrete variables, another correlation method has to be used since that the outcomes of these variables can only take on a finite number of values. In this case, the Spearman’s Rank Correlation Coefficients (SRCC) are calculated. The SRCC and the PCC are equal between the ranks of values for two variables. However, the PCC assesses linear relationships, where SRCC tests whether the given variables are linear or not. For the SRCC, the outcomes also lie between -1 and 1. When a high outcome is denoted, it can be stated that the observations have a relatively similar rank. When the outcome is close to -1, it can be stated that there is not a comparable rank between the two variables.

Although it is known that LC assigns grades by combining by the FICO score, credit variables and other personal information, the exact calculations are not publicly available. When calculating correlation coefficients, linear relationships are assumed. However, West (2000) states that classification algorithms could contain non-linear relationships. Thus, this assumption needs to be stated carefully. To summarize the added value of these coefficients, the following example is given: Different loan applicants could be renting a house, but the age of the creditors could influence the credibility in a negative (relative short Employment Period) or positive (relative long Employment Period) way.

To determine the correlation coefficients, one variable has to be dichotomous. In this case, the Sub-grade is selected and the coefficients of the other discrete and continuous variables show to what extend they are correlated with the Sub-grade assigned to a loan. Table 2 points out the PCC for the continuous variables as described above. These results show that there is a very high correlation between the Interest Rate and the

(34)

Sub-grade (0.978). The rest of the correlation coefficients provide moderate results. The next variable has a value of -0.354 that is related to the Loan Size.

The SRCC for the discrete variables can be found in Table 3. Here, the Interest Rate (-0.978) is also the most correlated variable. Next in line is the variable House Situation: Rental (-0.134), which is closely followed by the correlation coefficient of Education (-0.131). These descriptive results indicate that the associated Interest Rate best explains the Sub-grade assigned to a loan. Furthermore, it can be concluded that the majority of the continuous and discrete variables selected for this research do not heavily influence the grade assigned to a loan. Although there are different correlation levels between the personal finance variables and credit history variables, this test unfortunately does not provide clear statistical proof that these outcomes have a close relation to the Sub-grade assigned.

(35)

Table 2: Pearson Correlation Coefficients for continuous variables (N= 31.465) Sub-grade Interest Loan Yearly Employment Length Historical Public Open Last Loan to Payments Debt to Rate Size Income Period Credit Requests Records Accounts Delinquency Income to Income Income Line Sub-grade 1 -0.978 -0.354 -0.019 0.071 0.156 -0.129 -0.107 0.019 0.132 -0.114 -0.231 -0.092 Interest Rate 1 0.201 0.018 -0.059 -0.143 0.123 0.0124 -0.032 -0.153 0.112 0.201 0.088 Loan Size 1 0.301 0.098 0.167 -0.08 -0.05 0.091 0.002 0.675 0.874 0.036 Yearly Income 1 0.102 0.159 0.031 -0.014 0 -0.253 -0.327 -0.342 -0.126 Employment Period 1 0.311 0.015 0.071 -0.048 0.032 -0.064 -0.642 0.056

Length Credit Line

1 0.02 0.048 -0.029 -0.03 -0.071 -0.075 0.045 Historical Requests 1 0.023 0.117 -0.045 -0.048 -0.048 -0.034 Public Records 1 -0.751 -0.543 -0.029 0.033 -0.002 Open Accounts 1 0.019 -0.063 0.003 0.003 Last Delinquency 1 0.004 -0.089 0.029 Loan to Income 1 -0.092 0.065

Payments to Income ratio

1

0.011

Debt to Income ratio

(36)

Sub-grade Interest House House House House Car Credit Debt Education Home House Major Medical Moving Other Renew Small Vacation Wedding Rate Own Mortgage Rental Other loan card consoli improve purchase able business dation ment energy Sub-grade 1 -0.978 0.015 0.112 -0.134 -0.012 0.086 0.072 0.068 -0.131 0.068 0.002 0.056 0.016 0.003 0.005 0.011 -0.086 0.019 0.003 1 -0.092 -0.128 0.135 0.017 -0.076 -0.043 0.111 0.021 -0.071 -0.004 -0.076 -0.015 -0.002 0.011 -0.071 0.081 -0.023 0.001 1 -0.217 -0.301 -0.021 0.023 -0.041 -0.025 -0.008 0.039 0.003 0.030 0.021 -0.012 0.029 0.002 -0.021 0.001 -0.011 1 -0.798 -0.058 0.005 0.014 -0.062 -0.031 0.199 0.001 0.199 -0.006 -0.056 -0.052 -0.065 0.045 -0.012 -0.034 1 -0.065 -0.022 0.004 -0.034 0.030 -0.213 -0.009 -0.003 -0.003 0.055 0.025 -0.009 -0.016 0.017 0.041 1 -0.074 -0.003 0.078 0.011 -0.004 0.021 0.003 0.008 -0.003 0.042 -0.073 0.021 -0.007 -0.007 Car loan 1 0.159 0.031 0.018 -0.054 -0.043 -0.039 -0.021 -0.021 0.053 -0.064 -0.043 -0.017 -0.029 1 -0.126 -0.042 -0.121 -0.032 -0.011 0.003 -0.076 0.001 -0.071 -0.064 -0.041 -0.030 1 -0.022 -0.054 -0.025 -0.054 -0.052 -0.029 -0.067 -0.048 -0.095 -0.028 -0.071 Education 1 -0.032 -0.014 -0.064 -0.036 -0.057 -0.064 -0.029 0.054 -0.033 -0.075 1 -0.064 -0.003 0.011 -0.161 -0.032 -0.063 -0.06 -0.094 -0.076 House 1 0.045 -0.001 -0.012 -0.074 -0.05 -0.019 -0.011 -0.029 1 -0.013 -0.034 -0.031 -0.05 -0.032 -0.056 -0.011 Medical 1 -0.019 -0.014 -0.18 -0.061 0.002 -0.002 Moving 1 0.004 -0.07 -0.097 -0.076 -0.065 Other 1 -0.06 -0.058 -0.037 -0.039 1 0.003 -0.027 -0.075 1 0.001 0.003 Vacation 1 -0.002 Wedding 1

(37)

4.2 Testing for predictive variables

After generating the descriptive correlation coefficients, an analysis is constructed to test for significant differences between two categorized subsamples. The two subsamples are labeled “Defaulted” (N= 3.261) and Non-Defaulted (N= 28.204). As stated before, the belated payments do not influence the outcomes of this analysis since that this research focuses on the final outcome whether a loan is fully repaid or charged off.

This test is constructed to explore significant differences between the two subsamples. The choice for this type of analysis is to define variables that have a significant influence on the loans’ default probability and therefore deliver valuable insights to platform facilitators to lower the existing level of information asymmetries. By displaying significantly influencing variables, investors are better informed about the potential risks when investing in loans that contain numerous influential default determinants. Also in this test, two different methods are used. For the continuous variables, a simple T-test is constructed where significant variables are defined at a 90%, 95% and 99% confidence level. The same confidence levels hold for the discrete variables. However, for the discrete variables a Chi-Square test has been defined. This test is used for these types of variables because it is the most powerful statistical tool to measure statistical connectivity between two different categories. The H0 for both tests is that the predefined variables have no influence on the status “Defaulted”, meaning that when there are no significant results, the H0 is not rejected and the variables do not contribute to existing asymmetries that are embedded in these loans.

(38)

4.3 Logistic regression

After testing for variables that show significant differences between the two categories, a logistic regression analysis is constructed. This regression is a powerful statistical tool that evaluates the chance a default will occur. The dependent variable here is the same categorical measure as named above, namely the binary “Defaulted” vs. “non-Defaulted”. Regressions on discrete and continuous variables are performed separately here. A logistic regression provides a decent way to test the predictive power of each variable. To increase the robustness of these regression outcomes, seven different regressions are performed each defined as a separate model. The first model runs a regression on only the loans’ Sub-grade. The second model runs a regression on the variable Interest Rate; besides this model, each of the following regressions contains the Sub-grade as a regression variable. The third model is focused on the fourteen specific loan purposes. Model 4 focuses on the various Housing Situation variables (e.g. Ownership, Mortgage, Rent or Other) and Yearly Income. The fifth model performs a regression on the Historical Requests and Public Records. Model 6 runs a regression on the indebtedness of the borrowers assessing its Loan to Income and Payment to Income ratio. Model 7 sums up all the selected variables for this research and combines models 1 to 6. By separating the set of variables into different models, statistics better define the explanatory power of a variable in a narrowed model. Before conclusions can be drawn from the outcomes of these regressions, the models used for this regression should be checked. To control for this, the so-called goodness-of-fit test of Hosmer & Lemeshow (2004) is included. Here, the observed outcomes are compared to the expected outcomes in a reliable way. Due to the fact that this model tries to predict risk, including this measure provides valuable information

(39)

5: EMPIRICAL RESULTS 5.1 Predictive variables study

Table 4 contains information about the predictive power of each single continuous variable toward defaults. The mean and the standard deviation are included for the whole sample and each sub-sample (e.g. “Defaulted” and “Non-Defaulted”). A logic conclusion that could be drawn from this simple T-test is that the variable Interest Rate has a predictive power toward the probability of default. The average rate on Non-Defaulted loans is 11.50% and for Defaulted loans the average interest is 15.41%. By comparing these two categorical results it can be concluded that the there is a statistically significant difference between the groups (P < 0,001). As a result, more defaults will occur when interest rates are higher. As long as the economy is doing well and unemployment rates are stable, fixed interest rates are not a problem. But if the economy is adversely changing toward an economic downturn, this could be problematic. The vast majority of P2P platforms emerged after the financial crisis of 2007-2008 and thus did not face periods of economic crisis. This is important to know since that defaults are higher in periods of economic downturns. Main reasons for defaults are unexpected expenses, disruption in borrowers’ earnings due to the loss of a job or a business and furthermore the lack of collateral security and personal guarantees (Cognicant, 2014). Samir Desai, CEO of Funding Circle, stated that they stress test their credit risk models for extreme economic downturns. He stated that the default rates would be around 20%2. Indicating that this investment alternative outperforms the majority of other financial securities.

(40)

Another variable that delivers significant results is the Yearly Income. For the total sample, the average annual income was $ 68.800 and for the Defaulted and Non-Defaulted groups the average income levels were respectively $ 58.987 and $ 69.935. This result is also significant at a 99% confidence level. Furthermore, all the variables linked to the creditors’ loan history and personal indebtedness provides significant results. The variable Open Accounts, that covers other outstanding credit lines, is significant at a 95% confidence level and the variables Credit Line History, Last Delinquency, Public Records, Loan to Income, Payment to Income and Debt to Income are statistically significant at a 99% confidence interval. This indicates that when the credit history length, the number of inquiries or outstanding accounts increases, the chance a loan will perform badly during its term becomes higher. The same way of thinking is applicable to the personal indebtedness ratios. An increase in these measures indicates a simultaneous increase in the probability of default. Thus, for the continuous variables, it can be concluded that H0 should be rejected and that the conclusion can be drawn that personal finance, credit history and personal indebtedness variables contain predictive power toward the probability of a default.

(41)
(42)
(43)

5.2 Logistic regression results

The categorical distinction between a defaulted and non-defaulted loan is the dependent variable in this logistic regression analysis. This analysis exists of seven different models each focusing on a specific set of variables used during this empirical study. Table 7 provides results from these regressions and displays the outcomes of the variables, describing to what extent it contributes toward default prediction. The first model is regressed on the Sub-grade assigned to a loan. Results show that this variable predicts 53.60% of all the “Defaulted” cases and is significant at a 99% confidence level. Model 2, which is entirely focused on the variable Interest Rate, also provides statistically significant information about the probability of defaults. This result is in line with the PCC from Table 2, where the correlation coefficient between the Sub-grade and Interest Rate was very high (-0.978). The third model provides the regression results the fourteen loan purposes defined by LC. The purpose to buy a car (“Car Loan”), “Major Purchases”, “Vacation” and “Wedding” show statistically significant results at a 95% confidence level. However, the other ten purposes do not provide significant results. This indicates that the majority of the loan purposes do not have significant predictive power towards defaults. Analyzing the outcomes of the four predictive purposes in Table 5, it can be stated that the default rates of these purposes are below average. This indicates that investing in loans with the purposes “Car Loan” (92.3%), “Major Purchases” (91.4%), “Vacation” (89.7%) and “Wedding” (92.4%) would lead to better investment returns. However, it is strange that the purpose with the lowest default percentage i.e. “Credit Card” (93.2%) is not present in the list of default predictive purposes.

(44)

The fourth model addresses the personal finance variables. Here, regressions on the Housing Situation and Yearly Income do not provide clear information about the loans’ probability of default during its term. The fifth model regresses the variables that are associated with credit history. The variable Historical Requests is significant at a 90% confidence level, but the other associated variables (Public Records, Last Delinquency) are not statistically significant. This is relatively strange since that you would expect that once an applicant faced delinquencies in the past, they will be more likely to face delayed payments in the future. The sixth regression model regresses the indebtedness ratios. Both “Loan to Income” and “Payments to Income” have predictive power when it comes to estimating defaults. These variables are statistically significant at a 95% confidence level. The last model combines all the variables that are used in this regression. In the 7th model, the variables Interest Rate, Historical Requests and the indebtedness ratios deliver significant results. The Interest Rates and Historical Requests are significant at a 99% confidence level and the Loan to Income and Payments to Income ratios are significant at a 95% confidence level.

Although some models deliver significant results, it is important to analyze the models’ accumulated contribution toward predicting defaults. Analyzing the various model contributions, it can be stated that the Sub-grade and Interest Rate are the best predictors of future defaults with percentages that are respectively 53.60% and 56.90%. For the models 3, 4, 5, 6 and 7 it can be concluded that the predictive power of the variables barely increases. The respective percentages are 56.90%, 59.30%, 61.30% 61.90%, 62.10% and 63.90%.

(45)

The best predictive variables after the Interest Rate and Sub-grade are the indebtedness ratios. Combining the Sub-grade with these ratios leads to a predictive power percentage of 62.10%. Thus, adding these financial ratios increases the predictive power with 8.50%. Analyzing the last model, it can be stated that combining all the variables into one regression does not lead to a strong increase in predictive power. With a percentage of 63.90% it can be concluded that the vast majority of this percentage can be allocated to the Sub-grade and Interest Rate.

The most plausible explanation for the poor increase in the predictive percentages is that the last regression model is over-filled. This is a case of spillover effects since that LC uses the FICO score and other compliers to subject a Sub-grade to a loan applicant. As stated before, the exact grading systems are not publicly available, this makes it is reasonable to assume that LC’s grading systems controls for the variables used in this study. The associated risks are translated into the various interest rates that seem to be the most important determinant from these regressions. Since that it is not known how LC grades the various applicants, it is hard to conclude what the actual compliers of defaults are. Nonetheless, these results show that LC’s grading system can be approached as reliable. Although these results do not deliver a complete overview of the determinants that lower existing asymmetries, compliers to stimulate a better working market mechanism are defined in this empirical study.

(46)

Referenties

GERELATEERDE DOCUMENTEN

Where fifteen years ago people needed to analyze multiple timetables of different public transport organizations to plan a complete journey, today travellers need far less time to

Ilybius guttiger (groep 5) is een voorbeeld van een soort waarbij er een bijna complete scheiding optreedt in seizoen en in watertype bezetting tussen larven en adulten. Van

In de vierde sleuf werden uitgezonderd één recente gracht (S 001) met sterk heterogene vulling rond de 12-15 meter geen sporen aangetroffen.. Op een diepte van ongeveer 60

• the estimation problem compensates for disturbance and measurement errors by minimizing an appropriately defined cost functional over all trajectories of the system • the solution

Additionally, we introduce a new way of calculating the packet loss ratio, end-to-end delay, network utilization with upper and lower bonds and receiver utilization with the

werking van het ‘bestanden delen’ uit te zetten; in dat geval wordt er alleen gedownload en zou deze vorm mogelijk weer wel onder de privé-kopie-exceptie kunnen vallen. Als

The absolute foreign bank presence term is significantly positive in the model without asymmetric information involved, while both absolute and relative foreign bank presence seem

Since the command is usually written in display or text style, it should be uncommon to need the optional argument, unless you have to force a particular style for fractions.. If