• No results found

Re-classify financial constraints : k-means clustering with PCA

N/A
N/A
Protected

Academic year: 2021

Share "Re-classify financial constraints : k-means clustering with PCA"

Copied!
61
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

University of Amsterdam

Amsterdam Business School

MSc Finance

Master Specialization Quantitative Finance

Re-classify Financial Constraints: k-means

Clustering with PCA

Author: Hanlin Shen Student Number: 10621776 Supervisor: Prof. Tomislav Ladika

(2)

Statement of Originality

This document is written by Hanlin Shen who declares to take full responsibility for the contents of this document.

I declare that the text and the work presented in this document are original and that no sources other than those mentioned in the text and its references have been used in creating it.

The Faculty of Economics and Business is responsible solely for the supervision of completion of the work, not for the contents.

(3)

Abstract

This paper introduces a new quantitative measure of identifying firms’ financial constraints which merely depends on the firm balance sheet and income statement information: the PCA-Clustering approach. Unlike existing measures, the PCA-Clustering approach allows the mathematical model to detect the major variation of financial constraints and cluster the firms with similar principal components into one identical group. The PCA result suggests that cash holding, R&D expenses and profitability, which are mainly contrasted by the first principal component, contributes the most variation of financial constraints. To evaluate the performance of the PCA-Clustering approach, this paper conducts the cash flow sensitivity of cash regressions. The significant cash flow sensitivity of cash, displayed by the financially constrained firms identified by

PCA-Clustering method supports the validity of this method. For completeness, this paper examines the investment-cash flow sensitivity of constrained firms. However, no convincing evidence found to support the monotonically positive correlation between the level of constraints and the investment-cash flow sensitivity.

(4)

Table of Contents

1. Introduction 2. Literature Review

2.1 Current Methodologies of Measuring Financial Constraints 2.2 Shortcomings of Existing Measures of Financial Constraints 3. Theoretical Framework

3.1 Feature Selection

3.2 Principal Component Analysis (PCA) and Model Optimization 3.3 k-means Clustering

4. Sample and Data 4.1 Sample

4.2 Summary Statistics 4.3 Cross-tabulations 5. Empirical Result

5.1 PCA Result and Interpretation of Principal Components

5.2 Classification Scheme of Financial Constraints using Principal Components 5.3 Clustering Result

6. Discussion

6.1 PCA-Clustering method, and other measures of Financial Constraints 6.2 Robustness

6.3 Investment-cash flow sensitivity 7. Conclusion

8. Reference Appendix

(5)

1. Introduction

The influence of financial constraints on the firm business behavior is one of the core questions in corporate finance. A large number of literatures in corporate finance examine whether and how external financing frictions affect firms’ business decisions. Researchers have proposed various hypothesis that the financial constraints might have substantial effects on firms’ investment decisions, the choice of capital structure and stock returns for public firms.

The widely agreed definition of financial constraints is that firms which face a wedge in the process of raising external and internal financing. When the wedge of a firm between the external and internal cost of capital rises, the firm would be considered more financially constrained. In other words, financially constrained firms are more likely to encounter credit frictions. In order to study the effect of financial constraints on firm behavior, it is critical that researchers must adopt an approach which is capable to identify the degree of financial constraints with great accuracy. It is, however, unlike bankruptcy, difficult to observe financial constraints directly. Therefore, empirical researchers rely on indirect measurements. The literature has recommended many alternatives, such as investment-cash flow sensitivity (Fazzari, Hubbard and Peterson, 1988), KZ index (Lamont, Polk and Saa-Requejo, 2001), WW index (Whited and Wu, 2006), SA index (Hadlock and Pierce, 2010) and a variety of distinct measurements based on firm financial status.

While there are various existing measures for identifying the severity of financial constraints, still, considerable debates exist with regard to different approaches. Various researchers have examined the performance of widely adopted measures of financial constraints; however, unfortunately, these approaches perform “not so well” at all (Farre-Mensa and Ljungqvist, 2016). They point out that the constrained firms identified by existing approaches are capable to borrow more in

response to the increase of local income tax and the negative shock to the credit supply. Moreover, they find that the responses of the “constrained” and “unconstrained” firms to their increasing demand of external funding differ little. The constrained firms, as well as the unconstrained firms, can raise external fund without difficulties when they require more credits. This is expected since each method constructed by specific theoretical and/or empirical assumptions which might or might not be correct, and generated from a certain small sample. Without the accurate

(6)

identification of financial constraints, it is impossible to provide correct answer to how financial constraints influence firms’ behaviors and decisions. In academic perspective, developing a reliable measure of classifying financial constraints helps to uncover the true influence of financial constraints on firms’ behavior. In practice, this might help outsider investors to learn financial status of a firm in a more complete perspective.

In this article, a new data-driven methodology, which is so-called the PCA-Clustering approach, will be introduced. In particular, this approach classifies financial constraints quantitatively by exploiting firm accounting data in the balance sheet and income statement for a group of sample firms through 1997 to 2016. Moreover, by studying the constraint classifications and firm financial characteristics, it is possible to shed light on the mixed evidence concerning financial constraints and re-evaluate the previous findings.

The spirit of PCA-Clustering method is to exploit the major variation of financial constraints from firm accounting data (i.e. principal component analysis) and group the firms with similar principal components into the same group (i.e. clustering). The motivation of running the principal

component analysis (PCA) preliminary to clustering is to reduce the dimensionality of a dataset in which there are many inter-related variables and detect the main source of variation of financial constraints. Meanwhile, PCA is capable to retain as much as possible of the variation in the original firm accounting dataset. The clustering result will be based on the artificially constructed principal components. When the clustering is successfully implemented, the firms with similar financial constraints status (i.e. high homogeneity) will be clustered into one group and the between-group heterogeneity will be maximized. In such a way, the sample firms will be segmented with regard to their financial status.

The PCA-Clustering method proposed in this paper addresses several drawbacks of existing approaches. First, the PCA-Clustering method makes it possible to analyze more variables than any other indices. The PCA-Clustering approach evaluates 15 different accounting variables covering multivariate aspects of firms’ financial characteristics, including firm size, firm age, dividends paying, profitability, asset tangibility, innovation expenses, cash holdings and pension

(7)

funding. These variables are not randomly chosen since they have been discussed frequently in previous literatures. Some of the variables (e.g. intangible asset, cash reserves)are corroborated correlated with financial constraints (e.g. Almeida and Campello, 2007; Denis and Sililkov, 2010) while others (e.g. payout ratio, book asset size) are used as measure of the degree of financial frictions (e.g. Fazzari, Hubbard and Petersen, 1988; Hadlock and Pierce, 2010). The

PCA-Clustering approach summarizes the overall financial status of firms while existing methods evaluates merely one or several aspects. Intuitively, the PCA-Clustering method is more complete than other measures. Second, the general concern on the out-of-sample validity is eliminated. In practice, researchers borrow the indices directly rather than re-estimating the indices on different samples. Several literatures have casted serious doubt on the stability of coefficients (e.g. Whited and Wu, 2006). However, the PCA-Clustering method is an algorithm that can be easily

implemented by computer packages on different samples. Thus, researchers can re-predict the financial constraints on different samples. Finally, comparing with the “ad hoc” selection of variables, the mathematical property of PCA makes it possible to detect the major variation of financial constraints by the model itself. This method could select the most important features of financial constraints automatically.

The result of PCA reveals that the level of financial constraints could be identified by two critical principal components, which are systematically consistent over time. The first principal component, which is positively related to cash holdings and innovation expenses, and negatively correlated with profitability, explains the most variance of financial constraints. Another principal component, which is positively correlated with firm size and age, contributes the second important variation of financial constraints. A relatively larger value of the first principal component and a relatively lower magnitude of the second principal component suggest a higher level of financial constraints. The clustering is based on the value of retained principal components and classifies the firms into different groups depending on their level of financial constraints.

After studying the financial characteristics of firms belong to different financial constraints categories, the findings in this paper cast serious doubt about the validity of leverage as the monotonic indicator of financial constraints. This finding, to some extent, supports the argument

(8)

advocated by Hennessy and Whited (2007), and Acharya, Almeida and Campello (2007). The “endogeneity” of leverage migh result in the non-monotonic or non-universal relation to financial constraints.

To evaluate the validity of the PCA-Clustering method, this article conducts quantitative tests and robustness checks to provide corroboratory evidence to support this approach. The cash flow sensitivity of cash, introduced by Almeida, Campello and Weisbach (2004), is exploited. When sorting firms into constrained and unconstrained groups using the PCA-Clustering approach, the constrained group displays a significant and positive cash flow sensitivity of cash. The cash flow sensitivities of cash of robustness checks conducted on alternative samples, again, are positive and significant for the constrained firms identified by the PCA-Clustering approach. These evidence increases the credential of the PCA-Clustering method.

It is difficult to prove that the PCA-Clustering approach is the optimal approach of measuring financial constraints. The only analysis can be implemented is to show the advantage of the PCA-Clustering approach. Following this spirit, this paper contrasts the PCA-Clustering approach with other four popular measures of financial constraints, which are KZ-index, SA-index, payout ratio and asset size. Although the constrained firms recognized by other four methods display positive and significant sensitivities of cash flow of cash; however, the unconstrained firms classified by other four measures also exhibit unexpectedly positive and significant cash flow sensitivity of cash. The evidence provided in this paper reveals the better performance of the PCA-Clustering approach in identifying unconstrained firms.

For completeness, this paper review the critical but contradictory argument about whether investment-cash flow sensitivity is a monotonic indicator of financial constraints. This paper replicates and modifies the methodology of Kaplan and Zingales (1997) and exploits the investment-cash flow sensitivity with a broader sample. The findings confirm the critiques proposed by various authors (e.g. Cleary, 1999; Kaplan and Zingales, 1997). Particularly, no systematically positive and significant investment-cash flow sensitivity could be found within financial constrained firms identified by various approaches.

(9)

The rest of the paper is organized as follows. Section 2 presents exhaustive literature review on measures of financial constraints and describes the main drawbacks of these measures. In Section 3, the theoretical framework will be explained. Section 4 introduces the sample selection

procedure and descriptive statistics. In Section 5, the empirical results will be illustrated and interpreted. Section 6 further explores the relationship between financial constraints and financial characteristics, and examines the validity of PCA-Clustering method. Finally, Section 7

summarizes and concludes the paper.

2. Literature Review

2.1 Current Methodologies of Measuring Financial Constraints

In order to extend the study on the categorization of financial constraints, it is critical to introduce the definition of financial constraints at the first place. The common understanding of financial constraints is that the firms identified as financially constrained would encounter a wedge between the external and internal cost of capital. In a theoretically frictionless environment, firms are capable to fund freely for value-enhancing investment projects (Modigliani and Miller, 1958). However, the real capital market is imperfect. The capital market imperfections, such as information asymmetry (e.g. Myers and Majluf, 1984) and agency costs (e.g. Bernanke and Gertler, 1990), increase the cost of external financing and constrain the firms’ investment opportunities. Consequently, some firms would forgo the attractive value-increasing investments due to the availability or the higher cost of external funds, resulting in the lower future growth and firm value.

The real influence of financial frictions to the firm investment decisions is still a vital issue in the study of contemporary corporate finance. The prior task researchers have to solve is to identify the severity of financial constraints. There are various existing methodologies for measuring financial constraints, qualitatively and quantitatively. However, still considerable debates exist with regard to different approaches. Generally, there are three types of approaches of measuring financial constraints. Unidimensional methods (i.e. single variable) capture single aspect of constraints (e.g. Fazzari, Hubbard and Peterson, 1988), multidimensional methods (i.e. indices) summarize various aspects of constraints into one index (e.g.Hadlock and Pierce, 2010; Whited and Whu, 2006), and

(10)

text-based methods aim to dig the financial-constraints-related information from firm’s annual report or financial fillings (e.g. Hoberg and Maksimovic, 2014). The goal of existing methods is to exploit financial constraints from firm’s funding situation or investment plan changes (such as postponing investment), their behaviors (such as not paying dividends) or characteristics (such as credit rating, firm size, firm age, etc). The empirical literatures debate on which of these methods has better performance in capturing financial constraints, and yet there is still no winner in this competition.

Among all of these methods, KZ-index is currently the most popular measure of financial constraints according to Google Scholar citations (Farre-Mensa and Ljungqvist, 2016). The KZ-index is developed by Lamont, Polk and Saa-Requejo in 2001. The authors replicate the methodology of Kaplan and Zingales (1997) and estimate a logit model with regard to the degree of financial constraints on five accounting variables: cash flow, debt, dividends, market value and cash holdings, normalized by total assets respectively. The authors assume that the regression coefficients are implicitly stable over time and across different samples. The regression

coefficients are adopted by the authors as parameters of KZ-index. A higher score implies that a firm is more likely to be constrained.

Whited and Wu (2006) contribute another popular index for identifying financial constraints. They obtain their index by estimating a structural model including cash flow to total assets,

dividend-paying indicator, long-term debt to total assets, firm size, sales growth and industry sales growth. Despite the concerns on the stability of parameters, in practice, the researchers use the regression coefficients reported in Whited and Wu’s paper (2006) as the index parameters instead of re-estimating the structural model on different samples.

Another class of popular method is text-based method. Hadlock and Pierce (2010) offer an updated narrative-based approach by investigating firms’ financial fillings (i.e. 10-Ks) of 356 randomly selected companies over 1995-2004 and seeking evidence that firms classify themselves as financially constrained. Ball, Hoberg and Maskmovic (2012) follow the similar methodology. They identify financial constraints by machine-reading the financial fillings of public firms over

(11)

the period 1997-2009. The evidence to support their classification is that firms mention having recently postponed investment projects are considered as financially constrained.

Apart from these multivariate methods, univariate methods are also adopted in various literatures. Fazzari, Hubbard and Peterson (1988) identify financially constrained firms by payout ratio. The idea behind it is that not paying dividend suggests the potential constraints. Almeida, Campello and Weisbach (2004) examine the cash flow sensitivity of cash by adopting several

uni-dimensional measures of financial constraints, such as firm size, bond ratings, and commercial paper ratings. The philosophies of their classification schemes are: (1) small firms generally suffers more from information asymmetry and are difficult to access the capital market, (2) the credit quality of the firms which never obtain public bond ratings or commercial paper ratings is not approved by the market. Faulkender and Petersen (2006) also suggest that those firms which are not accessible to bond markets have to borrow on less from financial intermediaries.

Additionally, Whited (1992) point out that rating may decrease the degree of information

asymmetries between companies and investors, which reveals that firms without rating tend to be rationed by lenders. Therefore, the firms with relatively small size or without professional ratings are considered financially constrained.

2.2 Shortcomings of Existing Measures of Financial Constraints

Although there are plenty of different methodologies of identifying financial constraints, none of them perform well in practice. Farre-Mensa and Ljungqvist (2016) use three different tests to examine the performance of five popular financial-constraints measures. They find out that none of the five measurements are qualified to identify firms that perform as if they were truly constrained. Specifically, they state that public firms which are identified as constrained by the index have no difficulty in raising debt when there is an exogenous increase in demand of debt, such as the corporate income tax increases or their demand for external funding rises. The single-variable-based methods merely concentrate on exactly one aspect of financial constraints. Financial constraint is a complicated concept which might be determined by various factors. Obviously, the univariate measures cannot capture the whole picture of financial constraints, and therefore are not qualified measurements for identifying financial constraints.

(12)

Multivariate measures, for example index-based methods, can partially solve the unidimensional problem and capture more aspects of financial constraints. However, the performance of the indices is still not satisfied due to several shortcomings. Firstly, some of the determining variables in the indices are outdated. Several existing literatures identify financial constraints based on dividend payments. Firms do not pay dividends or fewer dividends are classified as financial constrained (Fazzari, 1988). However, some famous and profitable companies with promising developments in the high technology or pharmaceutical industry never pay dividends or have low payout ratio, such as Google LLC, Apple Inc, etc. Traditional measures of financial constraints will categorize these firms into financially constrained group; however, this classification obviously violates the ground truth.

Secondly, one of the greatest concerns on the validity of financial-constraints measures is the stability of parameters across the firms and over time. Unlike the indices in the field of physics, the coefficients of various financial-constraints indices are estimated from a small sample from a certain period. They are not similar to the physical constants (e.g. Plank constant) which are stable in every circumstance and any time period. However, in practice, researches continue using the regression coefficients reported in the original paper rather than replicating the methodology and re-estimating the parameters. Hoberg and Maksimovic (2014) point out that the reduced-form predictive models estimated from the certain small sample using pre-determined accounting variables might be potentially unstable and not be applicable to other populations. In other words, the stability of index parameters is doubtful. Whited and Wu (2006) also admit that it is difficult to demonstrate the stability of parameters, both across firms and over different periods, convincingly. Utilizing the coefficients estimated from a certain sample on a much larger group of firms in a different period leaves the unanswered open question that the index is truly qualified to measure financial constraints.

When it comes to the novel methodologies, the text-based measures identify financial constraints based on the novel information in firm’s financial fillings. They tend to seek evidence based on the context in financial fillings that the firms classify themselves as financially constrained. The key words include dividend omission, dividend increases, equity recycling, underfunded pension

(13)

plans, etc (Bodnaruk, Loughran and McDonald, 2015). The qualification of the firms’ disclosures becomes highly important. If the firms identify their financial status mistakenly, the concerns of obtaining biased results would be raised. Moreover, it should be noticed that there are still possibilities that the firms would reveal the misleading information on purpose in their financial fillings. Therefore, the text-based measures might not always be valid and have better performance than the peer methodologies.

In short, the limitations of existing measures of financial constraints are: single-variable methods are univariate and incomplete; index-based measures include outdated variables and potentially unstable parameters, which would probably result in misclassification; novel-based methods suffer from the subjective criterion and may not provide consistent and reliable predictions. Therefore, it is necessary to develop an objective methodology which could capture the most important aspects of financial constraints, adapt different group of firms and sample periods without the concern on the parameter stability, and have consistent and predictive performance.

3. Theoretical Framework

This section will introduce the model. The main objective is to re-classify financial constraint in terms of firms’ financial characteristics and structure. As mentioned in previous section, financial constraint is a fairly complex concept that is possibly determined by plenty of factors. Therefore, to re-identify financial constraints, it is fundamentally important to take as many as variables which are possibly related to financial constraints into consideration. For the classification criteria, instead of utilizing a single indicator or a narrow set of accounting variables (e.g. firm size and firm age), it is more plausible to use all the variables that can measure financial constraint jointly, conditioning on their availability in the public database. The underlying assumption of the model is: a group of firms which can be recognized as financially constrained (or financially

unconstrained) can possibly be identified through their similar within-group financial characteristics and status.

(14)

3.1 Feature Selection

Although the model is data-driven, it is still vital to select variables that should be included into the model. Otherwise, including accounting variables which are irrelevant to financial constraints might generate potential bias or even error. The selection of variables will be based on the existing literatures about financial constraints that have been approved to directly or indirectly identify firms with regard to their financial status.. The category of financial-constraints features mainly includes: firm size (e.g. Erickson and Whited, 2000), firm age (e.g. Hadlock and Pierce, 2010), dividends paying (e.g. Fazzari, Hubbard and Peterson, 1988; Almeida, Campello and Weisbach, 2004), debt level (e.g. Kaplan and Zingales, 1997), cash holding (e.g. Denis and Sibilkov, 2010), asset tangibility (e.g. Almeida and Campello, 2007), sales and profitability (e.g. Whited and Wu, 2006), pension funding (e.g. Rauh, 2006) and R&D expenses (e.g. Li, 2011). This paper excludes the investments as one of the explanatory variables due to the potential reverse causality. Financial constraints may have influence on corporate investment, and vice versa. Including investment in the analysis might result in endogeneity. Appendix 3 reports the detailed accounting variables listed in each category. Taken together, these categories of accounting variables are able to provide a clear-cut discrimination among the different segments in the sample in terms of their financial status.

3.2 Principal Component Analysis (PCA)

Every selected accounting variable in the model might have, more or less, ability to predict financial constraints. There is no reason to exclude any one of these possible deterministic variables out of the analysis. The methods in the previous literatures, which construct indices with a narrow set of accounting variables, do not clarify the selection criteria of variables. The variables that used in these literatures are, to some extent, ad hoc. Researchers hypothesize in their paper that some certain factors might be the determinants of predicting financial constraints, and support their arguments with empirical evidence. The drawback is that they only focus on several certain aspects of financial constraint and neglect the whole picture of the story. The readily available accounting variables selected by researchers might not contribute the most variation of financial constraints. Consequently, it is difficult to figure out the most important source of variation of financial constraints. Therefore, it leads to an open but crucial question: how to select the most

(15)

important variables that can measure financial constraints sophisticatedly without any prior assumptions.

The solution given in this article is principal component analysis (PCA). The idea behind PCA is to map the original dataset onto a lower-dimensional subspace and conduct a linear orthogonal data transformation which condenses correlated variables into several uncorrelated principal components (Jollife, 2002). Geometrically, PCA tends to find the major axis of variation on which to project the original data. With this method, the number of variables can be reduced efficiently by omitting higher-order components (i.e. basically noise terms) and the mathematically constructed variables can explain the data approximately (due to the cost of losing some variance) as well as the original dataset. This method allows the mathematical model to pick the important components which can explain the most variation of datasets automatically and take them into analysis. Figure 1 demonstrates the intuition of PCA with an example of mapping two-dimensional data onto a one dimensional subspace.

In particular, PCA satisfies two conditions that can construct a better-performed model: (1) the new artificial variables transformed from original features can “explain” a large proportion of the variation of the original dataset; (2) the new constructed variables are independent from each other (i.e. geometrically orthogonal). The first condition is intuitive: if the new artificial variables do not explain the major variation of firms’ financial characteristics, they would not be useful for further analysis. The second condition is also intuitive: the uncorrelated principal component can summarize firms’ financial characteristics from totally different aspects. Compared with previous measures of financial constraints, PCA provides more plausible criteria for the selection of important explanatory variables, and choose the factors that account for the most variation of firms’ financial status automatically.

Technically, PCA can also improve the performance of clustering algorithm implemented in the next step. Most clustering algorithms depend on the distance measurement between data points. They work inefficiently in high dimensional space as the data becomes sparse and distance measurement becomes not qualitatively meaningful (Aggarwal, Hinneburg, and Keim, 2001).

(16)

Figure 1

Demonstration of PCA

Figure 1 visualizes an example of PCA. A two-dimension dataset is projected on a one-dimension subspace. The first figure in the second column contains the most information (i.e. variation) of the original dataset, which reflect the intuition of PCA. Appendix 1 documents the mathematical terms of PCA.

3.3 k-means Clustering

The ultimate goal of the methodology proposed is to separate samples into different categories based on their financial status. Therefore, clustering algorithm will be implemented to accomplish the task. Clustering recognizes the pattern in the dataset and group samples according to some notion of similar features describing each observation (Wagstaff, Cardie, Rogers and Schroedl, 2001). In this article, k-means clustering algorithm, which is one of the most widely used clustering algorithms, will be used due to its simplicity and efficiency.

Figure 2 demonstrates the procedure of k-means clustering algorithm. In general, Hartigan (1975) introduces that k-means clustering contains two steps: assigning (E-step in Figure 2) and updating (M-step in Figure 2). In detail, the algorithm firstly initializes k cluster centers arbitrary; secondly assigns each sample data point to the closest cluster center; finally update each cluster center to the mean of the points assigned to it. The algorithm will iterate until the cluster centers no longer change. When a clustering analysis is successful, variability within the clusters is minimized and the heterogeneity between groups is maximized. For full introduction of k-means clustering, see Appendix 2. Figure 3 provides an intuitive overview of the theoretical model.

(17)

Figure 2

Demonstration of k-means clustering algorithm

Figure 2 visualizes an illustration of the k-means clustering algorithm. First, the algorithm will initialize the cluster centroids. Then, it assign each data point to its closest cluster center and update each cluster center to the mean of the points assigned to it.

Naturally, none of the methodologies are perfect in dealing with real problems. There are few possible drawbacks of the PCA-Clustering approach advocated in this paper: (1) the performance of the PCA-Clustering approach might be influenced by the distribution of data set; (2) the classification boundary might be ambiguous; (3) the ability of out-of-sample estimation still needs further statistical tests.

4. Sample and Data 4.1 Sample

Considering the accounting data availability and control for the country-wise characteristics of firms, the sample consists of U.S. firms which listed in Compustat database. The sample period is from 1997 to 2016. Borrowing Heider and Ljungqvist’s (2013) methodology, the same sample filter, which excludes the firm with special properties or under certain regulations (such as financial firms (SIC Code 6000-6999), utilities (SIC Code 4900-4999), public-sector entities (SIC Code 9000-9999), and non-U.S. firms), will be applied. The firms with any missing value are also filtered out to avoid the bias generated by missing value. In order to mitigate the effect of outlier, all variables are winsorized at 1st and 99th percentiles. A single firm-year performance will be a single observation. During the attempts of implementation of the PCA-Clustering approach, the

(18)

Figure 3

Overview of Theoretical Framework

This figure demonstrates the overview of PCA-Clustering method of identifying financial constraints intuitively. The methodology contains mainly five steps which process the raw firm financial-constraint-related accounting data to various clusters in terms of the degree of financial constraints.

firm size tends to be the overwhelming factor that generalizes large bias, leading to the meaningless classification of large firms and small firms. In order to mitigate the influence brought by the firm size, all variables (except total book asset and firm age) are normalized by firm total book asset before the further analysis. In such a way, the PCA-Clustering approach contrasts the ratio rather than the absolute value of financial characteristics.

Apart from the analysis on the whole sample period, this paper also considers the potential time variation in financial constraints. The sample period will be divided into four 5-year subsample periods. All sample firms are required to stay in the subsample period for at least 3 consecutive years which guarantees the long-term and consistent company performance can be observed. The minimum number of years required for firms which are qualified to enter the sample is also useful to construct lag structure of the further regression analysis. This is also consistent with many existing studies in the literature (e.g. Almeida and Campello, 2007). Several literatures utilize relatively shorter sample period and require firms to offer observations during the whole sample period under research (e.g. Whited, 1992). The advantage of imposing more restrictive rules for

•Raw Data: •Correlated high-dimensional data PCA •Transformed Data: •Uncorrelated principal components Optimization •Optimal selection of retained components Optimization •Optimal number of clusters based on principal components Clustering •Result: •Clusters in terms of financial constraint Interpretation

(19)

sample selection, which are series consistency and data stability, is obvious; however, imposing it on 20-year sample would result in apparent concern with survivorship biases. Moreover, the segmentation of sample period makes it possible to discover the time-varying financial condition (e.g. Hadlock and Pierce, 2010). In fact, it is accountable to assume that a firm that finds itself financially constrained in several years might subsequently change its business plan and enjoy better financial status, and vice versa. It is logical that the financial obstacle and macro-economic environment that firms face change over time. After filtering the sample, the whole-period panel consisting of 37,260 firm-year observations is obtained.

4.2 Summary Statistics

Table 1 reports the descriptive statistics for the firm characteristics in each subsample period, respectively. As Table 1 indicates, some of the firm characteristics have the apparent increasing pattern. The average firm size adjusted by average CPI in U.S. in the last subsample period is more than 2.5 times as large as it in the first subsample period. The average size of total asset increases from 1487.2 to 5559.98. Firms’ cash holding ratio also have an increasing pattern. Firms’ retaining cash stock rises from 16.6% of total book assets to more than 20% of assets. Moreover, in the last 20 years, firms pay more dividends. The firm payout ratio increase with more than 0.3 percentage points from 1997 to 2016. The firm leverage level ranges from 22.1 to 24.7% of total assets, and changes little in the whole sample period. Firm pension and retirement expense also remains at a stable level: round 5 to 6% of assets. The profitability of firms drops from 0.326 to 0.283 at the last subsample period. Firm R&D expenses increase at the final stage with nearly 10 percentage points.

4.3 Cross Tabulations

Table 2 presents Pearson’s correlation coefficients for the various firms’ accounting variables, including total assets, cash holdings, leverage, payout ratio, intangible assets, pension funding, R&D expenses and firm age. The correlations between the normalized firm accounting variables are not as large as the correlation between raw variables. Only several standardized variables have large correlations. For instance, the absolute values of correlation between cash holdings, R&D expenses and EBITDA are relatively higher than other correlations. The correlation coefficients

(20)

Table 1

Descriptive statistics and univariate comparison by subsample periods

Subsample periods

1st period 2nd period 3rd period 4th period Mean (A) Median (B) Mean (A) Median (B) Mean (A) Median (B) Mean (A) Median (B) Book asset 1487.2 108.43 2305.46 191.79 3399.60 273.97 3915.48 308.23 Age 14.682 10 18.331 14 20.210 16 21.973 20 Leverage 0.210 0.159 0.191 0.144 0.189 0.138 0.213 0.170 Cash holding 0.166 0.101 0.175 0.123 0.190 0.135 0.215 0.143 Payout ratio 0.013 0.000 0.014 0.000 0.013 0.000 0.016 0.000 Profitability 0.326 0.328 0.324 0.327 0.329 0.332 0.283 0.294 R&D expenses 0.095 0.047 0.092 0.044 0.090 0.034 0.104 0.042 Pension funding 0.005 0.003 0.005 0.003 0005 0.003 0.006 0.003

Data are from Compustat for the period through 1997 to 2016. The whole sample period is segmented into 4 subsample periods: 1997-2001; 2002-2006; 2007-2011; 2012-2016. The column (A) and column (B) report the mean and median of each firm characteristic, respectively. See Section 4.1 for the sample filtering criteria. Size is the total book assets measured in dollar unit adjusted by average CPI in U.S. (item 6).All firm size are measured by 1999’s dollar value. Firm age is calculated as the last year of specific subsample period minus the first year that the firm appears in Compustat. Cash holding is calculated as cash stock over total book assets, item 162 / item 6; leverage is defined as short-term debt plus long-term debt over total book assets, (item 9 + item 34)/ item 6, dividends payout ratio is defined as common dividends plus preferred dividends over total book asset, (item 19 + item 21) / item 6; profitability is computed as gross profit over total book asset; R&D expenses and pension funding are rescaled by total book assets. In order to mitigate the influence of outlier, all variables are winsorized at the 1st and 99th percentiles.

range from 0.201 to 0.251. Two non-standardized variables tend to have relatively larger

correlation. The firm size is positively correlated with firm age with the correlation of 0.259. This implies that the longer the firm operates, the larger the firm would be. The cross tabulation suggests that some highly correlated variables could probably be condensed into one component.

(21)

Table 2

Pearson’s correlation of various firm accounting variables.

AT CH Leverage Payout EBITDA INTAN XPR XRD AGE

AT 1 CH -0.150 1 Leverage -0.005 -0.028 1 Payout 0.005 -0.005 -0.000 1 EBITDA 0.077 -0.201 -0.234 0.010 1 INTAN 0.037 -0.064 -0.001 -0.001 0.103 1 XPR 0.013 -0.050 0.078 0.001 -0.049 -0.018 1 XRD -0.078 0.251 0.176 -0.004 -0.763 -0.114 0.058 1 AGE 0.259 -0.241 -0.003 0.008 0.131 0.071 0.116 -0.1318 1

This table reports the average cross-sectional Pearson’s correlation coefficient of various firm accounting variables which are normalized by firm total book assets, for the whole sample period through 1997 to 2016. The firm total assets and firm age are not standardized and remain unchanged. The figure in each cell is rounded to three decimals. The accounting variables are listed as follows: total asset (AT), cash holdings (CH), leverage, payout ratio, EBITDA, intangible assets (INTAN), pension and retirement expenses (XPR), R&D expenses (XRD) and firm age. The leverage is the sum of long-term and short-term debt normalized by total book assets. The payout ratio is calculated as the sum of common dividends and preferred dividends, scaled by operating income before extraordinary terms.

(22)

5. Empirical Result

5.1 PCA Result and Interpretation of Principal Components

First of all, PCA is implemented directly to the whole sample as well as four subsamples. The principal components are derived from the variance-covariance matrix of the firm accounting data set. The full mathematical interpretation is documented in Appendix 1.The PCA is implemented by Python. After obtain the principal components, they are sorted in descending orders with the corresponding eigenvalues.

As introduced in the previous section, PCA helps to exploit major variation of the original data. Therefore, in this article, the interpretation of principal components allows researchers to obtain knowledge about the source of main variation of financial constraints. Before attempting to interpret the principal components, a simplified version of the coefficients of each principal component needs to be produced. Typically, computer packages that calculate principal components provide the coefficients with few decimal places. However, as with other sorts of tabular data, the real interest is the general pattern of the coefficients rather than the absolute value (Jollife, 2002). The tables in Appendix 4 offer only rounded two decimal places and tables in Appendix 5 simplify the result still further.

The tables in Appendix 5 report the Pearson’s correlation coefficients between the principal components and the original accounting variables. A + or – i indicates that the absolute value of a correlation is larger than 0.5; the symbol of the coefficient is also reported. Values of correlations below the threshold are neglected, leaving blank space. Interpretation of the principal components is based on which of the original accounting variables are most strongly correlated with corresponding component. In other words, it is important to find the coefficients of original variables in large magnitude in either direction. A correlation between the original variable and the corresponding principal component above 0.5 is deemed important.

Turning to the interpretation of the principal components, the first principal components for each subsample as well as the whole sample correlated with mainly the cash holdings, R&D expenses and profitability. The correlations between the first principal component and various accounting

(23)

variables are systematically consistent among all the pre-stated samples. It accounts for approximately 42% of the total variation of financial constraints. The first principal component increases with increasing cash holding, R&D expenses, and decreasing profitability (i.e. revenue, EBITDA, and gross profit). This suggests that the main variation of financial constraints is from firm cash holding status, the expenses of innovation, and profitability, with opposite symbol.

The second principal component increases with two factors, increasing firm size and age. This component suggests that after accounting for the firm cash stock, innovation expenses and profitability, the main source of variation is between firms’ total book value and years of operation. The second principal component explains slightly less than 32% of the total variation, for each sample. The positive correlations among the second component, firm size and firm age, similar to the first component, are systematically consistent across all the samples. In other words, the second principal components for all pre-determined samples contrast the firm size and age. Therefore, the first and second principal components are stable over time although the subsamples are time varying.

However, unlike the first and second principal component, the rest of the principal components are generally not stable over time. For instance, the third principal components for four subsample periods remain some similarities but nevertheless have differences. In the first and fourth period, the third component is entirely a contrast between firm’s leverage. The components for both periods are positively correlated with debt level. However, in the second and third period, this component contrasts the long-term debt level and gross profit, with different sign. The fourth principal components differ even more between the subsamples. In the first three subsamples, this component contrasts revenue. However, in each period, other different variables give some contribution to this component. For the second period, this component increases with the short-term liabilities. While in the third period, the R&D expenses of firm also provide major contribution. For the last period, it contrasts the firm size (i.e. plant, property and equipment) and intangible assets, with opposite symbol.

(24)

Since the first and second principal component contributes the 74% of the total variation of financial constraints, the rest of components merely accounts for another 26% of the total variation. It is apparent that the first two principal components are capable to explain the majority variation of financial constraints. The rest of components are not as critical as the first two components. Moreover, the accounting variables which the first and the second principal component mainly contrast are diffusely approved in literatures. Various researchers have recognized that some of the firm accounting variables, such as leverage and cash flow, have the “endogenous nature” which might lead to sample-specific or non-monotonic relation to the financial constraints (e.g. Hennessy and Whited, 2007). However, the validity of the accounting variables contrasted by the first and the second principal component are generally not suspected by previous literature. Additionally, the correlations between those components and firm accounting variables are not stable over time. Dropping the third principal component onwards does not affect the result significantly. Hence, the previous discussion, examples and evidence suggest a convincing case that the first and the second principal component should figure conspicuously in any principal components that contributes the variation of financial constraints. It is rational to proceed the analysis exclusively based on the first two principal components.

This section has exhaustively described the major source of variation of financial constraints within the firm datasets. As introduced in Section 3.1, the selected variables are potential measurements of financial constraint which are widely studied by the previous literatures. Therefore, the interpretation of principal components provides an insight of the major ingredients of financial constraint. The results of PCA indicate that the first principal component which contrasts the firm cash holding, R&D expenses and profitability accounts for the most variation of financial constraint. The second principal component, which measures the firm size and age, contributes the second largest proportion of the total variation. These two principal components are systematically consistent over time and used for further investigation.

5.2 Classification Scheme of Financial Constraints using Principal Components

This paper proceeds to assign a single firm-year observation to a financial constrained group based on the linearly transformed data (i.e. principal components). Therefore, the classification scheme

(25)

will be determined by the magnitude of principal components instead of the raw values of firm accounting variables. Recalling the interpretation of the first principal component, it is positively correlated with the cash reserve, and the expense of development and research while negatively related to the profitability of the firm. Denis and Sibilkov (2010) point out that financially constrained firms tend to hold greater cash reserves than unconstrained firms. Their findings suggest that the higher cash stocks allow financially constrained firms to pursue value-enhancing investments which otherwise could be highly likely bypassed due to the high cost of external financing (e.g. Opler, Pinkowitz, Stulz and Williamson, 1999). Thus, the odds of predicting financial constraints rise with the increasing of cash holdings.

Another important factor which is positively correlated with first component is the expense of innovation. This factor links to the findings of Almeida, Hsu and Li (2012). They conduct a research on the relationship between financial constraints and the innovation of firms, and conclude that more constrained firms spend more on research and development than unconstrained firms. The reason to support their findings is that firms with low free cash flow are more likely to make productive and value-increasing R&D investments. Therefore, financially constrained firms might be willing to spend more on innovative projects. As a result, the higher percentage of R&D expenses to total book asset increases the probability of predicting financial constraints. The last important factor, which is the profitability, is negatively correlated with the first component. Naturally, the firms with higher profitability may enjoy better financial health, and hence are less likely to be constrained. Summarizing the arguments, the larger value of the first principal component suggests the higher severity of financial constraints.

Turning to the interpretation of the second principal component, it is positively related to the size and age of firms. Numerous empirical literatures emphasize the importance of firm size and age as the indicators of measuring financial constraints (e.g. Hadlock and Pierce, 2010; Whited and Wu, 2006). Larger firms usually have consolidated business (recalling that size and age are positively correlated, see Table 2) and might encounter less financial frictions when their demand of external financing increases; whereas the smaller firms are less acknowleged by the capital market due to their shorter operation history, and hence suffer more from the imperfections of capital market

(26)

(Gilchrist and Himmelberg, 1995). Following this argument, the firms with smaller book asset and shorter age tend to be considered constrained. Thus, the smaller value of the second principal component is an indicator of predicting higher level of financial constraints (recalling that the second component is positively correlated with firm size and age).

In general, the baseline classification schemes of financial constraints in terms of two principal components are that: (1) the firms with larger value of the first principal component and simultaneously small value the second principal component are identified as more financially constrained; (2) the firms with smaller value the first principal component and meanwhile larger value of the second principal component are considered less financially constrained.

5.3 Classification Result

The clustering of firms with regard to their financial status is based on the transformed data, which is the first and second principal component derived from the previous section. This article will not follow the KZ methodology and terminology, which creates five mutually exclusive groups. The reason is that the validity of the pre-determined five clusters based on firm financial status is still doubtful. Given the ground truth of the number of clusters is unknown; evaluation has to be performed using the model itself. Borrowing the methodology proposed by Rousseeuw (1987), the number of clusters will be determined by the silhouette score. Without any prior assumption of the number of clusters, the silhouette score algorithm evaluates the quality of clustering by comparing the within-cluster tightness and between-cluster separation. The higher score implies that the cluster is dense and well separated. Imposing a maximum threshold of 10 possible clusters, the silhouette algorithm will calculate the score for each number of clusters and choose the optimal quantity of groups.

Figure 5 presents the silhouette score for each period, respectively. As shown in Figure 5, the optimal number of groups generated by silhouette algorithm for all subsamples is four. Therefore, the general subsample would be separated into four cohesive clusters in term of their financial status. The result of silhouette score challenges the classification scheme of KZ methodology. It

(27)

Figure 4

Silhouette score for each subsample

Figure 4 plots the silhouette score corresponding to the specific number of clusters in range from 2 to 10 for each subsample, respectively (above 1st and 2nd period; below 3rd and 4th period). Silhouette algorithm calculates the score by contrasting the within-group tightness and between-group separation distance. Higher score implies the cluster is cohesive and well segmented. The silhouette score corresponding to three clusters reaches the peak in every subsample period. Thus, the algorithm chooses three as the optimal number of clusters.

implies that the classification scheme suggested by KZ is possibly not correct. In other words, the boundary, especially between LNFC (likely not financially constrained), PFC (potentially financially constrained) and LFC (likely financially constrained), is ambiguous. The sample firms possibly, by no means, could be segmented into five groups in terms of their financial status. The superiority of adopting silhouette score to determine the optimal number of clusters is that the model is totally data-driven. The model does not depend on any prior assumptions or hypothesis; the optimal quantity of clusters would be calculated based on data itself.

Table 3 presents the frequency of financial constraint categories for each subsample, respectively. The frequencies of the financial constraints share some similarities for all subsample periods. In every subsample, Cluster 2 accounts for the lowest proportion of the sample firms, which is from 1.5% to 4.5%. Of all four clusters, cluster 3 accounts for the highest percentage of sample. It accounts for nearly 50% of the total observations.

(28)

Table 3

Proportion of Financial Constraint Classifications

Whole (1) 1st period (2) 2nd period (3) 3rd period (4) 4th period (5) Cluster 1 17.48% 23.86% 10.36% 9.15% 17.03% Cluster 2 3.06% 4.51% 1.71% 1.84% 1.19% Cluster 3 51.72% 49.00% 49.18% 54.39% 51.40% Cluster 4 27.74% 22.63% 38.75% 34.62% 30.38% Observations 37,260 7,504 9,697 8,531 7,844

This table reports the fraction of subsample observations in which a single observation is classified to the indicated financial constraint group. The figures in Columns 1 to 5 belong to the sample of 37,260 Compustat firm-year observations through sample period 1997 to 2016. Each figure represents the proportion of indicated financial constraint cluster to the subsample. By calculating the silhouette score, the sample will be segmented into 4 groups: Cluster 1 to Cluster 4. The clustering is based on the transformed firm accounting dataset, and the algorithm used in classification has been discussed exhaustively in the previous sections.

One thing should be noticed that the sum of the observations of four subsample periods does not equal the number of observations of whole sample. The reason is that firms are required to stay at each subsample for at least 3 years. Some firms, hence, are excluded from the subsample due to the sample filtering. Table 4 reports the mean values of the two principal components on which the clustering depends for each of the four clusters. For each subsample period, cluster 1 is characterized as the cluster with the largest value of first principal component and meanwhile with the smallest second principal component among all four clusters. Following the classification scheme of introduced in previous section, cluster 1 can be labeled as financially constrained (FC) group. Turning to the cluster 4, comparing with the values of principal components of other clusters, it has the smallest first principal component and largest second principal component, Therefore, cluster 4 can be identified as not financially constrained (NFC) group.

Cluster 2 and 3 display mixed information. Cluster 2 has the second largest mean value of the first principal component and the second largest mean value of the second principal component. Compared with cluster 1, the firms in cluster 2 enjoy better financial health than those in the

(29)

Table 4

Clustering Result and Mean of the Corresponding Principal Components

Clusters Cluster 1 (1) Cluster 2 (2) Cluster 3 (3) Cluster 4 (4) I.Whole period through 1997 to 2016:

1st Component 1.935 0.928 -1.417 -2.571

2nd Component -0.839 0.622 -0.363 1.158

II. 1st period through 1997 to 2001:

1st Component 3.264 0.736 -0.989 -2.228

2nd Component -0.558 0.993 -0.443 2.024

III. 2nd period through 2002 to 2006:

1st Component 1.803 1.060 -0.565 -1.950

2nd Component -0.543 0.929 -0.288 1.336

IV. 3rd period through 2007 to 2011:

1st Component 2.279 0.564 -1.363 -2.769

2nd Component -0.302 0.822 -0.122 2.972

V. 4th period through 2012 to 2016:

1st Component 3.376 1.055 -1.487 -2.483

2nd Component -0.329 0.547 -0.288 1.310

This table reports the mean value of first and second principal component of each cluster, respectively. Scenario I present the values of principal components for the whole sample. Scenario II to V respectively presents the values of principal components for the subsamples. The values are derived by the linear transformation of original data set by multiplying the eigenvector of corresponding principal component with accounting variables vector. All the values in the cell are the mean of corresponding principal component. Larger value of 1st component and smaller value of 2nd component implies higher degree of financial constraints. The 1st principal component contributes more variation of financial constraints than the 2nd component.

cluster 1. However, in contrast with cluster 3, cluster 2 has relatively larger values of both principal components. Since the first principal component contributes more variation of financial constraints than another, it is reasonable that the severity of financial constraints depends more strongly on the first principal component. This means that the financial status of those firms

(30)

categorized in cluster 3 is better than those classified in cluster 2. Among all of these clusters, cluster 3 has the second smallest mean value of the first component and the second smallest mean value of the second component. The financial health of cluster 3 is, therefore, slightly worse than those of cluster 4. Hence, cluster 2 is identified as likely financially constrained (LFC) group while cluster 3 is considered as likely not financially constrained (LNFC) group

Table 5 reports the summary statistics of accounting variables for the sample firms as a whole and each pre-determined subsamples by the degree of constraints following the advocated classification scheme. The figure in each cell presents the mean of the accounting variables. Several differences between the sample firms belong to the more constrained and less constrained group can be found. The pattern of cash holdings, R&D expenses, firm size and age are consistent with expectation as they are by construction. Comparing the leverage level across the groups with regard to their severity of financial constraints, it can be discovered that debt level is not a monotonic indicator of financial constraints. In other words, the leverage level is not monotonically correlated with the level of financial constraints. This finding agrees with the arguments proposed by Hennessy and Whited (2007), and Acharya, Almeida and Campello (2007). These researchers have concerned about the “endogenous nature” of leverage might lead to the non-monotonic or sample-specific relation to financial constraints. The finding about leverage level in this paper suspects the role of leverage as the predictor of financial constraints.

Apart from the peculiar finding about the relation between leverage level and financial constraints, the descriptive statistics of payout ratio (excluding stock repurchases) are also worth studying. The payout ratio reveals time variation. For the first and second subsample period, the payout ratio is negatively correlated with the level of financial constraints. Specifically, more constrained firms (FC and LFC) pay fewer dividends than the more unconstrained firms. However, the situation reverses in the last two subsample periods. The evidence suggests that the more financially constrained firms tend to have higher payout ratio than the unconstrained firms. This is not surprising, as several authors have noticed that the average of payout ratio has declining trend (e.g. Fama and French, 2000).

(31)

Table 5: Descriptive statistics for financial constraint groups of subsamples

1st period 2nd period 3rd period 4th period

FC LFC LNFC NFC FC LFC LNFC NFC FC LFC LNFC NFC FC LFC LNFC NFC Size 75.73 2325.91 1034.61 3788.11 46.60 1038.17 628.93 4532.09 47.80 1868.73 1253.15 6805.72 97.91 4469.93 1026.86 10643.54 Age 7.151 21.060 14.701 22.306 10.389 14.399 14.900 25.596 19.281 26.200 24.668 33.354 10.261 24.893 21.225 29.134 Leverage 0.086 0.211 0.216 0.339 0.149 0.135 0.196 0.272 0.148 0.138 0.281 0.281 0.152 0.195 0.279 0.275 Cash holding 0.390 0.110 0.098 0.063 0.436 0.222 0.195 0.078 0.513 0.251 0.180 0.095 0.540 0.247 0.171 0.105 Payout ratio 0.004 0.005 0.041 0.032 0.007 0.005 0.024 0052 0.022 0.006 0.016 0.013 0.015 0.011 0.022 0.016 R&D expenses 0.236 0.062 0.048 0.026 0.368 0.086 0.040 0.024 0.425 0.077 0.045 0.024 0.345 0.072 0.047 0.024 Pension funding 0.003 0.005 0.005 0.005 0.007 0.004 0.003 0.008 0.004 0.004 0.007 0.008 0.003 0.005 0.008 0.009

This table reports the mean of the indicated variable over each indicated financial constraint group of the subsample periods, respectively. All variables are calculated by Compustat variables. “FC” represents financially constrained group, “LFC” represents likely financial constrained group, “Mixed” represents the group with mixed information and NFC represents not financially constrained group. Size is the total book assets adjusted by average CPI in U.S. and measured in 1999’s dollar. Firm age is calculated as the last year of specific subsample period minus the first year that the firm appears in Compustat. Cash holding, R&D expenses and pension funding are rescaled by total book assets.

(32)

Fama and French (2000) explain that this phenomenon is partially caused by the changing financial characteristics of firms and the reluctance of distributing incomes (i.e. regardless of the firms’ characteristics, firms are less willing to pay dividends). The decreasing trend in firm paying dividends also supports the arguments claimed by various articles (e.g. Grollon and Michaely, 2002; Skinner, 2008) that the stock repurchases are increasingly used by firms to substitute dividends, which results in lower payout ratio.

6. Discussion

6.1 PCA-Clustering Method, and Other Measures of Financial Constraints

The PCA-Clustering method for identifying financial constraints has several advantages over other measures of financial constraints in several aspects. First of all, the PCA-Clustering model can include more variables into analysis than other indices, and process the selected variables by a more sophisticated way. Comparing with other measures, PCA-Clustering model can theoretically take as many as possible variables that related to financial constraints into consideration. The PCA can transform the correlated variables into orthogonal principal components, gather the important information of accounting variables sophisticatedly, omit the redundant variables and detect the time-varying main source of variation of financial constraints. Recalling the development of various indices, the coefficients are directly estimated from a small sample with a narrow set of pre-determined variables. These methods must compromise between the risk of multicollinearity and completeness. However, PCA will not suffer from this dilemma.

Secondly, unlike the previous indices, no prior assumptions needed during the implementation of PCA-Clustering method. During variable selection and engineering, PCA can choose the variables that explain the most variation of financial constraints automatically. In the previous literatures, the number of groups in terms of the degree of financial constraints is determined by prior assumptions (e.g. Kaplan and Zingales, 1997; Hadlock and Pierce, 2010). However, this paper borrows the silhouette score to determine the optimal number of clusters. Therefore, every step of PCA-Clustering model can be optimized mathematically. This apparently avoids the errors caused by wrong or unproved prior assumptions, and improves the performance and validity of the model.

(33)

Finally, PCA-Clustering method can be implemented easily on different samples, and is more robust than other existing measurements. PCA-Clustering algorithm can be easily implemented by various computer software packages. There is no concern on the robustness of this measure since the parameters will be re-estimated every time when implementing the algorithm on different samples. Therefore, the general concern on the stability of coefficients proposed by Whited and Wu (2006) might be mitigated.

To examine the validity of PCA-Clustering method, quantitative investigations will be conducted. This paper borrows the methodology of Almeida et al. (2004) to check whether PCA-Clustering method is successful in classifying constrained and unconstrained firms. The spirit of Almeida et al. (2004) is to detect financial constraints within a population of sample firms by examining whether the samples exhibit a significant sensitivity of cash stocks to cash flow. They hypothesize that positive and significant cash flow sensitivity of cash holdings should be the indicator of identifying financial constraints. The main argument they propose is that constrained firms save more cash with the increase of cash flow, whereas the cash savings of unconstrained firms do not change systematically. Following Almeida et al (2004) idea, several regressions with regard to cash flow sensitivity of cash are conducted in this paper.

Testing the validity of PCA-Clustering model requires the comparison with other priori measures of financial constraints. There are various plausible approaches to classify firms into financially constrained and unconstrained group. Still, considerable debates, in terms of the selection of particular measure, exist in the literature. Without having convincing priors about which method is optimal, PCA-Clustering approach and other four widely-used measures of financial constraints are adopted to classify the sample:

 Scheme #1: The firms are identified as financially constrained and unconstrained by PCA-Clustering approach. The firms, which are classified in “NFC” group, are assigned to the financially unconstrained group. The firms which are labeled as “FC” and “LFC” are assigned to the financially constrained group. PCA-Clustering approach is exhaustively introduced in the previous sections.

(34)

 Scheme #2: KZ-index is computed for every year over the 1997 to 2016 period based on the results in Kaplan and Zingales (1997). The original variable definitions of KZ-index are used to construct the KZ-index:

𝐾𝑍𝑖𝑛𝑑𝑒𝑥 = −1.002 × 𝐶𝑎𝑠ℎ𝐹𝑙𝑜𝑤 + 0.283 × 𝑄 + 3.139 × 𝐿𝑒𝑣𝑒𝑟𝑎𝑔𝑒 − 39.368 × 𝐷𝑖𝑣𝑖𝑑𝑒𝑛𝑑𝑠 − 1.315 × 𝐶𝑎𝑠ℎ𝐻𝑜𝑙𝑑𝑖𝑛𝑔𝑠

Firms in the top (bottom) three deciles of KZ index ranking are identified as financially constrained (unconstrained).

 Scheme #3: SA-index, developed by Hadlock and Pierce (2010), is calculated for every year through the sample period 1997 to 2016. The firms’ financial status will be separated based on this index. The SA-index is computed as follows:

𝑆𝐴𝑖𝑛𝑑𝑒𝑥 = −0.737 × 𝑆𝑖𝑧𝑒 + 0.043 × 𝑆𝑖𝑧𝑒2− 0.040 × 𝐴𝑔𝑒

Following Hadlock and Pierce (2010), the index values of firms in the top (bottom) three deciles of SA index ranking are considered financially constrained (unconstrained).  Scheme #4: The firms are ranked based on their payout ratio of every year over the 1997 to

2016 period. The annual payout ratios of firms in the bottom (top) tercile of payout ratio distribution are classified in financially constrained (unconstrained) category. The payout ratio is computed as the ratio of total payout distributions (dividend plus stock repurchases) to operating income before extraordinary terms.

 Scheme #5: The firms are sorted based on their book asset size through 1997 to 2016. Firms in the bottom (top) tercile of the size distribution are categorized to the financially

constrained (unconstrained) group. The rankings are measured in annual basis. This approach follows the argument proposed by Gilchrist and Himmelberg (1995), which measures financial constraints by comparing firm asset size. They support their finding by the claim that small firms are generally young, less known, and therefore more vulnerable to the asymmetric information in the capital market.

Table 6 reports the results of regressions of cash flow to cash sensitivity. Replicating Almeida et al (2004) methodology, this paper conducts the regression with the annual change in standardized cash holdings as dependent variable against cash flow, Q, and the natural logarithm of book asset. The critical coefficient of interest is the estimation on cash flow. A positive and significant

Referenties

GERELATEERDE DOCUMENTEN

Lastly, there is some support for a substituting effect of firm- and country- level governance, in which firm-level governance partially reduces the negative effects of tax

Furthermore, moderating effects of Financial Depth, Payment Method, Geographical and Industrial Diversification are tested on their relationship to the Bidder

Managers of financially constrained firms should be aware of the fact that these excess working capital levels have a strong negative impact on firm value and should adjust

Using some synthetic and real-life data, we have shown that our technique MKSC was able to handle a varying number of data points and track the cluster evolution.. Also the

Evidence is presented that determinants of financial constraints can explain whether an acquisition is cross-border or domestic, and that firms that face a higher probability

We have further shown that the structure represented by each signed half of each principal component (greater than or equal to a score threshold of 1) is adequate for set

186 De verlenging van de loondoorbetalingsplicht bij ziekte voor de werkgever van één naar twee jaar is dus onder andere gebaseerd op een opzegverbod uit 1953,

In addition to the special CINet section, this issue of Creativity and Innovation Management contains four ‘regular’ articles.. Marcel Bogers and Joel West review the literature on